Issues In Educational Research, Vol 14, 2004
[ Contents Vol 14 ] [ IIER Home ]

Assessing early literacy and numeracy skills among Indigenous children with the Performance indicators in primary schools test

John R. Godfrey and Ann Galloway
Edith Cowan University

This report examines the Performance Indicators in Primary Schools (PIPS) test as a reliable and cohesive instrument to assess early literacy and numeracy skills among Indigenous children. The process includes the examination of the reliability of the PIPS test using the Cronbach Alpha and the Split-half method with Pearson's r correlation co-efficient and the Spearman-Brown correction. Individual items are examined to ascertain their discrimination indices and their item difficulty levels. These analyses reveal that some items of the instrument should be revised to improve its suitability to assess early literacy and numeracy skills. Total scores on the two major item sub-groups are correlated with the total scores to determine the overall cohesiveness of the instrument. In spite of some possible improvements this report indicates that overall the PIPS test is a highly reliable and thus adequately valid instrument to assess early literacy and numeracy skills among Indigenous children.

Introduction

The Performance Indicators in Primary Schools (PIPS) test is analysed in this report to ascertain if it is a reliable and cohesive instrument to assess early literacy and numeracy skills among Indigenous children. The research project in which the PIPS test was used, Teaching Indigenous students with conductive hearing loss in remote and urban school in Western Australia (CHL Project), was being undertaken by a team of researchers from Edith Cowan University to investigate the effect of conductive hearing loss as a result of otitis media on the language development of Indigenous students. The Project was funded by an Australian Research Council SPIRT grant and industry partners, the Department of Education and Training (then the Education Department of Western Australia), Catholic Education Office and the Association of Independent Schools. The Project was undertaken in 16 schools across three Western Australian education districts: Kimberley, Goldfields and Swan (part of the Perth metropolitan area). The research also had the endorsement of Aboriginal health services and the Western Australian government health services in the regions in which it was conducted.

The team believes that hearing loss due to otitis media may affect the development of auditory discrimination and processing skills and as a consequence may reduce phonological awareness, short-term auditory memory skills, auditory sequential memory skills and thus numeracy and literacy skills. They sought answers, among others, to the following questions.

What is the relationship between conductive hearing loss and school related variables including: literacy; numeracy; attendance; and behaviour of pre-primary to Year 3 students?
To what extent does the implementation of new teaching strategies result in improved literacy, numeracy, reduced absenteeism and reduced behaviour problems?

The choice of a reading test to ascertain the reading ability of Indigenous children who may have suffered CHL proved to be a most difficult and sensitive exercise. In particular the research team is mindful that Indigenous community leaders are concerned that their children are frequently subjected to numerous assessments. Indigenous people have the right and indeed the responsibility to complain and seek to redress unfair, unreliable and invalid assessments made of Indigenous children. Therefore the researchers spent significant time endeavouring to find a suitable instrument.

The following instruments were examined to determine their suitability: the Kimberley Standard English Vocabulary Test (Brandenburg, c.1984); the Phonological Profile for the Hearing Impaired Test (Vardi, 1991); the Western Australian Action Picture Test (Kormendy, 1988); and The Hundred Pictures Naming Test (Fisher & Glenister, 1992). All were rejected for a multiplicity of reasons, including unsuitability of language, complexity of administration, length, difficulty of assessing K to Year 3 reading skills, or because they were considered outdated.

After careful consideration and close examination, the reading tests contained within Neil J. Waddington's (2000) Diagnostic Reading and Spelling Tests 1 & 2 (Second Edition) were chosen because these tests appeared to be simple to understand and the language appeared appropriate for Indigenous children in K through to Year 2. The items depicted relevant and current items to be recognised such as balls, horses, fish and the sun etc. The tests are easy to score. The use of pictures with a three option multiple-choice item narrowed choices and aided statistical analysis. The test was examined by three researchers, who all agreed that the face validity of the instrument appeared suitable for assessing the reading ability of Indigenous children. A small pilot study was also conducted with promising results (see Godfrey, Partington & Sinclair, 2001; Godfrey, 2003a; and Godfrey, 2003b).

Unfortunately the administration of the Waddington (2000) tests to Indigenous children produced a wide divergence of opinion. These differences of opinion may have been based on the location of various schools. For example, at a meeting in a remote school district those responsible for the educational welfare of Indigenous children in the district were clearly opposed to the Waddington tests being administered to Indigenous children. These strong opinions were due to perceptions that the test contained numerous inappropriate, culturally biased items. On the other hand a few days later the same researcher was informed by the Principal of a Perth metropolitan school that Waddington tests are regarded as an important instrument for the assessment of students (including Indigenous students) in the school. (G. Partington, personal communication, July 26, 2001).

These and similar reactions of teachers of Indigenous children and Indigenous educators to Waddington tests led the CHL Research Team to examine other suitable instruments to assess early literacy skills among indigenous children. After careful consideration and a thorough examination of the content validity of the more recent, widely used and computer based PIPS instrument, it was chosen as the instrument to assist with the assessment of early literacy and also early numeracy skills among Indigenous children.

Method

Test

The Australian edition (Edith Cowan University) of the Performance Indicators in Primary Schools (PIPS): Baseline Assessment 2001, developed by the CEM Centre, University of Durham, United Kingdom is a computer-based, on-entry level literacy and numeracy assessment instrument, designed for pre-primary and Year 1 students. However, the PIPS developers advised that it was also suitable for use with Year 2. The PIPS assessment is administered on a one-to-one basis at the start and end of the school year to measure progress over time. It comprises a variety of sections of questions of increasing difficulty covering general vocabulary knowledge, concepts of print, sounds and phonological awareness, letter knowledge, reading and word attack skills, concepts of maths, digit identification, and number problems. In addition, there are two optional sections, one assessing short term memory (which was included in the testing for the Project); the other assessing attitudes (not included in this Project). The assessment is designed so that if students fail to answer three questions correctly, the program automatically defaults to the next appropriate screen for them. That is, students not able to handle higher level items, are not tested on those. Thus, 'failure' is not reinforced, and students are not aware that there is material that has not been covered.

Pilot study

Indigenous students in pre-primary and Years 1, 2 from one independent Aboriginal community school and three Education Department of Western Australian (EDWA) schools were involved in an initial testing program using the PIPS instrument in November 2001 and February 2002 (see Godfrey, 2002). These thirty three students were all volunteers whose parent or caregiver had given permission for the student's involvement in the CHL Project. Table 1 provides an overview of the participant group.

Table 1: Pilot study student sample

Type of school Class

Pre-Primary Year 1 Year 2

Aboriginal Community School
2 1

State Primary School A

3

State Primary School B 2 5 5

State Primary School C 4 4 7

Totals 6 11 16

Grand Total

33

The results of the pilot study, in spite of the small sample available, indicated that the test is reliable with a Cronbach Alpha coefficient of reliability of 0.95. An analysis of the items indicated that most items were operating with positive discrimination values and with difficulty levels within acceptable levels. Ten items were of concern. However the researchers decided not attempt to remove or revise any items as the sample was small. It was therefore decided to administer the PIPS instrument to a larger population of Indigenous children in the light of the high co-efficient of reliability and the apparent suitability of most items and to re-examine the reliability and items after the first administration of PIPS.

Sample

The sample of one hundred and ninety one Indigenous students who attempted the PIPS test came from a variety of schools which included a mix of State Primary, Catholic Primary and Aboriginal Community schools situated in metropolitan, country and remote areas of Western Australia (see Table 2).

Table 2: Sample of schools

Type of School Metropolitan Country Remote Total

Aboriginal Community 1 1 1 3

State Primary 3 3 1 7

Catholic School 1 2 0 3

Totals 5 6 2

Grand Total

13

Test administration

Members of the CHL Project team made initial familiarisation visits to the schools to discuss the testing process and to arrange a suitable place within the school where the testing could be conducted. Several visits were usually required to complete all the testing, due to student absences and general issues of student availability. In most schools, the CHL team was assisted by the school's Aboriginal Education Officer; in other schools, they worked directly with the classroom teachers. Students were tested individually.

The second round of testing was carried out towards the close of the school year to endeavour to gain an indication of development in literacy and numeracy skills; that is to gain "value-added" scores. The first and second tests were administered by the same researcher to ensure comparability in administration conditions. Unfortunately, due to student absenteeism, illness and transfers to non-participating schools, matching scores for individual Indigenous students were only available for half the number of those tested during the first administration of the instrument.

Data analysis

To gain an index of the reliability of the first administration of the test a Cronbach Alpha reliability coefficient was calculated and also the Pearson's r correlation between the two halves of an odd-even items split to produce a reliability coefficient after the Spearman-Brown correction was applied. However, with the second administration of the test only a Cronbach Alpha reliability coefficient was calculated.

An item analysis to obtain discrimination indices (DI), difficulty indices (DIFF) and an indication of the contribution of the item to the instrument (ICI) are calculated with the majority of the PIPS items using the EdStats computer program (Knibb, 1995). Some items proved to be unsatisfactory for analysis using the EdStats program and finally a total 132 items were analysed.

Correlations using Pearson's r correlation coefficient were calculated between the overall total score, reading (literacy) total score and mathematics total score of individuals on the first administration of the test with their corresponding score on the second administration of the test.

Finally a simple comparison of the means of the total scores on the two administrations of the PIPS test was conducted to ascertain whether there was change in the test scores of candidates on the two administrations.

Results

Reliability

The Cronbach Alpha reliability coefficient was calculated as 0.98 while the Pearson's r correlation between the two halves of an odd-even items split produced a coefficient r of 0.97 and after the Spearman-Brown correction was applied a Split-half reliability coefficient of 0.98. The mean and standard deviation of the 132 items was 62.61 and 28.08 respectively. The Standard Error of Measurement is 3.25 (see Table 3).

Table 3: PIPS test statistics

Administration 1

Cronbach Alpha 0.98

Pearson's r 0.97

Split-half Reliability 0.98

Mean 62.61

Standard Deviation 28.08

Standard Error of Measurement 3.25

Sample 191

Administration 2

Cronbach Alpha 0.97

Sample 89

Norm referenced analysis

The results were analysed as Norm Referenced Test (NRT) data with the assistance of EdStats producing DIs, DIFFs and ICIs. The ICI is an indication of the contribution of the item to the test as a whole with regard to reliability of the instrument. The difficulty and discrimination of the item are used to determine the ICI value. "Items with ICIs less than 0 should be considered for modification or removal. Items with ICIs more than 20 are desirable" (Knibb, 1995). Selected results are shown on Table 4.

The DIs indicate that the correlation between the scores on the item and the total scores is positive for nearly all items. The DIFFs of the selected items reveal that these items are very difficult for this sample of Indigenous students. On the basis of NRT analysis it would be advisable to revise or remove from the instrument a number of the items listed in Table 4. However, most of these items are discriminating among the participants and thus add information regarding the literacy and numeracy of the Indigenous candidates. Notwithstanding, it is clear that the Sum B items number 13, 14 and 16 with both DIs and DIFFs of 0.00 add no information regarding the numeracy ability of the students. They should be removed from the instrument.

Table 4: Norm referenced PIPS analysis of selected items

Items DI Range DIFF Range ICI Range

Walk 1-18 0.22 - 0.43 0.01 - 0.05 0 - 2

Maths 2; 5-8 0.21 - 0.55 0.01 - 0.12 0 - 3

Sum B 9; 11-16 0.00 - 0.35 0.00 - 0.04 0 - 2

Memory 5-7 0.11 - 0.35 0.01 - 0.07 0 - 3

Criterion referenced analysis

The results were further analysed as Criterion Referenced Test (CRT) data. The mastery level was set at 50% as an arbitrary level to enable an analysis of the suitability of the items. The analysis indicated that the items, as a mastery test, were operating satisfactorily. Items of interest and concern with their DI and DIFF results are listed in Table 5.

The discrimination indices for some of these twenty items are a cause of concern with DIs of 0.00 due to the items being too easy of too difficult with DIFFs of nearly 1.00 or 0.00. Most if not all of these items need to be revised or removed from the test for they add little, if any, information regarding the ability of the candidates. Notwithstanding, some of these items serve a useful purpose in the instrument, for example, the items Kitchen 1 to 3 and the two Classroom items appear early in the test and their ease gives candidates confidence with the computer based instrument. The Memory items 6 and 7 are the last two items in an instrument of nearly 150 items and it would be expected that only a few candidates would answer them; thus their DIFFs are close to zero.

Table 5: Criterion referenced PIPS analysis of selected item

Items DI Range DIFF Range

Kitchen 1 - 3 0.02 - 0.06 0.97- 0.99

Classroom 1- 2 0 05 - 0.09 0.97- 0.98

Sizes 1 - 6 0.00 - 0.10 0.83 - 0.98

Maths 5-8 0.02 - 0.09 0.01 - 0.04

Sum B 12-16 0.00 - 0.01 0.00 - 0.02

Memory 6-7 0.01 - 0.05 0.01 - 0.02

Note: Mastery Level set at 50%

Correlations

Correlations are calculated between the overall total score, literacy total score and mathematics total score of individuals on the first administration of the test with their corresponding score on the second administration of the test to investigate indications that the instrument is consistently measuring the same ability over two administrations of the test.

The correlations between the individual total scores of the 89 students who completed both assessments indicate that the instrument has a very high correlation between these pairs of scores and thus a very high test-re-test reliability (0.88; see Table 6).

Table 6: Total reading and mathematics scores correlations

Total 1 Total 2 Reading 2 Math 2

Total 1
0.88

Reading 1 0.99
0.85

Reading 2
0.99

Math 1 0.93

0.84

Math 2
0.89

The correlations between reading scores on the first and second administration of the test for each candidate and the correlations between mathematics scores on the first and second administration of the test for each candidate are very high; 0.85 and 0.84 respectively (see Table 6). These results indicate that the two major sub-sections (reading and mathematics) of the test have high test-retest reliability. Both these subsections correlate highly with the total scores of each administration of the test (reading 0.99 with administration 1 and 0.99 with administration 2; mathematics 0.93 with administration 1 and 0.89 with administration 2; see Table 6) and thus indicate that the major sub-sections significantly contribute to the reliability of the instrument.

Indications of change

A comparison of the means of the total scores on the two administrations of the PIPS test shows that the students increased their scores on the sub-sections of the test (see Table 7). While these results are positive and valuable to the overall purposes of the research further analysis of them is outside the parameters of this report.

Table 7: Change indicators

Test 1 Test 2 Change

Reading 48 84 36

Maths 30 41 11

Total 84 126 42

Discussion

The PIPS instrument, on the basis of the results of this sample, is clearly a highly reliable instrument. On the basis of the PIPS reliability coefficients, it can be recommended to Indigenous educators and to teachers as a reliable instrument to use with Indigenous students. Notwithstanding the DIs and DIFFs results of 12 or more items out of the 132 analysed, the PIPS instrument appears to be a discriminating, reliable instrument for assessing the mastery of the skills and sub-skills of reading and numeracy of Indigenous children.

An interesting feature of the structure of the PIPS test is that the difficulty levels of the items in each of the sub-groups are mainly hierarchical in value. Generally, it appears that the subgroups are Guttman scales and the large majority of the PIPS items appear to conform to a Rasch measurement scale (Godfrey, 2003a). However these results are tentative and need to be the subject of further more thorough analysis with a larger sample size.

Implications

These analyses of the PIPS instrument indicate that overall the test is a highly reliable and thus adequately valid instrument to assess early literacy skills, numeracy skills and change in these skills among Indigenous children. This report also reveals that revisions to some of the items would possibly improve its suitability to assess Indigenous children. Moreover the PIPS test may assist to isolate more effectively those areas of phonics, reading and numeracy skills of concern to Indigenous children.

The results of this sample of Indigenous students tested with the PIPS test may assist with the ongoing debate regarding suitable assessment instruments to use with Indigenous children and the testing of Indigenous children with standardised tests in general. The complaints of Indigenous people should not be directed in the first instance at the use of standardised instruments but most bitterly at the misuse and misreporting of the results of standardised testing and the unfair, often subtle assessments made of Indigenous children by school personnel without the aid of reliable and valid instruments (see Godfrey, Partington, Richer, & Harslett, 2001; Godfrey, Partington, Harslett, & Richer, 2001). The problem of assessment will not dissipate; it is a feature of modern society to assess most areas of behaviour and achievement. If the assessment and the associated instruments are culturally appropriate, valid and reliable, then Indigenous communities and parents should welcome such evaluation programs. Indeed, carefully constructed evaluation programs have been used to support programs that have aided the education of Indigenous children (Cataldi & Partington, 1998). To succeed in the 21st century Australian Indigenous children need to be participants in these educational evaluation processes. The National Strategy for the Education of Aboriginal and Torres Strait Islander Peoples; 1996-2002 (Ministerial Council on Education, Employment, Training and Youth Affairs, 1995) realises the importance of assessment to Indigenous education programs. It lists as one of the strategies for both Early Childhood Education and Schooling, "Formalise assessment procedures, strategies and instruments which appropriately reveal Aboriginal and Torres Strait Islander children's achievement" (Strategies 5.2.6.e & 5.2.6.s).

To lessen the legitimate concerns of Indigenous Australians in regard to assessment procedures researchers need to adhere to measurement procedures such as: ensuring that a pilot study of any instrument is conducted before using it on the wider Indigenous community; comparing the results collected over time from valid and reliable instruments to ensure the long term usefulness of the results; and using the results of valid and reliable instruments to compare various groups within Australian society in order to assist educationally and socially those sectors that are disadvantaged.

In short, it is essential to work through the problems associated with various types of tests and assessment programs in general to ensure that accurate and valid instruments and assessment programs are established and maintained to allow Indigenous and non-Indigenous educators to be well informed of the achievements of Indigenous children and Indigenous educational programs "beyond even reasonable doubt" (Cataldi & Partington, 1998, p. 325). Educators should accept that assessments in their various formats are necessary and comparisons need to made between sections of Australian society to highlight both deficiencies and achievements and thus overcome the deficiencies and apply those principles that assist students to achieve (see Drew, 2000).

References

Brandenburg. P. (c.1984). Kimberley Standard English Vocabulary Test. Available from M. Kormendy of Edith Cowan University, Perth.

Cataldi, C., & Partington, G. (1998). Beyond even reasonable doubt: Student assessment. In G. Partington (Ed.), Perspectives on Aboriginal and Torres Strait Islander education (pp. 309-332). Katoomba: Social Science Press.

Drew, N. (2000). Psychological testing with Indigenous people in Australia. In P. Dudgeon, D. Garvey & H. Pickett (Eds.), Working with Indigenous Australians: A handbook for Psychologists (pp. 325 - 333). Perth. Guanda Press.

CEM Centre, University of Durham. (2001). Performance Indicators in Primary Schools: Baseline Assessment 2001 (Australian Edition: Edith Cowan University).

Fisher, J. P., & Glenister, J. M. (1992). The Hundred Pictures Naming Test. Hawthorn: Australian Council for Educational Research.

Godfrey, J., Partington, G., Harslett, M., & Richer, K. (2001). Attitudes of Aboriginal students to schooling. Australian Journal of Teacher Education, 26 (1), 33-39.

Godfrey, J., Partington, G., Richer, K., & Harslett. M. (2001). Perceptions of their teachers by Aboriginal students. Issues in Educational Research, 11(1), 1-13. http://www.iier.org.au/iier11/godfrey.html

Godfrey, J., Partington, G., & Sinclair, A. (2001). To test or not to test?: The selection, adaptation, administration and analysis of instruments to assess literacy skills among Indigenous children. Proceedings of the Australian Association for Research in Education Conference held in Fremantle, 2nd to 6th December. http://www.aare.edu.au/01pap/god01617.htm [verified 5 Oct 2004]

Godfrey, J. R. (2002). An analysis of the Performance Indicators in Primary Schools (PIPS) instrument to assess early literacy skills among Indigenous children: A pilot study. Refereed paper presented at the Australian Indigenous Educators Conference held in Townsville, 4th to 7th July, 2002.

Godfrey, J. R. (2003a). Report on the administration and analysis of the Performance Indicators in Primary Schools test to assess literacy skills among Australian Indigenous children. Round Table discussion held at CEM Centre, University of Durham, Durham, 15th July.

Godfrey, J. R. (2003b). The selection, administration and analysis of Performance Indicators in Primary Schools instrument to assess literacy skills among Indigenous children. Paper presented at the Tenth International Literacy and Education Research Network Conference on Learning held at Institute of Education, London University, 14th to 18th of July.

Knibb, K. (1995). EdStats. Version 1.0.5. Available from Edith Cowan University.

Knibb, K. (1996). EdStats User's Guide. Mount Lawley: Mathematics, Science & Technology Education Centre, Edith Cowan University.

Kormendy, M. (1988). Western Australian Action Picture Test. Available from author at Edith Cowan University, Perth.

Ministerial Council on Education, Employment, Training and Youth Affairs. (1995). National Strategy for the Education of Aboriginal and Torres Strait Islander Peoples; 1996-2002. (P. Hughes, Chairperson). Canberra: Department of Employment, Education, Training and Youth Affairs.

Vardi, I. (1991). Phonological Profile for the Hearing Impaired Test. Perth: Iris Vardi.

Waddington. Neil J. (2000). Diagnostic Reading and Spelling Tests 1 & 2 (Second Edition). Strathalbyn, S.A: Waddington Educational Resources.

Authors: Dr John R. Godfrey is an Honorary Senior Fellow in Kurongkurl Katitjin, the School of Indigenous Australian Studies at Edith Cowan University, Perth. He has published in reading comprehension testing, academic dishonesty, assessment in religious schools, history of Australian assessment change, Indigenous education and teacher receptivity to change. Email: johngodfrey@westnet.com.au

Dr Ann Galloway was the Director of the Project Teaching Indigenous students with conductive hearing loss. Her research interests include language and literacy development and discourse analysis. She currently holds an Australian Research Council Post Doctoral Fellowship Industry in Kurongkurl Katitjin, the School of Indigenous Australian Studies, Edith Cowan University.

Please cite as: Godfrey, J. R. and Galloway, A. (2004). Assessing early literacy and numeracy skills among Indigenous children with the Performance indicators in primary schools test. Issues In Educational Research, 14(2), 144-155. http://www.iier.org.au/iier14/godfrey.html

[ Contents Vol 14 ] [ IIER Home ]
© 2004 Issues In Educational Research. This URL: http://www.iier.org.au/iier14/godfrey.html
Created 6 Feb 2005. Last revision: 25 May 2006.
HTML: Clare McBeath [c.mcbeath@bigpond.com] and Roger Atkinson [rjatkinson@bigpond.com]

Type of school	Class
Type of school	Pre-Primary	Year 1	Year 2
Aboriginal Community School		2	1
State Primary School A			3
State Primary School B	2	5	5
State Primary School C	4	4	7
Totals	6	11	16
Grand Total			33

Type of School	Metropolitan	Country	Remote	Total
Aboriginal Community	1	1	1	3
State Primary	3	3	1	7
Catholic School	1	2	0	3
Totals	5	6	2
Grand Total				13

Administration 1
Cronbach Alpha	0.98
Pearson's r	0.97
Split-half Reliability	0.98
Mean	62.61
Standard Deviation	28.08
Standard Error of Measurement	3.25
Sample	191
Administration 2
Cronbach Alpha	0.97
Sample	89

Items	DI Range	DIFF Range	ICI Range
Walk 1-18	0.22 - 0.43	0.01 - 0.05	0 - 2
Maths 2; 5-8	0.21 - 0.55	0.01 - 0.12	0 - 3
Sum B 9; 11-16	0.00 - 0.35	0.00 - 0.04	0 - 2
Memory 5-7	0.11 - 0.35	0.01 - 0.07	0 - 3

	Total 1	Total 2	Reading 2	Math 2
Total 1		0.88
Reading 1	0.99		0.85
Reading 2		0.99
Math 1	0.93			0.84
Math 2		0.89