Administration is the common term for the way in which a test is conducted with students. The emphasis is on the action being done by the teacher to the students. However, the term fails to address what should be the essentially collaborative nature of testing, whereby teachers and students jointly set out to explore how much progress has been made and where the gaps are in the learning.

Criterion-referenced assessment

A criterion-referenced assessment is one that has been designed to measure a person’s skills, knowledge and understanding with reference to benchmarks of expected performance in relation to a specific competency or body of knowledge. It allows us to measure the level of an individual's learning.

Curriculum levels

There are eight curriculum levels as defined in the New Zealand Curriculum 2007. These broad levels typically relate to years at school. Many students, however, do not fit this pattern. They include those with special learning needs, those who are gifted and those who come from non-English-speaking backgrounds. The New Zealand Curriculum On-line has a diagram of curriculum levels - on page 45 of the document.


A dependable assessment has both high validity and reliability. Sometimes there is a trade-off between validity and reliability. Formal assessments may be more reliable, while informal assessments may be more valid.

Diagnostic assessment (see Formative assessment)

Diagnostic assessment provides information for teachers on what or how students are achieving at a particular time. Diagnostic tools give detailed information about students' learning needs; and prompt reflection on appropriate teaching strategies to meet these. Diagnostic assessment also informs future programme planning, and gives valuable information to teachers on how they may scaffold the learning to meet the individual learning needs of students.


These are samples of authentic student work, often annotated, to illustrate levels of achievement. These could be examples of written work, designed tasks, art works, recordings of dance, drama or musical works.

Externally-referenced assessment

An externally-referenced assessment is one which is related to a wider context than the particular group being tested, and allows us to make comparisons with a general population or agreed benchmarks. An externally-referenced assessment may be either norm-referenced or criterion-referenced.


Specific, constructive feedback about learning, as it is unfolding, is one of the most powerful influences on student achievement. Positive feedback celebrates success, and helps keep students motivated, whilst constructive feedback highlights important aspects to focus on. Feed-forward provides an outline of the next steps to be taken. Feedback/feed-forward includes all dialogue to support learning in both formal and informal situations.

Formative assessment

'Formative assessment refers to all those assessment activities undertaken by teachers, and by the students themselves, which provide information, to be used as feedback to modify the teaching and learning activities in which they are engaged. Such assessments become formative when the evidence is actually used to adapt the teaching to meet the needs of students.’ (Black and Wiliam, 1998.) It is widely and empirically argued that formative assessment has the greatest impact on learning and achievement.

Learning intention

Learning intentions describe the knowledge, skill, understanding(s) and/or attitudes/values that are needed to develop an aspect of the curriculum. They are usually negotiated with students and expressed in a lesson or series of lessons, as both global and specific ‘chunks’ of learning. Learning intentions should be expressed in language that students understand and should support them in understanding what they are supposed to be learning and why.

As a result of the learning process, intentions may well have to be renegotiated or transformed according to the achievement of students. A learning intention takes achievement of the original learning goal into account and aims to move students on towards the next part of the learning.


Moderation is a process where teachers compare judgments to either confirm or adjust them. The process involves teacher collaboration to establish a shared understanding of what achievement of standards looks like and whether or not the student has demonstrated achievement of the standard. Teachers work towards making judgments that are consistent and comparable.

National Administration Guidelines (NAGs)

The National Administration Guidelines for school administration set out statements of desirable principles of conduct or administration for specified personnel or bodies. Recent amendments include the planning and reporting requirements, the footnote to 1(iii)c relating to gifted and talented learners (with effect from Term 1 2005), and clause 1(i)c regarding "regular quality physical activity" (with effect from Term 1 2006).

National Standards

The New Zealand Curriculum suggests a range of achievement for each year level and a rate of progress. National Standards set out what can reasonably be expected of most students by the end of the designated period or year.


Norms are statistical representations of a population. A norm-referenced score interpretation compares an individual's results on the test with the statistical representation of the population. In practice, rather than testing a population, a representative sample or group is tested. This provides a group norm or set of norms.

Norm-referenced assessment

A norm-referenced assessment is one that has been designed to determine the position of an individual relative to others in a population, with respect to the skills, knowledge and understanding being measured. When combined with a standardised score, it also allows us to track an individual's progress over time relative to a population.

Overall teacher judgement (OTJ)

An OTJ involves drawing on and applying the evidence gathered up to a particular point in time in order to make an overall judgment about a student’s progress and achievement.

Peer assessment

Peer assessment is the assessment by students of one another's work with reference to negotiated and specific criteria. This can occur using a range of strategies. The peer assessment process needs to be taught and students supported by opportunities to practise it regularly in a supportive and safe classroom environment.


This is the extent to which the results from the same assessment can be repeated across time and situations, statistically expressed. If an assessment comes up with very different results each time the student sits it, it lacks reliability.


A rubric communicates expectations of quality around a task. In many cases, rubrics are used to delineate consistent criteria for grading. Because the criteria are public, a rubric allows teachers and students alike to evaluate criteria, which can be complex and subjective. A rubric can also provide a basis for self-evaluation, reflection, and peer assessment. It is aimed at accurate and fair assessment, fostering understanding, and indicating a way to proceed with subsequent learning/teaching. This integration of performance and feedback is called ongoing assessment or formative assessment.


This is a process by which students engage in a systematic review of their progress and achievement, usually for the purpose of improvement. It may involve comparison with an exemplar, success criteria, or other criteria. It may also involve critiquing one's own work or a description of the achievement obtained.

Standardised assessment

In a standardised assessment, the content is set, the directions are prescribed and the scoring procedure is completely specified. There are norms against which we may compare the scores of the students being assessed. Standardised assessment tools enable the result for any student to be compared with the results for a normal sample of students.

Standardised scores

Standardised scores are derived from students’ results on a norm-referenced test in such a way that the underlying population has a predefined mean and standard deviation, and therefore allow us to interpret an individual’s results in a consistent way relative to the population.

Standardised test

A standardised test is a test that has been designed so that the questions, conditions for administering, scoring procedures, and interpretations are consistent and are administered and scored in a predetermined, standard manner.

Standards-referenced assessment

Standards-referenced assessment is a type of criterion-referenced assessment which makes direct and extensive use of teachers' qualitative judgments. It requires external, visible standards for the use of both teachers and students, ideally defined by exemplars and verbal descriptions. It allows us to make judgments about the level of an individual's learning with respect to shared benchmarks of expected performance.

Success criteria

Success criteria describe how students will go about achieving a learning intention or how they will know when they have learnt it. The purpose of creating success criteria is to ensure students understand the teacher's criteria for making judgments about their work, and so that they gain an ‘anatomy of quality’ for that particular piece of work. If students have been involved in the creation of success criteria they are more likely to take more ownership of their learning, be self-evaluative as they are working, and question the assessed work as it evolves. Measuring whether a single learning intention has been met may involve co-constructing several success criteria.

Summative assessment

This is an evaluation made by the teacher at the conclusion of a unit of work, instruction, or assessment activity to assess student skills, knowledge, and understandings at that particular point in time. However, these assessments can also be used formatively if they are used to promote future learning.


Validity is the most important single attribute of a good test. Nothing will be gained from assessment unless the assessment has some validity for your purpose.

There are several different types of validity:

  • Face validity - do the assessment items appear to be appropriate?
  • Content validity - does the assessment content cover what we want to assess?
  • Criterion-related validity - how well does the test measure what we want it to?
  • Construct validity - are you measuring what you think you're measuring?