The reliability of Test Scores
Indicators of quality
- Validity
- Reliability
- Utility
- Fairness
Question: how are they all inter-related?
Validity
- Depends on the PURPOSE
- E.g. a ruler may be a valid measuring device for length, but isn’t very valid for measuring volume
- Measuring what ‘it’ is supposed to
- Matter of degree (how valid?)
- Specific to a particular purpose!
- Must be inferred from evidence; cannot be directly measured
Reliability
- Consistency in the type of result a test yields
- Time & space
- participants
- Not perfectly similar result but ‘very close-to’ being similar
- When someone says you are a ‘reliable’ person, what do they really mean?
- Are you a reliable person? J
What do you think…?
- Forced-choice assessment forms are high in reliability, but weak in validity (true/false)
- Performance-based assessment forms are high in both validity and reliability (true/false)
- A test item is said to be unreliable when most students answered the item wrongly (true/false)
- When a test contains items that do not represent the content covered during instruction, it is known as an unreliable test (true/false)
- Test items that do not successfully measure the intended learning outcomes (objectives) are invalid items (true/false)
- Assessment that does not represent student learning well enough are definitely invalid and unreliable (true/false)
- A valid test can sometimes be unreliable (true/false)
- If a test is valid, it is reliable! (by-product)