Back to Glossary
Glossary
The consistency of scores when different raters or observers evaluate the same responses, behaviors, or material.
Inter-rater reliability is the consistency of scores or judgments when different raters evaluate the same material. It matters most for assessments that require human scoring — clinical interviews, behavioral observations, projective tests, or coding of open-ended responses.
Common statistics for inter-rater reliability include Cohen's kappa (for categorical judgments), intraclass correlation coefficients (for continuous ratings), and percent agreement. Self-report assessments don't have inter-rater reliability in the usual sense, because there's only one rater: the respondent.