In recent years, interest has grown in using classroom observation as a means to several ends, including teacher development, teacher evaluation, and impact evaluation of classroom-based interventions. Although education practitioners and researchers have developed numerous observational instruments for these purposes, many developers fail to specify important criteria regarding instrument use. In this article, the authors argue that for classroom observation to succeed in its aims, improved observational systems must be developed. These systems should include not only observational instruments but also scoring designs capable of producing reliable and cost-efficient scores and processes for rater recruitment, training, and certification. To illustrate how such a system might be developed and improved, the authors provide an empirical example that applies generalizability theory to data from a mathematics observational instrument.
Year of publication