Tinkering Toward Utopia

In “Shooting Bottle Rockets at the Moon: Overcoming the Legacy of Incremental Education Reform,” Thomas Kane says:

…we must be able to make a plausible argument that a given set of reforms will produce improvements of the desired magnitude.   There is no reason to expect that non-controversial, incremental policies such as more professional development, incrementally smaller class sizes, and better facilities will produce substantial change.  The current backlash against the Common Core and new teacher evaluation systems is, at least in part, a result of our long history of underpowered, incremental reforms.   By failing to worry about magnitudes, we have led politicians and voters to expect school reform without controversy.   We cannot return to shooting bottle rockets when Saturn V’s are required.  We need to recognize the magnitude of the changes required to achieve our goals.

I wonder what David Tyack and Larry Cuban would say.

Read the entire analysis here.

Test scores shouldn’t be used in teacher evaluation—now supported by research

In a large-scale analysis  of new evaluation systems that evaluate teachers by using test scores (as one element), Morgan Polikoff (University of Southern California) and Andrew Porter (University of Pennsylvania) found little or no correlation between quality teaching and teacher ratings.

Under Race-to-the-Top, the number of states using teacher evaluation systems based in part on student test scores has increased dramatically over the past five years. Many are using those systems to make high-stakes decisions regarding hiring, firing, and compensation.

According to Polikoff and Porter:

Low correlations raise questions about the validity of high-stakes (e.g., performance evaluation) or low-stakes (e.g., instructional improvement) inferences made on the basis of value-added assessment data … the results suggest challenges to the effective use of VAM data. At a minimum, these results suggest it may be fruitless for teachers to use state test VAMs to inform adjustments to their instruction. Furthermore, this interpretation raises the question—If VAMs are not meaningfully associated with either the content or quality of instruction, what are they measuring?

Before moving forward with new high-stakes teacher evaluation policies based on multiple- measures teacher evaluation systems, it is essential that the research community develops a better understanding of how state tests reflect differences in instructional content and quality.

…this study contributes to a growing literature suggesting state tests may not be up to the task of differentiating effective from ineffective (or aligned from misaligned) teaching.

At the very least, these findings indicate a need to slow these implementations down. At best, they suggest (what we’ve known all along): student test scores cannot be meaningfully used to evaluate teachers. Read the entire report here.