For a public service, too much standardized testing operates behind a veil of secrecy.
We have a more polite term for it — “test security” — but what it amounts to is secrecy.
When I was teaching, sealed MCA testing packets would arrive in boxes at our school. At the beginning of each test, I would have students unseal the packets, take the test, and re-seal the packets before collecting them into a box. Then they went back to the testing company (American Institutes for Research, a nonprofit) for scoring and analysis. Even though my school was being held “accountable” for how students did on these tests, the only people who ever saw the tests in their final form were the testing company and the students themselves.
Couldn’t I have snuck a peek at the questions? Sure, if I wanted to violate testing procedures. Check out the 2012-13 procedures manual [PDF] for MCA testing. Those administering tests are specifically told not to, “View test items for any reason except as allowed in the administration of an assessment,” nor were we supposed to, “‘Look over the shoulder’ to read test items when monitoring students taking a test.” Also on the “naughty” list: “Offer an opinion to a student, class or other staff member that a question is ‘bad’ or doesn’t have a correct answer.” (Apparently singing the praises of a question is totally fine.)
In other words, too much testing operates in a black box, with limited opportunities for oversight. Even when some efforts at oversight exist, they often don’t prove sufficient.
As the Atlanta Journal-Constitution has found, several states have administered and continue to administer questions of dubious quality. These include, for example, questions without right answers and questions that tend to be answered incorrectly by otherwise high-scoring students while being answered correctly by otherwise low-scoring students. While Minnesota does operate Item Review panels to advise AIR on question quality, we have to take it on faith that they’re responding appropriately to our concerns and screening out questions with weird statistics.
Concerns about bias
We also have reason to be concerned about cultural and class bias on our tests. I’m not talking here of active racism, but rather of the passive bias that tends to come with privilege. Minnesota does operate Bias Review advisory panels (which you can register for here), but again we have to take it on trust that AIR is responding appropriately to those concerns.
Minnesota appears to be in better shape than many other states, though we’re still far from ideal. We’re also fortunate to have a progressive administration that at least brings some reasonable skepticism about the appropiate scope and purpose of testing. The demise of the GRAD tests for graduation in the last legislative session was a good start; I’m hoping to see further critical evaluations of our testing mindset and approach.
It’s important to remember, though, that no matter how lousy tests can be proven to be, and no matter how bad they are for kids, there will always be some with an incentive to argue for more testing. I’ll discuss two groups: testing-as-ends and testing-as-means.
Testing as ends, testing as means
The testing-as-ends interests are the ones who directly benefit from a large amount of testing. Primarily, I’m referring to the testing companies who collect big checks from the state for developing and scoring the tests, and who often collect plenty of not-so-small checks from districts for “aligned” curricula (i.e. test prep curricula). While some may show restraint, all of them have an incentive to advocate for more testing.
The testing-as-means crowd are the ones who think that a large number of tests, administered regularly and changed every few years to keep everyone off their game, are a great way to slam the school system. Never mind that 70 percent of variation in test scores is driven by out-of-school factors. What matters to these folks is that there always be a reason to slam the people working in the public schools.
The best way to counter these forces is by having a clear, limited definition of what we want testing to accomplish. There are valid reasons to support a baseline of standardized assessment. Disaggregation of test scores by student group helps us identify areas where we as a society (including but not limited to our school system) have equity gaps.
Red flag or trigger for punishments?
Used as a red flag instead of a trigger for punishments, a small amount of testing that identifies schools and districts in need of more support can be a useful tool for promoting equity. We need to be vigilant that this testing be used for building awareness and support. If it is linked to dire consequences, it will lead to a narrowing of curriculum, developmentally inappropriate instruction, and increased pressure to cheat.
We also need to increase public scrutiny of testing companies at a national level. I believe most of the people working for them do so for the right reasons, but we need to demand more sunlight on the final products and how they’re being used.
Michael Diedrich, who taught English for two years as a Teach for America corps member, is a master of public policy student at the University of Minnesota’s Humphrey School of Public Affairs, where he is pursuing a concentration in education policy. He is also an Education Fellow at Minnesota 2020, on whose website this article originally appeared.
WANT TO ADD YOUR VOICE?
If you’re interested in joining the discussion, add your voice to the Comment section below — or consider writing a letter or a longer-form Community Voices commentary. (For more information about Community Voices, email Susan Albright at firstname.lastname@example.org.)