Skip to Content

Support MinnPost

MinnPost's education reporting is made possible by a grant from the Bush Foundation.

Do 'value-added' teacher data really add value?

Fifty-six years ago, a journalist by the name of Darrell Huff wrote a book entitled “How to Lie With Statistics.” Intended for the lay reader, it contained funny illustrations and witty, cautionary passages explaining things like why correlation does not imply causation. Happily, it’s still in circulation.

I read it during my J-school days and my takeaway then was the same as it is now: In most instances, statistics are as fungible as any commodity gets.

I raise this because there are three interesting controversies ripping through national education policy circles this week that Minnesotans might want to attend to. All three concern the use of standardized tests to assess teacher effectiveness, a topic that’s on the agendas both of Gov. Mark Dayton and both caucuses in the current state Legislature.

It’s not unreasonable to ask, as we take up yet another incendiary teacher accountability discussion, whether our statistics might fib a little — or at least merit serious challenge.

Blogger challenges Rhee's claims
It seems a retired Washington, D.C., math teacher with a blog has done what the nation’s education writers could not do: He challenged former D.C. Schools Chancellor Michelle Rhee’s claims about her classroom achievements. Apparently, she claimed on her résumé that she was able to raise proficiency rates for 90 percent of her students in both reading and math to the 90th percentile or above. Supposedly, they started at the 13th percentile.

Rhee did boost test scores, but into the 50th percentile, data gathered by blogger G.F. Brandenburg purports to show (scroll down to Jan. 31). Why do I say purports? Because the underlying datasets would tell me, personally, less than a cup of dried tea leaves. More important, by my lights, Brandenburg’s numbers add up for Washington Post “Class Struggle” columnist Jay Mathews, a respected veteran who takes a frequent drubbing on the blog. 

To be clear, Mathews disagrees with Brandenburg’s assertion that Rhee lied. He suggests there is a more nuanced story to be told about Rhee and her principal and conclusions they drew from data available at the time. I think Mathews would know, but I still think Rhee — a reform rock star with a national platform — has some ‘splainin’ to do.

Her answers matter for several reasons. Rhee was a Teach for America teacher at the time she supposedly turned in her stellar performance. Similar claims of exceptional performance by TFA recruits and other grads of alternative teacher preparation conduits, including Rhee’s own New Teacher Project, are at the heart of the calls for the creation of alternative teacher licensure provisions here and elsewhere.

It matters because it suggests Rhee, who gained both a national following and notoriety for her assault on D.C.’s lowest-performing teachers, might not have survived her own tenure as chancellor. And of course it matters very much to teachers everywhere whose performance may now be tied to that of their students.

An LA dust-up
Tidy segue to raging controversy No. 2: In August, the Los Angeles Times published a story based on a survey by a RAND Corp. researcher on measuring teacher effectiveness and a database showing how each and every Los Angeles Unified School District teacher fared using the rubrick. Within hours of its posting online, the database had garnered a quarter of a million hits. A month later, the paper came under fire from the teachers’ union, which asserted that its publication contributed to the suicide of a teacher who was rated “less effective.”

In the wake of the controversy, teachers unions in New York and elsewhere have petitioned to keep teacher evaluation data private. The counterargument: Parents and taxpayers have a right to know how effective teachers are.

Seems a couple of researchers with the Research and Evaluation Methodology Program at the University of Colorado at Boulder have parsed the LA Times’ data and concluded that, as a measurement of teacher effectiveness, it is “deeply flawed.” The researchers were unable to replicate RAND’s findings. Indeed, when they reran the same data using their own methodology, about half of the teachers’ ratings changed.

You can read the report. My takeaway: The paper was most accurate in assessing the highest and lowest performers; within the ranks of the “average” teachers, things get distinctly muddy — with potentially grave consequences. 

Small wonder that colleges of education — next on the “value-added” assessment firing line — are freaked out that the National Council of Teacher Quality and U.S. News and World Report are changing the way they rate teacher-preparation programs. Many of the changes are reportedly attempts to address colleges’ concerns that the ratings process is not transparent or accurate, but the council does plan “to supplement the content-based analysis at the heart of its methodology with information on candidate classroom performance culled from ‘value added’ data,” according to Education Week.

The stakes are high, indeed: One of the reforms pushed by the Obama administration’s Race to the Top education funding competition was the ability to measure the effectiveness of teacher-prep programs by tying alumni performance in the classroom to student achievement.

Jim Angermeyr is the director of Research, Evaluation & Testing for Bloomington Public Schools and one of the designers of a widely respected value-added test lots of Minnesota schoolchildren take two or three times a year. He’s also something of a standardized testing skeptic.

'The inferences we're drawing can be wrong'
His view of the controversies, in my vernacular: We’re looking at a bunch of blind men fondling an elephant. Economists, in his opinion, tend to be very supportive of the use of value-added data in evaluation. Educators and psychometricians, not so much.

“It’s not necessarily that the methodologies are wrong,” he said. “It’s that the inferences we’re drawing can be wrong.”

The kids are the greatest of the variables, of course. The tests may tell you a student is reading better or sliding in math, but they don’t tell you whether she spent the summer with a tutor or he is so young the test isn’t as accurate as it would be in an older child. 

Nor is the same test used from year to year. A particular student or teacher may fare better on a test closely normed with curriculum vs. one aligned with a set of knowledge-based standards.

“You leave out a lot of the potential variables,” Angermeyr said. “They’re just not at the point where we should use them to make decisions about jobs.”

Meanwhile, there’s a great deal of evidence that good old-fashioned classroom observation by peers and skilled principals is a terrific way to gauge teacher effectiveness. Ask any parent: Even before a child heads into kindergarten it’s usually possible to figure out which teachers are coveted, which seem just fine and which ones need coaching or a new line of work.

It’s enough to make one wonder whether we should slow down and make sure value-added data really is adding value.

Get MinnPost's top stories in your inbox

Comments (1)

Forgive my impertinence, but I'm a dozen or more years retired from a 30-year career as a public high school teacher, and I was a good one, if students, parents and administrators are to be believed.

Teaching, especially good teaching, is both a calling and an art. There's no formula that the state, or a school district, can follow that will guarantee, or even make very likely, that a given teaching candidate will turn out to be the sort of teacher that motivates some parents to pull strings to get their children in to that particular class, or that kids will show up in class for, even when they've skipped the rest of their classes that day.

Teachers and students have agendas that often overlap, but they are not identical, and the agenda of some kids is antithetical to what the teacher is trying to accomplish. Rating teachers, paying teachers, based on student performance on standardized tests is stupid.

At best, you find out how well that kid was able to take a standardized test on that particular day. It tells you nothing about that student's intellectual growth, or the lack thereof. Reliance on standardized tests wrecks both teacher performance and the district curricula. Teaching to the test becomes the norm.

The political equivalent is to grade the performance of former Governor Pawlenty, who had a DFL-controlled legislature throughout his tenure, and Governor Dayton – at least in his current term – who has a GOP-controlled legislature to deal with, in terms of how well each man accomplished the state’s presumed goals. Both Governor and legislators talk about having the best interests of the state and its citizens at heart, but what they mean by "best interests" can vary widely from being in complete agreement to being at opposite ends of the policy spectrum.

What is generally ignored in the ongoing debate over increasingly lousy educational outcomes, not to mention the appalling gap in achievement between and among ethnic and socioeconomic groups, is the role of the student and the culture – both of which are MUCH more difficult to address politically than a group of employees called teachers.

The focus is all on what teachers are doing wrong when it should perhaps be on the impact of the family environment, the student's own values, and the broader culture on that student's efforts. The most that a teacher – good, bad or indifferent – can do is to OFFER an education. It's entirely up to the student whether or not that offer is accepted. So far, in all the reading I've done on educational issues in the past few years, there's remarkably little that addresses the student's role in what ought to be a shared endeavor. Yes, a good teacher can work minor miracles, and I worked a few myself over the years, but it’s beyond unrealistic to expect any teacher to serve up those miracles for every single child, especially when many of those children and their families have agendas that don’t come close to matching those of the teacher and the school district.

“Bad,” as in ineffective, teachers shouldn’t be hired in the first place. Most school districts have at least 3 years of probationary status in which to evaluate the actual performance, as well as the potential, of a new teacher. That’s plenty of time to determine if someone has potential, or is worth making the necessary investment in training so that they’re better than adequate. If they screw that up, why aren’t we taking administrators to the woodshed for their lousy evaluation skills? If a tenured teacher loses interest in the job, most of the hysteria would have you believe that that person can’t be fired. Horsefeathers. All tenure does is guarantee due process, and there are lots of ways to document incompetence – and nothing wrong with insisting that you document it, rather than simply fall in line with the educational prejudices of a particular evaluator or principal.

In short, much of the current angry debate over educational outcomes addresses the easiest, most expedient whipping boy/girl, rather than making a serious attempt to discover why it is that, in recent years, kids are increasingly not learning what all of us want them to learn.