Debate on High-Stakes Testing

Opening Remarks, Debate on High-Stakes Testing

Opening Remarks - Annual Meeting of the American Association of Publishers - School Division
Austin, Texas, Jan. 23, 2001

by Richard P. Phelps

It is appropriate that this debate be held here. For at least fifteen years now, testing opponents have feared, hated, and focused on Texas.

In the 1980s, the citizens of Texas, discovering that some of their teachers were illiterate and noticing that there were no high-stakes requirements for new teachers, developed a basic literacy test, the TECAT, and required all teachers to pass it. By all accounts, the test was extremely easy, nonetheless, some testing researchers opposed it.

The federally-funded Center for Research on Evaluation, Standards, and Student Testing (CRESST) conducted a benefit-cost analysis of the test and decided the net benefits were negative, by about $70 million. Indeed, they were extremely critical of every aspect of the test.

Perhaps the most ironic of the authors' opinions coupled two conflicting assertions - that the test was easy, simplistic, and beneath the dignity of professional educators, and so studying for the test should not be counted as a benefit. But, at the same time, the teachers, their union, and the school districts were afraid that many would fail the test, so a massive effort was undertaken to prepare the teachers for it and that should be counted as a cost.

The CRESST report, however, was riddled with mistakes. They counted as pure costs the expenses for activities that either should not have been counted or that had countervailing benefits. They counted only one among several benefits, and even with it they made arbitrary exclusions of personnel from their counts and cut off the stream of benefits after only one year, in order to get their net benefits number to dip into the red.

In a later, unrelated study, by contrast, Ron Ferguson found a teacher's TECAT score to be the strongest factor in predicting increases in minority student achievement in Texas. In another study, Solmon and Fagnano estimated that net benefits of the TECAT must have exceeded $1 billion. Among other factors, they estimated the value over students' lifetimes of the increased learning gained from having literate teachers.

The CRESST researchers had not considered these benefits because they claimed the TECAT could not possibly have had any instructional value. They just defined it to not exist.

A few years ago, another study critical of Texas testing appeared that employed a different technique that I call, for lack of a better phrase, semantic distortion. These studies describe educational practices they like with attractive language, and those they dislike with unappealing language, often without descriptions of what the practices entail in practical terms.

Here is how this particular testing opponent describes Texas classrooms before the introduction of the Texas Assessment of Academic Skills (TAAS):

"creative, productive learning environments" "[a] culture of both equity and authentic academics" "rich curricula" "creative writing, literature, and science labs" "learning a variety of forms of writing, studying mathematics aimed at problem-solving and conceptual understanding," "teaching complex biology topics" and "the joy and magic of reading" "a substantive curriculum" "authentic teaching" "changing the futures for urban children"

...and after the TAAS was introduced?

"drudgery" "going over dreary one-paragraph passages" "repetitive drills, worksheets, and practice tests" "[teachers use] aggressive test drilling" and "test coaching" "...fosters an artificial curriculum..." "...not a curriculum that will educate these children for productive futures..." "[TAAS preparation] takes time from real teaching" "artificially shift teaching away from a focus on children's learning in favor of a format that held the teacher at the center" "excessive concern with discipline, mass processing of students for credentialing, overemphasis on efficiencies to the detriment of quality instruction" "[it is] de-skilling teachers' work, trivializing and reducing the quality of the content of the curriculum"

Moreover, the TAAS is described as too difficult:

"drab factories for test preparation," entire instructional budgets spent on "commercial test preparation materials," schools "handed over to 'test prep' from New Year's through April," "TAAS [test] camps," "[Friday night] lock-ins...where students do TAAS 'drills' until sunup," prizes given to students who do well, and "students cannot graduate if they fail the exams."

But, the TAAS is also too easy:

"could be passed by many fifth-graders," "'low expectations' are 'cause for concern,'" "...aimed at the lowest level of skills and information..." "Artificially simplified curricula that had been designed by bureaucrats seeking expedient curricular format"

Still another recent critical study of Texas testing relies more on the surreptitious alteration of definitions of terms. This researcher declared that Texas' TAAS scores and NAEP scores are artificially high because Texas has more dropouts and excludes more limited-ability students from these tests. Moreover, he argued that Texas' test scores are increasing only because these rates of exclusion are increasing.

But, he came to his dropout conclusion by making up his own definition of what a dropout is without explaining that he'd done so to the journalists to whom he talked. He came to his conclusions about exclusions by misreading one table in a NAEP report and ignoring all the others. All of the many journalists who covered his research, however, accepted his conclusions at face value.

In truth, Texas has had a lower rate of dropout than other demographically-similar states and it is not increasing. In truth, Texas has been below average for the nation in its rate of exclusions from the NAEP despite the fact that it has the second largest limited-English-speaking (LEP) student population in the country. Indeed, at one of the NAEP grade levels, Texas had the lowest rate of exclusion in the country among states with large populations of LEP students. At the other grade level, it was fourth. Moreover, Texas' rate of exclusions increased only at the same rate as the average of all the states.

Imagine how you would feel if you were an honest, hard-working employee in the Texas Education Agency and you had managed to do such an outstanding job, and the newspapers in the other forty-nine states all declared that you do incompetent work and publish dishonest reports.

In other professions, in the hard sciences for example, such biased research is shunned from the field. In much of the education research world, unfortunately, it is often considered heroic.

The lead researcher on one of the aforementioned reports was later elected president of the American Educational Research Association (AERA). Another appeared on 60 Minutes with Leslie Stahl as the single expert on Texas testing most important to interview. The third researcher has been cited by hundreds of journalists and researchers as having exposed the "Texas miracle" as a "mirage."

How does such poorly-done research get accepted as truth? It's simple. These researchers are often the only ones allowed to talk. As many in this room probably know, you can do excellent research and write an excellent research article, but if the conclusion implies that standardized tests might not be evil incarnate or might actually have some benefits, your article simply will not be accepted at some of the "mainstream" education journals.

If you visit the websites of the AASA, ASCD, NEA, or other, similar organizations of education professionals or of their many state and local affiliates, you will be overwhelmed by exposure to what is called testing research, but it will be uniformly from one side of the argument.

For their part, journalists, who can be so suspicious and cynical of any and all the behavior of our elected representatives, seem to overwhelmingly accept what these researchers tell them at face value. It seldom seems to be entertained as possible that these researchers might have a hidden agenda, an ulterior motive, or a selfish interest. They are pure of heart and only interested in what's best for the children.

And only rarely do journalists subject the anti-tester claims to common sense. In the handout, I provide you the single example I was able to find from the year 2000, written by Steve Blow, where a journalist actually looked at a test in an effort to verify a false accusation made about it.

During the long election campaign season of the year 2000, I downloaded several hundred articles and transcripts on testing from the Web to see who journalists relied on for expert commentary. I did not see a single instance, out of several dozen, where an expert with a favorable point of view toward high-stakes testing was interviewed. Meanwhile, in each of those dozens of articles or TV shows which featured expert interviews, the expert or experts interviewed were well-known opponents of testing.

For Texas in particular, there was many months ago a lawsuit against the TAAS, alleging it to be discriminatory. Expert witnesses spoke for both sides and the suit eventually failed. You can find many articles and transcripts from the following months featuring interviews with prosecution expert witnesses, those opposed to the Texas test. You will not find any that interviewed a defense expert witness, even though they are among the most highly-regarded psychometricians in the field.

Nonpartisan Education Review HOME