Author Co-author(s) Dismissive Quote type Title Source Link1 Link2
1 Daniel M. Koretz Matt Barnum [journalist] Journalist: I take it it’s very hard to quantify this test prep phenomenon, though? Koretz: It is extremely hard, and there’s a big hole in the research in this area. Dismissive Why one Harvard professor calls American schools’ focus on testing a ‘charade’ Chalkbeat, January 19, 2018 https://www.chalkbeat.org/posts/us/2018/01/19/why-one-harvard-professor-calls-american-schools-focus-on-testing-a-charade/  
2 Daniel M. Koretz   "However, this reasoning isn't just simple, it's simplistic--and the evidence is overwhelming that this approach [that testing can improve education] has failed. … these improvements are few and small. Hard evidence is limited, a consequence of our failure as a nation to evaluate these programs appropriately before imposing them on all children." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 142] University of Chicago Press, 2017    
3 Daniel M. Koretz   "The bottom line: the information yielded by tests, while very useful, is never by itself adequate for evaluating programs, schools, or educators. Self-evident as this should be, it has been widely ignored in recent years. Indeed, ignoring this obvious warning has been the bedrock of test-based education reform." Denigrating The Testing Charade: Pretending to Make Schools Better [Kindle location 301] University of Chicago Press, 2017    
4 Daniel M. Koretz   "…as of the late 1980s there was not a single study evaluating whether inflation occurred or how severe it was. With three colleagues, I set out to conduct one." 1stness The Testing Charade: Pretending to Make Schools Better [Kindle location 841] University of Chicago Press, 2017    
5 Daniel M. Koretz   "However, value-added estimates are rarely calculated with lower-stakes tests that are less likely to be inflated." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 1003] University of Chicago Press, 2017    
6 Daniel M. Koretz   "One reason we know less than we should … is that most of the abundant test score data available to us are too vulnerable to score inflation to be trusted. There is a second reason for the dearth of information, the blame for which lies squarely on the shoulders of many of the reformers." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 2717] University of Chicago Press, 2017    
7 Daniel M. Koretz   "High-quality evaluations of the test-based reforms aren't common, …" Denigrating The Testing Charade: Pretending to Make Schools Better [Kindle location 2757] University of Chicago Press, 2017    
8 Daniel M. Koretz   "The first solid study documenting score inflation was presented twenty-five years before I started writing this book." 1stness The Testing Charade: Pretending to Make Schools Better [Kindle location 3772] University of Chicago Press, 2017    
9 Daniel M. Koretz   "The first study showing illusory improvement in achievement gaps--the largely bogus "Texas miracle"--was publicshed only ten years after that." 1stness The Testing Charade: Pretending to Make Schools Better [Kindle location 3772] University of Chicago Press, 2017    
10 Daniel M. Koretz Holcombe, Jennings “To date, few studies have attempted to understand the sources of variation in score inflation across testing programs.” p. 3 Dismissive The roots of score inflation, an examination of opportunities in two states’ tests  Prepublication draft “to appear in Sunderman (Ed.), Charting reform: achieving equity in a diverse nation http://dash.harvard.edu/bitstream/handle/1/10880587/roots%20of%20score%20inflation.pdf?sequence=1  
11 Jennifer L. Jennings* Douglas Lee Lauen "Despite the ongoing public debate about the meaning of state test score gains, no study has examined the impact of accountability pressure from NCLB on multiple tests taken by the same students." 1stness Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p.222 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
12 Jennifer L. Jennings* Douglas Lee Lauen "Still, little is known about the effects of accountability pressure across demographic groups on multiple measures of student learning; addressing this gap is one goal of our study." Dismissive Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p. 223 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
13 Jennifer L. Jennings* Douglas Lee Lauen "In sum, all of the studies described here establish positive average effects of NCLB beyond state tests but do not assess the generalizability of state test gains to other measures of achievement. Our study…" 1stness Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p. 223 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
14 Jennifer L. Jennings* Douglas Lee Lauen "Our study contributes to a small but growing literature examining the relationship between school-based responses to accountability pressure and student performance on multiple measures of learning, which requires student-level data and test scores from multiple exams." Dismissive Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p. 223 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
15 Jennifer L. Jennings* Douglas Lee Lauen "Only one study has examined the effect of accountability pressure on multiple tests, but this study is from the pre-NCLB era. Jacob (2005) used item-level data to better understand the mechanisms underlying differential gains across tests." Dismissive Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p. 223 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
16 Jennifer L. Jennings* Douglas Lee Lauen "While the studies reviewed here have established the effects of accountability systems on outcomes, they have devoted less attention to studying heterogeneity in how educators perceive external pressures and react to them. Because the lever for change in accountability systems is educational improvement in response to external pressure, this is an important oversight." Denigrating Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p. 224 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
17 Jennifer L. Jennings* Douglas Lee Lauen "A unique feature of this study is the availability of multiple test scores for each student— both the Texas Assessment of Knowledge and Skills (TAKS) and the Stanford Achievement Test battery." 1stness Accountability, Inequality, and Achievement: The Effects of the No Child Left Behind Act on Multiple Measures of Student Learning, p. 225 The Russell Sage Foundation Journal of the Social Sciences, Volume 2, Number 5, September 2016, pp. 220-241 https://muse.jhu.edu/article/633744  
18 Daniel M. Koretz Waldman, Yu, Langli, Orzech Few studies have applied a multi-level framework to the evaluation of inflation,” p. 1 Denigrating Using the introduction of a new test to investigate the distribution of score inflation  Working paper of Education Accountability Project at the Harvard Graduate School of Education, Nov. 2014 http://projects.iq.harvard.edu/files/eap/files/ky_cot_3_2_15_working_paper.pdf  
19 Daniel M. Koretz   "What we don’t know, What is the net effect on student achievement?
-Weak research designs, weaker data
-Some evidence of inconsistent, modest effects in elementary math, none in reading
-Effects are likely to vary across contexts...
Reason: grossly inadequate research and evaluation"
Denigrating Using tests for monitoring and accountability Presentation at:  Agencia de Calidad de la Educación Santiago, Chile, November 3, 2014    
20 Daniel M. Koretz Jennifer L. Jennings “We find that research on the use of test score data is limited, and research investigating the understanding of tests and score data is meager.” p. 1 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities/ http://www.spencer.org/resources/content/3/3/8/documents/Koretz--Jennings-paper.pdf
21 Daniel M. Koretz Jennifer L. Jennings “Because of the sparse research literature, we rely on experience and anecdote in parts of this paper, with the premise that these conclusions should be supplanted over time by findings from systematic research.” p. 1 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/sites/default/files/pdfs/Koretz-Jennings-paper.pdf
22 Daniel M. Koretz Jennifer L. Jennings "...the relative performance of schools is difficult to interpret in the presence of score inflation. At this point, we know very little about the factors that may predict higher levels of inflation —for example, characteristics of tests, accountability systems, students, or schools." p.4 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/sites/default/files/pdfs/Koretz-Jennings-paper.pdf
23 Daniel M. Koretz Jennifer L. Jennings “We focus on three issues that are especially relevant to test-based data and about which research is currently sparse:
  How do the types of data made available for use affect policymakers’ and educators’ understanding of data?
  What are the common errors made by policymakers and educators in interpreting test score data?
  How do high-stakes testing and the availability of test-based data affect administrator and teacher practice? (p. 5)
Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
24 Daniel M. Koretz Jennifer L. Jennings Systematic research exploring educators’ understanding of both the principles of testing and appropriate interpretation of test-based data is meager.”, p.5 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
25 Daniel M. Koretz Jennifer L. Jennings "Although current, systematic information is lacking, our experience is that that the level of understanding of test data among both educators and education policymakers is in many cases abysmally low.", p.6 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
26 Daniel M. Koretz Jennifer L. Jennings "There has been a considerably (sic) amount of research exploring problems with standards-based reporting, but less on the use and interpretation of standards-based data by important stakeholders." p.12 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
27 Daniel M. Koretz Jennifer L. Jennings "We have heard former teachers discuss this frequently, saying that new teachers in many schools are inculcated with the notion that raising scores in tested subjects is in itself the appropriate goal of instruction. However, we lack systematic data about this..." p.14 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
28 Daniel M. Koretz Jennifer L. Jennings "Research on score inflation is not abundant, largely for the reason discussed above: policymakers for the most part feel no obligation to allow the relevant research, which is not in their self-interest even when it is in the interests of students in schools. However, at this time, the evidence is both abundant enough and sufficiently often discussed that that the existence of the general issue of score inflation appears to be increasingly widely recognized by the media, policymakers, and educators." p.17 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
29 Daniel M. Koretz Jennifer L. Jennings "The issue of score inflation is both poorly understood and widely ignored in the research community as well." p.18 Denigrating The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
30 Daniel M. Koretz Jennifer L. Jennings "Research on coaching is very limited." p.21 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
31 Daniel M. Koretz Jennifer L. Jennings "How is test-based information used by educators? … The types of research done to date on this topic, while useful, are insufficient." p.26 Denigrating The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
32 Daniel M. Koretz Jennifer L. Jennings … We need to design ways of measuring coaching, which has been almost entirely unstudied." p.26 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
33 Daniel M. Koretz Jennifer L. Jennings “We have few systematic studies of variations in educators’ responses. …” p. 26 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
34 Daniel M. Koretz Jennifer L. Jennings "Ultimately, our concern is the impact of educators’ understanding and use of test data on student learning. However, at this point, we have very little comparative information about the validity of gains, ....  The comparative information that is beginning to emerge suggests..." p.26 Dismissive The Misunderstanding and Use of Data from Educational Tests  Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities http://www.spencer.org/resources/content/3/3/8/documents/Koretz-Jennings-paper.pdf
35 Daniel M. Koretz   “The field of measurement has not kept pace with this transformation of testing.” p. 3 Denigrating Implications of current policy for educational measurement  paper presented at the Exploratory Seminar: Measurement Challenges Within the Race to the Top Agenda, December 2009 http://www.k12center.org/rsc/pdf/KoretzPresenterSession3.pdf  
36 Daniel M. Koretz   “For the most part, notwithstanding Lindquist’s warning, the field of measurement has largely ignored the top levels of sampling.” p. 6 Dismissive Implications of current policy for educational measurement  paper presented at the Exploratory Seminar: Measurement Challenges Within the Race to the Top Agenda, December 2009 http://www.k12center.org/rsc/pdf/KoretzPresenterSession3.pdf  
37 Daniel M. Koretz   “Currently, research on accountability‐related topics, such as score inflation and effects on educational practice, is slowly growing but remains largely divorced from the core activities of the measurement field.” p. 15 Dismissive Implications of current policy for educational measurement  paper presented at the Exploratory Seminar: Measurement Challenges Within the Race to the Top Agenda, December 2009 http://www.k12center.org/rsc/pdf/KoretzPresenterSession3.pdf  
38 Daniel M. Koretz   “The data, however, are more limited and more complex than is often realized, and the story they properly tell is not quite so straightforward. . . . Data about student performance at the end of high school are scarce and especially hard to collect and interpret.” p. 38 Dismissive How do American students measure up? Making Sense of International Comparisons The Future of Children 19:1 Spring 2009 http://www.princeton.edu/futureofchildren/publications/docs/19_01_FullJournal.pdf  
39 Daniel M. Koretz   “International comparisons clearly do not provide what many observers of education would like. . . . The findings are in some cases inconsistent from one study to another. Moreover, the data from all of these studies are poorly suited to separating the effects of schooling from the myriad other influences on student achievement. p 48 Dismissive How do American students measure up? Making Sense of International Comparisons The Future of Children 19:1 Spring 2009 http://www.princeton.edu/futureofchildren/publications/docs/19_01_FullJournal.pdf  
40 Daniel M. Koretz   If truly comparable data from the end of schooling were available, they would presumably look somewhat different, though it is unlikely that they would be greatly more optimistic.” p. 49 Dismissive How do American students measure up? Making Sense of International Comparisons The Future of Children 19:1 Spring 2009 http://www.princeton.edu/futureofchildren/publications/docs/19_01_FullJournal.pdf  
41 Daniel M. Koretz   Few detailed studies of score inflation have been carried out. ...” p. 778 Dismissive Test-based educational accountability. Research evidence and implication Zeitschrift für Pädagogik 54 (2008) 6, S. 777–790 http://www.pedocs.de/volltexte/2011/4376/pdf/ZfPaed_2008_6_Koretz_Testbased_educational_accountability_D_A.pdf  
42 Daniel M. Koretz   “Unfortunately, while we have a lot of anecdotal evidence suggesting that this [equity as the rationale for NCLB] is the case, we have very few serious empirical studies of this.” answer to 3rd question, 1st para Denigrating What does educational testing really tell us?  Education Week [interview ], 9.23.2008 http://blogs.edweek.org/edweek/eduwonkette/2008/09/what_does_educational_testing_1.html  
43 Daniel M. Koretz   "…we rarely know when [test] scores are inflated because we so rarely check." Dismissive Interpreting test scores: More complicated than you think [interview] Chronicle of Higher Education, August 15, 2008, p. A23    
44 Daniel M. Koretz   “… [T]he problem of score inflation is at best inconvenient and at worse [sic] threatening. (The later is one reason that there are so few studies of this problem. …)” p. 11 Dismissive Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
45 Daniel M. Koretz   “The relatively few studies that have addressed this question support the skeptical interpretation: in many cases, mastery of material on the new test simply substitutes for mastery of the old.” p. 242 Dismissive Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
46 Daniel M. Koretz   “Because so many people consider test-based accountability to be self-evaluating … there is a disturbing lack of good evaluations of these systems. …”) p. 331 Denigrating Measuring up: What educational testing really tells us Harvard University Press, 2008  Google Books   
47 Daniel M. Koretz   Most of these few studies showed a rapid divergence of means on the two tests. …” p. 348 Dismissive Using aggregate-level linkages for estimation and valuation, etc. in Linking and Aligning Scores and Scales, Springer, 2007 Google Books  
48 Daniel M. Koretz   "Research to date makes clear that score gains achieved under high-stakes conditions should not be accepted at face value. ...policymakers embarking on an effort to create a more effective system of ...accountability must face uncertainty about how well alternatives will function in practice, and should be prepared for a period of evaluation and mid-course correction." Dismissive Alignment, High Stakes, and the Inflation of Test Scores CRESST Report 655, June 2005    
49 Daniel M. Koretz   "Thus, even in a well-aligned system, policymakers still face the challenge of designing educational accountability systems that create the right mix of incentives: incentives that will maximize real gains in student performance, minimize score inflation, and generate other desirable changes in educational practice. This is a challenge in part because of a shortage of relevant experience and research..." Dismissive Alignment, High Stakes, and the Inflation of Test Scores CRESST Report 655, June 2005    
50 Daniel M. Koretz   "Research has yet to clarify how variations in the performance targets set for schools affect the incentives faced by teachers and the resulting validity of score gains." Dismissive Alignment, High Stakes, and the Inflation of Test Scores CRESST Report 655, June 2005    
51 Daniel M. Koretz   "In terms of research, the jury is still out." Dismissive Alignment, High Stakes, and the Inflation of Test Scores CRESST Report 655, June 2005    
52 Daniel M. Koretz   "The first study to evaluate score inflation empirically (Koretz, Linn, Dunbar, and Shepard, 1991) looked at a district-testing program in the 1980s that used commercial, off-the-shelf, multiple-choice achievement tests."  1stness Alignment, High Stakes, and the Inflation of Test Scores, p.7 CRESST Report 655, June 2005    
53 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “The shortcomings of the studies make it difficult to determine the size of teacher effects, but we suspect that the magnitude of some of the effects reported in this literature are overstated.” p. xiii Denigrating Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
54 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “Using VAM to estimate individual teacher effects is a recent endeavor, and many of the possible sources of error have not been thoroughly evaluated in the literature.” p. xix Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
55 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “Empirical evaluations do not exist for many of the potential sources of error we have identified. Studies need to be conducted to determine how these factors contribute to estimated teacher effects and to determine the conditions that exacerbate or mitigate the impact these factors have on teacher effects.” p. xix Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
56 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “This lack of attention to teachers in policy discussions may be attributed in part to another body of literature that attempted to determine the effects of specific teacher background characteristics, including credentialing status (e.g., Miller, McKenna, and McKenna, 1998; Goldhaber and Brewer, 2000) and subject matter coursework (e.g., Monk, 1994).” p. 8 Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
57 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “To date, there has been little empirical exploration of the size of school effects and the sensitivity of teacher effects to modeling of school effects.” p. 78 Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
58 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “There are no empirical explorations of the robustness of estimates to assumptions about prior-year schooling effects.“ p. 81 Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
59 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “There is currently no empirical evidence about the sensitivity of gain scores or teacher effects to such alternatives.” p. 89 Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
60 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “Empirical evaluations do not exist for many of the potential sources of error we have identified. Studies need to be conducted to determine how these factors contribute to estimated teacher effects and to determine the conditions that exacerbate or mitigate the impact these factors have on teacher effects.” p. 116 Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
61 Laura S. Hamilton Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz “Although we expect missing data are likely to be pervasive, there is little systematic discussion of the extent or nature of missing data in test score databases.” p. 117 Dismissive Evaluating Value-Added Models for Teacher Accountability  Rand Corporation, 2003 https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf  
62 Daniel M. Koretz   "Empirical research on the validity of score gains on high-stakes tests is limited, but the studies conducted to date show…" Dismissive Using multiple measures to address perverse incentives an score inflation, p.21 Educational Measurement: Issues and Practice, Summer 2003    
63 Daniel M. Koretz   "Research on educators' responses to high-stakes testing is also limited, …" Dismissive Using multiple measures to address perverse incentives an score inflation, p.21 Educational Measurement: Issues and Practice, Summer 2003    
64 Daniel M. Koretz   "Although extant research is sufficient to document problems of score inflation and unintended incentives from test-based accountability, it provides very little guidance about how one might design an accountability system to lessen these problems."  Denigrating Using multiple measures to address perverse incentives an score inflation, p.22 Educational Measurement: Issues and Practice, Summer 2003    
65 Daniel M. Koretz   “Relatively few studies, however, provide strong empirical evidence pertaining to inflation of entire scores on tests used for accountability.” p. 759 Denigrating Limitations in the use of achievement tests as measures of educators’ productivity  The Journal of Human Resources, 37:4 (Fall 2002) http://standardizedtests.procon.org/sourcefiles/limitations-in-the-use-of-achievement-tests-as-measures-of-educators-productivity.pdf  
66 Daniel M. Koretz   “Only a few studies have directly tested the generalizability of gains in scores on accountability-oriented tests.” p. 759 Dismissive Limitations in the use of achievement tests as measures of educators’ productivity  The Journal of Human Resources, 37:4 (Fall 2002) http://standardizedtests.procon.org/sourcefiles/limitations-in-the-use-of-achievement-tests-as-measures-of-educators-productivity.pdf  
67 Daniel M. Koretz   “Moreover, while there are numerous anecdotal reports of various types of coaching, little systematic research describes the range of coaching strategies and their effects.” p. 769 Dismissive Limitations in the use of achievement tests as measures of educators’ productivity  The Journal of Human Resources, 37:4 (Fall 2002) http://standardizedtests.procon.org/sourcefiles/limitations-in-the-use-of-achievement-tests-as-measures-of-educators-productivity.pdf  
68 Laura S. Hamilton Daniel M. Koretz "There is currently no substantial evidence on the effects of published report cards on parents’ decisionmaking or on the schools themselves." Dismissive Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 Chapter 2: Tests and their use in test-based accountability systems, p.44
   
69 Daniel M. Koretz Daniel F. McCaffrey, Laura S. Hamilton "Few efforts are made to evaluate directly score gains obtained under high-stakes conditions, and conventional validation tools are not fully adequate for the task.", p. 1 Dismissive Toward a framework for validating gains under high-stakes conditions CSE Technical Report 551, CRESST/Harvard Graduate School of Education, CRESST/RAND Education, December 2001    
70 Daniel M. Koretz Mark Berends “[T]here has been little systematic research exploring changes in grading standards. …” p. iii Dismissive Changes in high school grading standards in mathematics, 1982–1992  Rand Education, 2001 http://www.rand.org/content/dam/rand/pubs/monograph_reports/2007/MR1445.pdf  
71 Daniel M. Koretz Mark Berends [F]ew studies have attempted to evaluate systematically changes in grading standards over time.” p. xi Dismissive Changes in high school grading standards in mathematics, 1982–1992  Rand Education, 2001 http://www.rand.org/content/dam/rand/pubs/monograph_reports/2007/MR1445.pdf  
72 Daniel M. Koretz E. A. Hanushek, J. J. Heckman, and D. Neal (organizers) "Research provides sparse guidance about how to broaden the range of measured outcomes to provide a better mix of incentives and lessen score inflation.", p.27 Dismissive Limitations in the Use of Achievement Tests as Measures of Educators’ Productivity  Devising Incentives to Promote Human Capital, National Academy of Sciences Conference, May 2000 http://www.irp.wisc.edu/newsevents/other/symposia/koretz.pdf  
73 Daniel M. Koretz E. A. Hanushek, J. J. Heckman, and D. Neal (organizers) "...what types of accountability systems might be more effective, and what role might achievement tests play in them? Unfortunately, there is little basis in research for answering this question. The simple test-based accountability systems that have been in vogue for the past two decades have appeared so commonsensical to some policymakers that they have had little incentive to permit the evaluation of alternatives.", p.25 Dismissive Limitations in the Use of Achievement Tests as Measures of Educators’ Productivity  Devising Incentives to Promote Human Capital, National Academy of Sciences Conference, May 2000 http://www.irp.wisc.edu/newsevents/other/symposia/koretz.pdf  
74 Daniel M. Koretz E. A. Hanushek, J. J. Heckman, and D. Neal (organizers) "...while there are numerous anecdotal reports of various types of coaching, little systematic research describes the range of coaching strategies and their effects.", p.24 Denigrating Limitations in the Use of Achievement Tests as Measures of Educators’ Productivity  Devising Incentives to Promote Human Capital, National Academy of Sciences Conference, May 2000 http://www.irp.wisc.edu/newsevents/other/symposia/koretz.pdf  
75 Daniel M. Koretz E. A. Hanushek, J. J. Heckman, and D. Neal (organizers) "Only a few studies have directly tested the generalizability of gains in scores on accountability-oriented tests.", p.11 Denigrating Limitations in the Use of Achievement Tests as Measures of Educators’ Productivity  Devising Incentives to Promote Human Capital, National Academy of Sciences Conference, May 2000 http://www.irp.wisc.edu/newsevents/other/symposia/koretz.pdf  
76 Daniel M. Koretz E. A. Hanushek, J. J. Heckman, and D. Neal (organizers) "Relatively few studies, however, provide strong empirical evidence pertaining to inflation of entire scores on tests used for accountability.  Policy makers have little incentive to facilitate such studies, and they can be difficult to carry out.", p.11 Denigrating Limitations in the Use of Achievement Tests as Measures of Educators’ Productivity  Devising Incentives to Promote Human Capital, National Academy of Sciences Conference, May 2000 http://www.irp.wisc.edu/newsevents/other/symposia/koretz.pdf  
77 Daniel M. Koretz Sheila I. Barron “In the absence of systematic research documenting test-based accountability systems that have avoided the problem of inflated gains …” p. xvii Dismissive The validity of gains in scores on the Kentucky Instructional Results Information System (KIRIS)  Rand Education, 1998 http://www.rand.org/content/dam/rand/pubs/monograph_reports/2009/MR1014.pdf  
78 Daniel M. Koretz Sheila I. Barron “This study also illustrated in numerous ways the limitations of current research on the validity of gains.” p. xviii Dismissive The validity of gains in scores on the Kentucky Instructional Results Information System (KIRIS)  Rand Education, 1998 http://www.rand.org/content/dam/rand/pubs/monograph_reports/2009/MR1014.pdf  
79 Daniel M. Koretz Sheila I. Barron “The field of measurement has seen many decades of intensive development of methods for evaluating scores cross-sectionally, but much less attention has been devoted to the problem of evaluating gains. . . . [T]his methodological gap is likely to become ever more important.” p. 122 Dismissive The validity of gains in scores on the Kentucky Instructional Results Information System (KIRIS)  Rand Education, 1998 http://www.rand.org/content/dam/rand/pubs/monograph_reports/2009/MR1014.pdf  
80 Daniel M. Koretz Sheila I. Barron “The contrast between mathematics … and reading … underlines the limits of our current knowledge of the mechanisms that underlie score inflation.” p. 122 Dismissive The validity of gains in scores on the Kentucky Instructional Results Information System (KIRIS)  Rand Education, 1998 http://www.rand.org/content/dam/rand/pubs/monograph_reports/2009/MR1014.pdf  
81 Daniel M. Koretz reported by Debra Viadero “...all of the researchers interviewed agreed with FairTest’s contention that research evidence supporting the use of high-stakes tests as a means of improving schools is thin.”   Dismissive FairTest report questions reliance on high-stakes testing by states Debra Viadero, Education Week. January 28, 1998.    
82 Daniel M. Koretz Erik A. Hanushek, D.W. Jorgenson (Eds.) "Despite the long history of assessment-based accountability, hard evidence about its effects is surprisingly sparse, and the little evidence that is available is not encouraging. ...The large positive effects assumed by advocates...are often not substantiated by hard evidence....” Dismissive Using student assessments for educational accountability Improving America’s schools: The role of incentives. Washington, D.C.: National Academy Press, 1996    
83 Daniel M. Koretz Robert L. Linn, Stephen Dunbar, Lorrie A. Shepard “Evidence relevant to this debate has been limited.” p. 2 Dismissive The Effects of High-Stakes Testing On Achievement: Preliminary Findings About Generalization Across Tests  Originally presented at the annual meeting of the AERA and the NCME, Chicago, April 5, 1991 http://nepc.colorado.edu/files/HighStakesTesting.pdf  
                 
  IRONIES:              
  Daniel M. Koretz   "Although this problem has been documented for more than a quarter of a century, it is still widely ignored, and the public is fed a steady diet of seriously misleading information about improvements in schools." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 723] University of Chicago Press, 2017    
  Daniel M. Koretz   "It is worth considering why we are so unlikely to ever find out how common cheating has become. … the press remains gullible…" Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 1424] University of Chicago Press, 2017    
  Daniel M. Koretz   "…putting a stop to this disdain for evidence--this arrogant assumption that we know so much that we don't have to bother evaluating our ideas before imposing them on teachers and students--is one of the most important changes we have to make." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 2573] University of Chicago Press, 2017    
  Daniel M. Koretz   "But the failure to evaluate the reforms also reflects a particular arrogance." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 3184] University of Chicago Press, 2017    
  Daniel M. Koretz   "I've several times excoriated some of the reformers for assuming that whatever they dreamed up would work well without turning to actual evidence." Dismissive The Testing Charade: Pretending to Make Schools Better [Kindle location 3229] University of Chicago Press, 2017    
  Daniel M. Koretz Jennifer L. Jennings "Unfortunately, it is often exceedingly difficult to obtain the permission and access needed to carry out testing-related research in the public education sector. This is particularly so if the research holds out the possibility of politically inconvenient findings, which virtually all evaluations in this area do. In our experience, very few state or district superintendents or commissioners consider it an obligation to provide thepublic or the field with open and impartial research. Data are considered proprietary—a position that the restrictions imposed by the federal Family Educational Rights and Privacy Act (FERPA) have made easier to maintain publicly. Access is usually provided only for research which is not seen as unduly threatening to the leaders’ immediate political agendas. The fact that this last consideration is often openly discussed underscores the lack of a culture of public accountability."   The Misunderstanding and Use of Data from Educational Tests, pp.4-5 Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities/ http://www.spencer.org/resources/content/3/3/8/documents/Koretz--Jennings-paper.pdf
  Daniel M. Koretz Jennifer L. Jennings "This unwillingness to countenance honest but potentially threatening research garners very little discussion, but in this respect, education is an anomaly. In many areas of public policy, such as drug safety or vehicle safety, there is an expectation that the public is owed honest and impartial evaluation and research. For example, imagine what would have happed if the CEO of Merck had responded to reports of side-effects from Vioxx by saying that allowing access to data was “not our priority at present,” which is a not infrequent response to data requests made to districts or states. In public education, there is no expectation that the public has a right to honest evaluation, and data are seen as the policymakers’ proprietary sandbox, to which they can grant access when it happens to serve their political needs."   The Misunderstanding and Use of Data from Educational Tests, p.5 Prepared for Spencer Foundation meetings, Chicago, IL, February 11, 2010. Revised November 21, 2010 http://www.spencer.org/data-use-and-educational-improvement-initiative-activities/ http://www.spencer.org/resources/content/3/3/8/documents/Koretz--Jennings-paper.pdf
  Daniel M. Koretz   One sometimes disquieting consequence of the incompleteness of tests is that different tests often provide somewhat inconsistent results. (p. 10)   Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
  Daniel M. Koretz   "Even a single test can provide varying results. Just as polls have a margin of error, so do achievement tests. Students who take more than one form of a test typically obtain different scores." (p. 11)   Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
  Daniel M. Koretz   "Even well-designed tests will often provide substantially different views of trends because of differences in content and other aspects of the tests' design. . . . [W]e have to be careful not to place too much confidence in detailed findings, such as the precise size of changes over time or of differences between groups." (p. 92)   Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
  Daniel M. Koretz   "[O]ne cannot give all the credit or blame to one factor . . . without investigating the impact of others. Many of the complex statistical models used in economics, sociology, epidemiology, and other sciences are efforts to take into account (or 'control' for') other factors that offer plausible alternative explanations of the observed data, and many apportion variation in the outcome-say, test scores-among various possible causes. …A hypothesis is only scientifically credible when the evidence gathered has ruled out plausible alternative explanations." (pp. 122-123)   Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
  Daniel M. Koretz   "[A] simple correlation need not indicate that one of the factors causes the other." (p. 123)   Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
  Daniel M. Koretz   "Any number of studies have shown the complexity of the non-educational factors that can affect achievement and test scores." (p. 129)   Measuring up: What educational testing really tells us. Harvard University Press, 2008  Google Books  
                 
      Cite selves or colleagues in the group, but dismiss or denigrate all other work          
      Falsely claim that research has only recently been done on topic.          
                 
      * Jennifer L. Jennings is a protégé of Daniel Koretz.