HOME: Dismissive Reviews in Education Policy Research | |||||||||||
Author | Co-author(s) | Dismissive Quote | type | Title | Source | Link1 | Funders | Notes | Notes2 | ||
1 | John F. Pane | "Practitioners and policymakers seeking to implement personalized learning, lacking clearly defined evidence-based models to adopt, are creating custom designs for their specific contexts. Those who want to use rigorous research evidence to guide their designs will find many gaps and will be left with important unanswered questions about which practices or combinations of practices are effective. It will likely take many years of research to fill these gaps". | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.1 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
2 | John F. Pane | "The purpose of this Perspective is to offer strategic guidance for designers of personalized learning programs to consider while the evidence base is catching up." | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.1 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
3 | John F. Pane | "This guidance draws on theory, basic principles of learning science, and the limited research that does exist on personalized learning and its component parts." | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.1 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
4 | John F. Pane | "Thus far, the research evidence on personalized learning as an overarching schoolwide model is sparse." | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.4 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
5 | John F. Pane | "A team of RAND Corporation researchers conducted the largest and most-rigorous studies of student achievement effects to date." | 1stness | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.4 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
6 | John F. Pane | "While we await the answers to those questions, substantial enthusiasm around personalized learning persists. Educators, policy makers, and advocates are moving forward without the guidance of conclusive research evidence." | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.5 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
7 | John F. Pane | "In the absence of comprehensive, rigorous evidence to help select the personalized learning components most likely to succeed, what is the path forward? I suggest a few guiding principles aimed at using existing scientific knowledge and the best available resources." | Denigrating | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.5 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
8 | John F. Pane | "However, more work is necessary to establish causal evidence that the concept leads to improved outcomes for students" | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.9 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for deeper learning. And, Rand Corporation funders | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
9 | John F. Pane | "Those who want to use rigorous research evidence to guide their designs will find many gaps and will be left with important unanswered questions about which practices or combinations of practices are effective." | Dismissive, Denigrating | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.12 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for "deeper learning." | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
10 | John F. Pane | "Despite the lack of evidence, there is considerable enthusiasm about personalized learning among practitioners and policymakers, and implementation is spreading." | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.12 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for "deeper learning." | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
11 | John F. Pane | "Thus, the purpose of this Perspective is to offer strategic guidance for designers of personalized learning programs to consider while the evidence base is catching up. This guidance draws on theory, basic principles from learning science, and the limited research that does exist on personalized learning and its component parts. This research was conducted in RAND Education." | Dismissive | Strategies for Implementing Personalized Learning While Evidence and Resources Are Underdeveloped, p.12 | Rand Corporation Perspective, October 2018 | https://www.rand.org/pubs/perspectives/PE314.html | Funded by the William and Flora Hewlett Foundation, UCLA’s National Center for Research on Evaluation, Standards, and Student Testing (CRESST) is monitoring the extent to which the two consortia’s assessment development efforts are likely to produce tests that measure and support goals for "deeper learning." | Pane devotes considerable text to claims that no prior research exists, except for another Rand study, and then, on p.7 admits that there exist some relevant mastery learning studies from the 1980s. He implies, however, that there were only one or a few. In fact, there were hundreds. These researchers have included Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, and Wilson. There have also been thousands of studies of personalized instruction in conjunction with studies in special education, tutoring, teachers' aides, tracking, etc. | |||
12 | Jennifer L. Steele, Matthew W. Lewis, Lucrecia Santibañez, Susannah Faxon-Mills, Mollie Rudnick, Brian M. Stecher, Laura S. Hamilton | "Despite taking on considerable momentum in the field, competency-based systems have not been extensively researched." p.2 | Dismissive | Competency-Based Education in Three Pilot Programs Examining Implementation and Outcomes | Rand Education, 2014 | https://www.rand.org/content/dam/rand/pubs/research_reports/RR700/RR732/RAND_RR732.pdf | "The research described in this report was sponsored by the Bill & Melinda Gates Foundation" | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | |||
13 | Jennifer L. Steele, Matthew W. Lewis, Lucrecia Santibañez, Susannah Faxon-Mills, Mollie Rudnick, Brian M. Stecher, Laura S. Hamilton | "Recent studies have described the experiences of educators working to undertake competency-based reforms or have highlighted promising models, but these studies have not systematically examined the effects of these models on student learning or persistence." p.2 | Denigrating | Competency-Based Education in Three Pilot Programs Examining Implementation and Outcomes | Rand Education, 2014 | https://www.rand.org/content/dam/rand/pubs/research_reports/RR700/RR732/RAND_RR732.pdf | "The research described in this report was sponsored by the Bill & Melinda Gates Foundation" | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | |||
14 | Jennifer L. Steele, Matthew W. Lewis, Lucrecia Santibañez, Susannah Faxon-Mills, Mollie Rudnick, Brian M. Stecher, Laura S. Hamilton | "… there are no studies that would allow us to attribute outperformance to the competency-based education systems alone," p.2 | Dismissive | Competency-Based Education in Three Pilot Programs Examining Implementation and Outcomes | Rand Education, 2014 | https://www.rand.org/content/dam/rand/pubs/research_reports/RR700/RR732/RAND_RR732.pdf | "The research described in this report was sponsored by the Bill & Melinda Gates Foundation" | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | |||
15 | Jennifer L. Steele, Matthew W. Lewis, Lucrecia Santibañez, Susannah Faxon-Mills, Mollie Rudnick, Brian M. Stecher, Laura S. Hamilton | "Because it is one of the first studies we are aware of since the late 1980s that has attempted to estimate the impact of competency-based models on students’ academic outcomes," p.4 | 1stness | Competency-Based Education in Three Pilot Programs Examining Implementation and Outcomes | Rand Education, 2014 | https://www.rand.org/content/dam/rand/pubs/research_reports/RR700/RR732/RAND_RR732.pdf | "The research described in this report was sponsored by the Bill & Melinda Gates Foundation" | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | |||
16 | Jennifer L. Steele, Matthew W. Lewis, Lucrecia Santibañez, Susannah Faxon-Mills, Mollie Rudnick, Brian M. Stecher, Laura S. Hamilton | "In part, the lack of recent research on competency-based education may be due to variability around the concept of competency-based education itself." p.10 | Dismissive | Competency-Based Education in Three Pilot Programs Examining Implementation and Outcomes | Rand Education, 2014 | https://www.rand.org/content/dam/rand/pubs/research_reports/RR700/RR732/RAND_RR732.pdf | "The research described in this report was sponsored by the Bill & Melinda Gates Foundation" | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | |||
17 | Kun Yuan, Vi-Nhuan Le | "… there has been no systematic empirical examination of the extent to which other widely used achievement tests emphasize deeper learning." p.xi | Dismissive | Measuring
Deeper Learning Through Cognitively Demanding Test Items |
Rand Corporation Research Report, 2014 | https://www.rand.org/content/dam/rand/pubs/research_reports/RR400/RR483/RAND_RR483.pdf | "The research described in this report was sponsored by the William and Flora Hewlett Foundation" | ||||
18 | Pete Wilmoth | "The increasing availability of computers and Internet access makes technology based education an enticing option, both inside and outside the classroom. However, school districts have adopted many such tools without compelling evidence that they are effective in improving student achievement." | Dismissive | Cognitive Tutor: Encouraging Signs for Computers in the Classroom | The RAND Blog, November 19, 2013 | "The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A070185 to the RAND Corporation." | |||||
19 | Pete Wilmoth | "To help fill this evidence gap, a RAND research team … " | Dismissive | Cognitive Tutor: Encouraging Signs for Computers in the Classroom | The RAND Blog, November 19, 2013 | "The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A070185 to the RAND Corporation." | |||||
20 | Pete Wilmoth | "As one of the first large-scale assessments of a blended learning approach, this study suggests promise for using technology to improve student achievement." | 1stness | Cognitive Tutor: Encouraging Signs for Computers in the Classroom | The RAND Blog, November 19, 2013 | "The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A070185 to the RAND Corporation." | |||||
21 | John F. Pane, Beth Ann Griffin, Daniel F. McCaffrey, and Rita Karam, | "These tools allow self-paced instruction and provide students with customized feedback. These features, it is widely held, will improve student engagement and improve proficiency. However, evidence to support these claims remains scarce." p.2 | Dismissive | Does an Algebra Course with Tutoring Software Improve Student Learning? | Rand Corporation Brief, 2013 | "The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A070185 to the RAND Corporation." | |||||
22 | John F. Pane, Beth Ann Griffin, Daniel F. McCaffrey, and Rita Karam, | "To make headway in addressing this knowledge gap, a team of RAND researchers …" p.3 | Dismissive | Does an Algebra Course with Tutoring Software Improve Student Learning? | Rand Corporation Brief, 2013 | "The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305A070185 to the RAND Corporation." | |||||
23 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "In particular, there is still much to learn about how changes in testing might influence the education system and how tests of deeper content and more complex skills and processes could best be used to promote the Foundation’s goals for deeper learning." p.1 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
24 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "Given the gaps in evidence regarding the link between testing and student outcomes … " p.1 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
25 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "The first step for each of these research areas was to identify relevant material from previous literature reviews on these topics, including those conducted by RAND researchers (e.g., Hamilton, Stecher, and Klein, 2002; Hamilton, 2003; Stecher, 2010) and by the National Research Council (e.g., Koenig, 2011). p.5 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
26 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "… we paid particular attention to sources from the past ten years, since these studies were less likely to have been included in previous literature reviews." p.5 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
27 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "Time and resource constraints limited the extent of our literature reviews, but we do not think this had a serious effect on our findings. Most importantly, we included all the clearly relevant studies from major sources that were available for electronic searching. In addition, many of the studies we reviewed also included comprehensive reviews of other literature, leading to fairly wide coverage of each body of literature." p.8 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
28 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "However, the amount of research on test attributes is limited, and the research has been conducted in a wide variety of contexts involving a wide variety of tests. Thus, while the findings are interesting, few have been replicated." p.22 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
29 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "It is important to recognize that the literature on how school characteristics, such as urbanicity and governance, affect educators’ responses to testing is sparse." p.29 | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
30 | Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "… there is little empirical evidence that provides guidance on the amount and types of professional development that would promote constructive responses to assessment. | Dismissive | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||
31 | Jinok Kim | Joan L. Herman | "However, the validity of existing criteria and procedures lack an empirical base; in fact, reclassification practices are formulated and implemented with little knowledge of the factors that may influence their success." | Dismissive, Denigrating | Understanding Patterns and Precursors of ELL Success Subsequent to Reclassification, p.1 | CRESST Report 818, August, 2012 | https://files.eric.ed.gov/fulltext/ED540604.pdf | "The work reported herein was supported under the National Research and Development Centers, PR/Award Number R305A09058101, as administered by the U.S. Department of Education, Institute of Education Sciences." | |||
32 | Jinok Kim | Joan L. Herman | "Because the research basis for making mainstreaming or reclassification decisions remains slim, it may not be surprising that criteria for reclassifying students from ELL to Reclassified as Fluent English Proficient (RFEP) status vary substantially across states, as documented by a recent report reviewing statewide practices related to ELLs." | Dismissive | Understanding Patterns and Precursors of ELL Success Subsequent to Reclassification, p.3 | CRESST Report 818, August, 2012 | https://files.eric.ed.gov/fulltext/ED540604.pdf | "The work reported herein was supported under the National Research and Development Centers, PR/Award Number R305A09058101, as administered by the U.S. Department of Education, Institute of Education Sciences." | |||
33 | Jinok Kim | Joan L. Herman | "Previous studies cited earlier have identified potential problems in current reclassification, qualitatively analyzed criteria, and student characteristics that may relate to high versus low redesignation rates, and examined related research questions, such as how long it takes for non native speakers to acquire ELP or be reclassified; but none of the existing literature has directly dealt with reclassification systems and their consequences, and more specifically with the consequences of various reclassification criteria." | 1stness | Understanding Patterns and Precursors of ELL Success Subsequent to Reclassification, p.6 | CRESST Report 818, August, 2012 | https://files.eric.ed.gov/fulltext/ED540604.pdf | "The work reported herein was supported under the National Research and Development Centers, PR/Award Number R305A09058101, as administered by the U.S. Department of Education, Institute of Education Sciences." | |||
34 | Lorraine M. McDonnell | "Over the past 30 years, accountability policies have become more prominent in public K-12 education and have changed how teaching and learning are organized. It is less clear the extent to which these policies have altered the politics of education." Abstract, p.170 | Dismissive | Educational Accountability and Policy Feedback | Educational Policy 27(2) 170–189, 2012 | https://journals.sagepub.com/doi/10.1177/0895904812465119 | "The author received financial support from the William T. Grant Foundation for research presented in this article." | ||||
35 | Lorraine M. McDonnell | "In contrast to other policy areas such as health and social welfare where research is more developed, we know less about policy feedback in education." p.171 | Dismissive | Educational Accountability and Policy Feedback | Educational Policy 27(2) 170–189, 2012 | https://journals.sagepub.com/doi/10.1177/0895904812465119 | "The author received financial support from the William T. Grant Foundation for research presented in this article." | ||||
36 | Lorraine M. McDonnell | "However, an essential question for those interested in the politics of education policy has not been central in past research: To what extent have recent accountability policies altered the politics of education? This article begins to address that question ..." p.171 | Dismissive | Educational Accountability and Policy Feedback | Educational Policy, 27(2) 170–189, 2012 | https://journals.sagepub.com/doi/10.1177/0895904812465119 | "The author received financial support from the William T. Grant Foundation for research presented in this article." | ||||
37 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "He also noted that virtually all of the arguments, both for and against standards, are based on beliefs and hypotheses rather than on direct empirical evidence” (p. 427) Although a large and growing body of research has been conducted to examine the effects of SBA, the caution Porter expressed in 1994 about the lack of empirical evidence remains relevant today." pp.157-158 | Denigrating | Standards-Based Accountability in the United States: Lessons Learned and Future Directions | Education Inquiry, 3(2), June 2012, 149-170 | https://www.academia.edu/15201890/Standards_Based_Accountability_in_the_United_States_Lessons_Learned_and_Future_Directions_1 | "Material in this paper has been adapted from a paper commissioned by the Center on Education Policy: Hamilton, L.S., Stecher, B.M., & Yuan, K. (2009) Standards-based Reform in the United States: History, Research, and Future Directions. Washington, DC: Center on Education Policy. Portions of this work were supported by the National Science Foundation under Grant No. REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
38 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "High-quality research on the effects of SBA is difficult to conduct for a number of reasons,…." p.158 | Dismissive | Standards-Based Accountability in the United States: Lessons Learned and Future Directions | Education Inquiry, 3(2), June 2012, 149-170 | https://www.academia.edu/15201890/Standards_Based_Accountability_in_the_United_States_Lessons_Learned_and_Future_Directions_1 | "Material in this paper has been adapted from a paper commissioned by the Center on Education Policy: Hamilton, L.S., Stecher, B.M., & Yuan, K. (2009) Standards-based Reform in the United States: History, Research, and Future Directions. Washington, DC: Center on Education Policy. Portions of this work were supported by the National Science Foundation under Grant No. REC-0228295." | Access to anonymized student data is granted all the time. Externally administered high-stakes testing is widely reviled among US educationists. It strains credulity that one can not find one or a few districts out of the many thousands to cooperate in a study to discredit testing. | ||
39 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "Even when the necessary data have been collected by states or other entities, it is often difficult for researchers to obtain these data because those responsible for the data refuse to grant access, either because of concerns about confidentiality or because they are not interested in having their programmes scrutinised by. researchers. Thus, the amount of rigorous analysis is limited." p.158 | Dismissive | Standards-Based Accountability in the United States: Lessons Learned and Future Directions | Education Inquiry, 3(2), June 2012, 149-170 | https://www.academia.edu/15201890/Standards_Based_Accountability_in_the_United_States_Lessons_Learned_and_Future_Directions_1 | "Material in this paper has been adapted from a paper commissioned by the Center on Education Policy: Hamilton, L.S., Stecher, B.M., & Yuan, K. (2009) Standards-based Reform in the United States: History, Research, and Future Directions. Washington, DC: Center on Education Policy. Portions of this work were supported by the National Science Foundation under Grant No. REC-0228295." | Access to anonymized student data is granted all the time. Externally administered high-stakes testing is widely reviled among US educationists. It strains credulity that one can not find one or a few districts out of the many thousands to cooperate in a study to discredit testing. | ||
40 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "These evaluation findings reveal the challenges inherent in trying to judge the quality of standards. Arguably the most important test of quality is whether the standards promote high-quality instruction and improved student learning but, as we discuss later, there is very little research to address that question." p.158 | Dismissive | Standards-Based Accountability in the United States: Lessons Learned and Future Directions | Education Inquiry, 3(2), June 2012, 149-170 | https://www.academia.edu/15201890/Standards_Based_Accountability_in_the_United_States_Lessons_Learned_and_Future_Directions_1 | "Material in this paper has been adapted from a paper commissioned by the Center on Education Policy: Hamilton, L.S., Stecher, B.M., & Yuan, K. (2009) Standards-based Reform in the United States: History, Research, and Future Directions. Washington, DC: Center on Education Policy. Portions of this work were supported by the National Science Foundation under Grant No. REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
41 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "In fact, the bulk of research relevant to SBA has focused on the links between high-stakes tests and educators’ practices rather than standards and practices." p.159 | Dismissive | Standards-Based Accountability in the United States: Lessons Learned and Future Directions | Education Inquiry, 3(2), June 2012, 149-170 | https://www.academia.edu/15201890/Standards_Based_Accountability_in_the_United_States_Lessons_Learned_and_Future_Directions_1 | "Material in this paper has been adapted from a paper commissioned by the Center on Education Policy: Hamilton, L.S., Stecher, B.M., & Yuan, K. (2009) Standards-based Reform in the United States: History, Research, and Future Directions. Washington, DC: Center on Education Policy. Portions of this work were supported by the National Science Foundation under Grant No. REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
42 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "The existing evidence does not provide definitive guidance regarding the SBA system features that would be most likely to promote desirable outcomes." p.163 | Dismissive | Standards-Based Accountability in the United States: Lessons Learned and Future Directions | Education Inquiry, 3(2), June 2012, 149-170 | https://www.academia.edu/15201890/Standards_Based_Accountability_in_the_United_States_Lessons_Learned_and_Future_Directions_1 | "Material in this paper has been adapted from a paper commissioned by the Center on Education Policy: Hamilton, L.S., Stecher, B.M., & Yuan, K. (2009) Standards-based Reform in the United States: History, Research, and Future Directions. Washington, DC: Center on Education Policy. Portions of this work were supported by the National Science Foundation under Grant No. REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
43 | Girlie C. Delacruz | "Opportunities for student use of rubrics to improve learning appears logical, although only a few studies have examined this idea directly." | Dismissive | Impact of Incentives on the Use of Feedback in Educational Videogames | CRESST Report 813, March, 2012, p.3 | https://cresst.org/wp-content/uploads/R813.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | |||
44 | Jinok Kim | "Though we can find many such statistics in various reports, few have dealt with comparisons across students reclassified in various grade levels. Lack of such studies may be in part due to the difficulty in defining who are reclassified students as well as when they are reclassified." | Dismissive | Relationshiips among and between ELL status, demographic characteristics, enrollment history, and school persistence | CRESST Report 810, December, 2011, p.6 | https://cresst.org/wp-content/uploads/R810.pdf | "The work reported herein was supported under the National Research and Development Centers, PR/Award Number R305A090581, as administered by the U.S. Department of Education, Institute of Education Sciences with funding to the National Center for Research on Evaluation, Standards, and Student Testing (CRESST)." | ||||
45 | Joan Herman | 4 others | "While the challenge of teachers’ content-pedagogical knowledge has been documented (Heritage et al., 2009; Heritage, Jones & White, 2010; Herman et al., 2010), few studies have examined the relationship between such knowledge and teachers’ assessment practices, nor examined how teachers’ knowledge may moderate the relationship between assessment practices and student learning." | Dismissive | Relationships between Teacher Knowledge, Assessment Practice, and Learning-Chicken, Egg, or Omelet? | CRESST Report 809, November 2011 | http://cresst.org/wp-content/uploads/R809.pdf | Institute of Education Sciences, US Education Department | See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
46 | Lorrie A. Shepard | Kristen L. Davidson, Richard Bowman | "Although some instruments, such as the Northwest Evaluation Association‘s (NWEA) Measures of Academic Progress (MAP®), have been around for decades, few studies have been conducted to examine the technical adequacy of interim assessments or to evaluate their effects on teaching and student learning." | Dismissive | How Middle-School Mathematics Teachers Use Interim and Benchmark Assessment Data, p.2 | CRESST Report 807, October 2011 | http://cresst.org/wp-content/uploads/R807.pdf | Institute of Education Sciences, US Education Department | Studies of formative testing date back a cenury, and the evidence, on average, is strongly positive, which is not the result favored by CRESST, so they declare the studies nonexistent. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
47 | Kristen L. Davidson | Greta Frohbieter | "Yet, districts’ processes to this end [of adopting interim or benchmark assessments] have been largely unexamined (Bulkley et al.; Mandinach et al.; Young & Kim). | Dismissive | District Adoption and Implementation of Interim and Benchmark Assessments, p.2 | CRESST Report 806, September 2011 | https://eric.ed.gov/?id=ED525098 | Institute of Education Sciences, US Education Department | Studies of formative testing date back a cenury, and the evidence, on average, is strongly positive, which is not the result favored by CRESST, so they declare the studies nonexistent. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
48 | Kristen L. Davidson | Greta Frohbieter | "As noted above, district processes with regard to interim assessment adoption and implementation remain largely uninvestigated. A review of the few relevant studies, however, reveals..." | Dismissive | District Adoption and Implementation of Interim and Benchmark Assessments, p.4 | CRESST Report 806, September 2011 | https://eric.ed.gov/?id=ED525098 | Institute of Education Sciences, US Education Department | Studies of formative testing date back a cenury, and the evidence, on average, is strongly positive, which is not the result favored by CRESST, so they declare the studies nonexistent. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
49 | Marguerite Clarke | “The evidence base is stronger in some areas than in others. For example, there are many professional standards for assessment quality that ` be applied to classroom assessments, examinations, and large-scale assessments (APA, AERA, and NCME, 1999), but less professional or empirical research on enabling contexts.” p. 20 | Dismissive | Framework for Building an Effective Student Assessment System | World Bank, READ/SABER Working Paper, Aug. 2011 | http://files.eric.ed.gov/fulltext/ED553178.pdf | World Bank funders | No matter that there exist hundreds of other countries, a
century's worth of research prior to 2010, literally thousands of other
journals that might publish such a article, and a large "grey
literature" of alignment studies conducted as routine parts of test
development. Virtually any standards-based, large-scale test development
includes an alignment study, not to be found in a scholarly journal. Some notable alignment studies: with NRTs: Freeman, Kuhs, Porter, Floden, Schmidt, Schwille (1983); Debra P. v. Turlington (1984); Cohen, Spillane (1993); La Marca, Redfield, Winter, Bailey, and Despriet (2000); Wainer (2011) with Standards: Archbald (1994); Buckendahl, Plake, Impara, Irwin (2000); Bhola, Impara, Buckendahl (2003); Phelps (2005) with RTs: Massell, Kirst, Hoppe (1997); Wiley, Hembry, Buckendahl, Forte,Towles Nebelsick-Gullett (2015) |
|||
50 | Marguerite Clarke | “Data for some of these indicator areas can be found in official documents, published reports (for example, Ferrer, 2006), research articles (for example, Braun and Kanjee, 2005), and online databases. For the most part, data have not been gathered in any comprehensive or systematic fashion. Those wishing to review this type of information for a particular assessment system will most likely need to collect the data themselves.” p. 21 | Denigrating | Framework for Building an Effective Student Assessment System | World Bank, READ/SABER Working Paper, Aug. 2011 | http://files.eric.ed.gov/fulltext/ED553178.pdf | World Bank funders | No matter that there exist hundreds of other countries, a
century's worth of research prior to 2010, literally thousands of other
journals that might publish such a article, and a large "grey
literature" of alignment studies conducted as routine parts of test
development. Virtually any standards-based, large-scale test development
includes an alignment study, not to be found in a scholarly journal. Some notable alignment studies: with NRTs: Freeman, Kuhs, Porter, Floden, Schmidt, Schwille (1983); Debra P. v. Turlington (1984); Cohen, Spillane (1993); La Marca, Redfield, Winter, Bailey, and Despriet (2000); Wainer (2011) with Standards: Archbald (1994); Buckendahl, Plake, Impara, Irwin (2000); Bhola, Impara, Buckendahl (2003); Phelps (2005) with RTs: Massell, Kirst, Hoppe (1997); Wiley, Hembry, Buckendahl, Forte,Towles Nebelsick-Gullett (2015) |
|||
51 | Marguerite Clarke | “This paper has extracted principles and guidelines from countries’ experiences and the current research base to outline a framework for developing a more effective student assessment system. The framework provides policy makers and others with a structure for discussion and consensus building around priorities and key inputs for their assessment system.” p. 27 | 1rstness | Framework for Building an Effective Student Assessment System | World Bank, READ/SABER Working Paper, Aug. 2011 | http://files.eric.ed.gov/fulltext/ED553178.pdf | World Bank funders | No matter that there exist hundreds of other countries, a
century's worth of research prior to 2010, literally thousands of other
journals that might publish such a article, and a large "grey
literature" of alignment studies conducted as routine parts of test
development. Virtually any standards-based, large-scale test development
includes an alignment study, not to be found in a scholarly journal. Some notable alignment studies: with NRTs: Freeman, Kuhs, Porter, Floden, Schmidt, Schwille (1983); Debra P. v. Turlington (1984); Cohen, Spillane (1993); La Marca, Redfield, Winter, Bailey, and Despriet (2000); Wainer (2011) with Standards: Archbald (1994); Buckendahl, Plake, Impara, Irwin (2000); Bhola, Impara, Buckendahl (2003); Phelps (2005) with RTs: Massell, Kirst, Hoppe (1997); Wiley, Hembry, Buckendahl, Forte,Towles Nebelsick-Gullett (2015) |
|||
52 | Michael Hout, Stuart W. Elliot, Editors | "Unfortunately, there were no other studies available that would have allowed us to contrast the overall effect of state incentive programs predating NCLB…" p. 4-6 | Dismissive | Incentives and Test-Based Accountability in Education, 2011 | Board on Testing and Assessment, National Research Council | https://www.nap.edu/catalog/12521/incentives-and-test-based-accountability-in-education | National Research Council funders | Relevant studies of the effects of varying types of incentive or the optimal structure of incentives include those of Kelley (1999); the *Southern Regional Education Board (1998); Trelfa (1998); Heneman (1998); Banta, Lund, Black & Oblander (1996); Brooks-Cooper, 1993; Eckstein & Noah (1993); Richards & Shen (1992); Jacobson (1992); Heyneman & Ransom (1992); *Levine & Lezotte (1990); Duran, 1989; *Crooks (1988); *Kulik & Kulik (1987); Corcoran & Wilson (1986); *Guskey & Gates (1986); Brook & Oxenham (1985); Oxenham (1984); Venezky & Winfield (1979); Brookover & Lezotte (1979); McMillan (1977); Abbott (1977); *Staats (1973); *Kazdin & Bootzin (1972); *O’Leary & Drabman (1971); Cronbach (1960); and Hurlock (1925). *Covers many studies; study is a research review, research synthesis, or meta-analysis. Other researchers who, prior to 2000, studied test-based incentive programs include Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, Roueche, Kirk, Wheeler, Boylan, and Wilson. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
What about: Brooks-Cooper, C. (1993), Brown, S. M.
& Walberg, H. J. (1993), Heneman, H. G., III. (1998), Hurlock, E. B.
(1925), Jones, J. et al. (1996), Kazdin, A. & Bootzin, R. (1972), Kelley, C. (1999), Kirkpatrick, J. E. (1934), O’Leary, K. D. & Drabman, R. (1971), Palmer, J. S. (2002), Richards, C. E. & Shen, T. M. (1992), .Rosswork, S. G. (1977), Staats, A. (1973), Tuckman, B. W. (1994), Tuckman, B. W. & Trimble, S. (1997), Webster, W. J., Mendro, R. L., Orsack, T., Weerasinghe, D. & Bembry, K. (1997) |
|
53 | Michael Hout, Stuart W. Elliot, Editors | "Test-based incentive programs, as designed and implemented in the programs that have been carefully studied have not increased student achievement enough to bring the United States close to the levels of the highest achieving countries.", p. 4-26 | Denigrating | Incentives and Test-Based Accountability in Education, 2011 | Board on Testing and Assessment, National Research Council | https://www.nap.edu/catalog/12521/incentives-and-test-based-accountability-in-education | National Research Council funders | Relevant studies of the effects of varying types of incentive or the optimal structure of incentives include those of Kelley (1999); the *Southern Regional Education Board (1998); Trelfa (1998); Heneman (1998); Banta, Lund, Black & Oblander (1996); Brooks-Cooper, 1993; Eckstein & Noah (1993); Richards & Shen (1992); Jacobson (1992); Heyneman & Ransom (1992); *Levine & Lezotte (1990); Duran, 1989; *Crooks (1988); *Kulik & Kulik (1987); Corcoran & Wilson (1986); *Guskey & Gates (1986); Brook & Oxenham (1985); Oxenham (1984); Venezky & Winfield (1979); Brookover & Lezotte (1979); McMillan (1977); Abbott (1977); *Staats (1973); *Kazdin & Bootzin (1972); *O’Leary & Drabman (1971); Cronbach (1960); and Hurlock (1925). *Covers many studies; study is a research review, research synthesis, or meta-analysis. Other researchers who, prior to 2000, studied test-based incentive programs include Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, Roueche, Kirk, Wheeler, Boylan, and Wilson. | Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones. |
What about: Brooks-Cooper, C. (1993), Brown, S. M.
& Walberg, H. J. (1993), Heneman, H. G., III. (1998), Hurlock, E. B.
(1925), Jones, J. et al. (1996), Kazdin, A. & Bootzin, R. (1972), Kelley, C. (1999), Kirkpatrick, J. E. (1934), O’Leary, K. D. & Drabman, R. (1971), Palmer, J. S. (2002), Richards, C. E. & Shen, T. M. (1992), .Rosswork, S. G. (1977), Staats, A. (1973), Tuckman, B. W. (1994), Tuckman, B. W. & Trimble, S. (1997), Webster, W. J., Mendro, R. L., Orsack, T., Weerasinghe, D. & Bembry, K. (1997) |
|
54 | Michael Hout, Stuart W. Elliot, Editors | "Despite using them for several decades, policymakers and educators do not yet know how to use test-based incentives to consistently generate positive effects on achievement and to improve education." p .5-1 | Dismissive | Incentives and Test-Based Accountability in Education, 2011 | Board on Testing and Assessment, National Research Council | https://www.nap.edu/catalog/12521/incentives-and-test-based-accountability-in-education | National Research Council funders | Relevant studies of the effects of varying types of incentive or the optimal structure of incentives include those of Kelley (1999); the *Southern Regional Education Board (1998); Trelfa (1998); Heneman (1998); Banta, Lund, Black & Oblander (1996); Brooks-Cooper, 1993; Eckstein & Noah (1993); Richards & Shen (1992); Jacobson (1992); Heyneman & Ransom (1992); *Levine & Lezotte (1990); Duran, 1989; *Crooks (1988); *Kulik & Kulik (1987); Corcoran & Wilson (1986); *Guskey & Gates (1986); Brook & Oxenham (1985); Oxenham (1984); Venezky & Winfield (1979); Brookover & Lezotte (1979); McMillan (1977); Abbott (1977); *Staats (1973); *Kazdin & Bootzin (1972); *O’Leary & Drabman (1971); Cronbach (1960); and Hurlock (1925). *Covers many studies; study is a research review, research synthesis, or meta-analysis. Other researchers who, prior to 2000, studied test-based incentive programs include Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, Roueche, Kirk, Wheeler, Boylan, and Wilson. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
What about: Brooks-Cooper, C. (1993), Brown, S. M.
& Walberg, H. J. (1993), Heneman, H. G., III. (1998), Hurlock, E. B.
(1925), Jones, J. et al. (1996), Kazdin, A. & Bootzin, R. (1972), Kelley, C. (1999), Kirkpatrick, J. E. (1934), O’Leary, K. D. & Drabman, R. (1971), Palmer, J. S. (2002), Richards, C. E. & Shen, T. M. (1992), .Rosswork, S. G. (1977), Staats, A. (1973), Tuckman, B. W. (1994), Tuckman, B. W. & Trimble, S. (1997), Webster, W. J., Mendro, R. L., Orsack, T., Weerasinghe, D. & Bembry, K. (1997) |
|
55 | Michael Hout, Stuart W. Elliot, Editors | "The general lack of guidance coming from existing studies of test-based incentive programs in education…" | Dismissive | Incentives and Test-Based Accountability in Education, 2011 | Board on Testing and Assessment, National Research Council | https://www.nap.edu/catalog/12521/incentives-and-test-based-accountability-in-education | National Research Council funders | Relevant studies of the effects of varying types of incentive or the optimal structure of incentives include those of Kelley (1999); the *Southern Regional Education Board (1998); Trelfa (1998); Heneman (1998); Banta, Lund, Black & Oblander (1996); Brooks-Cooper, 1993; Eckstein & Noah (1993); Richards & Shen (1992); Jacobson (1992); Heyneman & Ransom (1992); *Levine & Lezotte (1990); Duran, 1989; *Crooks (1988); *Kulik & Kulik (1987); Corcoran & Wilson (1986); *Guskey & Gates (1986); Brook & Oxenham (1985); Oxenham (1984); Venezky & Winfield (1979); Brookover & Lezotte (1979); McMillan (1977); Abbott (1977); *Staats (1973); *Kazdin & Bootzin (1972); *O’Leary & Drabman (1971); Cronbach (1960); and Hurlock (1925). *Covers many studies; study is a research review, research synthesis, or meta-analysis. Other researchers who, prior to 2000, studied test-based incentive programs include Homme, Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron, Pierce, McMillan, Corcoran, Roueche, Kirk, Wheeler, Boylan, and Wilson. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
What about: Brooks-Cooper, C. (1993), Brown, S. M.
& Walberg, H. J. (1993), Heneman, H. G., III. (1998), Hurlock, E. B.
(1925), Jones, J. et al. (1996), Kazdin, A. & Bootzin, R. (1972), Kelley, C. (1999), Kirkpatrick, J. E. (1934), O’Leary, K. D. & Drabman, R. (1971), Palmer, J. S. (2002), Richards, C. E. & Shen, T. M. (1992), .Rosswork, S. G. (1977), Staats, A. (1973), Tuckman, B. W. (1994), Tuckman, B. W. & Trimble, S. (1997), Webster, W. J., Mendro, R. L., Orsack, T., Weerasinghe, D. & Bembry, K. (1997) |
|
56 | Eva L. Baker | "At the same time that interest in alternative assessment is high, our knowledge about the design, distribution, quality and impact of such efforts is low. This is a time of tingling metaphor, cottage industry, and existence proofs rather than carefully designed research and development." p.2 | Dismissive, Denigrating | What Probably Works in Alternative Assessment, July 2010 | CRESST Report 772 | Institute of Education Sciences, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||||
57 | Eva L. Baker | "Moreover, because psychometric methods appropriate for dealing with such new measures are not readily available, nor even a matter of common agreement, no clear templates exist to guide the technical practices of alternative assessment developers (Linn, Baker, Dunbar, 1991)." p.2 | Dismissive | What Probably Works in Alternative Assessment, July 2010 | CRESST Report 772 | Institute of Education Sciences, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||||
58 | Eva L. Baker | "Given that the level of empirical work is so obviously low, one well might wonder what these studies are about. Some studies argue for new approaches to achievement testing." p.3 | Denigrating | What Probably Works in Alternative Assessment, July 2010 | CRESST Report 772 | Institute of Education Sciences, US Education Department | She looked in two databases -- ERIC and NTIS -- and then implied she had looked everywhere. | ||||
59 | Eva L. Baker | "Despite this fragile research base, alternative assessment has already taken off. What issues can we anticipate being raised by relevant communities about the value of these efforts?" p.6 | Dismissive, Denigrating | What Probably Works in Alternative Assessment, July 2010 | CRESST Report 772 | Institute of Education Sciences, US Education Department | She looked in two databases -- ERIC and NTIS -- and then implied she had looked everywhere. | ||||
60 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"As in the earlier studies, efforts are made to distinguish between the concept of economic or opportunity costs (i.e., the use of teacher time that is already “paid for” through the contract and used as part of the assessment process rather then for some other activity or function), and the direct expenditures made for assessment." p.1 | Dismissive | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | For at least two decades, Larry Picus has elevated the trivial and elemental difference between expenditure and cost to the level of heavenly revelation. As any beginning undergraduate in economics knows expenditures -- particularly budgetary line-item expenditures -- don't necessarily equal the cost of an item or acitivity. The classifications of the amounts might or might not match. Picus needled the trivial point over and over for decades. Meanwhile my project on testing costs at the GAO (1991-1993) was a cost study in every sense that Picus identified for the term, but the word "expenditures" was in the title of the report. So, when Picus repeated and repeated that most studies on the topic prior to his were "just expenditure studies" (and not really "cost" studies) there was the GAO report, one of the few cost studies done prior to his, with the word expenditure in its title. The ploy worked, and many were convinced then, and still today, that my work at the GAO relied on budgetary line-item expenditure data (it didn't), neglected to include the cost of personnel time (it did include those costs), or was otherwise suspect, an inferior study. Picus and CRESST managed to denigrate into oblivion a taxpayer-funded study that was vastly superior to any he would ever do. | ||
61 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"Determining the resources necessary to achieve each of these goals is, at best, a complex task. Because of this difficulty, many analysts stop short of estimating the true costs of a program, and instead focus on the expenditures required for its implementation." p.7 | Dismissive | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | For at least two decades, Larry Picus has elevated the trivial and elemental difference between expenditure and cost to the level of heavenly revelation. As any beginning undergraduate in economics knows expenditures -- particularly budgetary line-item expenditures -- don't necessarily equal the cost of an item or acitivity. The classifications of the amounts might or might not match. Picus needled the trivial point over and over for decades. Meanwhile my project on testing costs at the GAO (1991-1993) was a cost study in every sense that Picus identified for the term, but the word "expenditures" was in the title of the report. So, when Picus repeated and repeated that most studies on the topic prior to his were "just expenditure studies" (and not really "cost" studies) there was the GAO report, one of the few cost studies done prior to his, with the word expenditure in its title. The ploy worked, and many were convinced then, and still today, that my work at the GAO relied on budgetary line-item expenditure data (it didn't), neglected to include the cost of personnel time (it did include those costs), or was otherwise suspect, an inferior study. Picus and CRESST managed to denigrate into oblivion a taxpayer-funded study that was vastly superior to any he would ever do. | ||
62 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"The study defined purchase cost as the money spent on test-related goods and services, a category in line with what we call expenditures (U.S. GAO, 1993)." p.21 | Denigrating | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | For at least two decades, Larry Picus has elevated the trivial and elemental difference between expenditure and cost to the level of heavenly revelation. As any beginning undergraduate in economics knows expenditures -- particularly budgetary line-item expenditures -- don't necessarily equal the cost of an item or acitivity. The classifications of the amounts might or might not match. Picus needled the trivial point over and over for decades. Meanwhile my project on testing costs at the GAO (1991-1993) was a cost study in every sense that Picus identified for the term, but the word "expenditures" was in the title of the report. So, when Picus repeated and repeated that most studies on the topic prior to his were "just expenditure studies" (and not really "cost" studies) there was the GAO report, one of the few cost studies done prior to his, with the word expenditure in its title. The ploy worked, and many were convinced then, and still today, that my work at the GAO relied on budgetary line-item expenditure data (it didn't), neglected to include the cost of personnel time (it did include those costs), or was otherwise suspect, an inferior study. Picus and CRESST managed to denigrate into oblivion a taxpayer-funded study that was vastly superior to any he would ever do. | ||
63 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"Unfortunately, aggregating these different types of time disguises important differences between them that, in fairness to the GAO, have emerged in the NCLB era as more important considerations than in previous decades. Specifically, test-preparation time for students has become a subject of national debate about how much class time teachers spend 'teaching to the test.'" p.21 | Denigrating | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | I continued to publish articles and made presentations based on the GAO project for several years after I left the GAO. These publications reported the disagregated costs and estimated benefits. Indeed, I published a net benefit (i.e., benefit/cost) study in the Journal of Education Finance ten years prior to this Picus article. Almost certainly he knows about it -- he has served as editor or on the editorial board for that journal for many years. In this report of his for SCOPE, my name is never mentioned nor are any of my many publications or presentations related to the costs and benefits of testing. | ||
64 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"In its analysis, the GAO does provide aggregate time estimates. However, it does not provide disaggregated estimates of teacher time, nor estimated benefits in terms of either teacher PD or improved student learning." p.21 | Denigrating | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | I continued to publish articles and made presentations based on the GAO project for several years after I left the GAO. These publications reported the disagregated costs and estimated benefits. Indeed, I published a net benefit (i.e., benefit/cost) study in the Journal of Education Finance ten years prior to this Picus article. Almost certainly he knows about it -- he has served as editor or on the editorial board for that journal for many years. In this report of his for SCOPE, my name is never mentioned nor are any of my many publications or presentations related to the costs and benefits of testing. | ||
65 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"The performance assessments studied by the GAO also do not demonstrate much variety. Most included only writing samples, reading comprehension and response, and math/science problem-solving items. A few districts used science lab work, group work, and skills observations, but most still relied on paper-and-pencil testing (U.S. GAO, 1993)." p.21 | Denigrating | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | Picus neglects to mention that the GAO collected data from the universe of states with testing programs and a very large, representative sample (> 660) of public school districts. We collected all the data on all the systemwide testing occurring at the time. We oversampled districts in certain states, such as Maryland, the one state at the time with the most elaborate performance test types. In doing that, we did more than he ever did in his couple of state studies. Yet, as usual, he implies that the GAO study or my work must have left out something important. | ||
66 | Lawrence O. Picus |
Frank Adamson William Montague Margaret Owens |
"In every instance, test developers crafting the performancebased tests started from scratch, writing test questions that fit the state’s curriculum or guidelines, then testing the draft on pilot groups of students and using an iterative revision process that did not involve state curriculum, which was undergoing simultaneous development (U.S. GAO, 1993)." p.22 | Denigrating | A New Conceptual Framework for Analyzing the Costs of Performance Assessment, 2010 | The Stanford Center for Opportunity Policy in Education (SCOPE) | https://edpolicy.stanford.edu/sites/default/files/publications/new-conceptual-framework-analyzing-costs-performance-assessment_0.pdf | SCOPE funders (1) | This sentence doesn't make sense, but he doesn't include page numbers in his citations so it is not even possible to find what text he might have been misunderstanding. Within one sentence, Picus claims that test items were based on established content standards, but then not based on them, because they didn't yet exist. The latter point is certainly not true. When standards-based tests are developed, the content standards are completed first, and the test items are written directly from them. | ||
67 | Joan L. Herman |
Ellen Osmundson, David Silver | "These indeed are promising developments for pushing formative assessment to fruition in classroom practice. They acknowledge and work toward remedying the need for classroom tools to assess and support student learning. Yet at the same time, recent studies reveal challenges in implementing quality formative assessment and show non-robust results with regard to effects on student learning (Herman, Osmundson, Ayala, Schneider, & Timms, 2006; Furtak, et al., 2008)." | Dismissive, Denigrating | Capturing Quality in Formative Assessment Practice: Measurement Challenges, p.2 | CRESST Report 770, June 2010 | https://eric.ed.gov/?id=ED512648 | Institute of Education Sciences, US Education Department | Studies of formative testing date back a cenury, and the evidence, on average, is strongly positive, which is not the result favored by CRESST, so they declare the studies nonexistent. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
68 | Joan L. Herman |
Ellen Osmundson, David Silver | "Just as the concept of formative assessment itself underscores the central role of evidence—learning data—in an effective teaching and learning process, so too do policymakers and practitioners need evidence on which to build effective formative practices. Toward this latter goal, this report explores ..." | 1stness | Capturing Quality in Formative Assessment Practice: Measurement Challenges, p.2 | CRESST Report 770, June 2010 | https://eric.ed.gov/?id=ED512648 | Institute of Education Sciences, US Education Department | Studies of formative testing date back a cenury, and the evidence, on average, is strongly positive, which is not the result favored by CRESST, so they declare the studies nonexistent. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
69 | Diana Pullin (Chair) |
Joan Herman, Scott Marion, Dirk Mattson, Rebecca Maynard, Mark Wilson, | "However, there have been very few studies of how interim assessments are actually used by individual teachers in classrooms, by principals, and by districts or of their impact on student achievement." p. 6 | Dismissive | Best Practices for State Assessment Systems, Part I | Committee on Best Practices for State Assessment Systems: Improving Assessment While Revisiting Standards; Center for Education; Division of Behavioral and Social Sciences and Education; National Research Council | https://www.nap.edu/catalog/12906/best-practices-for-state-assessment-systems-part-i-summary-of | "With funding from the James B. Hunt, Jr. Institute for Educational Leadership and Policy, as well as additional support from the Bill & Melinda Gates Foundation and the Stupski Foundation, the National Research Council (NRC) planned two workshops designed to explore some of the possibilities for state assessment systems." | |||
70 | Diana Pullin (Chair) |
Joan Herman, Scott Marion, Dirk Mattson, Rebecca Maynard, Mark Wilson, | "Research indicates that the result has been emphasis on
lower-level knowledge and skills and very thin alignment with the standards.
For example, Porter, Polikoff, and Smithson (2009) found very low to
moder ate alignment between state assessments and standards—meaning that large proportions of content standards are not covered on the assessments (see also Fuller et al., 2006; Ho, 2008). p. 10 |
Denigrating | Best Practices for State Assessment Systems, Part I | Committee on Best Practices for State Assessment Systems: Improving Assessment While Revisiting Standards; Center for Education; Division of Behavioral and Social Sciences and Education; National Research Council | https://www.nap.edu/catalog/12906/best-practices-for-state-assessment-systems-part-i-summary-of | "With funding from the James B. Hunt, Jr. Institute for Educational Leadership and Policy, as well as additional support from the Bill & Melinda Gates Foundation and the Stupski Foundation, the National Research Council (NRC) planned two workshops designed to explore some of the possibilities for state assessment systems." | |||
71 | Diana Pullin (Chair) |
Joan Herman, Scott Marion, Dirk Mattson, Rebecca Maynard, Mark Wilson, | "Another issue is that the implications of computer-based approaches for validity and reliability have not been thoroughly evaluated." p. 40 | Dismissive | Best Practices for State Assessment Systems, Part I | Committee on Best Practices for State Assessment Systems: Improving Assessment While Revisiting Standards; Center for Education; Division of Behavioral and Social Sciences and Education; National Research Council | https://www.nap.edu/catalog/12906/best-practices-for-state-assessment-systems-part-i-summary-of | "With funding from the James B. Hunt, Jr. Institute for Educational Leadership and Policy, as well as additional support from the Bill & Melinda Gates Foundation and the Stupski Foundation, the National Research Council (NRC) planned two workshops designed to explore some of the possibilities for state assessment systems." | |||
72 | Diana Pullin (Chair) |
Joan Herman, Scott Marion, Dirk Mattson, Rebecca Maynard, Mark Wilson, | "For current tests, he [Lauress Wise] observed, there is little evidence that they are good indicators of instructional effectiveness or good predictors of students’ readiness for subsequent levels of instruction." | Dismissive | Best Practices for State Assessment Systems, Part I | Committee on Best Practices for State Assessment Systems: Improving Assessment While Revisiting Standards; Center for Education; Division of Behavioral and Social Sciences and Education; National Research Council | https://www.nap.edu/catalog/12906/best-practices-for-state-assessment-systems-part-i-summary-of | "With funding from the James B. Hunt, Jr. Institute for Educational Leadership and Policy, as well as additional support from the Bill & Melinda Gates Foundation and the Stupski Foundation, the National Research Council (NRC) planned two workshops designed to explore some of the possibilities for state assessment systems." | |||
73 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “A few studies have attempted to examine how the creation and publication of standards, per se, have affected practices.” p. 3 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
74 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “The research evidence does not provide definitive answers to these questions.” p. 6 | Denigrating | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
75 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “He [Poynter 1994] also noted that ‘virtually all of the arguments, both for and against standards, are based on beliefs and hypotheses rather than on direct empirical evidence’ (p. 427).” pp. 34-35 | Dismissive, Denigrating | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
76 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | "Although a large and growing body of research has been conducted to examine the effects of SBR, the caution Poynter expressed in 1994 about the lack of empirical evidence remains relevant today.” pp. 34-35 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
77 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “Arguably the most important test of quality is whether the standards promote high-quality instruction and improved student learning, but as we discuss later, there is very little research to address that question.” p. 37 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
78 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “[T]here have been a few studies of SBR as a comprehensive system. . . . [T]here is some research on how the adoption of standards, per se, or the alignment of standards with curriculum influences school practices or student outcomes.” p. 38 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
79 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “The lack of evidence about the effects of SBR derives primarily from the fact that the vision has never been fully realized in practice.” p. 47 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
80 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “[A]lthough many conceptions of SBR emphasize autonomy, we currently know relatively little about the effects of granting autonomy or what the right balance is between autonomy and prescriptiveness.” p. 55 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
81 | Laura S. Hamilton | Brian M. Stecher, Kun Yuan | “One of the primary responsibilities of the federal government should be to ensure ongoing collection of evidence demonstrating the effects of the policies, which could be used to make decisions about whether to continue on the current course or whether small adjustments or a major overhaul are needed.” p. 55 | Dismissive | Standards-Based Reform in the United States: History, Research, and Future Directions | Center on Education Policy, December, 2008 | http://www.rand.org/content/dam/rand/pubs/reprints/2009/RAND_RP1384.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
82 | Douglas N. Harris | Lori L. Taylor, Amy A. Levine, William K. Ingle, Leslie McDonald | "However, previous studies under-state current costs by focusing on costs before NCLB was put in place and by excluding important cost categories." | Denigrating | The Resource Costs of Standards, Assessments, and Accountability | report to the National Research Council | National Research Council funders | No, they did not leave out important cost categories; Harris' study deliberately exagerates costs. See pages 3-10: https://nonpartisaneducation.org/Review/Essays/v10n1.pdf | |||
83 | Joan Herman | Katherine E. Ryan, Lorrie A. Shepard, Eds. | "Yet, available evidence suggests that the rhetoric surpasses the reality of formative assessment use" p.217 | Denigrating | Accountability and assessment: Is public interest in K-12 education being served? | Chapter 11 in The Future of Test-Based Educational Accountability | https://www.routledge.com/The-Future-of-Test-Based-Educational-Accountability-1st-Edition/Ryan-Shepard/p/book/9780805864700 | Institute of Education Sciences, US Education Department | Studies of formative testing date back a cenury, and the evidence, on average, is strongly positive, which is not the result favored by CRESST, so they declare the studies nonexistent. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
84 | Joan Herman | Katherine E. Ryan, Lorrie A. Shepard, Eds. | "The research base examining effects on students with disabilities and on English Language learners is scanty." p.223 | Dismissive | Accountability and assessment: Is public interest in K-12 education being served? | Chapter 11 in The Future of Test-Based Educational Accountability | https://www.routledge.com/The-Future-of-Test-Based-Educational-Accountability-1st-Edition/Ryan-Shepard/p/book/9780805864700 | Institute of Education Sciences, US Education Department | |||
85 | Joan Herman | Katherine E. Ryan, Lorrie A. Shepard, Eds. | "...there is no obvious accountability mechanism for the "average student" who may have made it just over the proficient level. There is little research on this issue." p.224 | Dismissive | Accountability and assessment: Is public interest in K-12 education being served? | Chapter 11 in The Future of Test-Based Educational Accountability | https://www.routledge.com/The-Future-of-Test-Based-Educational-Accountability-1st-Edition/Ryan-Shepard/p/book/9780805864700 | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of minimum-competency testing and the problems with a single passing score include those of Frederiksen (1994); Jacobson (1992); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Losack (1987); Marshall (1987); Mangino & Babcock (1986); Michigan Department of Education (1984); Serow (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); and Findley (1978). | ||
86 | Joan Herman | "The report considers how well the model fits available evidence by examining whether and how accountability assessment influences students’ learning opportunities and the relationship between accountability and learning." abstract | Dismissive | Accountability and assessment: Is public interest in K-12 education being served? | CRESST Report 728, October 2007 | https://eric.ed.gov/?id=ED499421 | Institute of Education Sciences, US Education Department | See, for example, Test Frequency, Stakes, and Feedback in Student Achievement: A Meta-Analysis https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | |||
87 | Joan Herman | "What of the impact of accountability on other segments of the student population--traditionally higher performing students? ...The average student? ...there is no obvious accountability mechanism for the "average student. There is little research on this issue." p.17 | Dismissive | Accountability and assessment: Is public interest in K-12 education being served? | CRESST Report 728, October 2007 | https://eric.ed.gov/?id=ED499421 | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of minimum-competency testing and the problems with a single passing score include those of Frederiksen (1994); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Losack (1987); Mangino & Babcock (1986); Serow (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); and Findley (1978). | |||
88 | Joan Herman | "While a thorough treatment of the effects on teachers is also beyond the scope of this report, it is worth noting a growing literature that is cause for concern." p.17 | Dismissive | Accountability and assessment: Is public interest in K-12 education being served? | CRESST Report 728, October 2007 | https://eric.ed.gov/?id=ED499421 | Institute of Education Sciences, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
89 | Joan Herman | "The research base examining effects on students with disabilities and on English language learner students is scanty." pp.16-17 | Dismissive | Accountability and assessment: Is public interest in K-12 education being served? | CRESST Report 728, October 2007 | https://eric.ed.gov/?id=ED499421 | Institute of Education Sciences, US Education Department | ||||
90 | Eva L. Baker | "Tests only dimly reflect in their design the results of research on learning, whether of skills, subject matter, or problem solving." p.310 | Denigrating | The End(s) of Testing | Educational Researcher, Vol. 36, No. 6, pp. 309–317 | 2007 Presidential Address for the American Educational Research Association | |||||
91 | Eva L. Baker | "To my mind, the evidential disconnect between test design and learning research is no small thing." p.310 | Dismissive | The End(s) of Testing | Educational Researcher, Vol. 36, No. 6, pp. 309–317 | 2007 Presidential Address for the American Educational Research Association | |||||
92 | Eva L. Baker | "What if we set aside learning-based design and ask, “How well do any of our external tests work?” The answer is that we often don’t know enough to know. We have little evidence that tests are in sync with their stated or de facto purposes or that their results lead to appropriate decisions." p.310 | Dismissive | The End(s) of Testing | Educational Researcher, Vol. 36, No. 6, pp. 309–317 | 2007 Presidential Address for the American Educational Research Association | |||||
93 | Laura S. Hamilton | Brian M. Stecher, Julie A. Marsh, Jennifer Sloan McCombs, Abby Robyn, Jennifer Lin Russell, Scott Naftel, Heather Barney | "For many educators, the utility of SBA was demonstrated in a few pioneering states in the 1990s. Two of the most prominent examples of SBA occurred in Texas and North Carolina, where scores on state accountability tests rose dramatically after the introduction of SBA systems (Grissmer and Flanagan, 1998)." p.4 | Standards-Based Accountability Under No Child Left Behind: Experiences of Teachers and Administrators in Three States | Rand Corporation, 2007 | https://www.rand.org/pubs/monographs/MG589.html | "This research was sponsored by the National Science Foundation under grant number REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | |||
94 | Laura S. Hamilton | Brian M. Stecher, Julie A. Marsh, Jennifer Sloan McCombs, Abby Robyn, Jennifer Lin Russell, Scott Naftel, Heather Barney | "However, the paths through which SBA [standards-based accountability] changes district, school, and classroom practices and how these changes in practice influence student outcomes are largely unexplored. There is strong evidence that SBA leads to changes in teachers’ instructional practices (Hamilton, 2004; Stecher, 2002)." p.5 | Dismissive | Standards-Based Accountability Under No Child Left Behind: Experiences of Teachers and Administrators in Three States | Rand Corporation, 2007 | https://www.rand.org/pubs/monographs/MG589.html | "This research was sponsored by the National Science Foundation under grant number REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
95 | Laura S. Hamilton | Brian M. Stecher, Julie A. Marsh, Jennifer Sloan McCombs, Abby Robyn, Jennifer Lin Russell, Scott Naftel, Heather Barney | "Much less is known about the impact of SBA at the district and school levels and the relationships among actions at the various levels and student outcomes. This study was designed to shed light on this complex set of relationships…" p.5 | Dismissive | Standards-Based Accountability Under No Child Left Behind: Experiences of Teachers and Administrators in Three States | Rand Corporation, 2007 | https://www.rand.org/pubs/monographs/MG589.html | "This research was sponsored by the National Science Foundation under grant number REC-0228295." | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
96 | Julie A. Marsh, John F. Pane, and Laura S. Hamilton | "Unlike past studies of data use in schools, this paper brings together information systematically gathered from large, representative samples of educators at the district, school, and classroom levels in a variety of contexts." p.1 | Dismissive, Denigrating | Making Sense of Data-Driven Decision Making in Education | Rand Corporation Occassional Paper, 2006 | ||||||
97 | Julie A. Marsh, John F. Pane, and Laura S. Hamilton | "Although
a few studies have tried to link DDDM to changes in school culture or
performance (Chen et al., 2005; Copland, 2003; Feldman and Tung, 2001;
Schmoker and Wilson, 1995; Wayman and Stringfield 2005), most of the
literature focuses on implementation. In addition, previous work has tended
to describe case studies of schools or has taken the form of advocacy or technical assistance (such as the “how to” implementation guides described by Feldman and Tung, 2001)." p.4 |
Dismissive, Denigrating | Making Sense of Data-Driven Decision Making in Education | Rand Corporation Occassional Paper, 2006 | ||||||
98 | Eva L. Baker | Joan L. Herman, Robert L. Linn | "For example, performance assessment was a rage in the early 1990s because it was something new and flashy, and looked to have great promise. Before almost any research was done, a number of states dropped their multiple-choice accountability systems, replacing them with performance assessments. | Dismissive | ACCELERATING FUTURE POSSIBILITIES FORASSESSMENT AND LEARNING, p.1 | CRESST Line, Winter 2006 | https://www.researchgate.net/publication/277283780_in_Educational_Researcher_called_The_Awful_Reputation_of_Education_Research | Institute of Education Sciences, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
99 | Eva L. Baker | Joan L. Herman, Robert L. Linn | "By the end of this year, nearly half of all states will have graduation exams in place (Peterson, 2005). Short institutional memory forgets that similar minimum competency tests did not lead to increased achievement some 20 years ago, but instead contributed to higher numbers of high school dropouts and inequities along racial lines (Catterall, 1989; Haertel & Herman, 2005)." | Dismissive | ACCELERATING FUTURE POSSIBILITIES FORASSESSMENT AND LEARNING, p.3 | CRESST Line, Winter 2006 | https://www.researchgate.net/publication/277283780_in_Educational_Researcher_called_The_Awful_Reputation_of_Education_Research | Institute of Education Sciences, US Education Department | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
100 | Edward Haertel | Joan Herman | "Passing rates on MCTs in many states rose rapidly from year to year (Popham, Cruse, Rankin, Sandifer, & Williams, 1985). Despite these gains, and positive trends on examinations like the National Assessment of Educational Progress (NAEP), there is little evidence that MCTs were the reason for improvements on other examinations." | Dismissive | A Historical Perspective on Validity Arguments for Accountability Testing | CRESST Report 654, June 2005 | https://cresst.org/wp-content/uploads/R654.pdf | Institute of Education Sciences, US Education Department | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
101 | Robert L. Linn | "Despite the clear appeal of assessment-based accountability and the widespread use of this approach, the development of assessments that are aligned with content standards and for which there is solid evidence of validity and reliability is a challenging endeavor." | Dismissive | Issues in the Design of Accountability Systems | CRESST Report 650, April 2005 | https://cresst.org/wp-content/uploads/R650.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | |||
102 | Robert L. Linn | "Alignment of an assessment with the content standards that it is intended to measure is critical if the assessment is to buttress rather than undermine the standards. Too little attention has been given to the evaluation of the alignment of assessments and standards." | Denigrating | Issues in the Design of Accountability Systems | CRESST Report 650, April 2005 | https://cresst.org/wp-content/uploads/R650.pdf | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | |||
103 | Betheny Gross, Michael Kirst, Dana Holland, and Tom Luschei | Bethany Gross & Margaret E. Goertz, Eds. | "Unlike elementary and middle school leaders, for whose institutions countless reform models have been designed and subsequently employed in efforts to meet accountability demands, high school leaders have relatively few models or school designs to which they can turn for guidance." p.43 | Dismissive | Got You Under My Spell? How Accountability Policy Is Changing and Not Changing Decision Making in High Schools | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
104 | Betheny Gross, Michael Kirst, Dana Holland, and Tom Luschei | Bethany Gross & Margaret E. Goertz, Eds. | "Perceptions that little information exists to be found may very well reduce the likelihood that information will be sought and that new strategies will be found." p.48 | Dismissive | Got You Under My Spell? How Accountability Policy Is Changing and Not Changing Decision Making in High Schools | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | … perceptions that they encourage | |||
105 | Elliot H. Weinbaum | Bethany Gross & Margaret E. Goertz, Eds. | "However, state accountability policies and the research on those policies have traditionally overlooked the role of school districts. Little research is available about the ways in which districts respond to accountability pressure or, until recently, the strategies that they might use for improvement." | Dismissive | Stuck in the Middle With You: District Response to State Accountability, p.96 | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
106 | Elliot H. Weinbaum | Bethany Gross & Margaret E. Goertz, Eds. | "Because of the limited investigation that has been done, and the urgent need for high school improvement" | Dismissive | Stuck in the Middle With You: District Response to State Accountability, p.96 | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
107 | Elliot H. Weinbaum | Bethany Gross & Margaret E. Goertz, Eds. | "The research community has relatively little understanding of the ways in which state level, performance-based accountability systems and local school districts interact given various contexts." p.98 | Dismissive | Stuck in the Middle With You: District Response to State Accountability, p.98 | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
108 | Elliot H. Weinbaum | Bethany Gross & Margaret E. Goertz, Eds. | "First of all, much of the research on districts has studied districts that are, for some reason, 'outliers.'" p.100 | Dismissive | Stuck in the Middle With You: District Response to State Accountability | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
109 | Elliot H. Weinbaum | Bethany Gross & Margaret E. Goertz, Eds. | "This is particularly true at the high school level, where continued debates about standards, the subject-specific nature of teacher expertise, and the lack of basic research about effective practices at the high school level make effective improvement strategies complex." p.104 | Dismissive | Stuck in the Middle With You: District Response to State Accountability | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
110 | Margaret E. Goertz and Diane Massell | Bethany Gross & Margaret E. Goertz, Eds. | "We know little about how high schools respond to external accountability pressures." p. 123 | Dismissive | Summary | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
111 | Margaret E. Goertz and Diane Massell | Bethany Gross & Margaret E. Goertz, Eds. | "Little academic research has explored what motivates and helps district organizations intervene on behalf of state accountability goals, particularly at the high school level. Our study sheds some light on this question." p.129 | Dismissive | Summary | Holding High Hopes: How High Schools Respond to State Accountability Policies, CPRE Research Report Series RR-056, March 2005 | US Education Department funding for the Consortium for Policy Research in Education | ||||
112 | Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "Based on available research, this chapter explores how well assessments serve these functions from the perspective of elementary schools." p.141 | Dismissive | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | Relevant pre-2000 studies of the effects of standards,
alignment, goal setting, setting reachable goals, etc. include those of
Mitchell (1999); Morgan & Ramist (1998); the Southern Regional Education
Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al.
(1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997);
Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black &
Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop
(1993); the U. S. General Accounting Office (1993); Eckstein & Noah
(1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton
(1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990);
Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman
(1989); Hillocks (1987); Willingham & Morris (1986); Resnick &
Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984);
Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere &
Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood
(1953); and Panlasigui & Knight (1930).
It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm |
"Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." "What about: Brooks-Cooper, C. (1993), Brown, S. M. & Walberg, H. J. (1993), Heneman, H. G., III. (1998), Hurlock, E. B. (1925), Jones, J. et al. (1996), Kazdin, A. & Bootzin, R. (1972), Kelley, C. (1999), Kirkpatrick, J. E. (1934), O’Leary, K. D. & Drabman, R. (1971), Palmer, J. S. (2002), Richards, C. E. & Shen, T. M. (1992), .Rosswork, S. G. (1977), Staats, A. (1973), Tuckman, B. W. (1994), Tuckman, B. W. & Trimble, S. (1997), Webster, W. J., Mendro, R. L., Orsack, T., Weerasinghe, D. & Bembry, K. (1997)" |
|
113 | Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "What is particularly new in standards-based assessment reform is being clear not only on the 'what' of what is expected (the content standards), but also on "how well" it should be accomplished (the performance standards) (Linn and Herman, 1997)." pp.141-142 | Dismissive | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
114 | Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "More is known currently about the variation in those elements across states and localities than about their influence on schools, teaching, and student learning." p.154 | Dismissive, Denigrating | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
115 | Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "There is ample evidence to suggest that state assessment systems do create pressure for teachers and principals … but little clear evidence on how various stakes have differential effects on teachers, their curriculum and instruction, or, ultimately, student learning." p.155 | Dismissive, Denigrating | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
116 | Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "Similarly, states and districts differ in how they respond to low-performing schools, but evidence on whether and how their various responses influence classroom teaching, test performance, and student learning is limited." p.155 | Dismissive, Denigrating | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
117 | Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "Further research is necessary, however, to identify optimal approaches. Needed, too, is additional research on how schools can best orchestrate their improvement efforts." p.155 | Dismissive, Denigrating | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
118 | Richard F. Elmore | Susan H. Fuhrman & Richard F. Elmore, Eds | "Nowhere is this question of what we don't know more apparent than in the issue of stakes. State policies require proficiency levels for grade promotion and graduation for students, for example, without any empirical evidence …" p.278 | Dismissive | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
119 | Richard F. Elmore | Susan H. Fuhrman & Richard F. Elmore, Eds | "Likewise, state policies set expected levels of improvement in schools without any evidence or theory about how schools actually respond to external pressure for student performance ..." pp.278-279 | Dismissive | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Joint project between CRESST and CPRE. | Institute of Education Sciences, US Education Department | |||
120 | Lorraine M. McDonnell | "A growing body of research suggests that school and classroom practices do change in response to these assessments (Herman and Golan, 1993; Smith and Rottenberg, 1991; Madaus, 1988)" | 1stness | Politics, Persuasion, and Educational Testing, p.9 | Harvard University Press, 2004 | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||||
121 | Lorraine M. McDonnell | "A growing body of research suggests that school and classroom practices do change in response to these assessments (Herman and Golan, 1993; Smith and Rottenberg, 1991; Madaus, 1988)" | Dismissive | Politics, Persuasion, and Educational Testing, p.9 | Harvard University Press, 2004 | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||||
122 | Lorraine M. McDonnell | "Although most literature on policy instruments identifies this persuasive tool as one of the stategies available to policymakers, little theoretical or comparative empirical research has been conducted on its properties." | Dismissive | Politics, Persuasion, and Educational Testing, p.24 | Harvard University Press, 2004 | ||||||
123 | Lorraine M. McDonnell | "There is empirical research on policies that rely on hortatory tools, but studies of these individual policies have not examined them within a broader theoretical framework." | Denigrating | Politics, Persuasion, and Educational Testing, p.24 | Harvard University Press, 2004 | ||||||
124 | Lorraine M. McDonnell | "This chapter represents an initial attempt to analyze the major characteristics of hortatory policy by taking an inductive approach and looking across several different policy areas to identify a few basic properties common to most policies of this type." | 1stness | Politics, Persuasion, and Educational Testing, p.24 | Harvard University Press, 2004 | ||||||
125 | Lorraine M. McDonnell | "This chapter has begun the task of building a conceptual framework for understanding hortatory policies by identifying their underlying causal assumptions and analyzing some basic properties common to most polcies that rely on information and values to motivate action." | 1stness | Politics, Persuasion, and Educational Testing, p.44–45 | Harvard University Press, 2004 | ||||||
126 | Lorraine M. McDonnell | "Because so little systematic research has been conducted on hortatory policy, it is possible at this point only to suggest, rather than to specify, the conditions under which its underlying assumptions will be valid and a policy likely to succeed." | Dismissive | Politics, Persuasion, and Educational Testing, p.45 | Harvard University Press, 2004 | ||||||
127 | Lorraine M. McDonnell | "Additional theoretical and empirical work is needed to develop a more rigorous and nuanced understanding of hotatory policy. Nevertheless, this study starts that process by articulating the policy theory undergirding hortatory policy and by outlining its potential promise and shortcomings." | Denigrating | Politics, Persuasion, and Educational Testing, p.45 | Harvard University Press, 2004 | ||||||
128 | Lorraine M. McDonnell | "However, because research on the effects of high stakes testing is limited, finds mixed results, and suggests unintended consequences, the informational and persuasive dimensions of testing will continue to be critical to the success of this policy." | Dismissive | Politics, Persuasion, and Educational Testing, p.182–183 | Harvard University Press, 2004 | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||||
129 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "...the federal government and the nation’s school systems have made and are continuing to make significant investments toward the improvement of mathematics education. However, the knowledge base upon which these efforts are founded is generally weak." p.iii | Denigrating | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
130 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "New curricular materials have been developed along with training and coaching programs intended to provide teachers with the knowledge and skills needed to use those materials. However, these efforts have been supported by only a limited and uneven base of research and research-based development, which is part of the reason for the limited success of those efforts." p. xi | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
131 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "More important, the intense debates over the past decade seem to be based more often on ideology than on evidence." p.xiii | Denigrating | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
132 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "However, despite more than a century of efforts to improve school mathematics in the United States, investments in research and development have been virtually nonexistent." p.xiv | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
133 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "There has never been a long-range programmatic effort to fund research and development in mathematics education, nor has funding been organized to focus on knowledge that would be usable in practice." p.xiv | Denigrating | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
134 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "Despite the strong history of work in this area, we lack research about what is happening today in algebra classrooms; how innovations in algebra teaching and learning can be designed, implemented, and assessed; and how policy decisions shape student learning and affect equity." p.xxi | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
135 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "Because most studies have focused on algebra at the high school level, we lack knowledge about younger students’ learning of algebraic ideas and skills." p.xxi | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
136 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "Little is known about what happens when algebra is viewed as a K–12 subject, what happens when it is integrated with other subjects, or what happens when it emphasizes a wider range of concepts and processes." p.xxi | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
137 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "Research could inform the perennial debates surrounding the algebra curriculum: what to include, emphasize, reduce, or omit.." p.xxi | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
138 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "For the most part, these debates are poorly informed because research evidence is lacking." p.xxiv | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
139 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "Despite more than a century of efforts to improve school mathematics in the United States, efforts that have yielded numerous research studies and development projects, investments in research and development have been inadequate." p.5 | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
140 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "Federal agencies (primarily the National Science Foundation and the U.S. Department of Education) have contributed funding for many of these efforts. But the investments have been relatively small, and the support has been fragmented and uncoordinated." p.5 | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
141 | Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "There has never been a long-range programmatic effort devoted solely to funding research in mathematics education, nor has research (as opposed to development) funding been organized to focus on knowledge that would be usable in practice. Consequently, major gaps exist in the knowledge base and in knowledge-based development." p.5 | Dismissive | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||
142 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “The shortcomings of the studies make it difficult to determine the size of teacher effects, but we suspect that the magnitude of some of the effects reported in this literature are overstated.” p. xiii | Denigrating | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
143 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “Using VAM to estimate individual teacher effects is a recent endeavor, and many of the possible sources of error have not been thoroughly evaluated in the literature.” p. xix | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
144 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “Empirical evaluations do not exist for many of the potential sources of error we have identified. Studies need to be conducted to determine how these factors contribute to estimated teacher effects and to determine the conditions that exacerbate or mitigate the impact these factors have on teacher effects.” p. xix | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
145 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “This lack of attention to teachers in policy discussions may be attributed in part to another body of literature that attempted to determine the effects of specific teacher background characteristics, including credentialing status (e.g., Miller, McKenna, and McKenna, 1998; Goldhaber and Brewer, 2000) and subject matter coursework (e.g., Monk, 1994).” p. 8 | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
146 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “To date, there has been little empirical exploration of the size of school effects and the sensitivity of teacher effects to modeling of school effects.” p. 78 | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
147 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “There are no empirical explorations of the robustness of estimates to assumptions about prior-year schooling effects.“ p. 81 | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
148 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “There is currently no empirical evidence about the sensitivity of gain scores or teacher effects to such alternatives.” p. 89 | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
149 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “Empirical evaluations do not exist for many of the potential sources of error we have identified. Studies need to be conducted to determine how these factors contribute to estimated teacher effects and to determine the conditions that exacerbate or mitigate the impact these factors have on teacher effects.” p. 116 | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
150 | Laura S. Hamilton | Daniel F. McCaffrey, J.R. Lockwood, Daniel M. Koretz | “Although we expect missing data are likely to be pervasive, there is little systematic discussion of the extent or nature of missing data in test score databases.” p. 117 | Dismissive | Evaluating Value-Added Models for Teacher Accountability | Rand Corporation, 2003 | https://www.rand.org/content/dam/rand/pubs/monographs/2004/RAND_MG158.pdf | Rand Corporation funders | Tennessee's TVAAS value-added measurement system had been running a decade when they wrote this and did much of what these authors claim had never been done. | ||
151 | Joan L. Herman, Noreen Webb, & Stephen Zuniga | "Despite the importance of the concept, the present state of alignment is weak (Feuer, Holland, Green, Bertenthal, & Hemphill, 1999; Rothman, Slattery, Vranek, & Resnick, 2000), and sound methodologies for examining and documenting it are just recently emerging." p.2 | Dismissive | Alignment and College Admissions: The Match of Expectations, Assessments, and Educator Perspectives | CSE Technical Report 593, April 2003 | Office of Research and Improvement, US Education Department | |||||
152 | Marguerite Clarke | 5 co-authors | “What this study adds to the body of literature in this area is a systematic look at how impact varies with the stakes attached to the test results.” p. 91 | 1stness | Perceived Effects of State-Mandated Testing Programs on Teaching and Learning etc. (5 co-authors) | National Board on Educational Testing and Public Policy monograph, January 2003 | http://files.eric.ed.gov/fulltext/ED474867.pdf | Ford Foundation | See, for example, Test Frequency, Stakes, and Feedback in Student Achievement: A Meta-Analysis https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||
153 | Marguerite Clarke | 5 co-authors | “Many calls for school reform assert that high-stakes testing will foster the economic competitiveness of the U.S. However, the empirical basis for this claim is weak.” p. 96, n. 1 | Denigrating | Perceived Effects of State-Mandated Testing Programs on Teaching and Learning etc. (5 co-authors) | National Board on Educational Testing and Public Policy monograph, January 2003 | http://files.eric.ed.gov/fulltext/ED474867.pdf | Ford Foundation | |||
154 | Brian M. Stecher | Laura S. Hamilton | "The business model of setting clear targets, attaching incentives to the attainment of those targets, and rewarding those responsible for reaching the targets has proven successful in a wide range of business enterprises. But there is no evidence that these accountability principles will work well in an educational context, and there are many reasons to doubt that the principles can be applied without significant adaptation." | Dismissive | Putting Theory to the Test: Systems of "Educational Accountability" Should be Held Accountable | Rand Review, Spring 2002 | https://www.rand.org/pubs/periodicals/rand-review/issues/rr-04-02/theory.html | Rand Corporation funders | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | ||
155 | Brian M. Stecher | Laura S. Hamilton | " The lack of strong evidence regarding the design and effectiveness of accountability systems hampers policymaking at a critical juncture." | Denigrating | Putting Theory to the Test: Systems of "Educational Accountability" Should be Held Accountable | Rand Review, Spring 2002 | https://www.rand.org/pubs/periodicals/rand-review/issues/rr-04-02/theory.html | Rand Corporation funders | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | ||
156 | Brian M. Stecher | Laura S. Hamilton | "Nonetheless, the evidence has yet to justify the expectations. The initial evidence is, at best, mixed. On the plus side, students and teachers seem to respond to the incentives created by the accountability systems | Dismissive | Putting Theory to the Test: Systems of "Educational Accountability" Should be Held Accountable | Rand Review, Spring 2002 | https://www.rand.org/pubs/periodicals/rand-review/issues/rr-04-02/theory.html | Rand Corporation funders | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | ||
157 | Brian M. Stecher | Laura S. Hamilton | "Proponents of accountability attribute the improved scores in these states to clearer expectations, greater motivation on the part of the students and teachers, a focused curriculum, and more-effective instruction. However, there is little or no research to substantiate these positive changes or their effects on scores." | Dismissive | Putting Theory to the Test: Systems of "Educational Accountability" Should be Held Accountable | Rand Review, Spring 2002 | https://www.rand.org/pubs/periodicals/rand-review/issues/rr-04-02/theory.html | Rand Corporation funders | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||
158 | Brian M. Stecher | Laura S. Hamilton | "One of the earliest studies on the effects of testing (conducted in two Arizona schools in the late 1980s) showed that teachers reduced their emphasis on important, nontested material." | Dismissive | Putting Theory to the Test: Systems of "Educational Accountability" Should be Held Accountable | Rand Review, Spring 2002 | https://www.rand.org/pubs/periodicals/rand-review/issues/rr-04-02/theory.html | Rand Corporation funders | Hundreds, perhaps thousands, of studies of the effects of testing predate CRESST's in the 1980s. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
159 | Brian M. Stecher | Laura S. Hamilton | "Test-based accountability systems will work better if we acknowledge how little we know about them, if the federal government devotes appropriate resources to studying them, and if the states make ongoing efforts to improve them." | Dismissive | Putting Theory to the Test: Systems of "Educational Accountability" Should be Held Accountable | Rand Review, Spring 2002 | https://www.rand.org/pubs/periodicals/rand-review/issues/rr-04-02/theory.html | Rand Corporation funders | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | ||
160 | Robert L. Linn | Eva L. Baker | "“It is true that many of these accommodated test conditions are not subjected to validity studies to determine whether the construct or domain tested has been significantly altered. In part, this lack of empirical data results from restricted resources.” p. 14 | Dismissive | Validity Issues for Accountability Systems | CSE Technical Report 585 (December 2002) | http://www.cse.ucla.edu/products/reports/TR585.pdf | Office of Research and Improvement, US Education Department | External evaluations of large-scale testing programs not only exist, but represent the norm. | ||
161 | Lauren B. Resnick | Robert Rothman, Jean B. Slattery, Jennifer L. Vranek | "States that have or adopt test-based accountability programs claim that their tests are aligned to their standards. But there has been, up to now, no independent methodology for checking alignment. This paper describes and illustrates such a methodology..." | 1stness | Benchmarking and Alignment of Standards and Testing, p.1 | CSE Technical Report 566, CRESST/Achieve, May 2002 | https://www.achieve.org/files/TR566.pdf | Office of Research and Improvement, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
162 | Lauren B. Resnick | Robert Rothman, Jean B. Slattery, Jennifer L. Vranek | "Yet few, if any, states have put in place effective policies or resource systems for improving instructional quality (National Research Council, 1999)." | Dismissive | Benchmarking and Alignment of Standards and Testing, p.4 | CSE Technical Report 566, CRESST/Achieve, May 2002 | https://www.achieve.org/files/TR566.pdf | Office of Research and Improvement, US Education Department | Hundreds, perhaps thousands, of studies of the effects of testing predate CRESST's in the 1980s. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
163 | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein | "Although test-based accountability has shown some compelling results, the issues are complex, the research is new and incomplete, and many of the claims that have received the most attention have proved to be premature and superficial." | Denigrating | Summary, p.xiv | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Hundreds, perhaps thousands, of studies of the effects of testing predate CRESST's in the 1980s. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | |||
164 | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein | "The research evidence does not provide definitive information about the actual costs of testing but the information that is available suggests that expenditures for testing have grown in recent years." | Dismissive | Introduction, p.9 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | No. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Picus, L.O., & Tralli, A. (1998, February). Alternative assessment programs: What are the true costs? CSE Technical Report 441, Los Angeles: CRESST; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | |||
165 | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein | "The General Accounting Office (1993) … estimate was $516 million … The estimate does not include time for more-extensive test preparation activities." p.9 | Denigrating | Introduction, p.9 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | As a matter of fact the GAO report did include those costs -- all of them. The GAO surveys very explicitly instructed respondents to "include any and all costs related" to each test, including any and all test preparation time and expenses. | |||
166 | Laura S. Hamilton, Daniel M. Koretz | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "There is currently no substantial evidence on the effects of published report cards on parents’ decisionmaking or on the schools themselves." | Dismissive | Chapter 2: Tests and their use in test-based accountability systems, p.44 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | For decades, consulting services have existed that help parents new to a city select the right school or school district for them. | ||
167 | Vi-Nhuan Le, Stephen P. Klein | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Research on the inflation of gains remains too limited to indicate how prevalent the problem is." | Dismissive | Chapter 3: Technical criteria for evaluating tests, p. 68 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | In fact the test prep, or test coaching, literature is vast and dates back decades, with meta-analyses of the literature dating back at least to the 1970s. There's even a What Works Clearinghouse summary of the (post World Wide Web) college admission test prep research literature: https://ies.ed.gov/ncee/wwc/Docs/InterventionReports/wwc_act_sat_100416.pdf . See also: Gilmore (1927) DeWeerdt (1927) French (1959) French & Dear (1959) Ortar (1960) Marron (1965) ETS (1965). Messick & Jungeblut (1981) Ellis, Konoske, Wulfeck, & Montague (1982) DerSimonian and Laird (1983) Kulik, Bangert-Drowns & Kulik (1984) Powers (1985) Samson (1985) Scruggs, White, & Bennion (1986) Jones (1986). Fraker (1986/1987) Halpin (1987) Whitla (1988) Snedecor (1989) Bond (1989). Baydar (1990) Becker (1990) Smyth (1990) Moore (1991) Alderson & Wall (1992) Powers (1993) Oren (1993). Powers & Rock (1994) Scholes, Lane (1997) Allalouf & Ben Shakhar (1998) Robb & Ercanbrack (1999) McClain (1999) Camara (1999, 2001, 2008) Stone & Lane (2000, 2003) Din & Soldan (2001) Briggs (2001) Palmer (2002) Briggs & Hansen (2004) Cankoy & Ali Tut (2005) Crocker (2005) Allensworth, Correa, & Ponisciak (2008) Domingue & Briggs (2009) Koljatic & Silva (2014) Early (2019) Herndon (2021) | ||
168 | Vi-Nhuan Le, Stephen P. Klein | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Relatively little is known about how testing accomodations affect score validity, and the few studies that have been conducted on the subject have had mixed results." | Dismissive | Chapter 3: Technical criteria for evaluating tests, p. 71 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | |||
169 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "High-stakes testing may also affect parents (e.g., their attitudes toward education, their engagement with schools, and their direct participation in their child's learning) as well as policymakers (their beliefs about system performance, their judgements about program effectiveness, and their allocation of resources). However, these issues remain largely unexamined in the literature." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 79 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | Parents and other adults are typically reached.through public opinion polls. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm . Among the hundreds of polls conducted between 1958 and 2008, a majority of them included parents in particular or adults in general. | ||
170 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "As described in chapter 2, there was little concern about the effects of testing on teaching prior to the 1970s." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 81 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | Rubbish. Entire books were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
171 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "In light of the changes that occurred in the uses of large-scale testing in the 1980s and 1990s, researchers began to investigate teachers' reactions to external assessment. The initial research on the impact of large-scale testing was conducted in the 1980s and the 1990s." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 83 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | Hundreds, perhaps thousands, of studies of the effects of testing predate CRESST's in the 1980s. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
172 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "The bulk of the research on the effects of testing has been conducted using surveys and case studies." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 83 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | This is misleading. True, many of the hundreds of studies on the effects of testing have been surveys and case studies. But, many, and more by my count, have been randomized experiments. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; | ||
173 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Data on the incidence of cheating [on educational tests] are scarce…" | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 96 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Actually, there have been, in surveys, in which respondents freely admit that they cheat and how. Moreover, news reports of cheating, by students or educators, have been voluminous. See, for example, Caveon Test Security's "Cheating in the News" section on its web site. | ||
174 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Less is known about changes in policies at the district and school levels in response to high-stakes testing, but mixed evidence of some impact has appeared." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 96 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Relevant pre-2000 studies of the effects of testing on at-risk students, completion, dropping out, curricular offerings, attitudes, etc. include those of Schleisman (1999); the *Southern Regional Education Board (1998); Webster, Mendro, Orsak, Weerasinghe & Bembry (1997); Jones (1996); Boylan (1996); Jones, 1993; Jacobson (1992); Grisay (1991); Johnstone (1990); Task Force on Educational Assessment Programs [Florida] (1979); Wellisch, MacQueen, Carriere & Duck (1978); Enochs (1978); Pronaratna (1976); and McWilliams & Thomas (1976). | ||
175 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Although numerous news articles have addressed the negative effects of high-stakes testing, systematic research on the subject is limited." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 98 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Relevant pre-2000 studies of the effects of testing on at-risk students, completion, dropping out, curricular offerings, attitudes, etc. include those of Schleisman (1999); the *Southern Regional Education Board (1998); Webster, Mendro, Orsak, Weerasinghe & Bembry (1997); Jones (1996); Boylan (1996); Jones, 1993; Jacobson (1992); Grisay (1991); Johnstone (1990); Task Force on Educational Assessment Programs [Florida] (1979); Wellisch, MacQueen, Carriere & Duck (1978); Enochs (1978); Pronaratna (1976); and McWilliams & Thomas (1976). | ||
176 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Research regarding the effects of test-based accountability on equity is very limited." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 99 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | |||
177 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Researchers have not documented the desirable consequences of testing … as clearly as the undesirable ones." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, p. 99 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
178 | Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | " … researchers have not generally measured the extent or magnitude of the shifts in practice that they identified as a result of high-stakes testing." | Dismissive | Chapter 4: Consequences of large-scale, high-stakes testing on school and classroom practice, pp. 99–100 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/content/dam/rand/pubs/monograph_reports/2002/MR1554.pdf | US National Science Foundation | The 1993 GAO study did. See, also: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
179 | Lorraine M. McDonnell | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "...this chapter can only describe the issues that are raised when one looks at testing from a political perspective. Because of the lack of systematic studies on the topic." | Dismissive | Chapter 5: Accountability as seen through a political lens, p.102 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Parents and other adults are typically reached.through public opinion polls. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm . Among the hundreds of polls conducted between 1958 and 2008, a majority of them included parents in particular or adults in general. | ||
180 | Lorraine M. McDonnell | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "...public opinion, as measured by surveys, does not always provide a clear and unambiguous measure of public sentiment." | Denigrating | Chapter 5: Accountability as seen through a political lens, p.108 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Parents and other adults are typically reached.through public opinion polls. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm . Among the hundreds of polls conducted between 1958 and 2008, a majority of them included parents in particular or adults in general. | ||
181 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "So test-based accountability remains controversial because there is inadequate evidence to make clear judgments about its effectiveness in raising test scores and achieving its other goals." | Dismissive | Chapter 6: Improving test-based accountability, p.122 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||
182 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Unfortunately, the complexity of the issues and the ambiguity of the existing research do not allow our recommendations to take the form of a practical “how-to” guide for policymakers and practitioners." | Denigrating | Chapter 6: Improving test-based accountability, p.123 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||
183 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Additional research is needed to identify the elements of performance on tests and how these elements map onto other tests …." | Denigrating | Chapter 6: Improving test-based accountability, p.127 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | |||
184 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Another
part of the interpretive question is the need to gather
information in other subject areas to portray a more complete picture of achievement. The scope of constructs that have been considered in research to date has been fairly narrow, focusing on the subjects that are part of the accountability systems that have been studied. Many legitimate instructional objectives have been ignored in the literature to date." |
Denigrating | Chapter 6: Improving test-based accountability, p.127 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Many studies of the effects of testing predate CRESST's in the 1980s and cover all subject fields, not just reading and math. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
185 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "States should also conduct ongoing analyses of the performance of groups whose members may not be numerous enough to permit separate reporting. English-language learners and students with disabilities are increasingly being included in high-stakes testing systems, and, as discussed in Chapter Three, little is currently known about the validity of scores for these groups." | Dismissive | Chapter 6: Improving test-based accountability, p.131 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Difficult to believe given that the federal government has for decades generously funded research into testing students with disabilities. See, for example, https://nceo.info/ and Kurt Geisinger's and Janet Carlson's chapters in Defending Standardized Testing and Correcting Fallacies in Educational and Psychological Testing. | ||
186 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "It would be especially helpful to know what changes in instruction are made in response to different kinds of information and incentives. In particular, we need to know how teachers interpret information from tests and how they use it to modify instruction." | Dismissive | Chapter 6: Improving test-based accountability, p.133 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
187 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | It seems clear that aligning the components of the system and providing appropriate professional development should, at a minimum, increase teachers’ political support for test-based accountability policies .... Although there is no empirical evidence to suggest that this strategy will reduce inappropriate responses to high-stakes testing,... Additional research needs to be done to determine the importance of alignment for promoting positive effects of test-based accountability. | Dismissive | Chapter 6: Improving test-based accountability, p.135 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
188 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "… we currently do not know enough about test-based accountability to design a system that is immune from the problems we have discussed | Dismissive | Chapter 6: Improving test-based accountability, p.136 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||
189 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "There is some limited evidence that educators’ responses to test based accountability vary according to the characteristics of their student populations,…" | Denigrating | Chapter 6: Improving test-based accountability, p.138 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | There was and is far more than "limited" evidence. See, for example: Hundreds, perhaps thousands, of studies of the effects of testing predate CRESST's in the 1980s. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
190 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "... there is very limited evidence to guide thinking about political issues." | Dismissive | Chapter 6: Improving test-based accountability, p.139 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Parents and other adults are typically reached.through public opinion polls. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm . Among the hundreds of polls conducted between 1958 and 2008, a majority of them included parents in particular or adults in general. | ||
191 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "First, we do not have an accurate assessment of the additional costs." | Denigrating | Chapter 6: Improving test-based accountability, p.141 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Yes, we did and we do. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Picus, L.O., & Tralli, A. (1998, February). Alternative assessment programs: What are the true costs? CSE Technical Report 441, Los Angeles: CRESST; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
192 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "However, many of these recommended reforms are relatively inexpensive in comparison with the total cost of education. This equation is seldom examined." | Denigrating | Chapter 6: Improving test-based accountability, p.141 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | Wrong. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
193 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Part of the reason these issues are rarely considered may be that no one has produced a good estimate of the cost of an improved accountability system in comparison with its benefits." | Denigrating | Chapter 6: Improving test-based accountability, p.141 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | No. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Picus, L.O., & Tralli, A. (1998, February). Alternative assessment programs: What are the true costs? CSE Technical Report 441, Los Angeles: CRESST; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
194 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "Nevertheless, our knowledge of the costs of alternative accountability systems is still somewhat limited. Policymakers need to know how much it would cost to change their current systems to be responsive to criticisms such as those described in this book. These estimates need to consider all of the associated costs, including possible opportunity costs associated with increased testing time and increased test preparation time." | Dismissive | Chapter 6: Improving test-based accountability, p.142 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | No. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Picus, L.O., & Tralli, A. (1998, February). Alternative assessment programs: What are the true costs? CSE Technical Report 441, Los Angeles: CRESST; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
195 | Laura S. Hamilton, Brian M. Stecher | Laura S. Hamilton, Brian M. Stecher, Stephen P. Klein, Eds. | "However, there is still much about these systems that is not well understood. Lack of research-based knowledge about the quality of scores and the mechanisms through which high-stakes testing programs operate limits our ability to improve these systems. As a result, our discussions also identified unanswered questions..." | Dismissive | Chapter 6: Improving test-based accountability, p.143 | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | https://www.rand.org/pubs/monograph_reports/MR1554.html | US National Science Foundation | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||
196 | Eva L. Baker, Robert L. Linn, Joan L. Herman, and Daniel Koretz | "Because experience with accountability systems is still developing, the standards we propose are intended to help evaluate existing systems and to guide the design of improved procedures." p.1 | Dismissive | Standards
for Educational Accountability Systems |
CRESST Policy Brief 5, Winter 2002 | https://www.gpo.gov/fdsys/pkg/ERIC-ED466643/pdf/ERIC-ED466643.pdf | Office of Research and Improvement, US Education Department | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | ||
197 | Eva L. Baker, Robert L. Linn, Joan L. Herman, and Daniel Koretz | "It is not possible at this stage in the development of accountability systems to know in advance how every element of an accountability system will actually operate in practice or what effects it will produce." p.1 | Dismissive | Standards
for Educational Accountability Systems |
CRESST Policy Brief 5, Winter 2002 | https://www.gpo.gov/fdsys/pkg/ERIC-ED466643/pdf/ERIC-ED466643.pdf | Office of Research and Improvement, US Education Department | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | ||
198 | Jay P. Heubert | "For Heubert, it is very much an open question what the effect of standards and high-stakes testing will be." p.83 | Dismissive | Achieving High Standards for All | National Research Council | "This project was funded by grant R215U990023 from the Office of Educational Research andImprovement (OERI) of the United States Department of Education." | See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | |||
199 | Ready,
Timothy, Ed.; Edley, Christopher, Jr., Ed.; Snow, Catherine E., Ed. |
"To be sure, there is a largely unexamined empirical assertion under-lying the arguments of high-stakes proponents: attaching high-stakesconsequences for the students provides an indispensable, otherwise un-obtainable incentive for students, parents, and teachers topay carefulattention to learning tasks." p. 128 | Dismissive | Achieving High Standards for All | National Research Council | "This project was funded by grant R215U990023 from the Office of Educational Research andImprovement (OERI) of the United States Department of Education." | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|||
200 | Daniel M. Koretz | Daniel F. McCaffrey, Laura S. Hamilton | "Although high-stakes testing is now widespread, methods for evaluating the validity of gains obtained under high-stakes conditions are poorly developed. This report presents an approach for evaluating the validity of inferences based on score gains on high-stakes tests. It describes the inadequacy of traditional validation approaches for validating gains under high-stakes conditions and outlines an alternative validation framework for conceptualizing meaningful and inflated score gains.", p.1 | Denigrating | Toward a framework for validating gains under high-stakes conditions | CSE Technical Report 551, CRESST/Harvard Graduate School of Education, CRESST/RAND Education, December 2001 | https://files.eric.ed.gov/fulltext/ED462410.pdf | Office of Research and Improvement, US Education Department | In fact the test prep, or test coaching, literature is vast and dates back decades, with meta-analyses of the literature dating back at least to the 1970s. There's even a What Works Clearinghouse summary of the (post World Wide Web) college admission test prep research literature: https://ies.ed.gov/ncee/wwc/Docs/InterventionReports/wwc_act_sat_100416.pdf . See also: Gilmore (1927) DeWeerdt (1927) French (1959) French & Dear (1959) Ortar (1960) Marron (1965) ETS (1965). Messick & Jungeblut (1981) Ellis, Konoske, Wulfeck, & Montague (1982) DerSimonian and Laird (1983) Kulik, Bangert-Drowns & Kulik (1984) Powers (1985) Samson (1985) Scruggs, White, & Bennion (1986) Jones (1986). Fraker (1986/1987) Halpin (1987) Whitla (1988) Snedecor (1989) Bond (1989). Baydar (1990) Becker (1990) Smyth (1990) Moore (1991) Alderson & Wall (1992) Powers (1993) Oren (1993). Powers & Rock (1994) Scholes, Lane (1997) Allalouf & Ben Shakhar (1998) Robb & Ercanbrack (1999) McClain (1999) Camara (1999, 2001, 2008) Stone & Lane (2000, 2003) Din & Soldan (2001) Briggs (2001) Palmer (2002) Briggs & Hansen (2004) Cankoy & Ali Tut (2005) Crocker (2005) Allensworth, Correa, & Ponisciak (2008) Domingue & Briggs (2009) Koljatic & Silva (2014) Early (2019) Herndon (2021) | ||
201 | Daniel M. Koretz | Daniel F. McCaffrey, Laura S. Hamilton | "Few efforts are made to evaluate directly score gains obtained under high-stakes conditions, and conventional validation tools are not fully adequate for the task.", p. 1 | Dismissive | Toward a framework for validating gains under high-stakes conditions | CSE Technical Report 551, CRESST/Harvard Graduate School of Education, CRESST/RAND Education, December 2001 | https://files.eric.ed.gov/fulltext/ED462410.pdf | Office of Research and Improvement, US Education Department | In fact the test prep, or test coaching, literature is vast and dates back decades, with meta-analyses of the literature dating back at least to the 1970s. There's even a What Works Clearinghouse summary of the (post World Wide Web) college admission test prep research literature: https://ies.ed.gov/ncee/wwc/Docs/InterventionReports/wwc_act_sat_100416.pdf . See also: Gilmore (1927) DeWeerdt (1927) French (1959) French & Dear (1959) Ortar (1960) Marron (1965) ETS (1965). Messick & Jungeblut (1981) Ellis, Konoske, Wulfeck, & Montague (1982) DerSimonian and Laird (1983) Kulik, Bangert-Drowns & Kulik (1984) Powers (1985) Samson (1985) Scruggs, White, & Bennion (1986) Jones (1986). Fraker (1986/1987) Halpin (1987) Whitla (1988) Snedecor (1989) Bond (1989). Baydar (1990) Becker (1990) Smyth (1990) Moore (1991) Alderson & Wall (1992) Powers (1993) Oren (1993). Powers & Rock (1994) Scholes, Lane (1997) Allalouf & Ben Shakhar (1998) Robb & Ercanbrack (1999) McClain (1999) Camara (1999, 2001, 2008) Stone & Lane (2000, 2003) Din & Soldan (2001) Briggs (2001) Palmer (2002) Briggs & Hansen (2004) Cankoy & Ali Tut (2005) Crocker (2005) Allensworth, Correa, & Ponisciak (2008) Domingue & Briggs (2009) Koljatic & Silva (2014) Early (2019) | ||
202 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "Despite their importance and widespread use, little is known about the impact of these tests on states’ recent efforts to improve teaching and learning." | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.14 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which owned said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
203 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "Little information about the technical soundness of teacher licensure tests appears in the published literature." | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.14 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which ownsed said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
204 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "Little research exists on the extent to which licensure tests identify candidates with the knowledge and skills necessary to be minimally competent beginning teachers." | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.14 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which ownsed said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
205 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "Information is needed about the soundness and technical quality of the tests that states use to license their teachers." | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.14 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which ownsed said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
206 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "policy and practice on teacher licensure testing in the United States are nascent and evolving" | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.17 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which ownsed said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
207 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "The paucity of data and these methodological challenges made the committee’s examination of teacher licensure testing difficult." | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.17 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which ownsed said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
208 | Karen J. Mitchell, David Z. Robinson, Barbara S. Plake, & Kaeli T. Knowles (Eds.) | "There were a number of questions the committee wanted to answer but could not, either because they were beyond the scope of this study, the evidentiary base was inconclusive, or the committee’s time and resources were insufficient." | Dismissive | Testing Teacher Candidates: The Role of Licensure Tests in Improving Teacher Quality, 2001, p.17 | Committee on Assessment and Teacher Quality | Board on Testing and Assessment, National Research Council | Every stage of test development, administration, and analysis at National Evaluation Systems—the contractors for dozens of states' teacher licensure tests—was thoroughly documented. But, instead of requesting that documentation from each state, which ownsed said documentaiton, the NRC committee insisted that NES provide it. NES refused to do so unless the NRC committee received permission from each state. The NRC committee, apparently, didn't feel like doing that much work, so declared the information nonexistent. | ||||
209 | Harold
F. O’Neil, Jr., University of Southern California, CRESST |
Jamal Abedi, UCLA/CRESST, Charlotte Lee, UCLA/CRESST, Judy Miyoshi, UCLA/CRESST, Ann Mastergeorge, UCLA/CRESST | "To our knowledge, based on an extensive literature review (to be reported elsewhere), our research group is the only one conducting research of this type; i.e., meaningful monetary incentives with released items from either NAEP or TIMSS with 12th graders." p.1 | Firstness | Monetary Incentives for Low-Stakes Tests, March 2001 | report to USED, CRESST | https://nces.ed.gov/pubs2001/2001024.pdf | "The work reported herein was funded at least in part with Federal funds from the U.S. Department of Education under the American Institutes for Research (AIR)/Education Statistical Services Institute (ESSI) contract number RN95127001, Task Order 1.2.93.1, as administered by the ... NCES.. The work reported herein was also supported under the Educational Research and Development Centers Program, PR/Award Number R305B60002, as administered by the Office of Educational Research and Improvement (OERI), U.S. Department of Education." | |||
210 | Marguerite Clarke | Jamal Abedi, UCLA/CRE | “[T]here has been no analogous infrastructure for independently evaluating a testing program before or after implementation, or for monitoring test use and impact.” p. 19 | Dismissive | The Adverse Impact of High Stakes Testing on Minority Students: Evidence from 100 Years of Test Data | In G. Orfield and M. Kornhaber (Eds.), Raising standards or raising barriers? Inequality and high stakes testing in public education. New York: The Century Foundation (2001) | http://files.eric.ed.gov/fulltext/ED450183.pdf | The Century Foundation | External evaluations of large-scale testing programs not only exist, but represent the norm. | ||
211 | Marguerite Clarke | Charlotte Lee, UCLA/CR | “The effects of testing are now so diverse, widespread, and serious that it is necessary to establish mechanisms for catalyzing inquiry about, and systematic independent scrutiny of them.” p. 20 | Dismissive | The Adverse Impact of High Stakes Testing on Minority Students: Evidence from 100 Years of Test Data | In G. Orfield and M. Kornhaber (Eds.), Raising standards or raising barriers? Inequality and high stakes testing in public education. New York: The Century Foundation (2001) | http://files.eric.ed.gov/fulltext/ED450183.pdf | The Century Foundation | External evaluations of large-scale testing programs not only exist, but represent the norm. | ||
212 | Ronald Deitel | Judy Miyoshi, UCLA/CR | "In the late 1980s, CRESST was among the first to research the measurement of rigorous, discipline-based knowledge for purposes of large-scale assessment." | 1stness | Center for Research on Evaluation, Standards, and Student Testing (CRESST) clarify the goals and activities of CRESST | EducationNews.org, November 18, 2000 | Office of Research and Improvement, US Education Department | Nonsense. Hundreds, perhaps thousands, of studies of the effects of testing predate CRESST's in the 1980s. See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | |||
213 | Marguerite Clarke | Ann Mastergeorge, UCL | “[F]or most of this century, there has been no infrastructure for independently evaluating a testing programme before or after implementation, or for monitoring test use and impact. The commercial testing industry does not as yet have any structure in place for the regulation and monitoring of appropriate test use.” p. 177 | Dismissive | Retrospective on Educational Testing and Assessment in the 20th Century | Curriculum Studies, 2000, vol. 32, no. 2, | http://webpages.uncc.edu/~rglamber/Rsch6109%20Materials/HistoryAchTests_3958652.pdf | External evaluations of large-scale testing programs not only exist, but represent the norm. | |||
214 | Marguerite Clarke | Madaus, Horn, and Ramos | “Given the paucity of evidence available on the volume of testing over time, we examined five indirect indicators of growth in testing. . . .” p. 169 | Dismissive | Retrospective on Educational Testing and Assessment in the 20th Century | Curriculum Studies, 2000, vol. 32, no. 2 | http://webpages.uncc.edu/~rglamber/Rsch6109%20Materials/HistoryAchTests_3958652.pdf | There exist many sources of such information, from the Council of Chief State School Officers (CCSSO), the US Education Department, the US General Accounting Office (GAO), for example. | |||
215 | Sheila Barron | "Although this is a topic researchers ... talk about often, very little has been written about the difficulties secondary analysts confront." p.173 | Dismissive | Difficulties associated with secondary analysis of NAEP data, chapter 9 | Grading the Nation's Report Card, National Research Council, 2000 | https://www.nap.edu/catalog/9751/grading-the-nations-report-card-research-from-the-evaluation-of | National Research Council funders | In their 2009 Evaluation of NAEP for the US Education Department, Buckendahl, Davis, Plake, Sireci, Hambleton, Zenisky, & Wells (pp. 77–85) managed to find quite a lot of research on making comparisons between NAEP and state assessments: several of NAEP's own publications, Chromy 2005), Chromy, Ault, Black, & Mosquin (2007), McLaughlin (2000), Schuiz & Mitzel (2005), Sireci, Robin, Meara, Rogers, & Swaminathan (2000), Stancavage, Et al (2002), Stoneberg (2007), WestEd (2002), and Wise, Le, Hoffman, & Becker (2004). | |||
216 | Sheila Barron | "...few articles have been written that specifically address the difficulties of using NAEP data." p.173 | Dismissive | Difficulties associated with secondary analysis of NAEP data, chapter 9 | Grading the Nation's Report Card, National Research Council, 2000 | https://www.nap.edu/catalog/9751/grading-the-nations-report-card-research-from-the-evaluation-of | National Research Council funders | In their 2009 Evaluation of NAEP for the US Education Department, Buckendahl, Davis, Plake, Sireci, Hambleton, Zenisky, & Wells (pp. 77–85) managed to find quite a lot of research on making comparisons between NAEP and state assessments: several of NAEP's own publications, Chromy 2005), Chromy, Ault, Black, & Mosquin (2007), McLaughlin (2000), Schuiz & Mitzel (2005), Sireci, Robin, Meara, Rogers, & Swaminathan (2000), Stancavage, Et al (2002), Stoneberg (2007), WestEd (2002), and Wise, Le, Hoffman, & Becker (2004). | |||
217 | Herman, Joan L. | “Testing accommodations that attempt to reduce the language load of a test or otherwise compensate for students' reduced language skills (e.g., by providing students more time) are also currently being researched, but answers that are equitable and fair for all students have not yet been found.” p. 8 | Dismissive | Student Assessment and Student Achievement in the California Public School System (with Brown and Baker) | CSE Technical Report 519, April 2000 | https://www.cse.ucla.edu/products/reports/TECH519.pdf | Office of Research and Improvement, US Education Department | ||||
218 | Herman, Joan L. | “Thus, the extent to which gains reflect real improvement in learning is an open question (see, e.g., Shepard, 1990).” p. 15 | Dismissive | Student Assessment and Student Achievement in the California Public School System (with Brown and Baker) | CSE Technical Report 519, April 2000 | https://www.cse.ucla.edu/products/reports/TECH519.pdf | Office of Research and Improvement, US Education Department | ||||
219 | R. L. Linn | "There are many reasons for the Lake Wobegon Effect, most of which are less sinister than those emphasized by Cannell." | Denigrating | Assessments and Accountability, p.7 | Educational Researcher, March, pp.4–16. | https://journals.sagepub.com/doi/abs/10.3102/0013189x029002004 | Office of Research and Improvement, US Education Department | No. Cannell was exactly right. There was corruption, lax security, and cheating. See, for example, https://nonpartisaneducation.org/Review/Articles/v6n3.htm | |||
220 | Lorrie A. Shepard | "This portrayal derives mostly from research leading to Wood and Bruner’s original conception of scaffolding, from Vygotskian theory, and from naturalistic studies of effective tutoring described next. Relatively few studies have been undertaken in which explicit feedback interventions have been tried in the context of constructivist instructional settings." | Dismissive | The Role of Classroom Assessment in Teaching and Learning, p.59 | CSE Technical Report 517, February 2000 | https://nepc.colorado.edu/sites/default/files/publications/TECH517.pdf | Office of Research and Improvement, US Education Department | ||||
221 | Lorrie A. Shepard | "The NCTM and NRC visions are idealizations based on beliefs about constructivist pedagogy and reflective practice. Although both are supported by examples of individual teachers who use assessment to improve their teaching, little is known about what kinds of support would be required to help large numbers of teachers develop these strategies or to ensure that teacher education programs prepared teachers to use assessment in these ways. Research is needed to address these basic implementation questions." | Dismissive | The Role of Classroom Assessment in Teaching and Learning, p.64 | CSE Technical Report 517, February 2000 | https://nepc.colorado.edu/sites/default/files/publications/TECH517.pdf | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
222 | Lorrie A. Shepard | "This social-constructivist view of classroom assessment is an idealization. The new ideas and perspectives underlying it have a basis in theory and empirical studies, but how they will work in practice and on a larger scale is not known." | Dismissive | The Role of Classroom Assessment in Teaching and Learning, p.67 | CSE Technical Report 517, February 2000 | https://nepc.colorado.edu/sites/default/files/publications/TECH517.pdf | Office of Research and Improvement, US Education Department | ||||
223 | Marguerite Clarke | Madaus, Pedulla, and Shore | “The National Board believes that we must as a nation conduct research that helps testing contribute to student learning, classroom practice, and state and district management of school resources.” p. 2 | Dismissive | An Agenda for Research on Educational Testing | NBETPP Statements, Vol. 1, No. 1, Jan. 2000 | http://files.eric.ed.gov/fulltext/ED456137.pdf | Ford Foundation | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
224 | Marguerite Clarke | Madaus, Pedulla, and Shore | “Validity research on teacher testing needs to address the following four issues in particular. . .” : [four bullet-point paragraphs follow] p. 3 | Dismissive | An Agenda for Research on Educational Testing | NBETPP Statements, Vol. 1, No. 1, Jan. 2000 | http://files.eric.ed.gov/fulltext/ED456137.pdf | Ford Foundation | |||
225 | Marguerite Clarke | Madaus, Pedulla, and Shore | “[W]e need to understand better the relationship between testing and the diversity of the college student body.” p. 6 | Dismissive | An Agenda for Research on Educational Testing | NBETPP Statements, Vol. 1, No. 1, Jan. 2000 | http://files.eric.ed.gov/fulltext/ED456137.pdf | Ford Foundation | |||
226 | Marguerite Clarke | Haney, Madaus | “We trust that further research will build on this good example and help all of us move from suggestive correlational studies towards more definitive conclusions.” p. 9 | 1stness | High Stakes Testing and High School Completion | NBETPP Statements, Volume 1, Number 3, Jan. 2000 | http://files.eric.ed.gov/fulltext/ED456139.pdf | Ford Foundation | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
227 | Jay P. Heubert | Robert M. Hauser | "A growing body of research suggests that tests often do in fact change school and classroom practices (Corbett & Wilson, 1991; Madaus, 1988; Herman & Golan 1993; Smith & Rottenberg, 1991)." p.29 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
228 | Jay P. Heubert | Robert M. Hauser | "A growing body of research suggests that tests often do in fact change school and classroom practices (Corbett & Wilson, 1991; Madaus, 1988; Herman & Golan 1993; Smith & Rottenberg, 1991)." p.29 | Denigrating | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
229 | Jay P. Heubert | Robert M. Hauser | "Most standards-based assessments have only recently been implemented or are still being developed. Consequently, it is too early to determine whether they will produce the intended effects on classroom instruction." p.36 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
230 | Jay P. Heubert | Robert M. Hauser | "A recent review of the available research evidence by Mehrens (1998) reaches several interim conclusions. Drawing on eight studies...." p.36 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
231 | Jay P. Heubert | Robert M. Hauser | "Although there are no national data summarizing how local districts use standardized tests in certifying students, we do know that serveral of the largest school systems have begun to use test scores in determining grade-to-grade promotion (Chicago) or are considering doing so (New York City, Boston)." p.37 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
232 | Jay P. Heubert | Robert M. Hauser | "There is very little research that specifically addresses the consequences of graduation testing." p.172 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
233 | Jay P. Heubert | Robert M. Hauser | "Caterall adds, 'initial boasts and doubts alike regarding the effects of gatekeeping competency testing have met with a paucity of follow-up research." p.172 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Just some of the relevant pre-2008 studies of the effects of minimum-competency or exit exams and the problems with a single passing score include those of Alvarez, Moreno, & Patrinos (2007); Grodsky & Kalogrides (2006); Audette (2005); Orlich (2003); StandardsWork (2003); Meisels, et al. (2003); Braun (2003); Rosenshine (2003); Tighe, Wang, & Foley (2002); Carnoy & Loeb (2002); Baumert & Demmrich (2001); Rosenblatt & Offer (2001); Phelps (2001); Toenjes, Dworkin, Lorence, & Hill (2000); Wenglinsky (2000); Massachusetts Finance Office (2000); DeMars (2000); Bishop (1999, 2000, 2001, & 2004); Grissmer & Flanagan(1998); Strauss, Bowes, Marks, & Plesko (1998); Frederiksen (1994); Ritchie & Thorkildsen (1994); Chao-Qun & Hui (1993); Potter & Wall (1992); Jacobson (1992); Rodgers, et al. (1991); Morris (1991); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Winfield (1987); Koffler (1987); Losack (1987); Marshall (1987); Hembree (1987); Mangino, Battaille, Washington, & Rumbaut (1986); Michigan Department of Education (1984); Ketchie (1984); Serow (1982); Indiana Education Department (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); Down(2) (1979); Wellisch (1978); and Findley (1978). | ||
234 | Jay P. Heubert | Robert M. Hauser | "in one of the few such studies on this topic (Bishop, 1997) compared the Third International Mathematics and Science Study (TIMSS) test scores of countries with and without rigorous graduation tests. He found that countries with demanding exit exams outperformed other countries at a comparable level of development. He concluded, however that such exams were probably not the most important determinant of achievement levels and that more research was needed." p.173 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Relevant pre-2000 studies of the effects of minimum-competency testing and the problems with a single passing score include those of Frederiksen (1994); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Losack (1987); Mangino & Babcock (1986); Serow (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); and Findley (1978). | ||
235 | Jay P. Heubert | Robert M. Hauser | "Very little is known about the specific consequences of passing or failing a high school graduation exam." p.176 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Relevant pre-2000 studies of the effects of minimum-competency testing and the problems with a single passing score include those of Frederiksen (1994); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Losack (1987); Mangino & Babcock (1986); Serow (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); and Findley (1978). | ||
236 | Jay P. Heubert | Robert M. Hauser | "American experience is limited and research is needed to explore their effectiveness. For instance, we do not know how to combine advance notice of high-stakes test requirements, remedial intervention, and opportunity to retake graduation tests." p.180 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Relevant pre-2000 studies of the effects of minimum-competency testing and the problems with a single passing score include those of Frederiksen (1994); Winfield (1990); Ligon, Johnstone, Brightman, Davis, et al. (1990); Losack (1987); Mangino & Babcock (1986); Serow (1982); Brunton (1982); Paramore, et al. (1980); Ogden (1979); and Findley (1978). | ||
237 | Jay P. Heubert | Robert M. Hauser | "Research is also needed to explore the effects of different kinds of high school credentials on employment and other post-school outcomes." p.180 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | |||
238 | Jay P. Heubert | Robert M. Hauser | "At the same time, solid evaluation research on the most effective remedial approaches is sparse." p.183 | Denigrating | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Developmental (i.e., remedial) education researchers have conducted many studies to determine what works best to keep students from failing in their “courses of last resort,” after which there are no alternatives. Researchers have included Boylan, Roueche, McCabe, Wheeler, Kulik, Bonham, Claxton, Bliss, Schonecker, Chen, Chang, and Kirk. | ||
239 | Jay P. Heubert | Robert M. Hauser | "There is plainly a need for good research on effective remedial eduation." p.183 | Denigrating | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | Developmental (i.e., remedial) education researchers have conducted many studies to determine what works best to keep students from failing in their “courses of last resort,” after which there are no alternatives. Researchers have included Boylan, Roueche, McCabe, Wheeler, Kulik, Bonham, Claxton, Bliss, Schonecker, Chen, Chang, and Kirk. | ||
240 | Jay P. Heubert | Robert M. Hauser | "However, in most of the nation, much needs to be done before a world-class curriculum and world-class instruction will be in place." p.277 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | |||
241 | Jay P. Heubert | Robert M. Hauser | "The committee sees a strong need for better evidence on the benefits and costs of high-stakes testing." p.281 | Denigrating | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | No. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
242 | Jay P. Heubert | Robert M. Hauser | "Very little is known about the specific consequences of passing or failing a high school graduation exam." p.288 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | The many studies of district and state minimum competency or diploma testing programs popular from the 1960s through the 1980s found positive effects for students just below the cut score and mixed effects for students far below and anywhere above. Researchers have included Fincher, Jackson, Battiste, Corcoran, Jacobsen, Tanner, Boylan, Saxon, Anderson, Muir, Bateson, Blackmore, Rogers, Zigarelli, Schafer, Hultgren, Hawley, Abrams, Seubert, Mazzoni, Brookhart, Mendro, Herrick, Webster, Orsack, Weerasinghe, and Bembry | ||
243 | Jay P. Heubert | Robert M. Hauser | "At present, however, advanced skills are often not well defined and ways of assessing them are not well established." p.289 | Denigrating | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | |||
244 | Jay P. Heubert | Robert M. Hauser | "...in many cases, the demands that full participation of these students [i.e., students with disabilities] place on assessment systems are greater than current assessment knowledge and technology can support." p.191 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | |||
245 | Jay P. Heubert | Robert M. Hauser | "...available evidence about the possible effects of graduation tests on learning and on high school dropout is inconclusive (e.g., Kreitzer et al., 1989, Reardon, 1996; Catterall, 1990; Cawthorne, 1990; Bishop, 1997). | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | The many studies of district and state minimum competency or diploma testing programs popular from the 1960s through the 1980s found positive effects for students just below the cut score and mixed effects for students far below and anywhere above. Researchers have included Fincher, Jackson, Battiste, Corcoran, Jacobsen, Tanner, Boylan, Saxon, Anderson, Muir, Bateson, Blackmore, Rogers, Zigarelli, Schafer, Hultgren, Hawley, Abrams, Seubert, Mazzoni, Brookhart, Mendro, Herrick, Webster, Orsack, Weerasinghe, and Bembry | ||
246 | Jay P. Heubert | Robert M. Hauser | "We do not know how to combine advance notice of high-stakes test requirements, remedial intervention, and opportunity to retake graduation tests. Research is also needed to explore the effects of different kinds of high school credentials on employment and other post-school outcomes." p.289 | Dismissive | High Stakes: Testing for Tracking, Promotion, and Graduation | Board on Testing and Assessment, National Research Council, 1999 | https://www.nap.edu/catalog/6336/high-stakes-testing-for-tracking-promotion-and-graduation | Ford Foundation | The many studies of district and state minimum competency or diploma testing programs popular from the 1960s through the 1980s found positive effects for students just below the cut score and mixed effects for students far below and anywhere above. Researchers have included Fincher, Jackson, Battiste, Corcoran, Jacobsen, Tanner, Boylan, Saxon, Anderson, Muir, Bateson, Blackmore, Rogers, Zigarelli, Schafer, Hultgren, Hawley, Abrams, Seubert, Mazzoni, Brookhart, Mendro, Herrick, Webster, Orsack, Weerasinghe, and Bembry | ||
247 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "But the practical nature of our charge and the limits of the evidence available to us have meant that we have also had to draw on the practical experience of committee members and outside experts in crafting our advice. Hence, this report relies heavily on expert advice from the field, in addition to scientific research." p. vii | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
248 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "we reviewed available evidence from research on assessment, accountability, and standards-based reform. However, we recognized that in many areas the evidentiary base was slim." p.11 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
249 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "Standards-based reform is a new idea, and few places have put all the pieces in place, and even fewer have put them in place long enough to enable scholars to observe their effects." p.11 | 1stness | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
250 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "Yet despite the prominence of standards-based reform in the policy debate, there are few examples of districts or states that have put the entire standards-based puzzle together, much less achieved success through it. Some evidence is beginning to gather." p.16 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
251 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "In large part, the limited body of evidence in this country reflects the complexity of the concept." p.16 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
252 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "Despite the common use of such accommodations, however, there is little research on their effects on the validity of test score information, and most of the research has examined college admission tests and other postsecondary measures, not achievement tests in elementary and secondary schools (National Research Council, 1997a)." p.57 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
253 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "Because of the paucity of research, questions remain about whether test results from assessments using accommodations represent valid and reliable indicators of what students with disabilities know and are able to do (Koretz, 1997)." p.57 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
254 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "As with accommodations for students with disabilities, the research on the effects of test accommodations for English-language learners is inconclusive." p.62 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
255 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "The small body of research that has examined classrooms in depth suggests that such instructional practices may be rare, even among teachers who say they endorse the changes the standards are intended to foster." p.75 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
256 | Richard F. Elmore, Robert Rothman, Eds. | Eva L. Baker, Lauren B. Resnick, Robert L. Linn, Lorraine McDonnel, Lauress L. Wise, Michael Feuer, et al. | "Districts' capacity to monitor the conditions of instruction in schools is limited, and there are few examples of districts that have been shown to be effective in analyzing such conditions and using the data to improve instruction. The research base on such efforts is slim, in large part because there are so few examples to study." p.76 | Dismissive | Testing, Teaching, and Learning: A Guide forStates and School Districts, 1999 | Committee on Title I Testing and Assessment, Board on Testing and Assessment, National Research Council | Pew Charitable Trusts, Spencer Foundation, William T. Grant Foundation | ||||
257 | Robert L. Linn | "Two obvious, but frequently ignored, cautions [from the TIERS experience] are these: . . . " p. 6 | Denigrating | Assessments and Accountability | CSE Technical Report 490 (November 1998) | http://www.cse.ucla.edu/products/Reports/TECH490.pdf | Office of Research and Improvement, US Education Department | ||||
258 | Robert L. Linn | "Moreover, it is critical to recognize first that the choice of constructs matters, and so does the way in which measures are developed and linked to the constructs. Although these two points may be considered obvious, they are too often ignored." p. 13 | Denigrating | Assessments and Accountability | CSE Technical Report 490 (November 1998) | http://www.cse.ucla.edu/products/Reports/TECH490.pdf | Office of Research and Improvement, US Education Department | ||||
259 | Robert L. Linn | “Although that claim is subject to debate, it seldom even gets considered when aggregate results are used either to monitor progress (e.g., NAEP) or for purposes of school, district, or state accountability.” p. 16 | Dismissive | Assessments and Accountability | CSE Technical Report 490 (November 1998) | http://www.cse.ucla.edu/products/Reports/TECH490.pdf | Office of Research and Improvement, US Education Department | ||||
260 | Lawrence O. Picus | Alisha Tralli | "What is surprising is, given the tremendous emphasis placed on assessment systems to measure school accountability, the relatively minuscule portion of educational expenditures devoted to this important and highly visible component of the educational system." p.66 | Dismissive | Alternative assessment programs: What are the true costs? | CSE Technical Report 441, February 1998 | https://cresst.org/publications/cresst-publication-2813/?_sf_s=441 | Office of Research and Improvement, US Education Department | The taxpayers ponied up big time to fund the GAO study, which Picus has spent his whole career misrepresenting, demeaning, or dismissing. By 1998, it is simply not believable that his continuing efforts stem from honest misunderstanding. He is deliberately misrepresenting previous research on the topic in order to advance his own work and career. | ||
261 | Lawrence O. Picus | Alisha Tralli | "In all of these analyses, except the GAO report, the cost estimates are based on the direct costs of the assessment program. The GAO is the only other organization we are aware of that has attempted to estimate the opportunity costs of personnel time, in attempting to determine the full costs of assessment programs. The GAO study, however, did not focus specifically on state assessment programs that included portfolios, an important factor in the higher cost estimates identified in the present study." p.64 | Denigrating | Alternative assessment programs: What are the true costs? | CSE Technical Report 441, February 1998 | https://cresst.org/publications/cresst-publication-2813/?_sf_s=441 | Office of Research and Improvement, US Education Department | The previous 63 pages of the Picus and Tralli report claimed: theirs was the first study to look at opportunity costs and all previous studies were "just expenditure studies" that ignored "true" opportunity costs. Then, here, on page 64, they finally admit something a bit truthful about the earlier and vastly better GAO report, but also immediately attempt to demain it, because it did not estimate the costs of Vermont's doomed portfolio program, which did not exist when the GAO did its study. | ||
262 | Lawrence O. Picus | Alisha Tralli | "Costs and expenditures are not synonymous terms. Monk (1995) distinguishes between these two terms. Costs are “measures of what must be foregone to realize some benefit,” while expenditures are “measures of resource flows regardless of their consequence” (p. 365). Expenditures are generally easier to track since accounting systems typically report resource flows by object, e.g., instruction, administration, transportation. Typically, most cost analyses in education focus on these measurable expenditures and ignore the more difficult measures of opportunity. The goal of this report is to move one step beyond past work and estimate these economic costs as well." p.5 | Denigrating | Alternative assessment programs: What are the true costs? | CSE Technical Report 441, February 1998 | https://cresst.org/publications/cresst-publication-2813/?_sf_s=441 | Office of Research and Improvement, US Education Department | No. Picus & Tralli neither did the first study of opportunity costs, nor the first study of opportunity costs in those two states. The 1993 GAO study did both. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
263 | Lawrence O. Picus | Alisha Tralli | "Although several states have implemented new assessment programs, there has been little research on the costs of developing and implementing these new systems." p.4 | Dismissive | Alternative assessment programs: What are the true costs? | CSE Technical Report 441, February 1998 | https://cresst.org/publications/cresst-publication-2813/?_sf_s=441 | Office of Research and Improvement, US Education Department | No. Picus & Tralli neither did the first study of opportunity costs, nor the first study of opportunity costs in those two states. The 1993 GAO study did both. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
264 | Lawrence O. Picus | Alisha Tralli | "The purpose of this report is to provide a first detailed analysis of the “economic” or opportunity costs of the testing systems in two states, Kentucky and Vermont." p.2 | 1stness | Alternative assessment programs: What are the true costs? | CSE Technical Report 441, February 1998 | https://cresst.org/publications/cresst-publication-2813/?_sf_s=441 | Office of Research and Improvement, US Education Department | No. Picus & Tralli neither did the first study of opportunity costs, nor the first study of opportunity costs in those two states. The 1993 GAO study did both. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||
265 | Anne Lewis | quoting Arnold Fege, National PTA | "The national testing proposal is based on 'quantum leap' theories, not on research, contended Arnold Fege of the National PTA. 'As I listened to the presentations this morning,’ he said, ‘I didn't hear about any research that backs up the introduction of national testing.’ In his opinion, ‘no parent in the country is losing sleep because his or her child is not meeting NAEP standards,’ and even though testing is pervasive in American education, it seems to not have made a big impact on change." | Dismissive | Assessing Student Achievement: Search for Validity and Balance | CSE Technical Report 481 (1997) | https://cresst.org/wp-content/uploads/TECH481.pdf | Office of Research and Improvement, US Education Department | In their 2009 Evaluation of NAEP for the US Education Department, Buckendahl, Davis, Plake, Sireci, Hambleton, Zenisky, & Wells (pp. 77–85) managed to find quite a lot of research on making comparisons between NAEP and state assessments: several of NAEP's own publications, Chromy 2005), Chromy, Ault, Black, & Mosquin (2007), McLaughlin (2000), Schuiz & Mitzel (2005), Sireci, Robin, Meara, Rogers, & Swaminathan (2000), Stancavage, Et al (2002), Stoneberg (2007), WestEd (2002), and Wise, Le, Hoffman, & Becker (2004). | ||
266 | Eva L. Baker | Zenaida Aguirre-Munoz | "The extent and nature of the impact of language skills on performance assessments remains elusive due to the paucity of research in this area." | Dismissive | Improving the equity and validity of assessment-based information systems, p.3 | CSE Technical Report 462, December 1997 | https://cresst.org/wp-content/uploads/TECH462.pdf | Office of Research and Improvement, US Education Department | |||
267 | Joan L. Herman | "Although conceptual models for analyzing the cost of alternative assessment and for conducting cost-benefit analyses have been formulated (Catterall & Winters, 1994; Picus, 1994), definitive cost studies are yet to be completed (see, however, Picus & Tralli, forthcoming)." p. 30 | Dismissive, Denigrating | Large-Scale Assessment in Support of School Reform: Lessons in the Search for Alternative Measures | CSE Technical Report 446, Oct. 1997 | http://www.cse.ucla.edu/products/reports/TECH446.pdf | Office of Research and Improvement, US Education Department | No. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | |||
268 | Robert L. Linn | Eva L. Baker | "“Very little research has been conducted to validate performance standards, particularly those that include specification of student response attributes.” pp. 26-27 | Dismissive | Emerging Educational Standards of Performance in the United States | CSE Technical Report 437 (August 1997) | http://www.cse.ucla.edu/products/reports/TECH437.pdf | Office of Research and Improvement, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
269 | Harold F. O'Neil, Jr. | Brenda Sugrue, Jamal Abedi, Eva L. Baker, Shari Golan | "However, as d'Ydewalle (1987) has pointed out, 'clear-cut results from neat experiments on the impact of motivation on learning [or performance] do not exist.'" | Dismissive | Final Report of Experimental Studies on Motivation and NAEP Test Performance, p.5 | CSE Technical Report 427, June 1997 | https://cresst.org/wp-content/uploads/TECH427.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
270 | Harold F. O'Neil, Jr. | Brenda Sugrue, Jamal Abedi, Eva L. Baker, Shari Golan | "In the educational context, most existing studies have focused on the influence of characteristics of the classroom learning environment, such as rewards, teacher feedback, goal structures, evaluation practices, on either the entecedents of consequences of motivation." | Dismissive | Final Report of Experimental Studies on Motivation and NAEP Test Performance, p.5 | CSE Technical Report 427, June 1997 | https://cresst.org/wp-content/uploads/TECH427.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
271 | Harold F. O'Neil, Jr. | Brenda Sugrue, Jamal Abedi, Eva L. Baker, Shari Golan | "Most of the studies that have compared goal orientations have examined their effects on performance during classroom learning activities rather than at the time of test taking." | Dismissive | Final Report of Experimental Studies on Motivation and NAEP Test Performance, p.7 | CSE Technical Report 427, June 1997 | https://cresst.org/wp-content/uploads/TECH427.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
272 | Harold F. O'Neil, Jr. | Brenda Sugrue, Jamal Abedi, Eva L. Baker, Shari Golan | "As yet, there appear to be no published studies that investigate the direct and indirect causal paths from motivational antecedents through use of metacognitive strategies to achievement." | Dismissive | Final Report of Experimental Studies on Motivation and NAEP Test Performance, p.8 | CSE Technical Report 427, June 1997 | https://cresst.org/wp-content/uploads/TECH427.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
273 | Harold F. O'Neil, Jr. | Brenda Sugrue, Jamal Abedi, Eva L. Baker, Shari Golan | "In general, there is a need for more studies to focus on the effects on test performance of motivational antecedents (not just anxiety) introduced at the time of test taking." | Dismissive | Final Report of Experimental Studies on Motivation and NAEP Test Performance, p.10 | CSE Technical Report 427, June 1997 | https://cresst.org/wp-content/uploads/TECH427.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
274 | Brian M. Stecher | Stephen P. Klein | "In constrast, relatively little has been published on the costs of such measures [performance tests] in operational programs. An Office of Technology Assessment (1992) … (Hoover and Bray) …." | Dismissive | The Cost of Science Performance Assessments in Large-Scale Testing Programs, p.1 | Educational Evaluation and Policy Analysis, Spring 1997, 19(1) | "This article is based on work supported by the National Science Foundation under Grant No. MDR-9154406." p.12 | The January 1993 GAO report on testing costs included such information. CRESST has spent a quarter century denigrating that report. | |||
275 | Brian M. Stecher | Stephen P. Klein | "However, empirical and observational data suggest much more needs to be done to understand what hands-on tasks actually measure. Klein et al. (1996b) … Shavelson et al. (1992) … Hamilton (1994) …." pp.9-10 | Dismissive | The Cost of Science Performance Assessments in Large-Scale Testing Programs, p.1 | Educational Evaluation and Policy Analysis, Spring 1997, 19(1) | "This article is based on work supported by the National Science Foundation under Grant No. MDR-9154406." p.12 | Article references only works by other CRESST authors and completely ignores the career-tech education literature, where such studies are most likely to be found. | |||
276 | Brian M. Stecher | Stephen P. Klein | "Future research will no doubt shed more light on the validity question, but for now, it is not clear how scores on hands-on performance tasks should be interpreted." p.10 | Dismissive | The Cost of Science Performance Assessments in Large-Scale Testing Programs, p.1 | Educational Evaluation and Policy Analysis, Spring 1997, 19(1) | "This article is based on work supported by the National Science Foundation under Grant No. MDR-9154406." p.12 | Article references only works by other CRESST authors and completely ignores the career-tech education literature, where such studies are most likely to be found. | |||
277 | Brian M. Stecher | Stephen P. Klein | "Advocates of performance assessment believe that the use of these measures will reinforce efforts to reform curriculum and instruction. … Unfortunately, there is very little research to confirm either the existence or the size of most off these potential benefits. Those few studies ... Klein (1995) ... Javonovic, Solanno-Flores, & Shavelson, 1994; Klein et al., 1996a)." p.10 | Dismissive | The Cost of Science Performance Assessments in Large-Scale Testing Programs, p.1 | Educational Evaluation and Policy Analysis, Spring 1997, 19(1) | "This article is based on work supported by the National Science Foundation under Grant No. MDR-9154406." p.12 | Article references only works by other CRESST authors and completely ignores the career-tech education literature, where such studies are most likely to be found. | |||
278 | Mary Lee Smith | 11 others | "The purpose of the research described in this report is to understand what happens in the aftermath of a change in state assessment policy that is designed to improve schools and make them more accountable to a set of common standards. Although theoretical and rhetorical works about this issue are common in the literature, empirical evidence is novel and scant." | Dismissive | Reforming schools by reforming assessment: Consequences of the Arizona Student Assessment Program (ASAP): Equity and teacher capacity building, p.3 | CSE Technical Report 425, March 1997 | https://cresst.org/wp-content/uploads/TECH425.pdf | Office of Research and Improvement, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | ||
279 | Robert L. Linn | Joan L. Herman | "How much do standards-led assessments costs? Dependable estimates are difficult to obtain, in part because many of the costs associated with assessment -- the time spent by teachers in preparation, administration, and scoring -- are typically absorbed by schools' normal operations and not prices in a separate budget." p.14 | Denigrating | A Policymaker's Guide to Standards-Led Assessment | Education Commission of the States, February, 1997 | The January 1993 GAO report on testing costs included such information. CRESST has spent a quarter century denigrating that report. See, for example, Phelps, R.P. (2000, Winter). Estimating the cost of systemwide student testing in the United States. Journal of Education Finance, 25(3) 343–380; Danitz, T. (2001, February 27). Special report: States pay $400 million for tests in 2001. Stateline.org. Pew Center for the States; Hoxby, C.M. (2002). The cost of accountability, in W. M Evers & H.J. Walberg (Eds.), School Accountability, Stanford, CA: Hoover Institution Press; U.S. GAO. (1993, January). Student testing: Current extent and expenditures, with cost estimates for a national examination. GAO/PEMD-93-8. Washington, DC: US General Accounting Office; Phelps, R.P. (1998). Benefit-cost analysis of systemwide student testing, Paper presented at the annual meeting of the American Education Finance Association, Mobile, AL. | ||||
280 | Robert L. Linn | Joan L. Herman | "None of the above estimates includes operational costs for schools, districts, or states." p.14 | Denigrating | A Policymaker's Guide to Standards-Led Assessment | Education Commission of the States, February, 1997 | The January 1993 GAO report on testing costs included such information. CRESST has spent a quarter century denigrating that report. | ||||
281 | Eva L. Baker | Robert L. Linn, Joan L. Herman | "How do we assure accurate placement of students with varying abilities and language capabilities? There is little research to date to guide policy and practice (August, et al., 1994)." | Dismissive | CRESST: A Continuing Mission to Improve Educational Assessment, p.12 | Evaluation Comment, Summer 1996 | Office of Research and Improvement, US Education Department | ||||
282 | Eva L. Baker | Robert L. Linn, Joan L. Herman | "Alternative assessments are needed for these students (see Kentucky Portfolios for Special Education, Kentucky Department of Education, 1995). Although promising, there has been little or no research investigating the validity of inferences from these adaptations or alternatives." | Dismissive | CRESST: A Continuing Mission to Improve Educational Assessment, p.13 | Evaluation Comment, Summer 1996 | Office of Research and Improvement, US Education Department | ||||
283 | Eva L. Baker | Robert L. Linn, Joan L. Herman | "Similarly, research is needed to provide a basis for understanding the implications of using different summaries of student performance, such as group means or percentage of students meeting a standard, for measuring progress." p.15 | Dismissive | CRESST: A Continuing Mission to Improve Educational Assessment | Evaluation Comment, Summer 1996 | Office of Research and Improvement, US Education Department | ||||
284 | Eva L. Baker | Harold F O'Neil, Jr | "Few research findings exist about the performance of ethnically different groups of students on performance-based assessment in its present form."" p.193 | Dismissive | Chapter 10 in Implementing Performance Assessment: Promises, Problems, and Challenges | Lawrence Erlbaum Associates Publishers, 1996 | Office of Research and Improvement, US Education Department | ||||
285 | Eva L. Baker | Harold F O'Neil, Jr | "The authors have not been able to find studies of the interaction of rters and student ethnicities in educational settings." p.193 | Dismissive | Chapter 10 in Implementing Performance Assessment: Promises, Problems, and Challenges | Lawrence Erlbaum Associates Publishers, 1996 | Office of Research and Improvement, US Education Department | ||||
286 | Robert L. Linn | Daniel M. Koretz, Eva Baker | “’Yet we do not have the necessary comprehensive dependable data. . . .’ (Tyler 1996a, p. 95)” p. 8 | Dismissive | Assessing the Validity of the National Assessment of Educational Progress | CSE Technical Report 416 (June 1996) | http://www.cse.ucla.edu/products/reports/TECH416.pdf | Office of Research and Improvement, US Education Department | In their 2009 Evaluation of NAEP for the US Education Department, Buckendahl, Davis, Plake, Sireci, Hambleton, Zenisky, & Wells (pp. 77–85) managed to find quite a lot of research on making comparisons between NAEP and state assessments: several of NAEP's own publications, Chromy 2005), Chromy, Ault, Black, & Mosquin (2007), McLaughlin (2000), Schuiz & Mitzel (2005), Sireci, Robin, Meara, Rogers, & Swaminathan (2000), Stancavage, Et al (2002), Stoneberg (2007), WestEd (2002), and Wise, Le, Hoffman, & Becker (2004). | ||
287 | Robert L. Linn | Daniel M. Koretz, Eva Baker | "“There is a need for more extended discussion and reconsideration of the approach being used to measure long-term trends.” p. 21 | Dismissive | Assessing the Validity of the National Assessment of Educational Progress | CSE Technical Report 416 (June 1996) | http://www.cse.ucla.edu/products/reports/TECH416.pdf | Office of Research and Improvement, US Education Department | There was extended discussion and cosideration. Simply put, they did not get their way because others disagreed with them. | ||
288 | Robert L. Linn | Daniel M. Koretz, Eva Baker | "“Only a small minority of the articles that discussed achievement levels made any mention of the judgmental nature of the levels, and most of those did so only briefly.” p. 27 | Denigrating | Assessing the Validity of the National Assessment of Educational Progress | CSE Technical Report 416 (June 1996) | http://www.cse.ucla.edu/products/reports/TECH416.pdf | Office of Research and Improvement, US Education Department | All achievement levels, just like all course grades, are set subjectively. This information was never hidden. | ||
289 | Thomas Kellaghan | George F. Madaus, Anastasia Raczek | "The limited evidence on the effectiveness of external, or extrinsic, rewards in education is also reviewed." p.vii | Dismissive | The Use of External Examinations to Improve Student Motication | American Educational Research Association monograph | "Work on this monograph was supported by Grant 910-1205-1 from the Ford Foundation." | See, for example: https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm . This list includes 24 studies completed before 2000 whose primary focus was to measure the effect of “test-based accountability.” A few dozen more pre-2000 studies also measured the effect of test-based accountability although such was not their primary focus. Include qualitative and program evaluation studies of test-based accountability, and the count of pre-2000 studies rises into the hundreds. | |||
290 | Lawrence O. Picus | Alisha Tralli, Suzanne Tacheny | "Although several states have implemented new assessment programs, there has been little research on the costs of developing and implementing these new systems." p.4 | Dismissive | Estimating the Costs of Student Assessment in North Carolina and Kentucky: A State-Level Analysis | CSE Technical Report 408 (February 1996) | http://www.cse.ucla.edu/products/reports/TECH408.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly ad by insinuation. | ||
291 | Lawrence O. Picus | Alisha Tralli, Suzanne Tacheny | "Although several states have implemenmted new assessment programs, there has been little research on the cost of developing and implementing these new systems." p.3 | Dismissive | Estimating the Costs of Student Assessment in North Carolina and Kentucky: A State-Level Analysis | CSE Technical Report 408 (February 1996) | http://www.cse.ucla.edu/products/reports/TECH408.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly ad by insinuation. | ||
292 | Thomas Kellaghan | George F. Madaus, Anastasia Raczek | "At the very least, a careful analysis of relecvant issues and a consideration of empirical evidence are required before reaching such a conclusion. However, the arguments put forward by reformers are not based on such analysis or consideration. Indeed, their arguments often lack clarity, even in the terminology they use. Further, although not much research deals directly with the relationship between external examinations and motivation, ..." p.2 | Dismissive, Denigrating | The Use of External Examinations to Improve Student Motication | American Educational Research Association monograph | "Work on this monograph was supported by Grant 910-1205-1 from the Ford Foundation." | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | |||
293 | Thomas Kellaghan | George F. Madaus, Anastasia Raczek | "The final proposition in the armory of proponents of external examinations anticipates that all students at selected grades at both elementary and high school levels will take such examinations. This proposition is presumably based on the unexamined assumption that the motivational power of examinations will operate more or less the same way for students of all ages." p.10 | Dismissive, Denigrating | The Use of External Examinations to Improve Student Motication | American Educational Research Association monograph | "Work on this monograph was supported by Grant 910-1205-1 from the Ford Foundation." | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | |||
294 | Robert L. Linn | Eva L. Baker | "Although the connection between student achievement and economic competitiveness is not well established, exhortations for higher standards of student achievement nonetheless are frequently based on the assumption of a strong connection." | Dismissive | What Do International Assessments Imply for World-Class Standards? | Educational Evaluation and Policy Analysis, Dec. 1, 1995 | https://journals.sagepub.com/doi/abs/10.3102/01623737017004405 | Office of Research and Improvement, US Education Department | |||
295 | Robert Rothman | "Though Cannell's methods were flawed and he overstated his case, …" p.51 | Dismissive, Denigrating | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | Rothman claims correctly there were likely multiple causes for test score inflation, including outdated norms and genuine improved student achievement. Then, he suggests that Cannell had insisted that there was only one cause--cheating. That is false. Cannell specifically acknowledged other possible causes. See https://eric.ed.gov/?q=Cannell&pg=2&id=ED314454 | ||||
296 | Robert Rothman | "To those familiar with testing—the finding—confirmed by a federally sponsored study by leading experts—pointed up many of the problems brought on by reliance on high-stakes testing. In any event, Cannell's small, crude study helped fuel a mounting criticism of the enterprise." p.52 | Dismissive, Denigrating | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | Cannell surveyed education departments in all fifty states and, in states where districts made all the testing decisions, the larger districts within each state. He was unusually successful in retrieving responses, which required many hours and persistence. It was was an enormous undertaking, and very revealing. Most states and districts admitted that were not following many professional test security standards. See https://eric.ed.gov/?q=Cannell&pg=2&id=ED314454 | ||||
297 | Robert Rothman | "And as a big man with a booming baritone voice, Cannell was able to make himself heard from statehouses to the corridors of the the U.S. Education Department." p.52 | Denigrating | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | Cannell was exactly right. There was corruption, lax security, and cheating. See, for example, https://nonpartisaneducation.org/Review/Articles/v6n3.htm | ||||
298 | Robert Rothman | "To Cannell, the high scores reflected flagrant cheating. … This charge lent an air of sensationalism to Cannell's already provacative findings and helped attract even more publicity for them. … Cannell began receiving letters from other teachers around the country confessing their own misdeeds or charging others with committing similar ones." p.56 | Denigrating | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | Rothman claims correctly there were likely multiple causes for test score inflation, including outdated norms and genuine improved student achievement. Then, he suggests that Cannell had insisted that there was only one cause--cheating. That is false. Cannell specifically acknowledged other possible causes. See https://eric.ed.gov/?q=Cannell&pg=2&id=ED314454 | ||||
299 | Robert Rothman | "Despite those cses, there is little evidence that cheating is epidemic in schools or that such practices are the reason test scores have risen." p.57 | Dismissive | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | Rotham cites one CRESST study. Meanwhile, Cannell surveyed all 50 states on their test security practices and found most lacking. | ||||
300 | Robert Rothman | "Daniel M. Koretz and his colleagues (at CRESST) found that students performed much worse on tests they had not seen before than they did on the district's tests, even though the test measured the same general content and skills." p.62 | Denigrating | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | The comparison test most likely did not measure the same content and skills, as it was a "competing test" in an era when national norm-references tests including widely varying content and sequencing of topics. Though, we cannot check, as Koretz has kept the identity of the tests and the schools secret. | ||||
301 | Robert Rothman | "'Teachers have gotten the message loud and clear that they would be rated on how kids score on tests. That's all it takes. The problem is, it simply hasn't worked in raising performance. I don't know why we would want to try it again when it hasn't worked before." p.134 | Denigrating | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | ||||
302 | Robert Rothman | "Moreover, Madaus and Kellaghan found that almost no country tests students before age sixteen, and most use tests to select students for scarce slots in higher education and training programs." p.135 | Dismissive | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | Madaus and Kellaghan did not "find" anything. They simply declared that such was a fact. It was not. See https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1745-3992.2000.tb00018.x | ||||
303 | Robert Rothman | "But scholars are just beginning to learn how the new instruments can be used to measure students' abilities." p.149 | Dismissive | Measuring Up: Standards, Assessment, and School Reform | Jossey-Bass Publishers, 1995 | "This book would not have come about without the support of two extraordinary groups of people, to whom I owe inclculable debt." CRESST, Dean Ted Mitchell, Director Eva Baker; Education Week, Editors Ron Wolk, Ginny Edwards. Also, Steve Ferrara, Chester Finn, Joan Herman, Laura Resnick | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||||
304 | Lawrence O. Picus | "While our understanding of how each of these assessment instruments can best be used is growing, information of their costs is virtually nonexistent." p.1 | Dismissive | A Conceptual Framework for Analyzing the Costs of Alternative Assessment | CSE Technical Report 384 (August 1994) | https://cresst.org/wp-content/uploads/TECH384.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly and by insinuation. | |||
305 | Lawrence O. Picus | "Research at the Center for Research on Evaluation, Standards, and Student Testing (CRESST) has found that policy makers have little information about the costs of alternative assessments, and that they are concerned abou the cost trade-offs involved in using alternative assessment compared to the many other activities they feel continue to be necessary." p.1 | Dismissive | A Conceptual Framework for Analyzing the Costs of Alternative Assessment | CSE Technical Report 384 (August 1994) | https://cresst.org/wp-content/uploads/TECH384.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly and by insinuation. | |||
306 | Lawrence O. Picus | "A number of important issues must be resolved before accurate estimates of costs can be developed. Central among those issues is the development of a clear definition of what constitutes a cost." p.1 | Denigrating | A Conceptual Framework for Analyzing the Costs of Alternative Assessment | CSE Technical Report 384 (August 1994) | https://cresst.org/wp-content/uploads/TECH384.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly and by insinuation. | |||
307 | Lawrence O. Picus | "Determining the resources necessary to achieve each of these goals is, at best, a difficult task. Because of this difficulty, many analysts stop short of estimating the true cost of a program, and instead focus on the expenditures required for its implementation." pp.3-4 | Denigrating | A Conceptual Framework for Analyzing the Costs of Alternative Assessment | CSE Technical Report 384 (August 1994) | https://cresst.org/wp-content/uploads/TECH384.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly and by insinuation. | |||
308 | Lawrence O. Picus | "… cost analysts in education have often resorted to estimating the monetary value of the resources devoted to the program being evaluated. ... However, it is important to remember the opportunity costs that result from time commitments of individuals not directly compensated through the assessment program, such as the teachers who are required to spend time on tasks that previously did not exist or were not their responsibility. Determining the value of these opportunity costs will improve the quality of educational cost analyses dramatically." p.33 | Denigrating | A Conceptual Framework for Analyzing the Costs of Alternative Assessment | CSE Technical Report 384 (August 1994) | https://cresst.org/wp-content/uploads/TECH384.pdf | Office of Research and Improvement, US Education Department | The January 1993 GAO report on testing costs included such information. Picus has spent over two decades denigrating that report, both directly and by insinuation. | |||
309 | Mary Lee Smith | 5 others | "This study also draws on previous research on the role of mandated testing. …The question unanswered by extant research is whether assessments that differ in form from the traditional, norm- or criterion-referenced standardized tests would produce similar reactions and effects." | Dismissive | What Happens When the Test Mandate Changes? Results of a Multiple Case Study | CSE Technical Report 380, July 1994 | https://cresst.org/wp-content/uploads/TECH380.pdf | Office of Research and Improvement, US Education Department | Emerging? It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
310 | Linn, R.L. | "Evidence is also needed that the uses and interpretations are contributing to enhanced student achievement and at the same time, not producing unintended negative outcomes." p.8 | Performance Assessment: Policy promises and technical measurement standards. | Educational Researcher, 23(9), 4-14, 1994 | As quoted in William A. Mehrens, Consequences of Assessment: What is the Evidence?, Education Policy Analysis Archives Volume 6 Number 13 July 14, 1998, https://epaa.asu.edu/ojs/article/view/580/ | Office of Research and Improvement, US Education Department | |||||
311 | Audrey J. Noble | Mary Lee Smith | "Are the behaviorist beliefs underlying measurement-driven reform warranted? A small body of evidence addresses the functions of assessments from the traditional viewpoint. | Dismissive | Old and New Beliefs About Measurement-Driven Reform: The More Things Change, the More They Stay the Same, p.3 | CSE Technical Report 373, CRESST/Arizona State University | https://cresst.org/wp-content/uploads/TECH373.pdf | Office of Research and Improvement, US Education Department | Emerging? It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
312 | Audrey J. Noble | Mary Lee Smith | "Few empirical studies exist of the
use and effects of performance testing in high-stakes environments." |
Dismissive | Old and New Beliefs About Measurement-Driven Reform: The More Things Change, the More They Stay the Same, p.10 | CSE Technical Report 373, CRESST/Arizona State University | https://cresst.org/wp-content/uploads/TECH373.pdf | Office of Research and Improvement, US Education Department | Emerging? It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
313 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "Sufficient high-quality assessments must be available before their impact on educational reform can be assessed. Although interest in performance-based assessment is high, our knowledge about its quality is low." | Dismissive | Policy and validity prospects for performance-based assessment, 1993, p.332 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance assessments have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
314 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "Moreover, few psychometric templates exist to guide the technical practices of assessment developers." | Dismissive | Policy and validity prospects for performance-based assessment, 1993, p.332 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance assessments have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
315 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "Most of the arguments in favor of performance-based assessment ... are based on single instances, essentially hand-crafted exercises whose virtues are assumed because they have been developed by teachers or because they are thought to model good instructional practice." | Denigrating | Policy and validity prospects for performance-based assessment, 1993, p.334 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
316 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "Although there is a considerable literature on the problem of unit or team assessment in the military (Swezey & Salas, 1992) and in technical fields such as antisubmarine warfare (Franken, in press), no compelling solutions have been forwarded for disaggregating group or team performance into individual records, a potential problem if assessments are to be used to allocate individual access or certification." | Denigrating | Policy and validity prospects for performance-based assessment, 1993, p.336 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
317 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "What is the evidence in support of performance assessment? Reviews conducted of literature in military performance assessments (Baker, O’Neil, & Linn, 1990) and of literature in education (Baker, 1990b) have reported the relatively low incidence of any empirical literature in the field; less than 5% of the literature cited empirical data." | Dismissive | Policy and validity prospects for performance-based assessment, 1993, p.339-340 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
318 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "To date, there is some evidence that precollegiate performance assessments result in relatively low levels of student performance in almost every subject matter area in which they have been tried. There is also emerging data from NAEP analyses (Koretz, Lewis, Skewes-Cox, & Burstein, 1992) that students differ by ethnicity in the rate at which they attempt more open-ended types of items." | Dismissive | Policy and validity prospects for performance-based assessment, 1993, p.341 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
319 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "Research is underway attempting to address the motivational aspects of these assessments (Gearhart, Saxe, Stipek, & Hakansson, 1992; O’Neil, Sugrue, Abedi, Baker, & Golan, 1992)." | Dismissive | Policy and validity prospects for performance-based assessment, 1993, p.341 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
320 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "Another approach might require the reconceptualization of the unit of assessment to include both teacher and student and thereby to legitimate help of various sorts. As yet, there is little research and only occasional speculation about the degree to which new assessments will be corrupted." | Dismissive | Policy and validity prospects for performance-based assessment, 1993, p.344-345 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
321 | Baker, E.L. | O'Neil, H.F., & Linn, R.L. | "A better research base is needed to evaluate the degree to which newly developed assessments fulfill expectations" | Denigrating | Policy and validity prospects for performance-based assessment, 1993, p.346 | American Psychologist, 48(12), 1210-1218. | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.816.7823&rep=rep1&type=pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | ||
322 | Eva L. Baker | Robert L. Linn | "Because performance assessments are emerging phenomena, procedures for assessing their quality are in some disorder." | Denigrating | The Technical Merits of Performance Assessments, p.1 | CRESST Line, Special 1993 AERA Issue | Office of Research and Improvement, US Education Department | Emerging? It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
323 | Eva L. Baker | Robert L. Linn | "Second, there is relatively little analysis of the sequence of technical procedures required to render assessments sound for some uses." | Dismissive | The Technical Merits of Performance Assessments, p.1 | CRESST Line, Special 1993 AERA Issue | Office of Research and Improvement, US Education Department | Emerging? It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
324 | Eva L. Baker | Robert L. Linn | "The problem is that we cannot learn enough from the conduct of short-term instructional studies, nor can we wait for the results of longer-term instructional programs. ...We must continue to operate on faith." | Denigrating | The Technical Merits of Performance Assessments, p.2 | CRESST Line, Special 1993 AERA Issue | Office of Research and Improvement, US Education Department | Emerging? It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
325 | Walter M. Haney | George F. Madaus, Robert Lyons | "Academics who write about educational and psychological testing similarly have given little attention to the commercial side of testing." p.9 | Dismissive | The Fractured Marketplace for Standardized Testing | National Commission on Testing and Public Policy, Boston College, Kluwer Academic Publishers, 1993 | "Finally we thank the Ford Foundation, and three present and former officials there, …" | ||||
326 | Walter M. Haney | George F. Madaus, Robert Lyons | "Nor is there much clear evidence on the potential distortions introduced by the Lake Wobegon phenomenon." p.231 | Dismissive | The Fractured Marketplace for Standardized Testing | National Commission on Testing and Public Policy, Boston College, Kluwer Academic Publishers, 1993 | "Finally we thank the Ford Foundation, and three present and former officials there, …" | John J. Cannells original "Lake Wobegon Effect" studies did a fine job of specifying the results, in detail. See: http://nonpartisaneducation.org/Review/Books/CannellBook1.htm http://nonpartisaneducation.org/Review/Books/Cannell2.pdf | |||
327 | Robert L. Linn | Vonda L. Kiplinger | "Unfortunately, there have been no empirical studies to date to either support or reject the hypothesized lack of motivation generated by the NAEP testing environment, or to show whether students' performance would be improved if motivation were increased." | 1stness | Raising the stakes of test administration: The impact on student performance on NAEP, p.3 | CSE Technical Report 360, March 3, 1993 | https://files.eric.ed.gov/fulltext/ED378221.pdf | Office of Research and Improvement, US Education Department | A cornucopia of research has shown "no stakes" tests to be relatively unreliable, less reliable than high stakes tests, and to dampen student effort (see, e.g., Acherman & Kanfer, 2009; S. M. Brown & Walberg, 1993; Cole, Bergin, & Whittaker, 2008; Eklof, 2007; Finn, 2015; Hawthorne, Bol, Pribesh, & Suh, 2015; Wise & DeMars, 2005, 2015). Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | ||
328 | Robert L. Linn | Vonda L. Kiplinger | "Although much has been written on achievement motivation per se, there has been surprisingly little empirical research on the effects of different motivation conditions on test performance. Before examining the paucity of research on the relationship of motivation and test performance....?" | Dismissive | Raising the stakes of test administration: The impact on student performance on NAEP, p.3 | CSE Technical Report 360, March 3, 1993 | https://files.eric.ed.gov/fulltext/ED378221.pdf | Office of Research and Improvement, US Education Department | A cornucopia of research has shown "no stakes" tests to be relatively unreliable, less reliable than high stakes tests, and to dampen student effort (see, e.g., Acherman & Kanfer, 2009; S. M. Brown & Walberg, 1993; Cole, Bergin, & Whittaker, 2008; Eklof, 2007; Finn, 2015; Hawthorne, Bol, Pribesh, & Suh, 2015; Wise & DeMars, 2005, 2015). Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | ||
329 | Robert L. Linn | Vonda L. Kiplinger | "Before examining the paucity of research on the relationship of motivation and test performance, we first review briefly the general literature on the relationship of motivation and achievement." | Dismissive | Raising the stakes of test administration: The impact on student performance on NAEP, p.3 | CSE Technical Report 360, March 3, 1993 | https://files.eric.ed.gov/fulltext/ED378221.pdf | Office of Research and Improvement, US Education Department | A cornucopia of research has shown "no stakes" tests to be relatively unreliable, less reliable than high stakes tests, and to dampen student effort (see, e.g., Acherman & Kanfer, 2009; S. M. Brown & Walberg, 1993; Cole, Bergin, & Whittaker, 2008; Eklof, 2007; Finn, 2015; Hawthorne, Bol, Pribesh, & Suh, 2015; Wise & DeMars, 2005, 2015). Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | ||
330 | Robert L. Linn | Vonda L. Kiplinger | "Prior to 1980, achievement motivation theory focused primarily on the need for achievement and the effects of test anxiety on test performance." | Dismissive | Raising the stakes of test administration: The impact on student performance on NAEP, p.3 | CSE Technical Report 360, March 3, 1993 | https://files.eric.ed.gov/fulltext/ED378221.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
331 | Robert L. Linn | Vonda L. Kiplinger | "Despite continuing concern regarding the effects of motivation on student achievement and test performance in general, ...there has been very little empirical research on students' self-reported motivation levels or experimental manipulation of motivational conditions--until recently." | Dismissive | Raising the stakes of test administration: The impact on student performance on NAEP, p.3 | CSE Technical Report 360, March 3, 1993 | https://files.eric.ed.gov/fulltext/ED378221.pdf | Office of Research and Improvement, US Education Department | Relevant studies of the effects of tests and/or accountability program on motivation and instructional practice include those of the *Southern Regional Education Board (1998); Johnson (1998); Schafer, Hultgren, Hawley, Abrams Seubert & Mazzoni (1997); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); Tuckman & Trimble (1997); Clarke & Stephens (1996); Zigarelli (1996); Stevenson, Lee, et al. (1995); Waters, Burger & Burger (1995); Egeland (1995); Prais (1995); Tuckman (1994); Ritchie & Thorkildsen (1994); Brown & Walberg, (1993); Wall & Alderson (1993); Wolf & Rapiau (1993); Eckstein & Noah (1993); Chao-Qun & Hui (1993); Plazak & Mazur (1992); Steedman (1992); Singh, Marimutha & Mukjerjee (1990); *Levine & Lezotte (1990); O’Sullivan (1989); Somerset (1988); Pennycuick & Murphy (1988); Stevens (1984); Marsh (1984); Brunton (1982); Solberg (1977); Foss (1977); *Kirkland (1971); Somerset (1968); Stuit (1947); and Keys (1934). *Covers many studies; study is a research review, research synthesis, or meta-analysis. | "Others have
considered the role of tests in incentive programs. These researchers have included Homme,
Csanyi, Gonzales, Rechs, O’Leary, Drabman, Kaszdin, Bootzin, Staats, Cameron,
Pierce, McMillan, Corcoran, and Wilson. International organizations, such as
the World Bank or the Asian Development Bank, have studied the effects of
testing on education programs they sponsor.
Researchers have included Somerset, Heynemann, Ransom, Psacharopoulis,
Velez, Brooke, Oxenham, Bude, Chapman, Snyder, and Pronaratna. Moreover, the mastery learning/mastery testing experiments conducted from the 1960s through today varied incentives, frequency of tests, types of tests, and many other factors to determine the optimal structure of testing programs. Researchers included such notables as Bloom, Carroll, Keller, Block, Burns, Wentling, Anderson, Hymel, Kulik, Tierney, Cross, Okey, Guskey, Gates, and Jones." |
|
332 | Joan L. Herman | "Although the development of new alternatives is a popular idea, and many are engaged in the process, most developers of these new alternatives (with the exception of writing assessments) are at the design and prototyping stages, at some distance from having validated assessments." | Dismissive | Accountability and Alternative Assessment: Research and Development Issues, p.9 | CSE Technical Report 348, August 1992 | https://cresst.org/wp-content/uploads/TECH348.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
333 | Joan L. Herman | "Yet what we know about alternative or performance-based measures is relatively small when compared to what we have yet to discover." | Dismissive | Accountability and Alternative Assessment: Research and Development Issues, p.9 | CSE Technical Report 348, August 1992 | https://cresst.org/wp-content/uploads/TECH348.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
334 | Lorrie A. Shepard | "Proponents of measurement-driveni nstruction (MDI) argued, in the 1980s, that high-stakes tests would set clear targets thus assuring that teachers would focus greater attentionon essential basic skills. Critics countered that measurement-driven instruction distorts the curriculum, .... Each side argued theoretically and from limited observations but without systematic proof of these assertions." | Dismissive | Will National Tests Improve Student Learning?, p.6 | CSE Technical Report 342, April 1992 | https://files.eric.ed.gov/fulltext/ED348382.pdf | Office of Research and Improvement, US Education Department | Relevant pre-2000 studies of the effects of standards, alignment, goal setting, setting reachable goals, etc. include those of Mitchell (1999); Morgan & Ramist (1998); the *Southern Regional Education Board (1998); Miles, Bishop, Collins, Fink, Gardner, Grant, Hussain, et al. (1997); the Florida Office of Program Policy Analysis (1997); Pomplun (1997); Schmoker (1996); Aguilera & Hendricks (1996); Banta, Lund, Black & Oblander (1996); Bottoms & Mikos (1995); *Bamburg & Medina (1993); Bishop (1993); the U. S. General Accounting Office (1993); Eckstein & Noah (1993); Mattsson (1993); Brown (1992); Heyneman & Ransom (1992); Whetton (1992); Anderson, Muir, Bateson, Blackmore & Rogers (1990); Csikszentmihalyi (1990); *Levine & Lezotte (1990); LaRoque & Coleman (1989); Hillocks (1987); Willingham & Morris (1986); Resnick & Resnick (1985); Ogle & Fritts (1984); *Natriello & Dornbusch (1984); Brooke & Oxenham (1984); Rentz (1979); Wellisch, MacQueen, Carriere & Dick (1978); *Rosswork (1977); Estes, Colvin & Goodwin (1976); Wood (1953); and Panlasigui & Knight (1930). | |||
335 | Lorrie A. Shepard | "The vision of curriculum-driven examinations offered by the National Education Goals Panel is inspired. However, we do not at present have the technical, curricular, or political know-how to install such a system at least not on so large a scale." | Dismissive | Will National Tests Improve Student Learning?, p.10 | CSE Technical Report 342, April 1992 | https://files.eric.ed.gov/fulltext/ED348382.pdf | Office of Research and Improvement, US Education Department | ||||
336 | Lorrie A. Shepard | "Moreover, there is no evidence available about what would happen to the quality of instruction if all high-school teachers, not just those who volunteered, were required to teach to the AP curricula." | Dismissive | Will National Tests Improve Student Learning?, p.10 | CSE Technical Report 342, April 1992 | https://files.eric.ed.gov/fulltext/ED348382.pdf | Office of Research and Improvement, US Education Department | ||||
337 | Lorrie A. Shepard | "Research evidence on the effects of traditional standardized tests when used as high-stakes accountability instruments is strikingly negative." | Dismissive | Will National Tests Improve Student Learning?, pp.15-16 | CSE Technical Report 342, April 1992 | https://files.eric.ed.gov/fulltext/ED348382.pdf | Office of Research and Improvement, US Education Department | In fact, the evidence "that testing can improve education" is voluminous. See, for example, Phelps, R. P. (2005). The rich, robust research literature on testing’s achievement benefits. In R. P. Phelps (Ed.), Defending standardized testing (pp. 55–90). Mahwah, NJ: Psychology Press. Or, see https://journals.sagepub.com/doi/abs/10.1177/0193841X19865628#abstract | |||
338 | Joan L. Herman | Shari Golan | ""Using greater technical rigor, Linn et al. (1989) replicated Cannell's findings, but moved beyond them in identifying underlying causes for such seemingly spurious results, among them the age of norms." pp.10-11 | Denigrating | Effects of Standardized Testing on Teachers and Learning—Another Look | CSE Report No. 334 | https://eric.ed.gov/?id=ED341738 | Office of Research and Improvement, US Education Department | No. Cannell was exactly right. There was corruption, lax security, and cheating. See, for example, https://nonpartisaneducation.org/Review/Articles/v6n3.htm | ||
339 | R.J. Dietel, J.L. Herman, and R.A. Knuth | "Although there is now great excitement about performance-based assessment, we still know relatively little about methods for designing and validating such assessments. CRESST is one of many organizations and schools researching the promises and realities of such assessments." p.3 | Dismissive | What Does Research Say About Assessment? | North Central Regional Education Laboratory, 1991 | http://methodenpool.uni-koeln.de/portfolio/What%20Does%20Research%20Say%20About%20Assessment.htm | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
340 | R.J. Dietel, J.L. Herman, and R.A. Knuth | "What we know about performance-based assessment is limited and there are many issues yet to be resolved." p.6 | Dismissive | What Does Research Say About Assessment? | North Central Regional Education Laboratory, 1991 | http://methodenpool.uni-koeln.de/portfolio/What%20Does%20Research%20Say%20About%20Assessment.htm | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Open-ended item formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
341 | Mary Lee Smith | Carole Edelsky, Kelly Draper, Claire Rottenberg, Meredith Cherland | "Although schools have administered standardized tests of achievement for decades, only recently have such tests been used as instruments of social policy." p.1 | Dismissive | The Role of Testing in Elementary Schools | CSE Technical Report 321, May 1991 | https://cresst.org/publications/cresst-publication-2695/ | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
342 | Mary Lee Smith | Carole Edelsky, Kelly Draper, Claire Rottenberg, Meredith Cherland | "The research literature on the effects of external testing is small but growing." p.3 | Dismissive | The Role of Testing in Elementary Schools | CSE Technical Report 321, May 1991 | https://cresst.org/publications/cresst-publication-2695/ | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
343 | Mary Lee Smith | Carole Edelsky, Kelly Draper, Claire Rottenberg, Meredith Cherland | "Past researchers have not examined the classroom directly for traces of testing effects." p.5 | Dismissive | The Role of Testing in Elementary Schools | CSE Technical Report 321, May 1991 | https://cresst.org/publications/cresst-publication-2695/ | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
344 | Eva L. Baker | "Knowledge Base: Paltry But Sure to Improve: At the same time that interest in alternative assessment is high, our knowledge about the design, distribution, quality and impact of such efforts is low. This is a time of tingling metaphor, cottage industry, and existence proofs rather than carefully designed research and development." | Dismissive | What Probably Works in Alternative Assessment, p.2 | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) | https://files.eric.ed.gov/fulltext/ED512658.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
345 | Eva L. Baker | "Moreover, because psychometric methods appropriate for dealing with such new measures are not readily available, nor even a matter of common agreement, no clear templates exist to guide the technical practices of alternative assessment developers (Linn, Baker, Dunbar, 1991)." | Dismissive | What Probably Works in Alternative Assessment, p.2 | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) | https://files.eric.ed.gov/fulltext/ED512658.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
346 | Eva L. Baker | "Given that the level of empirical work is so obviously low, one well might wonder what these studies are about." | Denigrating | What Probably Works in Alternative Assessment, p.3 | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) | https://files.eric.ed.gov/fulltext/ED512658.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
347 | Eva L. Baker | "Despite this fragile research base, alternative assessment has already taken off. What issues can we anticipate being raised by relevant communities about the value of these efforts?" | Dismissive | What Probably Works in Alternative Assessment, p.6 | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) | https://files.eric.ed.gov/fulltext/ED512658.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
348 | Eva L. Baker | "This phenomenon may be due to lack of coherent specifications of the performance task domain, lack of coherent instructional experience, or the inherent instability of more complex performance? Until some insight on this phenomenon can be developed, however, using a single performance assessment for individual student decisions is a scary prospect." | Dismissive | What Probably Works in Alternative Assessment, p.7 | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) | https://files.eric.ed.gov/fulltext/ED512658.pdf | Office of Research and Improvement, US Education Department | It is selected-response item formats (e.g., multiple choice) that are new. Performance and authentic test formats have been with us for millenia. And, thousands of research, evalution, and validity studies have been conducted on them. | |||
349 | Lorrie A. Shepard | Catherine Cutts Dougherty | "Evidence to support the positive claims for measurement-driven instruction comes primarily from high-stakes tests themselves. For example, Popham, Cruse, Rankin, Sandifer, and Williams (1985) and Popham (1987) pointed to the steeply rising passing rates on minimum competency tests as demonstrations that MDI had improved student learning." p.2 | Denigrating | Effect of High-Stakes Testing on Instruction | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) and the National Council on Measurement in Education (Chicago, IL, April 4-6,1991) | https://files.eric.ed.gov/fulltext/ED337468.pdf | Office of Research and Improvement, US Education Department | The many studies of district and state minimum competency or diploma testing programs popular from the 1960s through the 1980s found positive effects for students just below the cut score and mixed effects for students far below and anywhere above. Researchers have included Fincher, Jackson, Battiste, Corcoran, Jacobsen, Tanner, Boylan, Saxon, Anderson, Muir, Bateson, Blackmore, Rogers, Zigarelli, Schafer, Hultgren, Hawley, Abrams, Seubert, Mazzoni, Brookhart, Mendro, Herrick, Webster, Orsack, Weerasinghe, and Bembry | ||
350 | Lorrie A. Shepard | Catherine Cutts Dougherty | "Evidence documenting the negative influence on instruction is limited to a few studies. Darling-Hammond and Wise (1985) reported that teachers in their study were pressured to 'teach to the test.'" | Dismissive | Effect of High-Stakes Testing on Instruction | Paper presented at the Annual Meetings of the American Educational Research Association (Chicago, IL, April 3-7, 1991) and the National Council on Measurement in Education (Chicago, IL, April 4-6,1991) | https://files.eric.ed.gov/fulltext/ED337468.pdf | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
351 | Daniel M. Koretz | Robert L. Linn, Stephen Dunbar, Lorrie A. Shepard | “Evidence relevant to this debate has been limited.” p. 2 | Dismissive | The Effects of High-Stakes Testing On Achievement: Preliminary Findings About Generalization Across Tests | Originally presented at the annual meeting of the AERA and the NCME, Chicago, April 5, 1991 | http://nepc.colorado.edu/files/HighStakesTesting.pdf | Office of Research and Improvement, US Education Department | See, for example, https://www.tandfonline.com/doi/full/10.1080/15305058.2011.602920 ; https://nonpartisaneducation.org/Review/Resources/QuantitativeList.htm ; https://nonpartisaneducation.org/Review/Resources/SurveyList.htm ; https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm | ||
352 | James S. Catterall | "Before proceeding, readers should note that the observations do not result from an accumulated weight of in-depth cost-benefit type studies, since no such weight has been registered." p.2 | Dismissive | Estimating the Costs and Benefits of Large-Scale Assessments: Lessons from Recent Research | CSE Report No. 319, 1990 | https://cresst.org/wp-content/uploads/TECH319.pdf | Office of Research and Improvement, US Education Department | ||||
353 | James S. Catterall | "The points tend to build on the small number of interesting developments reported (particularly Shepard & Kreitzer, 1987a, 1987b; Solmon & Fagnano, in press), as well as on the author's experiences in conducting cost-benefit type analyses of educational assessment practices (Catterall, 1984, 1989). We also base inferences on the paucity of research itself." p.2 | Dismissive | Estimating the Costs and Benefits of Large-Scale Assessments: Lessons from Recent Research | CSE Report No. 319, 1990 | https://cresst.org/wp-content/uploads/TECH319.pdf | Office of Research and Improvement, US Education Department | ||||
354 | Hartigan, J. A., & Wigdor, A. K. | "The empirical evidence cited for the standard deviation of worker productivity is quite slight." p.239 | Dismissive | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
355 | Hartigan, J. A., & Wigdor, A. K. | "Some fragmentary confirming evidence that supports this point of view can be found in Hunter et al. (1988)... We regard the Hunter and Schmidt assumption as plausible but note that there is very little evidence about the nature of the relationship of ability to output." p.243 | Dismissive | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
356 | Hartigan, J. A., & Wigdor, A. K. | "It is also important to remember that the most important assumptions of the Hunter-Schmidt models rest on a very slim empirical foundation .... Hunter and Schmidt's economy-wide models are based on simple assumptions for which the empirical evidence is slight." p.245 | Dismissive, Denigrating | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
357 | Hartigan, J. A., & Wigdor, A. K. | "It is important to remember that the most important assumptions of the Hunter-Schmidt models rest on a very slim empirical foundation." p.245 | Dismissive, Denigrating | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
358 | Hartigan, J. A., & Wigdor, A. K. | "Hunter and Schmidt's economy wide models are based on simple assumptions for which the empirical evidence is slight." p.245 | Dismissive, Denigrating | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
359 | Hartigan, J. A., & Wigdor, A. K. | "That assumption is supported by only a very few studies." p.245 | Dismissive, Denigrating | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
360 | Hartigan, J. A., & Wigdor, A. K. | "There is no well-developed body of evidence from which to estimate the aggregate effects of better personnel selection...we have seen no empirical evidence that any of them provide an adequate basis for estimating the aggregate economic effects of implementing the VG-GATB on a nationwide basis." p.247 | Dismissive, Denigrating | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
361 | Hartigan, J. A., & Wigdor, A. K. | "Furthermore, given the state of scientific knowledge, we do not believe that realistic dollar estimates of aggregate gains from improved selection are even possible." p.248 | Dismissive | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
362 | Hartigan, J. A., & Wigdor, A. K. | "...primitive state of knowledge..." p.248 | Denigrating | Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. | Washington, DC: National Academy Press, 1989 | https://www.nap.edu/catalog/1338/fairness-in-employment-testing-validity-generalization-minority-issues-and-the | National Research Council funders | See, for example, The National Research Council’s Testing Expertise, https://www.apa.org/pubs/books/supplemental/correcting-fallacies-educational-psychological-testing/Phelps Web Appendix D new.doc | |||
363 | Joan L. Herman, Donald W. Dorr-Bremme | Walter E. Hathaway, Ed. | "Despite the controversy and the important issues that it raises, little information has been forthcoming on the nature of testing as it is actually used in the schools. What functions do tests serve in the classrooms? How do teachers and principals use test results? What kinds of tests do principals and teachers trust and rely on most? These and similar questions have gone largely unaddressed." p.8 | Dismissive | Uses of Testing in the Schools: A National Profile | Testing in the Schools, New Directions for Testing and Measurement #19, Jossey-Bass, September 1983 | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
364 | Joan L. Herman, Donald W. Dorr-Bremme | Walter E. Hathaway, Ed. | "A few studies have indicated teachers' circumspect attitudes toward and limited use of one type of achievement measure, the norm-referenced test. Beyond this, however, the landscape of test uses in American schools has remained largely unexplored." p.8 | Dismissive | Uses of Testing in the Schools: A National Profile | Testing in the Schools, New Directions for Testing and Measurement #19, Jossey-Bass, September 1983 | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
365 | Joan L. Herman, Donald W. Dorr-Bremme | Walter E. Hathaway, Ed. | "We know very little about the quality of teacher-developed tests." p.15 | Dismissive | Uses of Testing in the Schools: A National Profile | Testing in the Schools, New Directions for Testing and Measurement #19, Jossey-Bass, September 1983 | Office of Research and Improvement, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
366 | Don Dorr-Bremme | James Catterall | "Relatively little is known aout students' attitudes and feelings toward assessment in general. Even less is known regarding their feelings about different forms of assessment." p.48-1 | Dismissive | Costs of Testing: Test Use Project | CSE Report, November 1982 | https://files.eric.ed.gov/fulltext/ED224835.pdf | National Institute of Education, US Education Department | See https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm. For a list of 19 pre-1982 qualitative studies of student attitudes toward testing | ||
367 | Don Dorr-Bremme | James Catterall | "in light of these few and certainly non-definitive findings, student interviews were undertaken to explore the affective valence that different forms of achievement assessment have for students." p.48-2 | Dismissive | Costs of Testing: Test Use Project | CSE Report, November 1982 | https://files.eric.ed.gov/fulltext/ED224835.pdf | National Institute of Education, US Education Department | See https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm. For a list of 19 pre-1982 qualitative studies of student attitudes toward testing | ||
368 | Don Dorr-Bremme | James Catterall | "Because of the small sample size and the paucity of research in this topic, these findings suggests potential avenues for research as much as they provide information." p.48-26 | Dismissive | Costs of Testing: Test Use Project | CSE Report, November 1982 | https://files.eric.ed.gov/fulltext/ED224835.pdf | National Institute of Education, US Education Department | See https://nonpartisaneducation.org/Review/Resources/QualitativeList.htm. For a list of 19 pre-1982 qualitative studies of student attitudes toward testing | ||
369 | Jennie P. Yeh | Joan L. Herman | "Testing in American schools is increasing in both scope and visibility. … What return are we getting for this quite considerable investment? Little information is available. How are tests used in schools? What functions to test serve in classrooms?", p.1 | Dismissive | Teachers and testing: A survey of test use | CSE Report No. 166, 1981 | https://files.eric.ed.gov/fulltext/ED218336.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
370 | Joan L. Herman | James Burry, Don Dorr-Bremme, Charlotte M. Lazar-Morrison, James D. Lehman, Jennie P. Yeh | "Despite the great controversy that surrounds testing and its potential uses and abuses, there is little empirical information available about the nature of testing as it actually occurs and is used (or not used) in schools. The Test Use Project at the Center for the Study of Evaluation seeks to fill this gap and answer basic questions about tests and schooling.", p.2 | Dismissive | Teaching and testing: Allies or adversaries | CSE Report No. 165, 1981 | https://files.eric.ed.gov/fulltext/ED218336.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
371 | Joan L. Herman | James Burry, Don Dorr-Bremme, Charlotte M. Lazar-Morrison, James D. Lehman, Jennie P. Yeh | "Clearly the policy toward testing in this country has been one of accretion, but the full magnitude is undocumented. The CSE Test Use Project ... ", p.2 | Dismissive | Teaching and testing: Allies or adversaries | CSE Report No. 165, 1981 | https://files.eric.ed.gov/fulltext/ED218336.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
372 | James Burry | "As instructional considerations have come into prominence, the dialogue over testing has become somewhat adversarial, with a great deal of the recent literature forming a series of position papers espousing the value of one kind of test over another, but offering little empirical data (Lazar-Morrison, Polin, Moy, & Burry, 1980)." p.27 | Dismissive | The Design of Testing Programs with Multiple and Complimentary Uses | Paper presented at the Annual Meeting of the National Council on Measurement in Education (Los Angeles, CA, April 1981) | https://files.eric.ed.gov/fulltext/ED218337.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
373 | James Burry | "This paper makes a preliminary step toward explicating school peoples' points of view about the kinds of assessment that are useful for external accountability concerns and for instructional decision making." pp.27-28 | 1stness | The Design of Testing Programs with Multiple and Complimentary Uses | Paper presented at the Annual Meeting of the National Council on Measurement in Education (Los Angeles, CA, April 1981) | https://files.eric.ed.gov/fulltext/ED218337.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | |||
374 | Joan L. Herman | Jennie Yeh | "Despite the great controversy that surrounds testing and its potential uses and abuses, there is little empirical information available about the nature of testing as it actually occurs and is used (or not used) in schools. The Test Use Project …." p.2 | Dismissive | Contextual Examination of Test Use: The Test, The Setting, The Cost | Paper presented at the Annual Meeting of the National Council on Measurement in Education (Los Angeles, CA, April 1981) | https://files.eric.ed.gov/fulltext/ED218337.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
375 | Joan L. Herman | Jennie Yeh | "Clearly the policy toward testing in this country has been one of accretion, but the full magnitude is undocumented. The CSE Test Use Project ... ", p.2 | Dismissive | Contextual Examination of Test Use: The Test, The Setting, The Cost | Paper presented at the Annual Meeting of the National Council on Measurement in Education (Los Angeles, CA, April 1981) | https://files.eric.ed.gov/fulltext/ED218337.pdf | National Institute of Education, US Education Department | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
376 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "There is little research-based information about current testing practice." | Dismissive | A review of the literature on test use, p.3 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
377 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Almost ten years ago, Kirkland (1971) reviewed the literature on test impact on students and schools and found that while much had been written about tests, few empirical studies were evident." | Dismissive | A review of the literature on test use, p.3 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
378 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "What is significant about [Kirkland's] exclusions is the correct observation that these issues are 'implications,' often not founded on empirical research." | Denigrating | A review of the literature on test use, p.3 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
379 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Today, there still remains a plethora of publications on these very issues and a dearth of empirical support on actual test use practices." | Dismissive | A review of the literature on test use, p.3 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
380 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Kirkland's review of the literature is concentrated mainly upon the social and psychological issues in testing, more than upon instructional issues. Also, then as now, little empirical research had accumulated on the latter. | Dismissive | A review of the literature on test use, p.3 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
381 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Only recently has the testing dialogue begun to move away from social and psychological issues ...and begun to focus on the instructional issues of testing. | Dismissive | A review of the literature on test use, p.3 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
382 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | " ...the testing dialogue has taken the form of a debate, with the bulk of the test literature being a series of position papers citing little empirical data. This debate is being carried on predominantly by people outside the schools." | Denigrating | A review of the literature on test use, p.4 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
383 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | ""There is little empirical research available that can answer the questions that have arisen." | Dismissive | A review of the literature on test use, p.5 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
384 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "... little is known about the amount of other testing that takes place." | Dismissive | A review of the literature on test use, p.6 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
385 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Although much has been written about minimum competency issues, there has yet to be any report of the actual uses or extent of the use of competency-based tests." | Dismissive | A review of the literature on test use, p.7 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
386 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | ""Virtually nothing is known about the amount of testing taking place using other types of assessments." | Dismissive | A review of the literature on test use, p.7 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
387 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The literature on curriculum-embedded tests is equally scant." | Dismissive | A review of the literature on test use, p.8 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
388 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The current information focuses on norm- and criterion-referenced tests with some emphasis on minimum competency testing. Since literature on the other evaluative processes is lacking, there is a great need to look at various types of assessments to determine the purposes they serve. | Dismissive | A review of the literature on test use, p.9 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
389 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The kinds of contextual factors which influence testing and the use of test results are just beginning to be appreciated." | Dismissive | A review of the literature on test use, p.9 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
390 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Concern exists about the level of teacher training in testing. ... The literature does not appear to reflect any great follow-up to such suggestions [regarding teacher competence with testing]." | Dismissive | A review of the literature on test use, p.9 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
391 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "All of the studies mentioned included information about standardized achievement testing. As of yet, there is no evidence about how teacher attitudes toward other types of tests affect the use of those assessments." | Dismissive | A review of the literature on test use, p.19 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
392 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The effect of the actual testing environment on test use is only beginning to emerge. Evidence suggests that characteristics of the test-takers and the instructional environment need to be explored." | Dismissive | A review of the literature on test use, p.19 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
393 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "These factors have been considered in research on teachers' instructional decision-making or in studies of the social or organizational qualities of the classroom. The investigation of these variables as factors affecting teachers' use of tests and test data is minimal." | Dismissive | A review of the literature on test use, p.20 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
394 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "In the community, parent involvement, accounability pressures, and news media coverage of test scores are possible influences on the nature and amount of testing, but they have yet to be researched." | Dismissive | A review of the literature on test use, p.20 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
395 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "We know very little about the costs of testing." | Dismissive | A review of the literature on test use, p.20 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
396 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Little information is available about these types of costs, and the little information that is available concerns teachers and student attitudes." | Dismissive | A review of the literature on test use, p.22 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
397 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The question of whether test scores affect a student's self-concept has also been raised." ... As indicated previously, information on any of the aforementioned issues is scant," | Dismissive | A review of the literature on test use, p.23 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
398 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Other evidence suggests that tests of many types are being administered and the results are being utilized. To what extent this is occurring is not specifically known." | Dismissive | A review of the literature on test use, pp.23-24 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
399 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "There are a number of areas concerning teachers and testing for which there is no information." | Dismissive | A review of the literature on test use, p.24 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
400 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The impact of other testing must also be considered. In-class assessments made by individual teachers have yet to be examined in depth." | Dismissive | A review of the literature on test use, p.24 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
401 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "Teachers place greater reliance on, and have more confidence in, the results of their own judgments of students' performance, but little is known about the kinds of activities that give voice to this information." | Dismissive | A review of the literature on test use, p.25 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
402 | Charlotte Lazar-Morrison | Linda Polin, Raymond Moy, James Burry | "The settings and factors which affect the use of tests and their results is yet another uninformed area." | Dismissive | A review of the literature on test use, p.25 | CSE Report No. 144, August 1980 | https://cresst.org/publications/cresst-publication-2531/ | National Institute of Education, US Department of Health and Human Services | Rubbish. Entire books dating back a century were written on the topic, for example: C.C. Ross, Measurement in Today’s Schools, 1942; G.M. Ruch, G.D. Stoddard, Tests and Measurements in High School Instruction, 1927; C.W. Odell, Educational Measurement in High School, 1930. Other testimonies to the abundance of educational testing and empirical research on test use starting in the first half of the twentieth century can be found in Lincoln & Workman 1936, 4, 7; Butts 1947, 605; Monroe 1950, 1461; Holman & Docter 1972, 34; Tyack 1974, 183; and Lohman 1997, 88. | ||
IRONIES: | |||||||||||
Rand Corporation | "All RAND [monographs/occasional papers/etc.] undergo rigorous peer review to ensure that they meet high standards for research quality and objectivity." | ||||||||||
Susannah Faxon-Mills, Laura S. Hamilton, Mollie Rudnick, Brian M. Stecher | "We found considerable research on the effects of testing in U.S. schools, including studies of high-stakes testing, performance assessment, and formative assessment." p. viii | New Assessments, Better Instruction? Designing Assessment Systems to Promote Instructional Improvement | Rand Corporation Research Report, 2013 | "Funding to support the research was provided by the William and Flora Hewlett Foundation." "Marc Chun at the Hewlett Foundation first approached us about reviewing the literature on the impact of assessment, and he was very helpful in framing this investigation." | |||||||
Michael J. Feuer | "To challenge authority is to hold authority accountable. Challenging people in power requires them to show that what they are doing is legitimate; we invite them to rise to the challenge and prove their case; and they, in turn, trust that the system will treat them fairly." | Measuring Accountability When Trust Is Conditional | Education Week, September 24, 2012 | https://www.edweek.org/ew/articles/2012/09/24/05feuer_ep.h32.html?print=1 | |||||||
Michael J. Feuer | "No profession is granted automatic autonomy or an exemption from evaluation." | Measuring Accountability When Trust Is Conditional | Education Week, September 24, 2012 | https://www.edweek.org/ew/articles/2012/09/24/05feuer_ep.h32.html?print=1 | |||||||
Joan
L. Herman |
Susan H. Fuhrman & Richard F. Elmore, Eds | "Granted, one would expect to see higher growth on KIRIS, which was customized to Kentucky's learning objectives, than to the more general and thereby less curricularly sensitive NAEP measure." | Redesigning Accountability Systems for Education, Chapter 7 | Teachers College Press, 2004 | Institute of Education Sciences, US Education Department | ||||||
Deborah Loewenberg Ball | Jo Boaler, Phil Daro, Andrew Porter, & 14 others | "High-quality work depends on open debate unconstrained by orthodoxies and political agendas. It is crucial that the composition of the panels and the extended research communities be inclusive, engaging individuals with a wide range of views and skills." p.xxiii | Mathematical Proficiency for All Students | Rand Corporation, 2003 | https://www.rand.org/pubs/monograph_reports/MR1643.html | Office of Research and Improvement, US Education Department | |||||
Laura S. Hamilton | Brian M. Stecher, Stephen P. Klein | "Greater knowledge about testing and accountability can lead to better system design and more-effective system management." p.xiv | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | Summary, p.xiv | |||||||
Laura S. Hamilton | Brian M. Stecher | "Incremental improvements to existing systems, based on current research on testing and accountability, should be combined with long-term research and development efforts that may ultimately lead to a major redesign of these systems. Success in this endeavor will require the thoughtful engagement of educators, policymakers, and researchers in discussions and debates about tests and testing policies." | Making Sense of Test-Based Accountability in Education, Rand Corporation, 2002 | Chapter 6, Improving test-based accountability, pp.143-144 | |||||||
Brian M. Stecher | Stephen P. Klein | "Additional information about the impact of performance assessments on curriculum and instruction would provide policymakers with valuable data on the benefits that may accrue from this relatively expensive form of assessment." p.11 | The Cost of Science Performance Assessments in Large-Scale Testing Programs, p.1 | Educational Evaluation and Policy Analysis, Spring 1997, 19(1) | |||||||
Ronald James Dietel | "comparative information from other research organizations would aid decision makers in measuring program quality;" Abstract | Evaluation of the Dissemination Program from an Education Research and Development Center | Doctoral Dissertation, University of California, Los Angeles | ||||||||
Eva L. Baker | Robert L. Linn, Joan L. Herman | "Diverse perspectives are needed to clarify real differences and to find equitable, workable balances." | CRESST: A Continuing Mission to Improve Educational Assessment, p.13 | Evaluation Comment, Summer 1996 | |||||||
Eva L. Baker | Robert L. Linn, Joan L. Herman | "Impartiality, not advocacy, is the key to the credibility of research and development." | CRESST: A Continuing Mission to Improve Educational Assessment, p.13 | Evaluation Comment, Summer 1996 | |||||||
Madaus, G.F. | "too often policy debates emphasize only one side or the other of the testing effects coin" | The effects of important tests on students: Implications for a National Examination System, 1991 | Phi Delta Kappan, 73(3), 226-231. | As quoted in William A. Mehrens, Consequences of Assessment: What is the Evidence?, Education Policy Analysis Archives Volume 6 Number 13 July 14, 1998, https://epaa.asu.edu/ojs/article/view/580/ | |||||||
Author cites (and accepts as fact without checking) someone elses dismissive review | |||||||||||
Cite selves or colleagues in the group, but dismiss or denigrate all other work | |||||||||||
Falsely claim that research has only recently been done on topic. | |||||||||||
1) [as of July 4, 2021] SCOPE funders include: Bill & Melinda Gates Foundation; California Education Policy Fund; Carnegie Corporation of New York; Center for American Progress; Community Education Fund, Silicon Valley Community Foundation; Ford Foundation; James Irvine Foundation; Joyce Foundation; Justice Matters; Learning Forward; Metlife Foundation; National Center on Education and the Economy; National Education Association; National Public Education Support Fund; Nellie Mae Education Foundation; NoVo Foundation; Rose Foundation;S. D. Bechtel, Jr. Foundation; San Francisco Foundation; Sandler Foundation; Silver Giving Foundation; Spencer Foundation; Stanford University; Stuart Foundation; The Wallace Foundation; William and Flora Hewlett Foundation; William T. Grant Foundation |