NAEP, or the National Assessment of Educational Progress, is often called the "Nation's Report Card." It is the only measure of student achievement in the United States where you can compare the performance of students in your state with the performance of students across the nation or in other states. NAEP, sponsored by the U.S. Department of Education, has been conducted for over 30 years. The results are widely reported by the national and local media.
Federal law dictates complete privacy for all test takers and their families. Under the National Assessment of Educational Progress Authorization Act (Public Law 107-279 III, section 303), the Commissioner of the National Center for Education Statistics (NCES) is charged with ensuring that NAEP tests do not question test-takers about personal or family beliefs or make information about their personal identity publicly available.
After publishing NAEP reports, NCES makes data available to researchers but withholds students' names and other identifying information. The names of all participating students are not allowed to leave the schools after NAEP assessments are administered. Because it might be possible to deduce from data the identities of some NAEP schools, researchers must promise, under penalty of fines and jail terms, to keep these identities confidential.
No. By design, information will not be available at these levels. Reports traditionally disclose state, regional, and national results. In 2002, NAEP began to report (on a trial basis) results from several large urban districts (Trial Urban District Assessments, after the release of state and national results. Because NAEP is a large-group assessment, each student takes only a small part of the overall assessment. In most schools, only a small portion of the total grade enrollment is selected to take the assessment and these students may not reliably or validly represent the total school population. Only when the student scores are aggregated at the state or national level are the data considered reliable and valid estimates of what students know and can do in the content area; consequently, school- or student-level results are never reported.
In recent years there has been considerable interest among education policymakers and researchers in linking NAEP results to other assessment data. Much of this interest has been centered on linking NAEP to international assessments. The 1992 NAEP mathematics assessment results were successfully linked to those from the International Assessment of Educational Progress (IAEP) of 1991, and the 1996 grade 8 mathematics and science results for NAEP have been linked to the 1995 Third International Mathematics and Science Study (TIMSS); there is a separate report of results. The feasibility of linking the 2000 NAEP to the 1999 TIMSS-R has been studied. Various methods for linking NAEP scores to state assessment results have been examined and continue to be explored.
The National Center for Education Statistics (NCES) grants members of the educational research community permission to use NAEP data.
NAEP results are provided in formats that the general public can easily access. Tailored to specific audiences, NAEP reports are widely disseminated. Since the 1994 assessment, all reports and data have been placed on the World Wide Web to provide even easier access. In addition, NCES periodically offers seminars to stimulate interest in using NAEP data to address educational research questions, participants' understanding of the methodological and technological issues relevant to NAEP, and demonstrate the steps necessary for conducting accurate statistical analyses of NAEP data. These seminars are advertised in advance on the NCES Web site. Research using NAEP data is supported by grants from several sources.
NCES provides support to NAEP coordinators through the NAEP State Service Center. The support takes the form of technical assistance and resources.
The NAEP program has always endeavored to assess all students selected as a part of its sampling process. In all NAEP schools, accommodations will be provided as necessary for students with disabilities (SD) and/or English language learners (ELL).
Inclusion in NAEP of an SD or ELL student is encouraged if that student (a) participated in the regular state academic assessment in the subject being tested, and (b) if that student can participate in NAEP with the accommodations NAEP allows. Even if the student did not participate in the regular state assessment, or if he/she needs accommodations NAEP does not allow, school staff are asked whether that student could participate in NAEP with the permitted accommodations. (Examples of accommodations that would not be allowed in NAEP are administering the reading assessment in a language other than English, or reading the reading passages aloud to the student. Also, extending testing over several days is not allowed for NAEP because NAEP administrators are in each school only one day.)
NAEP has developed a number of different publications and web-based tools that provide direct access to assessment results at the state and national level. For every major assessment release, web-specific content is developed that is suitable to the Web environment.
- The Nation's Report Card is a Web site developed especially to display the results of each assessment in a clear format and comprehensive manner. To locate this useful information, there are links to the most recent results from any subject information page on this Web site. See, for instance, the link on the mathematics subject page.
- State Profiles present state-level results and a history of state participation in NAEP. The NAEP Data Explorer and State Comparisons provide comprehensive information on student performance.
- Explore NAEP Questions links users to the Questions Tool and Item Maps that provide student responses, scoring guides, and other information on the questions that have been released to the public.
Several types of printed reports published by NAEP can be found under publications on the NAEP Web site. These range from the NAEP Report Card, a comprehensive report that contains all the major results for each assessment, to technical reports that contain psychometric details of a national or state assessment.
NAEP materials such as frameworks, released questions, and reports have many uses in the educational community. For instance, frameworks can serve as models for designing an assessment or revising curricula. Also, released constructed-response questions and their corresponding scoring guides can serve as models of innovative assessment practices.
NAEP findings are reported in many publications specifically targeted to educators. Furthermore, NAEP staff host seminars to discuss NAEP results and their implications.
Before the data are analyzed, responses from the groups of students assessed are assigned sampling weights to ensure that their representation in NAEP results matches their actual percentage of the school population in the grades assessed.
- Data for national and state NAEP assessments in most subjects are analyzed by a process involving the following steps:
- Check Item Data and Performance: The data and performance of each item are checked in a number of ways, including scoring reliability checks, item analyses, and differential item functioning (DIF), to assure fair and reliable measures of performance in the subject of the assessment.
- Set the Scale for Assessment Data: Each subject assessed is divided into subskills, purposes, or content domains specified by the subject framework. Separate scales are developed relating to the content domains in an assessment subject area. A special statistical procedure, Item Response Theory scaling, is used to estimate the measurement characteristics of each assessment question.
- Estimate Group Performance Results: Because NAEP must minimize the burden of time on students and schools by keeping assessment administration brief, no individual student takes more than a small portion of the assessment for a given content domain. NAEP uses the results of scaling procedures to estimate the performance of groups of students (e.g., of all fourth-grade students in the nation, of female eighth-grade students in a state).
- Transform Results to the Reporting Scale: Results for assessments conducted in different years are linked to reporting scales to allow comparison of year-to-year trend results for common populations on related assessments.
- Create a Database: A database is created and used to make comparisons of all results, such as scale scores, percentiles, percentages at or above achievement levels, and comparisons between groups and between years for a group. All comparisons are subjected to testing for statistical significance, and estimates of standard errors are computed for all statistics.
To ensure reliability of NAEP results, extensive quality control and plausibility checks are carefully conducted as part of each analysis step. Quality control tasks are intended to verify that analysis steps have not introduced errors or artifacts into the results. Plausibility checks are intended to encourage thinking about the results, whether they make sense, and what story they tell.
NAEP data are collected using a closely monitored and standardized process. The tight controls that guide the data collection process help ensure the comparability of the results generated for the national and the state assessments. All NAEP sessions use the same assessment booklets and identical administration procedures, and contractor staff members direct all sessions during a single calendar assessment period.
The national sample is a combined sample of students assessed in each participating state, plus an additional sample from the states that did not participate in the state assessment, ensuring that the national sample is representative of the total national student population. The full data set is analyzed together, allowing all data to contribute to the final results and setting a single scale for the assessment. All results are then reported in the scale score metric used for the specific assessment. In years with both national and state assessments in the same subjects, the national sample is a subset of the combined sample of public-school students assessed in each participating state, plus an additional sample from the states that did not participate, and a national nonpublic school sample.
While multiple-choice questions allow students to select an answer from a list of options, constructed-response questions require students to provide their own answers. Qualified and trained raters score constructed-response questions.
Scoring a large number of constructed responses with a high level of reliability and within a limited time frame is essential to NAEP's success. (In a typical year, over three million constructed responses are scored.) To ensure reliable, quick scoring, NAEP takes the following steps:
- develops focused, explicit scoring guides that match the criteria delineated in the assessment frameworks;
- recruits qualified and experienced scorers, trains them, and verifies their ability to score particular questions through qualifying tests;
- employs an image-processing and scoring system that routes images of student responses directly to the scorers so they can focus on scoring rather than paper routing;
- monitors scorer consistency through ongoing reliability checks;
- assesses the quality of scorer decision-making through frequent monitoring by NAEP assessment experts; and
- documents all training, scoring, and quality control procedures in the technical reports.
NAEP assessments generally contain both constructed-response and multiple-choice questions. The constructed responses are scored using the image-processing system, whereas the responses to the multiple-choice questions are scored by scanning the test booklets.
The number of students selected to be in a NAEP sample depends on whether it is a national-only sample or a combined state and national sample. In the national-only sample, there are approximately 10,000 to 20,000 students. In a combined national and state sample, there are approximately 3,000 students per participating jurisdiction from approximately 100 schools. Typically, 45 to 55 jurisdictions participate in such an assessment.
Data for the national and state NAEP are collected at the same time during the winter. Data for the national long-term trend assessments are collected in the fall for 13-year-olds, in the winter for 9-year-olds, and in the spring for 17-year-olds. Other NAEP special studies can occur at different points throughout the school year.
Federal law specifies that NAEP is voluntary for every student, school, school district, and state. However, federal law also requires all states that receive Title I funds to participate in NAEP reading and mathematics assessments at fourth and eighth grades. Similarly, school districts that receive Title I funds and are selected for the NAEP sample are also required to participate in NAEP reading and mathematics assessments at fourth and eighth grades. All other NAEP assessments are voluntary.
The NAEP science framework says that "Innovative assessments in the United States and other countries use three major item types: performance exercises, open-ended paper-and-pencil exercises, and multiple-choice items probing understanding of conceptual and reasoning skills. In performance exercises, students actually manipulate selected physical objects and try to solve a scientific problem about the objects. An extra period of time (20 or 30 minutes) may be necessary for students who have been assigned to perform complex tasks." Read more about the importance of performance tasks in the science framework.
Some of the students in the sample will perform the hands-on experiments. In addition, one-half of the students in each participating school received one of three hands-on tasks and related questions. These performance tasks require students to conduct actual experiments using materials provided to them, and to record their observations and conclusions in their test booklets by responding to both multiple-choice and constructed-response questions. For example, students at grade 12 might be given a bag containing three different metals, sand, and salt and asked to separate them using a magnet, sieve, filter paper, funnel, spoon, and water and document the steps they used to do so.
NAEP has two major goals: to compare student achievement in states and other jurisdictions and to track changes in achievement of fourth-, eighth-, and twelfth-graders over time in mathematics, reading, writing, science, and other content domains. To meet these dual goals, NAEP selects nationally representative samples of students who participate in either the main NAEP assessments or the long-term trend NAEP assessments.
The NAEP sample in each state is designed to be representative of the students in that state. At the state level, results are currently reported for public school students only and are broken down by several demographic groupings of students. When NAEP is conducted at the state level (i.e., in mathematics, reading, science, and writing), results are also reported for the nation. The national NAEP sample is then composed of all the state samples of public school students, as well as a national sample of nonpublic school students. In non-participating states, a certain number of schools and students are selected to complete the national-level sample.
For assessments conducted at the national level only, samples are designed to be representative of the nation as a whole. Data are reported for public and nonpublic school students as well as for several major demographic groups of students.
Subject-matter achievement is reported in two ways-scale scores and achievement levels so that student performance can be more easily understood. NAEP scale score results provide a numeric summary of what students know and can do in a particular subject and are presented for groups and subgroups. Achievement levels categorize student achievement as Basic, Proficient, and Advanced, using ranges of performance established for each grade. (A fourth category, below Basic, is also reported for this scale.) Achievement levels are used to report results in terms of a set of standards for what students should know and be able to do.
NAEP provides results about subject-matter achievement, instructional experiences, and school environment and reports these results for populations of students (e.g., fourth-graders) and groups within those populations (e.g., male students or Hispanic students). NAEP does not provide individual scores for the students or schools assessed.
Because NAEP scales are developed independently for each subject, scale score and achievement level results cannot be compared across subjects. However, these reporting metrics greatly facilitate performance comparisons within a subject from year to year and from one group of students to another in the same grade.
Since its inception in 1969, NAEP has assessed numerous academic subjects, including mathematics, reading, science, writing, the arts, civics, economics, geography, and U.S. history. NAEP assessments in foreign language and world history are under development.
Beginning with the 2003 assessments, NAEP national and state assessments are conducted in mathematics and reading at least once every two years at grades 4 and 8. These assessments are conducted in the same year and initial results are released six months after administration, in the fall of that year. Results from all other assessments are released about one year after administration, usually in the spring of the following year. Many NAEP assessments are conducted at the national level for grade 12, as well as at grades 4 and 8.
Since 1988, the National Assessment Governing Board has selected the subjects assessed by NAEP. Furthermore, the Governing Board oversees creation of the frameworks that underlie the assessments and the specifications that guide the development of the assessment instruments. The framework for each subject area is determined through a collaborative development process that involves teachers, curriculum specialists, subject-matter specialists, school administrators, parents, and members of the general public.
The national results are based on a representative sample of students in public schools, private schools, Bureau of Indian Education schools, and Department of Defense schools. Private schools include Catholic, Conservative Christian, Lutheran, and other private schools. The state results are based on public school students only. The main NAEP assessment is usually administered at grades 4 and 8 (at the level) plus grade 12 at the national level. The long-term trend assessments report national results (in mathematics and reading only) for age samples 9, 13, and 17 in public and nonpublic schools.
Because NAEP findings have an impact on the public's understanding of student academic achievement, precautions are taken to ensure the reliability of these findings. In its current legislation, as in previous legislative mandates, Congress has called for an ongoing evaluation of the assessment as a whole. In response to these legislative mandates, the National Center for Education Statistics (NCES) has established various panels of technical experts to study NAEP, and panels are formed periodically by NCES or external organizations, such as the National Academy of Sciences, to conduct evaluations. The Buros Center for Testing, in collaboration with the University of Massachusetts/Center for Educational Assessment and the University of Georgia, recently conducted an external evaluation of NAEP.
To meet the nation's growing need for information about what students know and can do, the NAEP assessment instruments must meet the highest standards of measurement reliability and validity. They must measure change over time and must reflect changes in curricula and instruction in diverse subject areas.
Developing the assessment instruments from writing questions to analyzing pilot test results to constructing the final instruments is a complex process that consumes most of the time during the interval between assessments. In addition to conducting national pilot tests, developers oversee numerous reviews of the assessment instruments by internal NAEP measurement experts, by the National Assessment Governing Board, and by external groups that include representatives from each of the states and jurisdictions that participate in the NAEP program.
In addition to assessing subject area knowledge and abilities, NAEP collects information from participating students, teachers, and principals about contextual or background variables that are related to student achievement. When developing the questionnaires used to gather this information, NAEP ensures that the questions do not infringe on respondents' privacy, that they are grounded in current educational research, and that the answers can provide information relevant to the discussions about educational reform.
The questionnaires appear in separately timed blocks of questions in the assessment booklets, such as the student context questionnaires, or, as in the case of questionnaires for the teachers, schools, and students with disabilities or who are classified as English language learners, they are printed separately. Four general sources provide context for NAEP results as follows:
- student questionnaires, which examine background characteristics and subject area instructional experience;
- teacher questionnaires, which gather data on teacher training and classroom instruction;
- school questionnaires, which gather information about school characteristics and policies; and
- SD/ELL questionnaires, which provide information about students within the sample who have disabilities or are English language learners (previously called SD/LEP questionnaires, for students with disabilities or who were classified as limited English proficient).
These questionnaires were developed using a framework and process similar to that used for the cognitive questions. This process included reviews by external advisory groups, pilot testing, and reviews by NCES, the Governing Board, and the Office of Management and Budget (OMB). For the main and state NAEP, the student questions appear in non-cognitive blocks. The background characteristic questions vary somewhat by grade level within a subject, and the subject area experience questions differ slightly by grade level within a subject. Unlike the cognitive blocks, these noncognitive blocks do not differ among the assessment booklets for a given grade and subject. The teacher questionnaires vary based on subject area and may differ by grade level. The school questionnaires are completed by a school official for each grade of students participating in the assessment.
Beginning with the 2003 assessment, results in mathematics and reading are released six months after the administration of the assessments. Results from all other assessments will be released one year after administration.