Student Score Reporting

Smarter Balanced for ELA and Mathematics

Final student scale scores represent the ability estimates for students. For Smarter Balanced Summative assessments, once the responses from the PT and CAT portions are merged for final scoring, the resulting ability estimates are based on the responses to the specific test items that a student answered, not the total number of items answered correctly. Higher ability estimates are associated with students who correctly answer more difficult and more discriminating items; lower ability estimates are associated with students who correctly answer easier and less discriminating items. Two students can arrive at the same scale score by very different paths. This type of scoring is called “item pattern scoring.”

Reporting Achievement

CAASPP Smarter Balanced assessments in ELA and mathematics were scaled vertically, which means that scores between adjacent grade levels were linked through certain items that were common. This will make it possible to monitor students’ year-to-year progress in assimilating the CCSS and to describe student progress over time across grade levels.

Overall Achievement Levels

Overall achievement levels are categorical labels given to particular scale score ranges. The minimum and maximum scale scores for each achievement level vary for grade level and content area; these are presented in Appendix A: Scale Score Ranges. Achievement levels were set during a process called standard setting, which established the association between scores and their category of achievement.

Student test results are reported in the following overall achievement levels:

Level 4—Standard Exceeded
Level 3—Standard Met
Level 2—Standard Nearly Met
Level 1—Standard Not Met

The establishment of achievement levels through the standard setting process ensures alignment with the CCSS. Information on the process can be found on the Reporting Scores web page of the Smarter Balanced Assessment Consortium website.

Writing Extended Response

WER scores for ELA performance tasks provide additional information about writing performance for a student. These scores will be available in CERS as well as the LEA Student Score Data File available for download in TOMS. However, note that a percentage of randomly assigned students who take a Smarter Balanced field test will have WER results available in CERS at a later date, because these responses are sent to Smarter Balanced for scoring.

The WER scores in the LEA Student Score Data File provide information on how a student scored on the three writing traits—organization/purpose, evidence/elaboration, and conventions—for an essay. In addition, CERS and the LEA Student Score Data File provide explanations for a 0 (zero) score on the ELA WER items, when applicable, such as that the response was off topic, off purpose, or insufficient.

WER condition codes are presented in table 1 and defined in the Condition Codes for the ELA Writing Extended Response web document.

Table 1. WER Scoring Condition Codes
Condition Code	Reason	Description
B	Blank	No response
I	Insufficient	Use the “I” code when a student has not provided a meaningful response; examples can include random keystrokes, undecipherable text, “I hate this test,” “I like pizza!” (in response to a reading passage about helicopters), or response consists entirely of profanity. For ELA WER items, use the “I” code for responses previously described and also if the student’s original work is insufficient to determine whether the student is able to organize, cite evidence and elaborate, and use conventions as defined in the rubrics; or the response is too brief to make a determination regarding whether it is on purpose or on topic.
L	Nonscorable Language	A language other than English was used.
T	Off-Topic for ELA WER Items Only	The response is unrelated to the task or sources or shows no evidence that the student has read the task or the sources (especially for informational or explanatory and opinion or argumentative). Off topic responses are generally substantial responses.
M	Off-Purpose for ELA WER Items Only	The student has clearly not written to the purpose designated in the task: An off-purpose response addresses the topic of the task but not the purpose of the task. Students may use some narrative techniques in an explanatory essay or use some argumentative or persuasive techniques to explain, for example, and still be on purpose. Off-purpose responses are generally developed responses (essays, poems, etc.) clearly not written to the designated purpose. If a response receives the code of “off-purpose,” the conventions trait is scored, and the traits of evidence/elaboration and organization/purpose are not scored. The student has the opportunity to receive credit for a response, affecting the student’s overall ELA score.

Because of differing levels of item difficulty, WER raw scores should not be compared between students, grade levels, and test administration years.

Claims and Assessment Targets for Smarter Balanced Assessments

The Smarter Balanced content areas of ELA and mathematics are broken down into claims and assessment targets.

Some claims are broken down into content categories, which contain a varying number of assessment targets. An assessment target defines the grade level–specific knowledge, skill, or ability that students should know or be able to demonstrate within the domain. For example, the overall claim “Reading” has a content category called “Literary” that contains an assessment target called “Reasoning and Evaluation.”

Claims and their assessment targets are found on the Smarter Balanced Content Explorer website. Please note that not all assessment targets are tested for all students given the adaptive nature of the CAT portion of the test.

Area (Claim) Performance Levels

Assessment claims are evidence-based statements about what students know and can do as demonstrated by their achievement on the summative assessments. They are defined in the item specifications for ELA and mathematics available on the Smarter Balanced Assessment Consortium Development and Design web page.

Claim performance levels are based on a smaller collection of items than the overall achievement levels. However, as a result of the adjusted-form test blueprint used for Smarter Balanced Online Summative Assessments for ELA and mathematics, the number of items for each claim is fewer than were on the previous full-form blueprint, increasing the amount of classification error and making it difficult to provide reliable information about a student’s claim achievement levels. Therefore, individual claim performance levels are not reported for the Smarter Balanced Online Summative Assessments for ELA and mathematics (although aggregated claim performance levels are reported for student groups of 30 or more on the Test Results for California’s Assessments website).

There are four claims (but three reporting categories) per mathematics assessment and four claims per ELA assessment, each with a varying number of content categories (subcategories that may apply to some specific claims) and assessment targets. Performance on claims is reported as one of three levels:

Above Standard
Near Standard
Below Standard

Performance levels for claims provide supplemental information regarding a student’s strengths or weaknesses. Only three performance levels for claims were developed since there are fewer items within each claim. Levels, rather than scores, are reported because of the small number of items in each claim—the levels provide a more accurate measurement than the scores would.

A student’s ability, along with the corresponding standard error, are estimated for each claim. Performance levels for claims are based on the distance a student’s performance on the claim is from the Level 3 Standard Met achievement level. An interval estimate corresponding to the student’s true performance on the claim is constructed. The interval is defined as being from 1.5 times the standard error below the student’s ability to 1.5 times the standard error above the student’s ability. If the interval contains the Level 3 Standard Met criterion value for a particular claim, it indicates the student’s results are near the standard for this claim. If the interval is above the Level 3 Standard Met criterion, it would indicate that the student’s results are above the standard. If the interval is below the Level 3 standard, it would indicate that the student’s results are below the standard.

No standard setting occurred for claims.

Assessment Targets

Assessment targets describe what is to be assessed within a claim and are used to develop test items (questions). Assessment target reports are available in CERS and show target scores for groups of students; these are reported as Performance Relative to the Entire Test and Performance Relative to Level 3 (Met Standard). Target reports are not available for individual students.

Assessment targets provide information regarding a group’s strengths and weaknesses relative to its achievement on the assessment as a whole and where students’ performance indicates Standard Met. For non-WER targets, only those targets with 10 or more items in the pool are included for reporting. To get a score, students must answer at least 10 CAT items and 1 PT. Students who log on to both the CAT and the PT but do not meet this scoring threshold will receive the LOSS. Scores are sent to CERS, which only displays target results for 30 students or more.

While the claims do not vary among grade levels, assessment targets for ELA Claims 1–4 and mathematics Claim 1 are unique to each grade level. Note that assessment targets are reported for mathematics Claim 1 only, because, according to the Assessment Target Reports Frequently Asked Questions, “for claims 2, 3, and 4, items are intended to emphasize the mathematical practices, and therefore, items may align to the content included in several mathematics assessment targets. The best common descriptors of the items included in these claims are the claim labels themselves.”

CAST

The CAST process converts each possible raw score to an ability estimate and then equates the score to the number-right score on a base test form so that scores from different forms of the CAST are comparable. The number-right scores are then transformed to scale scores, to facilitate score interpretation. If two students take the same form of the CAST, the higher scale score is given to whomever provides more correct responses.

Reporting Achievement

Overall Achievement Levels

Overall achievement levels are categorical labels given to particular scale score ranges. The minimum and maximum scale scores for each achievement level vary for grade level; these are presented in Appendix A: Scale Score Ranges. Achievement levels were set during a process called standard setting, which established the association between scores and their category of achievement.

Student test results are reported in the following overall achievement levels:

Level 4—Standard Exceeded
Level 3—Standard Met
Level 2—Standard Nearly Met
Level 1—Standard Not Met

Achievement level setting ensures that the achievement levels align to the CA NGSS. Information about achievement level descriptors and scale score ranges can be found in the “Scores and Results Reporting” section of the CDE California Science Test web page.

Domain (Area) Performance Levels

In addition to achievement levels for the total test, domain performance levels for the Earth and Space Sciences, Life Sciences, and Physical Sciences domains are also reported for students who answered enough items in the domain. Students might receive science domain performance levels for some domain(s) but not the others depending on the number of items they completed for different domains. Science domain performance levels are not reported for students who answered fewer than 10 items for the total test.

Domain performance levels are based on a smaller collection of items. This makes it more difficult to provide information about a student’s domain performance level without increasing the amount of classification error. A larger classification error increases the chance that a student could be misclassified as belonging to one performance level when the student actually belongs to another. For this reason, there are only three domain performance levels. While the actual domain scores are not reported, the domain performance level indicates that the score for a domain is one of the following:

If the scale score of a domain is above the interval that was estimated using the scale score of the “Standard Met” achievement level on the total test and the standard error of the domain scale score, the performance level for the domain is “Above Standard.”
If the scale score of a domain is within the interval that was estimated using the scale score of the “Standard Met” achievement level on the total test and the standard error of the domain score, the performance level for the domain is “Near Standard.”
If the scale score of a domain is below the interval that was estimated using the scale score of the “Standard Met” achievement level on the total test and the standard error of the domain scale score, the performance level for the domain is “Below Standard.”

These levels were identified based on the distance of a student’s performance on the domain from the Level 3 “Standard Met” achievement level criterion. Using the standard error, an interval estimate corresponding to the student’s true performance on the domain is defined and constructed.

A student’s ability, along with the corresponding standard error, are estimated for each domain. Performance levels for domains are based on the distance a student’s performance on the domain is from the Level 3 Standard Met achievement level. An interval estimate corresponding to the student’s true performance on the domain is constructed. The interval is defined as being from 1.5 times the standard error below the student’s ability to 1.5 times the standard error above the student’s ability. If the interval contains the Level 3 Standard Met criterion value for a domain, it would indicate the student’s results are near the standard for this domain. If the interval is above the Level 3 Standard Met criterion, it would indicate that the student’s results are above the standard. If the interval is below the Level 3 standard, it would indicate that the student’s results are below the standard.

CAAs for ELA, Mathematics, and Science

Reporting Achievement

For the CAAs for ELA and mathematics, scale scores reflect estimates of student ability that are based on which items a student correctly answers in a multistage adaptive test setting. A two-stage testing approach adapts the difficulty of a test to each student’s ability in order to achieve a more precise measurement. The first stage consists of a routing test that provides an initial student ability estimate. The second stage consists of a test that varies in difficulty depending on that initial ability estimate. A student whose initial ability estimate is high will respond to a second-stage module consisting of difficult items that will help to determine just how high the student’s ability is. A student whose initial ability estimate is low will respond to a second-stage module consisting of less difficult items. In certain cases where a student does not answer enough items correctly, the student’s test will be stopped at the end of Stage 1, in accordance with the DFAs.

For the CAA for Science, once the responses to each embedded PT are merged for the final scoring, the CAA for Science process first converts each possible raw score to an ability estimate so that scores from different forms of the CAA for Science are comparable. The ability estimates are then transformed to scale scores, to facilitate score interpretation. If two students take the same form of the CAA for Science, the higher scale score is given to whomever provides more correct responses.

Overall Achievement Levels

CAA overall achievement levels are categorical labels given to particular scale score ranges. The minimum and maximum scale scores for each achievement level vary for grade level and content area; these are presented in Appendix A: Scale Score Ranges. Achievement levels were set during a process called standard setting, which established the association between scores and their category of achievement.

Student test results for the CAAs for ELA, mathematics, and science are reported in the following overall achievement levels:

Understanding (Level 3)
Foundational Understanding (Level 2)
Limited Understanding (Level 1)

Regardless of the grade level—which is indicated by the first digit of the scale score—the minimum and maximum scale scores for each achievement level are the same within each content area. Standard setting also ensures that the performance levels align to the CCSS and CA NGSS Connectors achievement level descriptors.

CSA

Reporting Achievement

CSA reporting levels are categorical labels given to particular scale score ranges. The minimum and maximum scale scores for each reporting level vary for grade level and content area; these are presented in Appendix A: Scale Score Ranges. Reporting levels were set during a process called standard setting, which established the association between scores and their reporting category.

Overall Reporting Levels

Student test results for the CSA are reported in the following overall achievement levels, where the first digit is presented as “x” to indicate the student’s grade level and the high school grade band would be indicated with “9”:

Score Reporting Range 3 (x60–x99)
Score Reporting Range 2 (x46–x59)
Score Reporting Range 1 (x00–x45)