Data Quality Component 5: Subgroup Analysis and Reporting

Subgroup Analysis and Reporting wedge

Data analysis includes analyzing the relationship between multiple data elements to determine links, trends, patterns, and probabilities. Once data are analyzed, SEAs must consider how to best convey the results of the analysis for use by stakeholders. These considerations are the same whether reporting subgroup data or data that are not disaggregated by subgroup. Data reports will need to adhere to state and federal policies and regulations around data disclosure and privacy; therefore, the SEA will need to ensure methods for disclosure avoidance and protection of personally identifiable information are established. Management of risk takes careful consideration and may include strategies such as minimum-n thresholds for reporting, fixed top/bottom coding thresholds, and complementary suppression. The following considerations outline how SEA staff may consider how new or revised subgroup analysis and reporting processes play a role in the larger data-quality system.

  • Determining analyses to conduct: SEA staff will need to conduct analyses, such as between relevant subgroups of students, to inform stakeholders about state, district, and school performance of students, staff, and programs. SEA staff will need to determine the type of analysis to conduct based on what needs to be reported, e.g., basic percentages and counts or more involved calculations of relationships across the data using cross-tabulation or other methods. SEA staff will need to establish whether district staff have the capacity to do these analyses, and, if not, provide needed guidance and training.
  • Establishing reporting processes: SEA staff will need to ensure that internal programs, e.g., early childhood, special education, and Title I, are able to report data to the SEA system and to the U.S. Department of Education, as needed, and that there is a plan in place to meet federal and state reporting requirements. The SEA will need processes for reporting data publicly and a plan for protecting personally identifiable information for those publicly reported data. These processes will include establishing a minimum n-size, i.e., the minimum number within a subgroup for which data can be disaggregated while still protecting the individual identities of those within the subgroup. The SEA will need to be diligent to ensure these policies are established and implemented for required state, district, and school report cards, and for the evolving use of data visualization through dashboards, infographics, and other displays.

For years, SEAs have had to report data about students while ensuring those students’ privacy. Ensuring privacy has not always been easy, often requiring additional statistical procedures and data inspection, as well as manual labor to correct errors of disclosure. The ESEA, as amended by ESSA, established new requirements to report on district and state report cards data for students who are homeless, in foster care, and connected to a military parent or guardian. Because these are often small groups of students, these requirements have heightened discourse and concern about ways to avoid disclosure of data that could jeopardize the recognition of an individual. Techniques to protect data such as data suppression, blurring, and perturbation (i.e., finding complementary data across various cells that together can give away too much information risking personal identification) are often difficult to implement across levels.

SEAs can use the Data Systems Self-Reflection Checklist to consider how their current actions promote quality data systems. To download the Data Systems Self-Reflection Checklist, click here.

For more information on how Delaware is currently approaching data analysis, please see the vignette below.

Subgroup Analysis and Reporting Vignette: Automated System in Delaware Protects Student Data Privacy
Delaware Department of Education logo

The Delaware Department of Education (DDOE) is implementing a strategy to systematize data privacy using computer code that conducts cell suppression that transcends across data-reporting levels.

Delaware, like many other states, wanted an efficient solution to reporting aggregate data that could not be traced back to individual students. Considering the limited number of staff in DDOE’s Data Management and Governance office, the solution needed to be cost-effective and not labor intensive while adhering to federal and state policy. State policy identifies Delaware’s population threshold as 15; that is, any counts or grade-level percentages when representing 15 or fewer students in a group must be suppressed, and reporting percentages within the top or bottom 5% appear as >95% or <5% instead of as individual percentages. To further avoid disclosure of personally identifiable information, DDOE follows the guidelines recommended by the U. S. Department of Education’s Privacy Technical Assistance Center and suppresses single-cell data for a subgroup category of a particular population that contains three or fewer students. Keeping within these boundaries, DDOE developed computer code that would make the job of suppressing potentially identifiable data possible through a simplified set of rules the computer can follow at the database level using DDOE’s database engine.

The computerized program not only searches for needed suppression by subgroup with the specified threshold, but also by categories within the subgroups (mutually exclusive grouping). For example, when reporting on the number of homeless students (n=14) scoring proficient on the state standardized test (n=2), both the population of homeless students, which is below the threshold, and those that scored proficient would be suppressed. This search would be considered a level-one search where only an individual cell needs disclosure avoidance. The program takes a step further and conducts a level-two search, seeking neighboring cells that by themselves may not need disclosure avoidance but together with other cells tell a fuller picture that could jeopardize identification of students, and therefore also need suppression (complementary suppression). Hence, the program accomplishes the repetitive processes in determining when suppression is needed, which when done by hand takes extensive staff time. DDOE has been able to produce federal education reports using the program and continues to use it for state and local report cards. After much beta-testing, DDOE created a programming template that can be customized for other databases in other states, with some rules for data formatting and input and engine logic. For further information, contact the DDOE page here.

Subgroup Analysis and Reporting Resources:

Data Quality Campaign. Privacy and Effective Data Use Go Hand in Hand. Washington, DC: Data Quality Campaign, 2016.

Institute of Education Sciences. Issue Briefs and White Papers. Washington, DC: Institute of Education Sciences National Center for Education Statistics, n.d

Data Visualization:

Green, Stephanie. “Evergreen Data: Intentional Reporting & Data Visualization.” Stephanie Evergreen. (accessed October 26, 2018).

Ribecca, Severino. 2014. “The Data Visualisation Catalogue.” The Data Visualisation Catalogue. (accessed October 26, 2018).

The Center on Standards and Assessment Implementation:

Center on Standards and Assessment Implementation. Accountability Requirements for Subgroups of Students. Washington, DC: WestED, 2017.

U.S. Department of Education:

National Forum on Education Statistics. Forum Guide to Data Visualization: A Resource for Education Agencies (NFES 2017-016). Washington, DC: U.S. Department of Education, National Center for Education Statistics, 2016.

Seastrom, Marilyn. Best Practices for Determining Subgroup Size in Accountability Systems While Protecting Personally Identifiable Student Information (IES 2017-147). Washington, DC: U.S. Department of Education, Institute of Education Sciences, 2017.