top of page

Data Anlaysis

All Infographics were done for Balkan Insider

Freelance Project | Sun Journal | Lewiston, ME


Based on 2018-2019 Maine Department of Education data, I used linear regression to examine the relationship between demographic factors and student performance at varying school levels. Using a combination of public enrollment and assessment data the factor with the strongest relationship with student success was “economically disadvantaged,” which the state defines as the percent of students getting free or reduced lunch. 

Students in the economically disadvantaged bracket had higher correlation rates with "below expectation" results than any other demographic result, amongst all school levels, but especially in High School ELA and math. 

The stronger determinant of relationship strength comes from the r-squared value, a result from running a linear regression model and representative how much one variable explains another’s results. In this case, the strongest results were found amongst High School ELA and math again, where over 55% of the variance in scores below state expectations could be explained by economically disadvantaged students. Other grade and subject levels had an r-squared value around 30%, still higher than any other demographic. 

But no matter the grade or subject, as the percent of economically disadvantaged students went up, test scores can be expected to go down. In High School math and ELA results, for every 10  point increase in economically disadvantaged students there is a 6 percentage point drop in  students exceeding state expectations. 

Using this analysis, we looked at how schools were predicted to perform given their proportion of economically disadvantaged students and compared it to how they actually performed (residual data). The difference between those two scores was standardized and we were left with a number that represents how closely a school performed to their predicted result. 

Schools falling two or more standard deviations above the average were classified as performing greater than expected. Those performing more than two standard deviations below from the mean were classified as performing below expectations. The rest were classified as performing as expected. 

This type of analysis is commonly used to assess school performance. It's a method that puts schools on a demographically "level" playing field.

Link to story:

Class Project | The Data of Divides | Washington, DC 

Presented is a simple screenshot overview, but the project explored more than just mapping with ArcGIS, also providing an analysis into the changing census data behind redistricting.  


© 2023 by Train of Thoughts. Proudly created with

bottom of page