dashboard icon.png
 

Abortion Data in Illinois - RStudio

Our infographic visually combines and presents two sets of data on abortions in Illinois from 1995 to 2012 (data source).

My Contributions: Built graphs in RStudio

challenge

Illinois has offered thousands of women access to critical healthcare services. Among these services are access to safe abortions. In the state of Illinois, there are tens-of-thousands of abortions per year. We decided to create an infographic that conveys those numbers as a function of time, age, and marital status. View infographic

process

The choices we made when creating this graph were based on Kosslyn’s (Kosslyn, 2006) principles, grouped by three main goals:

  1. Connect with your Audience

  2. Direct and Hold Attention

  3. Promote Understanding and Memory

Analysis

We looked at four variables within the dataset: year (ordinal), county (nominal), age (ordinal), and marital status (nominal). There are five relationships being presented within our infographic. The bar graph presents a nominal relationship between age range and 1995 as well as 2012 as nominal data.  Three time series graphs were created to present the change in abortion rates over time in regards to marital status, age range, and residential status in Illinois. Our geospatial graph presents a relationship with the change in abortion counts from 1995 to 2012 for each county in Illinois. 

Goal 1: Connect with the Audience

This graph could be viewed by the general public, but more specifically for anyone who would be interested in learning about abortion rates in Illinois between between 1995 and 2012. 

Ethics debates on topics such as abortion rights are widely discussed in the current political climate. As one of the most contentious bipartisan discussions, abortion rights are salient and in the public’s spotlight. This kind of information visualization aims to model around objective data: void of opinion.

Principles of Relevance and Appropriate Knowledge

The bar graph on age group and year show how different abortion rates were in 1995 and 2012. This graph follows the Principles of Relevance and Appropriate Knowledge by allowing someone to easily compare these changes between different age groups as well as visualize the overall trend that happens as we get older. For people who have any sentiment to a county or region in Illinois, the heat map connects with them on a personal level by showing how much the abortion rates have changed from 1995 and 2012, according to each county. Lastly, we added a lite explanation for the heat map to help disambiguate the legend. 

Goal 2: Direct and Hold Attention

Grouping Laws

When organizing the graphs, we wanted to put certain graphs aligned with each other so that they could be easily compared. For example, we wanted to be able to compare the line graph on marital status and the one on residency to see if their are any patterns between the two. For individual graphs, such as the bar graph, we followed the Grouping Laws by situating the appropriate age range labels within close proximity of the bars.

Principle of Salience

To simplify the message on the heat map, we resorted to using one color, as opposed to two with a white midpoint. By design, this highlights that the majority of counties saw a decrease in the number of abortions between 1995 and 2012. This follows the Principle of Salience by drawing the viewer’s attention to the increase in abortion counts. 

Goal 3: Promote Understanding and Memory

Principle of Capacity Limitations

We wanted to use the fewest number of stylistic elements in each visualization to convey discrete messages; doing so aimed to get a simple point across, effectively.  Our line graph on abortion counts per age group over time presents the most visual complexity, since there are multiple age groups represented through the data. On every graph, the y-axis represents total count, for number of abortions.

The heat map represents the changes of abortion counts per county from 1995 to 2012 through a color scale: black and grey indicate a decrease in overall counts, while the red hues indicate increases.

For our bar graph on number of abortions in 1995 and 2012, we decided to make the most recent year as red, and the starting year as black. The high contrast between the two data points aims to emphasize the differences in counts between 1995 and 2012, since the counts do not consistently decrease for some ages over time. We tried to follow this pattern and color scheme in order to follow the Principle of Capacity Limitations.

RStudio Code for each graph

Line Graphs: 

  • Graph 1: > ggplot(melt(Abortion, id.vars="YEAR", measure.vars=c("0_14", "15_17", "18_19", "20_24", "25_29", "30_34", "35_39", "40_44", "45_over"), variable.name="Age", value.name="Counts"), aes(x=YEAR, y=Counts, group=Age, colour=Age))+geom_line()

  • Graph 2: > ggplot(melt(Abortion, id.vars="YEAR", measure.vars=c("MarriedILResident", "UnmarriedILResident"), variable.name="MaritalStatus", value.name="Counts"), aes(x=YEAR, y=Counts, group=MaritalStatus, colour=MaritalStatus))+geom_line()

  • Graph 3: > ggplot(melt(Abortion, id.vars="YEAR", measure.vars=c("ILResidents", "OutOfState"), variable.name="Residency", value.name="Counts"), aes(x=YEAR, y=Counts, group=Residency, colour=Residency))+geom_line()

Geospatial:

>mergedData <-merge(Illinois, countynew2, by.x='subregion', by.y=‘county’)

>arranged<-arrange(mergedData, group, order)

>ggplot(arranged, aes(x=long, y=lat, group=group, fill=delta95_12))+geom_polygon(colour="black", size=.2)+coord_map() +scale_fill_gradientn(colours=c("#000000","#8A8A8A","#ffffff", "#FF5B5B", "#EC0000", "#CB0000","#CB0000","#CB0000","#CB0000", "#B70000"))

Bar Graph:

> difference <- read.csv("CountDiff.csv")

> ggplot(difference, aes(x=Age, y=Count, fill=Year)) + geom_bar(position="dodge", stat="identity")



© 2023 Carolina Barrios