Visualizing data: getting started with bar charts

<<Previous post: Exploring data with descriptive statistics and frequency tables

Once you’ve done a first-pass analysis of your data using summaries and tables, it’s time to visualize it with graphs and charts.

For exploring categorical variables, it makes sense to start with a simple bar chart. We will start with using a built-in function of R, “barplot”, before getting into ggplot2 – a package that allows for a bit more flexibility in specifying graph details.

Prior to creating an actual bar chart in R using “barplot”, you first need to make a table of your data from your data.frame:

table1 <- table(data.frame$variable1)
barplot(table1)

For example:

ethnicitytable <- table(data3$ethnicity)
barplot(ethnicitytable)

ethnicitytable1

Notice that you won’t need to specify the data.frame name anymore when working with this table, since it is now listed separately in your workspace on the right.

If you want to include missing values as a category in the bar chart, add the “useNA” option when creating a table, as shown here:

table1 <- table(data.frame$variable1, useNA="ifany")
barplot(table1)

If you want to order the data in decreasing (decreasing = T), or increasing order (decreasing = F) relative to the frequency of that category, use:

barplot(table1[order(table1, decreasing = T)])  
barplot(table1[order(table1, decreasing = F)])

For example:

barplot(ethnicitytable[order(ethnicitytable, decreasing = T)])  
barplot(ethnicitytable[order(ethnicitytable, decreasing = F)])

Output for increasing order:

ethnascending

To make the bars horizontal, add “horiz = T”:

barplot(table1[order(table1, decreasing = F)], horiz = T)

For example:

barplot(ethnicitytable[order(ethnicitytable, decreasing = F)], horiz = T)

ethntable3

You can see that my labels on the categorical axis are oriented the wrong way. In order to reorient the labels, add the “las” option:

barplot(table1[order(table1, decreasing = F)], horiz = T, las = 1)

Where las is 0 = parallel,  1 = horizontal, 2 = perpendicular and 3 = vertical, relative to the axis.

For example:

barplot(ethnicitytable[order(ethnicitytable, decreasing = F)], horiz = T, las = 1)

To add labels to the axis, use the same number of labels as there are bars, using the “names.arg” option:

barplot(table1[order(table1, decreasing = F)], horiz = T, las = 1, names.arg=c("category1", "category2", "..."))

barplot(ethnicitytable[order(ethnicitytable, decreasing = F)], horiz = T, las = 1, names.arg=c("American Indian/Alaska Native", "Native Hawaiian/Pacific Islander", "African American", "Other", "Multiple", "Caucasian", "Asian", "Hispanic"))

Keep in mind that the labeling of categories of a horizontal bar plot occurs from bottom to top (hence the “decreasing = F”).

ethntable4

There are a few more things to fix. First of all,  the labels are cut off on the left. This can be prevented by resizing the margins from their default values of the bottom (5.1), left (4.1), top (4.1), and right (2.1) to a larger size. For example, the chart will fit if we use a margin of 14 instead of 4.1 for the left margin, before plotting the bar graph (note that you can use any word for “marginsetting1”, but remember it for restoring default values later – see below):

marginsetting1 <- par(mar = c(5.1,14,4.1,2.1))

barplot(ethnicitytable[order(ethnicitytable, decreasing = F)], horiz = T, las = 1, names.arg=c("American Indian/Alaska Native", "Native Hawaiian/Pacific Islander", "African American", "Other", "Multiple", "Caucasian", "Asian", "Hispanic"))

ethn5

 

Only two more labels are missing: the title and a label for the x-axis. We can add that using the “main” and “xlab” option, respectively:

barplot(table1[order(table1, decreasing = F)], horiz = T, las = 1, main="title", lab ="xaxistitle", names.arg=c("category1", "category2", "..."))

For example:

barplot(ethntable[order(ethntable, decreasing = F)], horiz = T, las = 1, main="Race/ethnicity", xlab="Frequency", names.arg=c("American Indian/Alaska Native", "Native Hawaiian/Pacific Islander", "African American", "Other", "Multiple", "Caucasian", "Asian", "Hispanic"))

ethn7

 

Notice that the margins will stay the same as the values that we just specified, until we reset them. One way to do this, is to reset all graphical parameters using RStudio. Make sure to first save the graphs that you just produced if needed, as it will delete the graphs from your “Plots” tab.

Plots  →  Clear all…  →  Yes

Another way to reset the margins is to recall the settings from before you started messing with them. If you used the word “marginsetting1” earlier to create the new margins, you can now reset the values to what they were before, with:

par(marginsetting1)

 

While the barplot function contains several more options for specifying details of charts (color, scale etc), and for adding variables as groups or stacks, we will now continue with ggplot2, a useful package that allows for creating pretty nice looking graphs with lots of added functionality. Before continuing with ggplot2, check out the basics and possibilities on this cheat sheet.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *