In the process of doing an ROI analysis, I wanted a simple area chart showing the negative burn down of our costs while the project was in development, through to the positive savings of the project after completion. Excel didn’t do a great job for me, so I thought I’d give R a try.
As you can see in the image below, Excel put the x axis right through the center of the chart and since I had to use a column chart, it’s more blocky/chunky than I liked. With R and ggplot2, I was able to smooth out those lines and put the x axis in the right place. I also had a little more control over the labels on the x axis, so I could show the intervals at 12/24/… months instead of 1/13/….. Of course it’s super-easy to create charts in Excel, but with a little extra effort in R, you can have a much better final product.
The code should be fairly well commented, but here’s the general idea:
- Install ggplot2 and scales to build the chart and format it with dollar signs.
- Load data from a csv file. Ours simply contains months as numbers, and the total savings.
- Interpolate our 60 months into 1000 separate values to reduce the choppines.
- Add a valence column to indicate which values are positive or negative.
- Plot the chart.
### Begin R Code ###
# If you haven't already, install the packages below.
library(ggplot2) # ggplot2 for creating the charts
library(scales) # scales for the dollar formatting of the axis.
#Load the csv file containing the values:
SampleROI <- read.csv("C:/Users/convalytics/Documents/R/SampleROI.csv") # Make sure to use your own path *** # Interpolate the data into 1000 separate values. (to smooth out the choppiness of having only a few values) interp <- approx(SampleROI$Month, SampleROI$TotalCostVsSavings, n=1000) #Rebuild the data frame with the interpolated values. roi <- data.frame(Month=interp$x, Savings=interp$y) #Add a "valence" column to indicate positive vs negative values. #Essentially selects the not-yet-existing "valence" column on a # subset of the positive or negative values in the "Savings" column, # and inserts a value of "pos" or "neg" accordingly. roi$valence[roi$Savings >= 0] <- "pos" roi$valence[roi$Savings < 0] <- "neg" # Plot the chart ggplot(roi, aes(x=Month, y=Savings, width=1)) + geom_area(aes(fill=valence, alpha=.8,stat="identity")) + scale_x_continuous(breaks=seq(0,60,12),expand=c(0,0)) + scale_y_continuous(breaks=seq(-4000000,6000000,1000000),labels=dollar) + scale_fill_manual(values=c("darkred","darkgreen")) + labs(title="Sample ROI",x="Months from Project Start",y="Running Cost vs. Savings") + theme(plot.title=element_text(face="bold")) # R Code by Jason Green : Convalytics.com # ### End R Code ###
I hope this makes for a good example that you can use to tweak for your own uses. If you have any questions or suggestions, please leave us a comment!