In the process of doing an ROI analysis, I wanted a simple area chart showing the negative burn down of our costs while the project was in development, through to the positive savings of the project after completion. Excel didn’t do a great job for me, so I thought I’d give R a try.
As you can see in the image below, Excel put the x axis right through the center of the chart and since I had to use a column chart, it’s more blocky/chunky than I liked. With R and ggplot2, I was able to smooth out those lines and put the x axis in the right place. I also had a little more control over the labels on the x axis, so I could show the intervals at 12/24/… months instead of 1/13/….. Of course it’s super-easy to create charts in Excel, but with a little extra effort in R, you can have a much better final product.
The code should be fairly well commented, but here’s the general idea:
- Install ggplot2 and scales to build the chart and format it with dollar signs.
- Load data from a csv file. Ours simply contains months as numbers, and the total savings.
- Interpolate our 60 months into 1000 separate values to reduce the choppines.
- Add a valence column to indicate which values are positive or negative.
- Plot the chart.
### Begin R Code ###
# If you haven't already, install the packages below.
library(ggplot2) # ggplot2 for creating the charts
library(scales) # scales for the dollar formatting of the axis.
#Load the csv file containing the values:
SampleROI <- read.csv("C:/Users/convalytics/Documents/R/SampleROI.csv") # Make sure to use your own path ***
# Interpolate the data into 1000 separate values. (to smooth out the choppiness of having only a few values)
interp <- approx(SampleROI$Month, SampleROI$TotalCostVsSavings, n=1000)
#Rebuild the data frame with the interpolated values.
roi <- data.frame(Month=interp$x, Savings=interp$y)
#Add a "valence" column to indicate positive vs negative values.
#Essentially selects the not-yet-existing "valence" column on a
# subset of the positive or negative values in the "Savings" column,
# and inserts a value of "pos" or "neg" accordingly.
roi$valence[roi$Savings >= 0] <- "pos"
roi$valence[roi$Savings < 0] <- "neg"
# Plot the chart
ggplot(roi, aes(x=Month, y=Savings, width=1)) +
geom_area(aes(fill=valence, alpha=.8,stat="identity")) +
labs(title="Sample ROI",x="Months from Project Start",y="Running Cost vs. Savings") +
# R Code by Jason Green : Convalytics.com #
### End R Code ###
I hope this makes for a good example that you can use to tweak for your own uses. If you have any questions or suggestions, please leave us a comment!