3 Ways To Analyze Google Analytics Data in R with RGA and ggplot2

In my opinion, Google Analytics is the single most influential development in marketing analytics ever.  Quantcast estimates that 70% of its top 10,000 website have GA installed.  Google has shown a relentless drive to improve the product over the years and it’s free price tag insures access to most anyone that runs a website.  With that said, Google Analytics is a service and no service (great or lacking) is without flaws.  One of the hidden advantages GA possesses is a robust API and this advantage allows users to build some of the features that are missing from the standard interface.  I wanted to cover some of the ways a user could use R to deal with some of the features not available in GA.

In order to use any of these techniques, you will have to install R as well as the rga package and dplyr package which available on CRAN.  Other packages used include ggplot2 for visualization, scales, lubridate and zoo.  Use the script below to install.

install.packages("RGA")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("scales")
install.packages("lubridate")
install.packages("zoo")
  1. Event Conversion Rate Script

    One of my gripes with Google Analytics is that the Top Events report includes total event counts but does not include a conversion metric.  If you are using the Google Tag Manger click listening technique to add events to your site by listening for click elements, a you could add a bit of custom Javascript to pass an impression for the same element, however, in many cases, just a simple total event count over the pageview count would suffice.  Here’s a script that grabs that simple metric:

    library(RGA)
    library(dplyr)
    
    #Authorization for RGA
    authorize()
    
    ##Enter your view ID here
    profile <- XXXXXXXX
    
    #Enter you start and end date here
    start.date = "2017-05-01"
    end.date = "2017-05-01"
    
    #RGA script pulls event parameters, and event page
    gaevents <- get_ga(profile, start.date, end.date, 
                 metrics = "ga:totalEvents",
                 dimensions = "ga:eventCategory,ga:eventAction,ga:eventLabel,ga:pagePath")
    
    #pull pageview counts as well as content group for easy grouping if content groups are used
    gapages <- get_ga(profile, start.date, end.date, 
                       metrics = "ga:pageviews",
                       dimensions = "ga:pagePath,ga:contentGroup1")
    
    #join event and page data
    gaevents <- inner_join(gaevents, gapages, by = "pagePath")
    
    #create conversion metric
    gaevents$pageconv <- round(gaevents$totalEvents/gaevents$pageviews,4)
    
    eventLabelpagePathtotalEventscontentGroup1pageviewspageconv
    Social Link/3Homepage9.3333

    This gives all event parameters (Category, Action and Label) as well as the page URL and content group 1, allowing the user to easily aggregate pages if they are passing content groups.  I strongly encourage using content groupings.

  2. Analyze Acquisition Mediums with ggplot2

    Google Analytics has some good embedded graphs for analyzing traffic mediums and the advent of Google Data Studio gives users even more flexibility, however, sites with high numbers of marketing mediums (10+) will pose issues for these tools.  Using ggplot2 in R allows a user to create what analysts call “small multiples” or a series of similar graphs or charts using the same scale and axes, allowing them to be easily compared.  Below is a script that returns small multiples for a year over year comparison of marketing mediums.

    library(RGA)
    library(ggplot2)
    library(scales)
    library(lubridate)
    library(zoo)
    
    #Authorization for RGA
    authorize()
    
    ##Enter your view ID here
    profile <- XXXXXXXX
    
    
    ##Enter your start and end date here
    start.date = "2016-01-01"
    end.date = "2017-05-31"
    
    ##RGA script
    ga <- get_ga(profile, start.date, end.date, metrics = "ga:sessions",
                 dimensions = "ga:yearMonth,ga:year,ga:medium")
    
    ##Zoo script converts YYYYMM format date to YYYY-MM-DD
    ga$yearMonth <- as.Date(as.yearmon(ga$yearMonth, "%Y%m"))
    
    ##Lubridate script converts month to monthname for X axis of graphs
    ga$monthname <- month(ga$yearMonth, label = TRUE)
    
    ##GGplot script creates line graph
    ggplot(ga, aes(monthname, sessions, color = year, group = year)) + geom_line(size = 1) +
      
      ##converts any integer Y axis to continuous scale and adds a comma for easy reading   
      scale_y_continuous(labels = comma)  +
      
      ##creates a facet for multiple graphs by medium and forces Y axis to independent scales  
      facet_wrap(~ medium, ncol=4, scales = "free_y")

    small multiples using ggplot2 and r
    Small Multiples using ggplot2 and R
  3. Analyze Product Performance, Content Groups or Other Categories with ggplot2

    A user could also use the previous script for small multiples to learn about other categorical data like revenue by product:

    library(RGA)
    library(ggplot2)
    library(scales)
    library(lubridate)
    library(zoo)
    
    #Authorization for RGA
    authorize()
    
    ##Enter your view ID here
    profile <- XXXXXXXX
    
    
    ##Enter your start and end date here
    start.date = "2016-01-01"
    end.date = "2017-05-31"
    
    ##RGA script
    ga <- get_ga(profile, start.date, end.date, metrics = "ga:itemRevenue",
                 dimensions = "ga:yearMonth,ga:year,ga:productName")
    
    ##Zoo script converts YYYYMM format date to YYYY-MM-DD
    ga$yearMonth <- as.Date(as.yearmon(ga$yearMonth, "%Y%m"))
    
    ##Lubridate script converts month to monthname for X axis of graphs
    ga$monthname <- month(ga$yearMonth, label = TRUE)
    
    ##GGplot script creates line graph
    ggplot(ga, aes(monthname, itemRevenue, color = year, group = year)) + geom_line(size = 1) +
      
      ##converts any integer Y axis to continuous scale and adds a comma for easy reading   
      scale_y_continuous(labels = comma)  +
      
      ##creates a facet for multiple graphs by product and forces Y axis to independent scales  
      facet_wrap(~ productName, ncol=4, scales = "free_y")

    If you haven’t had a chance yet, please read my post on why an analyst should learn R.

    Have any questions or comments???  Let me know in the comments section.

     

Like and share!

Leave a Reply

Your email address will not be published. Required fields are marked *