Finding a Marketing Mix with Google Analytics Multi Channel Funnels and R

Google Analytics Multi Channel Analysis

Online marketing channels such as Paid Search and Display Advertising are used by scores of organizations to improve outreach.  Having the ability to improve your visibility by simply purchasing traffic is very useful.  What is not useful, at least for most marketing organizations is having to come to grips around whether the money for said traffic was spent as efficiently as possible.  Most organizations will simply use the Acquisition reporting in Google Analytics to learn of how much traffic their marketing campaigns generate (bad).  Some will even venture to see how many conversions or revenue they produce (better but still bad).

Only the savvy organization will employ the technique known as “multi-session marketing analytics.”  This technique uses user and session data in order to analyze the activity of users across multiple sessions.  This improves on the simplistic “last click” attribution model used for the regular Google Analytics Acquisition reporting.  Google has provided some reporting tools for this type of analysis in the Multi-Channel Funnels and Attribution reporting found under the “Conversions” tab in Google Analytics.  The Model Comparison Tool report can even be used to compare different models (i.e. “last click”, “first click”, etc.).

multi session marketing analysis
Model Comparison Tool in Google Analytics

The unfortunate thing about using these models is that there is no such thing as a one-size fits all model for analyzing a site’s marketing channels.  Each site has it’s own flavor and it’s own user base with it’s own marketing behavior.

In order to deal with this issue, an organization could either pay for Google Premium (which uses machine learning algorithm to predict the best model to use) or it could run a probabilistic model on its GA data. Markov Chain is one of the easier models to use. It is also a model that is used by a number of marketing attribution analytics consultancies.

I’m no statistician, but I believe the simplest way to explain Markov Chain is that is way of describing the probability of events based on the most previous event state. In this case each event is a session with an assigned medium and the result is the re-assigning of values based on the highest probabilities of conversion.  Read more on Markov Chain here.

Markov Chain for Marketing Analysis
Markov Chain Illustration

Good thing for those of us that have no advanced math degree, there is a package for R called ChannelAttribution which allows us to run the Markov Chain model on data direct from the Google Analytics API. It also compares the Markov Chain model to other models such as last touch, first touch and linear without much fuss.  There is a great tutorial on using ChannelAttribution on the Lunametrics Blog.  Read below to see how this technique can be used with data direct from the Google Analytics API in R.  The script seems a bit verbose, but it works very well, thanks to Kaelin Harmon!

This produces a dataframe and a plot which compares each of the heuristic models (last touch, first touch, linear) and the Markov Model.

Multi Session Analysis in R
Heuristic Models vs Markov Model

If you liked this post, you might want to take a look at my last post on using the GA API and R as an alternative to Google Analytics Premium or just leave me a comment below.

An Alternative to Google Analytics BigQuery Export Using R and Google Tag Manager

Google Analytics has proven to be one of the most influential tools ever created for marketing analysis.  Google is pretty unrelenting in their pursuit of innovation for Google Analytics and that innovation shows in the number of other tools they’ve built for analysts.  From Google Sheets to BigQuery to Google Data Studio, the complementary tools built are a great aid for dealing with the dearth of data that can be mined from Google Analytics.  One of the little known yet game-breaking tools available for use with Google Analytics data is the Google Analytics BigQuery Export.

Google Analytics Premium BigQuery Export
Google Analytics Premium BigQuery Export

This tool, which is only available for users of Google Analytics premium product, is in essence a raw data export of a website’s Google Analytics data.  This unlocks any analyst with a decent knowledge of SQL from the shackles of Google canned reports.  This also allows an analyst to create much more robust logic for creating reports.  For instance, if an analyst wanted to create a report for all users that viewed a particular page during their session and returned to the site within 6 days, they would only be limited by only their knowledge SQL and their ability to fork out the $150K Google charges for their premium product!!!

Google Analytics does not provide data at the user level out of the box, however, with the aid of a process outlined in Simo Ahava’s tremendously useful blog, you can use Google Tag Manager to pull Google’s user and session IDs out of the cookie (also known as Client ID) and feed them back to the interface in custom dimension or event.  This gives an analyst the ability to report on user activity at the user ID level.

Remember: passing personally identifying information to Google Analytics is a violation of the terms of service, so don’t pass any personal identifying information to Google if you might have it, like email addresses.

Below are the steps I use for passing pageview data along with user and session data to the GTM data layer for logging in Google Analytics, however, you could technically use a slightly different process to pass ecommerce, event, goal, custom metric/dimension data as well.  I’ll cover that in a later post.  I’ll assume that the reader has already tagged all of their pages with a Google Tag Manager container, but if not, start by reading this post and make sure to tag your pages.

Steps:

  1. Create a Custom Dimension by going to the admin page in your Google Analytics view:Under “Custom Definintions” select “Custom Dimensions”, create a dimension and call it “Client ID” or whatever name you prefer. This dimension will have a scope of “Session”.  Make note of the dimension index (you’ll need to enter that later).
  2. Create a Custom JavaScript Variable in Google Tag Manager and give it a title such as {{Set Client ID in Dimension 1}}.
    Here is the code:

    Make sure to include the correct index to the customDimensionIndex variable.  If you’ve completed this step correctly, you will be able to see the ClientId being passed under whichever custom dimension you have set it up for in the Google Analytics Debugger tool.

    Client ID being passed into dimension 1
  3. If everything shows up, move back to Google Tag Manager and edit the pageview Tag for your site. Under “Fields to Set”, type “customTask” and under “Value” use the dropdown to select the variable we created in step B, {{Set Client ID in Dimension 1}}.Now that concludes the first part of the process.  Once you’ve reached this step, you could technically start playing with the user and session Client ID dimension in Google Analytics’ custom reports.Pull Client ID Custom Dimension DataSo we’ve tagged our site to send user and session data to Google Analytics and have dealt with sampling, now for the fun part.  This string pulls page URLs, user and session IDs by date based on the dimensions detailed above.  Where pro

    Run some other scripts

    Using some other scripts, an analyst can answer a number of other questions, like how long does it take a new user to become a repeat user.  These scripts rely heavily on the data.table syntax instead using base R.  Please take a look at my prior post on using data.table to learn why I do so.

    Data Returned

    Then run the rest:

    This will return the number of days on average it takes a new user to become a repeat user.

    calculate number of days for new user to return
    There are a number of other uses for the client ID data in GA.  For instance, a marketer might want to do some attribution modelling or a content manager would want to know if viewing an article in one session might effect subsequent sessions.

One consideration around doing this type of analysis is scale.  Most smaller websites won’t pose an issue, but some larger sites (like the one I currently work on) will.  Pulling an individual non-aggregated row for every session, page or event can yield some extremely large datasets.  In this case, it would make sense to send the data to a cloud storage data warehouse such as BigQuery.  Want to learn more on using R to solve for this?  Stay tuned…

3 Ways To Analyze Google Analytics Data in R with RGA and ggplot2

In my opinion, Google Analytics is the single most influential development in marketing analytics ever.  Quantcast estimates that 70% of its top 10,000 website have GA installed.  Google has shown a relentless drive to improve the product over the years and it’s free price tag insures access to most anyone that runs a website.  With that said, Google Analytics is a service and no service (great or lacking) is without flaws.  One of the hidden advantages GA possesses is a robust API and this advantage allows users to build some of the features that are missing from the standard interface.  I wanted to cover some of the ways a user could use R to deal with some of the features not available in GA.

In order to use any of these techniques, you will have to install R as well as the rga package and dplyr package which available on CRAN.  Other packages used include ggplot2 for visualization, scales, lubridate and zoo.  Use the script below to install.

  1. Event Conversion Rate Script

    One of my gripes with Google Analytics is that the Top Events report includes total event counts but does not include a conversion metric.  If you are using the Google Tag Manger click listening technique to add events to your site by listening for click elements, a you could add a bit of custom Javascript to pass an impression for the same element, however, in many cases, just a simple total event count over the pageview count would suffice.  Here’s a script that grabs that simple metric:

    eventLabelpagePathtotalEventscontentGroup1pageviewspageconv
    Social Link/3Homepage9.3333

    This gives all event parameters (Category, Action and Label) as well as the page URL and content group 1, allowing the user to easily aggregate pages if they are passing content groups.  I strongly encourage using content groupings.

  2. Analyze Acquisition Mediums with ggplot2

    Google Analytics has some good embedded graphs for analyzing traffic mediums and the advent of Google Data Studio gives users even more flexibility, however, sites with high numbers of marketing mediums (10+) will pose issues for these tools.  Using ggplot2 in R allows a user to create what analysts call “small multiples” or a series of similar graphs or charts using the same scale and axes, allowing them to be easily compared.  Below is a script that returns small multiples for a year over year comparison of marketing mediums.

    small multiples using ggplot2 and r
    Small Multiples using ggplot2 and R
  3. Analyze Product Performance, Content Groups or Other Categories with ggplot2

    A user could also use the previous script for small multiples to learn about other categorical data like revenue by product:

    If you haven’t had a chance yet, please read my post on why an analyst should learn R.

    Have any questions or comments???  Let me know in the comments section.