The Missing Link: 3 Steps For Connecting TV & SEM Performance

Columnist Benjamin Vigneron shares his method for attributing changes in SEM performance over time to multiple internal and external variables.

Every marketer who thinks about their marketing mix holistically not only cares about each channel individually, but also how those channels perform in combination with each other.

While this can get tricky to measure accurately, I’ll share some basic techniques to connect online and offline data — and, more specifically, how marketers can measure the impact of TV and seasonality on their SEM efforts.

1. Pick Relevant Data

Ideally, you’d want to run a test on a significant sample of your audience and compare the results with the rest of your audience. Unfortunately, that is not always possible in real life.

For example, if you run TV ads nationally, you won’t be able to target a randomized sample of the population and compare the results with the rest of population, so you won’t be able to form nice and tidy test and control groups. Instead, you’ll have to analyze how much of an impact national TV has on your online initiatives over time.

Assuming our response variable is the weekly SEM impression volume we’re getting on a selection of branded search queries on Google and Bing, then our first variable would be how much was spent on TV ads over time. Note that seasonal trends may play a major role on general SEM performance and should pretty much always be taken into consideration when attributing changes in performance over time.

You essentially want to normalize the data based on seasonal trends — this will prevent you from attributing a change to TV ads when you were actually expecting more volume based on historical seasonal trends.

Similarly, budget changes — whether they are online (SEM, Social advertising, RTB, emailing, etc.) or offline (TV, radio, etc.) — can hugely impact performance over time and should definitely be factored in.

For the purposes of this article, I’ll keep it simple and focus on the following variables: national TV spend and seasonal trends. However, the logic would hold true for more variables, as long as those variables are independent of each other.

In this case, we’ll use the following input variables:

  • Response variable Y1: weekly branded SEM impressions
  • Input variable X1: weekly national TV spend in this case
  • Input variable X2: weekly Google Trends index on top non-branded queries, which supposedly reflect the market demand

2. Run A Contribution Analysis

The next step is to run a contribution analysis (more specifically a multiple linear regression analysis, in this case) so that we can predict our response variable (i.e., SEM branded search query volume) from two independent variables: TV ad spend and seasonal trends. For the sake of this post, let’s use some hard numbers and this downloadable spreadsheet: Actual vs. Modeled (.XLSX file). Say we have nineteen weeks of SEM and TV data, as well as Google Trends data.

(Note: We could use R, which is very well suited for this type of analysis. For the sake of this post, however, we’ll just use Excel, which is by far more widely used.)

Excel offers a “Data Analysis” package, which will well help run a multiple regression analysis. Step-by-step instructions are as follows:

  1. Load Excel’s analysis tool pack once for all — see Load the Analysis ToolPak for instructions.
  2. Launch the data analysis package via the “Data” tab

  • Select “Regression” in the Analysis Tools box.
  • Select your response variable (“Input Y Range”), input variables (“Input X Range”), pick a cell where you want to output the results, such as $G$1, and click “OK.”
  • regression-settings

    Looking at the regression summary, you’ll be looking for:

    • A high adjusted R squared — that is, a value greater than 0.6-0.8, which would indicate that 60-80% of your branded impressions can be attributed to the combination of TV spend and seasonality.

    summary-output-11

    • Low p-values for each input variable. A p-value greater than 0.05 is not statistically significant (it might be due to random chance, rather than a finding).
    • Positive coefficients for each contributing variable. Negative coefficients may indeed occur as a result of the co-linearity of two input variables, which means that your input variables are correlated (for example, they happen at the same time) and the regression analysis is not able to distinguish the impact of those variables individually.

    Average-Variable-Contribution

    Of course, this is the best case scenario where the data is particularly clean — in real life, you might need to first clean up the data (remove outliers, normalize the data further, add more input variables).

    However, this technique can be very useful in order to get a first feel for connecting online and offline data, and more generally attributing changes in performance to multiple internal and external variables — then you can test your predictive model, see how accurate it is, and fine tune it over time.