Skip to main content
Data Runs Deep Menu

Main navigation

  • Home
  • Who We Are
    • Careers
    • What We're Up To
    • What We Believe In
    • Ethical Policy
  • What We Do
  • Google Marketing Platform
    • Analytics 360
    • Tag Manager 360
    • Salesforce and GA360
    • Google Analytics App+Web
    • Data Studio
    • Optimize
  • Google Cloud Platform
    • Google BigQuery
      • Google BigQuery Training
      • Google Analytics Integration
  • Consulting
    • Strategy
    • Implementation
      • Google Analytics Audit
      • Google Tag Manager Audit
      • Enhanced Ecommerce Implementation
      • Google Tag Manager
    • Analysis
      • Google Data Studio
        • Google Data Studio Case Study - Mona
      • UX Analysis
  • Training
    • Five Day Google Analytics Online Workshop
    • Advanced Google Analytics Training
    • Google Tag Manager Training
    • Google Analytics 360 Training
    • Advanced Analytics Bootcamp
  • Partnerships
    • Conductrics
  • Blog
  • The Flying Beagle
  • Merch
  • Contact

How To Get Unsampled Data From Google Analytics Standard Into BigQuery

bigquery9
By: Balkan Misirli
Article Date
April 24, 2017

 

Sampling is a great statistical tool, but also something of a nuisance for those of us who use Google Analytics. Google tells us that Ad-hoc queries are subject to sampling if the number of sessions for the date range selected is above the general limit of 500k sessions. Over this limit, GA will not process all of your data but a limited subset of it and extrapolate from those results.

Having said this, Google does not apply this limit to Google Analytics 360 customers.

One way we have developed to get around this sampling issue is to use Google Apps Script and Google Sheets together with Google BigQuery to set up a daily export of data from GA to a table in BigQuery, which we can then query fully to our heart's content.

Step 1: Set Up A BigQuery Account And Create A Project

This is pretty easy to do, given that Google now offers $300 credit to use over 12 months on a trial. Link is here. Once you have signed up, follow the instructions to create a new project. 

 

Step 2: Set Up The Intermediary Google Sheet

Open up a new Google Sheets that you want to use as a staging post for your GA data. Give it a nice name.

bigquery2.1

When the sheet loads up, click on ‘Tools’ > ‘Script Editor’ which will open up the Google Apps Script editor page.

bigquery2.2

 

Step 3: Script To Automate GA Import To Sheets

Give the Script project a new name and delete all the text in the Code.gs file (the ‘function myFunction() {   } ‘ part). You can also rename the Code.gs to something more like “GA to GS” with that downward triangle button to the right of the ‘Code.gs’ in the list.

 

bigquery3.1

Copy and paste the below code and change the relevant fields to reflect your GA account and dimensions/metrics that you want to export.

The profile number is the number after the ‘p’ in the URL of your Google Analytics reports. You can also find it under the Admin / View Settings in Google Analytics, listed under View ID. 

The script will get the data from Google Analytics and put it into a new sheet, which it will place as the first sheet in the document and rename to yesterday’s date.

Enter the required dimensions as comma separated list within single quotes. The format of the name is as in the Dimensions & Metrics Explorer (‘ga:xxx’)

The sort by argument can also be configured - or left out, by adding // to the beginning, like the segment and filter arguments (similarly, removing the // will add those arguments to the code to be executed).

Configure the metrics to be pulled in by making a comma separated list (like dimensions) with a maximum of 10 metrics per script possible.

Save the file with Ctrl + S or ‘File’ > ‘Save’.

function GaExport() {
  try {

    var firstProfile = enter your GA profile number here;
    var results = getReportDataForProfile(firstProfile);
    outputToSpreadsheet(results);
    var yestDate = new Date();
    yestDate.setDate(yestDate.getDate()-1);
    var formattedDate = Utilities.formatDate(yestDate, "GMT", "yyyyMMdd");
    SpreadsheetApp.getActiveSpreadsheet().renameActiveSheet(formattedDate);

  } catch(error) {
    Browser.msgBox(error.message);
  }
}

function getReportDataForProfile(firstProfile) {

  var profileId = again enter the same profile number as above
  var tableId = 'ga:' + profileId;
  var startDate = getLastNdays(1);   // 1 day ago
  var endDate = getLastNdays(1);      // Today.

  var optArgs = {
    'dimensions': 'ga:date, ga:source, ga:medium, ga:campaign, ga:adContent',           // Comma separated list of dimensions.
    'sort': '-ga:date',                                                       // Sort descending by date.
    //'segment': 'dynamic::ga:isMobile==Yes',                          // Process only mobile traffic.
    //'filters': 'ga:source==google',                                  // Display only google traffic.
    'start-index': '1',
    'max-results': '100000'                                         // Display the first 100000 results.
  };

  // Make a request to the API.
  var results = Analytics.Data.Ga.get(
      tableId,
      startDate,                  
      endDate,                    
      'ga:sessions,ga:pageviews', // Comma seperated list of metrics.
      optArgs);

  if (results.getRows()) {
    return results;

  } else {
    throw new Error('No views (profiles) found');
  }
}

function getLastNdays(nDaysAgo) {
  var today = new Date();
  var before = new Date();
  before.setDate(today.getDate() - nDaysAgo);
  return Utilities.formatDate(before, 'GMT', 'yyyy-MM-dd');
}

// create a new sheet in Google Sheets and for each column print the header onto the sheet. Then fill those columns with the GA data

function outputToSpreadsheet(results) {
  var sheet = SpreadsheetApp.getActiveSpreadsheet().insertSheet(0);

  // Print the headers.
  var headerNames = [];
  for (var i = 0, header; header = results.getColumnHeaders()[i]; ++i) {
    headerNames.push(header.getName());
  }
  sheet.getRange(1, 1, 1, headerNames.length)
      .setValues([headerNames]);

  // Print the rows of data.
  sheet.getRange(2, 1, results.getRows().length, headerNames.length)
      .setValues(results.getRows());

}

 

Step 4: Script To Import Data Into BigQuery

Now we run another code to export the newly created sheet’s contents into the BigQuery table.

You will need to have a dataset created in BigQuery for this. Click the triangle button next to the project name and click to create new dataset. Enter some name into the Dataset ID and leave the other fields as they are.

bigquery4.1

Back to the Script Editor. I like to put this second import script into a different script file. This is done with ‘File’ > ‘New’ > ‘Script file’. Again, delete all text within that new script and replace with the below code.

bigquery5.1

Replace the projectId, datasetId, tableId and fileId fields with your relevant information. 

TableId refers to the table which BigQuery will append the data on to. This table will be created by the script if it does not already exist.

For the fileId, the number there refers to the alphanumeric bunch of characters in between the ‘d/’ and the ‘/edit’ in your Google Sheet’s URL.

Make sure that in the ‘fields’ part of the ‘schema’ correctly reflect your table fields, in the correct order. You can copy/paste and change names to configure this. 


function loadToBQ() {

  var projectId = 'lowercase project name within quotes';
  var datasetId = 'dataset name within quotes';
  var yestDate = new Date();
  yestDate.setDate(yestDate.getDate()-2);
  var formattedDate = Utilities.formatDate(yestDate, "GMT", "yyyyMMdd");
  var tableId = 'name of new table within quotes, no spaces';
  var fileId = 'the alphanumeric number in Google Sheets URL between d/ and /edit (within quotes)';

  // Define our load job.
  var jobSpec = {
    configuration: {
      load: {
        destinationTable: {
          projectId: projectId,
          datasetId: datasetId,
          tableId: tableId
        },
        allowJaggedRows: true,
        writeDisposition: 'WRITE_APPEND',
        schema: {
          fields: [
            {name:'date', type: 'STRING'},
            {name:'source', type: 'STRING'},
            {name:'medium', type: 'STRING'},
            {name:'campaign', type: 'STRING'},
            {name:'adContent', type: 'STRING'},
            {name:'sessions', type: 'STRING'},
            {name:'pageviews', type: 'STRING'}
          ]
        }
      }
    }
  };

  var spreadsheet = SpreadsheetApp.openById(fileId);
  var filename = spreadsheet.getName();
  
  var MAX_ROWS = 5000;
  var sheet = spreadsheet.getSheets()[0]; 
  var data = sheet.getDataRange().getValues();
  var csvdata = "";
  for (var row = 1; row < data.length && row < MAX_ROWS + 1; row++) {
    for (var col = 0; col < data[row].length; col++) {
      var cell = data[row][col].toString();
      if (cell.indexOf(",") != -1) {
        csvdata += "\"" + cell + "\"";
      } else {
        csvdata += cell;
      }

      if (col < data[row].length - 1) {
        csvdata += ",";
      }
    }
    csvdata += "\r\n";
  }
  var data = Utilities.newBlob(csvdata, "application/octet-stream");

  // Execute the job.
  BigQuery.Jobs.insert(jobSpec, projectId, data);
}

Save the file.

Please note that this method will append the rows of each Google Sheets spreadsheet to the bottom of the BigQuery table.

 

Step 5: Turn On API And Authorisations

First, turn on the necessary API links with ‘Resources’ > ‘Advanced Google services’

bigquery5.2

On the menu that pops up, turn on Google Analytics API, BigQuery API and Sheets API

bigquery6

Then click on the link down the bottom that says ‘Google API Console’. In the search box, search for the API name. For Analytics, click on the first result there.

bigquery7

Click the enable button

bigquery8

Do the same for the BigQuery API and Google Sheets API. You can click on the left arrow next to the Analytics API name to go back to menu.

Now return to the Script Editor. Click the triangle on the Select Function box to see the dropdown. Select the GaExport function and then press the play button to the left.

bigquery9

This should prompt you to authorise permission for Apps Script to access your data. Click Review Permissions and Allow.

bigquery10
bigquery11

Wait a few moments and this process should have resulted in the Google Sheet being populated with a new first sheet with the necessary data from Google Analytics.

Go to the second script, to upload to BigQuery, and again select function (loadToBQ) and press play. This should bring that Sheets data into the table in BigQuery.

Now, when you look at the dataset in BigQuery, you should see a shiny new table populated with your Google Analytics data!

 

Step 6: Set Up Time Triggers

Finally, we set these two scripts up to run every day to have a daily inflow of data into our BigQuery table.

Go to ‘Edit’ > ‘Current project’s triggers’ and click on the link that says ‘No triggers set up. Click here to add one now.’ Now, choose to run the GaExport function on the day timer at 1am-2am each day. Also, add another to run the loadToBQ function on the day timer at 2am-3am daily.Click save.

bigquery12

That’s it. From now on, you will have your BigQuery table grow each day based on the previous day’s Google Analytics data - unsampled, and in a format and platform where you can do whatever analysis you like!

Get in touch

To find out more, email us on hello@datarunsdeep.com.au or complete a quick contact form.

Get in touch

Talk to us: 1300 737 885

08:30 - 16:30, Monday to Friday. We love to chat

Visit us, we’re at:

Level 8, 313 Little Collins Street Melbourne, VIC 3000

Footer

  • Home
    • Who We Are
      • What We Do
      • What We're Up To
      • Careers
      • Contact
  • Training
    • Google Analytics Training Courses
    • Advanced Google Analytics Training
    • Google Tag Manager Training
  • Consulting
    • Analysis
    • Implementation
    • Strategy
    • Google Analytics Audit
    • Google Tag Manager Audit
  • Google Cloud Partner
  • Google Marketing Platform Sales Partner
  • Bitcoin accepted here
  • Registered Looker Consulting Partner Melbourne
  • Twitter
  • LinkedIn
  • Google Plus
  • Blog RSS feed
Copyright © 2009-2021 Data Runs Deep Pty Ltd. All Rights Reserved. Google Analytics is a trademark of Google Inc.
Website by Marin Sekesan