Occasionally, we write R scripts to pull and manipulate data from Google Analytics and put it back into BigQuery. We then usually put these scripts on a virtual machine somewhere and schedule it to run automatically. One of the more finicky parts of this process is the authentication involved.
I tend to use Mark Edmondson's googleAnalyticsR package for all GA related work. It's awesome. However, if the script that you wrote works fine on your own computer but throws access errors when the cron job is trying to run it, read on.
In order to access the API data, R first has to authenticate the session. In your local RStudio it does this by one of two means;
- It gives you a custom URL which leads to a Google login page, where you manually authenticate and permit googleAnalyticsR to use the API through your account.
- If you have already done this previous one in the past, the authentication code generated by this process is saved in RStudio, in the .httr-oauth file. If this is the case, it just authenticates automatically and you don’t need to worry about it.
However, if you are looking to host this script on a server and have it scheduled to run automatically, this method will not work for you. You'll have to do the following:
- In the GCP console, go to APIs & Services > Credentials > Create Credentials > Service Account Key. Choose the Service Account that the VM (or whatever resource) belongs to. Choose JSON as the file type. This will generate an API key for you to download locally.
- Take this JSON file and upload it to wherever the R script will be running.
- In this JSON file, replace all instances of \n with \\n, as this will escape the newline character and enable the script to read the API key properly.
- In the directory where the R script will run, create a new file called .Renviron. Inside it, write the following line:
This will set the environment variable “GA_AUTH_FILE” to be the contents of the API key. The package googleAnalyticsR automatically looks for whether a variable with this name exists in the environment before it attempts anything else. Thus, the script should now be able to run non-interactively.
Now, go forth and access some API!