
In many of our upcoming posts, you will be reading about how to connect to Google services using a service account. In this post, we will look at what service accounts are, why you would use them, how to create one, and how to apply permissions for a service account.
When Do I Need To Start Thinking About Service Accounts?
You will need to start considering service accounts if you (or one of your developers or partners) are going to write scripts or applications that are going to interact with your data sitting in the Google Cloud Platform (GCP).
What Is A Service Account?
In order to understand what a service account is, we should compare it to a user account. When someone joins an organisation, a user account is provisioned for them. For example, vxxxx@datarunsdeep.com.au is my user account. This account represents that user, and they will be able to access certain resources based on the permissions given to their account.
A service account, on the other hand, does not represent a person. It instead represents a computational resource (e.g. a script in Google Cloud Functions, a Google App Engine app, a virtual machine, etc.) that wishes to speak to the GCP services housing your data. Your script or application assumes the identity of the service account whenever it calls the APIs.
When you create an application, it can interact with GCP’s services either using your user account or the application’s service account. So why would we prefer a service account over a user account?
Why Use A Service Account?
You will be familiar with logging into applications using your user account. If you have ever clicked on a “Sign in with Google” button, you are signing in with your user account.
Now if the application needs to interact with GCP services, and provided you have granted it the appropriate scopes, it will interact with those services as you.
This should be familiar to all of us: we’ve all logged into various third-party applications with our user accounts. And it all seems harmless. So why do we need to work with service accounts?
There are potential pitfalls working with a user’s credentials, and thus why we prefer using service accounts for analytics workloads:
If you need to run an automated job in the background, you will need to store the user’s credentials along with your scripts. First off, that is not great security practice. Second, if the user leaves the organisation or their permissions change, then your automation will fail.
A user, for the purposes of performing their day-to-day tasks, may have a wide range of lenient permissions across a number of products. Errant application code could affect data in these products because it is allowed open access to them via these permissions. Service accounts, on the other hand, can be given restrictive permissions while the user’s permissions are unchanged. Therefore the application is restricted in what it can do, and the user can still perform their day-to-day job without hindrance.
If you need to bring in third-party developers or testers to write your application, you only need to provide them with your service account’s details instead of creating new users for them in your organisation. If you had to create user accounts within your domain for them, they could potentially have access to sensitive organisational information. Service accounts allow you to grant access to specific services to third-parties without having to bring them into your organisation.
In short, service accounts allow us to better enforce security and reduce reliance on specific individuals.
Scenario: Adding A Service Account To Bigquery
Let’s look at how to set up and apply a service account by walking through one of our more common scenarios. We often automate analytics workloads for our customers utilising their Google BigQuery data and, sometimes, Google Cloud Storage. We use service accounts to authenticate our scripts and are confident the solution will continue running even if our individual engineers’ permissions change.
In this walkthrough we are going to:
Create a service account for our automation application
Apply the correct permissions to the service account
Download and start using the service account’s credentials
Create A Service Account
First, log in to the Google Cloud Platform Console.
Ensure you have selected the correct project in the dropdown box
Using the hamburger menu on the top-left navigate to IAM & admin > Service accounts
Click on + CREATE SERVICE ACCOUNT
Create your service account with the following settings:
- Service account name: <team name>-<resources accessed>-<what your app does> . In our example, we’ve gone with "drd-cloud-bq-gcs-automation" because:
drd-cloud: stands for the Data Runs Deep Cloud Services team
bq-gcs: this service account will access Google BigQuery and Google Cloud Storage
automation: this service account will be used in an application that automates data workloads
Service account ID: this will default to <service account name>@<your project ID>.iam.gserviceaccounts.com
Service account description: be kind to the administrators of the project and include a description of how this service account is going to be used.
- Service account name: <team name>-<resources accessed>-<what your app does> . In our example, we’ve gone with "drd-cloud-bq-gcs-automation" because:
Click CONTINUE
Next, we need to apply permissions for this service account. Because we only want read-only access to Google BigQuery and Google Cloud Storage, we give the service account the BigQuery Data Viewer, BigQuery Job User, and Storage Object Viewer permissions.
Click CONTINUE
Click DONE
You have just created your first service account.
Obtain The Credentials For Your Service Account
In order for your application or script to access these services as the service account, it will need the service account’s credentials.
Using the hamburger menu on the top-left navigate to IAM & admin > Service accounts
Find the service account that you’re interested in and click on the three dots next to it
Click on Create key
Select JSON and then click on CREATE
A JSON file will be downloaded to your machine. These are the credentials you will use in your code to authenticate with Google Cloud Platform’s services.
Use Your Service Account Credentials
Now that you have your service account credentials, it’s time to put them to use! Our future blog posts will go into more in-depth use cases, but for this article let’s look at connecting to BigQuery and running the most simplest of queries (i.e.: SELECT 1)
With Python, you store the credential file with your code. Set the environment label GOOGLE_APPLICATION_CREDENTIALS to point to the file. Once you have done that, Google Cloud’s client libraries will automatically take care of authenticating the service account with the services you are accessing.
Checkpoint
By this point you should:
Have an understanding of what service accounts are
Have created your first service account
Be able to download service account credentials
Be able to connect your application or script to GCP services using the service account credentials
Keep reading below for suggested best practices and other things you can do with service accounts.
Best Practices For Service Accounts
Here are our recommendations when it comes to creating and managing service accounts:
Ensure the names of your service accounts are intuitive and easy to read. Administrators who are looking through your GCP IAM settings should be able to gauge what the purpose of a service account is.
Define and stick to a naming convention for your service accounts. The key is to be consistent in naming over time. We usually recommend a pattern that looks like "<team_name>-<resources_accessed>-<purpose>". For example, if our Cloud team requires Google BigQuery and Google Cloud Storage access to automate data uploads, we would provision a service account named "drd-cloud-bq-gcs-autoupload@XXXX.iam.gserviceaccounts.com".
Be as restrictive as possible. Give the service accounts only the permissions that they need; nothing more.
Please please please DO NOT commit your credentials files to your source repositories.
Your application documentation should include a list of which service accounts it relies on. When new developers join the project, they will know which account credentials they need to download in order to get the application working in their local environment.
Things We Haven’t Covered Here, But You Can Do
Service accounts can access resources across projects. The example above assumes the service account and the resources it is accessing reside in the same project. Just like with users, however, you can add service accounts to any project through that project’s IAM facilities.
You can programmatically provision and configure service accounts via the IAM API (using one of the GCP client libraries or gcloud).
You can monitor service account usage in Stackdriver. This is handy for verifying that they are being used to access the appropriate resources and only by the right applications.
TL;DR
What is a service account? A service account represents a computational resource (e.g. application script, virtual machine instance, etc.) instead of a user.
Why use a service account? Using a service account is a better security practice. It is also more robust in the long-term as your code doesn’t have to rely on specific users being around in the organisation.
How do I create a service account in GCP? Go to the Cloud Console and navigate to IAM & admin > Service accounts to get started. For full details, see the instructions above.
How do I use my service account credentials? Download the JSON key file for the service account, and set the environment variable GOOGLE_APPLICATION_CREDENTIALS to point to it. The GCP client library in use will then take care of the authentication.
Useful resources
Here are some useful resources on service accounts:
Google Cloud Identity and Access Management > Understanding service accounts
git-secrets is a useful utility to prevent you from accidentally uploading credentials to git