We love Google Cloud Functions for its simplicity in allowing us to quickly deploy code without having to worry about servers, networking, and security.
However, it does get tricky when we need to forecast how much these Cloud Functions are going to cost us over time. To be able to do that we need to know how much memory is typically consumed, how much processing power is used, and how long it runs for. The reporting of these metrics are not very user friendly.
Enter Stackdriver and Google BigQuery to the rescue. Stackdriver logging is an awesome tool, made even more awesome by the fact that you can easily export all your logs to BigQuery, as described here.
If you have the Stackdriver to BigQuery integration set up for Cloud Functions logs, you may be interested in knowing the total number of milliseconds your Cloud Functions have run over the last day/month/year. Along with other aggregate values (such as average execution time), you can then dig deeper to discover function invocations that took longer than usual and perhaps diagnose why. With all this information at hand, we are able to forecast costs with a higher level of confidence.
The query is below. It extracts the number of milliseconds each function invocation took to complete and then casts it into an integer. This query runs over the period of 1 January to March 15 2019 but you can change that as required. Also, you will need to replace <cf_name> with the actual name of your cloud function.
Here is another query that will first identify which function invocations contained at least one log entry with “ERROR” level severity, then calculate the percentage of erroneous invocations for each individual Google Cloud Function, finally displaying these results in a descending list. This way, you can easily discern which of your Cloud Functions need some more love.
Finally, here is another query to find a list of all function invocations which took longer than 130% of the mean execution time to finish, along with the timestamp of the invocation. This allows you to diagnose if there are any particular times when the function tends to fail more often.
Happy log mining!