BigQuery Cost per Query Visibility with Labels

by Vantage Team


BigQuery Cost Per Query

BigQuery ranks at number three in top Google Cloud spending. Given it is a high contributor to cloud costs, users are looking for ways to cut costs by optimizing their expensive queries. However, pinpointing which specific queries are driving the highest costs can be a challenge. This is where labels come into play.

By leveraging labels on query jobs you can filter to see the spending of each query job, thereby gaining insight into which queries may need to be optimized.

BigQuery Pricing

BigQuery pricing is relatively simple, primarily consisting of compute pricing (the cost to process queries) and storage pricing (the cost to store data). It gets a little more complicated if you consider additional features or the free tier. For our purposes, we’re just going to focus on on-demand compute pricing.

Price Free Tier
$6.25 per TiB 1 free TiB per month

BigQuery on-demand compute pricing

For companies running many queries, these costs can add up and users look to understand ways to cut costs on costly queries. The challenge lies in the fact that BigQuery does not provide a direct way to view the cost associated with each query. While you can see the overall cost incurred during a particular period or try to estimate it yourself, breaking down these costs on a per-query basis is not readily available through the native BigQuery interface.

BigQuery Labels

Labels can be applied to query jobs and provide an excellent way to filter and gain insights into your query costs. The difference between query jobs and queries in BigQuery are as follows:

  • Query: The SQL statement or request that you submit to the system to retrieve, manipulate, or analyze data.
  • Query Job: The complete process of executing a query, which includes the submission of the query, its processing by BigQuery’s infrastructure, and the retrieval of the results.

So by adding a label to a query job you can track the total cost associated with queries in that query job label. However, be mindful of some limitations when determining if it is the best solution.

  • Non-Retroactive: You cannot apply the labels retroactively as they can’t be added or updated for pending, running, or completed jobs.
  • Manual Labeling: While adding labels is straightforward, it can be a tedious process for large amounts of query jobs because you will need to add labels to query jobs individually.

Labels are formatted KEY:VALUE, where keys must be unique. Keys and values “can contain only lowercase letters, numeric characters, underscores, and dashes”.

Add Labels

Labels can be added from the bq command-line tool or API.

Pre-Requisites

Before you can start adding labels, ensure that you have been granted the necessary access permission, the BigQuery User (roles/bigquery.user) IAM role.

Add Via API

Use the jobs.insert method and populate the labels property in the job configuration.

Add Via bq

Add the --label flag to your bq query` command.

Example - Add Via bq

This query retrieves NYC 311 calls from a public dataset on a given date. By adding the --label flag, the query will be identified with the “purpose” key and “6-15-311-analysis” value.

bq query --nouse_legacy_sql --label=purpose:6-15-311-analysis \
'SELECT
  COUNT(*) as `311_count`
FROM
  `bigquery-public-data.new_york_311.311_service_requests`
WHERE
  DATE(`created_date`) = DATE(2021, 06, 15)'

View Labels In Vantage

To view queries with the labels use the filter feature to sort by the tag(s) or group by your tag.

BigQuery filter by tag

BigQuery filter by tag

After applying you will be able to see the cost for queries with those filters.

BigQuery group by tag

BigQuery group by tag

Note there may be a two day delay in seeing resources appear.

Conclusion

Optimizing query costs in BigQuery is essential for cost control and efficient resource utilization. The ability to add labels to your query jobs provides a valuable tool for tracking and managing costs. By identifying the most expensive queries, you can proactively optimize them and cut unnecessary expenses.