Google BigQuery: Truncate date and timestamp – Example with sample data included

Google Big Query @ Freshers.in

When dealing with data in Google BigQuery, especially time-series data, it’s common to want to truncate dates and timestamps to a specific level of precision. For instance, you might want to round a timestamp down to the start of the hour or the day. This guide will provide a step-by-step walkthrough on how to truncate dates and timestamps in Google BigQuery.

Step 1: Set up the table

Let’s first create a table named freshers_in_page_views which will contain the page view counts and their associated timestamps.

CREATE TABLE freshers_in_page_views (
    view_id INT64,
    view_timestamp TIMESTAMP,
    view_count INT64
);

Step 2: Insert sample data

For the sake of demonstration, we’ll insert a few rows of sample data into our table.

Insert Statement:

INSERT INTO freshers_in_page_views (view_id, view_timestamp, view_count)
VALUES
(1, TIMESTAMP "2023-09-14 15:43:22 UTC", 120),
(2, TIMESTAMP "2023-09-14 12:05:45 UTC", 89),
(3, TIMESTAMP "2023-09-14 09:15:00 UTC", 155);

Step 3: Truncate Date and Timestamp

Now, let’s look at some common truncation scenarios:

Sample data:

view_id | view_timestamp             | view_count
------- | -------------------------- | ----------
1       | 2023-09-14 15:43:22 UTC    | 120
2       | 2023-09-14 12:05:45 UTC    | 89
3       | 2023-09-14 09:15:00 UTC    | 155

Truncate to the start of the day:

This will get you the date without the time portion.

SELECT
    view_id,
    DATE(view_timestamp) AS truncated_date,
    view_count
FROM freshers_in_page_views;
Result
view_id | truncated_date | view_count
------- | -------------- | ----------
1       | 2023-09-14     | 120
2       | 2023-09-14     | 89
3       | 2023-09-14     | 155

Truncate to the start of the hour:

This will give you the timestamp rounded down to the start of the hour.

SELECT
    view_id,
    TIMESTAMP_TRUNC(view_timestamp, HOUR) AS truncated_hour,
    view_count
FROM freshers_in_page_views;
Result
view_id | truncated_hour            | view_count
------- | ------------------------- | ----------
1       | 2023-09-14 15:00:00 UTC   | 120
2       | 2023-09-14 12:00:00 UTC   | 89
3       | 2023-09-14 09:00:00 UTC   | 155
Truncate to the start of the month:

To get the timestamp at the start of the month.

SELECT
    view_id,
    TIMESTAMP_TRUNC(view_timestamp, MONTH) AS truncated_month,
    view_count
FROM freshers_in_page_views;
Result
view_id | truncated_month           | view_count
------- | ------------------------- | ----------
1       | 2023-09-01 00:00:00 UTC   | 120
2       | 2023-09-01 00:00:00 UTC   | 89
3       | 2023-09-01 00:00:00 UTC   | 155

Truncating dates and timestamps is essential when you need to aggregate data over specific periods or when you want to simplify the visual representation of time-series data. Google BigQuery offers functions like DATE() and TIMESTAMP_TRUNC() to make this process straightforward and efficient.

Author: user

Leave a Reply