Formatting timestamp data according to a specified string format in Google Bigquery

Google Big Query @ Freshers.in

Google BigQuery addresses this challenge with the FORMAT_TIMESTAMP() function, allowing users to format timestamps into more comprehensible and standardized outputs. This article offers an extensive overview of FORMAT_TIMESTAMP(), supplemented with executable examples in BigQuery.

Understanding FORMAT_TIMESTAMP():

The FORMAT_TIMESTAMP() function in Google BigQuery serves to format timestamp data according to a specified string format. It converts a TIMESTAMP data type to a STRING data type, presenting the date and time in a format that’s more readable or that fits certain criteria.

The syntax for FORMAT_TIMESTAMP() is as follows:

FORMAT_TIMESTAMP(format_string, timestamp_expression [, timezone])

format_string: The string that dictates how the timestamp should be formatted, defined using specific format elements (like %Y for year, %m for month, etc.).

timestamp_expression: The TIMESTAMP data type that you wish to format.

timezone (optional): The time zone to consider for the formatting. If not specified, BigQuery uses the default time zone set at the project level.

Creating Sample Data: For demonstration purposes, we’ll create sample data in BigQuery. Execute the following commands in the BigQuery console to set up your dataset and table:

-- Create a dataset
CREATE SCHEMA IF NOT EXISTS time_data;

-- Create a table
CREATE TABLE IF NOT EXISTS time_data.UserLogs (
  id INT64,
  activity STRING,
  activity_timestamp TIMESTAMP
);

-- Insert sample data
INSERT INTO time_data.UserLogs (id, activity, activity_timestamp) VALUES
  (1, 'Login', TIMESTAMP("2023-01-01 08:30:00 UTC")),
  (2, 'Purchase', TIMESTAMP("2023-04-15 12:45:00 UTC")),
  (3, 'Logout', TIMESTAMP("2023-07-20 22:10:00 UTC"));

Example

Suppose we want to present the ‘activity_timestamp’ in a more readable format, like “YYYY-MM-DD hh:mm:ss”, and convert the time to the “America/Los_Angeles” timezone. We would use the FORMAT_TIMESTAMP() function as follows:

SELECT 
  id, 
  activity, 
  FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', activity_timestamp, "America/Los_Angeles") AS formatted_timestamp
FROM 
  time_data.UserLogs;

This query will format the ‘activity_timestamp’ into the desired structure and convert it from UTC to the specified timezone.

In the ‘format_string’ parameter, different elements dictate how the timestamp’s components should be displayed. For example, ‘%Y’ represents the four-digit year, ‘%m’ represents the two-digit month, and ‘%d’ represents the two-digit day. There are many such elements available, and you can consult the BigQuery documentation for a comprehensive list.

Author: user