The STRUCT data type in Google BigQuery enables you to encapsulate multiple fields, possibly of different data types, into one complex structure. Think of it as a mini-table within a table, where each field in the STRUCT has both a name and a value. This allows for more organized and intricate storage of related data fields. The real power of the STRUCT data type is felt when dealing with semi-structured data or when needing to encapsulate multiple related attributes into one column.
How to Define and Use STRUCTs in Tables
To define a STRUCT, you must specify both field names and their corresponding data types.
CREATE TABLE freshers_in_most_visited_pages (
id INT64,
page_name STRING,
visitor_info STRUCT<
name STRING,
age INT64,
visit_count INT64
>
);
Insert Statement
INSERT INTO freshers_in_most_visited_pages (id, page_name, visitor_info)
VALUES
(1, 'Home', STRUCT("John Doe", 28, 5)),
(2, 'About', STRUCT("Jane Smith", 25, 3)),
(3, 'Contact', STRUCT("Alan Walker", 32, 2)),
(4, 'Services', STRUCT("Emily Stone", 29, 4));
Querying Fields Within a STRUCT
Once your data is structured in the STRUCT
data type, querying specific fields within it requires a slightly different approach.
Example 1: Accessing Specific STRUCT Fields
To get the name of the visitor for each page:
SELECT page_name, visitor_info.name AS visitor_name
FROM freshers_in_most_visited_pages;
Filtering using STRUCT Fields
To fetch pages visited more than 3 times:
SELECT page_name, visitor_info.name AS visitor_name, visitor_info.visit_count
FROM freshers_in_most_visited_pages
WHERE visitor_info.visit_count > 3;
Output
+-----------+--------------+-------------+
| page_name | visitor_name | visit_count |
+-----------+--------------+-------------+
| Home | John Doe | 5 |
| Services | Emily Stone | 4 |
+-----------+--------------+-------------+
BigQuery import urls to refer