How to copy data from Redshift to Hive ? How to unload from Redshift in Parquet format ? Explained with Sample code .

user August 13, 2021 Leave a Comment

There may be some business scenario to copy daily data from Redshift to Hive. For the compression , it is always preferable to have the data in Parquet format. Before Dec 2019 , redshift don’t have the capability of unloading data to parquet format.

From Dec 2019 you can unload the result of an Amazon Redshift query to your Amazon S3 data lake as Apache Parquet, an efficient open columnar storage format for analytics. The great advantage is Parquet format is up to 2x faster to unload and consumes up to 6x less storage in Amazon S3, compared to text formats. During unload you can specify one or more partition columns so that unloaded data is automatically partitioned into folders in your Amazon S3 bucket.

UNLOAD (‘select * from hrm.my_freshers_in_data where as_on_date = ”2021-08-01”’)
to ‘s3://sem-freshers-in-us-east-1-daily/hrm/base/hrm.my_freshers_in_data’ iam_role ‘arn:aws:iam::100988878888:role/FRESHERS_IN_Redshift_S3_Role’ PARQUET ALLOWOVERWRITE PARTITION BY (as_on_date);

PARQUET with ENCRYPTED is only supported with server-side encryption with an AWS Key Management Service key (SSE-KMS).By default, each row group is compressed using SNAPPY compression. During unload you can’t use PARQUET with DELIMITER, FIXEDWIDTH, ADDQUOTES, ESCAPE, NULL AS, HEADER, GZIP, BZIP2, or ZSTD.

Reference
Amazon Redshift introducing Apache Parquet
Redshift Unload Command

Post Views: 134

Author: user

How to copy data from Redshift to Hive ? How to unload from Redshift in Parquet format ? Explained with Sample code .

Leave a Reply Cancel reply

Trending

Recent Posts

Featured Posts – Slider Widget

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Chemical Engineering

Civil Engineering

Backpressure in AWS Kinesis Streams: Optimizing Data Processing

Most Viewed Posts

Related Posts

Related Articles

Leave a Reply Cancel reply

Trending

Recent Posts

Featured Posts – Slider Widget