Hive : Optimizing queries using Materialized Views using REWRITE option

Apache Hive is a popular data warehousing tool built on top of Hadoop for managing and querying large datasets. Among various features and options, one that stands out is the REWRITE option, especially useful in optimizing queries using Materialized Views (MVs).

Materialized Views provide a way to precompute and store aggregated or processed data for more efficient querying. With the REWRITE option, Hive can automatically utilize these Materialized Views when executing queries, reducing processing time and resources.

In this article, we will explore the REWRITE option in Hive, focusing on how to create tables using the freshers_in_ prefix and leverage Materialized Views.

1. Creating Tables with freshers_in_ Prefix

Let’s start by creating a table named “freshers_in_employees” with necessary columns:

CREATE TABLE freshers_in_employees (
    id INT,
    name STRING,
    department STRING,
    salary FLOAT
)
STORED AS ORC;

2. Materialized Views in Hive

Materialized Views are stored queries that contain the result set of a query. They can be leveraged by Hive’s query optimizer to rewrite queries, automatically substituting parts of a user’s query with the precomputed results in the Materialized View.

For instance, let’s create a Materialized View that contains the average salary per department from the “freshers_in_employees” table:

CREATE MATERIALIZED VIEW freshers_in_avg_salary_by_department
AS SELECT department, AVG(salary) AS avg_salary
FROM freshers_in_employees
GROUP BY department;

3. Query Optimization with REWRITE

Once the Materialized View is created, Hive’s query optimizer can use it to rewrite suitable queries, automatically substituting them with the Materialized View. The REWRITE option enables this behavior.

Consider the following query, which retrieves the average salary by department:

SELECT department, AVG(salary) AS avg_salary
FROM freshers_in_employees
GROUP BY department;

With the REWRITE option enabled, Hive can automatically rewrite this query to:

SELECT department, avg_salary
FROM freshers_in_avg_salary_by_department;

This rewriting reduces the computation required to answer the query, as the aggregation is already computed and stored in the Materialized View.

4. Enabling and Disabling Automatic Query Rewrite

The automatic query rewrite functionality can be controlled using the hive.materializedview.rewriting configuration parameter. It can be set to true or false to enable or disable the feature, respectively.

Enable:

SET hive.materializedview.rewriting=true;

Disable:

SET hive.materializedview.rewriting=false;

5. Manual Control of REWRITE

In some cases, developers may wish to manually control the rewriting process. This can be done using the EXPLAIN REWRITE command to see how the optimizer would rewrite a given query using the Materialized Views.

EXPLAIN REWRITE
SELECT department, AVG(salary) AS avg_salary
FROM freshers_in_employees
GROUP BY department;

The REWRITE option in Hive represents an advanced optimization technique, enhancing query performance through the utilization of Materialized Views.

Hive important pages to refer

Post Views: 11

Hive : Optimizing queries using Materialized Views using REWRITE option

2. Materialized Views in Hive

3. Query Optimization with REWRITE

4. Enabling and Disabling Automatic Query Rewrite

5. Manual Control of REWRITE

Leave a Reply Cancel reply

Trending

Recent Posts

Featured Posts – Slider Widget

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Chemical Engineering

Civil Engineering

Backpressure in AWS Kinesis Streams: Optimizing Data Processing

Most Viewed Posts

2. Materialized Views in Hive

3. Query Optimization with REWRITE

4. Enabling and Disabling Automatic Query Rewrite

5. Manual Control of REWRITE

Related Articles

Leave a Reply Cancel reply

Trending

Recent Posts

Featured Posts – Slider Widget