Hive interview questions

Hive @ Freshers.in

57. What will be the result when you do cast(‘abc’ as INT) ?
Hive will return NULL

58. Can the name of a view be same as the name of a hive table ?
No. The name of a view must be unique when compared to all other tables and views present in the same database.
CREATE VIEW [IF NOT EXISTS] view_name [(column_name [COMMENT column_comment], …) ]
[COMMENT table_comment]
AS SELECT …
hive> CREATE VIEW emp_30000 AS
SELECT * FROM employee
WHERE salary>30000;

59. Can we LOAD data into a view ?
No. A view can not be the target of a INSERT or LOAD statement.

60. What types of costs are associated in creating index on hive tables ?
Indexes occupies space and there is a processing cost in arranging the values of the column on which index is created.

61. Give the command to see the indexes on a table ?
SHOW INDEX ON table_name
This will list all the indexes created on any of the columns in the table table_name.

62. What is bucketing ?
The values in a column are hashed into a number of buckets which is defined by user. It is a way to avoid too many partitions or nested partitions while ensuring optimizes query output.

63. What does /*streamtable(table_name)*/ do ?
It is query hint to stream a table into memory before running the query. It is a query optimization Technique.
Fortunately, you don’t have to put the largest table last in the query. Hive also provides
a “hint” mechanism to tell the query optimizer which table should be streamed:
SELECT /*+ STREAMTABLE(s) */ s.ymd, s.symbol, s.price_close, d.dividend
FROM stocks s JOIN dividends d ON s.ymd = d.ymd AND s.symbol = d.symbol
WHERE s.symbol = ‘AAPL’;
Now Hive will attempt to stream the stocks table, even though it’s not the last table in the query.

“STREAMTABLE” hint is not longer required. Moreover, if you enable CBO and run analyze commands for table/columns, TEZ execution engine will optimize the plan and dynamically determine which table to stream.

Author: user

Leave a Reply