Hive interview questions

Hive @

1. What is the syntax of Creating table in hive
CREATE TABLE records (year STRING, temperature INT, quality INT)

2. How to populate Hive with the data
LOAD DATA LOCAL INPATH ‘input/ncdc/micro-tab/sample.txt’
Running this command tells Hive to put the specified local file in its warehouse directory.The OVERWRITE keyword in the LOAD DATA statement tells Hive to delete any existing files in the directory for the table.

3. Where can we see hive table location ?
Tables are stored as directories under Hive’s warehouse directory, which is controlled by the hive.metastore.warehouse.dir and defaults to /user/hive/warehouse.Thus, the files for the records table are found in the /user/hive/warehouse/records directory on the local filesystem:
% ls /user/hive/warehouse/records/

4. Where can we see the Configuration file in Hive ?
Hive is configured using an XML configuration file like Hadoop’s. The file is called hive-site.xml and is located in Hive’s conf directory. The contains hivedefault.xml, which documents the properties that Hive exposes and their default values.You can override the configuration directory that Hive looks for in hive-site.xml bypassing the –config option to the hive command:
% hive –config /Users/tom/dev/hive-conf same directory

5. How to do Warehouse sharing for multiple users ?
More than one Hive user sharing a Hadoop cluster, you need to make the directories that Hive uses writable by all users. The following commands will create the directories and set their permissions appropriately:
% hadoop fs -mkdir /tmp
% hadoop fs -chmod a+w /tmp
% hadoop fs -mkdir -p /user/hive/warehouse
% hadoop fs -chmod a+w /user/hive/warehouse

6. What is SET -v ?
SET -v to list all the properties in the system,including Hadoop defaults.

7. What is the Metastore and embedded metastore ?
The metastore is the central repository of Hive metadata. The metastore is divided into two pieces: a service and the backing store for the data. By default, the metastore service runs in the same JVM as the Hive service and contains an embedded Derby database instance backed by the local disk. This is called the embedded metastore configuration.however, only one embedded Derby database can access the database files on disk at any one time, which means you can have only one Hive session open at a time that shares the same metastore.

Author: user

Leave a Reply