Hive interview questions

29. What is a Hive variable? What for we use it ?
The hive variable is variable created in the Hive environment that can be referenced by Hive scripts. It is used to pass some values to the hive queries when the query starts executing.
You need to use the special hiveconf for variable substitution.
hive> set CURRENT_DATE=’2012-09-16′;
hive> select * from foo where day >= ‘${hiveconf:CURRENT_DATE}’
Note that there are env and system variables as well, so you can reference ${env:USER} for example. To see all the available variables, from the command line, run
% hive -e ‘set;’

30. Can hive queries be executed from script files ? How ?
Using the source command.
Example −
Hive> source /path/to/file/file_with_query.hql

31. What is .hiverc file ?
It is a file that is executed when you launch the hive shell – making it an ideal place for adding any hive configuration/customization you want set, on start of the hive shell. This could be:
– Setting column headers to be visible in query results
– Making the current database name part of the hive prompt
– Adding any jars or files
– Registering UDFs
location is: /etc/hive/conf.cloudera.hive1

32. What is .hiverc and can you give a sample .hiverc ?
add jar /home/airawat/hadoop-lib/hive-contrib-0.10.0-cdh4.2.0.jar;
set hive.cli.print.header=true;
set hive.cli.print.current.db=true;
set hive.mapjoin.smalltable.filesize=30000000;

33. What are the default record and field delimiter used for hive text files ?
The default record delimiter is − \n
And the filed delimiters are − \001,\002,\003

34. What do you mean by schema on read?
The schema is validated with the data when reading the data and not enforced when writing data.

35. How do you list all databases whose name starts with p ?

Author: user

Leave a Reply

Your email address will not be published. Required fields are marked *