8. Explain Schema on Read Versus Schema on Write ?
In a traditional database, a table’s schema is enforced at data load time. If the data being loaded doesn’t conform to the schema, then it is rejected. This design is sometimes called schema on write because the data is checked against the schema when it is written into the database.Hive, on the other hand, doesn’t verify the data when it is loaded, but rather when a query is issued. This is called schema on read. Schema on read makes for a very fast initial load, since the data does not have to be read, parsed, and serialized to disk in the database’s internal format. Schema on write makes query time performance faster because the database can index columns and perform compression on the data.
9. What is Hive Locks ?
Hive also has support for table and partition-level locking. Locks prevent, for example,one process from dropping a table while another is reading from it. Locks are managed transparently using ZooKeeper.
10. What are Indexes in Hive ?
There are currently two index types: compact and bitmap.
Compact indexes store the HDFS block numbers of each value, rather than each file offset.
Bitmap indexes use compressed bitsets to efficiently store the rows that a particular value appears in, and they are usually appropriate for low-cardinality columns.
11. What are the hive Data Types ?
Hive supports both primitive and complex data types. Primitives include numeric,Boolean, string, and timestamp types. The Complex data types include arrays, maps, and structs.
BOOLEAN type for storing true and false values. There are four signed integral types: TINYINT, SMALLINT, INT, and BIGINT, which are equivalent to Java’s byte, short, int, and long primitive types, respectively; they are 1-byte, 2-byte, 4-byte, and 8-byte signed integers.
Hive’s floating-point types, FLOAT and DOUBLE, correspond to Java’s float and double, which are 32-bit and 64-bit floating point numbers. The DECIMAL data type is used to represent arbitrary-precision decimals, like
12. What are the data type for storing Text ?
There are three Hive data types for storing text. STRING is a variable-length character string with no declared maximum length. (The theoretical maximum-size STRING that may be stored is 2GB, although in practice it may be inefficient to materialize such large values.
VARCHAR types are similar except they are declared with a maximum length between 1
and 65355; for example, VARCHAR(100).
CHAR types are fixed-length strings that are padded with trailing spaces if necessary;
12. What is BINARY data type in Hive ?
The BINARY data type is for storing variable-length binary data.
12. How timestamp stored in Hive ?
TIMESTAMP data type stores timestamps with nanosecond precision
the to_utc_timestamp and from_utc_timestamp functions make it possible to do timezone conversions.
DATE data type stores a date with year, month and day components
13. What are the Complex data types ?
Hive has four complex types: ARRAY, MAP, STRUCT, and UNION. ARRAY and MAP are like their namesakes in Java, whereas a STRUCT is a record type that encapsulates a set of named fields. A UNION specifies a choice of data types; values must match exactly one of these types. Complex type declarations must specify the type of the fields in the collection, using an angled bracket notation.
CREATE TABLE complex (
c2 MAP<STRING, INT>,
c3 STRUCT<a:STRING, b:INT, c:DOUBLE>,
c4 UNIONTYPE<STRING, INT>
14. How to do type Conversions in hive?
TINYINT will be converted to an INT, the reverse conversion will not occur.
The implicit conversion rules can be summarized as follows. Any numeric type can be implicitly converted to a wider type, or to a text type (STRING, VARCHAR, CHAR) . All the text types can be implicitly converted to another text type. Perhaps surprisingly, they can also be converted to DOUBLE or DECIMAL. BOOLEAN types cannot be converted to any other type, and they cannot be implicitly converted to any other type in expressions. TIMESTAMP and DATE can be implicitly converted to a text type. You can perform explicit type conversion using CAST. For example, CAST(‘1’ AS INT) will convert the string ‘1’ to the integer value 1. If the cast fails – as it does in CAST(‘X’ AS INT), for example – the expression returns NULL.