Tag: Apache

Hive – What are the metastore tables in Hive ?

Metastore is the central repository of Apache Hive metadata. It stores metadata for Hive tables AUX_TABLE BUCKETING_COLS CDS COLUMNS_V2 COMPACTION_QUEUE…

Continue Reading Hive – What are the metastore tables in Hive ?

How to remove csv header using Spark (PySpark)

A common use case when dealing with CSV file is to remove the header from the source to do data…

Continue Reading How to remove csv header using Spark (PySpark)

Apache PIG interview questions

1. What is pig? Pig is a Apache open soucre project which run on top of hadoop,provides engine for data…

Continue Reading Apache PIG interview questions