Tag: Big Data

PySpark @ Freshers.in

How to run dataframe as Spark SQL – PySpark

If you have a situation that you can easily get the result using SQL/ SQL already existing , then you…

Continue Reading How to run dataframe as Spark SQL – PySpark

Hive – What are the metastore tables in Hive ?

Metastore is the central repository of Apache Hive metadata. It stores metadata for Hive tables AUX_TABLE BUCKETING_COLS CDS COLUMNS_V2 COMPACTION_QUEUE…

Continue Reading Hive – What are the metastore tables in Hive ?

How to remove csv header using Spark (PySpark)

A common use case when dealing with CSV file is to remove the header from the source to do data…

Continue Reading How to remove csv header using Spark (PySpark)