Tag: Spark_Interview

PySpark @ Freshers.in

PySpark : How to read date datatype from CSV ?

We specify schema = true when a CSV file is being read. Spark determines the data type of a column…

Continue Reading PySpark : How to read date datatype from CSV ?
PySpark @ Freshers.in

PySpark-How to returns the first column that is not null

pyspark.sql.functions.coalesce If you want to return the first non zero from list of column you can use coalesce function in…

Continue Reading PySpark-How to returns the first column that is not null
PySpark @ Freshers.in

How can you convert PySpark Dataframe to JSON ?

pyspark.sql.DataFrame.toJSON There may be some situation that you need to send your dataframe to a file to a server or…

Continue Reading How can you convert PySpark Dataframe to JSON ?
PySpark @ Freshers.in

What is the difference between repartition() and coalesce() ?

The repartition algorithm will perform a full shuffle and creates new partitions with data that’s distributed evenly. The repartition algorithm makes…

Continue Reading What is the difference between repartition() and coalesce() ?