Tag: PySpark

PySpark @ Freshers.in

PySpark how to find the date difference between two date and how to round it just days without decimal (datediff,floor)

pyspark.sql.functions.datediff and pyspark.sql.functions.floor In this article we will learn two function , mainly datediff and floor. pyspark.sql.functions.datediff : To get…

PySpark @ Freshers.in

PySpark – How to convert string date to Date datatype

pyspark.sql.functions.to_date In this article will give you brief on how can you convert string date to Date datatype . With…

PySpark @ Freshers.in

PySpark-How to returns the first column that is not null

pyspark.sql.functions.coalesce If you want to return the first non zero from list of column you can use coalesce function in…

PySpark @ Freshers.in

How can you convert PySpark Dataframe to JSON ?

pyspark.sql.DataFrame.toJSON There may be some situation that you need to send your dataframe to a file to a server or…

PySpark @ Freshers.in

How can I see the full column values in a Spark Dataframe ?

When we do a dataframe.show () , we can see that some of the column values got truncated. Here we…

PySpark @ Freshers.in

What is the difference between repartition() and coalesce() ?

The repartition algorithm will perform a full shuffle and creates new partitions with data that’s distributed evenly.┬áThe repartition algorithm makes…

PySpark @ Freshers.in

Converts a column containing a StructType, ArrayType or a MapType into a JSON string-PySpark(to_json)

You can convert a column containing a StructType, ArrayType or a MapType into a JSON string using to_json function. pyspark.sql.functions.to_json…

PySpark @ Freshers.in

How to round the given value to scale decimal places using HALF_EVEN rounding in Spark – PySpark

bround function bround function returns the rounded expr using HALF_EVEN rounding mode. That means bround will round the given value…

PySpark @ Freshers.in

How to replace a value with another value in a column in Pyspark Dataframe ?

In PySpark we can replace a value in one column or multiple column or multiple values in a column to…

PySpark @ Freshers.in

How to drop nulls in a dataframe : PySpark

For most of the data cleansing the first thing that you may need to do drop the nulls in the…