Category: spark

Spark User full article

PySpark @ Freshers.in

What is the difference between concat and concat_ws in Pyspark

concat vs concat_ws Syntax: pyspark.sql.functions.concat(*cols) pyspark.sql.functions.concat_ws(sep, *cols) concat : concat concatenates multiple input columns together into a single column. The…

Continue Reading What is the difference between concat and concat_ws in Pyspark
PySpark @ Freshers.in

How to add a new column in PySpark using withColumn

withColumn Syntax: DataFrame.withColumn(column_name, col) withColumn is comonly used to add a column on an existing dataframe. withColumn returns a new…

Continue Reading How to add a new column in PySpark using withColumn
PySpark @ Freshers.in

How to use filter or where condition in PySpark

filter / where The filter condition will filters rows based on multiple conditions. where() is an alias for filter(). In…

Continue Reading How to use filter or where condition in PySpark
PySpark @ Freshers.in

Explain Complex datatype PySpark (ArrayType,MapType,StructType)

There are three complex datatype in PySpark, (1) ArrayType, (2) MapType (3) StructType. ArrayType ArrayType represents values comprising a sequence…

Continue Reading Explain Complex datatype PySpark (ArrayType,MapType,StructType)