PySpark : Exploding a column of arrays or maps into multiple rows in a Spark DataFrame [posexplode_outer]

PySpark @ Freshers.in

pyspark.sql.functions.posexplode_outer

The posexplode_outer function in PySpark is part of the pyspark.sql.functions module and is used to explode a column of arrays or maps into multiple rows in a Spark DataFrame. This function is similar to the posexplode function, but it also includes the original row even if the column is empty or null. In other words, the posexplode_outer function is an “outer explode” that preserves the original rows even if they don’t have any values in the exploded column.

Returns a new row for each element in the specified array or map, along with its position. In contrast to posexplode, the row (null, null) is produced if the array or map is null or empty. Unless otherwise provided, uses the default column names pos for position, col for array items, and key and value for map elements.

Here is an example of how to use the posexplode_outer function in PySpark:

from pyspark.sql import SparkSession
from pyspark.sql.functions import posexplode_outer
# Start a SparkSession
spark = SparkSession.builder.appName("PosexplodeOuter Example @ Freshers.in").getOrCreate()
# Create a DataFrame
data = [([1, 2, 3],), ([4, 5],), ([],), (None,)]
df = spark.createDataFrame(data, ["values"])
df.show()
+---------+
|   values|
+---------+
|[1, 2, 3]|
|   [4, 5]|
|       []|
|     null|
+---------+

Applying posexplode_outer function

# Use the posexplode_outer function to transform the values column
df2 = df.select("values", posexplode_outer("values").alias("position", "value"))
df2.show()

Result

+---------+--------+-----+
|   values|position|value|
+---------+--------+-----+
|[1, 2, 3]|       0|    1|
|[1, 2, 3]|       1|    2|
|[1, 2, 3]|       2|    3|
|   [4, 5]|       0|    4|
|   [4, 5]|       1|    5|
|       []|    null| null|
|     null|    null| null|
+---------+--------+-----+

Spark important urls to refer

  1. Spark Examples
  2. PySpark Blogs
  3. Bigdata Blogs
  4. Spark Interview Questions
  5. Official Page
Author: user

Leave a Reply