Understanding Series.transform(func[, axis])

user April 8, 2024

Series.transform(func[, axis])

In this article, we’ll explore the Series.transform(func[, axis]) function, shedding light on its capabilities through comprehensive examples and outputs.

Understanding Series.transform(func[, axis]): The Series.transform(func[, axis]) function in Pandas API on Spark is tailored to invoke func, yielding a Series with transformed values akin to the original Series. It ensures the transformed Series retains the same length as the input, thereby facilitating seamless integration within Spark DataFrames. This function serves as a linchpin for executing custom transformations on Series data, fostering intricate data manipulations in Spark.

Syntax:

Series.transform(func[, axis])

Where:

func: A transformation function applied to each element of the Series.
axis (optional): Specifies the axis along which the transformation is applied. Default is 0 (column-wise), while 1 signifies row-wise transformations.

Examples and Outputs: Let’s embark on practical examples to elucidate the functionality of Series.transform(func[, axis]) within Spark DataFrames.

Example 1: Applying a Simple Transformation Function.

Consider a Spark DataFrame df with a Series named column2. We’ll double each element of column2.

# Sample data
from pyspark.sql import SparkSession
spark = SparkSession.builder \
    .appName("Pandas API on Spark") \
    .getOrCreate()
data = [("A", 10), ("B", 20), ("C", 30)]
df = spark.createDataFrame(data, ["column1", "column2"])
# Define transformation function
def double_value(x):
    return x * 2
# Apply transformation function using Series.transform(func)
transformed_df = df.withColumn("transformed_column", double_value(df["column2"]))
transformed_df.show()

Output:

+-------+-------+-------------------+
|column1|column2|transformed_column|
+-------+-------+-------------------+
|      A|     10|                 20|
|      B|     20|                 40|
|      C|     30|                 60|
+-------+-------+-------------------+

Example 2: Applying a Custom Transformation Function.

Let’s define a custom transformation function that converts strings to uppercase and apply it to a Series containing strings.

# Sample data
from pyspark.sql.functions import upper
data = [("A", "hello"), ("B", "world"), ("C", "spark")]
df = spark.createDataFrame(data, ["column1", "column2"])
# Apply custom transformation function using Series.transform(func)
transformed_df = df.withColumn("transformed_column", upper(df["column2"]))
transformed_df.show()

Output:

+-------+-------+-------------------+
|column1|column2|transformed_column|
+-------+-------+-------------------+
|      A|  hello|              HELLO|
|      B|  world|              WORLD|
|      C|  spark|              SPARK|
+-------+-------+-------------------+

Spark important urls to refer

Post Views: 0

Author: user

Understanding Series.transform(func[, axis])

Series.transform(func[, axis])

Example 1: Applying a Simple Transformation Function.

Example 2: Applying a Custom Transformation Function.

Trending

Recent Posts

Featured Posts – Slider Widget

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Chemical Engineering

Civil Engineering

Backpressure in AWS Kinesis Streams: Optimizing Data Processing

Most Viewed Posts

Series.transform(func[, axis])

Example 1: Applying a Simple Transformation Function.

Example 2: Applying a Custom Transformation Function.

Related Articles

Trending

Recent Posts

Featured Posts – Slider Widget