Spark_Interview - Freshers.in

PySpark : Connecting and updating postgres table in spark SQL
Apache Spark is an open-source, distributed computing system that can process large amounts of data…
PySpark : How do I read a parquet file in Spark
To read a Parquet file in Spark, you can use the spark.read.parquet() method, which returns…
What is the difference between concat and concat_ws in Pyspark
concat vs concat_ws Syntax: pyspark.sql.functions.concat(*cols) pyspark.sql.functions.concat_ws(sep, *cols) concat : concat concatenates multiple input columns together…
How to run dataframe as Spark SQL - PySpark
If you have a situation that you can easily get the result using SQL/ SQL…
PySpark : Explanation of MapType in PySpark with Example
MapType in PySpark is a data type used to represent a value that maps keys…
PySpark : How to decode in PySpark ?
pyspark.sql.functions.decode The pyspark.sql.functions.decode Function in PySpark PySpark is a popular library for processing big data…
What is the difference between repartition() and coalesce() ?
The repartition algorithm will perform a full shuffle and creates new partitions with data that's…
PySpark : How to read date datatype from CSV ?
We specify schema = true when a CSV file is being read. Spark determines the…
How to remove csv header using Spark (PySpark)
A common use case when dealing with CSV file is to remove the header from…
Spark : Calculate the number of unique elements in a column using PySpark
pyspark.sql.functions.countDistinct In PySpark, the countDistinct function is used to calculate the number of unique elements…

Tag: Spark_Interview

In pyspark what is the difference between Spark spark.table() and spark.read.table()

PySpark : How to read date datatype from CSV ?

PySpark: How to accept date in a Dataframe : DateType can not accept object ‘YYYY-MM-DD’ in type

How to transform columns into list of objects [arrays] on top of group by in PySpark – collect_list and collect_set

Convert data from the PySpark DataFrame columns to Row format or get elements in columns in row

PySpark: How to add months to a date column in Spark DataFrame (add_months)

PySpark-How to returns the first column that is not null

How can you convert PySpark Dataframe to JSON ?

How can I see the full column values in a Spark Dataframe ?

What is the difference between repartition() and coalesce() ?

Trending

Recent Posts

Featured Posts – Slider Widget

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Chemical Engineering

Civil Engineering

Backpressure in AWS Kinesis Streams: Optimizing Data Processing

Troubleshooting Data Ingestion and Processing Issues with AWS Kinesis Streams

Most Viewed Posts