MapType in PySpark is a data type used to represent a value that maps keys…
Tag: PySpark
PySpark : Setting PySpark parameters – A complete Walkthru [3 Ways]
In PySpark, you can set various parameters to configure your Spark application. These parameters can be set in different ways…
PySpark : Using CASE WHEN for Spark SQL to conditionally execute expressions : Dataframe and SQL way explained
The WHEN clause is used in Spark SQL to conditionally execute expressions. It’s similar to a CASE statement in SQL…
Spark : Calculation of executor memory in Spark – A complete info.
The executor memory is the amount of memory allocated to each executor in a Spark cluster. It determines the amount…
PySpark : PySpark program to write DataFrame to Snowflake table.
Overview of Snowflake and PySpark. Snowflake is a cloud-based data warehousing platform that allows users to store and analyze large…
Hive : Hive optimizer – Detailed walk through
Hive is a popular open-source data warehouse system that allows users to store, manage, and analyze large datasets using SQL-like…
Hive : Difference between the Tez execution engine and the Spark execution engine in Hive
Hive is a data warehousing tool built on top of Hadoop, which allows us to write SQL-like queries on large…
Hive : Different types of Hive execution engines
Hive is an open-source data warehouse tool built on top of Hadoop. It allows users to write SQL-like queries, called…
Hive : Difference between the MapReduce execution engine and the Tez execution engine in Hive
MapReduce and Tez are two popular execution engines used in Apache Hive for processing large-scale datasets. While both engines are…
PySpark : LongType and ShortType data types in PySpark
pyspark.sql.types.LongType pyspark.sql.types.ShortType In this article, we will explore PySpark’s LongType and ShortType data types, their properties, and how to work…
PySpark : HiveContext in PySpark – A brief explanation
One of the key components of PySpark is the HiveContext, which provides a SQL-like interface to work with data stored…