Category: article

PySpark @ Freshers.in

PySpark : Extracting minutes of a given date as integer in PySpark [minute]

pyspark.sql.functions.minute The minute function in PySpark is part of the pyspark.sql.functions module, and is used to extract the minute from…

Continue Reading PySpark : Extracting minutes of a given date as integer in PySpark [minute]
PySpark @ Freshers.in

PySpark : Function to perform simple column transformations [expr]

pyspark.sql.functions.expr The expr module is part of the PySpark SQL module and is used to create column expressions that can…

Continue Reading PySpark : Function to perform simple column transformations [expr]
getDbt

How do you use DBT to manage your data lineage?

Data lineage refers to the history of data as it moves from its source to its destination, including transformations and…

Continue Reading How do you use DBT to manage your data lineage?
PySpark @ Freshers.in

PySpark : Formatting numbers to a specific number of decimal places.

pyspark.sql.functions.format_number One of the useful functions in PySpark is the format_number function, which is used to format numbers to a…

Continue Reading PySpark : Formatting numbers to a specific number of decimal places.
PySpark @ Freshers.in

PySpark : Creating multiple rows for each element in the array[explode]

pyspark.sql.functions.explode One of the important operations in PySpark is the explode function, which is used to convert a column of…

Continue Reading PySpark : Creating multiple rows for each element in the array[explode]
PySpark @ Freshers.in

PySpark : How decode works in PySpark ?

One of the important concepts in PySpark is data encoding and decoding, which refers to the process of converting data…

Continue Reading PySpark : How decode works in PySpark ?
PySpark @ Freshers.in

PySpark : Extracting dayofmonth, dayofweek, and dayofyear in PySpark

pyspark.sql.functions.dayofmonth pyspark.sql.functions.dayofweek pyspark.sql.functions.dayofyear One of the most common data manipulations in PySpark is working with date and time columns. PySpark…

Continue Reading PySpark : Extracting dayofmonth, dayofweek, and dayofyear in PySpark
python @ Freshers.in

Python : Understanding traceback.format_exc() in Python

In Python, the traceback module provides functions for working with tracebacks, which are snapshots of the call stack at a…

Continue Reading Python : Understanding traceback.format_exc() in Python
AWS Glue @ Freshers.in

Explain the purpose of the AWS Glue data catalog.

The AWS Glue data catalog is a central repository for storing metadata about data sources, transformations, and targets used in…

Continue Reading Explain the purpose of the AWS Glue data catalog.