Category: article

PySpark : Reading parquet file stored on Amazon S3 using PySpark

user March 27, 2023 0 Comments

To read a Parquet file stored on Amazon S3 using PySpark, you can use the following code: from pyspark.sql import…

Redshift : Role of VACUUM and ANALYZE in Redshift

user March 27, 2023 0 Comments

Amazon Redshift is a popular data warehousing solution that is widely used by businesses to manage and analyze large volumes…

Google Dataflow : Handling Late Data in Google Dataflow

user March 25, 2023 0 Comments

Handling late-arriving data is a common challenge when working with streaming data processing systems like Google Dataflow. Late data refers…

Google Dataflow-An Overview and programming languages are supported by Google Dataflow

user March 25, 2023 0 Comments

Google Dataflow is a cloud-based data processing service that allows developers to easily and efficiently process large volumes of data….

Python : extend() and append() – Purpose and difference – A Comprehensive Guide with example

user March 23, 2023 0 Comments

When working with lists in Python, two common methods used for adding elements to a list are extend() and append()….

Python-Pandas : Rename columns dynamically without specifying the name of the index column using Python

user March 23, 2023 0 Comments

To rename columns dynamically without specifying the name of the index column, you can retrieve the index column name using…

Hive : Hive Table Properties : How are Hive Table Properties used?

user March 21, 2023 0 Comments

One of the key features of Hive is the ability to define table properties, which can be used to control…

Hive : Implementation of UDF in Hive using Python. A Comprehensive Guide

user March 18, 2023 0 Comments

A User-Defined Function (UDF) in Hive is a function that is defined by the user and can be used in…

Python : Steps to Upgrade Python 3.7 from Python 2.7 [This can be used for any lower version to upper version]

user March 18, 2023 0 Comments

Upgrading from Python 2.7 to Python 3.7 requires you to install Python 3.7 and then re-point all the libraries installed…

Hive : Hive metastore and its importance.

user March 18, 2023 0 Comments

The Hive Metastore is an important component of the Apache Hive data warehouse software. It acts as a central repository…

Category: article

PySpark : Reading parquet file stored on Amazon S3 using PySpark

Redshift : Role of VACUUM and ANALYZE in Redshift

Google Dataflow : Handling Late Data in Google Dataflow

Google Dataflow-An Overview and programming languages are supported by Google Dataflow

Python : extend() and append() – Purpose and difference – A Comprehensive Guide with example

Python-Pandas : Rename columns dynamically without specifying the name of the index column using Python

Hive : Hive Table Properties : How are Hive Table Properties used?

Hive : Implementation of UDF in Hive using Python. A Comprehensive Guide

Python : Steps to Upgrade Python 3.7 from Python 2.7 [This can be used for any lower version to upper version]

Hive : Hive metastore and its importance.

Trending

Recent Posts

Featured Posts – Slider Widget

How PARTITION BY Works in Snowflake, and SQL in general

Stash a specific file using Git

Prevent your computer from locking : Python to simulate mouse movements

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Most Viewed Posts