Tag: Big Data

What are the Query Operators supported by Snowflake

Snowflake supports most of the standard operators defined in SQL:1999. Arithmetic Operators + , – , * , / ,…

PySpark @ Freshers.in

PySpark how to get rows having nulls for a column or columns without nulls or count of Non null

pyspark.sql.Column.isNotNull isNotNull() : True if the current expression is NOT null. isNull() :¬†True if the current expression is null. With…

PySpark @ Freshers.in

PySpark – groupby with aggregation (count, sum, mean, min, max)

pyspark.sql.DataFrame.groupBy PySpark groupby functions groups the DataFrame using the specified columns to run aggregation ( count,sum,mean, min, max) on them….

PySpark @ Freshers.in

PySpark filter : How to filter data in Pyspark – Multiple options explained.

pyspark.sql.DataFrame.filter PySpark filter function is used to filter the data in a Spark Data Frame, in short used to cleansing…

Amazon Aurora @ Freshers.in

Amazon Aurora quick reference and cheat sheet.

1. Aurora is an AWS proprietary database. 2. Aurora is a fully managed service. 3. Aurora have High performance and…

AWS Athena @ Freshers.in

Amazon Athena quick reference and cheat sheet

1. Amazon Athena is an interactive query service to analyze data in Amazon S3 using standard SQL. 2. Athena is…

Hive @ Freshers.in

How to drop multiple partition in Hive by giving condition.

Hive Partitions is a good and easy way to organizes Hive tables into partitions by dividing tables into different parts…

Hive @ Freshers.in

How to delete a partition data as well from Hive external table on DROP command?

As you know external tables are tables where¬† Hive does not manage the data of the External table. So when…

Hive @ Freshers.in

How to convert a hive managed table to external table without recreating it ?

In Hive, Managed tables / Internal table are Hive owned tables and the tables data are managed and controlled by…

AWS Redshift @ Freshers.in

How to do Force serialization in AWS Redshift table by locking all tables?

You can force serialization by locking all tables in each session. The LOCK command blocks operations that would result in…