How to renaming Spark Dataframe having a complex schema with AWS Glue – PySpark
There can be multiple reason to rename the Spark Data frame . Even though withColumnRenamed can be used to rename…
How can you track the change metadata of a Snowflake table?
The CHANGES clause enables querying the change tracking metadata for a table within a specified interval of time without having…
How to do Pivot in Snowflake ?
PIVOT in Snowflake PIVOT rotates a table by turning the unique values from one column in the input expression into…
How can you UNPIVOT in Snowflake ?
UNPIVOT in Snowflake UNPIVOT will rotate a table by transforming columns into rows. UNPIVOT is a relational operator which accepts…
Physiology of Calcium – (Endocrinology)
1. 99% of the body calcium is in the skeleton. 2. Only 1% is circulation and only half of this…
PySpark – How to read a text file as RDD using Spark3 and Display the result in Windows 10
Here we will see how to read a sample text file as RDD using Spark Environment and version which we…
What is the problem in having lots of small files in HDFS? What is the remediation plan?
In Hadoop ecosystem we are storing files under folders in HDFS, most of the time the folder name we are…
Explain distributed cache in Hadoop ?
Distributed cache is a facility provided by Hadoop map reduce framework to access small file needed by application during its…
What is Swappiness Value? What is the role of Swappiness Value during the cluster set up?
vm.swappiness is one of the Kernel Parameter in Linux or UNIX, vm.swappiness value is from 0-100 which controls the swapping…
What are the Python Modules provided in AWS Glue
AWS Glue version 2.0 supports the following python modules. Note : Different Glue versions support different Python versions. boto3==1.12.4 botocore==1.15.4…