Category: article

Snowflake : Retrieve the SQL script used to create a specific database, schema, table, view, materialized view

user May 5, 2023 0 Comments

In Snowflake, you can use the “GET_DDL” function to retrieve the SQL script used to create a specific database, schema,…

PySpark : Understanding the ‘take’ Action in PySpark with Examples. [Retrieves a specified number of elements from the beginning of an RDD or DataFrame]

user April 29, 2023 0 Comments

In this article, we will focus on the ‘take’ action, which is commonly used in PySpark operations. We’ll provide a…

AWS Glue : Handling Errors and Retries in AWS Glue

user April 27, 2023 0 Comments

AWS Glue is a fully managed ETL service that simplifies and automates data processing tasks. While AWS Glue is designed…

Redshift : Redshift Unload getting the file as filename appended with 000 – how to get as actual filename.

user April 26, 2023 0 Comments

Unfortunately, Redshift’s UNLOAD command appends a part number (like 000) to the output file names by default, and there’s no…

DBT : Restarting job from failure step in DBT on DBT cloud

user April 26, 2023 0 Comments

To restart a job from a failure step in DBT on DBT Cloud, you can follow these steps: Go to…

DBT : Best Practices for Restartable dbt Jobs: Tips for Resilient Data Pipelines

user April 24, 2023 0 Comments

To ensure restartability in dbt jobs, you can use a combination of incremental models, snapshots, and custom materializations. Additionally, it’s…

DBT : Organizing DBT Models in Subdirectories: A Guide to YAML Configuration

user April 23, 2023 0 Comments

DBT (Data Build Tool) is an essential tool for data engineers and analysts to build, test, and document data pipelines…

DBT : Converting S3 Paths with DBT Macros Based on Environment Variables

user April 23, 2023 0 Comments

In data engineering, it is common to work with cloud-based storage systems such as Amazon S3. Often, the location of…

DBT : Demystifying the DBT Model: A Comprehensive Guide

user April 23, 2023 0 Comments

Data Build Tool (DBT) has become an indispensable tool for data engineers and analysts in modern data environments. It enables…

Python : ZIP file from an S3 bucket, split it into smaller ZIP files, and save those directly to the S3 bucket

user April 21, 2023 0 Comments

To read a large ZIP file from an S3 bucket, split it into smaller ZIP files, and save those directly…

Category: article

Snowflake : Retrieve the SQL script used to create a specific database, schema, table, view, materialized view

PySpark : Understanding the ‘take’ Action in PySpark with Examples. [Retrieves a specified number of elements from the beginning of an RDD or DataFrame]

AWS Glue : Handling Errors and Retries in AWS Glue

Redshift : Redshift Unload getting the file as filename appended with 000 – how to get as actual filename.

DBT : Restarting job from failure step in DBT on DBT cloud

DBT : Best Practices for Restartable dbt Jobs: Tips for Resilient Data Pipelines

DBT : Organizing DBT Models in Subdirectories: A Guide to YAML Configuration

DBT : Converting S3 Paths with DBT Macros Based on Environment Variables

DBT : Demystifying the DBT Model: A Comprehensive Guide

Python : ZIP file from an S3 bucket, split it into smaller ZIP files, and save those directly to the S3 bucket

Trending

Recent Posts

Featured Posts – Slider Widget

How PARTITION BY Works in Snowflake, and SQL in general

Stash a specific file using Git

Prevent your computer from locking : Python to simulate mouse movements

AWS EC2 vs Azure Virtual Machines

Production and Industrial Engineering

Engineering Technical campus placement question and answers

JavaScript’s reduceRight() method to iterate over an array from right to left

Merging Multiple Images into a Single PDF File Using Python

Nanotechnology

Electronics and Instrumentation

Most Viewed Posts