Python : If the current row date is null or blank then get the date from the previous row and fill it using Pandas

Python Pandas @ Freshers.in

You can use the fillna() method in pandas with the ffill() method to fill the null values in a DataFrame with the previous row’s value. fillna() and ffill() are methods in the pandas library for handling missing or null values in a DataFrame. Here’s an explanation of each: fillna() is a method that fills the missing or null values in a DataFrame with a specified value. It takes several parameters, such as value, method, limit, inplace, and others, that allow you to customize how missing values are filled. The inplace=True parameter is used to modify the DataFrame in place. ffill() is a method that fills missing or null values in a DataFrame with the previous value along a specified axis. This is also called forward filling. You can use this method to propagate last valid observation forward to fill gaps.

Here’s some sample code that demonstrates this:

import pandas as pd

# Create a sample DataFrame with dates
df = pd.DataFrame({'date': ['2023-01-01', '2023-01-02', None, '2023-01-04', None], 'value': [1, 2, 3, 4, 5]})

# Convert the 'date' column to datetime format
df['date'] = pd.to_datetime(df['date'])

# Fill null values with previous row values using ffill()
df['date'].fillna(method='ffill', inplace=True)

# Print the updated DataFrame
print(df)

In this example, we create a DataFrame with dates in the ‘date’ column, and some of the dates are null. We convert the ‘date’ column to datetime format using pd.to_datetime(), and then use the fillna() method with the ffill() method to fill the null values with the previous row’s value. The inplace=True parameter is used to modify the DataFrame in place.

The output of the code will be:

        date  value
0 2023-01-01      1
1 2023-01-02      2
2 2023-01-02      3
3 2023-01-04      4
4 2023-01-04      5

CASE II

Instead of None , it its a blank place , how can I rewite the code

If the missing values in your DataFrame are empty strings rather than None, you can use the replace() method to replace them with NaN values before filling them with the previous row’s value using fillna(). Here’s the modified code:

import pandas as pd

# Create a sample DataFrame with dates
df = pd.DataFrame({'date': ['2023-01-01', '2023-01-02', '', '2023-01-04', ''], 'value': [1, 2, 3, 4, 5]})

# Replace empty strings with NaN values
df = df.replace('', pd.NA)

# Convert the 'date' column to datetime format
df['date'] = pd.to_datetime(df['date'])

# Fill null values with previous row values using ffill()
df['date'].fillna(method='ffill', inplace=True)

# Print the updated DataFrame
print(df)

In this modified code, we replace empty strings with pd.NA values using the replace() method. We then convert the ‘date’ column to datetime format using pd.to_datetime(), and use the fillna() method with the ffill() method to fill the null values with the previous row’s value. The inplace=True parameter is used to modify the DataFrame in place.

The output of the modified code will be the same as the previous example:

        date  value
0 2023-01-01      1
1 2023-01-02      2
2 2023-01-02      3
3 2023-01-04      4
4 2023-01-04      5

You can see, the null values in the ‘date’ column have been filled with the previous row’s value, even though they were initially empty strings. Note that both fillna() and ffill() methods can be used together to first fill missing values with forward filling method and then fill any remaining missing values with a specified value using fillna() method.

Refer more on python here :

Author: user

Leave a Reply