2026-05-05 12:17:25+08
Data cleaning takes up 80% of a data scientist's time. This prompt automates the repetitive parts of cleaning a new dataset using the Python Pandas library.
Write a Python script using Pandas to clean a CSV file named "data.csv". The script should: 1. Remove duplicate rows. 2. Fill missing values in the "price" column with the median. 3. Convert the "date" column to datetime objects. 4. Remove any rows where "email" is invalid.
By listing specific steps, you ensure the script is modular and easy to test against your specific data problems.
df['price'] = df['price'].fillna(df['price'].median())