Efficient CSV File Handling in Python with pandas
Working with CSV Files in Python Using pandas
The pandas library provides powerful tools for processing, analyzing, and manipulating data in CSV format. Let's explore basic operations and some advanced techniques.
Library Installation
Install pandas using pip:
1pip install pandas
Basic CSV Operations
Reading and Writing CSV Files
1import pandas as pd
2
3# Reading a CSV file
4df = pd.read_csv('data.csv')
5
6# Displaying the first 5 rows
7print(df.head())
8
9# Writing data back to a CSV file
10df.to_csv('output.csv', index=False)
This script demonstrates basic reading of a CSV file into a DataFrame, displaying the first few rows, and saving the data to a new file.
Data Filtering
1# Filtering rows where the value in the 'age' column is greater than 30
2filtered_df = df[df['age'] > 30]
3print(filtered_df)
Merging Multiple CSV Files
1import glob
2
3# Getting a list of all CSV files in the current directory
4csv_files = glob.glob('*.csv')
5
6# Reading and combining all CSV files
7df_list = [pd.read_csv(file) for file in csv_files]
8combined_df = pd.concat(df_list, ignore_index=True)
9print(combined_df)
Handling Missing Values
1# Filling missing values with the mean of the column
2df['column_name'].fillna(df['column_name'].mean(), inplace=True)
3
4# Removing rows with missing values
5df.dropna(inplace=True)
Data Grouping and Aggregation
1# Grouping by 'category' column and calculating the mean of 'value'
2grouped = df.groupby('category')['value'].mean()
3print(grouped)
Applying Functions to Columns
1# Applying a custom function to a column
2df['new_column'] = df['old_column'].apply(lambda x: x * 2)
Conclusion
Pandas significantly simplifies working with CSV files in Python, offering a wide range of functions for data processing and analysis. From simple file reading and writing to complex operations of filtering, grouping, and data transformation - pandas is an indispensable tool for working with tabular data.
Experiment with various pandas functions to increase the efficiency of your data work!
comments powered by Disqus