In the world of data analysis and manipulation, DataFrames and Pandas are indispensable tools for Python developers. Whether you’re cleaning messy datasets, transforming raw data into insights, or conducting exploratory data analysis, Pandas provides a powerful, flexible, and user-friendly framework to handle structured data with ease.
At the heart of Pandas lies the DataFrame—a two-dimensional, size-mutable, and labeled data structure that resembles a table in relational databases or an Excel spreadsheet. With its intuitive design and robust functionality, DataFrames make it easy to organize, analyze, and manipulate data in Python.
DataFrames and Pandas are core components of Python for data analysis and manipulation.
Pandas is a powerful and flexible Python library that provides tools for working with structured data, such as tables or spreadsheets. It’s particularly well-suited for data cleaning, transformation, and exploratory data analysis.
A DataFrame is the primary data structure in Pandas. It is a two-dimensional, size-mutable, and labeled data structure, similar to a table in a relational database, an Excel spreadsheet, or a NumPy array with labeled rows and columns.
Here’s a quick guide to getting started with Pandas and DataFrames:
import pandas as pd
From a Dictionary:
data = {
‘Name’: [‘yuva’, ‘ganapati’, ‘Charlie’],
‘Age’: [25, 30, 35],
‘Salary’: [50000, 60000, 70000]
}
df = pd.DataFrame(data)
print(df)
Output:
Name Age Salary
0 Yuva 25 50000
1 Ganapati30 60000
2 Charlie 35 70000
Access Columns:
print(df[‘Name’]) # Access the ‘Name’ column
Filter Rows:
print(df[df[‘Age’] > 25])
Add a New Column:
df[‘Bonus’] = df[‘Salary’] * 0.1
print(df)
Summary Statistics
print(df.describe()) # Summary statistics of numeric columns
Read CSV File:
df = pd.read_csv(‘data.csv’)
Save DataFrame to CSV:
df.to_csv(‘output.csv’, index=False)
Indian Institute of Embedded Systems – IIES