StudyLover
  • Home
  • Study Zone
  • Profiles
  • Contact us
  • Sign in
StudyLover Pandas
Download
  1. Python
  2. Pyhton MCA (Machine Learning using Python)
  3. Unit:1 Foundations of Python and Its Applications in Machine Learning
Bokeh : Mahotas
Unit:1 Foundations of Python and Its Applications in Machine Learning

Pandas is the most essential and powerful Python library for data manipulation and analysis. It provides high-performance, easy-to-use data structures and data analysis tools that have made it the go-to library for data scientists, analysts, and engineers.

The two primary data structures in Pandas are:

  • Series: A one-dimensional labeled array, similar to a column in a spreadsheet.

  • DataFrame: A two-dimensional labeled data structure with columns of potentially different types, much like a full spreadsheet or a SQL table. This is the most commonly used Pandas object.


Code Examples

To run these examples, you first need to install Pandas: pip install pandas

1. Creating a DataFrame

You can create a DataFrame from various sources, but a common way is from a dictionary or by reading a file.

 

import pandas as pd

 

# Create a DataFrame from a dictionary

data = {

    'Product': ['Laptops', 'Monitors', 'Keyboards', 'Mice'],

    'Sales_2024': [150, 220, 310, 450],

    'Sales_2025': [180, 250, 290, 400],

    'Category': ['Electronics', 'Electronics', 'Accessories', 'Accessories']

}

df = pd.DataFrame(data)

 

print("--- Created DataFrame ---")

print(df)

 

2. Inspecting Data

Pandas provides simple methods to get a quick overview of your DataFrame.

 

import pandas as pd

 

# (Assuming 'df' is the DataFrame from the previous example)

data = {

    'Product': ['Laptops', 'Monitors', 'Keyboards', 'Mice'],

    'Sales_2024': [150, 220, 310, 450],

    'Sales_2025': [180, 250, 290, 400],

    'Category': ['Electronics', 'Electronics', 'Accessories', 'Accessories']

}

df = pd.DataFrame(data)

 

 

# Get the first few rows

print("--- First 2 Rows (.head(2)) ---")

print(df.head(2))

 

# Get a concise summary of the DataFrame

print("\n--- DataFrame Info (.info()) ---")

df.info()

 

# Get descriptive statistics for numerical columns

print("\n--- Descriptive Statistics (.describe()) ---")

print(df.describe())

 

3. Selecting Data

You can select columns, rows, and specific data points in several ways.

 

import pandas as pd

 

# (Assuming 'df' is the DataFrame from the first example)

data = {

    'Product': ['Laptops', 'Monitors', 'Keyboards', 'Mice'],

    'Sales_2024': [150, 220, 310, 450],

    'Sales_2025': [180, 250, 290, 400],

    'Category': ['Electronics', 'Electronics', 'Accessories', 'Accessories']

}

df = pd.DataFrame(data)

 

# Select a single column (returns a Series)

print("--- Selecting the 'Product' Column ---")

print(df['Product'])

 

# Select multiple columns

print("\n--- Selecting 'Product' and 'Sales_2025' Columns ---")

print(df[['Product', 'Sales_2025']])

 

# Select rows by their integer position using .iloc

print("\n--- Selecting the first row (.iloc[0]) ---")

print(df.iloc[0])

 

4. Filtering Data

This is one of the most powerful features of Pandas, allowing you to select rows that meet certain criteria.

 

import pandas as pd

 

# (Assuming 'df' is the DataFrame from the first example)

data = {

    'Product': ['Laptops', 'Monitors', 'Keyboards', 'Mice'],

    'Sales_2024': [150, 220, 310, 450],

    'Sales_2025': [180, 250, 290, 400],

    'Category': ['Electronics', 'Electronics', 'Accessories', 'Accessories']

}

df = pd.DataFrame(data)

 

# Filter for products with 2024 sales greater than 300

high_sales_2024 = df[df['Sales_2024'] > 300]

print("--- Products with 2024 Sales > 300 ---")

print(high_sales_2024)

 

# Filter for products in the 'Electronics' category

electronics_products = df[df['Category'] == 'Electronics']

print("\n--- Products in 'Electronics' Category ---")

print(electronics_products)

 

5. GroupBy and Aggregation

The groupby() method allows you to split your data into groups based on some criteria and then apply a function (like sum, mean, count) to each group.

 

import pandas as pd

 

# (Assuming 'df' is the DataFrame from the first example)

data = {

    'Product': ['Laptops', 'Monitors', 'Keyboards', 'Mice'],

    'Sales_2024': [150, 220, 310, 450],

    'Sales_2025': [180, 250, 290, 400],

    'Category': ['Electronics', 'Electronics', 'Accessories', 'Accessories']

}

df = pd.DataFrame(data)

 

# Group by 'Category' and calculate the sum of sales for each group

category_sales = df.groupby('Category')[['Sales_2024', 'Sales_2025']].sum()

print("--- Total Sales by Category ---")

print(category_sales)

 

6. Creating New Columns

You can easily create new columns, often based on calculations from existing columns.

 

import pandas as pd

 

# (Assuming 'df' is the DataFrame from the first example)

data = {

    'Product': ['Laptops', 'Monitors', 'Keyboards', 'Mice'],

    'Sales_2024': [150, 220, 310, 450],

    'Sales_2025': [180, 250, 290, 400],

    'Category': ['Electronics', 'Electronics', 'Accessories', 'Accessories']

}

df = pd.DataFrame(data)

 

# Create a new column for the change in sales from 2024 to 2025

df['Sales_Change'] = df['Sales_2025'] - df['Sales_2024']

print("--- DataFrame with New 'Sales_Change' Column ---")

print(df)

 

Bokeh Mahotas
Our Products & Services
  • Home
Connect with us
  • Contact us
  • +91 82955 87844
  • Rk6yadav@gmail.com

StudyLover - About us

The Best knowledge for Best people.

Copyright © StudyLover
Powered by Odoo - Create a free website