Pandas – Python library

Pandas, founded by Wes McKinney in the early 2000s, stands as a powerful Python library for data manipulation and analysis. Born out of the necessity for a flexible tool to handle financial data, McKinney developed Pandas to address shortcomings in data analysis workflows. With its intuitive DataFrame and Series structures, Pandas simplifies the process of cleaning, transforming, and exploring structured data. Since its inception, the library has evolved into a fundamental component of the Python ecosystem, playing a pivotal role in various domains, including data science, finance, and research. Today, Pandas continues to be a go-to solution for anyone working with tabular data in Python.

We are going to learn pandas from beginner.

Step 1 : Download Anaconda – https://www.anaconda.com/download/

Step 2 : Open terminal and write jupyter notebook (localhost in port 8888 will open)

Step 3 : Create a folder in any location and insert the ‘titanic.csv’ file into the newly created folder.

Download titanic.csv file : https://drive.google.com/file/d/1I_TNj706IerJbc0CSDvJVofxYiU_S-Kv/view?usp=sharing (Simply analyzing the ‘titanic.csv’ file)

Step 3 : Open created folder and add new Python 3 (ipykernel)

Step 4 : Open created file (this window will pop up)

Let’s start by reading the ‘titanic.csv’ file and analyzing it.

import pandas as pd

This command helps to import pandas library.

df = pd.read_csv('titanic.csv')

This command helps to read a CSV file and stores it in the ‘df’ variable.

# Reading a CSV file

csv_file_path = ‘path/to/your/file.csv’

df_csv = pd.read_csv(csv_file_path)

# Reading an Excel file (XLSX)

xlsx_file_path = ‘path/to/your/file.xlsx’

df_xlsx = pd.read_excel(xlsx_file_path)

# Reading an json file

json_file_path = ‘path/to/your/file.json’

df_json = pd.read_json(json_file_path)

In this way you can read different files in pandas.

Let’s learn some of the built-in functions in the Pandas library.

1.head() -> It helps to print first 5 rows of file.

If you want to print the number of rows as you wish, simply place the desired number inside the function.

df.head(10)

df.head()

Similarly

2.tail() -> It helps to print last 5 rows of file

3.iloc -> It helps to print unique rows as your wish.

titanic.iloc[50]

Unique rows using iloc function

4. use of loc function

This Post Has One Comment

  1. puravive reviews

    This webpage is phenomenal. The brilliant data reveals the proprietor’s interest. I’m awestruck and expect further such mind blowing posts.

Leave a Reply