Introduction

Pandas is a Python library for data analysis and manipulation that is completely open source.

Loading a dataset

To load a dataset from a csv file,

If working with Jupyter Notebooks on google Colabs, the environment comes with all dependencies needed installed. 

I working on Project locally, check this on how to set up your virtual environment in this article.


import pandas as pd

dataframe = pd.read_csv('path/to/file.csv')

pd.read_csv returns a DataFrame Object with all the values in the file as a python dictionary plus other handful methods for manipulating the data

# print a subset of the data
dataframe.head()
dataframe.info()

Read more about Pandas here

Found this article helpful? You may follow me on Twitter where I tweet about interesting topics on software development.