Data Analytics in Python – How to use .loc, .iloc, .ix in Pandas – Learning by Doing

Python Pandas Machine Learning

In the field of Data Science and Machine Learning, the very first thing after getting access to data is to Analyze it. Data Analysis is the most important part of extracting any valuable information from the data.

Before applying any Machine Learning Model or Techniques it is necessary to get to know the data attributes and dimensions in order to treat it accordingly. In this tutorial, we will be using Hands On approach to go through and analyze an actual data which is used for Machine Learning. We will be using Python and Pandas for this purpose and use .loc, .iloc, .ix in Pandas. We will start with loading the data and defining its Labels and Classes as per Data description mentioned in the Machine Learning Data Repository.

import pandas as pd

df = pd.read_csv(
    filepath_or_buffer='https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data',
    header=None,
    sep=',')

df.columns=['sepal_len', 'sepal_wid', 'petal_len', 'petal_wid', 'class']

df.dropna(how="all", inplace=True) # drops the empty line at file-end

df.head()
df.tail()

df = df.set_index('class')

SELECTING A COLUMN IN PANDAS:

df['petal_len']

SELECTING MULTIPLE COLUMN IN PANDAS:

df[['petal_len', 'petal_wid']]

SELECTING ALL ROWS BY INDEX LABEL:

# Select all rows with class 'Iris-virginica'
df.loc['Iris-virginica']

SELECTING ROWS IN PANDAS

# Select every row up to 5
df.iloc[:4]

# Select the forth and fifth row
df.iloc[3:4]

# Select every row after the fifth row
df.iloc[4:]

SELECTING COLUMNS IN PANDAS

# Select the first 2 columns
df.iloc[:,:2]

Electronics Engineer by book, Software Architect and Technopreneur by passion, Open Source Enthusiast, Problem Hacker, Enabler, Do-Tank, Blogger, Autodidact, Yogi and an avid Reader. Involved in Building Products. Having loads of experience and technical expertise in areas ranging from Full Stack Web Application Development to Big Data Analysis, Modeling, Processing and Visualization, he is currently involved in working on Python, Django, Javascript, SQL, Bootstrap, PostgreSQL, RRD (Round Robin Database), MySQL, MonetDB, LevelDB, BerkeleyDB, Redis, Apache Spark, Pandas, SciPy, NumPy etc.

Ali Raza received his Masters Degree in Electronics Engineering which involved Research focused on Machine Learning. He is currently working as a Chief Technical Officer at BitWits (Pvt) Limited, CEO & Founder at DataLysis.io and CEO & Founder at LearningByDoing.io.

Please follow and like us:

One thought on “Data Analytics in Python – How to use .loc, .iloc, .ix in Pandas – Learning by Doing”

Leave a Reply

Your email address will not be published. Required fields are marked *