In order to get rows or row IDs of Dataframe having maximum values for each column, Pandas DataFrame’s .idxmax() is used.
.idxmax() returns Row Ids of Maximum values of each column of a DataFrame.
We are not using .max() here because .max() returns the actual value rather then the row in which that value resides. At times it’s useful to have a look at the whole row of a DataFrame where a column’s max value is present. Thats where .idxmax() is useful.
Syntax:
DataFrame.idxmax(axis=0, skipna=True)
Parameters :
axis : 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise
skipna : Exclude NA/null values. If an entire row/column is NA, the result will be NA
Returns : idxmax : Series
Following is the code with comments, description and results of the commands to be run in Python 3 for using .idxmax() to get row ids of maximum values for each column:
Electronics Engineer by book, Software Architect and Technopreneur by passion, Open Source Enthusiast, Problem Hacker, Enabler, Do-Tank, Blogger, Autodidact, Yogi and an avid Reader. Involved in Building Products. Having loads of experience and technical expertise in areas ranging from Full Stack Web Application Development to Big Data Analysis, Modeling, Processing and Visualization, he is currently involved in working on Python, Django, Javascript, SQL, Bootstrap, PostgreSQL, RRD (Round Robin Database), MySQL, MonetDB, LevelDB, BerkeleyDB, Redis, Apache Spark, Pandas, SciPy, NumPy etc.
Ali Raza received his Masters Degree in Electronics Engineering which involved Research focused on Machine Learning. He is currently working as a Chief Technical Officer at BitWits (Pvt) Limited, CEO & Founder at DataLysis.io and CEO & Founder at LearningByDoing.io.