Pandas – Apply

Pandas apply() function used to apply a function along an axis of the DataFrame. This tutorial has explained how to use apply() method with DataFrame by applying a numpy function, a lambda function and user-defined function.

Syntax:

DataFrame.apply(selffuncaxis=0, raw=False, result_type=Noneargs=()**kwds) 

Parameters:

func - Function to apply to each column or row.
axis - (default 0) Axis along which the function is applied:
       0 or ‘index’: apply function to each column.
       1 or ‘columns’: apply function to each row.

raw - default False
      False: passes each row or column as a Series to the function
      True: the passed function will receive ndarray objects instead. If you are just 
      applying a NumPy reduction function this will achieve much better performance.

result_type - {‘expand’, ‘reduce’, ‘broadcast’, None}, default None
              These only act when axis=1 (columns):
              ‘expand’ : list-like results will be turned into columns.
              ‘reduce’ : returns a Series if possible rather than expanding list-like 
                         results. This is the opposite of ‘expand’.
              ‘broadcast’ : results will be broadcast to the original shape of the 
                            DataFrame, the original index and columns will be retained. 

args - tuple Positional arguments to pass to func in addition to the array/series.

Examples

In [1]:
# Let's define DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame([[1,2], [6, 1], [9,5],[1 ,4]], columns=list('AB'))
df

Out[1]:
   A  B
0  1  2
1  6  1
2  9  5
3  1  4

Using a numpy function

In [2]: df.apply(np.square)   # Find the square of each element
Out[2]:
    A   B
0   1   4
1  36   1
2  81  25
3   1  16

In [3]: df.apply(np.sum,axis=1)    # Find the sum over each row
Out[3]:
0     3
1     7
2    14
3     5
dtype: int64

In [4]: df.apply(np.sum,axis=0)   # Find the sum over each column
out[4]:
A    17
B    12
dtype: int64

Using a user-defined function

In [5]: df
Out[5]:
   A  B
0  1  2
1  6  1
2  9  5
3  1  4

# Let's define a function
In [6]:
def multiply_by_5(x):
    return x*5

# Apply 'multiply_by_5' function to each elements of the DataFrame
In [7]: df['C'] = df['B'].apply(multiply_by_5) 

In [8]: df
Out[8]:
   A  B   C
0  1  2  10
1  6  1   5
2  9  5  25
3  1  4  20

Using a lambda function

In [9]: df['D'] = df['A'].apply(lambda x: x*5)

In [10]: df
Out[10]:
   A  B   C   D
0  1  2  10   5
1  6  1   5  30
2  9  5  25  45
3  1  4  20   5

Returning a list-like will result in a Series

In [11]: df.apply(lambda x: [1, 2], axis=1)
Out[11]:
0    [1, 2]
1    [1, 2]
2    [1, 2]
3    [1, 2]
dtype: object

Passing result_type=’expand’ will expand list-like results to columns of a Dataframe

In [12]: df.apply(lambda x: [1, 2], axis=1, result_type='expand')
Out[12]:
   0  1
0  1  2
1  1  2
2  1  2
3  1  2

.     .     .

Leave a Reply

Your email address will not be published. Required fields are marked *

Python Pandas Tutorials

Pandas – How to remove DataFrame columns with constant (same) values?

Pandas – How to remove DataFrame columns with only one distinct value?

Pandas – Count unique values for each column of a DataFrame

Pandas – Count missing values (NaN) for each columns in DataFrame

Pandas – MultiIndex

Pandas – Applymap

Pandas – Map

Pandas – Missing Data

Difference between Merge, join, and concatenate

Pandas – Join

pandas : Handling Duplicate Data

Pandas : Handling Categorical Data

Pandas : Data Types

Appending a row to DataFrame

Python Pandas – Merge

Python Pandas – Concatenation & append

Python Pandas – GroupBy

Python Pandas – Visualization

Python Pandas – Options and Customization

Python Pandas – Descriptive Statistics

Python Pandas – Basic functions

Python Pandas – DataFrame

Python Pandas – Series

Python Pandas – Introduction