Pandas is also providing various functions for data processing. This tutorial has explained various inbuilt functions of Pandas with example.
Function | Description |
df.head(n) | Return first n rows of the Series/DataFrame. (Default n=5) |
df.tail(n) | Return last n rows of the Series/DataFrame. (Default n=5) |
df.shape | Return the dimensionality of DataFrame |
df.index | Return the index of DataFrame |
df.columns | Return the columns of DataFrame |
df.info | Get Info of dataframe |
df.values | Return the Series/DataFrame as ndarray |
df.size | Return the total number of elements in the DataFrame |
df.ndim | Return the number of dimensions |
df.dtypes | Return the datatypes of each column |
df.axes | Returns a list with the row axis labels and column axis labels as the only members. |
df.empty | Returns True if series/DataFrame is entirely empty. |
. . .
Example
In [1]: import pandas as pd data = {'Name':['Tomy', 'Jack', 'Steve', 'Ricky','Mark','Johi'], 'Age':[28,34,29,51,25,60], 'gender':['F','M','M','F','M','F']} df = pd.DataFrame(data) df Out[1]: Name Age gender 0 Tomy 28 F 1 Jack 34 M 2 Steve 29 M 3 Ricky 51 F 4 Mark 25 M 5 Johi 60 F In [2]: df.shape # Return the dimension of the DataFrame Out[2]: (6,3)
df.head(n) – This function returns the first n rows of the DataFrame/Series (Default n=5)
In [3]: df.head() Out[3]: Name Age gender 0 Tomy 28 F 1 Jack 34 M 2 Steve 29 M 3 Ricky 51 F 4 Mark 25 M
df.tail(n) – This function returns the last n rows of the DataFrame/Series (Default n=5).
In [4]: df.tail(n=3) Out[4]: Name Age gender 3 Ricky 51 F 4 Mark 25 M 5 Johi 60 F
DataFrame.values – Return a Numpy representation of the DataFrame. Only the values in the DataFrame will be returned, the axes labels will be removed.
In [5]: df.values Out[5]: array([['Tomy', 28, 'F'], ['Jack', 34, 'M'], ['Steve', 29, 'M'], ['Ricky', 51, 'F'], ['Mark', 25, 'M'], ['Johi', 60, 'F']], dtype=object)
DataFrame.to_numpy() – Convert the DataFrame to a NumPy array.
In [6]: df.to_numpy() Out[6]: array([['Tomy', 28, 'F'], ['Jack', 34, 'M'], ['Steve', 29, 'M'], ['Ricky', 51, 'F'], ['Mark', 25, 'M'], ['Johi', 60, 'F']], dtype=object)
DataFrame.index – Return the index (row labels) of the DataFrame.
In [6]: df.index Out[6]: RangeIndex(start=0, stop=6, step=1)
DataFrame.columns – Return the column labels of the DataFrame.
In [7]: df.columns Out[7]: Index(['Name', 'Age', 'gender'], dtype='object')
DataFrame.dtypes – Return the data type of each column of the DataFrame.
In [8]: df.dtypes Out[8]: Name object Age int64 gender object dtype: object
DataFrame.info() – Print a concise summary of a DataFrame. This method prints information about a DataFrame including the index dtype and column dtypes, non-null values and memory usage.
In [9]: df.info() Out[9]: <class 'pandas.core.frame.DataFrame'> RangeIndex: 6 entries, 0 to 5 Data columns (total 3 columns): Name 6 non-null object Age 6 non-null int64 gender 6 non-null object dtypes: int64(1), object(2) memory usage: 272.0+ bytes
DataFrame.axes – Return a list representing the row axis labels and column axis labels of the DataFrame.
In [10]: df.axes Out[10]: [RangeIndex(start=0, stop=6, step=1), Index(['Name', 'Age', 'gender'], dtype='object')]
. . .