Python Pandas – Basic functions

Pandas is also providing various functions for data processing. This tutorial has explained various inbuilt functions of Pandas with example.

 

Function Description
df.head(n) Return first n rows of the Series/DataFrame. (Default n=5)
df.tail(n) Return last n rows of the Series/DataFrame. (Default n=5)
df.shape Return the dimensionality of DataFrame
df.index Return the index of DataFrame
df.columns Return the columns of DataFrame
df.info Get Info of dataframe
df.values Return the Series/DataFrame as ndarray
df.size Return the total number of elements in the DataFrame
df.ndim Return the number of dimensions
df.dtypes Return the datatypes of each column
df.axes Returns a list with the row axis labels and column axis labels as the only members.
df.empty Returns True if series/DataFrame is entirely empty.

.     .     .

Example

In [1]:
import pandas as pd
data = {'Name':['Tomy', 'Jack', 'Steve', 'Ricky','Mark','Johi'],
        'Age':[28,34,29,51,25,60],
        'gender':['F','M','M','F','M','F']}
df = pd.DataFrame(data)
df
Out[1]:
    Name  Age gender
0   Tomy   28      F
1   Jack   34      M
2  Steve   29      M
3  Ricky   51      F
4   Mark   25      M
5   Johi   60      F

In [2]: df.shape  # Return the dimension of the DataFrame
Out[2]: (6,3)

df.head(n) – This function returns the first n rows of the DataFrame/Series (Default n=5)

In [3]: df.head()
Out[3]:
    Name  Age gender
0   Tomy   28      F
1   Jack   34      M
2  Steve   29      M
3  Ricky   51      F
4   Mark   25      M

df.tail(n) – This function returns the last n rows of the DataFrame/Series (Default n=5).

In [4]: df.tail(n=3)
Out[4]:
    Name  Age gender
3  Ricky   51      F
4   Mark   25      M
5   Johi   60      F

DataFrame.values – Return a Numpy representation of the DataFrame. Only the values in the DataFrame will be returned, the axes labels will be removed.

In [5]: df.values
Out[5]:
array([['Tomy', 28, 'F'],
       ['Jack', 34, 'M'],
       ['Steve', 29, 'M'],
       ['Ricky', 51, 'F'],
       ['Mark', 25, 'M'],
       ['Johi', 60, 'F']], dtype=object)

DataFrame.to_numpy() – Convert the DataFrame to a NumPy array.

In [6]: df.to_numpy()
Out[6]:
array([['Tomy', 28, 'F'],
       ['Jack', 34, 'M'],
       ['Steve', 29, 'M'],
       ['Ricky', 51, 'F'],
       ['Mark', 25, 'M'],
       ['Johi', 60, 'F']], dtype=object)

DataFrame.index – Return the index (row labels) of the DataFrame.

In [6]: df.index
Out[6]: RangeIndex(start=0, stop=6, step=1)

DataFrame.columns – Return the column labels of the DataFrame.

In [7]: df.columns
Out[7]: Index(['Name', 'Age', 'gender'], dtype='object')

DataFrame.dtypes – Return the data type of each column of the DataFrame.

In [8]: df.dtypes
Out[8]: 
Name      object
Age        int64
gender    object
dtype: object

DataFrame.info() – Print a concise summary of a DataFrame. This method prints information about a DataFrame including the index dtype and column dtypes, non-null values and memory usage.

In [9]: df.info()
Out[9]:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 3 columns):
Name      6 non-null object
Age       6 non-null int64
gender    6 non-null object
dtypes: int64(1), object(2)
memory usage: 272.0+ bytes

DataFrame.axes – Return a list representing the row axis labels and column axis labels of the DataFrame.

In [10]: df.axes
Out[10]:
[RangeIndex(start=0, stop=6, step=1),
 Index(['Name', 'Age', 'gender'], dtype='object')]

.     .     .

Leave a Reply

Your email address will not be published. Required fields are marked *

Python Pandas Tutorials