Python Pandas – Concatenation & append

The Pandas’s Concatenation function provides a verity of facilities to concating series or DataFrame along an axis.

pandas.concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=None, copy=True)

Parameters:

  • objs : a sequence or mapping of Series or DataFrame objects
  • axis : The axis to concatenate along. {0/’index’, 1/’columns’}, default 0
  • join : How to handle indexes on other axes. {‘inner’, ‘outer’}, default ‘outer’
  • ignore_index : bool, default False
      • If True, do not use the index values along the concatenation axis. The resulting axis will be labeled 0, …, n – 1.
  • keys : sequence, default None
      •  Construct hierarchical index using the passed keys as the outermost level.
  • levels : list of sequences, default None
      • Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys.
  • names : Names for the levels in the resulting hierarchical index.
  • verify_integrity : bool, default False.Check whether the new concatenated axis contains duplicates.
  • sort : bool, default None
  • copy :  bool, default True
      • If False, do not copy data unnecessarily.

.     .     .

 

Example

import pandas as pd
data1 = {'name' :['mark','juli'],'city':['New York','Paris']}
data2 = {'name' :['john','alex'],'city':['London','Tokyo'],'age':[28,56]}
data3 = {'name' :['Saty','Jonathan'],'city':['germany','Moscow']}

df1 = pd.DataFrame(data1,index = [0,1])
df2 = pd.DataFrame(data2,index=[2,3])
df3 = pd.DataFrame(data3,index=[1,2])

Let’s concatenate the DataFrames df1  and df3.

In [1]: pd.concat([df1,df3])
Out[1]: 
       city      name
0  New York      mark
1     Paris      juli              # Here, the index is duplicated.
1   germany      Saty
2    Moscow  Jonathan

# To avoid the duplicate index, use the parameter ignore_index=True.

In [2]: pd.concat([df1,df3],ignore_index=True) 
Out[2]: 
       name      city
0      mark  New York
1      juli     Paris
2      Saty   germany
3  Jonathan    Moscow

You can also concatenate the multiple DataFrames.

In [3]: pd.concat([df1,df2,df3])
Out[3]:
    age      city      name
0   NaN  New York      mark
1   NaN     Paris      juli
2  28.0    London      john
3  56.0     Tokyo      alex
1   NaN   germany      Saty
2   NaN    Moscow  Jonathan

Concatenate the DataFrames Horizontally

By default, the Pandas’ concat() method concatenate the DataFrames vertically as the parameter axis=1 is defined. However, you can also merge the DataFrames horizontally by specifying the parameter axis=0.

In [4]: pd.concat([df1,df2],axis=1)     # axis=1 (concatenate horizontally)
Out[4]:
   name      city  name    city   age
0  mark  New York   NaN     NaN   NaN
1  juli     Paris   NaN     NaN   NaN
2   NaN       NaN  john  London  28.0
3   NaN       NaN  alex   Tokyo  56.0

In [5]: pd.concat([df1,df3],axis=1,join='inner')    # join = 'inner'
Out[5]:
   name   city  name     city
1  juli  Paris  Saty  germany

In [6]: pd.concat([df1, df3], axis=1).reindex(df1.index)   
Out[6]: 
       city  name     city  name
0  New York  mark      NaN   NaN
1     Paris  juli  germany  Saty

Construct hierarchical indexing

By defining the parameter keys, you can construct the hierarchical indexing.

In [7]: result = pd.concat([df1,df3],keys=['x','y'])

In [8]: result
Out[8]: 
         name      city
x 0      mark  New York
  1      juli     Paris
y 1      Saty   germany
  2  Jonathan    Moscow

In [9]: result.loc['y']
Out[9]: 
     city  name
2  London  john
3   Tokyo  alex

.     .     .

Concatenating Using append

A useful shortcut to concat() are the append() instance methods on Series and DataFrame. These methods actually predated concat. They concatenate along axis=0, namely the index.

In [10]: df1.append(df3)
Out[10]: 
       city      name
0  New York      mark
1     Paris      juli
1   germany      Saty
2    Moscow  Jonathan

In [11]: df1.append([df3,df2])
Out[11]: 
       city      name
0  New York      mark
1     Paris      juli
1   germany      Saty
2    Moscow  Jonathan
2    London      john
3     Tokyo      alex

.     .     .

Leave a Reply

Your email address will not be published. Required fields are marked *

Python Pandas Tutorials

Pandas – How to remove DataFrame columns with constant (same) values?

Pandas – How to remove DataFrame columns with only one distinct value?

Pandas – Count unique values for each column of a DataFrame

Pandas – Count missing values (NaN) for each columns in DataFrame

Pandas – MultiIndex

Pandas – Applymap

Pandas – Apply

Pandas – Map

Pandas – Missing Data

Difference between Merge, join, and concatenate

Pandas – Join

pandas : Handling Duplicate Data

Pandas : Handling Categorical Data

Pandas : Data Types

Appending a row to DataFrame

Python Pandas – Merge

Python Pandas – GroupBy

Python Pandas – Visualization

Python Pandas – Options and Customization

Python Pandas – Descriptive Statistics

Python Pandas – Basic functions

Python Pandas – DataFrame

Python Pandas – Series

Python Pandas – Introduction