Matplotlib – Histogram – Study Machine Learning

The Histogram represents the distribution of the numeric data. A histogram is an estimate of the probability distribution of a continuous variable. It differs from a bar graph. The bar graph is related to the categorical variable, whereas the histogram is related to the numeric feature. A histogram is widely used in the data analysis task. This tutorial has demonstrated a various method to plot histogram.

A pyplot.hist() method used to plot a histogram of numeric data points.

Parameters :

x : Input values
bins : no of bins
range : the lower and upper range of the bins.
density : bool
weight : An array of weights, of the same shape as x.
bottom : Location of the bottom baseline of each bin
align : {‘left’, ‘mid’, ‘right’}, optional
orientation : {‘horizontal’, ‘vertical’}, optional
label : label of the histogram
color : color of the histogram

Example:

import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors
from matplotlib.ticker import PercentFormatter
 
N_points = 100000
n_bins = 20
x = np.random.randn(N_points)
 
fig, axs = plt.subplots(1, 2, figsize=(9,5),tight_layout=True)
# N is the count in each bin, bins is the lower-limit of the bin
N, bins, patches = axs[0].hist(x, bins=n_bins)
 
# We'll color code by height, but you could use any scalar
fracs = N / N.max()
# we need to normalize the data to 0..1 for the full range of the colormap
norm = colors.Normalize(fracs.min(), fracs.max())
 
# Now, we'll loop through our objects and set the color of each accordingly
for thisfrac, thispatch in zip(fracs, patches):
    color = plt.cm.viridis(norm(thisfrac))
    thispatch.set_facecolor(color)
 
# We can also normalize our inputs by the total number of counts
axs[1].hist(x, bins=n_bins, density=True)
# Now we format the y-axis to display percentage
axs[1].yaxis.set_major_formatter(PercentFormatter(xmax=1))
plt.show()

This produces the following result:

. . .

import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
 
mu = 100 # mean of distribution
sigma = 15 # standard deviation of distribution
x = mu + sigma * np.random.randn(10000)
num_bins = 20
 
n, bins, patches = plt.hist(x, num_bins, normed=1, facecolor='green', alpha=0.5)
# add a 'best fit' line
y = mlab.normpdf(bins, mu, sigma)
plt.plot(bins, y, 'r-o')
 
plt.xlabel('X')
plt.ylabel('Probability')
plt.title(r'Histogram : $\mu=100$, $\sigma=15$')
# Tweak spacing to prevent clipping of ylabel
plt.subplots_adjust(left=0.15)
plt.show()

. . .

Matplotlib – Histogram

Leave a Reply Cancel reply

Matplotlib Tutorials