02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seaborn Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits. Third party packages A large number of third party packages extend and build on Matplotlib functionality, including several higher- level plotting interfaces (seaborn, holoviews, ggplot, ...), and two projection and mapping toolkits (basemap and cartopy). matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. In matplotlib.pyplot various states are preserved across function calls, so that it keeps track of things like the current figure and plotting area, and the plotting functions are directed to the current axes (please note that "axes" here and in most places in the documentation refers to the axes part of a figure and not the strict mathematical term for more than one axis). Tip: In Jupyter Notebook, you can also include %matplotlib inline to display your plots inside your notebook. Load the required libraries In [101]: import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline #plt.plot? Plot a point
32
Embed
PYTHON FOR DATA SCIENCE · 02/03/2020 Matplotlib-TUTORIALS localhost:8889/lab 1/ 32 PYTHON FOR DATA SCIENCE Visaulisation Matplotlib & Seabor n Matplotlib is a Python 2D plotting
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 1/32
PYTHON FOR DATA SCIENCE
Visaulisation
Matplotlib & SeabornMatplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopyformats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Pythonand IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.
Third party packages
A large number of third party packages extend and build on Matplotlib functionality, including several higher-level plotting interfaces (seaborn, holoviews, ggplot, ...), and two projection and mapping toolkits (basemapand cartopy).
matplotlib.pyplot is a collection of command style functions that make matplotlib work like MATLAB. Eachpyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure,plots some lines in a plotting area, decorates the plot with labels, etc.
In matplotlib.pyplot various states are preserved across function calls, so that it keeps track of things like thecurrent figure and plotting area, and the plotting functions are directed to the current axes (please note that"axes" here and in most places in the documentation refers to the axes part of a figure and not the strictmathematical term for more than one axis).
Tip: In Jupyter Notebook, you can also include %matplotlib inline to display your plots inside your notebook.
Load the required libraries
In [101]:
import numpy as npimport pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline #plt.plot?
Plot a point
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 2/32
In [82]:
plt.plot(4, 3, '.')
Plot number of pointsIn [102]:
x = np.array([2,4,6,8,10,12,14,16])y = x/2
plt.figure(figsize=(10,5))plt.scatter(x, y, c='green') plt.show()
plt.subplot(131) #find the meaning of the parameter inside the subplot function plt.bar(names, scores)plt.subplot(132)plt.scatter(names, scores)plt.subplot(133)plt.plot(names, scores)plt.suptitle('Categorical Plotting') #you can give titles,xlabels and ylabels to each of the plots as wellplt.show()
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 5/32
What are the differences between add_axes and add_subplot?
The calling signature of add_axes is add_axes(rect), where rect is a list [x0, y0, width, height] denoting thelower left point of the new axes in figure coodinates (x0,y0) and its width and height. So the axes ispositionned in absolute coordinates on the canvas
The calling signature of add_subplot does not directly provide the option to place the axes at a predefinedposition. It rather allows to specify where the axes should be situated according to a subplot grid. The usualand easiest way to specify this position is the 3 integer notation,
e.g. ax = fig.add_subplot(231)
In this example a new axes is created at the first position (1) on a grid of 2 rows and 3 columns. To produceonly a single axes, add_subplot(111) would be used (First plot on a 1 by 1 subplot grid). (In newer matplotlibversions, add_subplot()` without any arguments is possible as well.)
SeabornSeaborn comes with a large number of high-level interfaces and customized themes that matplotlib lacks asit becomes difficult to figure out the settings that make plots attractive.
Mostly, matplotlib functions don’t work well with dataframes as seaborn does.
NB: Seaborn visualisations are based on matplotlib
In [107]:
import seaborn as sns
Let's load a dataset to be used
In [108]:
ourdata=pd.read_excel("Pokemon.xls")
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 6/32
In [111]:
ourdata.head()
In [112]:
sns.lmplot(x='Attack', y='Defense', data=ourdata) #lmplot() function is used toquickly plot the Linear Relationship between two(2) variables. lm for linear regression modelplt.show()
No regression line and adding hue
Setting fit_reg=False to remove the regression line
Out[111]:
Name Type1 Type 2 Total HP Attack Defense Atk Def Speed Stage Legenda
We set hue='Stage' to color our points by the Pokémon's evolution stage. This hue argument is very usefulbecause it allows you to express a third dimension of information using color.
Out[43]:
<seaborn.axisgrid.FacetGrid at 0x1a1e37a860>
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 8/32
In [44]:
fig = plt.figure()a1 = fig.add_axes([0,0,1,1]) #The calling signature of add_axes is add_axes(rect), where rect is a list [x0, y0, width, height] denoting the lower left point of the new axes in figure coodinates (x0,y0) and its width and height. So the axes is positionned in absolute coordinates on the canvas
x = np.arange(1,10)a1.plot(x, np.exp(x),'r')a1.set_title('range of numbers')plt.ylim(0,10000)plt.xlim(0,10)
#explicitly set x and y labelsplt.xlabel("x-axis") plt.ylabel('y-axis')plt.show()
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 9/32
In [113]:
ourdata.head()
Out[113]:
Name Type1 Type 2 Total HP Attack Defense Atk Def Speed Stage Legenda
<matplotlib.axes._subplots.AxesSubplot at 0x1a26b2ff60>
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 20/32
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 21/32
In [117]:
ourdata1.head()
Univariate Visualisation
DistplotThe most convenient way to take a quick look at a univariate distribution in seaborn is the distplot() function.By default, this will draw a histogram and fit a kernel density estimate (KDE). It is used basically for univariantset of observations and visualizes it through a histogram i.e. only one observation and hence we choose oneparticular column of the dataset.
In [118]:
sns.distplot(ourdata1['Defense'])
use boxplot to confirm your disttribution
Out[117]:
Name Type 1 Type 2 HP Attack Defense Atk Def Speed
0 Bulbasaur Grass Poison 45 49 49 65 65 45
1 Ivysaur Grass Poison 60 62 63 80 80 60
2 Venusaur Grass Poison 80 82 83 100 100 80
3 Charmander Fire NaN 39 52 43 60 50 65
4 Charmeleon Fire NaN 58 64 58 80 65 80
Out[118]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a26bc97f0>
02/03/2020 Matplotlib-TUTORIALS
localhost:8889/lab 22/32
In [51]:
sns.boxplot(ourdata1['Defense'])
You can explicitly turn off the kde
read about kde: https://pythontic.com/pandas/series-plotting/kernel%20density%20estimation%20plot(https://pythontic.com/pandas/series-plotting/kernel%20density%20estimation%20plot)