The Palmer penguins dataset was introduced by Allison Horst, Alison Hill, and Kristen Gorman provide a great dataset for data exploration and visualization, as an alternative to iris. It was first introduced as an R package. The released version of palmerpenguins can be instaalled from CRAN with:
10.2.3plotly R package for interactive data visualization
Interactive visualization focuses on graphic representations of data that improve the way we interact with information
plotly is an R package for creating interactive web-based graphs via the open source JavaScript graphing library plotly.js.
library(plotly)p <-ggplot(penguins) +geom_point( aes(x = flipper_length_mm,y = body_mass_g,color = species,shape = species)) +xlab("Flipper Length")+ylab("Body Mass")# The function ggplotly converts a ggplot2::ggplot() object to a plotly object.plotly::ggplotly(p)
Method 2
library(plotly)fig <-plot_ly(penguins, x =~flipper_length_mm,y =~body_mass_g, color =~species,symbol =~species,type ="scatter")fig
10.3 Python
Load data
#load functions in palmerpenguins packagefrom palmerpenguins import load_penguinspenguins = load_penguins()# Return the first part of the datasetpenguins.head()# Retrieve column nameslist(penguins.columns)
10.3.1Matplotlib package
Matplotlib is mainly deployed for basic plotting. Visualization using Matplotlib generally consists of bars, pies, lines, scatter plots and so on.
# Import matplotlib to make statistical graphics. # By convention, it is imported with the shorthand sns.import matplotlib.pyplot as pltcolors = {'Adelie':'blue', 'Gentoo':'orange', 'Chinstrap':'green'}plt.scatter(penguins.flipper_length_mm,penguins.body_mass_g, c= penguins.species.apply(lambda x: colors[x]))plt.xlabel('Flipper Length')plt.ylabel('Body Mass')
10.3.2seaborn Package
Seaborn is an easy-to-use high level statistical plotting library which provides a variety of visualization patterns. It uses fewer syntax and has easily interesting default themes.
It tries to provide a ‘grammar of graphics’ style way to create plots but in a pythonic style without getting the exact syntax from ggplot as in plotnine.
# Import seaborn to make statistical graphics. # By convention, it is imported with the shorthand sns.import seaborn as sns #load functions in palmerpenguins packagefrom palmerpenguins import load_penguinspenguins = load_penguins()# Apply the default themesns.set_theme()# sns.set_style('whitegrid')p = sns.relplot(x ='flipper_length_mm', y ='body_mass_g', hue ='species', style ='species', data = penguins)p.set_xlabels('Flipper Length')p.set_ylabels('Body Mass')
The function relplot() is named that way because it is designed to visualize many different statistical relationships. The relplot() function has a convenient kind parameter that lets you easily switch to this alternate representation: scatterplot() with kind="scatter"; the default and lineplot() with kind="line".
10.3.3plotnine package
https://pypi.org/project/plotnine/
plotnine is an implementation of a grammar of graphics in Python, it is based on ggplot2. The grammar allows users to compose plots by explicitly mapping data to the visual objects that make up the plot.
Plotting with a grammar is powerful, it makes custom (and otherwise complex) plots are easy to think about and then create, while the simple plots remain simple.
NOTE: R vs Python Syntax
Unlike in R, now all the variables must be enclosed by single quotes
from plotnine import*# unlike in R, now all the variables must be enclosed by single quotes(ggplot(penguins) + geom_point(aes(x ='flipper_length_mm', y ='body_mass_g', color ='species', shape ='species')) + xlab("Flipper Length")+ ylab("Body Mass"))
10.3.4plotly Python library for interactive data visualization
The plotly.express (Plotly Express or PX) module contains functions that can create entire figures at once. It is usually imported as px. Plotly Express is a built-in part of the plotly library.