Feb 2, 2018

Visualizations in R vs Python

Although Python and R are among top languages for Data Analysis and Data Modeling. R is better for Data Visualizations. Because with fewer lines of code you can get complex visuals.

We can see this in action with “Iris” data. Four measurements for each species are plotted as subplots. Contrast Figure-1(R visual) and Figure-2(Python visual)

Figure-1(R visual)

Figure-2(Python visual)
R Code
library(data.table)
library(dplyr)
library(tidyr)
library(ggplot2)

dfiris<- fread("D:\\Data Science\\Data Sets\\iris.csv")
dfiris %>% gather(Measurement, Value, -species) %>%  ggplot(aes(x = species, y = Value,fill=species)) +  geom_boxplot() +  facet_grid(Measurement ~ .)

Python Code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

dfiris = pd.read_csv('D:/Data Science/Data Sets/iris.csv')
fig = plt.figure()
fig, axis = plt.subplots(4,1,figsize=(10,8))
sns.boxplot( y="petal_length", x= "species", data=dfiris,  orient='v' , ax=axis[0])
sns.boxplot( y="petal_width", x= "species", data=dfiris,  orient='v' , ax=axis[1])
sns.boxplot( y="sepal_length", x= "species", data=dfiris,  orient='v' , ax=axis[2])
sns.boxplot( y="sepal_width", x= "species", data=dfiris,  orient='v' , ax=axis[3])
As you can see R’s ggplot2 automatically displays a beautiful plot with a legend. But in Python’s Matplotlib and seaborn you need to explicitly write additional code for showing a legend and theming it to make it look attractive.

Simple visuals are easy to code in both Python and R. But for complex visuals, more lines of code has to be written in Python. Python community is already working on active porting of R’s ggplot2, but it has yet to be matured.

Please share your experiences on which visualizations R/Python worked better for you and why.

No comments:

Post a Comment