8 Popular Python Visualization Toolkits You’ll Love
Participation: Chi Meng, Wang Shuting
When it comes to creating stunning visualizations with Python, the options can be overwhelming. But which method is best? Before we dive into the world of visualization, let’s clear up some questions about the object of our focus: Do you want to understand the initial distribution of your data? Do you want to impress people with a show-stopping visualization? Or perhaps you want to give someone a glimpse into the inner workings of a moderately complex system? In this article, we’ll explore some common Python visualization packages, their advantages, and disadvantages, and when to use them.
Matplotlib, Seaborn, and Pandas: The Holy Trinity
These three packages are often used together for a reason. Seaborn and Pandas are built on top of Matplotlib, which means that when you use Seaborn’s df.plot() or Pandas’ (), you’re essentially writing Matplotlib code. As a result, the visualizations produced by these packages share a similar landscape, grammar, and customizability. When it comes to these visual tools, I think of three words: exploration, data, and analysis. These packages are perfect for exploring your data, but they might not be the best choice for presentations.
Matplotlib: The Low-Level Library
Matplotlib is a relatively low-level library, but it’s incredibly customizable. You can select from various styles, including ggplot2 and xkcd, to create visually appealing plots. Here’s an example of how I used Matplotlib to create a bar graph showing the median salary of the top 10 teams in the NBA:
import seaborn as sns
import matplotlib.pyplot as plt
color_order = ['xkcd: cerulean', 'xkcd: ocean', 'xkcd: black', 'xkcd: royal purple',
'xkcd: royal purple', 'xkcd: navy blue', 'xkcd: powder blue',
'xkcd: light maroon', 'xkcd: lightish blue', 'xkcd: navy']
sns.barplot(x=top10.Team, y=top10.Salary, palette=color_order)
plt.ticklabel_format(style='sci', axis='y', scilimits=(0, 0))
Seaborn: The Higher-Level Library
Seaborn is a higher-level library that builds on top of Matplotlib. It provides a more convenient and intuitive way to create visualizations. Here’s an example of how I used Seaborn to create a bar graph:
import seaborn as sns
import matplotlib.pyplot as plt
sns.barplot(x=top10.Team, y=top10.Salary)
plt.ticklabel_format(style='sci', axis='y', scilimits=(0, 0))
Pandas: The Data Analysis Powerhouse
Pandas is a powerful library for data analysis, but it’s not primarily a visualization tool. However, it does provide some basic visualization capabilities. Here’s an example of how I used Pandas to create a bar graph:
import pandas as pd
import matplotlib.pyplot as plt
top10.plot(kind='bar')
ggplot2: The Python Port of the R Package
ggplot2 is a Python port of the popular R package. It uses the “graphical grammar” approach to build visualizations. Here’s an example of how I used ggplot2 to create a simple visualization:
import pandas as pd
import matplotlib.pyplot as plt
ggplot(data=df, aes(x='season_start', y='salary', color='team')) + \
geom_point() + \
theme(legend.position='none') + \
labs(title='Salary Over Time', x='Year', y='Salary ($)')
Bokeh: The Interactive Visualization Library
Bokeh is an interactive visualization library that provides a high-level interface for creating web-based visualizations. Here’s an example of how I used Bokeh to create a histogram:
import pandas as pd
from bokeh.plotting import figure
from bokeh.io import show
counts = is_masc.sum()
resps = is_masc.columns
p2 = figure(title='Do You View Yourself As Masculine?',
x_axis_label='Response',
y_axis_label='Count',
x_range=list(resps))
p2.vbar(x=resps, top=counts, width=0.6, fill_color='red', line_color='black')
show(p2)
Plotly: The Interactive Visualization Library
Plotly is another interactive visualization library that provides a high-level interface for creating web-based visualizations. Here’s an example of how I used Plotly to create a bar chart:
import plotly.graph_objs as go
data = [go.Bar(x=team_ave_df.team, y=team_ave_df.turnovers_per_mp)]
layout = go.Layout(title='Turnovers per Minute by Team',
xaxis=dict(title='Team', font=dict(family='Courier New, monospace', size=18, color='#7f7f7f')),
yaxis=dict(title='Average Turnovers / Minute', font=dict(family='Courier New, monospace', size=18, color='#7f7f7f')),
autosize=True, hovermode='closest')
py.iplot(figure_or_data=data, layout=layout, filename='jupyter-plot', sharing='public', fileopt='overwrite')
Pygal: The Simple Visualization Library
Pygal is a simple visualization library that provides a low-level interface for creating static visualizations. Here’s an example of how I used Pygal to create a simple bar chart:
import pygal
bar_chart = pygal.Bar()
bar_chart.title = 'Bar Chart Example'
bar_chart.x_labels = ['A', 'B', 'C']
bar_chart.add('First', [1, 2, 3])
bar_chart.add('Second', [2, 3, 4])
bar_chart.render_to_file('bar_chart.svg')
Networkx: The Graph Analysis Library
Networkx is a graph analysis library that provides a high-level interface for creating and manipulating graphs. Here’s an example of how I used Networkx to create a simple graph:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_nodes_from(v)
edges = [itertools.combinations(net, 2) for net in network]
for edge_group in edges:
G.add_edges_from(edge_group)
options = {'Node_color': 'lime', 'Node_size': 3, 'Width': 1, 'With_labels': False}
nx.draw(G, **options)
In conclusion, there are many Python visualization toolkits available, each with its own strengths and weaknesses. By understanding the characteristics of each toolkit, you can choose the best one for your specific needs. Whether you’re exploring your data, creating a presentation, or building a complex graph, there’s a toolkit out there that can help you achieve your goals.