Create a racing bar chart showing the top n values for each period.
Type
Default
Details
df
pandas.DataFrame
A DataFrame containing three columns for entity, metric, and period. The names can be anything bu they have to be in this specific order.
n
int
10
The number of top items to display for each period.
title
str
Racing Chart
The title of the chart.
frame_duration
int
500
The duration of each frame during animation, before transitioning to the following frame, in milliseconds.
theme
str
none
The theme of the chart. Any of the following values can be used:
width
NoneType
None
The width of the chart in pixels.
height
int
600
The height of the chart in pixels.
font_size
int
12
The size of the fonts in the chart.
Returns
plotly.graph_objects.Figure
Entity-Metric-Period DataFrame
Code
# import random# [random.randint(10, 35) for i in range(20)]df = pd.DataFrame({'entity': ['blue', 'green', 'yellow', 'red', 'orange'] *4,'metric': [16, 35, 10, 25, 13, 35, 25, 25, 27, 19, 10, 18, 34, 20, 25, 20, 24, 25, 14, 21],'period': [i for i inrange(1, 5) for x inrange(5)],}).sort_values(['period', 'metric'], ascending=[True, False]).reset_index(drop=True)display_html(#'<div style="margin-left: 20%">' + df .style .hide() .bar(subset=['metric'], color='lightgray') .background_gradient(subset=['period'], cmap='Blues') .to_html() # + '</div>', raw=True)
entity
metric
period
green
35
1
red
25
1
blue
16
1
orange
13
1
yellow
10
1
blue
35
2
red
27
2
green
25
2
yellow
25
2
orange
19
2
yellow
34
3
orange
25
3
red
20
3
green
18
3
blue
10
3
yellow
25
4
green
24
4
orange
21
4
blue
20
4
red
14
4
Get the top three values for each period
racing_chart(df, n=3)
Important
The DataFrame supplied to the racing_chart function needs to have the three columns containing entity, metric, and period, in exactly this particular order. Their names don’t matter, but the order does
Entities
Some examples:
countries
URLs
keywords
product names
Metrics
clicks
impressions
sales
conversions
population
count
Period
days
months
weeks
quarters
years
Example: Google Search Console data
First three contries and months by clicks:
gsc = pd.read_csv('data/gsc_country_month_report.csv')gsc['flag'] = [adviz.flag(cc) for cc in gsc['country']]gsc.groupby('date').head(3).head(9)
country
clicks
impressions
ctr
position
date
flag
0
usa
185
11397
0.016232
38.121435
2022-01-31
🇺🇸
1
ind
157
6777
0.023167
33.000443
2022-01-31
🇮🇳
2
deu
69
1538
0.044863
20.850455
2022-01-31
🇩🇪
197
usa
230
12642
0.018193
32.848837
2022-02-28
🇺🇸
198
ind
178
7025
0.025338
31.743203
2022-02-28
🇮🇳
199
deu
67
1701
0.039389
20.089947
2022-02-28
🇩🇪
394
usa
285
15394
0.018514
36.662986
2022-03-31
🇺🇸
395
ind
185
7142
0.025903
30.034584
2022-03-31
🇮🇳
396
gbr
114
3029
0.037636
32.233080
2022-03-31
🇬🇧
Modifying the animation speed with animation_duration (in milliseconds)
racing_chart( gsc[['country', 'clicks', 'date']], frame_duration=1500, height=700, title='Google Search Console <b>mywebsite.com</b><br>clicks per month - top 10<br><b>frame_duration=1500</b>', theme='plotly_dark')
Make it faster and use flags instead of country codes
fig = racing_chart( gsc[['flag', 'clicks', 'date']], frame_duration=500, height=700, title='Google Search Console <b>mywebsite.com</b><br>clicks per month - top 10<br><b>frame_duration=500</b>', theme='plotly_dark')fig.layout.yaxis.tickfont.size =25for frame in fig.frames: frame.data[0].marker.color ='white'fig.data[0].marker.color ='white'fig
Queries can be long and take a lot of space, so we can set the left margin of the Figure object to a larger number to fit them.
fig = racing_chart( queries[['query', 'impressions', 'date']], title='Google Search Console <b>mywebsite.com</b><br>impressions per month - top 15', n=15, theme='seaborn', height=800)fig.layout.margin.l =250fig
Filtering entities
Taking a look at the top queries is definitely interesting, but many times you may have tens or hundreds of thousands of queries, and you want to go deeper.
One way is to filter those based on some criterion.
For example, let’s see which are the top queries that contain “log” and see how we are doing on log file analysis queries: