Racing Bar Chart

Get the top n values per period, and create a “racing” bar chart across periods.

racing_chart

 racing_chart (df, n=10, title='Racing Chart', subtitle=None,
               frame_duration=500, template='none', width=None,
               height=600, font_size=12, **kwargs)

Create a racing bar chart showing the top n values for each period.

	Type	Default	Details
df	pandas.DataFrame		A DataFrame containing three columns for entity, metric, and period. The names can be anything bu they have to be in this specific order.
n	int	10	The number of top items to display for each period.
title	str	Racing Chart	The title of the chart.
subtitle	NoneType	None	The subtitle of the chart.
frame_duration	int	500	The duration of each frame during animation, before transitioning to the following frame, in milliseconds.
template	str	none	The template of the chart.
width	NoneType	None	The width of the chart in pixels.
height	int	600	The height of the chart in pixels.
font_size	int	12	The size of the fonts in the chart.
kwargs	VAR_KEYWORD
Returns	plotly.graph_objects.Figure

Entity-Metric-Period `DataFrame`

Code

# import random
# [random.randint(10, 35) for i in range(20)]

df = (
    pd.DataFrame(
        {
            "entity": ["blue", "green", "yellow", "red", "orange"] * 4,
            "metric": [
                16,
                35,
                10,
                25,
                13,
                35,
                25,
                25,
                27,
                19,
                10,
                18,
                34,
                20,
                25,
                20,
                24,
                25,
                14,
                21,
            ],
            "period": [i for i in range(1, 5) for x in range(5)],
        }
    )
    .sort_values(["period", "metric"], ascending=[True, False])
    .reset_index(drop=True)
)  # noqa
display_html(  #'<div style="margin-left: 20%">' +
    df.style.hide()
    .bar(subset=["metric"], color="lightgray")
    .background_gradient(subset=["period"], cmap="Blues")
    .to_html(),
    # + '</div>'
    raw=True,
)

entity	metric	period
green	35	1
red	25	1
blue	16	1
orange	13	1
yellow	10	1
blue	35	2
red	27	2
green	25	2
yellow	25	2
orange	19	2
yellow	34	3
orange	25	3
red	20	3
green	18	3
blue	10	3
yellow	25	4
green	24	4
orange	21	4
blue	20	4
red	14	4

Get the top three values for each period

racing_chart(df, n=3)

Important

The DataFrame supplied to the racing_chart function needs to have the three columns containing entity, metric, and period, in exactly this particular order. Their names don’t matter, but the order does

Entities

Some examples:

countries
URLs
keywords
product names

Metrics

clicks
impressions
sales
conversions
population
count

Period

days
months
weeks
quarters
years

Example: Google Search Console data

First three contries and months by clicks:

gsc = pd.read_csv("data/gsc_country_month_report.csv")
gsc["flag"] = [adviz.flag(cc) for cc in gsc["country"]]
gsc.groupby("date").head(3).head(9)


KeyboardInterrupt

Modifying the animation speed with `animation_duration` (in milliseconds)

racing_chart(
    gsc[["country", "clicks", "date"]],
    frame_duration=1500,
    height=700,
    title="Google Search Console <b>mywebsite.com</b><br>clicks per month - top 10<br><b>frame_duration=1500</b>",
    template="plotly_dark",
)

Make it faster and use flags instead of country codes

fig = racing_chart(
    gsc[["flag", "clicks", "date"]],
    frame_duration=500,
    height=700,
    title="Google Search Console <b>mywebsite.com</b><br>clicks per month - top 10<br><b>frame_duration=500</b>",
    template="plotly_dark",
)
fig.layout.yaxis.tickfont.size = 25
for frame in fig.frames:
    frame.data[0].marker.color = "white"
fig.data[0].marker.color = "white"
fig

queries = pd.read_csv("data/gsc_query_month_report.csv")
queries

	query	clicks	impressions	ctr	position	date
0	advertools	156	246	0.634146	1.016260	2022-01-31
1	advertools python	27	43	0.627907	1.000000	2022-01-31
2	python advertools	19	24	0.791667	1.000000	2022-01-31
3	python seo crawler	5	7	0.714286	1.000000	2022-01-31
4	advertools github	4	26	0.153846	3.846154	2022-01-31
...	...	...	...	...	...	...
30166	🤟 emoji	0	1	0.000000	43.000000	2022-12-31
30167	🥑 meaning in text	0	1	0.000000	45.000000	2022-12-31
30168	🥔 meaning in text	0	1	0.000000	38.000000	2022-12-31
30169	🥕 meaning in text	0	1	0.000000	36.000000	2022-12-31
30170	🫐 meaning in text	0	1	0.000000	27.000000	2022-12-31

30171 rows × 6 columns

Explore the top queries by month (impressions)

Queries can be long and take a lot of space, so we can set the left margin of the Figure object to a larger number to fit them.

fig = racing_chart(
    queries[["query", "impressions", "date"]],
    title="Google Search Console <b>mywebsite.com</b><br>impressions per month - top 15",
    n=15,
    template="seaborn",
    height=800,
)
fig.layout.margin.l = 250
fig

Filtering entities

Taking a look at the top queries is definitely interesting, but many times you may have tens or hundreds of thousands of queries, and you want to go deeper.

One way is to filter those based on some criterion.

For example, let’s see which are the top queries that contain “log” and see how we are doing on log file analysis queries:

fig = racing_chart(
    queries[queries["query"].str.contains("log")][["query", "impressions", "date"]],
    title='Google Search Console <b>mywebsite.com</b> - queries containing "log"<br>impressions per month - top 15',
    n=15,
    template="seaborn",
    height=800,
)
fig.layout.margin.l = 250
fig