URL Structure

Visualizing a website’s URL structure with a treemap

source

url_structure

 url_structure (url_list, items_per_level=10, height=600, width=None,
                template='none', domain='example.com', title='URL
                Structure', subtitle=None)

Create a treemap for the first two URL path directories example.com/dir_1/dir_2/.

Type Default Details
url_list list Any list-like object with a bunch of URLs.
items_per_level int 10 The number of items to display for each level of the treemap. All other
items will be grouped under a special item called “Others”.
height int 600 The height of the chart in pixels.
width NoneType None The width of the chart in pixels.
template str none Name of template to use for the chart.
domain str example.com The main domain of the URL list. This will be displayed at the top
panel in the treemap to display values like a breadcrumb.
title str URL Structure The title of the figure. You can use/include the following HTML tags in
the title: <a>, <b>, <br>, <i>, <sub>, <sup>
subtitle NoneType None The subtitle of the figure.
Returns plotly.graph_objects.Figure

Read a list of URLs from a text/CSV file

apple = pd.read_csv(filepath)
apple.head(10)
url
0 https://www.apple.com/ae/shop/accessories/all
1 https://www.apple.com/ae/shop/accessories/all/...
2 https://www.apple.com/ae/shop/accessories/all/...
3 https://www.apple.com/ae/shop/accessories/all/...
4 https://www.apple.com/ae/shop/accessories/all/...
5 https://www.apple.com/ae/shop/accessories/all/...
6 https://www.apple.com/ae/shop/accessories/all/...
7 https://www.apple.com/ae/shop/accessories/all/...
8 https://www.apple.com/ae/shop/accessories/all/...
9 https://www.apple.com/ae/shop/accessories/all/...

Visualize the URL structure with

url_structure(apple["url"])

Number of items per level

url_structure(url_list=apple["url"], items_per_level=5)

Number of items per level

url_structure(url_list=apple["url"], items_per_level=25)

Pick a template

url_structure(url_list=apple["url"], items_per_level=25, template="plotly_dark")

Pick a template

url_structure(url_list=apple["url"], items_per_level=15, template="seaborn")

Set domain name and chart title

Code
url_structure(
    url_list=apple["url"],
    items_per_level=15,
    template="ggplot2",
    domain="apple.com",
    title='URL Structure: <b>apple.com</b><br>Raw data: <a href="data/apple_url_list.csv">Apple.com URLs</a>',
)