Counting values

Count repeated values in a list, pandas.Series, or pandas.DataFrame. This function adds to the pandas value_counts method in a few ways. It counts values absolutely, and as a percentage. It provides cumulative counts and cumulative percentages. It also lumps all values beyond the show_top values under “Others:”. You alsoe get the values colored with the sequential color scale of your choice.

source

value_counts

 value_counts (data, dropna=False, show_top=10, sort_others=False,
               title=None, style=True, width=900, height=650,
               colorscale='cividis')

Count the values of data and return a table of counts (absolute, cumulative, percentage, and cumulative percentage).

Type Default Details
data list, tuple, pandas.Series, pandas.DataFrame A collection of items to count, using any of the above data structures.
dropna bool False Wether or not to drop missing values.
show_top int 10 How many top items to show. All remaining items, will be grouped into “Others:”.
sort_others bool False Whether or not to put “Others” in their sorted order. The default is to have this item at the bottom.
title NoneType None The title of the chart.
style bool True Whether or not to style the resulting table with a heatmap.
width int 900 The width in pixels of the resulting figure. Set this to None to make it use the widthe of its container.
height int 650 The width in pixels of the resulting figure. Set this to None to make it use the widthe of its container.
colorscale str cividis Which color scale to use for the heatmaps.
Returns plotly.graph_objects.Figure

Get URL data and count values

apple = pd.read_csv('data/apple_url_list.csv')
urldf = adv.url_to_df(apple['url'])
urldf.head()
url scheme netloc path query fragment dir_1 dir_2 dir_3 dir_4 ... query_src query_tab query_modelList query_itscg query_itsct query_select query_useASL query_int_cid query_board_id query_sel
0 https://www.apple.com/ae/shop/accessories/all https www.apple.com /ae/shop/accessories/all NaN NaN ae shop accessories all ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 https://www.apple.com/ae/shop/accessories/all/... https www.apple.com /ae/shop/accessories/all/accessibility NaN NaN ae shop accessories all ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 https://www.apple.com/ae/shop/accessories/all/... https www.apple.com /ae/shop/accessories/all/airtag NaN NaN ae shop accessories all ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 https://www.apple.com/ae/shop/accessories/all/... https www.apple.com /ae/shop/accessories/all/beats NaN NaN ae shop accessories all ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 https://www.apple.com/ae/shop/accessories/all/... https www.apple.com /ae/shop/accessories/all/beats-featured NaN NaN ae shop accessories all ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 28 columns

Default behavior for a single column

value_counts(urldf['dir_1'])

Count unique row combinations in more than one column

Setting width=None will make the width responsive and takes on the width of the chart’s container.

value_counts(urldf[['dir_1', 'dir_2']], width=None)

Filter for a better overview and change colorscale

value_counts(urldf[urldf['dir_1'].eq('shop')]['dir_2'], colorscale='magma')

Filter data and set an informative title for the chart

value_counts(urldf[urldf['dir_1'].eq('ca')]['dir_2'], title='<b>apple.com/ca/dir_2/</b> directory counts')