Cumulative proportions are useful when categories have a meaningful order—such as ratings or levels of satisfaction. We can compute them with cumsum():
cumsum()
library(dplyr) songs %>% count(THEME) %>% mutate( prop = n / sum(n), cum_prop = cumsum(prop) )
THEME n prop cum_prop 1 Heartbreak 145 0.145 0.145 2 Life_and_death 131 0.131 0.276 3 Love 139 0.139 0.415 4 Party_songs 162 0.162 0.577 5 People_and_places 145 0.145 0.722 6 Politics_and_protest 141 0.141 0.863 7 Sex 131 0.131 0.994 8 <NA> 6 0.006 1.000
This produces a table showing the distribution and how it accumulates across categories.
If the categories are not already ordered, you may want to arrange() them before computing the cumulative values.
arrange()