library(dplyr)
stocks %>%
summarise(
mean_year = mean(YEAR, na.rm = TRUE),
min_year = min(YEAR, na.rm = TRUE),
max_year = max(YEAR, na.rm = TRUE)
) mean_year min_year max_year
1 1971.5 1928 2015
The heart of tidyverse descriptive statistics is the summarise() function. It reduces a dataset to one or more summary statistics.
Suppose we want to compute the mean, minimum, and maximum of the YEAR variable in the stock data:
library(dplyr)
stocks %>%
summarise(
mean_year = mean(YEAR, na.rm = TRUE),
min_year = min(YEAR, na.rm = TRUE),
max_year = max(YEAR, na.rm = TRUE)
) mean_year min_year max_year
1 1971.5 1928 2015
This code produces a one-row tibble containing the requested values.
Notice that we must explicitly set na.rm = TRUE. Unlike base R’s summary(), tidyverse functions never remove missing values unless told to do so. This makes missing-data handling visible and intentional.
Try requesting the mean of SPSTOCKS from the Stocks dataset using dplyr.
NOTE: The stocks dataset and the dplyr package are already loaded in the working directory of this webr session.