Counting and Missing Values

Tidyverse includes simple helpers for counting observations, unique values, and missing data. These are especially useful for quick checks.

If you just want to know how many rows your dataset contains, you can use n() inside summarise():

library(dplyr)
stocks %>% 
  summarise(n = n())
   n
1 88

To look at missing data, we combine is.na() with sum(). This lets us count all the missing values in a particular column, making the amount of missingness explicit:

stocks %>% 
  summarise(missing = sum(is.na(TBONDS)))
  missing
1       6

Similarly, we might want to know how many distinct values appear in a variable:

stocks %>% 
  summarise(unique = n_distinct(SPSTOCK_D))
  unique
1      2

We can also check how many rows are fully complete, meaning they have no missing values in any column. We can do that using complete.cases():

stocks %>% summarise(complete_rows = sum(complete.cases(.)))
  complete_rows
1            82
Back to top