Calculating summary statistics of variables

In this tutorial, we use a dataset about profits from stocks and bonds. We can take a look at the data using head(), which shows us the first 6 rows of the data frame.

head(stocks)

  YEAR TBONDS SPSTOCK TBONDS_D SPSTOCK_D
1 1928      1      44        1         1
2 1929      4      -8        1         0
3 1930     NA     -25       NA         0
4 1931     -3     -44        0         0
5 1932      9      -9        1         0
6 1933      2      50        1         1

The data set contains five variables:

YEAR: the year
TBONDS: Price of Treasury Bonds, US government bonds. For example, a figure of 4 means an increase of 4%.
SPSTOCKS: S&P 500 Stocks price, shares of the 500 largest US companies. For example, a figure of 44 means an increase of 44%.
TBONDS_D: Indicates whether Treasury Bonds rose (1) or fell/remained the same (0) that year.
SPSTOCKS_D: Indicates whether the S&P 500 stocks rose (1) or fell/remained the same (0) that year.

The str() function tells us the class of each variable. You will notice that most variables in the stocks data contain integers.

str(stocks)

'data.frame':   88 obs. of  5 variables:
 $ YEAR     : int  1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 ...
 $ TBONDS   : int  1 4 NA -3 9 2 8 4 5 1 ...
 $ SPSTOCK  : int  44 -8 -25 -44 -9 50 -1 47 32 -35 ...
 $ TBONDS_D : int  1 1 NA 0 1 1 1 1 1 1 ...
 $ SPSTOCK_D: int  1 0 0 0 0 1 0 1 1 0 ...

We will explore this data set further. At the start, we are interested to get more information about our collected variables. In most cases, we want at least to know the following parameters:

the mean
the standard deviation
the range (minimum and maximum)

For this purpose, R provides the following functions:

mean()
sd()
min()
max()

You can use these for a single variable (at a time). You can call up each variable by adding $‘variable name’ after the name of the data frame.

Practice

Try requesting the mean of SPSTOCK from the stocks dataset.

NOTE: The stocks dataset is already loaded in the working directory of this webr session.

Interactive Editor
Solution

mean(stocks$SPSTOCK)

[1] 11.42045