In this tutorial, we use a dataset about profits from stocks and bonds. We can take a look at the data using head(), which shows us the first 6 rows of the data frame.
TBONDS: Price of Treasury Bonds, US government bonds. For example, a figure of 4 means an increase of 4%.
SPSTOCKS: S&P 500 Stocks price, shares of the 500 largest US companies. For example, a figure of 44 means an increase of 44%.
TBONDS_D: Indicates whether Treasury Bonds rose (1) or fell/remained the same (0) that year.
SPSTOCKS_D: Indicates whether the S&P 500 stocks rose (1) or fell/remained the same (0) that year.
The str() function tells us the class of each variable. You will notice that most variables in the stocks data contain integers.
str(stocks)
'data.frame': 88 obs. of 5 variables:
$ YEAR : int 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 ...
$ TBONDS : int 1 4 NA -3 9 2 8 4 5 1 ...
$ SPSTOCK : int 44 -8 -25 -44 -9 50 -1 47 32 -35 ...
$ TBONDS_D : int 1 1 NA 0 1 1 1 1 1 1 ...
$ SPSTOCK_D: int 1 0 0 0 0 1 0 1 1 0 ...
We will explore this data set further. At the start, we are interested to get more information about our collected variables. In most cases, we want at least to know the following parameters:
the mean
the standard deviation
the range (minimum and maximum)
For this purpose, R provides the following functions:
mean()sd()min()max()
You can use these for a single variable (at a time). You can call up each variable by adding $‘variable name’ after the name of the data frame.
Practice
Try requesting the mean of SPSTOCK from the stocks dataset.
NOTE: The stocks dataset is already loaded in the working directory of this webr session.