Frequency Tables

In 2009, The Guardian newspaper published a list entitled “1000 songs you need to hear before you die”. What kind of songs would they have selected for that? That’s exactly what we’re going to find out now!

We aquired a dataset containing the 1000 songs and information about them. After importing the data to R, we can look at the first rows of the dataset with the following command:

head(songs)
  THEME                 TITLE          ARTIST YEAR
1  Love      The_Look_of_Love             ABC 1982
2  Love           The_Shining Badly_Drawn_Boy 2000
3  Love        God_Only_Knows  The_Beach_Boys 1966
4  Love       Good_Vibrations  The_Beach_Boys 1966
5  Love Wouldn’t_It_Be_Nice  The_Beach_Boys 1966
6  Love     Eight_Days_a_Week     The_Beatles 1964
                                           SPOTIFY_URL
1 http://open.spotify.com/track/78j3qTBdzcIiT3eS7XymoD
2 http://open.spotify.com/track/2PojSoZ94AIzp7fsz6wtMt
3 http://open.spotify.com/track/0ObrXLrfrqJUNc8RfmIBHP
4 http://open.spotify.com/track/2oF7FZHIJbzjeEXZ3D0Ku4
5 http://open.spotify.com/track/0cx32rX0uZvcJUP92Wkj2y
6                                                     

The dataset contains five columns for the characteristics of the song (which we also call “variables”) and 1000 rows, one row for each song.

We are now going to create a frequency table for the theme variable in the data set. To do this, we will use the table() function. We add the exclude = NULL argument to ensure that we can also see how many missing values there are.

table(songs$THEME, exclude = NULL)

          Heartbreak       Life_and_death                 Love 
                 145                  131                  139 
         Party_songs    People_and_places Politics_and_protest 
                 162                  145                  141 
                 Sex                 <NA> 
                 131                    6 

Here you see the resulting frequency table. Frequency means how often a value occurs. In this frequency table, you can see how often each theme occurs. We can for example see that there are 145 songs about “Heartbreak” and 131 songs about “Sex”.

Practice

Make a frequency table for the variable YEAR. Use again the exclude = NULL argument.

NOTE: The songs dataset is already loaded in the working directory of this webr session.

table(songs$YEAR, exclude = NULL)

1916 1922 1928 1929 1931 1932 1935 1936 1938 1939 1940 1941 1944 1946 1949 1950 
   1    1    6    2    2    2    1    1    2    3    1    2    2    2    1    2 
1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 
   3    1    1    4    3    9    3    6   10    5   14    7   17   27   33   37 
1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 
  33   39   31   23   30   27   24   27   23   18   27   25   34   25   19   18 
1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 
  25   22   18   17   18   15   19   10   10   14   11   18   12    7    8    7 
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 <NA> 
  11   12    7   12   14    9   15   14   16   19    6 
Back to top