In 2009, The Guardian newspaper published a list entitled “1000 songs you need to hear before you die”. What kind of songs would they have selected for that? That’s exactly what we’re going to find out now!
We aquired a dataset containing the 1000 songs and information about them. After importing the data to R, we can look at the first rows of the dataset with the following command:
head(songs)
THEME TITLE ARTIST YEAR
1 Love The_Look_of_Love ABC 1982
2 Love The_Shining Badly_Drawn_Boy 2000
3 Love God_Only_Knows The_Beach_Boys 1966
4 Love Good_Vibrations The_Beach_Boys 1966
5 Love Wouldn’t_It_Be_Nice The_Beach_Boys 1966
6 Love Eight_Days_a_Week The_Beatles 1964
SPOTIFY_URL
1 http://open.spotify.com/track/78j3qTBdzcIiT3eS7XymoD
2 http://open.spotify.com/track/2PojSoZ94AIzp7fsz6wtMt
3 http://open.spotify.com/track/0ObrXLrfrqJUNc8RfmIBHP
4 http://open.spotify.com/track/2oF7FZHIJbzjeEXZ3D0Ku4
5 http://open.spotify.com/track/0cx32rX0uZvcJUP92Wkj2y
6
The dataset contains five columns for the characteristics of the song (which we also call “variables”) and 1000 rows, one row for each song.
We are now going to create a frequency table for the theme variable in the data set. To do this, we will use the table() function. We add the exclude = NULL argument to ensure that we can also see how many missing values there are.
table(songs$THEME, exclude =NULL)
Heartbreak Life_and_death Love
145 131 139
Party_songs People_and_places Politics_and_protest
162 145 141
Sex <NA>
131 6
Here you see the resulting frequency table. Frequency means how often a value occurs. In this frequency table, you can see how often each theme occurs. We can for example see that there are 145 songs about “Heartbreak” and 131 songs about “Sex”.
Practice
Make a frequency table for the variable YEAR. Use again the exclude = NULL argument.
NOTE: The songs dataset is already loaded in the working directory of this webr session.