For this project, you will be exploring and visualization data from boardgamegeek.com. Your goal is to observe and visualize interesting trends and patterns in the data, and to tell a cohesive and compelling story about the insights you gain. This project is intentionally open-ended!
Exceptional projects will include creative and unique insights into the board game data. For example, have the most popular types of board games changed over time? How do complexity and rating interact, if at all? Do the number of players and playing time seem to be associated with user rating and complexity? These are just a few examples of the types of question I hope you will explore for this project.
Your projects will be evaluated on the quality of your visualizations and exploratory analyses. This includes, but is not limited to, the quality of your writing, the usefulness and clarity of your visualizations, and how interesting the insights you provide are. Your submission should read as one continuous and cohesive report, rather than six distinct and unconnected sections. To this end, your report should include an introductory paragraph as well as a conclusion/summary paragraph at the end. The target audience of your report is an educated reader who is uninformed about the details of the data, but is interested in learning more about board games. I encourage you to use this 538 post for inspiration, but please do not just copy their graphs and analyses.
You have the option to work with a partner for this project and submit a group report. If you do so, you must both submit the same .Rmd, .html, and .Rproj files on Canvas with both of your names at the top of the .html.
read_csv()
to read online .csv files. I strongly recommend saving a version of the unprocessed .csv on your machine in a Data
subfolder within your Project 1 folder so you will be able to work offline.filter()
the raw data. (Make sure never to save processed data over your original raw .csv file!)category
and mechanic
variables are untidy. I recommend using the cSplit()
function within the splitstackshape
package to fix these. Try setting both direction = "wide"
and direction = "long"
within your cSplit()
call. Both might be useful to you, depending on what plot you want generate.echo = FALSE
to hide code.)Keep in mind: there are no right answers for this project! These are real data, and I’m hoping for creative and interesting analyses that tell a compelling story about the data rather than cookie cutter reports. Have some fun with it, and good luck!