Hello! Please follow these instructions to download and install the necessary software and files before the workshop, ideally somewhere with a reliable internet connection.
The workshop will be hands on from the get go, so if you don’t have the tools ready, there wouldn’t be much to do other than stare at a neighbbour’s screen, which is not exactly the best way to learn things. Therefore, following these instructions beforehand is a necessarily mandatory prerequisite for participation in the workshop. Note that it is not enough to have just R installed (which you might have) - you will need a number of packages and a file which you will download below.
This document contains step-by-step instructions for:
All steps except 5 are mandatory.
Importantly, if something went wrong and you could not install the software, please get in touch before the workshop starts so we can try to quickly troubleshoot. We cannot afford to waste any time on issues like installation during the workshop.
You must have two pieces of (free) software installed on you laptop before coming to the workshop. Bring your laptop with you. The installation process is very simple. Before you start, please make sure your operating system is up to date as well (particularly Macs: there are known conflicts between old versions of R and some newer packages, which will manifest if you have a Mac with an old version of the Mac OS, which in turn would lead you to download an old version of R).
First and foremost, you need R. If you already have R installed, please still update it to the most recent version (which is done just by downloading the most recent installer and installing). Depending on your operating system, go to:
Download the installer and install (with default options, just keep clicking Next). Run the R once to see that it works (in Windows, Rgui.exe should appear as a shortcut in the start menu and/or desktop; on a Mac, look for the R application in Finder). It should look something like this, depending on your OS:
Good job. Now close R (if it asks to save the workspace: not necessary). Once you get RStudio, there is no need to look at this ugly interface ever again.
While it is fine to use R from the command line or the bare-bones R interface application, we are going to use RStudio instead, which will make using R a lot easier and less of a hassle. It also has nice support for R Markdown, which we will be using.
Common issues and troubleshooting:
RStudio is an integrated development environment for R (which is why we had to install that first) - the Console panel on the left is basically the same thing that you saw when you ran “plain” R. But RStudio also features a number of very helpful things that will become apparent in the workshop. RStudio comes with a handy script editor, which we are going to use right away.
Before we do that, we need to quickly change two options in RStudio to make it behave in a more useful way for us (fortunately, the RStudio interface is highly customisable).
Soft-wrap R source file
(this will make using the script editor much easier, by wrapping long lines so you won’t have to keep scrolling left and right all the time).Show output inline for all R Markdown documents
. This will disable notebook-style plot previews in the script editor and show plots on their own.Almost done! We need to make sure your RStudio and certain packages get along so we can use R Markdown and some more advanced plotting tools in the workshop. During this process, RStudio might need to download a few things - make sure you have internet access.
p=c("ggplot2","ggmosaic","cowplot","ggbeeswarm","ggstance","ggrepel","RColorBrewer","magrittr","reshape2","dplyr","corrplot","plotly","languageR","igraph","visNetwork","quanteda","stringdist","rmarkdown","rworldmap","gapminder"); install.packages(p); x=p%in%rownames(installed.packages());if(all(x)){print("All packages installed successfully!")}else{print(paste("Failed to install:", paste(p[!x]), ", try again and make sure you have internet connection."))};rm(x,p)
Common issues & troubleshooting:
Do you want to install from sources the package which needs compilation? (Yes/no/cancel)
type “no” and press enter.Would you like to create a personal library...
- click yes. Alternatively, if on Windows: run RStudio as administrator (right-click the RStudio icon, choose run as administrator), this will allow it to install packages to the Program Files folder. Second alternative, install Rstudio in a location with write access, i.e. not in Program Files.RcppArmadillo
: ignore it.package ... is not available (for R version ...)
- seems you didn’t update R, see above.Feel free to close this window now. If all this worked, great! If something did not work here, try restarting RStudio once and redoing the steps. If still no luck, find me beforehand so we can fix this.
You will need to ownload a script file for the workshop. Right-click (Mac ctrl-click) here and choose Save link as… (or Download linked file…, or similar) and save the file. Make sure that you save it as an .Rmd
file, not a .txt
file: the file on your computer should be named artofthefigure_dh2019.Rmd
. Open it in RStudio (if RStudio is properly installed, double-clicking it should do it). That’s it, you’re all set for the workshop. Feel free to have a look around, but don’t be intimidated by the amount of code - you will be guided through all of that, step by step.
If you don’t know R (and particularly if you haven’t written code in the past at all) but would like to get a head start, I would highly recommending doing this 5-minute exercise before the workshop begins. It’s just a quick introduction to how basic R syntax works. After doing this, you’ll find it easier to do the exercises and interactive parts at the workshop. If you have used R a while ago, feel free to do it as well as a refresher.
Below is a block of code. What you need to do is select and copy this block of code, and paste it into the RStudio script editor. First create a new script by clicking on the New icon in the top left corner of RStudio (or File->New File) and choose “R Script”.
A blank text file will open above the Console panel, titled something like Untitled1.R
.
Good job! Now copy-paste the code below (the gray box) into the new script file you just created, and follow the instructions therein.
# Start of the exercise. You should be reading this message in the RStudio script editor.
print("Hello! Put your text cursor on this line (click on the line). Anywhere on the line. Now press CTRL+ENTER (PC) or CMD+ENTER (Mac). Just do it.")
# The command above, when executed (what you just did), printed the text in the console below. Also, this here is a comment. Commented parts of the script (anything after a # ) are not executed.
# Also, if you've been scrolling left and right in the script window to read all this, turn on text wrapping ASAP: on the menu bar above, go to Tools -> Global Options -> Code (on the left) -> tick "Soft-wrap R source files". Better, right?
# So, print() is a function. Most functions look something like this:
# myfunction(inputs, parameters)
# All the inputs to the function go inside the ( ) brackets. In the above case, the text (all text, or "strings", must be within quotes) is the input to the print() function.
# Let's try another function, sum().
sum(1,10) # cursor on the line, press CTRL+ENTER (or CMD+ENTER on Mac)
# You should see the output (sum of 1 and 10) in the console. Note that you can also write commands directly in the console, and executing them with ENTER. Try some simple maths: write 2*3+1 in the console below, and press ENTER.
# Important: you can always get help for a function and check its input parameters by executing
help(sum) # put the name of any function in the brackets
# ...or by searching for the function by name in the Help tab on the right, or by selecting a function name in the code and pressing F1.
# Let's plot something. The command for plotting things is, surprisingly, plot().
plot(42, main = "Greatest plot in the world") # execute the command; a plot should appear on the right. If it appears inline in the script editor window, go back to the pre-workshop instructions and follow the steps for configuring R Markdown output.
# OK, this was not very exciting. But notice that a function can have multiple inputs, or arguments. In this case, the first argument is the data (a vector of length one), and the second is 'main', which specifies the main title of the plot.
# Note that you can make to plot bigger by pressing the 'Zoom' button above the plot panel on the right.
# Let's create some data to play with. We'll use the sample() command, which creates random numbers from a predifined sample. Basically it's like rolling a dice some n times, and recording the results.
sample(x = 1:6, size = 50, replace = T) # execute this; its output is 50 numbers
# If an output is not assigned to some object, it usually just gets printed in the console. It would be easier to work with data, if we saved it in an object. For this, we need to learn assignement, which in R works using the equals = symbol (or the <-, but let's stick with = for simplicity).
xdata = sample(x = 1:6, size = 50, replace = T) # what it means: xdata is the name of a (new) object, the equals sign (=) signifies assignement, with the object on the left and the data on the right. In this case, the data is the output of the sample() function. Instead of printing in the console, the output is assigned to the object.
xdata # execute to inspect: calling an object usually prints its contents into the console below.
# Let's plot:
plot(xdata) # plots data as it is ordered in the object
hist(xdata, breaks=20, main="Frequency of dice values") # plots a histogram (distribution of values)
# Since we already installed the plotly package, why not try it out:
library(plotly) # load the package
ydata = xdata+runif(length(xdata),-1,1) # some more random data
plot_ly(x=~xdata, y=~ydata, type="scatter", mode="markers", alpha=0.5, marker=list(size=10)) # creating interactive plots is easy! (also all the stuff beyond "y=~" is optional, only for making it prettier)
# Well done!! This is the end of the pre-workshop exercise. Congratulations, you have successfully installed R and RStudio, and have now learned the basics of programming (functions, assignement, parameters, input-output; as well as authoring basic webpages).
# You have also learned how to make simple plots in R. This will form the basis of the workshop, where we will learn how to recognize ways to plot different kinds of data (not just random dice rolls!), and how to modify the plotting parameters in order to create informative but also visually more pleasing graphs.
# Best of all, all the code is reusable in the sense that you can easily use the very same commands that we will be working with to plot your own data later on, just by changing the inputs.
# Also, the "programming" side of things will not get any more complicated than what you did in the last 10 minutes. That is literally all you need - but we will explore how to use these basic skills in various different ways.