R Resources and Tips



Resources to get started with R

The basics


Lengthier material

Why R?

R is a programming language that is most well-known for being excellent for statistical analysis and data visualization.

While the learning curve is steeper than for most programs with graphical user interfaces (GUIs), it pays off to invest in learning R:

  • R gives you greater flexibility to do anything you want.

  • Writing computer instructions as code, like you have to do in R, is more reproducible than clicking around in a GUI. It also makes it much easier to redo analyses with slight modifications!

  • R is highly interdisciplinary and can be used with many different kinds of data. To just name two examples, R has a very strong ecosystem for bioinformatics analysis (“Bioconductor” project), and can be used to create maps and perform GIS analyses.

  • R is more than a platform to perform analysis and create figures. R Markdown combines R with a simple text markup language to produce analysis reports that integrate code, results, and text, and to create slide decks, data dashboards, websites, and even books!

  • While not as versatile outside of data-focused topics as a language like Python, R can be used as a general programming language, for instance to automate tasks such as large-scale file renaming.

Finally, R:

  • Is open-source and freely available for all platforms (Windows, Mac, Linux).

  • Has a large and welcoming user community.



Miscellaneous R tips

Installing R packages

CRAN packages

To install an R package that is available at CRAN, the default R package repository, from within R (e.g. in the R console in RStudio), use the install.packages() function.

The install.packages() function will handle dependencies within R – i.e., it will install other R packages that your package depends on. Occasionally, when the install function needs to compile a package from source, errors arise that relate to missing system dependencies (i.e. software outside of R).

On Mac and Linux, these system dependencies are best installed outside of R, such as with homebrew on Mac or apt on Ubuntu. The error message you got when trying to install an R package should tell you which system dependencies are needed.

On Windows, you can use the installr package to install such dependencies or other software from within R – for example:

install.packages("installr")    # Install the installr package first
installlr::install.RStudio()    # Install RStudio
installr::install.python()      # Install Python

System setup to installing packages “from source”

Sometimes you need to install a package from source, that is, you need to compile the package rather than simply installing a pre-existing binary. (On Linux, where installing from source is often needed, this should work without additional steps.) On Windows and Mac, installing from source is generally only needed when you install a package from outside of CRAN (such as from Github, see below), but you will need to make sure you have the following non-R software:

On Windows, you will need Rtools (Rtools installation instructions).

On a Mac, you will need Xcode (which can be installed from the Mac App store).

You can test whether or not you are able to install packages from source using the devtools package:

install.packages("devtools")    # Install the devtools package
devtools::has_devel()           # Check whether you can install packages from source

For a bit more info, see this page.


Installing packages from GitHub

To install a package from GitHub, use either the devtools or the remotes package – for example:

install.packages("remotes")                # Install the remotes package
remotes::install_github("kbroman/broman")  # Install from a repository using "<username>/<repo-name>"

This will install the package from source, so you will need to make sure you are able to do so by following the instructions in the section right above this one.


Installing packages from Bioconductor

If you’re doing bioinformatic analyses in R, you will probably run into packages that are not on CRAN but on Bioconductor. To install a package from Bioconductor, use the BiocManager package – for example:

install.packages("BiocManager")  # Install the BiocManager package
BiocManager::install("edgeR")    # Install the edgeR package from Bioconductor


Updating R

Consider updating R if you have an older version of R installed – as of June 2022, the current version is R 4.2 and we would certainly recommend updating R if the version is below R 4.0.

You can check which version of R you have by looking at the first line of output when running the following command inside R:

sessionInfo()

To update:


Re-installing your packages after updating (Mac and Linux)

While the installr::updateR() function for Windows users takes care of re-installing your packages along with updating R, Mac and Linux users will have to manually re-install their packages. Some people prefer to re-install these packages on the fly, which can end up being a way to get rid of packages you no longer use.

But if you want immediately re-install all your packages, run this before you upgrade:

my_packages <- installed.packages()
saveRDS(my_packages, "my_packages.rds")

Then, after you’ve installed the latest R version:

my_packages <- readRDS("CurrentPackages.rds")
install.packages(my_packages[1, ])

This will only work for packages available on CRAN. Of course, you can check your list for Github-only and Bioconductor packages and then install those with their respective commands (see below). Yes, this can be a bit of a hassle!






Jelmer Poelstra
Jelmer Poelstra
Bioinformatician at MCIC