Objectives

<– Back to Table of Contents

Getting started

Let’s assume you’ve downloaded R to your computer and are getting all excited to start down the path of learning statistical computing. You maybe don’t quite know what “statistical computing” is–other than that it sounds exciting! So hopefully you’re wondering by now: What the heck is R? Let’s start by looking at a really boring definition, and then pick it apart a bit:

“R is a language and environment for statistical computing and graphics. […] R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. […] R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form.”

So, what does this mean?…

“R is a language and environment for statistical computing”: This means that R doesn’t look like Excel or other spreadsheet programs you may be used to. Instead, R looks a lot more like programming languages such as Python or Java. Using R, you will be using functions and writing code snippets that are very similar to those used in other programming languages, but that have a special emphasis on data analysis and visualization. (Note: Python, Java, and other programming languages also have data analysis and visualization capabilities, but they are not quite as optimized for these tasks as R is!)

“R is highly extensible”: This means that people can write packages for R and share them with other R users. For example, let’s say you wanted to write a package that lets people make graphs and charts that look like web comics. You could write a packages that does that, and then share it back to the R user community for others to use. (In fact, this has already been done!)

“R is available as Free Software”: This means that there is essentially a large, global group of geeks who is responsible for maintaining and distributing the R software and the various packages that are available for R. This group is called CRAN–the “Comprehensive R Archive Network”–and they distribute the R software to you and other students and scientists around the world for free! If you have already downloaded R onto your computer, you probably got the software from their website. R is the result of a massive amount of collaboration and coordination between data geeks around the world!

Now that you have some background on where R comes from and how it is connected to a broader global community, here are some tips you should keep in mind as you get started learning R…

1) Keep your projects organized together in separate folders–this helps R find them.

In general, it is best to create a separate folder for each project you’re working on, and keep all of your files for that project within the same folder. This is because R is very specific in how it looks for files on your computer. It always starts in what R calls the “working directory”. The working directory is essentially just a folder on your computer that R has flagged to say: “Start here! When you’re looking for things, start here first!”. You can also change your working directory whenever you need to, so that R starts looking for files in a different place.