Objectives


Related to: Data Computing, “Functions, Arguments and Commands”, p. 21; “R Command Patterns”, Ch. 3

<– Back to Table of Contents

Functions are R’s “recipes” for your data

A function in computer programming is like a recipe: it defines a set of inputs, how to combine them, and what will come out in the end. Similar to a recipe, a function expects you to put in a specific number and type of inputs in order to end with something tasty (in a statistical sense).

The inputs you put into functions are called arguments. If you’ve ever taken a cooking class, you know that some recipes–like a peanut butter & jelly sandwich–can be very simple and require very few ingredients. Other recipes–like a pizza–can be more complex, and the number of ingredients can vary a bit based on your preferences. R functions are similar: some take only one or two arguments, while others can take a whole string of arguments. A typical R function generally has a small set of required arguments. Functions may also have additional, optional arguments that you can add on like optional “pizza toppings” when you are calling a function.

graphic showing pizza ingredients and comparing them to arguments, and a pizza recipe and comparing it to a function

Pro tip: Wrong thing in, wrong thing out

A lot of the errors you’ll see in R are caused when you feed the wrong types of arguments into a function. Functions can be very picky about the type of inputs they’re expecting in order to work properly. For example, if you feed a function a column of numbers (quantitative variable) when its expecting a column of words (qualitative variable), it is likely to spit back an error message in the R console. It’s like baking cupcakes with salt instead of sugar or feeding your dog chocolate for dinner–expect something gross to come out!

1-argument function example

Now, let’s start by looking at some examples of simple functions that expect either one or multiple arguments so you can get a better sense of how this all fits together in R syntax. For these examples, we’ll be using a dataset describing the energy efficiency of different government and public buildings across the city of Minneapolis.

Simple functions like mean(), sum(), and summary() take a single argument in R. The structure is generally as follows:

basic structure of a single-argument R function

For example, let’s look at a basic summary of the “energy_star_score” variable in our dataset. We can use the 1-argument function summary() to get a quick and dirty look at the minimum, mean, and maximum values of this variable:

summary(data$energy_star_score)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    1.00   46.25   71.50   65.05   86.75  100.00     189

Multiple-argument function example

Once we get beyond 1-argument functions, the world of R starts getting a bit more complex. Multiple-argument functions typically have some combination of required arguments, along with some additional optional arguments. For example, the function plot() will accept two numeric arguments as inputs and create a scatterplot:

plot(data$year_built, data$energy_star_score)