3 + 5 * 2
[1] 13
In this lesson, you’ll receive your first taste of the R programming language. Specifically, you’ll learn how to use R as a calculator and the basics of variables, functions, and packages.
When using R as a calculator, the order of operations is the same as you would have learned back in school.
From highest to lowest precedence:
(
, )
^
or **
*
/
+
-
3 + 5 * 2
[1] 13
Use parentheses to group operations to force the order of evaluation if it differs from the default, or to make clear what you intend.
3 + 5) * 2 (
[1] 16
Parentheses can get unwieldy when not needed, but it clarifies your intentions. Remember that others may later read your code.
3 + (5 * (2 ^ 2))) # hard to read
(3 + 5 * 2 ^ 2 # clear, if you remember the rules
3 + 5 * (2 ^ 2) # if you forget some rules, this might help
The text after each line of code is called a “comment.” Anything that follows after the hash (or octothorpe) symbol #
is ignored by R when it executes code. (Note the difference between a code comment and a quarto chunk option specified with |#
)
Really small or large numbers get a scientific notation:
2/10000
[1] 2e-04
Which is shorthand for “multiplied by 10^XX
”. So 2e-4
is shorthand for 2 * 10^(-4)
.
You can write numbers in scientific notation too:
5e3 # Note the lack of minus here
[1] 5000
R has many built-in mathematical functions. To call a function, we can type its name, followed by open and closing parentheses. Functions take arguments as inputs; anything we type inside the parentheses of a function is considered an argument.
Depending on the function, the number of arguments can vary from none to multiple. For example:
getwd() #returns an absolute filepath
doesn’t require an argument. On the contrary, the following mathematical functions need a value to compute the result:
sin(1) # trigonometry functions
[1] 0.841471
log(1) # natural logarithm
[1] 0
log10(10) # base-10 logarithm
[1] 1
exp(0.5) # e^(1/2)
[1] 1.648721
Don’t worry about remembering every function in R. You can look them up on Google, or if you can remember the start of the function’s name, use the tab completion in RStudio. The latter is one advantage that RStudio has over R on its own: it has auto-completion abilities for easy look-up functions, their arguments, and the values that they take.
Typing ?
before the name of a command will open the help page for that command. When using RStudio, this will open the ‘Help’ pane; if using R in the terminal, the help page will open in your browser. The help page will include a detailed description of the command. The bottom of the help page usually shows a collection of code examples illustrating command usage. We’ll go through an example later.
We can also make comparisons in R:
1 == 1 # equality (note two equals signs, read as "is equal to")
[1] TRUE
1 != 2 # inequality (read as "is not equal to")
[1] TRUE
1 < 2 # less than
[1] TRUE
1 <= 1 # less than or equal to
[1] TRUE
1 > 0 # greater than
[1] TRUE
1 >= -9 # greater than or equal to
[1] TRUE
We can store values in variables using the assignment operator <-
, like this:
<- 1/40 x
Notice that assignment does not print a value. Instead, we stored it for later in something called a variable. x
now contains the value 0.025
:
x
[1] 0.025
Look for the Environment
tab in the top right panel of RStudio, and you will see that x
and its value have appeared. Our variable x
can be used in place of a number in any calculation that expects a number:
log(x)
[1] -3.688879
Notice also that variables can be reassigned:
<- 100 x
x
used to contain the value 0.025 and now equals 100.
Assignment values can contain the variable being assigned to:
<- x + 1 #notice how RStudio updates its description of x on the top right tab
x <- x * 2 y
The right-hand side of the assignment can be any valid R expression. The right-hand side is fully evaluated before the assignment occurs.
Variable names can contain letters, numbers, underscores, and periods but no spaces. They must start with a letter.
It is recommended to use a consistent variable naming syntax, such as
Note that it is also possible to use the =
operator for assignment:
= 1/40 x
But this is much less common among R users, and the general recommendation is to use <-
.
Note that a variable can contain many values at once. For example, a vector in R corresponds to a collection of values stored in a certain order, all with the same data type. There are many ways to create vectors. Some examples include:
c(1, 4, 2)
[1] 1 4 2
1:5
[1] 1 2 3 4 5
2^(1:5)
[1] 2 4 8 16 32
<- 1:5
x 2^x
[1] 2 4 8 16 32
This is incredibly powerful; we will discuss this further in an upcoming lesson.
There are a few useful commands you can use to interact with the R session.
ls
will list all of the variables and functions stored in the global environment (your working R session):
ls()
[1] "x" "y"
Note here that we didn’t give any arguments to ls
, but we still needed to give the parentheses to tell R to call the function.
If we type ls
by itself, R prints a bunch of code instead of a listing of objects.
ls
function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE,
pattern, sorted = TRUE)
{
if (!missing(name)) {
pos <- tryCatch(name, error = function(e) e)
if (inherits(pos, "error")) {
name <- substitute(name)
if (!is.character(name))
name <- deparse(name)
warning(gettextf("%s converted to character string",
sQuote(name)), domain = NA)
pos <- name
}
}
all.names <- .Internal(ls(envir, all.names, sorted))
if (!missing(pattern)) {
if ((ll <- length(grep("[", pattern, fixed = TRUE))) &&
ll != length(grep("]", pattern, fixed = TRUE))) {
if (pattern == "[") {
pattern <- "\\["
warning("replaced regular expression pattern '[' by '\\\\['")
}
else if (length(grep("[^\\\\]\\[<-", pattern))) {
pattern <- sub("\\[<-", "\\\\\\[<-", pattern)
warning("replaced '[<-' by '\\\\[<-' in regular expression pattern")
}
}
grep(pattern, all.names, value = TRUE)
}
else all.names
}
<bytecode: 0x10e3d74a8>
<environment: namespace:base>
What’s going on here?
Like everything in R, ls
is the name of an object, and entering the name of an object by itself prints the contents of the object. The object x
that we created earlier contains 1, 2, 3, 4, 5:
x
[1] 1 2 3 4 5
The object ls
contains the R code that makes the ls
function work! We’ll talk more about how functions work and start writing our own later.
You can use rm
to delete objects you no longer need:
rm(x)
If you have lots of things in your environment and want to delete all of them, you can pass the results of ls
to the rm
function (or you can click the “broom” icon in the environment panel):
rm(list = ls())
In this case, we’ve combined the two. Like the order of operations, anything inside the innermost parentheses is evaluated first, and so on.
In this case, we’ve specified that the results of ls
should be used for the list
argument in rm
. When assigning values to arguments by name, you must use the =
operator!!
If, instead, we use <-
, there will be unintended side effects, or you may get an error message:
rm(list <- ls())
Error in rm(list <- ls()): ... must contain names or character strings
We can add functions to R by writing a package or obtaining a package written by someone else. As of this writing, there are over 17,000 packages available on CRAN (the comprehensive R archive network). R and RStudio have functionalities for managing packages:
install.packages("packagename")
, where packagename
is the package name in quotes.update.packages()
library(packagename)
Packages can also be viewed, loaded, and detached in the Packages tab of the lower right panel in RStudio. Clicking on this tab will display all of the installed packages with a checkbox next to them. If the box next to a package name is checked, the package is loaded and if it is empty, the package is not loaded. Click an empty box to load that package and click a checked box to detach that package.
Packages can be installed and updated from the Package tab with the Install and Update buttons at the top of the tab.