[1] 0.8660254
[1] 25
[1] 2
[1] 0
[1] 1 2 3 4 5
[1] 2 4 6 8 10
ACTL1101 Introduction to Actuarial Studies
By the end of this topic, you should be able to do the following in R:
These slides often make reference to specific pages from the following book:
This book serves as the main (though not only) reference for the R content of ACTL1101. While you can buy a hard copy of this book at the UNSW bookshop, it is also downloadable for free at the following link (you may have to enter your UNSW credentials to access it).
RR can easily replace all the functionalities of a (sophisticated!) calculator.
Calculate the following
R responds to your requests by displaying the result obtained after evaluation. However, this result is displayed, then lost.
To store values, one can use the assignment arrows: <- or ->, or the more standard =.
[1] 1
[1] 2
[1] 3
[1] 1
These stored values are known as variables (more on these later).
While there are many ways to assign variables in R, it is recommended that you either use = or <-.
Some of you may also be asking whether there are any differences between the different assignment operators, and there are, but they are unlikely to have any effect on the kind of things we will do in this course.
If you are interested in the differences anyway, see this link for more information.
Once we move to RStudio, there is an inbuilt shortcut Alt + - for typing <-.
Perform the following tasks
var1 whose value is the sum of \(\{1,2,3,4\}\)var2 whose value is the multiplication of \(e\) and \(\pi\)var3 whose value is the sum of var1 and var2var3Which of the following are valid methods of assigning x with the value 2?
x = 2? Yes2 -> x? Yesx -> 2? Nox <- 2? Yes2 = x? No2 <- x? NoVectors are a very important type of data structure in R. We will see other types of data structures in Week 2, but for now we concentrate on vectors.
A vector is a sequence of data points of the same type.
You can create a vector in different ways. For instance, the function c() produces a vector.
Operations performed on vectors are done element by element.
seq() [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
[1] 0 5 10 15 20
You can perform relational operations in R, which will output logical values (TRUE/FALSE) (note: you can scroll down)
[1] TRUE
[1] FALSE
[1] TRUE
[1] TRUE
[1] FALSE
[1] TRUE TRUE FALSE
[1] TRUE FALSE TRUE
Note that in the last case, since two vectors are compared, three results are given, one for each individual relation.
You can perform logical operations in R, which will output logical values. Any logical operation takes the form
statement1 OPERATOR statement2
We start with the operator AND. If statement1 is T and statement2 is T, then AND outputs T, otherwise, it outputs F.
In R there are two AND operators, & and &&. We start with & which is an element-wise comparison.
It compares the first element of the first vector to the first element of the second vector, then the second of the first vector to the second of the second vector, etc.
It therefore returns a vector of logical values. See if you can understand how the output below is produced.
Obviously, we can combine this with the previous section to construct statements like
The && AND operator is known as the short-circuit evaluation
[1] FALSE
[1] TRUE
So what is the point of this? Well ‘short-circuit’ means that if one condition fails, the entire condition fails and it does not check the other conditions. This means you can do things like
because as soon as it finds that the first statement was false, it exits the operation rather than trying to take the square root of a negative number.
The OR operator does exactly what it sounds like. If statement1 is T or statement2 is T, then it returns T, otherwise it returns F.
Again we have both | and ||.
[1] TRUE TRUE TRUE FALSE
[1] TRUE FALSE
[1] TRUE
The NOT operator is much simpler and represented by !; it just inverts the result, i.e. T becomes F, and F becomes T:
There are many other functions in R that operate on logical values/vectors, such as:
Another useful trick is that internally R (and most programming languages) treat T as 1 and F as 0, e.g.,
You can also do things like:
A variable is an object in R. There are rules for choosing a variable name:
Note: use meaningful names for your variables to improve the readability of your code.
One of the main strengths of R is its ability to organise data in a structured way. This will turn out to be very useful for many statistical procedures and data analysis.
| Data type | Type in R | Display |
|---|---|---|
| real number (integer or not) | numeric (double) | 3.27 |
| integer | numeric (integer) | 3 |
| complex number | complex | 3+2i |
| logical (true/false) | logical | TRUE or FALSE |
| missing | logical | NA |
| text (string) | character | “text” |
A missing or undefined value is indicated by the instruction NA (for Non Available).
Normally if you try and do numerical operations with NA, the whole operation becomes NA as you can see below. This is done to alert you that their are NAs in your data, but if you want to ignore them, some functions (e.g. mean and sum) come with an na.rm argument
This argument removes all the NAs before applying the operation.
Dealing with NAs is very important once we get to importing and using external data.
Any information between quotation marks (single ’ ’ or double ““) corresponds to a character string. Try the following commands:
Given that
var1 and var2var1 to a double precision number using as.doubleNotice that as.double still needs to be re-assigned to the variable, otherwise it does nothing. Most functions in R behave this way and do not perform “in-place” changes
We have seen in the ‘Probability’ theory part of this coure, that random variables are mathematical descriptions of the random phenomena encountered in everyday life.
Many probability distributions are implemented in base R. There are typically four functions you can use for each distribution. For the Normal distribution they are dnorm (density function), pnorm (CDF), qnorm (quantile function) and rnorm (random generator).
[1] 0.05844094
[1] 0.9750021
[1] 0
[1] 1.959964
[1] -0.01639022
R. For instance, can you guess what dexp() does? Or pbinom()?R is an apt tool to do so throught the use of functions runif(), rnorm(), rgamma(), etc…We place below some problems for you to solve as extra practice… try them out!
What is the output produced by the following R codes? Try to predict it before typing the commands in R!
1:3^2(1:5)*2root.of.four <- sqrt(4)TRUE + T +FALSE*F + T*FALSE + F(Inspired from Exercises 3.1-3.13 in the R Textbook.)
What can be improved about the following variable names? Suggest a better alternative.
delaytime
the_number_of_marks_higher_than_50
20_day_limit
child/adult
pi
variable1
Create a vector x consisting of \(10\) randomly generated Binomial(\(n=100, p=0.1\)) variables, and display the result.
Create \(3\) randomly generated standard Normal random variables.
By generating an independent sample of size \(10000\), approximate the probability that a standard normal random variable is greater than \(1.96\)
Hint: check slide 24 if you are having trouble with calculating the probability after simulating.
