Applying a label depending on which condition is met using R - r

I would like to use a simple R function where the contents of a specified data frame column are read row by row, then depending on the value, a string is applied to that row in a new column.
So far, I've tried to use a combination of loops and generating individual columns which were combined later. However, I cannot seem to get the syntax right.
The input looks like this:
head(data,10)
# A tibble: 10 x 5
Patient T1Score T2Score T3Score T4Score
<dbl> <dbl> <dbl> <dbl> <dbl>
1 3 96.4 75 80.4 82.1
2 5 100 85.7 53.6 55.4
3 6 82.1 85.7 NA NA
4 7 82.1 85.7 60.7 28.6
5 8 100 76.8 64.3 57.7
6 10 46.4 57.1 NA 75
7 11 71.4 NA NA NA
8 12 98.2 92.9 85.7 82.1
9 13 78.6 89.3 37.5 42.9
10 14 89.3 100 64.3 87.5
and the function I have written looks like this:
minMax<-function(x){
#make an empty data frame for the output to go
output<-data.frame()
#making sure the rest of the commands only look at what I want them to look at in the input object
a<-x[2:5]
#here I'm gathering the columns necessary to perform the calculation
minValue<-apply(a,1,min,na.rm=T)
maxValue<-apply(a,1,max,na.rm=T)
tempdf<-as.data.frame((cbind(minValue,maxValue)))
Difference<-tempdf$maxValue-tempdf$minValue
referenceValue<-ave(Difference)
referenceValue<-referenceValue[1]
#quick aside to make the first two thirds of the output file
output<-as.data.frame((cbind(x[1],Difference)))
#Now I need to define the class based on the referenceValue, and here is where I run into trouble.
apply(output, 1, FUN =
for (i in Difference) {
ifelse(i>referenceValue,"HIGH","LOW")
}
)
output
}
I also tried...
if (i>referenceValue) {
apply(output,1,print("HIGH"))
}else(print("LOW")) {}
}
)
output
}
Regardless, both end up giving me the error message,
c("'for (i in Difference) {' is not a function, character or symbol", "' ifelse(i > referenceValue, \"HIGH\", \"LOW\")' is not a function, character or symbol", "'}' is not a function, character or symbol")
The expected output should look like:
Patient Difference Toxicity
3 21.430000 LOW
5 46.430000 HIGH
6 3.570000 LOW
7 57.140000 HIGH
8 42.310000 HIGH
10 28.570000 HIGH
11 0.000000 LOW
12 16.070000 LOW
13 51.790000 HIGH
14 35.710000 HIGH
Is there a better way for me to organize the last loop?

Since you seem to be using tibbles anyway, here's a much shorter version using dplyr and tidyr:
> d %>%
gather(key = tscore,value = score,T1Score:T4Score) %>%
group_by(Patient) %>%
summarise(Difference = max(score,na.rm = TRUE) - min(score,na.rm = TRUE)) %>%
ungroup() %>%
mutate(AvgDifference = mean(Difference),
Toxicity = if_else(Difference > mean(Difference),"HIGH","LOW"))
# A tibble: 10 x 4
Patient Difference AvgDifference Toxicity
<int> <dbl> <dbl> <chr>
1 3 21.4 30.3 LOW
2 5 46.4 30.3 HIGH
3 6 3.6 30.3 LOW
4 7 57.1 30.3 HIGH
5 8 42.3 30.3 HIGH
6 10 28.6 30.3 LOW
7 11 0 30.3 LOW
8 12 16.1 30.3 LOW
9 13 51.8 30.3 HIGH
10 14 35.7 30.3 HIGH
I think maybe your expected output might have been based on a slightly different average difference, so this output is very slightly different.
And a much simpler base R version if you prefer:
d$min <- apply(d[,2:5],1,min,na.rm = TRUE)
d$max <- apply(d[,2:5],1,max,na.rm = TRUE)
d$diff <- d$max - d$min
d$avg_diff <- mean(d$diff)
d$toxicity <- with(d,ifelse(diff > avg_diff,"HIGH","LOW"))
A few notes on your existing code:
as.data.frame((cbind(minValue,maxValue))) is not an advisable way to create data frames. This is more awkward than simply doing data.frame(minValue = minValue,maxValue = maxValue) and risks unintended coercion from cbind.
ave is for computing summaries over groups; just use mean if you have a single vector
The FUN argument in apply expects a function, not an arbitrary expression, which is what you're trying to pass at the end. The general syntax for an "anonymous" function in that context would be apply(...,FUN = function(arg) { do some stuff and return exactly the thing you want}).

Related

How can I apply calculations multiple times on similar variables in the Tidyverse?

I am trying to run calculations on multiple variables with similar names (mx1_var1...mx2_var1 etc).
A simplified version of the data is below.
structure(list(mx1_amenable = c(70.0382790687902, 20.8895416774022,
98.1328630153307, 8.63038330575823, 21.098387740395, 31.959849814698,
9.22952906324882, 74.4660849895597, 29.6851613973842, 60.941434908354
), mx1_Other = c(50.0261607893197, 46.0117649431311, 51.8219837573084,
73.7814971552898, 93.8008571298187, 92.6841115228084, 95.660659297798,
10.8184536035572, 43.6606611340557, 81.4415005182801), mx1_preventable = c(38.6864667127179,
22.5707957186912, 13.324746863086, 74.9369833030818, 13.0413382062397,
98.3757571024402, 86.6179643621766, 19.7927752780922, 2.28293032845359,
67.0137368426169), mx2_amenable = c(63.6636904898683, 40.361275660631,
3.2234218985236, 80.4870440564426, 49.483719663574, 71.0484920255819,
97.3726798797323, 30.0044347466731, 25.8476044496246, 39.4468283905231
), mx2_Other = c(4.0822540063483, 52.9579932985574, 38.3393867228102,
80.8093349013419, 89.5704617034906, 7.15269982141938, 44.9889904260212,
94.1639871656393, 17.4307996383923, 91.9360333328057), mx2_preventable = c(97.9327560952081,
42.7026845980086, 74.6785922702186, 27.4754587243202, 14.5174992869947,
29.298035056885, 3.2058044369044, 44.6985715883816, 33.7262168187378,
50.9358501169921)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-10L))
I want to run calculations e.g.
mutate(diff_amenable = mx1_amenable)
Across all variables in the dataset as well as further calculations based on the output of these new figures. I think using some sort of string match and function should be able to do it but all I could come across was [this.][1]
At the moment I am working with the data in wide format and manually inputting the column names to run the calculations which is not feasible as I work with more variables (up to 70 paired values).
Any ideas how this could be done?
[1]: Function to perform similar calculations on variables with similar names
This might be a slight step forward - writing functions that give the calculation for a pair of selected columns by name detection in the across function. This works for the six example columns in your dataset:
library(tidyverse)
difference <- function(...) {
x <- list(...)
x[[1]][[1]] - x[[1]][[2]]
}
proportion <- function(...) {
x <- list(...)
x[[1]][[1]] / x[[1]][[2]]
}
df %>%
rowwise() %>%
transmute(
mx1_allcause = sum(across(starts_with("mx1"))),
mx2_allcause = sum(across(starts_with("mx2"))),
diff_amenable = difference(across(ends_with("_amenable"))),
diff_allcause = difference(across(ends_with("_allcause"))),
prop_amenable = proportion(across(starts_with("diff")))
)
#> # A tibble: 10 x 5
#> # Rowwise:
#> mx1_allcause mx2_allcause diff_amenable diff_allcause prop_amenable
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 159. 166. 6.37 -6.93 -0.920
#> 2 89.5 136. -19.5 -46.5 0.418
#> 3 163. 116. 94.9 47.0 2.02
#> 4 157. 189. -71.9 -31.4 2.29
#> 5 128. 154. -28.4 -25.6 1.11
#> 6 223. 107. -39.1 116. -0.338
#> 7 192. 146. -88.1 45.9 -1.92
#> 8 105. 169. 44.5 -63.8 -0.697
#> 9 75.6 77.0 3.84 -1.38 -2.79
#> 10 209. 182. 21.5 27.1 0.794
Created on 2021-04-09 by the reprex package (v2.0.0)
Expanding this to your 70+ variables though might be different. My solution here relies on each calculation combining two columns being able to select the two (in order) based on a text match. If there's a need for a more complicated matching of one name to another, you might need a smarter approach or to give in and manually define pairings.

Controlling decimal places displayed in a tibble. Understanding what pillar.sigfig does

I have a csv file weight.csv with the following contents.
weight,weight_selfreport
81.5,81.66969147005445
72.6,72.59528130671505
92.9,93.01270417422867
79.4,79.4010889292196
94.6,96.64246823956442
80.2,79.4010889292196
116.2,113.43012704174228
95.4,95.73502722323049
99.5,99.8185117967332
If I do
library(readr)
Df <- read_csv('weight.csv')
Df
I get
# A tibble: 9 x 2
weight weight_selfreport
<dbl> <dbl>
1 81.5 81.7
2 72.6 72.6
3 92.9 93.0
4 79.4 79.4
5 94.6 96.6
6 80.2 79.4
7 116. 113.
8 95.4 95.7
9 99.5 99.8
If I convert that tibble to a normal data frame, I'll see more digits.
as.data.frame(Df)
weight weight_selfreport
1 81.5 81.66969
2 72.6 72.59528
3 92.9 93.01270
4 79.4 79.40109
5 94.6 96.64247
6 80.2 79.40109
7 116.2 113.43013
8 95.4 95.73503
9 99.5 99.81851
Initially I thought that if I wanted to get this type of display for the tibble, I thought I would do options(pillar.sigfig = 5).
However, that's not what it does.
options(pillar.sigfig = 5)
Df
# A tibble: 9 x 2
weight weight_selfreport
<dbl> <dbl>
1 81.5 81.670
2 72.600 72.595
3 92.9 93.013
4 79.4 79.401
5 94.6 96.642
6 80.2 79.401
7 116.2 113.43
8 95.4 95.735
9 99.5 99.819
And so I see that pillar.sigfig is about controlling significant digits not decimals places.
Fine but
Why is (row 2, col 1) 72.6 being displayed as 72.600?
What can I do, or can I do anything, to get five decimals places?
This might come a little late...3 years late, but it might help others looking for answers.
The issue lies with tibble. It has a very opinionated way of representing dfs. I presume, you often do not feel the need to look at your data in this way, but if you do, there are two options I frequently use that potentially are just another workaround.
Option 1: Use num()
This neat function enforces decimals. So you can mutate() all columns you want to format with the following:
library(tidyverse)
data <- tribble(
~ weight, ~ weight_selfreport,
81.5,81.66969147005445,
72.6,72.59528130671505,
92.9,93.01270417422867,
79.4,79.4010889292196,
94.6,96.64246823956442,
80.2,79.4010889292196,
116.2,113.43012704174228,
95.4,95.73502722323049,
99.5,99.8185117967332
)
data <-
data %>%
mutate(across(where(is.numeric), ~ num(., digits = 3)))
data
#> # A tibble: 9 × 2
#> weight weight_selfreport
#> <num:.3!> <num:.3!>
#> 1 81.500 81.670
#> 2 72.600 72.595
#> 3 92.900 93.013
#> 4 79.400 79.401
#> 5 94.600 96.642
#> 6 80.200 79.401
#> 7 116.200 113.430
#> 8 95.400 95.735
#> 9 99.500 99.819
Option 2: Use table packages
Usually, when I inspect a tibble it is because it contains results I want to report. Thus, I use one of the many table-generator packages, e.g.
flextable,
gt,
formattable,
reactable,
etc.
Here is an example you can try using flextable:
library(tidyverse)
data <- tribble(
~ weight, ~ weight_selfreport,
81.5,81.66969147005445,
72.6,72.59528130671505,
92.9,93.01270417422867,
79.4,79.4010889292196,
94.6,96.64246823956442,
80.2,79.4010889292196,
116.2,113.43012704174228,
95.4,95.73502722323049,
99.5,99.8185117967332
)
flextable::flextable(data)
I assume Option 1 might have been what you were looking for.
I have the same issue. Using pillar.sigfig helps. You can also use it with round() and you have more control. But if the last figure is 0 it will not display it.
The "trick" I used was to save the results in a variable and then use print.data.frame(). Then it works fine. But maybe there is an easier solution...

Resampling cross-sectional time series data in R

I'm dealing with cross-sectional time series data (many DIFFERENT individuals over time). At the individual level, each person has a quantity of a good demanded. This data is unbalanced with respect to how many individuals are in each period. For each time period, I've aggregated the individual data into a single time series. Example data structure below
Cross-Section Time Series
Time | Person | Quantity
----------------------
11/18| Bob | 2
11/18| Sally | 1
11/18| Jake | 5
12/18| Jim | 2
12/18| Roger | 8
Time Series
Time | Total Q
-------------
11/18| 8
12/18| 10
What I want to do for each period is resample (with replacement) the individual quantity, aggregate across the individuals, iterate X amount of times, and then get an mean and standard error from the bootstrap.
The end result should look like
Time | Total Q | Boot Strap Total Mean
-------------------------------------
11/18| 8 | 8.5
12/18| 10 | 10.05
Here is some code to create example sample data:
library(tidyverse)
set.seed(1234)
Cross_Time = data.frame(x) %>%
mutate(Period = sample(1:10, 50, replace=T),
Q=rnorm(50,10,1)) %>%
arrange(Period)
Timeseries = Cross_Time %>%
group_by(Period) %>%
summarize(Total=sum(Q))
I know this is possible in R, but I'm at a loss as to how to code it or what the right questions I need to ask are. All help is appreciated!
We may do the following:
X <- 1000
Cross_Time %>% group_by(Period) %>%
do({QS <- colSums(replicate(sample(.$Q, replace = TRUE), n = X))
data.frame(Period = .$Period[1], `Total Q` = sum(.$Q), Mean = mean(QS), `Standard Error` = sd(QS))})
# A tibble: 10 x 4
# Groups: Period [10]
# Period Total.Q Mean Standard.Error
# <int> <dbl> <dbl> <dbl>
# 1 1 28.8 28.8 0.284
# 2 2 35.9 35.8 0.874
# 3 3 109. 109. 3.90
# 4 4 48.9 48.9 2.16
# 5 5 20.2 20.2 0.658
# 6 6 59.0 58.8 3.57
# 7 7 88.7 88.6 2.64
# 8 8 22.7 22.7 1.04
# 9 9 47.7 47.7 2.46
# 10 10 27.9 27.9 0.575
I think the code is quite self-explanatory. In every group we resample it's values with replacement X times with replicate and compute the two desired statistics. It's also straightforward to add any others!

Apply function to specific object in data.frame and add the results to new columns in the

I have a table with some vessel GPS data.
Just like
ID POSTIME LON LAT SPEED AZIMUTH
1 2015-12-31 23:56:15 123.4003 32.39449 5.2 145
2 2015-12-31 23:56:53 123.3982 32.39487 5.2 138
3 2015-12-31 23:59:53 123.3884 32.39625 5.3 138
4 2016-01-01 00:01:19 123.3836 32.39702 5.2 146
5 2016-01-01 00:02:58 123.3788 32.39915 5.1 154
6 2016-01-01 00:06:41 123.3708 32.40391 5.1 157
And I want to calculate the distance, time difference and angle difference of the ship at each sample point.
I have written a function point.distance for calculating distance by lon and lat of different points, just like
point.distance <- function(lon1,lat1,lon2,lat2)
lon1/2 and lat1/2 stands for different points
also with a point.angle function to calculate angle difference
point.angle <- function(lon1,lat1,lon2,lat2,lon3,lat3)
I know how to use functions on 2 individual points, but how to apply the functions to all the rows and add the results to new columns in order to further analyze?
I hope my results might be like
ID POSTIME LON LAT SPEED AZIMUTH DISTANCE TD AD
1 2015-12-31 23:56:15 123.4003 32.39449 5.2 145 NA 00:00:38 -7
2 2015-12-31 23:56:53 123.3982 32.39487 5.2 138 201.873 00:03:00 0
3 2015-12-31 23:59:53 123.3884 32.39625 5.3 138 ... ... ...
4 2016-01-01 00:01:19 123.3836 32.39702 5.2 146 ... ... ...
Is there any package or function will act like this?
Or should I just save the results in different vectors and then write to the xlsx file at last?
If you're just getting started in R, I'd recommend you checkout the dplyr and tidyr packages for data manipulation. I'm going to use dplyr to help answer your question. I'm going to use a simpler example that gets at what I think is the heart of your question:
how do I calculate a value based on two successive rows of data in my data.frame?
I've used two functions from the dplyr package below:
mutate - which takes a data.frame and transforms it by adding columns. Note I am able to reference new columns I've created in the same mutate command.
lag - this function takes a vector as an argument and returns a shifted copy of the vector. So for example
lag(c(1, 2, 3))
# = NA, 1, 2
So here's my simple example. I'm going to make some coordinates in the xy-plane and compute the euclidian distance between successive points. I'm going to add columns to my table to bring the coordinates from row i to row i + 1 and then I'll compute the distance using the two sets of coordinates.
#install.packages(dplyr)
library(dplyr)
d <- data.frame(x = c(-1, 2, 0, 0, 2), y = c(-3, -2, -1, 1, 3))
d
# x y
#1 -1 -3
#2 2 -2
#3 0 -1
#4 0 1
#5 -2 3
mydist <- function(x1, y1, x2, y2){
sqrt((x2 - x1)^2 + (y2 - y1)^2)
}
mutate(d, x0 = lag(x), y0 = lag(y), distance = mydist(x0, y0, x, y))
# x y x0 y0 distance
#1 -1 -3 NA NA NA
#2 2 -2 -1 -3 3.162278
#3 0 -1 2 -2 2.236068
#4 0 1 0 -1 2.000000
#5 -2 3 0 1 2.828427
Here is a tidyverse and geosphere driven version. If you are a pandas fan or familiar with SQL or just new to R, you will probably find the tidyverse a very comfortable language in which to work.
For the distance calculation, I have used the most precise function available in geosphere. If you are finding your calculations are taking too long, please feel free to back down the complexity to Haversine or lower: options are detailed well here: see Section 2 - Great Circle Distance (p.2)
I have also left the code in a very verbose state. That way you may review all the steps in the process. I just wanted to make sure this answer might be most accessible to you and others who might also have just begun to get excited about the thrilling sport of data wrangling.
The libraries used:
library(tidyverse)
library(lubridate)
library(geosphere)
A replicable dataset transformation of the OP view of the data sample above:
df_dat <-
read.table(text = " ID POSDATE POSTIME LON LAT SPEED AZIMUTH
1 2015-12-31 23:56:15 123.4003 32.39449 5.2 145
2 2015-12-31 23:56:53 123.3982 32.39487 5.2 138
3 2015-12-31 23:59:53 123.3884 32.39625 5.3 138
4 2016-01-01 00:01:19 123.3836 32.39702 5.2 146
5 2016-01-01 00:02:58 123.3788 32.39915 5.1 154
6 2016-01-01 00:06:41 123.3708 32.40391 5.1 157
", header = TRUE, stringsAsFactors = FALSE
)
df_dat
As seen below:
> df_dat
ID POSDATE POSTIME LON LAT SPEED AZIMUTH
1 1 2015-12-31 23:56:15 123.4003 32.39449 5.2 145
2 2 2015-12-31 23:56:53 123.3982 32.39487 5.2 138
3 3 2015-12-31 23:59:53 123.3884 32.39625 5.3 138
4 4 2016-01-01 00:01:19 123.3836 32.39702 5.2 146
5 5 2016-01-01 00:02:58 123.3788 32.39915 5.1 154
6 6 2016-01-01 00:06:41 123.3708 32.40391 5.1 157
Below is the code for wrangling your dataframe down into your desired shape. I have also included into the preparation dataframe a column called TD_per that you might find to be a helpful format.
output <-
df_dat %>%
arrange(ID) %>%
mutate(DTM = ymd_hms(paste0(POSDATE, POSTIME)),
LON_prev = lag(LON),
LAT_prev = lag(LAT),
AZM_prev = lag(AZIMUTH),
DTM_prev = lag(DTM),
TD_sec = difftime(DTM, DTM_prev),
TD_per = as.period(TD_sec), # an alternative way to list the times
AD = AZIMUTH - AZM_prev) %>%
rowwise %>% # to keep geosphere on the straight and narrow
mutate(DISTANCE = distVincentyEllipsoid(c(LON_prev, LAT_prev), c(LON, LAT)),
TD = format(ymd(POSDATE, tz = "UTC") + TD_sec, "%H:%M:%S")
) %>%
select(ID, # getting dataframe all presentable
POSTIME = DTM,
LON,
LAT,
SPEED,
AZIMUTH,
DISTANCE,
TD,
AD)
output
output
Source: local data frame [6 x 9]
Groups: <by row>
# A tibble: 6 x 9
ID POSTIME LON LAT SPEED AZIMUTH DISTANCE TD AD
<int> <dttm> <dbl> <dbl> <dbl> <int> <dbl> <chr> <int>
1 1 2015-12-31 23:56:15 123.4003 32.39449 5.2 145 NA <NA> NA
2 2 2015-12-31 23:56:53 123.3982 32.39487 5.2 138 202.0246 00:00:38 -7
3 3 2015-12-31 23:59:53 123.3884 32.39625 5.3 138 934.6486 00:03:00 0
4 4 2016-01-01 00:01:19 123.3836 32.39702 5.2 146 459.6053 00:01:26 8
5 5 2016-01-01 00:02:58 123.3788 32.39915 5.1 154 509.6387 00:01:39 8
6 6 2016-01-01 00:06:41 123.3708 32.40391 5.1 157 919.2855 00:03:43 3
Finally, you can write your output dataframe directly to a .csv.
write_excel_csv(output, "output.csv")

Functions without arguments

I'm not sure about this. Here is an example of a function which does not work:
myfunction<-function(){
mydata=read_excel("Square_data.xlsx", sheet = "Data", skip=0)
mydata$Dates=as.Date(mydata$Dates, format= "%Y-%m-%d")
mydata.ts=ts(mydata, start=2006, frequency=1)
}
The files do not load. When I execute each command line by line in R the files are loaded, so there's no problem with the commands. My question is, can I run a function such as myfunction to load the files? Thanks.
Last statement in function is an assignment If the last executed statement in a function is an assignment then it will not display on the console unless you use print but if the function result is assigned then you can print the assigned value later. For example, using the built in BOD data frame:
> f <- function() bod <- BOD
> f() # no result printed on console because f() was not explicitly printed
> print(f()) # explicitly print
Time demand
1 1 8.3
2 2 10.3
3 3 19.0
4 4 16.0
5 5 15.6
6 7 19.8
> X <- f() # assign and then print the assigned value
> X
Time demand
1 1 8.3
2 2 10.3
3 3 19.0
4 4 16.0
5 5 15.6
6 7 19.8
Last statement in function is expression producing a result If the last statement produces a value rather than being an assignment then a result is printed on the console. For example:
> g <- function() BOD
> g()
Time demand
1 1 8.3
2 2 10.3
3 3 19.0
4 4 16.0
5 5 15.6
6 7 19.8
Thus make sure that the last statement in your function is not an assignment if you want it to display on the console automatically.
Note 1: sourcing code Also, note that if your code is sourced using a source() statement or if the code is called by another function then it also won't print automatically on the console unless you use a print.
Note 2: Two results Regarding some comments to the question, if you want to output two results then output them in a named list. For example. this outputs a list with components named BOD and BOD2:
h <- function() list(BOD = BOD, BOD2 = 2*BOD)
h()
$BOD
Time demand
1 1 8.3
2 2 10.3
3 3 19.0
4 4 16.0
5 5 15.6
6 7 19.8
$BOD2
Time demand
1 2 16.6
2 4 20.6
3 6 38.0
4 8 32.0
5 10 31.2
6 14 39.6
We could refer to them like this:
> H <- h()
> H$BOD
Time demand
1 1 8.3
2 2 10.3
3 3 19.0
4 4 16.0
5 5 15.6
6 7 19.8
> H$BOD2
Time demand
1 2 16.6
2 4 20.6
3 6 38.0
4 8 32.0
5 10 31.2
6 14 39.6
Note 3: <<- operator Regarding the comments to the question, in general, using the <<- operator should be avoided because it undesirably links the internals of your function to the global workspace in an invisible and therefore error-prone way. If you want to return a value it is normally best to return it as the output of the function. There are some situations where <<- is warranted but they are relatively uncommon.
Sure. Just give it a value to be returned:
myfunction<-function(){
mydata=read_excel("Square_data.xlsx", sheet = "Data", skip=0)
mydata$Dates=as.Date(mydata$Dates, format= "%Y-%m-%d")
ts(mydata, start=2006, frequency=1) # The last object is returned by an R function
}
so calling dat <- myfunction() will make dat the ts-object that was created inside the function.
P.S.: There also in a return function in R. As a best practice only use this if you want to return an object early, e.g. in combination with if

Resources