R split call to funciton over several lines - r

I wonder if there is a way to split a call to a function in R over several lines, other then using commas or '+' which is not always applicable.
I am basically looking like Python's '\' escape.
For example, I want to display this line:
PromoterIslands$illumina_probes[bins_with_probes]-tapply(CGIP_to_Probe$subjectHits,CGIP_to_Probe$subjectHits,function(x) length(x))
as:
PromoterIslands$illumina_probes[bins_with_probes]
<-tapply(CGIP_to_Probe$subjectHits,CGIP_to_Probe$subjectHits,
function(x) length(x))]
Is there any way to do this?
Thanks in advance

You can just split commands on various lines, no special sign needed.
The command you wrote is almost OK, you just need to put the <- operator on the first line.
So for instance, this is valid R code, and will assign 13 to a
a <-
5 +
8
But this is not
a
<- 5 +
8
Note, however, that this is valid code
a <-
5
+ 8
But would assign 5 to a and print 8.

Assuming you're talking about source (as opposed to desiring to get something to print a particular way), you can break R code in any place that doesn't produce a syntactically complete statement. There is no special character that tells R that the statement isn't complete. In your case, one option is:
PromoterIslands$illumina_probes[bins_with_probes] <-
tapply(
CGIP_to_Probe$subjectHits,
CGIP_to_Probe$subjectHits,
function(x) length(x)
)
You have to leave the <- at the end of the first line, otherwise PromoterIslands$illumina_probes[bins_with_probes] is a syntactically complete statement that would get evaluated. Similarly, on the next line, you have to leave the ( on the same line as tapply otherwise tapply is a syntactically complete statement (returns the contents of the tapply function).
While this doesn't quite answer your question, hopefully you will find there are enough places you can break a line in R that the lack of the special command you are looking for isn't a problem.

Related

Why does R require a return inside of {} when using inside of with statement?

I could not find a reason why the code below would not work.
with(df, {a<-plot(x,y) b<-lines(x1,x2)})
Then I found some examples that used the following syntax.
with(df, {a<-plot(x,y)
b<-lines(x1,x2)})
When I used the second syntax I received no errors. What am I missing?
If you want two commands on the same line, separate them with a semicolon
with(df, {a<-plot(x,y); b<-lines(x1,x2)})
This isn't unique towith or {}. You can't just do
a <-5 b<-3 a+b # syntax error if on the same line.
in R and have it run. You either need semicolons or new line characters between separate statements that you want R to run.
From An Introduction to R section 1.8
Commands are separated either by a semi-colon (‘;’), or by a newline.
Elementary commands can be grouped together into one compound
expression by braces (‘{’ and ‘}’).
Below codes are equivalent
with(df1, {a<-plot(x,y) ; b<-lines(x1,x2)})
with(df1, {a<-plot(x,y)
b<-lines(x1,x2)})

R - Why does frameex[ind, ] needs a ", " to display rows

I am new to R and I have troubles understanding how displaying an index works.
# Find indices of NAs in Max.Gust.SpeedMPH
ind <- which(is.na(weather6$Max.Gust.SpeedMPH))
# Look at the full rows for records missing Max.Gust.SpeedMPH
weather6[ind, ]
My code here works, no problem but I don't understand why weather6[ind] won't display the same thing as weather6[ind, ] . I got very lucky and mistyped the first time.
I apologize in advance that the question might have been posted somewhere else, I searched and couldn't find a proper answer.
So [ is a function just like any other function in R, but we call it strangely. Another way to write it in this case would be:
'[.data.frame'(weather6,ind,)
or the other way:
'[.data.frame'(weather6,ind)
The first three arguments to the function are named x, i and j. If you look at the code, early on it branches with the line:
if (Narg < 3L)
Putting the extra comma tells R that you've called the function with 3 arguments, but that the j argument is "missing". Otherwise, without the comma, you have only 2 arguments, and the function code moves on the the next [ method for lists, in which it will extract the first column instead.

R programming, naming output file using variable

I would like to direct output to a file, using a write.csv statement. I am wanting to write 16 different output files, labeling each one with the extension 1 through 16.
Example as written now:
trackfilenums=1:16
for (i in trackfilenums){
calculations etc
write.csv(max.hsi, 'Severity_Index.csv', row.names=F)
}
I would like for the output csv files to be labeled 'Severity_Index_1.csv', 'Severity_Index_2.csv', etc. Not sure how to do this in R language.
Thanks!
Kimberly
You will want to use the paste command:
write.csv(max.hsi, paste0("Severity_Index_", i,".csv"), row.names=F)
Some people like to have file names like Name_01 Name_02 etc instead of Name_1 Name_2 etc. This may, for example, make the alphabetical order more reasonable: with some software, otherwise, 10 would come after 1, 20 after 2, etc.
This kind of numbering can be achieved with sprintf:
sprintf("Severity_Index_%02d.csv", 7)
The interesting part is %02d -- this says that i is an integer value (could actually use %02i as well) that will take at least 2 positions, and leading zero will be used if necessary.
# try also
sprintf("Severity_Index_%03d.csv", 7)
sprintf("Severity_Index_%2d.csv", 7)
To add to the other answers here, I find it's also a good idea to sanitise the pasted string to make sure it is ok for the file system. For that purpose I have the following function:
fsSafe <- function(string) {
safeString <- gsub("[^[:alnum:]]", "_", string)
safeString <- gsub("_+", "_", safeString)
safeString
}
This simply strips out all non-alphabetic and non-numeric characters and replacing them with an underscore.

Using lists in R

Sorry for possibly a complete noob question but I have just started programming with R today and I am stuck already.
I am reading some data from a file which is in the format.
3.482373 8.0093238198371388 47.393873
0.32 20.3131 31.313
What I want to do is split each line then deal with each of the individual numbers.
I have imported the stringr package and using
x = str_split(line, " ")
This produces a list which I would like to index but don't know how.
I have learnt that x[[1:2]] gets the second element but that is about it. Ideally I would like something like
x1 = x[1]
x2 = x[2]
x3 = x[3]
But can't find anyway of doing this.
Thanks in advance
By using unlist you will get a vector instead of a list of vectors, and you will then be able to index it directly :
R> unlist(str_split("foo bar baz", " "))
[1] "foo" "bar" "baz"
But maybe you should read your file directly from read.table or one of its variant ?
And if you are beginning with R, you really should read one of the introduction available if you want to understand subsetting, indexing, etc.
you can wrap your call to str_split with unlist to get the behavior you're looking for.
The usual way to get this in would be to import it into a dataframe (a special sort of list). If file name is "fil.dat"" and is in "C:/dir/"
dfrm <- read.table("C:/dir/fil.dat") # resist the temptation to use backslashes
dfrm[2,2] # would give you the second item on the second row.
By default the field separator in R is "white-space" and that seems to be what you have, so you do not need to supply a sep= argument and the read.table function will attempt to import as numeric. To be on the safe side, you might consider forcing that option with colClasses=rep("numeric", 3) because if it encounters a strange item (such as often produced by Excel dumps), you will get a factor variable and will probably not understand how to recover gracefully.

Split code over multiple lines in an R script

I want to split a line in an R script over multiple lines (because it is too long). How do I do that?
Specifically, I have a line such as
setwd('~/a/very/long/path/here/that/goes/beyond/80/characters/and/then/some/more')
Is it possible to split the long path over multiple lines? I tried
setwd('~/a/very/long/path/here/that/goes/beyond/80/characters/and/
then/some/more')
with return key at the end of the first line; but that does not work.
Thanks.
Bah, comments are too small. Anyway, #Dirk is very right.
R doesn't need to be told the code starts at the next line. It is smarter than Python ;-) and will just continue to read the next line whenever it considers the statement as "not finished". Actually, in your case it also went to the next line, but R takes the return as a character when it is placed between "".
Mind you, you'll have to make sure your code isn't finished. Compare
a <- 1 + 2
+ 3
with
a <- 1 + 2 +
3
So, when spreading code over multiple lines, you have to make sure that R knows something is coming, either by :
leaving a bracket open, or
ending the line with an operator
When we're talking strings, this still works but you need to be a bit careful. You can open the quotation marks and R will read on until you close it. But every character, including the newline, will be seen as part of the string :
x <- "This is a very
long string over two lines."
x
## [1] "This is a very\nlong string over two lines."
cat(x)
## This is a very
## long string over two lines.
That's the reason why in this case, your code didn't work: a path can't contain a newline character (\n). So that's also why you better use the solution with paste() or paste0() Dirk proposed.
You are not breaking code over multiple lines, but rather a single identifier. There is a difference.
For your issue, try
R> setwd(paste("~/a/very/long/path/here",
"/and/then/some/more",
"/and/then/some/more",
"/and/then/some/more", sep=""))
which also illustrates that it is perfectly fine to break code across multiple lines.
Dirk's method above will absolutely work, but if you're looking for a way to bring in a long string where whitespace/structure is important to preserve (example: a SQL query using RODBC) there is a two step solution.
1) Bring the text string in across multiple lines
long_string <- "this
is
a
long
string
with
whitespace"
2) R will introduce a bunch of \n characters. Strip those out with strwrap(), which destroys whitespace, per the documentation:
strwrap(long_string, width=10000, simplify=TRUE)
By telling strwrap to wrap your text to a very, very long row, you get a single character vector with no whitespace/newline characters.
For that particular case there is file.path :
File <- file.path("~",
"a",
"very",
"long",
"path",
"here",
"that",
"goes",
"beyond",
"80",
"characters",
"and",
"then",
"some",
"more")
setwd(File)
The glue::glue function can help. You can write a string on multiple lines in a script but remove the line breaks from the string object by ending each line with \\:
glue("some\\
thing")
something
I know this post is old, but I had a Situation like this and just want to share my solution. All the answers above work fine. But if you have a Code such as those in data.table chaining Syntax it becomes abit challenging. e.g. I had a Problem like this.
mass <- files[, Veg:=tstrsplit(files$file, "/")[1:4][[1]]][, Rain:=tstrsplit(files$file, "/")[1:4][[2]]][, Roughness:=tstrsplit(files$file, "/")[1:4][[3]]][, Geom:=tstrsplit(files$file, "/")[1:4][[4]]][time_[s]<=12000]
I tried most of the suggestions above and they didn´t work. but I figured out that they can be split after the comma within []. Splitting at ][ doesn´t work.
mass <- files[, Veg:=tstrsplit(files$file, "/")[1:4][[1]]][,
Rain:=tstrsplit(files$file, "/")[1:4][[2]]][,
Roughness:=tstrsplit(files$file, "/")[1:4][[3]]][,
Geom:=tstrsplit(files$file, "/")[1:4][[4]]][`time_[s]`<=12000]
There is no coinvent way to do this because there is no operator in R to do string concatenation.
However, you can define a R Infix operator to do string concatenation:
`%+%` = function(x,y) return(paste0(x,y))
Then you can use it to concat strings, even break the code to multiple lines:
s = "a" %+%
"b" %+%
"c"
This will give you "abc".
This will keep the \n character, but you can also just wrap the quote in parentheses. Especially useful in RMarkdown.
t <- ("
this is a long
string
")
I do this all of the time. Use the paste0() function.
Rootdir = "/myhome/thisproject/part1/"
Subdir = "subdirectory1/subsubdir2/"
fullpath = paste0( Rootdir, Subdir )
fullpath
> fullpath
[1] "/myhome/thisproject/part1/subdirectory1/subsubdir2/"

Resources