Operation overloading in R [duplicate] - r

This question already has answers here:
Making a string concatenation operator in R
(5 answers)
Closed 8 years ago.
What's the most straight forward way of overloading '+' for characters?
I have defined '%+%' <- function(...) paste(...,sep=""):
str <- "aa"%+%"bb"%+%"cc" #str="aabbcc"
But I don't like the syntax. I think str <- "aa"+"bb"+"cc" would be nicer.
(I am building long SQL queries to use with RODBC, the usual paste is not very handy in such situations. Any suggestions?)

You may try something like that :
R> oldplus <- `+`
R> `+` <- function(e1, e2) {
R> if (is.character(e1) && is.character(e2)) {
R> paste(e1,e2,sep="")
R> }
R> else {
R> oldplus(e1,e2)
R> }
R> }
Which gives :
R> 2+3
[1] 5
R> "aa"+"bb"
[1] "aabb"
But as Sacha pointed out, overloading such a basic function is very dangerous, and I can't assure you it will not break your R session and make your computer explode :-)

I think that using two arguments is better than the dots:
'%+%' <- function(x,y) paste(x,y,sep="")
"a"%+%"b"%+%"C"
[1] "abC"
If you really really want to you can overwrite +, but be veeeeery careful when doing this as you will break one of the most important functions in R. I can't think of any reason why you would want to do that over %+%:
# '+' <- function(x,y) paste(x,y,sep="")
# "a"+"b"+"C"
# [1] "abC"
rm('+')
commented it out to be sure I don't accidently break someones R:)

Why is the usual 'paste' not very handy? It's what it's meant for. Suggestions:
Write yourself an unusual paste function that does what you want. Maybe you just don't like typing 'sep=""' all the time. So write a function that calls paste with sep="". Or whatever.
Building long SQL queries with string concatenation is potential fail anyway. See http://xkcd.com/327/ for the canonical example.
Another possibility is some kind of templating solution. I've used the brew package in the past and it's great for that.

You can find this operator in stringi package.
http://docs.rexamine.com/R-man/stringi/oper_plus.html

Related

Substring by specific character [duplicate]

I would like to extract filename from url in R. For now I do it as follows, but maybe it can be done shorter like in python. assuming path is just string.
path="http://www.exanple.com/foo/bar/fooXbar.xls"
in R:
tail(strsplit(path,"[/]")[[1]],1)
in Python:
path.split("/")[-1:]
Maybe some sub, gsub solution?
There's a function for that...
basename(path)
[1] "fooXbar.xls"
#SimonO101 has the most robust answer IMO, but some other options:
Since regular expressions are greedy, you can use that to your advantage
sub('.*/', '', path)
# [1] "fooXbar.xls"
Also, you shouldn't need the [] around the / in your strsplit.
> tail(strsplit(path,"/")[[1]],1)
[1] "fooXbar.xls"

How can a function which prints be called repeatedly without returning the value of what was printed and without a for loop? [duplicate]

This question already has answers here:
How to suppress part of the output from `lapply()`?
(2 answers)
Closed 2 years ago.
I recently wrote for(i in 1:100){foo(i,bar)} as the last line in a script. In this case, the last line of foo is a call to print and I definitely do not want to see the return values of foo. I only want the printing. This for loop works, but it feels unidiomatic to use a loop like this in R. Is there any more idiomatic way to achieve this? foo must take each i in 1:100 individually before being called, so foo(1:100,bar) won't work.
sapply(1:100,function(x) foo(x,bar)) seems more idiomatic, but it gives me both the return values of foo and its printing. I had considered using do.call, but having to use as.list(1:100) disgusted me. What are my alternatives?
Minimum example:
foo<-function(i,bar)
{
print(paste0("alice",i,bar,collapse = ""))
}
for(i in 1:100){foo(i,"should've used cat")}
sapply(1:100,function(x) foo(x,"ugly output"))```
You can use invisible in base R to suppress function return output:
invisible(sapply(1:5, function(x) foo(x, "ugly")))
[1] "alice1ugly"
[1] "alice2ugly"
[1] "alice3ugly"
[1] "alice4ugly"
[1] "alice5ugly"
You can also use purrr::walk - it is like sapply in that it executes a function over iterated values, but wrapped with invisible by default:
purrr::walk(1:100, ~foo(., "ugly"))

How do I remove the extraneous printing of [1] in R? [duplicate]

This question already has answers here:
Output in R, Avoid Writing "[1]"
(4 answers)
R output without [1], how to nicely format?
(1 answer)
Closed 3 years ago.
Consider the following example:
library(digest)
hash <- digest("hello world", algo="md5", serialize=F)
hash
produces [1] "5eb63bbbe01eeed093cb22bb8f5acdc3"
For my purposes, I only want the raw string output with no embellishments or extras. The objective is to alter the script so it produces 5eb63bbbe01eeed093cb22bb8f5acdc3.
I've spent over an hour looking for any way to get rid of the [1] and the documentation has been absolutely terrible. Most of the search results are manipulation, clickbait, wrong, or scams.
Array indexing doesn't work:
hash[1]
produces [1] "5eb63bbbe01eeed093cb22bb8f5acdc3", because apparently an array is the first element of itself which makes no programmatic sense whatsoever.
typeof(hash)
produces [1] "character". Really?
substr(hash[1], 4, 1000)
produces [1] "63bbbe01eeed093cb22bb8f5acdc3".
How do I just make that [1] and preferably the quotes as well go away? There's absolutely no instructions searchable on the web as far as I know.
More generally, I'd like a function or procedure to convert anything to a string for manipulation and post-processing.
library(digest)
hash <- digest("hello world", algo="md5", serialize=F)
cat(hash)

How to use variable in xpath in R?

when i parse a web file, it works fine ,
tdata=xpathApply(data,"//table[#id='PL']")
i want to use variable in xpathApply,
x="PL"
tdata=xpathApply(data,"//table[#id=x]")
it can not work,how to write the xpath expression in xpathApply with variable?
think for Dason's suggestion,
x="PL"
y=paste0("//table[#id='",x,"']")
tdata=xpathApply(data,y)
it is ok,but i feel it is ugly,how can i write it more beautiful?
The gsubfn package can do string interpolation somewhat along the lines of Perl if we preface the function whose arguments are to contain substitutions with fn$. Here $x means substitute in the value of x . See ?fn and the gsubfn home page.
library(gsubfn)
x <- "PL"
tdata <- fn$xpathApply(data, "//table[#id='$x']")
#Dason's suggestion of using paste or one alike is most likely the only way to go. If you find it ugly, you can sweep it under the rug by creating your own function:
my.xpathApply <- function(data, x) xpathApply(data, paste0("//table[#id='",x,"']"))
tdata <- my.xpathApply(data, "PL")
After all, you must use a lot of package functions that use paste somewhere, so you should be ok with having one of your own :-)

Using lists in R

Sorry for possibly a complete noob question but I have just started programming with R today and I am stuck already.
I am reading some data from a file which is in the format.
3.482373 8.0093238198371388 47.393873
0.32 20.3131 31.313
What I want to do is split each line then deal with each of the individual numbers.
I have imported the stringr package and using
x = str_split(line, " ")
This produces a list which I would like to index but don't know how.
I have learnt that x[[1:2]] gets the second element but that is about it. Ideally I would like something like
x1 = x[1]
x2 = x[2]
x3 = x[3]
But can't find anyway of doing this.
Thanks in advance
By using unlist you will get a vector instead of a list of vectors, and you will then be able to index it directly :
R> unlist(str_split("foo bar baz", " "))
[1] "foo" "bar" "baz"
But maybe you should read your file directly from read.table or one of its variant ?
And if you are beginning with R, you really should read one of the introduction available if you want to understand subsetting, indexing, etc.
you can wrap your call to str_split with unlist to get the behavior you're looking for.
The usual way to get this in would be to import it into a dataframe (a special sort of list). If file name is "fil.dat"" and is in "C:/dir/"
dfrm <- read.table("C:/dir/fil.dat") # resist the temptation to use backslashes
dfrm[2,2] # would give you the second item on the second row.
By default the field separator in R is "white-space" and that seems to be what you have, so you do not need to supply a sep= argument and the read.table function will attempt to import as numeric. To be on the safe side, you might consider forcing that option with colClasses=rep("numeric", 3) because if it encounters a strange item (such as often produced by Excel dumps), you will get a factor variable and will probably not understand how to recover gracefully.

Resources