I am working with RODBC and parallel to make multiple queries against a data system for some internal reporting. To facilitate making new connections, I am going to extract the connection string from the RODBC object. To do this, I planned to use attributes(). However, I've encountered a behavior that I do not understand. A minimal working example is below:
> example.data <- data.frame(letters = sample(x = LETTERS,size = 20,replace = T),
+ numbers = sample(x = 0:9,size = 20,replace = T))
>
> attributes(obj = example.data)
Error in attributes(obj = example.data) :
supplied argument name 'obj' does not match 'x'
> attributes(example.data)
$names
[1] "letters" "numbers"
$row.names
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
$class
[1] "data.frame"
It should be noted that the obj = behavior is the one tab-suggested by RStudio. However, it causes an error. I tried to review the source code for attributes, but it is a primitive, so I would have to go digging into the C source - with which I am not nearly as familiar.
Why does attributes() fail when an explicit argument (obj =) is used, but runs fine when it is not used? (And should the behavior of RStudio with regard to suggesting obj = be changed?)
This seems like a bug in the documentation for attributes. The parameter probably should be named x. You can call it that way
attributes(x = example.data)
The problem is that attributes() is a primitive function and primitive functions behave differently than regular functions in R. They don't have formal parameters (formals(attributes) returns NULL). For these types of functions, R typically isn't going to parse out parameters by name and will assume they are in a certain positional order for efficiency reasons. That's why it's better not to name them because you cannot change the order of these parameters. There should be no need to name the parameter here.
There are other functions out there that have mismatches between the parameter name in the documentation and the value checked by the code. For example
isS4(pi)
# [1] FALSE
# documented parameter name is "object"
isS4(object=pi)
# Error in isS4(object = pi) :
# supplied argument name 'object' does not match 'x'
isS4(x=pi)
# [1] FALSE
But there are also other primitives out there that use names other than x: e.g. seq_along (uses "along.with=") and quote (uses "expr=").
Related
I have a function that returns a list.
I would like this list to be immutable, similar to the way that lockBinding prevents overwriting or editing an object.
This would look something like the following:
myfun <- function(x){
out <- list(a = 1, val = x)
make_read_only(out)
out
}
test <- myfun(9)
test$a
[1] 1
test$val
[1] 9
test
$a
[1] 1
$val
[1] 9
test$newval <- 7
Error:
Where make_read_only() is just a standin for a function or some code that accomplishes this task.
I have tried using lockBinding which would work perfectly, however the 'lock' doesn't survive being passed upward into the parent environment after the function returns its output.
I have also looked into trying to lock symbol in the parent environment, but there doesn't seem to be a way to learn what the output will be assigned to from inside the function, which is needed as an argument of lockBinding.
It seems like there might be a way to do this via returning a refernce to an environment and locking the environment using lockEnvironment() or doing something similar, but I would like to hear what options are out there to accomplish this before beginnig.
Similary, it seems like it might be achievable using R6 but I would prefer to avoid using R6 since it is not required for any other part of this codebase and again, I would just like to hear what options are availble to acheive this behaviour.
In summary, the function will return an arbitrary list. This list will be at least a few levels deep.
mylist <- myfun(list(b = list(c = 3), foo = "bar"))
The user should be able to access the sublits/elements in a straightforward way similar to current dollar-sign access
mylist$a$b$c
[1] 3
It should not be possible to edit this object.
mylist$a <- 5
Error:
It should be possible to remove the object via
rm(mylist)
mylist
Error: object 'mylist' not found
So my question is, what available options are there in R to accomplish this?
One way of achieving this could be by creating a new class (locked) along with methods for [<-, [[<-, and $<- for this new class, which will return an error.
For example:
`[[<-.locked` <- function(...) {stop("Can't assign into locked object")}
a<-list(a="a",b=2)
a
$a
[1] "a"
$b
[1] 2
class(a)<-"locked"
a[[1]]<-"moose"
Error in `[[<-.locked`(`*tmp*`, 1, value = "moose") :
Can't assign into locked object
a
$a
[1] "a"
$b
[1] 2
attr(,"class")
[1] "locked"
In that case, all your "make read only" function needs to do is redefine the class for the object as locked:
class(out)<-c("locked",class(out))
Hi I frequently find myself using .fists = FALSE argument in purrr:partial.
So today I decided to stop repeating myself and I tried to write a partial of partial itself:
backwards_partial <- partial(partial,.first = FALSE)
This function made me nervous rather quickly, because it is ambiguous as to how .first = FALSE will be used:
as a default parameter of outputted function
as argument of calling partial that will move pre-filled arguments to the back of in outputed function
I thought I could remedy this ambiguity by writing this:
backwards_partial <- lift_ld(lift_dl(partial),list(.first = FALSE))
But this failed and it does not seem elegant.
So my question is...
Is there a correct way(best practice, community standard) that I'm missing here?
If so what is it?
Otherwise how would you solve this problem?
EDIT:
I should mention my use case for having backwards_partial.
I am looking to pre-fill arguments of multiple functions that I will pass into compose which will pass results of past function into first argument, hence .fists = FALSE ensures that we are not overwriting pre-filled arguments.
Here's a way:
# copy function
backwards_partial <- purrr::partial
# change formals
formals(backwards_partial)[5] <- alist(.first = FALSE)
Let's test it:
partial(head,2)(1:5)
# Error in head.default(2, ...) : length(n) == 1L is not TRUE
partial(head,2,.first = FALSE)(1:5)
# [1] 1 2
backwards_partial(head,2)(1:5)
# [1] 1 2
I'll keep it simple. Why does this work:
> as.data.frame(c('a', 'b'))
c("a", "b")
1 a
2 b
But this doesn't:
> as(c('a', 'b'), "data.frame")
Error in as(c("a", "b"), "data.frame") :
no method or default for coercing “character” to “data.frame”
I assumed that the latter would simply somehow convert into the former, but I suppose not.
Maybe the R authors thought replicating the first method would be encouraging bad coding practice. The first result does not look particularly worth emulating because the name of the column will not be easy to use. The data.frame method for character values delivers a much better behaved result since it gets created with a valid name:
> as.data.frame(c('a','b'))
c("a", "b")
1 a
2 b
data.frame(c('a','b'))
c..a....b..
1 a
2 b
See what happens when you try to extract values with the name of that column. Since everyone knows that dataframes are really list objects, (right?)... then it would be more natural to expect coders to use a list argument:
data.frame(list(b=c('a', 'b')) )
b
1 a
2 b
# same as
> as.data.frame(list(f=c('a','b')))
f
1 a
2 b
Alex's answer directs you to the as-function code, which elaborates and confirms joran's comment above. That function doesn't use the S3 dispatch, but rather looks up registered coercion methods that have been created by packages or constructed with setAs which is a process that is more commonly used in building S4-methods.
> setAs("character", "data.frame", function(from){ to=as.data.frame.character(from)})
> new=as(c('a', 'b'), "data.frame")
> new
from
1 a
2 b
The setAs function also allows you to use custom coercion at the time of input with the read.*-functions: How can I completely remove scientific notation for the entire R session
I believe that it has to do with the fact that as is not a generic function, such as mean:
R> mean
function (x, ...)
UseMethod("mean")
<bytecode: 0x000000000a617ed0>
<environment: namespace:base>
Since it's not a generic, there is no call to method dispatch (ie UseMethod)
On the other hand, as.data.frame is a generic function-- see methods(class= "data.frame") or the source for as.data.frame
If there was method dispatch on as, your assumption "that the latter would convert to the former" would be correct. Since as is not a generic function, your assumption is wrong.
If you look at the source code to as, you see that it's essentially a call to a number of if-else cases instead of a call to method dispatch. On line 52, you see the catch that returns your error:
if (is.null(asMethod))
stop(gettextf("no method or default for coercing %s to %s",
dQuote(thisClass), dQuote(Class)), domain = NA)
Which gives the return that you see.
I am getting a strange error converting Gene Symbols to Entrez ID. Here is my code:
testData = read.delim("IL_CellVar.txt",head=T,row.names = 2)
testData[1:5,1:3]
# ClustID Genes.Symbol ChrLoc
# NM_001034168.1 4 Ank2 chrNA:-1--1
# NM_013795.4 4 Atp5l chrNA:-1--1
# NM_018770 4 Igsf4a chrNA:-1--1
# NM_146150.2 4 Nrd1 chrNA:-1--1
# NM_134065.3 4 Epdr1 chrNA:-1--1
clustNum = 5
filteredClust = testData[testData$ClustID == clustNum,]
any(is.na(filteredClust$Genes.Symbol))
# [1] FALSE
selectedEntrezIds <- unlist(mget(filteredClust$Genes.Symbol,org.Mm.egSYMBOL2EG))
# Error in unlist(mget(filteredClust$Genes.Symbol, org.Mm.egSYMBOL2EG)) :
# error in evaluating the argument 'x' in selecting a method for function
# 'unlist': Error in #.checkKeysAreWellFormed(keys) :
# keys must be supplied in a character vector with no NAs
Another approach fails too:
selectedEntrezIds = select(org.Mm.eg.db,filteredClust$Genes.Symbol, "ENTREZID")
# Error in .select(x, keys, columns, keytype = extraArgs[["kt"]], jointype = jointype) :
# 'keys' must be a character vector
Just for the sake or error, removing 'NA', doesn't help:
a <- filteredClust$Genes.Symbol[!is.na(filteredClust$Genes.Symbol)]
selectedEntrezIds <- unlist(mget(a,org.Mm.egSYMBOL2EG))
# Error in unlist(mget(a, org.Mm.egSYMBOL2EG)) :
# error in evaluating the argument 'x' in selecting a method for function
# 'unlist': Error in # .checkKeysAreWellFormed(keys) :
# keys must be supplied in a character vector with no NAs
I am not sure why I am getting this error as the master file from which gene symbols were extracted for testData gives no problem while converting to EntrezID. Would apprecite help on this.
Since you didn't provide a minimal reproducible example for us to replicate the error you've experienced, I'm making a speculation here based on the error message. This is most likely caused by the default behavior of read.delim and functions alike (read.csv, read.table etc.) that converts strings in your data file to factor's.
You need to add an extra parameter to read.delim, specifically, stringsAsFactors=F (by default, it is TRUE).
That is,
testData = read.delim("IL_CellVar.txt", head=T, row.names = 2, stringsAsFactors=F)
If you read the documentation:
stringsAsFactors
logical: should character vectors be converted to factors? Note that this is overridden by as.is and colClasses, both of which allow finer control.
You can check the class of your Gene.symbol column by:
class(testData$Gene.Symbol)
and I guess it woul be "factor".
This leads to the error you had:
# Error in .select(x, keys, columns, keytype = extraArgs[["kt"]], jointype = jointype) :
# 'keys' must be a character vector
You can also manually convert the factors to strings/characters by:
testData$Gene.Symbol <- as.character(testData$Gene.Symbol)
You can read more about this peculiar behavior in this chapter of Hadley's book "Advanced R". And I'm quoting the relevant paragraph here:
... Unfortunately, most data loading functions in R automatically convert character vectors to factors. This is suboptimal, because there’s no way for those functions to know the set of all possible levels or their optimal order. Instead, use the argument stringsAsFactors = FALSE to suppress this behaviour, and then manually convert character vectors to factors using your knowledge of the data. A global option, options(stringsAsFactors = FALSE), is available to control this behaviour, but I don’t recommend using it. Changing a global option may have unexpected consequences when combined with other code (either from packages, or code that you’re source()ing), and global options make code harder to understand because they increase the number of lines you need to read to understand how a single line of code will behave. ...
is there a way to test whether two objects are identical in the R language?
For clarity: I do not mean identical in the sense of the identical function,
which compares objects based on certain properties like numerical values or logical values etc.
I am really interested in object identity, which for example could be tested using the is operator in the Python language.
UPDATE: A more robust and faster implementation of address(x) (not using .Internal(inspect(x))) was added to data.table v1.8.9. From NEWS :
New function address() returns the address in RAM of its argument. Sometimes useful in determining whether a value has been copied or not by R, programatically.
There's probably a neater way but this seems to work.
address = function(x) substring(capture.output(.Internal(inspect(x)))[1],2,17)
x = 1
y = 1
z = x
identical(x,y)
# [1] TRUE
identical(x,z)
# [1] TRUE
address(x)==address(y)
# [1] FALSE
address(x)==address(z)
# [1] TRUE
You could modify it to work on 32bit by changing 17 to 9.
You can use the pryr package.
For example, return the memory location of the mtcars object:
pryr::address(mtcars)
Then, for variables a and b, you can check:
address(a) == address(b)