Simple hash table for a conversion R script - r

I would like to use hash table to implement simple coversion script.
An input should be multiplied with a factor using its symbol, i.e. y = x * 1E-12, should be called e.g. y <- X2Y(x,"p") with "p" being the symbol for 1E-12.
library(hash)
sym2num <- function(x) {
h <- hash( c("f"=1E-15,"p"=1E-12,"n"=1E-9,"mu"=1E-6,"m"=1E-3,"c"=1E-2) )
return(h$x)
}
X2Y <- function(X,x) {
xNum <- sym2num(x)
Y <- X * xNum
return(Y)
}
# y = x * 1E-12
y <- X2Y(x,"p")
print(y)
With the above code I get numeric(0) as result. Any idaes where it goes wrong?

There’s no benefit to using the {hash} library here. Indeed, since you rehash your vector before each subsetting, this will be substantially less efficient than a direct lookup.
Even if you only constructed the hash table once instead of repeatedly, it would probably still be faster not to use it: the hash table implementation carries a substantial constant overhead. It’s only faster than direct vector or list subsetting for fairly large tables.
Instead, just do this:
sym2num <- function(x) {
c(f = 1E-15, p = 1E-12, n = 1E-9, mu = 1E-6, m = 1E-3, c = 1E-2)[x]
}
This is idiomatic, efficient R code.
Fundamentally, the mistake in your R code was the subsetting, h$x. This fails because the subset operator $ doesn’t work with variables, it expects an unevaluated name on its right-hand side. The code will thus always look up the literal name x inside h. thc’s answer shows how to avoid the issue.

Your function sym2num always returns the hash of "x", which is NULL.
h$x is a shortcut for h[["x"]], but what you want is h[[x]].
Instead use this:
sym2num <- function(x) {
h <- hash( c("f"=1E-15,"p"=1E-12,"n"=1E-9,"mu"=1E-6,"m"=1E-3,"c"=1E-2) )
return(h[[x]])
}

Related

Assigning objects to arguments in R functions - switches, if else statements or functions?

I am new to R and programming in general and am trying to write a very basic function where the input is 2 numbers and a selection from one of 3 operations. The output is supposed to be the result of a further calculation (divide the result of the input by 3*pi) and then a character string to confirm what operation was selected/performed. I want the default operation to be addition.
I've read up a little on the switch function and if... else type statements but not sure what is the most efficient way to achieve what I am trying to do and so far I haven't been able to get anything to work anyway. I seem to be getting a massive matrix as the output or an error to say I can't return multiple arguments in my current attempt. Can someone help with where I am going wrong? Thank you in advance.
basiccalc <- function(x, y, operation = addition){
addition <- x + y
subtraction <- x - y
multiplication <- x * y
calculation <- operation/(3*pi)
return(calculation, "operation")
}
switch would be useful
basiccalc <- function(x, y, operation = addition) {
operation <- deparse(substitute(operation))
op <- switch(operation,
addition = x + y,
subtraction = x - y,
multiplication = x * y)
return(op/(3 *pi))
}
-testing
> basiccalc(3, 5)
[1] 0.8488264
> 8/(3 * pi)
[1] 0.8488264
> basiccalc(3, 5, operation = subtraction)
[1] -0.2122066
> (3- 5)/(3 * pi)
[1] -0.2122066

Finding root of a function containing an array using uniroot.all

I'm trying to find the roots of a function like this
f <- function(x) {
sum( tanh(Carray + x) ) - x
}
library(rootSolve)
roots <- uniroot.all(f,c(0,1))
where x is a scalar and Carray is an array. The problem seems to be that uniroot.all sends to f a whole array of x'es at once, and then R gets confused about doing Carray+x (warning: "longer object length is not a multiple of shorter object length").
Function f works as intended when x is a scalar. I am not sure how to vectorize f so that I can pass to it an array of x'es.
If we knew the size of Carray, it would maybe be possible to vectorize f more naturally, but for a general Carray you could use Vectorize:
roots <- uniroot.all(Vectorize(f), c(0, 1))
yes, example:
f <- function(x) {
sum(tanh(y+x))/100-x
}
vf <- Vectorize(f)
y=runif(100,-1,1)
f(x=0.5)
f(x=-0.5)
vf(x=1:2)
uniroot.all(vf,c(-.99,.99))

R BB package - no way to pass parameters to objective function?

I am eager to use the R package BB to solve a system of non-linear equations, but the syntax does not seem to allow for parameters to be passed to the system of equations. Very strange since this would severely limit what appears to be an otherwise very appealing and powerful alternative to nleqslv().
To be clear: "Normally", you expect a solver to have a space for passing parameters to the underlying objective function. For eg. in nleqslv:
out <- nleqslv(in_x, obj_fn, jac = NULL, other_pars1, other_pars2, method = "Broyden")
Where "in_x" is the vector of initial guesses at a solution, and the "other_pars1, other_pars2" are additional fixed parameters (can be scalars, vectors, matrices, whatever) required by "obj_fn".
In BBsolve, on the other hand, you just have
out <- BBsolve(in_x, obj_fn)
With no space to put in all the "other_pars1, other_pars2" required by obj_fn.
Create a function that "attaches" additional parameters to your objective function. The key concept here is that the return value is itself a function:
gen_obj_fn <- function( obj_fn, other_pars1, other_pars2 )
{
function(x) { obj_fn( x, other_pars1, other_pars2 ) }
}
The output of gen_obj_fn can now be passed directly to BBsolve:
## Previous call using nleqslv():
out <- nleqslv( in_x, myFun, jac = NULL, myParam1, myParam2, ... )
## Equivalent call using BBsolve():
myObjF <- gen_obj_fn( myFun, myParam1, myParam2 )
is.function( myObjF ) ## TRUE
out <- BBsolve( in_x, myObjF )
You haven't shown how you are using BBsolve. As I said in my comment BBsolve certainly does accept additional function arguments.
But you must name those arguments.
See this example for how to do what you seem to want:
library(nleqslv)
f <- function(x,p1=3,p2=2) {
y <- numeric(2)
y[1] <- 10*x[1]+3*x[2]^2 - p1
y[2] <- x[1]^2 -exp(x[2]) -p2
y
}
xstart <- c(1,1)
nleqslv(xstart, f)
library(BB)
BBsolve(xstart,f)
Try slightly different values for p1 and p2:
nleqslv(xstart,f,p1=2.7,p2=2.1)
BBsolve(xstart,f,p1=2.7,p2=2.1)
Both functions find the same solution.

Is there a better clean approach to single-use functions in R?

In a lot of cases, I need to write some code that makes up a logical bloc and it feels right to place it in a function. However, being used only once, it makes it more cumbersome to move the code away from where it is applied and give it a single-use name thus polluting the namespace.
Today, I was experimenting and I also came across a question about lambda expressions in R. So I implemented my logic as following:
x <- (function(charsBase, n, m) {
z <- apply(
matrix(
sample(unique(charsBase), n*m*3, replace = TRUE)
, nrow = n*3, ncol = m
)
, 1
, paste, collapse="")
head(unique(z), n)
}) (LETTERS, 1000, 3)
Questions:
Is there a better way of creating a lambda in R?
While the namespace is apparently kept clean, how about the memory? In my experience, R usually leaks when you create / remove object in the global environment. If extensive allocation / freeing is done within the function, would this keep the memory under control?
Thanks a lot in advance!
You can use with with a list or data-frame as first argument. For example:
result <- with(list(a=3, b=4), {
foo <- a + b
foo^2
})
This keeps the global environment clean, because the part enclosed in brackets is evaluated in a separate environment that is destroyed after the evaluation takes place.
However, in my experience it can become cumbersome to program in this style. Sometimes I find more practical to clean up unwanted objects with rm() when they're no longer needed. It's not as elegant, this I agree.
I would use local with shorter lines and more readable code:
x <- local({
charsBase <- LETTERS
n <- 1000
m <- 3
sam <- sample(unique(charsBase), n*m*3, replace = TRUE)
mtx <- matrix(sam, nrow = n*3, ncol = m)
z <- apply(mtx, 1, paste, collapse="")
head(unique(z), n)
})
Nothing of the above "leaks" to the global environment unless you explicitly use global assignment with <<-. The value of the last "thing" evaluated within the curly brackets becomes the value of x. You can get identical result with ...
local({
charsBase <- LETTERS
n <- 1000
m <- 3
sam <- sample(unique(charsBase), n*m*3, replace = TRUE)
mtx <- matrix(sam, nrow = n*3, ncol = m)
z <- apply(mtx, 1, paste, collapse="")
x <<- head(unique(z), n) # notice the wyrd assignment operator
})
... I'd say it's less elegant but then again, it's a matter of preference.
A useful trick I sometimes use when experimenting is ...
local(browser())
You don't have to keep track of the assignments, everything is temporary unless you use global assignment.

Why can a matrix 'protect' itself and how can I implement real restrictions to custom classes?

I have been trying to get my ahead around validity of objects. I have read Hadley's advanced programming and get what he says about aiming at your feet (with a gun):
R doesn't protect you from yourself: you can easily shoot yourself in the foot, but if you don't aim the gun at your foot and pull the trigger, you won't have a problem.
So this holds for S3. Looking for a more rigorous implementation I looked into S4.
The man page to setValidity brought up the following:
setClass("track",
representation(x="numeric", y = "numeric"))
t1 <- new("track", x=1:10, y=sort(stats::rnorm(10)))
## A valid "track" object has the same number of x, y values
validTrackObject <- function(object) {
if(length(object#x) == length(object#y)) TRUE
else paste("Unequal x,y lengths: ", length(object#x), ", ",
length(object#y), sep="")
}
## assign the function as the validity method for the class
setValidity("track", validTrackObject)
## t1 should be a valid "track" object
validObject(t1)
## Now we do something bad
t2 <- t1
t2#x <- 1:20
## This should generate an error
## Not run: try(validObject(t2))
Bottom line: If I do not add validObject to the initializer or constructor there's little I can do. Also this post from Martin Morgan and bioconductor's Seth Falcon was interesting, still though I could always t2#x <- 1:1111.
I guess there's not much I can do about this? Though the matrix class for example makes me wonder if there's an option.
a <- matrix(c(1:12),3,4)
cbind(a,"somechar")
# or similarily
a[1,1] <- "b"
Obviously all elements of a matrix have to be of the same class. So that's why once a character is added all elements are coerced to the common denominator, which is class character.
So my question is: How is this possible? In which way is the matrix class defined, that it can protect the restriction "some class for all elements" by any means? And is there a way to implement such a restriction to a custom class, too?
E.g.: class of class 'customlist' that has to be a named list and names being restricted to only be two chars long.
AFAIK, there isn't a way to prevent you (or your users) doing silly things with assignment, short of possibly overriding <-. Since that is primitive, and quite fundamental to R, there is a danger of breaking other things if you go down that route.
If you use reference classes then you can include accessors which allow checks before assignments are made.
trackFactory <- setRefClass(
"track",
fields = list(
x = "numeric",
y = "numeric"
),
methods = list(
initialize = function(x, y)
{
assertIsValid(x, y)
x <<- x
y <<- y
},
assertIsValid = function(x, y)
{
if(length(x) != length(y))
{
stop(
"Unequal x,y lengths: ",
toString(c(length(x), length(y)))
)
}
},
setX = function(x)
{
assertIsValid(x, .self$y)
x <<- x
},
setY = function(y)
{
assertIsValid(.self$x, y)
y <<- y
}
)
)
track1 <- trackFactory$new(1:10, runif(10))
track1$setX(1:5)
## Error in assertIsValid(x, .self$y) : Unequal x,y lengths: 5, 10
Unfortunately, you can still use direct assignment to skip the checks.
track1$x <- 1:7

Resources