Programming R functions that accommodate spelling variations - r

Suppose you want to program an R function that uses an argument with different possible spellings (e.g., using the arguments centre and center from British and US English respectively). You want the function to allow the user to use either spelling, but treated essentially as a single argument. What is "best practice" for how to structure a function like this?

Whilst I cannot speak to best practice, I am going to give one possible structure for this task, which seems to me to accommodate spelling variations. This is the method that is used in some of the base R functions for probability distributions when there are alternative parameterisations of the distribution (e.g., in the gamma distribution you can specify a scale or rate parameter). In these cases the function includes both arguments and then uses a clever set of commands to ensure that the function will use whichever argument is specified by the user.
To create a function using centre and centre as spelling variations for the same argument, you could use the structure shown below. Note that the computation in this method is done using the second spelling variation, and this latter argument is set equal to the former argument in the default values for the function. This ensures that the function will use the user-specified argument.
MY_FUNCTION <- function(centre = 0, center = centre, ...) {
#Check inputs
if (!missing(centre) && !missing(center)) {
if (sum((centre - center)^2) < 1e-15) {
warning("specify 'centre' or 'center' but not both") } else {
stop("Error: specify 'centre' or 'center' but not both") } }
Do some computations using the 'center' argument and then give output }
Here is an example of this function structure applied to generate a function DISTANCE that computes the Euclidean distance between a point and a centre/centre.
DISTANCE <- function(point, centre = 0, center = centre, ...) {
#Check inputs
if (!missing(centre) && !missing(center)) {
if (sum((centre - center)^2) < 1e-15) {
warning("specify 'centre' or 'center' but not both") } else {
stop("Error: specify 'centre' or 'center' but not both") } }
sqrt(sum((point - center)^2)) }
In the examples below, we demonstrate that this structure works properly regardless of whether the user uses the centre argument or the center argument. In the pathological case where the user specifies both arguments (which they really shouldn't) the function compares the values center and centre; if the values are close together then the function continues but gives the user a warning not to specify both arguments; if the values are not close together then the function stops and gives an error message telling the user not to specify both arguments.
POINT <- c(4, 2, 3);
CENT <- c(0, 3, 1);
DISTANCE(point = POINT, centre = CENT);
[1] 4.582576
DISTANCE(point = POINT, center = CENT);
[1] 4.582576
DISTANCE(point = POINT, centre = CENT, center = CENT);
[1] 4.582576
Warning message:
In DISTANCE(point = POINT, centre = CENT, center = CENT) :
specify 'centre' or 'center' but not both
DISTANCE(point = POINT, centre = CENT, center = CENT + 1);
Error in DISTANCE(point = POINT, centre = CENT, centre = CENT + 1) :
Error: specify 'centre' or 'center' but not both

Related

Convert P to Z in log space

Given a -log10(P) value, I'd like to calculate the Z score in log space, how would I do that?
So, given the following code, how to recode the last line so that it calculates Z from log10P in the log space?
Z=10
log10P = -1*(pnorm(-abs(Z),log.p = T)*1/log(10) + log10(2))
Z== -1*(qnorm(10^-log10P/2)) # <- this needs to be in log space.
qnorm also has a log.p argument analogous to pnorm's, so you can reverse the operations that you used to get log10P in the first place (it took me a couple of tries to get this right ...)
I rearranged your log10P calculation slightly.
log10P_from_Z <- function(Z) {
abs((pnorm(-abs(Z),log.p=TRUE)+log(2))/log(10))
}
Z_from_log10P <- function(log10P) {
-1*qnorm(-(log10P*log(10))-log(2), log.p=TRUE)
}
We can check the round-trip accuracy (i.e. convert from -log10(p) to Z and back, see how close we got to the original value.) This works perfectly for values around 20, but does incur a little bit of round-off error for large values (would have to look more carefully to see if there's anything that can be remedied here).
zvec <- seq(20,400)
err <- sapply(zvec, function(z) {
abs(Z_from_log10P(log10P_from_Z(z))-z)
})

How do I find level sets for a function on R^d, in R?

I am looking for an efficient way to find level sets of an arbitrary function from [0,1]^d to R.
To be clear: with a level set I mean the set of points in [0,1]^d that are mapped to the same value.
In all of my applications, the level sets are connected. They are lines, planes, or some higher dimensional hyperplane, but apart from the connectedness, they do not satisfy some general criterium.
I am looking for a subset of the level set that has a high density everywhere.
When I limit my functions to 2d, I can use the function contourLines from the package grDevices, which does exactly what I am looking for:
test <- function(x,y){
y-(x^2-6*x+9)
}
Mat = matrix(0,100,100)
x <- seq(-10,10,length.out = 100)
y <- seq(-10,10,length.out = 100)
for(i in (1:100)){
for(j in (1:100)){
Mat[i,j] = test(x[i],y[j])
}
}
cont <- contourLines(x, y, Mat, levels = 0)
Unfortunately I have not been able to find a function that does the same trick in higher dimensions.
To give a bit more context to the problem:
I have a 'wild' function, of which I hardly know anything, but I can easily evaluate it at any point in R^d. This function divides the R^d (or [0,1]^d, to make it a bit simpler), into a positive part (level sets larger than 0), and a negative part (level sets smaller than 0). I am looking for the boundary separating the two, which is the level set for 0.

Plotting functions of functions

So I'm kind of new to R, and I need to plot some functions. My understanding of curve in R is that it requires a function that has x as its single input. Since my functions are all just different representations of the same main function, I first thought I would create the main function, and then I would define each specific function individually.
# The principal function
puiss <- function(theta, inf, sup) {
for(k in inf:sup) {
total += (choose(30,k) * (theta^k) * ((1-theta)^(30-k)))
}
}
# The specific functions I need to draw on the same plot
p1 <- function(x) { puiss(x,2,13) }
p2 <- function(x) { puiss(x,3,14) }
p3 <- function(x) { puiss(x,3,13) }
# Can't even get to trace just a single one... :'(
curve(p1,
0, 1, # from 0 to 1
main="puissance(theta)", # title
xlab="x", ylab="theta") # axes
curve(p2, add=T) # adding the other function
curve(p3, add=T) # adding the other function
I get this error:
'expr' did not evaluate to an object of length 'n'
I've tried multiple approaches, but this one seemed to be the closest one to what it should have been.
Among other alternatives, I've tried:
changing from <- to = for the specific functions
using no {} (brackets) for the specific functions
plugging the for loop directly into the curve and replacing theta by x and inf:sup appropriately
trying to use p1(x) inside curve
I've also read that some times Vectorize() is needed, so I've tried Vectorize(p1) inside curve
What am I doing wrong?
It might help to disclose that my main function is just a Binomial(30, theta)'s mass function (probability) evaluated in different regions (the summation within the boundaries, my sigma which is a for loop because I couldn't figure out how to properly create a sigma function in R). In other words, it is a cumulative distribution function.
Ultimately, I'm trying to plot the 3 specific functions together on the same plot.
Thanks for the help! :)
It seems you are using some Python (or similar) code in your function definition.
Here is the R version of it, which for me will plot the results when calling curve.
puiss <- function(theta, inf, sup) {
total = 0
for(k in inf:sup) {
# "+=" does not work for R
total <- total + (choose(30,k) * (theta^k) * ((1-theta)^(30-k)))
}
# you need to use parentheses around total
return(total)
}

Error in `[<-`(`*tmp*`, i, succ, value = 1) : subscript out of bounds

I have to simulate an image with a white crack on a black background. So I defined a function that adds to a matrix with all elements equal to zero some consecutive points equal to one.
The function is the following:
crepa<-function(matrice) {
start<-sample(1:ncol(matrice),1)
matrice[1,start]<-1
for (i in 2:nrow(matrice)) {
alpha<-sample(c(-1,0,1),1)
succ<-start+alpha
if (succ==(ncol(matrice)+1)) succ==ncol(matrice)
if (succ==0) succ==1
matrice[i,succ]<-1
start<-succ
}
matrice<-as.matrix(matrice)
}
To control whether the function works well, I applied it over and over again to the following matrix:
m<-matrix(0,64,64)
imma<-crepa(m)
par(mar=rep(0,4))
image(t(imma), axes = FALSE, col = grey(seq(0, 1, length = 256)))
In most cases the result is correct. However, in few cases I run into this Error:
Error in [<-(*tmp*, i, succ, value = 1) : subscript out of bounds
These two lines:
if (succ==(ncol(matrice)+1)) succ==ncol(matrice)
if (succ==0) succ==1
Should be:
if (succ==(ncol(matrice)+1)) succ=ncol(matrice)
if (succ==0) succ=1
In case you still can't see it, you've used the equality test == when you should use assignment = or <-.
The error message told me it had to be the element going off the matrix, so I started printing out the values of succ and then noticed it wasn't being reset within the right range, and only then did I spot the mistake. I probably looked at the code ten times without noticing. I also figured that kind of error was more likely with a small matrix, and so tested with a 6x6 matrix which meant I could be more likely to see it than with a 64x64!

Bisection method error in R. Specifying midpoint and length

Hi I'm trying to write the bisection method using R. I know there has been questions like this before, but my specific questions is that, how could I write the function if specifying the midpoint and length, but not a and b? Here is my code, but it gives me lots of error.
The reason why I want to specify the midpoint and length is that, if the initial interval does not contain the root, it could double the search length... like from (pt-h/2, pt+h/2) to (pt-3h/2,pt+3h/2).
Here is my codes.
f1<-function(x){exp(x)-1}
find<-function(pt,h,tol = 0.0001){
if f1(pt-h/2)*f1(pt+h/2)<0{
time = 1
while (h>tol) {
if f1(pt)*f1(pt-h/2)<0 pt+h/2<-pt
if f1(pt)*f1(pt+h/2)<0 pt-h/2<-pt
time<-time+1
}
cat("Iteration time is ",time,"\n")
cat("The root is ",md,"\n")
cat("The real y of root is ",fmd,"\n")
}
}

Resources