How to calculate arcsin(sgn(x)√|x|)? - r

I'm trying to arcsine squareroot data lying on [-1,1]. Using transf.arcsine from the metafor package produces NaNs when trying to squareroot the negative datapoints. Conceptually, I want to use arcsin(sgn(x)√|x|) i.e. square the absolute value, apply its previous sign, then arcsine transform it. The trouble is I have no idea how to begin doing this in R. Help would be appreciated.

x <- seq(-1, 1, length = 20)
asin(sign(x) * sqrt(abs(x)))
or as a function
trans.arcsine <- function(x){
asin(sign(x) * sqrt(abs(x)))
}
trans.arcsine(x)

Help in R is just help() or help.search(). So, let's try the obvious,
> help(arcsin)
No documentation for ‘arcsin’ in specified packages and libraries:
OK, that's not good. But it must be able to trig... let's try something even simpler.
help(sin)
There's all the trig functions. And I note, there's a link to Math on the page. Clicking that seems to provide all of the functions you need. It turns out that I could have just typed..
help(Math)
also,
help.search('trigonometry')

I had a similar prob. I wanted to arcsine transform most of the dataset "logmeantd.ascvr" & approached it in this manner:
First make are data range has been transformed b/t -1 and 1 (in this case they were expressed as percentages):
logmeantd.ascvr[1:12] <- logmeantd.ascvr[1:12] * 0.01
Next apply the square root function, sqrt():
logmeantd.ascvr[1:12] <- sqrt(logmeantd.ascvr[1:12])
lastly apply the arc sine function, asin():
logmeantd.ascvr[1:12] <- asin(logmeantd.ascvr[1:12])
*note in this instance I had excluded the MEAN variable of my dataset because I wanted to apply a log function to it, log():
logmeantd.ascvr$MEAN <- log(logmeantd.ascvr$MEAN)

Related

Determining the mean center and standard distance of a dataset in R

I have a dataset called mypoints and I have created a polygon and plotted the points as below:
mypoints=read.csv("d:\\data\\venus.csv",header = T)
mypoints
minx=min(mypoints[,1])
maxx=max(mypoints[,1])
miny=min(mypoints[,2])
maxy=max(mypoints[,2])
mypolygon=cbind(c(minx,maxx,maxx,minx),c(miny,miny,maxy,maxy))
plot(mypoints)
polygon(mypolygon)
I now want to write a function that calculates both the mean center and standard distance for mypoints. I then need to plot the standard distance as a circle centered on the mean center of all points with the radius equal to the standard distance. Note that the last expression evaluated in a function becomes the return value, the result of invoking the function.
So far:
#I think this how I calculate the mean center for x and y:
x1=sum(mypoints[,1])/length(mypoints[,1])
y1=sum(mypoints[,2])/length(mypoints[,2])
#This is the formula I was shown for standard distance:
sd.mypoints=sqrt(sum(x1+y1)/n)
#This is the formula I was shown for creating the circle:
symbols(sd.mypoints[1],sd.mypoints[2],sd.mypoints$sd,add=T,inches=F)
#This is the error that I get when I run the circle formula:
Error in sd.mypoints$sd : $ operator is invalid for atomic vectors
I have found it easier to find the Nearest Neighbor, do KDE, Ghat, and Fhat for this dataset than trying to figure this out. I am sure there is a easy solution for this but I just can't seem to get it. Third class in R and it has been a lot of fun up to this point.
You have the line
symbols(sd.mypoints[1],sd.mypoints[2],sd.mypoints$sd,add=T,inches=F)
in your code. As said in the comments, sd.mypoint is not a data.frame, so subsetting it with sd.mypoint$sd` causes the error you see.
From the documentation of symbols, which you can access with ?symbols you'll see that for circles the circles argument is mandatory, so the function can differentiate what sort of figure is drawing.
EDIT:
Also, please notice that you are using x and y points to symbols different to the ones you already calculated. So you need to replace that line with:
symbols(x1, y1,circles = sd.mypoints,add=T,inches=F)
Notice the use of x1 and y1. I can see the plot now.

How to create a random walk in R that goes in different directions than -1 or +1?

Consider this two‐dimensional random walk:
where, Zt, Wt, t = 1,2,3, … are independent and identically distributed standard normal
random variables.
I am having problems in finding a way to simulate and plot the sample path of (X,Y) for t = 0,1, … ,100. I was given a sample:
The following code is an example of the way I am used to plot random walks in R:
set.seed(13579)
r<-sample(c(-1,1),size=100,replace=T,prob=c(0.5,0.5))
r<-c(10,r))
(w<-cumsum(r))
w<-as.ts(w)
plot(w,main="random walk")
I am not very sure of how to achieve this.
The problem I am having is that this kind of codes has a more "simple" result, with a line that goes either up or down, -1 or +1:
while the plot I need to create also goes from left to right and viceversa.
Would you help me in correcting the code I know so that it fits my task/suggesting a smarterst way to go about it? It would be greatly appreciated.
Cheers!
Instead of using sample, you need to use rnorm(100) to draw 100 samples from a standard normal distribution. Since the walk starts at [0, 0], we need to append a 0 at the start and do a cumsum on the result, i.e. cumsum(c(0, rnorm(100))).
We want to do this for both the x and y variables, then plot. The whole thing can be done in a single line of code in base R:
plot(x = cumsum(c(0, rnorm(100))), y = cumsum(c(0, rnorm(100))), type = 'l')

how to intersect an interpolated surface z=f(x,y) with z=z0 in R

I found some posts and discussions about the above, but I'm not sure... could someone please check if I am doing anything wrong?
I have a set of N points of the form (x,y,z). The x and y coordinates are independent variables that I choose, and z is the output of a rather complicated (and of course non-analytical) function that uses x and y as input.
My aim is to find a set of values of (x,y) where z=z0.
I looked up this kind of problem in R-related forums, and it appears that I need to interpolate the points first, perhaps using a package like akima or fields.
However, it is less clear to me: 1) if that is necessary, or the basic R functions that do the same are sufficiently good; 2) how I should use the interpolated surface to generate a correct matrix of the desired (x,y,z=z0) points.
E.g. this post seems somewhat related to the problem I am describing, but it looks extremely complicated to me, so I am wondering whether my simpler approach is correct.
Please see below some example code (not the original one, as I said the generating function for z is very complicated).
I would appreciate if you could please comment / let me know if this approach is correct / suggest a better one if applicable.
df <- merge(data.frame(x=seq(0,50,by=5)),data.frame(y=seq(0,12,by=1)),all=TRUE)
df["z"] <- (df$y)*(df$x)^2
ta <- xtabs(z~x+y,df)
contour(ta,nlevels=20)
contour(ta,levels=c(1000))
#why are the x and y axes [0,1] instead of showing the original values?
#and how accurate is the algorithm that draws the contour?
li2 <- as.data.frame(contourLines(ta,levels=c(1000)))
#this extracts the contour data, but all (x,y) values are wrong
require(akima)
s <- interp(df$x,df$y,df$z)
contour(s,levels=c(1000))
li <- as.data.frame(contourLines(s,levels=c(1000)))
#at least now the axis values are in the right range; but are they correct?
require(fields)
image.plot(s)
fancier, but same problem - are the values correct? better than the akima ones?

R: Writing a recursive function for a Random Walk (initial values)

I'm a new user to R, and I am trying to create a function that will simulate a random walk. The issue for me is trying to integrate some initial values smoothly. Say I have this basic function.
y(t) = y(t-2) + eps(t)
Epsilon (or eps(t)) will be the randomness factor. I want to define y(-1)=0, and y(0)=0.
Here is my code:
ran.walk=function(n){ # 'n' steps will be the input
eps=rnorm(n) # creates a vector taking random values from N(0,1)
y= c(eps[1], eps[2]) # this will set up my initial vector
for (i in 3:n){
ytemp = y[i-2] + eps[i] ## !!! problem is here. Details below !!!
y= c(y, ytemp)
}
return(y)
}
I'm trying to get this start adding y3, y4, y5, etc, but I think there is a flaw in this design... I'm not sure if I should just set up two separate lines, with an if statement: testing if n is even or odd, perhaps with:
if i%%2 == 1 #using modulus
Since,
y1= eps1,
y2= eps2,
y3= y1 + eps3,
y4= y2 + eps4,
y5= y3 + eps5 and so on...
Currently, I see the error in my code.
I have y1, and y2 concatenated, but I don't think it knows how to incorporate y[1]
Can I define beforehand somehow y[-1]=0, and y[0]=0 ? I tried this also and got an error.
Thank you kindly in advance for any assistance. This is first times attempting a for loop with recursion.
-N (sorry for any formatting issues, I had a lot of problems getting this question to go through)
I found that your odd and even series is independent one of the other. Assuming that it is the case, I just split the problem in two columns and use cumsum to get the random walk. The final data frame include the random numbers and the random walk, so you can compare it is working properly.
Hoping it helps
ran.walk=function(n) {
eps=rnorm(ceiling(n / 2)*2)
dim(eps) <- c(2,ceiling(n/2))
# since each series is independent, we can tally each one in its own
eps2 <- apply(eps, 1, cumsum)
# and just reorganize it
eps2 <- as.numeric(t(eps2))
rndwlk <- data.frame(rnd=as.numeric(eps), walk=eps2)
# remove the extra value if needed
rndwlk <- rndwlk[1:n,]
return(rndwlk)
}
ran.walk(13)
After taking a break with my piano, it came to me. It's funny how simple the answer becomes once you discover it... almost trivial.
Setting the initial value to be a vector, that is:
[y(1) = y(-1) + eps(1), y(2)= y(0) + eps(2)]
everything works out. It is still true that the evens and odds don't interact, but there is no reason to specify any of that.
The method to split the iterations with modulus, then concatenating it back into the main vector would also work, but is unnecessary and more complicated. Shorter is better for users and computers. As Einstein said, make it as simple as possible, but no simpler.

Get summary vectors of raster cell centers in R

I want to extract summary vectors that contain the coordinates for the centers of the different cells in a raster. The following code works but I believe involves an n-squared comparison operation. Is there a more efficient method? Not seeing anything obvious in {raster}'s guidance.
require(raster)
r = raster(volcano)
pts = rasterToPoints(r)
x_centroids = unique(pts[,1])
y_centroids = unique(pts[,2])
To get the centers of the raster cells, you should use the functions xFromCol, yFromRow and friends (see also the help pages)
In this case, you get exactly the same result as follows:
require(raster)
r <- raster(volcano)
x_centers <- xFromCol(r)
y_centers <- yFromRow(r)
Note that these functions actually don't do much else but check the minimum value of the coordinates and the resolution of the raster. From these two values, they calculate the sequence of centers as follows:
xmin(r) + (seq_len(ncol(r)) - 0.5) * xres(r)
ymin(r) + (seq_len(nrow(r)) - 0.5) * xres(r)
But you better use the functions mentioned above, as these do a bit more safety checks.

Resources