I have predicted values and confidence intervals that I want to add to my 3D plot using trans3d, but I get an error on the line that uses seq I already tried length(z.bin), and read other possible solutions, but it's still not working.
Error in seq.default(lowerCI, upperCI, length.out = 25) :
'from' must be of length 1
I hope you can help me to fix my code. Here are the predicted values (z.bin), upper CI (UCI) and lower CI (LCI):
z.bin= c(0.0293498087331418, 0.090245714112389, 0.184180408140189, 0.288479689911685,
0.380290727519617, 0.447221380019439, 0.486749948207999, 0.515460732539617,
0.524544278048373, 0.517863012982977, 0.499015552138662, 0.471040830332284,
0.436384769878271, 0.39696995466237, 0.354295721949241, 0.309542936297033,
0.263681366413638, 0.217589473510825, 0.172201272125033, 0.128688774135519,
0.0886552840745102, 0.0542241604227149, 0.0277504883386967, 0.0108213094005216,
0.00277584412160996)
UCI=c(0.0366603230533126, 0.0902131425743432, 0.190710608825939,
0.329281535177887, 0.37359325824382, 0.49083302601992, 0.502923852215148,
0.532414036794941, 0.542594424500199, 0.544876477822669, 0.513975201348124,
0.500360540087923, 0.460641689148807, 0.415363280410005, 0.358399020245284,
0.321189810843667, 0.285678220416678, 0.234306786216362, 0.185151688725085,
0.141800528101782, 0.0848830167493455, 0.0596895934068413, 0.034797331186028,
0.0136423698337293, 0.00416130620917585)
LCI=c(0.0203880237502624, 0.0639803379126716, 0.15252099326726, 0.279883133515488,
0.321969495145084, 0.433138773211774, 0.445700330934391, 0.474863237969827,
0.485779389412345, 0.489219946727086, 0.461012139808171, 0.449297954511444,
0.412682077834953, 0.370799794091489, 0.317884618001687, 0.283779930784182,
0.251320227770169, 0.20400383106003, 0.158982316141284, 0.119627373509671,
0.0683623411169277, 0.0464255905587446, 0.0252020843583765, 0.00810835262770212,
0.0014811836711362)
code (please don't run the lines points() and trans3d(), res2 is not included, but it's there to show you the loop I want to use to create the CI bars):
y.bin <- rep(1,25)
x.bin <- seq(-10,10,length.out = 25)
# points(trans3d(x.bin, y.bin, z.bin, pmat = res2), col = 1, pch = 16)
for (i in 1:length(z.bin)) {
lowerCI <- LCI
upperCI <- UCI
CI.bar <- seq(lowerCI,upperCI,length.out=25)
# lines (trans3d(x.bin[i], y.bin[i], z = CI.bar, pmat = res2), col = #1, lwd=2)
}
Looks like you just need to index the LCI and UCI values, otherwise you're feeding the seq function the entire lists:
for (i in 1:length(z.bin)) {
lowerCI <- LCI[i]
upperCI <- UCI[i]
CI.bar <- seq(lowerCI, upperCI, length.out = 25)
# lines (trans3d(x.bin[i], y.bin[i], z = CI.bar, pmat = res2), col = #1, lwd=2)
}
Related
I am trying to use a for-loop over a range of positive and negative values and then plot the results. However, I'm having trouble getting R not to plot the correct values, since the negative values seem to screw up the index.
More precisely, the code I am running is:
# Setup objects
R = (1:20)
rejection = rep(NA, 20)
t = seq(from = -10, to = 10, by = 1)
avg_rej_freq = rep(NA, 21)
# Test a hypothesis for each possible value of x and each replication
for (x in t) {
for (r in R) {
# Generate 1 observation from N(x,1)
y = rnorm(1, x, 1)
# Take the average of this observation
avg_y = mean(y)
# Test this observation using the test we found in part a
if (avg_y >= 1 + pnorm(.95))
{rejection[r] = 1}
if (y < 1 + pnorm(.95))
{rejection[r] = 0}
}
# Calculate the average rejection frequency across the 20 samples
avg_rej_freq[x] = mean(rejection)
}
# Plot the different values of x against the average rejection frequency
plot(t, avg_rej_freq)
The resulting graph should look something like this
# Define the rejection probability for n=1
rej_prob = function(x)(1-pnorm(1-x+qnorm(0.95)))
# Plot it
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
...but there's clearly something wrong with my code that is shifting the positive values on the graph over to the left.
Any help on how to fix this would be much appreciated!
Yep, as you suspected the negative indices are causing problems. R doesn't know how to store something as the "negative first" object in a vector, so it just drops them. Instead, try using seq_along to produce a vector of all positive indices and looping over those instead:
# Setup objects
R = (1:20)
rejection = rep(NA, 20)
t = seq(from = -10, to = 10, by = 1)
avg_rej_freq = rep(NA, 21)
# Test a hypothesis for each possible value of x and each replication
for (x in seq_along(t)) {
for (r in R) {
# Generate 1 observation from N(x,1)
# Now we ask for the value of t at index x rather than t directly
y = rnorm(1, t[x], 1)
# Take the average of this observation
avg_y = mean(y)
# Test this observation using the test we found in part a
if (avg_y >= 1 + pnorm(.95))
{rejection[r] = 1}
if (y < 1 + pnorm(.95))
{rejection[r] = 0}
}
# Calculate the average rejection frequency across the 20 samples
avg_rej_freq[x] = mean(rejection)
}
# Plot the different values of x against the average rejection frequency
plot(t, avg_rej_freq)
which produces the following plot:
Not sure why you want to simulate the vectorized function pnrom() using for loops, still correcting the mistakes in your code (check the comments):
# Test a hypothesis for each possible value of x and each replication
for (x in t) {
for (r in R) {
# Generate 1 observation from N(x,1)
y = rnorm(1, x, 1)
# no need to take average since you have a single observation
# Test this observation using the test we found in part a
rejection[r] = ifelse(y >= 1 + pnorm(.95), 1, 0)
}
# Calculate the average rejection frequency across the 20 samples
# `R` vector index starts from 1, transform your x values s.t., negative values become positive
avg_rej_freq[x-min(t)+1] = mean(rejection)
}
# Define the rejection probability for n=1
rej_prob = function(x)(1-pnorm(1-x+qnorm(0.95)))
# Plot it
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
# plot your points
points(t, avg_rej_freq, pch=19, col='red')
Not sure why the for loops etc, what you are doing can be collapsed into a one line. The rest of the code taken from #Sandipan Dey:
R <- 20
t <- seq(from = -10, to = 10, by = 1)
#All the for-loops collapsed into this one line:
avg_rej_freq <- rowMeans(matrix(rnorm(R * length(t), t), 21) >= 1 + pnorm(.95))
rej_prob <- function(x) 1 - pnorm(1 - x + qnorm(0.95))
curve(rej_prob,from = -10, to = 10, xlab = expression(theta),
ylab = "Rejection probability")
# plot your points
points(t, avg_rej_freq, pch=19, col='red')
I have been finding HyperSpec to very useful, however, I continue to receive errors when running through "Outlier Removal by Principal Component Analysis (PCA)" following the steps in the Chondro tutorial . The code I'm running is below:
pca <- prcomp (spc_N2, center = TRUE)
scores <- decomposition (spc_N2, pca$x, label.wavelength="PC",label.spc_N2="score/a.u.")
loadings <- decomposition (spc_N2, t(pca$rotation), scores = FALSE,label.spc_N2="laoding I/a.u.")
pairs (scores [[,,1:20]], pch = 19, cex = 0.5)
This results in a plot of the first 20 score pairs as expected. When I try to identify spectra:
out <- map.identify (scores [,,5])
I receive the following error:
Error in eval(modelRHS[[2]], data, env) : object 'x' not found
Any suggestions would be greatly appreciated.
Thank you
Haley
**edit
I've added an example file. Here is the code I used to import and pre-process the file:
library(hyperSpec)
#import file
file <- read.table ("t0_CA_bln_adj.csv", header = TRUE, dec = ".", sep = ",")
spc <- new ("hyperSpec", wavelength = file [,1], spc = t (file [, -1]), data = data.frame (sample = colnames (file [, -1])), labels = list ((.wavelength = "cm-1"), spc = "I"))
#initial plot
plot (spc)
#intensity standardize to mean of N2 peak
factors_N2 <- 1/apply(spc[, , 2200~2400],1,mean)
spc_N2<-sweep(spc,1,factors_N2,"*")
plot(spc_N2)
#PCA
pca <- prcomp (spc_N2, center = TRUE)
scores <- decomposition (spc_N2, pca$x, label.wavelength="PC",label.spc_N2="score/a.u.")
loadings <- decomposition (spc_N2, t(pca$rotation), scores = FALSE,label.spc_N2="laoding I/a.u.")
#plot score plots of the first 20 and first 5 PCs
pairs (scores [[,,1:20]], pch = 19, cex = 0.5)
pairs (scores [[,,1:5]], pch = 19, cex = 0.5)
#attempt to identify outliers
out <- map.identify (scores [,,5])
file example: t0_CA_bln_adj
I hope you can help me to solve this issue, I've been trying different things but nothing work so far:
I have a 3D graph that is using a squared term (x2) on the x axis (values go from 0 to 100). The original x has positive and negative values (values go from -10 to 10). In x2 and therefore in the X axis of my 3D graph the values are all positive. Thinking that x2=100 is the value obtained from x=-10^2 and x=10^, x2=25 comes from x= -5^2 and x=5^2 and so on. I have only "half" of the graph and I would like to:
1) Have the graph with the original scale going from -10 to 10 on the X axis.
2) Complete the other half of the graph to have the non-linear relationship (i.e. to complete the surface that corresponds from -10 to 0, which I assume should be a mirror of the one I have right now).
Using different colours you can see better the nonlinear relationship, but I didn't include them here to simplify the code.
Since it is not possible to get the negative values back to plot x because the square root will be always positive, I duplicated the data in Excel. I added negative values (values now go from -100 to 100), I made an R list again. This is not a solution because it still has the same scale of x2, but anyways it doesn’t work.
This is how I plot the graph:
Data: https://www.dropbox.com/s/fv943jf35eqtkd8/NSSH.csv?dl=0
link function code:
logexp <- function(days = 1)
{
linkfun <- function(mu) qlogis(mu^(1/days))
linkinv <- function(eta) plogis(eta)^days
mu.eta <- function(eta) days * plogis(eta)^(days-1) *
.Call("logit_mu_eta", eta, PACKAGE = "stats")
valideta <- function(eta) TRUE
link <- paste("logexp(", days, ")", sep="")
structure(list(linkfun = linkfun, linkinv = linkinv,
mu.eta = mu.eta, valideta = valideta, name = link),
class = "link-glm")
}
The 3D graph:
library(akima)
x <- NSSH$reLDM
x2<- x^2
y <- NSSH$yr
y2 <-y^2
n <-NSSH$AgeDay1
z <- NSSH$survive
m <- glm(z~x2+y+y2+x2:y+n,family=binomial(link=logexp(NSSH$exposure)))
# interaction
i <- 25
xtemp <- seq(min(x),max(x),length.out=i)
xrange <- rep(xtemp,times=i)
x2temp <- seq(min(0),max(100),length.out=i)
x2range <- rep(x2temp,times=i)
ytemp <- seq(min(y),max(y),length.out=i)
yrange <- rep(ytemp,each=i)
y2temp <- seq(min(y2),max(y2),length.out=i)
y2range <- rep(y2temp,each=i)
ntemp <- rep(mean(n),times=i)
nrange <- rep(ntemp,times=i)
newdata <- data.frame(x2=x2range,y=yrange,y2=y2range,n=nrange)
zhat <- predict(m,newdata=newdata)
NS <- zhat^27
xyz <- interp(x2range,yrange,NS)
quartz()
persp(xyz,
theta = 35, phi = 50,col="blue", border="grey40", ticktype = "detailed", zlim=c(0,1)) -> res2
Is there a way I can copy the "half" graph I have as a “mirror” and put it next to the part I already have and use the original scale from x?
Thanks a lot for your help!
UPDATE:
The 3D graph is perfect!
But when I use the "half graph" to make a contour plot it looks like this:
And now with the new graph it looks like this, I wonder why the origin around 0 next to the value 0.7 (area in the red circle) doesn't look the same as the first contour plot. Do you have any idea? is it possible to fix it? Thanks again.
this is the code of the contour plot:
image(xyz2,col = "white")
contour(xyz2,add=T)
I think you don't have to worry about the small things with the exception that make X and Y increase and dim(Z) are c(length(X), length(Y)) .
xyz2 <- interp(sqrt(x2range), yrange, NS) # change scale before interpolate
xyz2$x <- c(rev(xyz2$x)*-1, xyz2$x) # reverse and combine
xyz2$x[41] <- 1.0E-8 # because [40] = [41] = 0 (40 is interp's nx value)
xyz2$z <- rbind(apply(xyz2$z, 2, rev), xyz2$z) # reverse and combine
persp(xyz2,xlab="Relative laying date",ylab="Year",zlab="Nest success",
theta = 35, phi = 50,col="blue", border="grey40", ticktype = "detailed")
[EDITED]
I can't reproduce your additional question.
origin <- list(x = unique(x2range),
y = unique(yrange),
z = matrix(NS, ncol=length(unique(yrange))))
xyz <- interp(x2range,yrange,NS) # OP's code
image(origin, col = "white", xlim=c(-10,10), ylim=c(7, 24))
contour(origin, add=T, lwd=1.5, drawlabels=F) # no interp : black
contour(xyz, add=T, col=2, drawlabels=F) # OP's code : red
contour(x=sqrt(xyz$x), y=xyz$y, z=xyz$z, add=T, col=3, drawlabels=F) # only scale change : green
contour(xyz2, add=T, col=4, drawlabels=F) # my code : blue
I have this code
N <- 1000
beta1 = runif(N, -1,1);
beta2 = runif(N, -1,1);
x1 = seq(-500, 500, 0.01);
and for each i evaluated from 1 to N, I want to plot this function
z = beta1[i] + beta2[i]*x1;
pr = 1/(1+exp(-z));
plot (x1,pr);
at the end I would expect 1000 curve of pr vs x1.
for that I've tried this
for (i in 1:N){
z[i]= res[i,1] + res[i,2]*x1
pr[i] = 1/(1+exp(-z[i]));
plot(x1,pr[i])
}
But it gave list of 50 warnings and it didn't worked out.
Any helps?
This is a great time for some matrix multiplication to simplify and speed up calculation. Your biggest problem was that plot opens a new plot every time it's called. I assume you want all the lines plotted on the same graph.
N <- 1000
beta1 = runif(N, -1, 1)
beta2 = runif(N, -1, 1)
# I changed this to by = 1
# for plotting purposes you really done need 100k points per line
x1 = seq(-500, 500, 1)
z = cbind(1, x1) %*% rbind(beta1, beta2)
pr = 1 / (1 + exp(-z))
# this is the bug step you were missing
# initialize an empty plot with sufficient range
plot(range(x1), range(pr), type = "n")
# then just add to it in the for loop
for (i in 1:N) {
lines(x1, pr[, i])
}
Assume A follows Exponential distribution; B follows Gamma distribution
How to plot the PDF of 0.5*(A+B)
This is fairly straight forward using the "distr" package:
library(distr)
A <- Exp(rate=3)
B <- Gammad(shape=2, scale=3)
conv <- 0.5*(A+B)
plot(conv)
plot(conv, to.draw.arg=1)
Edit by JD Long
Resulting plot looks like this:
If you're just looking for fast graph I usually do the quick and dirty simulation approach. I do some draws, slam a Gaussian density on the draws and plot that bad boy:
numDraws <- 1e6
gammaDraws <- rgamma(numDraws, 2)
expDraws <- rexp(numDraws)
combined <- .5 * (gammaDraws + expDraws)
plot(density(combined))
output should look a little like this:
Here is an attempt at doing the convolution (which #Jim Lewis refers to) in R. Note that there are probably much more efficient ways of doing this.
lower <- 0
upper <- 20
t <- seq(lower,upper,0.01)
fA <- dexp(t, rate = 0.4)
fB <- dgamma(t,shape = 8, rate = 2)
## C has the same distribution as (A + B)/2
dC <- function(x, lower, upper, exp.rate, gamma.rate, gamma.shape){
integrand <- function(Y, X, exp.rate, gamma.rate, gamma.shape){
dexp(Y, rate = exp.rate)*dgamma(2*X-Y, rate = gamma.rate, shape = gamma.shape)*2
}
out <- NULL
for(ix in seq_along(x)){
out[ix] <-
integrate(integrand, lower = lower, upper = upper,
X = x[ix], exp.rate = exp.rate,
gamma.rate = gamma.rate, gamma.shape = gamma.shape)$value
}
return(out)
}
fC <- dC(t, lower=lower, upper=upper, exp.rate=0.4, gamma.rate=2, gamma.shape=8)
## plot the resulting distribution
plot(t,fA,
ylim = range(fA,fB,na.rm=TRUE,finite = TRUE),
xlab = 'x',ylab = 'f(x)',type = 'l')
lines(t,fB,lty = 2)
lines(t,fC,lty = 3)
legend('topright', c('A ~ exp(0.4)','B ~ gamma(8,2)', 'C ~ (A+B)/2'),lty = 1:3)
I'm not an R programmer, but it might be helpful to know that for independent random variables with PDFs f1(x) and f2(x), the PDF
of the sum of the two variables is given by the convolution f1 * f2 (x) of the two input PDFs.