I am a beginner with R, so hopefully this will be an easy fix.
I am trying to use a for loop on a dataset for neuron firing direction in order to:
Incrementally add the next value from the dataset to a vector
Run a Rayleigh test on that vector and save it to a variable
Test if the Rayleigh test I just ran has a larger statistic than the the Rayleigh test in the last loop just before it, as well as having a p-value of less than .05
If the value is larger, save the statistic value, so that the next loop can compare to it
If the value is larger, save the vector
So far I have this for the code, and after going through it for a long time I'm at a loss for why it's not working. Every time I run it, the for loop goes all the way to the end and just reports the rayleigh value and vector for the whole dataset, which I know for sure isn't correct.
(I'm using the circular package for the rayleigh test function)
# This first line is just to create an initial rayleigh statistic to compare to in the loop that is low
best_rayleigh <- rayleigh.test(1:10)
data_vector <- c()
for (i in firing_directions) {
data_vector <- append(data_vector, i)
ray_lee_test <- rayleigh.test(data_vector)
if ((ray_lee_test$statistic>best_rayleigh$statistic)&(ray_lee_test$p.value<=.05)) {
best_rayleigh <- ray_lee_test
best_rayleigh_vector <- data_vector
} else {
NULL
}
}
Any help is appreciated. Thank you!
Update: I tried using && instead of single & in the if statement, however it returned the same result
The following code doesn't give warnings and selects the vector with highest test statistic and "significant" p-value.
library(circular)
set.seed(2020)
firing_directions <- rvonmises(n = 25, mu = circular(pi), kappa = 2)
plot(firing_directions)
best_rayleigh <- rayleigh.test(circular(1:10))
for(i in seq_along(firing_directions)){
dv <- firing_directions[seq_len(i)]
rltest <- rayleigh.test(dv)
if((rltest$statistic > best_rayleigh$statistic) && (rltest$p.value <= 0.05)){
best_rayleigh <- rltest
best_rayleigh_vector <- dv
}
}
best_rayleigh
#
# Rayleigh Test of Uniformity
# General Unimodal Alternative
#
#Test Statistic: 0.8048
#P-value: 0.0298
best_rayleigh_vector
#Circular Data:
#Type = angles
#Units = radians
#Template = none
#Modulo = asis
#Zero = 0
#Rotation = counter
#[1] 4.172219 2.510826 2.997495 4.095335 3.655613
I think the NULL is throwing up some issues. Not sure what will have if your throw a NULL. You only update the vector if it passes your criteria:
library(circular)
firing_directions= rvonmises(n=25, mu=circular(pi), kappa=2)
best_rayleigh <- rayleigh.test(1:10)
data_vector <- c()
for (i in firing_directions){
data_vector <- c(data_vector, i)
ray_lee_test <- rayleigh.test(data_vector)
if ((ray_lee_test$statistic>best_rayleigh$statistic)&(ray_lee_test$p.value<=.05)) {
best_rayleigh <- ray_lee_test
best_rayleigh_vector <- data_vector
}
}
Related
I am trying to plot histogram from the R output which is not a data frame. Below are my codes and the output.
x <- replicate(1000,
{y <- rpois(200, 1)
{lambda0 <- 1
for(i in 1:1)
{
if( i == 1 ) cat( sprintf("%15s %15s %15s %15s\n", "LogL", "Score", "Information", "New Estimate"))
logL <- sum((-lambda0) + y*(log(lambda0)))
score <- sum((y/lambda0)-1)
information <- sum(y/(lambda0)^2)
lambda1 <- lambda0 + score/information
cat( sprintf("%15.4f %15.4f %15.4f %15.5f\n", logL, score, information, lambda1))
lambda0 <- lambda1
}
}
})
Below is my output
I'm trying to take the new estimate from the output and create histogram. Can you please help?
Thank you.
You need to store the value for New Estimate during your loop. This way you can retrieve your results after the loop is finished. Normally when using a loop, you specify a variable in advance in which you can save the result for each iteration. E.g.:
numbers <- 1:3
result <- list(length = length(numbers)
for (i in seq_along(numbers){
result[[i]] <- numbers[[i]] + 1
}
In this example there is a vector of three numbers, you want to add one to each number and save the result. You can do this by creating a list of length 3 (adding length is better, but not necessary) and for each ith iteration, you save the result in the ith element of the list.
After finishing the loop you can retrieve the results from the result variable. And you can retrieve the ith result by using square brackets: result[[i]].
I made a matrix based population model, however, I would like to run more than one simultaneously in order to represent different groups of animals, in order that dispersing individuals can move between matrices. I originally just repeated everything to get a second matrix but then I realised that because I run the model using a for loop and break() under certain conditions (when that specific matrix should stop running, ie that group has died out) it is, understandably, stopping the whole model rather than just that singular matrix.
I was wondering if anyone had any suggestions on the best ways to code the model so that instead of breaking, and stopping the whole for loop, it just stops running across that specific matrix. I'm a little stumped. I have include a single run of one matrix below.
Also if anyone has a more efficient way of creating and running 9 matrices than writing everything out 9 times advice much appreciated.
n.steps <- 100
mats <- array(0,c(85,85,n.steps))
ns <- array(0,c(85,n.steps))
ns[1,1]<-0
ns[12,1]<-rpois(1,3)
ns[24,1]<-rpois(1,3)
ns[85,1] <- 1
birth<-4
nextbreed<-12
for (i in 2:n.steps){
# set up an empty matrix;
mat <- matrix(0,nrow=85,ncol=85)
surv.age.1 <- 0.95
x <- 2:10
diag(mat[x,(x-1)]) <- surv.age.1
surv.age.a <- 0.97
disp <- 1:74
disp <- disp*-0.001
disp1<-0.13
disp<-1-(disp+disp1)
survdisp<-surv.age.a*disp
x <- 11:84
diag(mat[x,(x-1)])<-survdisp
if (i == nextbreed) {
pb <- 1
} else {
pb <- 0
}
if (pb == 1) {
(nextbreed <- nextbreed+12)
}
mat[1,85] <- pb*birth
mat[85,85]<-1
death<-sample(c(replicate(1000,
sample(c(1,0), prob=c(0.985, 1-0.985), size = 1))),1)
if (death == 0) {
break()}
mats[,,i]<- mat
ns[,i] <- mat%*%ns[,i-1]
}
group.size <- apply(ns[1:85,],2,sum)
plot(group.size)
View(mat)
View(ns)
As somebody else suggested on Twitter, one solution might be to simply turn the matrix into all 0s whenever death happens. It looks to me like death is the probability that a local population disappears? It which case it seems to make good biological sense to just turn the entire population matrix into 0s.
A few other small changes: I made a list of replicate simulations so I could summarize them easily.
If I understand correctly,
death<-sample(c(replicate(1000,sample(c(1,0), prob=c(0.985, 1-0.985), size =1))),1)
says " a local population dies completely with probability 1.5% ". In which case, I think you could replace it with rbinom(). I did that below and my plots look similar to those I made with your code.
Hope that helps!
lots <- replicate(100, simplify = FALSE, expr = {
for (i in 2:n.steps){
# set up an empty matrix;
mat <- matrix(0,nrow=85,ncol=85)
surv.age.1 <- 0.95
x <- 2:10
diag(mat[x,(x-1)]) <- surv.age.1
surv.age.a <- 0.97
disp <- 1:74
disp <- disp*-0.001
disp1<-0.13
disp<-1-(disp+disp1)
survdisp<-surv.age.a*disp
x <- 11:84
diag(mat[x,(x-1)])<-survdisp
if (i == nextbreed) {
pb <- 1
} else {
pb <- 0
}
if (pb == 1) {
(nextbreed <- nextbreed+12)
}
mat[1,85] <- pb*birth
mat[85,85]<-1
death<-rbinom(1, size = 1, prob = 0.6)
if (death == 0) {
mat <- 0
}
mats[,,i]<- mat
ns[,i] <- mat%*%ns[,i-1]
}
ns
})
lapply(lots, FUN = function(x) apply(x[1:85,],2,sum))
I want to perform an IDW cross-validation and find out which "power"-value gives the smallest RMSE. In order to do this, I want to store the "power" and "RMSE"-values in a list and sort them by the smallest RMSE, for example
I'd like something like this:
RMSE Power
[1,] 1.230 2.5
[2,] 1.464 1.5
[3,] 1.698 2.0
[4,] 1.932 3.0
What I have so far is this:
require(sp)
require(gstat)
data("meuse")
#### create grid:
pixels <- 500 #define resolution
#define extent
raster.grd <- expand.grid(x=seq(floor(min(x=meuse$x)),
ceiling(max(x=meuse$x)),
length.out=pixels),
y=seq(floor(min(y=meuse$y)),
ceiling(max(y=meuse$y)),
length.out=pixels))
# convert the dataframe to a spatial points and then to a spatial pixels
grd.pts <- SpatialPixels(SpatialPoints((raster.grd)))
grd <- as(grd.pts, "SpatialGrid")
gridded(grd) = TRUE
#### perform IDW and loop through different power-values
power = seq(from = 1.5, to = 3, by = 0.5)
results=list()
results.cv=list()
for(i in power) {
results[[paste0(i,"P")]] <- gstat::idw(meuse$zinc ~ 1, meuse, grd, idp = i)
results.cv[[paste0(i,"P")]] <- krige.cv(zinc ~ 1, meuse, nfold = nrow(meuse),set = list(idp = i))
}
Now my attempt to calculate and store the RMSE with a for-loop:
results_rmse <- list()
pwr <- names(results.cv)
for(i in results.cv){ #for each Element (1.5P, 2P, etc) in results.cv
for(j in 1:length(pwr)){ #for each Power
results_rmse <- sqrt(mean(i$residual^2))
print(pwr[j])
}
print(paste("RMSE",results_rmse))
}
But with this loop, it prints each RMSE individually. So I changed the code like this
results_rmse[[i]] <- sqrt(mean(i$residual^2))
But then I get an error
Error in results_rmse[[i]] <- sqrt(mean(i$residual^2)) : invalid subscript type 'S4'
I tried several versions of the for-loop, but I couldn't even figure out how to store the values in a list, not to mention to sort them by the smallest RMSE.
There is an extra loop for j in the RMSE calculation that is not needed as far as I understand the problem. Also, I rearranged the loop in such a way that it cycles through a sequence of elements rather than calling them by their names.
# Data, because your script doesn't run for me. The rest is identical from your code
for(i in power) {
results.cv[[paste0(i,"P")]]$residual <- rnorm(50)
}
# Fixed loop
for(i in 1:length(results.cv)){
results_rmse[[i]] <- sqrt(mean(results.cv[[i]]$residual^2))
}
names(results_rmse) <- names(results.cv)
Alternatively, the for loop can be avoided with the apply function. The result is a named list corresponding to the input names, so the last line can be omitted to achieve the same results_rmse.
results_rmse <- lapply(results.cv, function(x) sqrt(mean(x$residual^2)))
To print the data as you showed in your question:
cbind(RMSE=unlist(results_rmse), Power=power)
I have code that successfully calculates VaR based on Extreme Value Theory using historical data. I'm trying to run this same code on multiple simulated price paths (i.e. calculating a VaR for each path) and then taking the median or average of those VaRs.
Every example I could find online had the simulation function return the price at the end of the period and then they replicated the function X many time. That makes sense to me, except that I essentially need to calculate value-at-risk for each simulated path. Below is the code I have so far. I can say that the code works when using historical data (i.e. the "evt" function works fine and the datatable is populated correctly when the lossOnly, u, and evtVar lines aren't in a function). However, I've been trying to implement simulation in the second function and trying various combinations, which have all failed.
library('RODBC')
library('nor1mix')
library('fExtremes')
library('QRM')
library('fGarch')
#function for computing the EVT VaR
evt <- function(data,u){
#fit excess returns to gpd to get estimates
gpdfit = tryCatch({
gpdfit <- gpdFit(data,u,type="mle")
}, warning = function(w) {
gpdfit <- gpdFit(data,u,type="mle",optfunc="nlminb")
return(gpdfit)
}, error = function(e) {
gpdfit <- gpdFit(data,u,type="pwm",optfunc="nlminb")
return(gpdfit)
}, finally = {})
#now calculate VaRs
xi <- gpdfit#fit$par.ests["xi"]
beta <- gpdfit#fit$par.ests["beta"]
Nu <- length(gpdfit#data$exceedances)
n <- length(data)
evtVar95 <- (u+((beta/xi)*(((n/Nu)*.05)^(-xi) - 1.)))*100
evtVar99 <- (u+((beta/xi)*(((n/Nu)*.01)^(-xi) - 1.)))*100
evtVar997 <- (u+((beta/xi)*(((n/Nu)*.003)^(-xi) - 1.)))*100
evtVar999 <- (u+((beta/xi)*(((n/Nu)*.001)^(-xi) - 1.)))*100
#return calculations
return(cbind(evtVar95,evtVar99,evtVar997,evtVar999,u,xi,beta,Nu,n))
}
#data <- read.table("pricedata.txt")
prices <- data$V1
returns <- diff(log(prices)) #or returns <- log(prices[-1]/prices[-n])
xi <- mean(returns)
std <- sd(returns)
N <- length(prices)
lstval <- prices[N]
options(scipen = 999)
p <- c(lstval, rep(NA, N-1))
gen.path <- function(){
N <- length(prices)
for(i in 2:N)
p[i] <- p[i-1] * exp(rnorm(1, xi, std))
# plot(p, type = "l", col = "brown", main = "Simulated Price")
#evt calculation
#first get only the losses and then make them absolute
lossOnly <- abs(p[p<0])
#get threshold
u <- quantile(lossOnly, probs = 0.9, names=FALSE)
evtVar <- evt(lossOnly,u)
return(evtVar)
}
runs <- 10
sim.evtVar <- replicate(runs, gen.path())
evtVar <- mean(sim.evtVar)
#add data to total table
VaR <- c(evtVar[1],evtVar[2],evtVar[3],evtVar[4],evtVar[5],evtVar[6],evtVar[7],evtVar[8],evtVar[9])
DF <- data.frame(VaR, row.names=c("evtVar95","evtVaR_99","evtVaR_997","evtVaR_999","u","xi","beta","Nu","n"))
In short, I'm trying to run the value-at-risk function (first function) within the monte carlo function (second function) and trying to put the average simulated values into a data tables. I know the first function works, but it's the second function that's driving me crazy. There are the errors I'm getting:
> sim.evtVar <- replicate(runs, gen.path())
Error in if (xi > 0.5) { : missing value where TRUE/FALSE needed
Called from: .gpdpwmFit(x, u)
Browse[1]> evtVar <- mean(sim.evtVar)
Error during wrapup: object 'sim.evtVar' not found
Browse[1]>
> #add data to total table
> VaR <- c(evtVar[1],evtVar[2],evtVar[3],evtVar[4],evtVar[5],evtVar[6],evtVar[7],evtVar[8],evtVar[9])
Error: object 'evtVar' not found
> DF <- data.frame(VaR, row.names=c("evtVar95","evtVaR_99","evtVaR_997","evtVaR_999","u","xi","beta","Nu","n"))
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ""function"" to a data.frame
Any help you can provide is greatly appreciated! Thank you in advance!
I think the Problem is this row:
lstval <- prices[N]
because if you take a stock price, that can't ever be negative, you produce an empty vector at this row in your function:
lossOnly <- abs(p[p<0])
you should try instead:
lstval <- min(returns)
if you want the highest negative return of your dataset
I've got a data set in R of a variable, repeated 10,000 times and sampled 200 times on each repeat so a 10,000 by 200 matrix, I would like to calculate statistical moments for the variable up to an arbitrary number. So in the end I would like a numeric vector holding the value of moments.
I can get the variance and the mean for the data set using colMean and colVar, but they only go so far.
I am also aware of the moments package in R, however using the all.moments command is returning me moments for each time course, or treating each column or row as an individual variable, not what I want.
Does anyone know an equivalent to colMean and colVar for higher order moments? And if possible also for cross moments?
Many thanks!
I stole this code from an obscure R package e1071:
theskew<- function (x) {
x<-as.vector(x)
sum((x-mean(x))^3)/(length(x)*sd(x)^3)
}
thekurt <- function (x) {
x<-as.vector(x)
sum((x-mean(x))^4)/(length(x)*var(x)^2) - 3
}
You can fold that into your code by feeding them one column at a time
Okay did this yesterday for posterity here is a loop that will do what I asked.
Provided your data is a time course of a variable you are measuring, and you want the moments of that variable:
rm(list=ls())
yourdata<-read.table("whereveryourdatais/and/variableyouwant")
yourdata<-t(yourdata) #only do this at your own discretion
mu<-colMeans(yourdata,1:ncol(yourdata))
NumMoments <- 5
rawmoments <- matrix(NA, nrow=NumMoments, ncol=ncol(yourdata))
for(i in 1:NumMoments) {
rawmoments[i, ] <- colMeans(yourdata^i)
}
plot(rawmoments[1,])
holder<-matrix(NA,nrow=nrow(yourdata),ncol=ncol(yourdata))
middles<-matrix(NA,nrow=1,ncol=ncol(yourdata))
for(j in 1:nrow(yourdata)){
for(o in 1:ncol(rawmoments)){
middles[o]<-yourdata[j,o]-rawmoments[1,o]
}
holder[j,] <- middles
}
centmoments<-matrix(NA,nrow=NumMoments,ncol=ncol(yourdata))
for(i in 1:NumMoments){
centmoments[i,]<-colMeans(holder^i)
}
Then centmoments has the centralmoments and rawmoments has the raw moments, you can specify how many moments to take by changing the value of NumMoments.
Note that the first row in "centmoments" will be approximately 0.
Is this what you're looking for?
X <- matrix(1:12, 3, 4) # your data
NumMoments <- 5
moments <- matrix(NA, nrow=NumMoments, ncol=ncol(X))
for(i in 1:NumMoments) {
moments[i, ] <- colMeans(X^i)
}
EDIT:
okay, apparently you want "central moments"
X <- matrix(1:12, 3, 4)
NumMoments <- 5
moments <- matrix(NA, nrow=NumMoments, ncol=ncol(X))
Y <- X
for(i in 1:ncol(X)) {
Y[, i] <- Y[, i] - moments[1, i]
}
for(i in 2:NumMoments) {
moments[i, ] <- colMeans(Y^i)
}