I'm trying to create variables to use, for each variable it's called,
M000
M001
M002
M003
Example
c.n_vars <- nrow(comb)
for (i in 1:c.n_vars)
{
paste("M",comb[i,1],comb[i,2],comb[i,3]) = Arima(y,order=c(arima[1,1],arima[2,1],arima[3,1]),seasonal=list(order=c(comb[i,1],comb[i,2],comb[i,3]),period=12))
}
where comb is all combinations
a <- c(0,1,2,3,4)
b <- c(0,1,2,3,4)
c <- c(0,1,2,3,4)
comb <- expand.grid(a, b,c)
row parameter1 parameter2 parameter3
1 0 0 0
2 1 0 0
3 2 0 0
4 3 0 0
5 4 0 0
6 0 1 0
7 1 1 0
8 2 1 0
9 3 1 0
10 4 1 0
11 0 2 0
12 1 2 0
13 2 2 0
14 3 2 0
15 4 2 0
and arima is
arima <- data.frame(c(2,1,4))
row parameters
1 2
2 1
3 4
i am trying to create
c.n_vars <- nrow(comb)
for (i in 1:c.n_vars)
{
paste("M",comb[i,1],comb[i,2],comb[i,3]) = Arima(y,order=c(arima[1,1],arima[2,1],arima[3,1]),seasonal=list(order=c(comb[i,1],comb[i,2],comb[i,3]),period=12))
}
this code must return
for i = 1
M000 = arima model saved in that variable
for i = 2
M100 = arima model saved in that variable
for i = 3
M200 = arima model saved in that variable
.
.
.
.
.
for i = 15
M420 = arima model saved in that variable
and the following error appears
Error in paste("M", comb[i, 1], comb[i, 2], comb[i, 3]) = Arima(y, order = c(arima[1, :
assignment target expands an object out of language
I need that each iteration of the variable 'i' be saved in a different variable
Is there any solution? or another way to do it
Your sample code is still incomplete. I was not able to run it. For example y is missing.
As Base_R_Best_R pointed out, you cannot use paste to create variables like that. You can use the following pattern instead. Also note that I replaced paste() with paste0() to avoid spaces in the names:
result = list()
for (i in 1:c.n_vars)
{
result[[paste0("M",comb[i,1],comb[i,2],comb[i,3])]] = Arima(y,order=c(arima[1,1],arima[2,1],arima[3,1]),seasonal=list(order=c(comb[i,1],comb[i,2],comb[i,3]),period=12))
}
Access your variables like this:
result$M100
Related
I tried to create a table with three variables. Now I want to run a Chi-square on both of the outputs. How do I run a Chi-square on the output , , = 1 and again on output , , = 2?
> emphasis<-table(Pilot$emphasis.GI, Pilot$emphasis.race, Pilot$required.learning)
> emphasis
, , = 1
2 3 4
2 11 5 0
3 2 8 0
4 0 0 0
, , = 2
2 3 4
2 0 0 0
3 0 2 0
4 0 0 1
It is a 3D array. We can use apply with MARGIN = 3 and apply the test
apply(emphasis, 3, chisq.test)
Or use a for loop
out <- vector('list', dim(emphasis)[3])
for(i in seq_along(out)) out[[i]] <- chisq.test(emphasis[,, i])
You can try asplit over the third dimension and run chisq.test with Map
Map(chisq.test,asplit(emphasis, 3))
I have a time series (or simply a vector) that is binary, returning 0 or 1's depending on some condition (generated with ifelse). I would like to be able to return the counts (in this case corresponds to time series, so days) in between the 1's.
I can do this very easily in Excel, by simply calling the Column I am trying to calculate and then adding the row above (if working with Ascending data, or calling row below if working with descending). See below
I tried doing something similar in R but I am getting an error.
DaysBetweenCondition1 = as.numeric(ifelse((Condition1 ==0 ),0,lag(DaysBetweenCondition1)+1))
Is there an easier way to do this besides making a function
Row# Date Condition1 DaysBetweenCondition1
1 5/2/2007 NA NA
2 5/3/2007 NA NA
3 5/4/2007 NA NA
4 5/5/2007 NA NA
5 5/6/2007 0 NA
6 5/7/2007 0 NA
7 5/8/2007 0 NA
8 5/9/2007 0 NA
9 5/10/2007 0 NA
10 5/11/2007 0 NA
11 5/12/2007 0 NA
12 5/13/2007 0 NA
13 5/14/2007 1 0
14 5/15/2007 0 1
15 5/16/2007 0 2
16 5/17/2007 0 3
17 5/18/2007 0 4
18 5/19/2007 0 5
19 5/20/2007 0 6
20 5/21/2007 0 7
21 5/22/2007 1 0
22 5/23/2007 0 1
23 5/24/2007 0 2
24 5/25/2007 0 3
25 5/26/2007 0 4
26 5/27/2007 1 0
27 5/28/2007 0 1
28 5/29/2007 0 2
29 5/30/2007 1 0
(fwiw, the Dates in this example are made up, in the real data I am using business days so a bit different, and I dont want to reference them, just put in for clarity)
This gets the counting done in one line. Borrowing PhiSeu's code and a line from How to reset cumsum at end of consecutive string and modifying it to count zeros:
# Example
df_date <- cbind.data.frame(c(1:20),
c(rep("18/08/2016",times=20)),
c(rep(NA,times=5),0,1,0,0,1,0,0,0,0,1,1,0,1,0,0)
,stringsAsFactors=FALSE)
colnames(df_date) <- c("Row#","Date","Condition1")
# add the new column with 0 as default value
DaysBetweenCondition1 <- c(rep(0,nrow(df_date)))
# bind column to dataframe
df_date <- cbind(df_date,DaysBetweenCondition1)
df_date$DaysBetweenCondition1<-sequence(rle(!df_date$Condition1)$lengths) * !df_date$Condition1
R is very good when working with rows that don't depend on each other. Therefore a lot of functions are vectorized. When working with functions that depend on the value of other rows it is not so easy.
At the moment I can only provide you with a solution using a loop. I assume there is a better solution without a loop.
# Example
df_date <- cbind.data.frame(c(1:20),
c(rep("18/08/2016",times=20)),
c(rep(NA,times=5),0,1,0,0,1,0,0,0,0,1,1,0,1,0,0)
,stringsAsFactors=FALSE)
colnames(df_date) <- c("Row#","Date","Condition1")
# add the new column with 0 as default value
DaysBetweenCondition1 <- c(rep(0,nrow(df_date)))
# bind column to dataframe
df_date <- cbind(df_date,DaysBetweenCondition1)
# loop over rows
for(i in 1:nrow(df_date)){
if(is.na(df_date$Condition1[i])) {
df_date$DaysBetweenCondition1[i] <- NA
} else if(df_date$Condition1[i]==0 & is.na(df_date$Condition1[i-1])) {
df_date$DaysBetweenCondition1[i] <- NA
} else if(df_date$Condition1[i]==0) {
df_date$DaysBetweenCondition1[i] <- df_date$DaysBetweenCondition1[i-1]+1
} else {
df_date$DaysBetweenCondition1[i] <- 0
}
}
Here's a solution that should be relatively fast
f0 = function(x) {
y = x # template for return value
isna = is.na(x) # used a couple of times
grp = cumsum(x[!isna]) # use '1' to mark start of each group
lag = lapply(tabulate(grp + 1), function(len) {
seq(0, length.out=len) # sequence from 0 to len-1
})
split(y[!isna], grp) <- lag # split y, set to lag element, unsplit
data.frame(x, y)
}
A faster version avoids the lapply() loop; it creates a vector along x (seq_along(x)) and an offset vector describing how the vector along x should be corrected based on the start value of the original vector
f1 = function(x0) {
y0 = x0
x = x0[!is.na(x0)]
y = seq_along(x)
offset = rep(c(1, y[x==1]), tabulate(cumsum(x) + 1))
y0[!is.na(y0)] = y - offset
data.frame(x0, y)
}
Walking through the first solution, here's some data
> set.seed(123)
> x = c(rep(NA, 5), rbinom(30, 1, .15))
> x
[1] NA NA NA NA NA 0 0 0 1 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 1
[26] 1 0 0 1 0 0 0 0 0 0
use cumsum() to figure out the group the non-NA data belong to
> isna = is.na(x)
> grp = cumsum(x[!isna])
> grp
[1] 0 0 0 1 2 2 2 3 3 3 4 4 4 4 4 5 5 5 5 6 7 7 7 8 8 8 8 8 8 8
use tabulate() to figure out the number of elements in each group, lapply() to generate the relevant sequences
> lag = lapply(tabulate(grp + 1), function(len) seq(0, length.out=len))
finally, create a vector to hold the result, and use spilt<- to update with the lag
> y = x
> split(y[!isna], grp) <- lag
> data.frame(x, y)
x y
1 NA NA
2 NA NA
3 NA NA
4 NA NA
5 NA NA
6 0 0
7 0 1
8 0 2
9 1 0
10 1 0
11 0 1
12 0 2
13 1 0
14 0 1
15 0 2
16 1 0
17 0 1
...
The key to the second solution is the calculation of the offset. The goal is to be able to 'correct' y = seq_along(x) by the value of y at the most recent 1 in x, kind of like 'fill down' in Excel. The starting values are c(1, y[x==1]) and each needs to be replicated by the number of elements in the group tabulate(cumsum(x) + 1).
I d like to create a new variable that contains 1 and 0. A 1 represents agreement between the rater (both raters 1 or both raters 0) and a zero represents disagreement.
rater_A <- c(1,0,1,1,1,0,0,1,0,0)
rater_B <- c(1,1,0,0,1,1,0,1,0,0)
df <- cbind(rater_A, rater_B)
The new variable would be like the following vector I created manually:
df$agreement <- c(1,0,0,0,1,0,1,1,1,1)
Maybe there's a package or a function I don't know. Any help would be great.
You could create df as a data.frame (instead of using cbind) and use within and ifelse:
rater_A <- c(1,0,1,1,1,0,0,1,0,0)
rater_B <- c(1,1,0,0,1,1,0,1,0,0)
df <- data.frame(rater_A, rater_B)
##
df <- within(df,
agreement <- ifelse(
rater_A==rater_B,1,0))
##
> df
rater_A rater_B agreement
1 1 1 1
2 0 1 0
3 1 0 0
4 1 0 0
5 1 1 1
6 0 1 0
7 0 0 1
8 1 1 1
9 0 0 1
10 0 0 1
I've got a column in my dataset that contains a collection of 0,1 and 2. The 2's are a weird leftover from some previous transformation, and I need to convert them to 1. I've written a simple loop to do this
for (i in my.cl.accept$enroll){
if (i==2){
i=1
}
}
however, this doesn't change the actual contents of the dataframe. ifelse() doesn't work, because I don't need to change the other digits at all; just the number 2.
I've been using R a little more after coming from python, what simple thing am I misunderstanding here?
Lets generate a sample set:
set.seed(10)
DF <- data.frame(
a=1:10,
b=sample(0:2,10,rep=T))
DF
Now, replace every entry corresponding to 2 with 1:
DF$b[DF$b==2] <- 1
DF
Note: This is a vectorized method, and will always work faster than loop iterations.
Dunno whether this is what you want?
> A<- 1:10
> B<- c(rep(0,5), rep(1,3), rep(2,2))
> data <- data.frame(A,B)
> data
A B
1 1 0
2 2 0
3 3 0
4 4 0
5 5 0
6 6 1
7 7 1
8 8 1
9 9 2
10 10 2
> data[data$B==2,]$B <- 1
> data
A B
1 1 0
2 2 0
3 3 0
4 4 0
5 5 0
6 6 1
7 7 1
8 8 1
9 9 1
10 10 1
Are you sure you're using ifelse correctly? It actually does allow you to only change one value to another. Here's an example:
> x <- sample(c(0, 1, 2), 10, TRUE)
> x
## [1] 2 1 1 0 2 2 0 0 2 1
> ifelse(x == 2, 1, x)
## [1] 1 1 1 0 1 1 0 0 1 1
For future reference, your good old-fashioned for loop should go something like this...
for (i in 1:length(my.cl.accept$enroll)){
if (my.cl.accept$enroll[i] == 2){
my.cl.accept$enroll[i] <- 1
} else {
my.cl.accept$enroll[i]
}
}
I am working with multiple binary vectors e.g., A,B,C,D,E,F,G,H.
I want to find the classification between them. I have tried the following:
log_data<-read.csv(choose.files(), as.is = T, header = T, blank.lines.skip = TRUE)
data<-log_data[2:ncol(log_data)]
data
TIME A B C D E F G
1 1 1 1 0 1 0 1 1
2 0 0 1 1 1 1 0 1
3 1 1 1 1 1 0 1 1
4 1 0 1 1 1 1 0 1
.....................
fit <- network(data)
fit.prior <- jointprior(fit)
fit <- getnetwork(learn(fit,rats,fit.prior))
**Error in postc0c(node$condposterior[[1]]$mu, node$condposterior[[1]]$tau, :
NA/NaN/Inf in foreign function call (arg 1)**
Getting this error just because all are continuous variable and NULL at mu.
How should I proceed in order to classify after creating a network?