I just started learning to code in R. I have a requirement where I have to keep adding unknown number of values to different vectors (number of vectors is not known). So, I tried to implement this using -
clust_oo = c()
clust_oo[k] = c(clust_oo[k],init_dataset[k,1])
Without the [k], the above code works but since i don't know the number of vectors/lists i have to use [k] as a differentiator. clust_oo[1] could have values say, 1,23,45 , clust_oo[2] could have other values 4, 40 and clust_oo[3] with values 44, 67, 455, 885. Where the values are added dynamically.
Is this the right way to proceed for this?
Try:
clust_oo = c()
for(i in 1:3)
clust_oo[length(clust_oo)+1] = i
clust_oo
[1] 1 2 3
Related
I'm trying to compare a "regular" data-set to a contaminated one, however I'm having trouble creating the contaminated data-set
Each list contains 25 data-frames, to each corresponding a size n; each data-frame contain m=850 samples of size n = {100, 200, ..., 2500} of an exponential distribution
I have tried replacing the first n/4 items of each sample for each data-frame.
The current way I am doing it adds extra entries to the contaminated data-frames, which I do not want - I merely wish to replace them.
However, if I switch c(j) with c(1:n/4), an error pops up saying replacement has 25 rows, data has 100.
What could I do better?
set.seed(915)
n_lst <- seq(from = 100, to = 2500, by=100)
m_lst <- seq(from=1, to=850, by=1)
l = list()
lCont = list()
i=1
for (n in n_lst) {
l[[i]] = lCont[[i]] = data.frame(replicate(850, rexp(n, 0.73)))
for (j in m_lst) {
lCont[[i]][c(j), c(1:n/4)] = rexp(n/4, 0.01)
}
i <-i+1
}
Bellow are the original list and the contaminated list (sorry about the formatting issues I was having trouble with the formatting verification)
Original List
Contaminated List
The main problem is that you are indexing using [columns, rows], which is backwards. R indexes data frames and matrices as [rows, columns]. Switching to lCont[[i]][1:(n / 4), j] will solve that.
Also note that : comes early in R's order of operations, you want 1:(n / 4), not 1:n / 4.
And a last comment, c() is only needed if you're combining more than one thing, like c(1:5, 12). c(j) is a long way to write j.
I am doing a simple exercise involving vectors and lists. It revolves around just generating 10 random values drawn for each given value of the mean from a normal distribution. The exercise involves a simple use of a for loop:
mu = c(-10, 0, 10, 100)
random_norm = vector(mode = "list", length = 4)
for(i in seq_along(mu)){
random_norm[i] = rnorm(n = 10, mean = mu[i])
}
random_norm
The output I expected was a list of length 4 where each entry in the list would be a vector containing 10 values for each of the respective means used to generate the values, i.e:
[[1]]
[1] - vector of length 10
[[2]]
[1] - vector of length 10
etc
But instead I'm getting the error:
#> Warning in random_norm[i] <- rnorm(n = 10, mean = mu[i]): number of items to
#> replace is not a multiple of replacement length
So I deduce it must be something I'm misunderstanding with regards to the structure of vectors and lists. Could somebody explain what is going wrong?
It's really pretty hard to search SO for prior answers on this topic. There are two fundamental assignment operators for lists: [[<- and [<-. The first assigns the value into a list "package". The second tries to assign it as.is. Since it was a vector of length 10 it can’t be a proper sub-list.
Technically that was only a warning. That code did produce a list of length 4 but it only had the first item in each of the 10 element vectors.
I have a vector with values from 1 to 100 v1 <- (1:100). I would like to get output with values from indexes 44, 50, 51, 52 ... 71.
I have tried v1 <- c(seq(44,44), seq(50,71)) but this solution overwrites original vector instead of printing value.
Could you tell me how to get output that I need using only one instruction? Is it possible? I'd be grateful for any help. Thanks
You can access elements of vector by index using the [] operator. So, for your case it would be v1[c(44, 50:71)].
Here we use a vector, containing necessary indexes inside square brackets to define what elements of vector v1 to choose.
50:71 is a short form for seq(50, 71)
Advice you to get familiar with R manual https://cran.r-project.org/doc/manuals/R-lang.html#Indexing and in R help by printing ?"[" in cosole.
I am using R to code simulations for a research project I am conducting in college. After creating relevant data structures and generating data, I seek to randomly modify a proportion P of observations (in increments of 0.02) in a 20 x 20 matrix by some effect K. In order to randomly determine the observations to be modified, I sample a number of integers equal to P*400 twice to represent row (rRow) and column (rCol) indices. In order to guarantee that no observation will be modified more than once, I perform this algorithm:
I create a matrix, alrdyModded, that is 20 x 20 and initialized to 0s.
I take the first value in rRow and rCol, and check whether alrdyModded[rRow[1]][rCol[1]]==1; WHILE alrdyModded[rRow[1]][rCol[1]]==1, i randomly select new integers for the indices until it ==0
When alrdyModded[rRow[1]][rCol[1]]==0, modify the value in a treatment matrix with same indices and change alrdyModded[rRow[1]][rCol[1]] to 1
Repeat for the entire length of rRow and rCol vectors
I believe a good method to perform this operation is a while loop nested in a for loop. However, when I enter the code below into R, I receive the following error code:
R CODE:
propModded<-1.0
trtSize<-2
numModded<-propModded*400
trt1<- matrix(rnorm(400,0,1),nrow = 20, ncol = 20)
cont<- matrix(rnorm(400,0,1),nrow = 20, ncol = 20)
alrdyModded1<- matrix(0, nrow = 20, ncol = 20)
## data structures for computation have been intitialized and filled
rCol<-sample.int(20,numModded,replace = TRUE)
rRow<-sample.int(20,numModded,replace = TRUE)
## indices for modifying observations have been generated
for(b in 1:numModded){
while(alrdyModded1[rRow[b]][rCol[b]]==1){
rRow[b]<-sample.int(20,1)
rCol[b]<-sample.int(20,1)}
trt1[rRow[b]][rCol[b]]<-'+'(trt1[rRow[b]][rCol[b]],trtSize)
alrdyModded[rRow[b]][rCol[b]]<-1
}
## algorithm for guaranteeing no observation in trt1 is modified more than once
R OUTPUT
" Error in while (alrdyModded1[rRow[b]][rCol[b]] == 1) { :
missing value where TRUE/FALSE needed "
When I take out the for loop and run the code, the while loop evaluates the statement just fine, which implies an issue with accessing the correct values from the rRow and rCol vectors. I would appreciate any help in resolving this problem.
It appears you're not indexing right within the matrix. Instead of having a condition like while(alrdyModded1[rRow[b]][rCol[b]]==1){, it should read like this: while(alrdyModded1[rRow[b], rCol[b]]==1){. Matrices are indexed like this: matrix[1, 1], and it looks like you're forgetting your commas. The for-loop should be something closer to this:
for(b in 1:numModded){
while(alrdyModded1[rRow[b], rCol[b]]==1){
rRow[b]<-sample.int(20,1)
rCol[b]<-sample.int(20,1)}
trt1[rRow[b], rCol[b]]<-'+'(trt1[rRow[b], rCol[b]],trtSize)
alrdyModded1[rRow[b], rCol[b]]<-1
}
On a side note, why not make alrdyModded1 a boolean matrix (populated with just TRUE and FALSE values) with alrdyModded1<- matrix(FALSE, nrow = 20, ncol = 20) in line 7, and have the condition be just while(alrdyModded1[rRow[b], rCol[b]]){ instead?
A TA wants me to create a new random variable Y_n=sum(X_i), where X_i are n binomial random variables, with N = 4 and p = 1/3. This wasn't too bad; I just use the following for loop: for(i in 1:100){yn[i] <- c(sum(rbinom(i, 4, (1/3))))}. However, he then wants me to recreate Y_n for every tenth number from 1 to 10,000 (i.e., 10, 20, 30,...,9990,10000). I tried to use this code: yseq <- seq(10, 10000, by=10)
for(i in yseq){
Y2[i] <- c(sum(rbinom(i,4,(1/3))))}. It sorta works, but not really. It returns a list (I checked its class) with seemingly correct values, but a bunch of NAs. This has created two problems for me: 1) R won't let me reclass the list as a vector, and 2) R tells me that the list is length 1, which is a bunch of rubbish.
Can some please tell me where I am going wrong? I've said it before: programming is not my forte, but I am always doing my best to learn!
Thanks!