Let us say we have 20 questions with different number of answers and we want that the questions are not is the same order in the generated nops, how can we do it?
I tried :
myexam <- dput(dir("exercises/"))
exams2nops(file = myexam,
n = 180,
nsamp = length(myexam),
dir = "nops",
edir = getwd(),
encoding = "UTF-8",
blank = 1,
reglength = 8,
samepage = TRUE)
But it gives the error about only 45 exercices in an exam are supported.
Ps, if the exercices are in a list and I use nsamp I have the error about group of exercices not having the same length.
Thanks for your help.
Your setup is almost correct but file and nsamp have to be specified slightly differently. For an exam with 20 shuffled exercises:
file = list(c("ex1.Rmd", ..., "ex20.Rmd")) should be a list with a single vector of length 20.
nsamp = 20.
So in your case probably:
myexam <- list(dir("exercises/"))
exams2nops(myexam, nsamp = length(unlist(myexam)), ...)
The reason behind this is the following:
When myexam is a list, then the exams2xyz() interfaces first draw nsamp elements from each element of the list.
Thus, if myexam is a list with only a single vector, then nsamp elements from that vector are sampled.
If nsamp is equal to the length of that one vector, then the vector is permuted/shuffled.
Related
I am doing a simple exercise involving vectors and lists. It revolves around just generating 10 random values drawn for each given value of the mean from a normal distribution. The exercise involves a simple use of a for loop:
mu = c(-10, 0, 10, 100)
random_norm = vector(mode = "list", length = 4)
for(i in seq_along(mu)){
random_norm[i] = rnorm(n = 10, mean = mu[i])
}
random_norm
The output I expected was a list of length 4 where each entry in the list would be a vector containing 10 values for each of the respective means used to generate the values, i.e:
[[1]]
[1] - vector of length 10
[[2]]
[1] - vector of length 10
etc
But instead I'm getting the error:
#> Warning in random_norm[i] <- rnorm(n = 10, mean = mu[i]): number of items to
#> replace is not a multiple of replacement length
So I deduce it must be something I'm misunderstanding with regards to the structure of vectors and lists. Could somebody explain what is going wrong?
It's really pretty hard to search SO for prior answers on this topic. There are two fundamental assignment operators for lists: [[<- and [<-. The first assigns the value into a list "package". The second tries to assign it as.is. Since it was a vector of length 10 it can’t be a proper sub-list.
Technically that was only a warning. That code did produce a list of length 4 but it only had the first item in each of the 10 element vectors.
Can someone help me with this? I got the cut_interval code to work for a single test column, but can't seem to get it to work in a for loop to have it run on all of the columns.
#Bin worker data into three groups (low/medium/high %methylation) for the cpg cg10757709
#This code works
cg10757709_interval <- cut_interval(cpgs$cg10757709, n=3, labels = c("low","med","high"))
View(cg10757709_interval)
#Write a loop so that data for each of the significant cpgs will be binned into low, medium, and high groups
#This code gives an error (that there are more elements are supplied than there are to replace)
cpgs_interval <- matrix(ncol = length(cpgs), nrow = 29)
for (i in seq_along(cpgs)) {
cpgs_interval[[i]] <- cut_interval(cpgs[[i]], n=3, labels = c("low","med","high"))
}
View(cpgs_interval)
The error says "Error in cpgs_interval[[i]] <- cut_interval(cpgs[[i]], n = 3, labels = c("low", : more elements supplied than there are to replace". Should I not be using a matrix for cpgs_interval? Or is something else the problem? I'm rather new to writing for loops. Thanks.
In your example, cpgs_interval is a matrix. If you want to put the variable into the ith column of the matrix, you could do:
for (i in seq_along(cpgs)) {
cpgs_interval[,i] <- cut_interval(cpgs[[i]], n=3, labels = c("low","med","high"))
}
That said, you might be better off making cpgs_interval a data frame, then you'll retain the factor rather than turning it into text.
In order to label thousands of random points I need a huge vector with labels. For logistic reasons I would like that all strings has length 2. What I have so far is this string
sl = paste(letters[1],letters,":0",sep="")
for (i in 2:26){
ll = paste(letters[i],letters,":0",sep="")
sl = c(sl,ll)
}
SL = paste(LETTERS[1],LETTERS,":0",sep="")
for (i in 2:26){
ll = paste(LETTERS[i],LETTERS,":0",sep="")
SL = c(SL,ll)
}
S1 = paste(LETTERS[1],0:9,":0",sep="")
for (i in 2:26){
ll = paste(LETTERS[i],1:10,":0",sep="")
SL = c(SL,ll)
}
s1 = paste(letters[1],0:9,":0",sep="")
for (i in 2:26){
ll = paste(letters[i],1:10,":0",sep="")
SL = c(SL,ll)
}
sl=c(sl,SL,S1,s1)
this vector has 1872 strings only. Taking in account that my questions are
Do you know a more elegant way to have something like this? I am building a package and I find this lines not elegant at all.
Do you know how can I easily increase the length of the vector with more normal strings of length 2?
Any help is appreciated.
Limiting yourself to two character strings and including all permutations of c(letters, LETTERS, 0:9) gives you a maximum of 62^2 = 3844 possibilities. That full vector can be generated via
paste0(
as.vector(
outer(c(letters, LETTERS, 0:9),
c(letters, LETTERS, 0:9),
paste0)
),
":0"
)
If you need more labels than that, you will need to either include more characters to select from, or increase the length of the string.
However, I think such a labeling scheme may not be as useful as you hope. Labeling points like this on a plot runs the risk of making the plot unreadable. Are you sure this is the approach you need?
I am using R to code simulations for a research project I am conducting in college. After creating relevant data structures and generating data, I seek to randomly modify a proportion P of observations (in increments of 0.02) in a 20 x 20 matrix by some effect K. In order to randomly determine the observations to be modified, I sample a number of integers equal to P*400 twice to represent row (rRow) and column (rCol) indices. In order to guarantee that no observation will be modified more than once, I perform this algorithm:
I create a matrix, alrdyModded, that is 20 x 20 and initialized to 0s.
I take the first value in rRow and rCol, and check whether alrdyModded[rRow[1]][rCol[1]]==1; WHILE alrdyModded[rRow[1]][rCol[1]]==1, i randomly select new integers for the indices until it ==0
When alrdyModded[rRow[1]][rCol[1]]==0, modify the value in a treatment matrix with same indices and change alrdyModded[rRow[1]][rCol[1]] to 1
Repeat for the entire length of rRow and rCol vectors
I believe a good method to perform this operation is a while loop nested in a for loop. However, when I enter the code below into R, I receive the following error code:
R CODE:
propModded<-1.0
trtSize<-2
numModded<-propModded*400
trt1<- matrix(rnorm(400,0,1),nrow = 20, ncol = 20)
cont<- matrix(rnorm(400,0,1),nrow = 20, ncol = 20)
alrdyModded1<- matrix(0, nrow = 20, ncol = 20)
## data structures for computation have been intitialized and filled
rCol<-sample.int(20,numModded,replace = TRUE)
rRow<-sample.int(20,numModded,replace = TRUE)
## indices for modifying observations have been generated
for(b in 1:numModded){
while(alrdyModded1[rRow[b]][rCol[b]]==1){
rRow[b]<-sample.int(20,1)
rCol[b]<-sample.int(20,1)}
trt1[rRow[b]][rCol[b]]<-'+'(trt1[rRow[b]][rCol[b]],trtSize)
alrdyModded[rRow[b]][rCol[b]]<-1
}
## algorithm for guaranteeing no observation in trt1 is modified more than once
R OUTPUT
" Error in while (alrdyModded1[rRow[b]][rCol[b]] == 1) { :
missing value where TRUE/FALSE needed "
When I take out the for loop and run the code, the while loop evaluates the statement just fine, which implies an issue with accessing the correct values from the rRow and rCol vectors. I would appreciate any help in resolving this problem.
It appears you're not indexing right within the matrix. Instead of having a condition like while(alrdyModded1[rRow[b]][rCol[b]]==1){, it should read like this: while(alrdyModded1[rRow[b], rCol[b]]==1){. Matrices are indexed like this: matrix[1, 1], and it looks like you're forgetting your commas. The for-loop should be something closer to this:
for(b in 1:numModded){
while(alrdyModded1[rRow[b], rCol[b]]==1){
rRow[b]<-sample.int(20,1)
rCol[b]<-sample.int(20,1)}
trt1[rRow[b], rCol[b]]<-'+'(trt1[rRow[b], rCol[b]],trtSize)
alrdyModded1[rRow[b], rCol[b]]<-1
}
On a side note, why not make alrdyModded1 a boolean matrix (populated with just TRUE and FALSE values) with alrdyModded1<- matrix(FALSE, nrow = 20, ncol = 20) in line 7, and have the condition be just while(alrdyModded1[rRow[b], rCol[b]]){ instead?
A TA wants me to create a new random variable Y_n=sum(X_i), where X_i are n binomial random variables, with N = 4 and p = 1/3. This wasn't too bad; I just use the following for loop: for(i in 1:100){yn[i] <- c(sum(rbinom(i, 4, (1/3))))}. However, he then wants me to recreate Y_n for every tenth number from 1 to 10,000 (i.e., 10, 20, 30,...,9990,10000). I tried to use this code: yseq <- seq(10, 10000, by=10)
for(i in yseq){
Y2[i] <- c(sum(rbinom(i,4,(1/3))))}. It sorta works, but not really. It returns a list (I checked its class) with seemingly correct values, but a bunch of NAs. This has created two problems for me: 1) R won't let me reclass the list as a vector, and 2) R tells me that the list is length 1, which is a bunch of rubbish.
Can some please tell me where I am going wrong? I've said it before: programming is not my forte, but I am always doing my best to learn!
Thanks!