I have a list containing 100 lists within it, each of which has 552 numerical values. How do I sequentially extract the 1st value (and so on up to 552) from each of the 100 lists?
Example: 5 lists within a list containing the numbers 1-10
list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), c(1, 2, 3, 4, 5, 6, 7,
8, 9, 10), c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), c(1, 2, 3, 4, 5,
6, 7, 8, 9, 10), c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
I want to extract each term sequentially i.e. 1,1,1,1,1 and then
2,2,2,2,2 and so on
This statement produces a list of vectors, taking the first element of each of your original vectors, the second element, etc., giving NA for the value of a short vector:
num <- max(unlist(lapply(x, length))) ## Length of the longest vector in x
lapply(seq(num), function(i) unlist(lapply(x, `[`, i)))
And here's a matrix approach:
matrix(unlist(x), ncol=length(x))
The rows of that matrix are your elements. This relies on each vector being the same length.
Related
I need some programming/statistic help.
I have a database with multiple groups (variable "group"). The members of each group rated some items (in our example-dataset the variables "var1", "var2" and "var3").
I would like to get the intraclass variance for each group. In particular i would like to calculate the r*wg(j), ICC(1) and ICC(2).
I looked for a solution but the icc function in r expect to have the raters (my team members) as columns and not as row. I could find a way to do it by creating a subset for every group and then transposing every dataset but I believe there is an easier solution.
Thanks to anyone who can help me with this.
group <- c(1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4)
var1 <- c(4, 5, 4, 2, 3, 4, 5, 3, 5, 8, 4, 3, 4, 4, 5)
var2 <- c(2, 3, 4, 2, 4, 4, 5, 6, 6, 9, 3, 3, 2, 5, 4)
var3 <- c(4, 5, 6, 2, 3, 6, 7, 6, 7, 8, 5, 6, 3, 3, 6)
df <- data.frame(group, var1, var2, var3)
Let's say I have a df like this
df1 <- data.frame(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9),
var2 = c(2, 8, 0, 7, 3, 4, 1, 10, 13))
I want to get a vector of values which produce following operation:
(x-median(x-1))/median(x-1)
where this -1 refers to index of the element in column. For example, for first element in column var2 the result is:
(2-(median(c( 8, 0, 7, 3, 4, 1, 10, 13))) )/(median(c( 8, 0, 7, 3, 4, 1, 10, 13)))
-0.63636
Thanks!
Using sapply, we can loop over index of each value in var2, ignore that value and calculate median of remaining values and perform the calculation.
sapply(seq_along(df1$var2), function(i) {
med_i <- median(df1$var2[-i])
(df1$var2[i] - med_i)/med_i
})
#[1] -0.6364 1.2857 -1.0000 1.0000 -0.4545 -0.2000 -0.8182 1.8571 2.7143
I have a data frame, in wide format, with each column representing one questionnaire item for one particular version of a questionnaire for a particular time point (repeated measures design).
My data would look something like the following:
df <- data.frame(id = c(1:5), t1_QOL_child_Q1 = c(5, 3, 6, 2, 7), t1_QOL_child_Q2 = c(5, 2, 3, 7, 1), t1_QOL_child_Q3 = c(7, 7, 6, 2, 5), t1_QOL_child_joy = c(9,9, 5, 3, 6), t1_QOL_teen_Q1 = c(5, 3, 6, 2, 7), t1_QOL_teen_Q2 = c(5, 2, 3, 7, 1), t1_QOL_teen_Q3 = c(7, 7, 6, 2, 5), t1_QOL_teen_joy = c(5, 7, 4, 7, 9), t1_QOL_adult_Q1 = c(5, 3, 6, 2, 7), t1_QOL_adult_Q2 = c(5, 2, 3, 7, 1), t1_QOL_adult_Q3 = c(7, 7, 6, 2, 5), t1_QOL_adult_joy = c(6, 5, 3, 3, 2), t2_QOL_child_Q1 = c(5, 3, 6, 2, 7), t2_QOL_child_Q2 = c(5, 2, 3, 7, 1), t2_QOL_child_Q3 = c(7, 7, 6, 2, 5), t2_QOL_child_joy = c(9,9, 5, 3, 6), t2_QOL_teen_Q1 = c(5, 3, 6, 2, 7), t2_QOL_teen_Q2 = c(5, 2, 3, 7, 1), t2_QOL_teen_Q3 = c(7, 7, 6, 2, 5), t2_QOL_teen_joy = c(5, 7, 4, 7, 9), t2_QOL_adult_Q1 = c(5, 3, 6, 2, 7), t2_QOL_adult_Q2 = c(5, 2, 3, 7, 1), t2_QOL_adult_Q3 = c(7, 7, 6, 2, 5), t2_QOL_adult_joy = c(6, 5, 3, 3, 2))
For example, column t1_QOL_child_Q1 would mean Question 1 (Q1) of the child version (child) of Quality of Life (QOL) questionnaire, with time point 1 (t1) data.
I want to select only subscales/columns whose suffix are labelled differently. In the sample data above, it would be the columns ending with "joy".
I have over 3000 columns and many more suffixes and it would be a pain to use the following:
select(df, ends_with("joy"), ends_with(<another suffix>), ends_with(<another suffix>))
I have thought of putting all the potential suffixes in a string vector, and use the vector as an input to the ends_with function, but ends_with could only take a single string instead of a vector of strings.
I have searched on Stackoverflow and found a solution that could accommodate a small vector of strings, which is the following:
select(df, sapply(vector_of_strings, starts_with))
However, I have too many suffixes in my vector of strings and the following error message resulted from it: Error: sapply(vector_of_strings, ends_with) must resolve to integer column positions, not a list
Help appreciated. Thanks!
We can use a single matches with multiple patterns separated by | to match substrings at the end ($) of the string
df %>%
select(matches("(joy|Q2)$"))
This question already has answers here:
How can I remove all duplicates so that NONE are left in a data frame?
(3 answers)
Closed 6 years ago.
If I have a vector:
x <- c(5, 6, 2, 9, 5, 2, 1, 9, 9)
How can I make another vector that contains elements that were never repeated? In this case it would be: c(6, 1) (because 5, 2, and 9 are repeated)
test <- c(5, 6, 2, 9, 5, 2, 1, 9, 9)
setdiff(test, test[duplicated(test)])
vector.a <- c(5, 6, 2, 9, 5, 2, 1, 9, 9)
not.reap <- NULL
for (i in 1:length(vector.a)){
not.reap[i] <- !(vector.a[i] %in% vector.a[-i])
}
vector.a[not.reap]
I'm have to use R instead of Matlab and I'm new to it.
I have a large array of data repeating like 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10...
I need to find the locations where values equal to 1, 4, 7, 10 are found to create a sample using those locations.
In this case it will be position(=corresponding value) 1(=1) 4(=4) 7(=7) 10(=10) 11(=1) 14(=4) 17(=7) 20(=10) and so on.
in MatLab it would be y=find(ismember(x,[1, 4, 7, 10 ])),
Please, help! Thanks, Pavel
something like this?
foo <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
bar <- c(1, 4, 7, 10)
which(foo %in% bar)
#> [1] 1 4 7 10 11 14 17 20
#nicola, feel free to copy my answer and get the recognition for your answer, simply trying to close answered questions.
The %in% operator is what you want. For example,
# data in x
targets <- c(1, 4, 7, 10)
locations <- x %in% targets
# locations is a logical vector you can then use:
y <- x[locations]
There'll be an extra step or two if you wanted the row and column indices of the locations, but it's not clear if you do. (Note, the logicals will be in column order).