I am trying to write some code which will take a .csv file which contains some sample names as input and will output a data.frame containing the sample names and either a 96 well plate or 384 well plate format (A1, B1, C1...). For those who do not know, a 96 well plate has eight alphabetically labeled rows (A, B, C, D, E, F, G, H) and 12 numerically labeled columns (1:12) and a 384 well plate has 16 alphabetically labeled rows (A:P) and 24 numerically labeled columns (1:24). I am trying to write some code that will generate either of these formats (there CAN be two different functions to do this) allowing for the samples to be labeled either DOWN (A1, B1, C1, D1, E1, F1, G1, H1, A2...) or ACROSS (A1, A2, A3, A4, A5 ...).
So far, I have figured out how to get the row names fairly easily
rowLetter <- rep(LETTERS[1:8], length.out = variable)
#variable will be based on how many samples I have
I just cannot figure out how to get the numeric column names to apply correctly... I have tried:
colNumber <- rep(1:12, times = variable)
but it isn't that simple. All 8 rows must be filled before the col number increases by 1 if you're going 'DOWN' or all 12 columns must be filled before the row letter increases by 1 if you're going 'ACROSS'.
EDIT:
Here is a clunky version. It takes the number of samples that you have, a 'plate format' which IS NOT functional yet, and a direction and will return a data.frame with the wells and plate numbers. Next, I am going to a) fix the plate format so that it will work correctly and b) give this function the ability to take a list of samples names or ID's or whatever and return the sample names, well positions, and plate numbers!
plateLayout <- function(numOfSamples, plateFormat = 96, direction = "DOWN"){
#This assumes that each well will be filled in order. I may need to change this, but lets get it working first.
#Calculate the number of plates required
platesRequired <- ceiling(numOfSamples/plateFormat)
rowLetter <- character(0)
colNumber <- numeric(0)
plateNumber <- numeric(0)
#The following will work if the samples are going DOWN
if(direction == "DOWN"){
for(k in 1:platesRequired){
rowLetter <- c(rowLetter, rep(LETTERS[1:8], length.out = 96))
for(i in 1:12){
colNumber <- c(colNumber, rep(i, times = 8))
}
plateNumber <- c(plateNumber, rep(k, times = 96))
}
plateLayout <- paste0(rowLetter, colNumber)
plateLayout <- data.frame(plateLayout, plateNumber)
plateLayout <- plateLayout[1:numOfSamples,]
return(plateLayout)
}
#The following will work if the samples are going ACROSS
if(direction == "ACROSS"){
for(k in 1:platesRequired){
colNumber <- c(colNumber, rep(1:12, times = 8))
for(i in 1:8){
rowLetter <- c(rowLetter, rep(LETTERS[i], times = 12))
}
plateNumber <- c(plateNumber, rep(k, times = 96))
}
plateLayout <- paste0(rowLetter, colNumber)
plateLayout <- data.frame(plateLayout, plateNumber)
plateLayout <- plateLayout[1:numOfSamples,]
return(plateLayout)
}
}
Does anybody have any thoughts on what else might make this cool? I'm going to use this function to generate .csv or .txt files to use as sample name imports for different instruments so I will be kind of constrained in terms of 'cool features', but I think it would be cool to use ggplot to make a graphic which shows the plates and sample names?
You don't need for loops. Here is a start:
#some sample ids
ids <- c(LETTERS, letters)
#plate size:
n <- 96
nrow <- 8
samples <- character(n)
samples[seq_along(ids)] <- ids
samples <- matrix(samples, nrow=nrow)
colnames(samples) <- seq_len(n/nrow)
rownames(samples) <- LETTERS[seq_len(nrow)]
# 1 2 3 4 5 6 7 8 9 10 11 12
# A "A" "I" "Q" "Y" "g" "o" "w" "" "" "" "" ""
# B "B" "J" "R" "Z" "h" "p" "x" "" "" "" "" ""
# C "C" "K" "S" "a" "i" "q" "y" "" "" "" "" ""
# D "D" "L" "T" "b" "j" "r" "z" "" "" "" "" ""
# E "E" "M" "U" "c" "k" "s" "" "" "" "" "" ""
# F "F" "N" "V" "d" "l" "t" "" "" "" "" "" ""
# G "G" "O" "W" "e" "m" "u" "" "" "" "" "" ""
# H "H" "P" "X" "f" "n" "v" "" "" "" "" "" ""
library(reshape2)
samples <- melt(samples)
samples$position <- paste0(samples$Var1, samples$Var2)
# Var1 Var2 value position
# 1 A 1 A A1
# 2 B 1 B B1
# 3 C 1 C C1
# 4 D 1 D D1
# 5 E 1 E E1
# 6 F 1 F F1
# 7 G 1 G G1
# 8 H 1 H H1
# 9 A 2 I A2
# 10 B 2 J B2
# 11 C 2 K C2
# 12 D 2 L D2
# 13 E 2 M E2
# 14 F 2 N F2
# 15 G 2 O G2
# 16 H 2 P H2
# 17 A 3 Q A3
# 18 B 3 R B3
# 19 C 3 S C3
# 20 D 3 T D3
# 21 E 3 U E3
# 22 F 3 V F3
# 23 G 3 W G3
# 24 H 3 X H3
# 25 A 4 Y A4
# 26 B 4 Z B4
# 27 C 4 a C4
# 28 D 4 b D4
# 29 E 4 c E4
# 30 F 4 d F4
# 31 G 4 e G4
# 32 H 4 f H4
# 33 A 5 g A5
# 34 B 5 h B5
# 35 C 5 i C5
# 36 D 5 j D5
# 37 E 5 k E5
# 38 F 5 l F5
# 39 G 5 m G5
# 40 H 5 n H5
# 41 A 6 o A6
# 42 B 6 p B6
# 43 C 6 q C6
# 44 D 6 r D6
# 45 E 6 s E6
# 46 F 6 t F6
# 47 G 6 u G6
# 48 H 6 v H6
# 49 A 7 w A7
# 50 B 7 x B7
# 51 C 7 y C7
# 52 D 7 z D7
# 53 E 7 E7
# 54 F 7 F7
# 55 G 7 G7
# 56 H 7 H7
# 57 A 8 A8
# 58 B 8 B8
# 59 C 8 C8
# 60 D 8 D8
# 61 E 8 E8
# 62 F 8 F8
# 63 G 8 G8
# 64 H 8 H8
# 65 A 9 A9
# 66 B 9 B9
# 67 C 9 C9
# 68 D 9 D9
# 69 E 9 E9
# 70 F 9 F9
# 71 G 9 G9
# 72 H 9 H9
# 73 A 10 A10
# 74 B 10 B10
# 75 C 10 C10
# 76 D 10 D10
# 77 E 10 E10
# 78 F 10 F10
# 79 G 10 G10
# 80 H 10 H10
# 81 A 11 A11
# 82 B 11 B11
# 83 C 11 C11
# 84 D 11 D11
# 85 E 11 E11
# 86 F 11 F11
# 87 G 11 G11
# 88 H 11 H11
# 89 A 12 A12
# 90 B 12 B12
# 91 C 12 C12
# 92 D 12 D12
# 93 E 12 E12
# 94 F 12 F12
# 95 G 12 G12
# 96 H 12 H12
Use the byrow argument to fill the matrix in the other direction:
samples <- matrix(samples, nrow=nrow, byrow=TRUE)
To fill more than one plate, you can use basically the same idea, but use an array instead of a matrix.
I've never written this code in R before but it should be the same as Perl, Python or Java
For Row major order (going across) the pseudocode algorithm is simply:
for each( i : 0..totalNumWells - 1){
column = (i % numColumns)
row = ((i % totalNumWells) / numColumns)
}
Where numColumns is 12 for 96 well plate, 24 or 384 and totalNumWells is 96 or 384 respectively. This will give you a column and row index in 0-based coordinates which is perfect for accessing arrays.
wellName = ABCs[row], column + 1
Where ABCs is an array of all the valid letters in your plate (or A-Z). +1 is to convert 0-based into 1-based, otherwise the first well will be A0 instead of A1.
I also want to point out that often 384 wells aren't in row major order. I've seen most often sequencing centers preferring a "checker board" pattern A01, A03, A05... then A02, A04, A06..., B01, B03... etc to be able to combine 4 96-well plates into a single 384 well without changing the layout and simplifying the picking robot's work. that's a much harder algorithm to compute the ith well for
The following code does what I set out to do. You can use it to make as many plates as you need, with the assumptions that whatever your import list is will be in order. It can make as many plates as you need and will add a column for "plateNumber" which will indicate which batch it's on. It can only handle 96 or 384 well plates, but that is all I deal in so that is fine.
plateLayout <- function(numOfSamples, plateFormat = 96, direction = "DOWN"){
#This assumes that each well will be filled in order.
#Calculate the number of plates required
platesRequired <- ceiling(numOfSamples/plateFormat)
rowLetter <- character(0)
colNumber <- numeric(0)
plateNumber <- numeric(0)
#define the number of columns and number of rows based on plate format (96 or 384 well plate)
switch(as.character(plateFormat),
"96" = {numberOfColumns = 12; numberOfRows = 8},
"384" = {numberOfColumns = 24; numberOfRows = 16})
#The following will work if the samples are going DOWN
if(direction == "DOWN"){
for(k in 1:platesRequired){
rowLetter <- c(rowLetter, rep(LETTERS[1:numberOfRows], length.out = plateFormat))
for(i in 1:numberOfColumns){
colNumber <- c(colNumber, rep(i, times = numberOfRows))
}
plateNumber <- c(plateNumber, rep(k, times = plateFormat))
}
plateLayout <- paste0(rowLetter, colNumber)
plateLayout <- data.frame(plateNumber,plateLayout)
plateLayout <- plateLayout[1:numOfSamples,]
return(plateLayout)
}
#The following will work if the samples are going ACROSS
if(direction == "ACROSS"){
for(k in 1:platesRequired){
colNumber <- c(colNumber, rep(1:numberOfColumns, times = numberOfRows))
for(i in 1:numberOfRows){
rowLetter <- c(rowLetter, rep(LETTERS[i], times = numberOfColumns))
}
plateNumber <- c(plateNumber, rep(k, times = plateFormat))
}
plateLayout <- paste0(rowLetter, colNumber)
plateLayout <- data.frame(plateNumber, plateLayout)
plateLayout <- plateLayout[1:numOfSamples,]
return(plateLayout)
}
}
An example of how to use this would be as follows
#load whatever data you're going to use to get a plate layout on (sample ID's or names or whatever)
thisData <- read.csv("data.csv")
#make a data.frame containing your sample names and the function's output
#alternatively you can use length() if you have a list
plateLayoutDataFrame <- data.frame(thisData$sampleNames, plateLayout(nrow(thisData), plateFormat = 96, direction = "DOWN")
#It will return something similar to the following, depending on your selections
#data plateNumber plateLayout
#sample1 1 A1
#sample2 1 B1
#sample3 1 C1
#sample4 1 D1
#sample5 1 E1
#sample6 1 F1
#sample7 1 G1
#sample8 1 H1
#sample9 1 A2
#sample10 1 B2
#sample11 1 C2
#sample12 1 D2
#sample13 1 E2
#sample14 1 F2
#sample15 1 G2
That sums up this function for now. Roland offered a good method of doing this which is less verbose, but I wanted to avoid the use of external packages if possible. I'm working on a shiny app now which actually uses this! I want it to be able to automatically subset based on the 'plateNumber' and write each plate as it's own file... for more on this, go to: Automatic multi-file download in R-Shiny
Here's how I'd do it.
put_samples_in_plates = function(sample_list, nwells=96, direction="across")
{
if(!nwells %in% c(96, 384)){
stop("Invalid plate size")
}
nsamples = nrow(sample_list)
nplates = ceiling(nsamples/nwells);
if(nwells==96){
rows = LETTERS[1:8]
cols = 1:12
}else if(nwells==384){
rows = LETTERS[1:16]
cols = 1:24
}else{
stop("Unrecognized nwells")
}
nrows = length(rows)
ncols = length(cols)
if(tolower(direction)=="down"){
single_plate_df = data.frame(row = rep(rows, times=ncols),
col = rep(cols, each=nrows))
}else if(tolower(direction)=="across"){
single_plate_df = data.frame(row = rep(rows, each=ncols),
col = rep(cols, times=nrows))
}else{
stop("Unrecognized direction")
}
single_plate_df = transform(single_plate_df,
well = sprintf("%s%02d", row, col))
toobig_plate_df = cbind(data.frame(plate=rep(1:nplates, each=nwells)),
do.call("rbind", replicate(nplates,
single_plate_df,
simplify=FALSE)))
res = cbind(sample_list, toobig_plate_df[1:nsamples,])
return(res)}
# Quick test
a_sample_list = data.frame(x=1:386, y=rnorm(386))
r.096.across = put_samples_in_plates(sample_list = a_sample_list,
nwells= 96,
direction="across")
r.096.down = put_samples_in_plates(sample_list = a_sample_list,
nwells= 96,
direction="down")
r.384.across = put_samples_in_plates(sample_list = a_sample_list,
nwells=384,
direction="across")
r.384.down = put_samples_in_plates(sample_list = a_sample_list,
nwells=384,
direction="down")
Two points worth noting in the function above:
the use of the times and each parameters within the rep function to differentiate "across" and "down" directions, and
the use of replicate to repeat the individual plate as many times as needed along with the use of a call to rbind from do.call.
Related
suppose I have the following dataframe
x <- c(12,30,45,100,150,305,2,46,10,221)
x2 <- letters[1:10]
df <- data.frame(x,x2)
df <- df[with(df, order(x)), ]
x x2
7 2 g
9 10 i
1 12 a
2 30 b
3 45 c
8 46 h
4 100 d
5 150 e
10 221 j
6 305 f
And I would like to split these into groups based on another vector,
v <- seq(0, 500, 50)
Basically, I would like to partition out each row based on column x and how it matches with to v ( so for example x <= an element in v) - the location/index of that element in v is then used to assign a group for that row. The resulting table should look something like the following:
x x2 group
7 2 g g1
9 10 i g1
1 12 a g1
2 30 b g1
3 45 c g1
8 46 h g2
4 100 d g3
5 150 e g4
10 221 j g4
6 305 f g6
I could try to loop through each row and try and match it to v but I'm still confuse as to how I could easily detect where the match x<=element v occurs so that I can assign a group id to it. thanks.
You can use cut to break up df$x by the values of v:
df$group <- as.numeric(cut(df$x, breaks = v))
df$group <- paste0('g', df$group)
cut returns a factor so you can use as.numeric to just pull out which numeric bucket the value of df$x falls into based on v.
I'd like to be able to compare two tables and have R return a list of records and variables that don't match.
For example, with the following two tables
> df1
id let num
1 1a a 1
2 2b b 2
3 3c c 3
4 4d d 4
5 5e e 5
> df2
id let num
1 1a a 1
2 2b b 2
3 3c c 3
4 4d e 4
5 5e d 5
I would want a compare() function to return something like "id=4d, let" to let me know that the let variable in the record with id = 4d doesn't match.
I have seen the compare library in CRAN but it only returns TRUE or FALSE for the entire variable if there is a mismatch. Is there a library with a different compare function, or a way to do this manually?
df1 <- read.table(text="
id let1 num1
1a a 1
2b b 2
3c c 3
4d d 4
5e e 5", head=T, as.is=T)
df2 <- read.table(text="
id let2 num2
1a a 1
2b b 2
3c c 3
4d e 4
5e d 5", head=T, as.is=T)
df <- merge(df1, df2, by="id")
df$let <- ifelse(df$let1 == df$let2, "equal", "not equal")
df$num <- ifelse(df$num1 == df$num2, "equal", "not equal")
df
# id let1 num1 let2 num2 let num
# 1 1a a 1 a 1 equal equal
# 2 2b b 2 b 2 equal equal
# 3 3c c 3 c 3 equal equal
# 4 4d d 4 e 4 not equal equal
# 5 5e e 5 d 5 not equal equal
You mean something like which? Quick reproducible example:
> m1 <- m2 <- matrix(1:9, 3)
> diag(m1) <- 0
> which(m1 != m2, arr.ind = TRUE)
row col
[1,] 1 1
[2,] 2 2
[3,] 3 3
Something like:
df_diff <- list()
for (i in 1:ncol(df1))
{
df_diff[[i]] <- df1$id[df2[i] != df1[i]]
names(df_diff)[i] <- names(df1)[i]
}
This should produce (hopefully :)) a list of character vectors (one for each variable). Each vector contains the IDs of df1 where the records of the two df don't match.
I would like to create a loop that will create a new column, then paste together two columns if a condition is met in a separate column. If the condition is not met, then the column would equal whatever value is in the existing column. Finally, I would like to delete the old columns and rename the new columns to match the old columns. In my example below, I create columns called a1_t, a2_t, a3_t. Then, if a1 == A, paste a1 and a1_c together and place the value in a1_t, otherwise copy the value from a1 into a1_t. Repeat this procedure for a2_t and a3_t.
Here is the data:
set.seed(1)
dat <- data.frame(a1 = sample(LETTERS[1:9],15,replace=T),
a1_c = sample (1:100,15),
a2 = sample(LETTERS[1:9],15,replace=T),
a2_c = sample (1:100,15),
a3 = sample(LETTERS[1:9],15,replace=T),
a3_c = sample (1:100,15))
Here is the long hand way of creating my end goal:
dat$a1_t <- 'none'
dat$a1_t[dat$a1=="A"] <- paste((dat$a1[dat$a1=="A"]),(dat$a1_c[dat$a1=="A"]),sep="_")
dat$a1_t[dat$a1=="B"] <- 'B'
dat$a1_t[dat$a1=="C"] <- 'C'
dat$a1_t[dat$a1=="D"] <- 'D'
dat$a1_t[dat$a1=="E"] <- 'E'
dat$a1_t[dat$a1=="F"] <- 'F'
dat$a1_t[dat$a1=="G"] <- 'G'
dat$a1_t[dat$a1=="H"] <- 'H'
dat$a1_t[dat$a1=="I"] <- 'I'
dat$a2_t <- 'none'
dat$a2_t[dat$a2=="A"] <- paste((dat$a2[dat$a2=="A"]),(dat$a2_c[dat$a2=="A"]),sep="_")
dat$a2_t[dat$a2=="B"] <- 'B'
dat$a2_t[dat$a2=="C"] <- 'C'
dat$a2_t[dat$a2=="D"] <- 'D'
dat$a2_t[dat$a2=="E"] <- 'E'
dat$a2_t[dat$a2=="F"] <- 'F'
dat$a2_t[dat$a2=="G"] <- 'G'
dat$a2_t[dat$a2=="H"] <- 'H'
dat$a2_t[dat$a2=="I"] <- 'I'
dat$a3_t <- 'none'
dat$a3_t[dat$a3=="A"] <- paste((dat$a3[dat$a3=="A"]),(dat$a3_c[dat$a3=="A"]),sep="_")
dat$a3_t[dat$a3=="B"] <- 'B'
dat$a3_t[dat$a3=="C"] <- 'C'
dat$a3_t[dat$a3=="D"] <- 'D'
dat$a3_t[dat$a3=="E"] <- 'E'
dat$a3_t[dat$a3=="F"] <- 'F'
dat$a3_t[dat$a3=="G"] <- 'G'
dat$a3_t[dat$a3=="H"] <- 'H'
dat$a3_t[dat$a3=="I"] <- 'I'
-al
If you are dealing with a small number of columns, you might just want to use within and ifelse, like this:
within(dat, {
a1_t <- ifelse(a1 == "A", paste(a1, a1_c, sep = "_"),
as.character(a1))
a2_t <- ifelse(a2 == "A", paste(a2, a2_c, sep = "_"),
as.character(a2))
a3_t <- ifelse(a3 == "A", paste(a3, a3_c, sep = "_"),
as.character(a3))
})
You can, however, extend the idea programatically, if necessary.
Ive added comments throughout the code below so you can see what it's doing.
## What variables are we checking?
checkMe <- c("a1", "a2", "a3")
## Let's convert those to character first
dat[checkMe] <- lapply(dat[checkMe], as.character)
cbind(dat, ## We'll combine the original data using cbind
setNames( ## setNames is for the resulting column names
lapply(checkMe, function(x) { ## lapply is an optimized loop
Get <- c(x, paste0(x, "_c")) ## We need this for the "if" part
ifelse(dat[, x] == "A", ## logical comparison
## if matched, paste together the value from
## the relevant column
paste(dat[, Get[1]], dat[, Get[2]], sep = "_"),
dat[, x]) ## else return the original value
}),
paste0(checkMe, "_t"))) ## the column names we want
# a1 a1_c a2 a2_c a3 a3_c a1_t a2_t a3_t
# 1 C 50 E 79 I 90 C E I
# 2 D 72 F 3 C 86 D F C
# 3 F 98 E 47 E 39 F E E
# 4 I 37 B 72 C 76 I B C
# 5 B 75 H 67 F 93 B H F
# 6 I 89 G 46 C 42 I G C
# 7 I 20 H 81 E 67 I H E
# 8 F 61 A 41 G 38 F A_41 G
# 9 F 12 G 23 A 30 F G A_30
# 10 A 25 D 7 H 69 A_25 D H
# 11 B 35 H 9 D 19 B H D
# 12 B 2 F 29 H 64 B F H
# 13 G 34 H 95 D 11 G H D
# 14 D 76 E 58 D 22 D E D
# 15 G 30 E 35 E 13 G E E
I have a simple question. I have a list of objects. Each object holds a few lists. Before this gets too complicated, let me illustrate:
x = a list
x[[1]] = some object
x[[2]] = another object
...
x[[n]] = another object
And as I said, each object holds some more lists. But I'm interested in a specific list, let's call it "a".
x[[1]][[a]] = ('A': 1, 'B': 2, 'C': 3, ..., Z: 26)
Sorry for the python-like syntax! I am really just learning R. Anyway, what I want to do is combine the lists held in these objects, then take their median. To make this more clear, I want to group all 'A' elements, then take their median:
x[[1]][[a]][['A']], x[[2]][[a]][['A']], x[[3]][[a]][['A']], ..., x[[n]][[a]][['A']]
Similarly I want to group all 'B', 'C', ..., 'Z' elements and take their median...
x[[1]][[a]][['Z']], x[[2]][[a]][['Z']], x[[3]][[a]][['Z']], ..., x[[n]][[a]][['Z']]
So the question is what's the best way to do this? I've spent hours trying to figure this out! It would be great if someone could help me.
And if you would like to know what I'm actually doing, basically I have a list (x) of random forest objects. So x[[1]] is the first random forest, x[[100]] is the 100th random forest. Each random forest has a list of predicted values, which are stored in, e.g. x[[1]][['predicted']]. Each prediction list has a label associated with its predicted value. What I'm actually trying to do is calculate each label's median predicted value across all 100 random forests. And I want to do it efficiently. In Python, this is easy, but in R I'm not so sure. Anyway, thanks for the help!!! I really appreciate it.
Here's one way you could do it. It's a bit tough because you can't use rapply to subset by the names of list elements (which is frustrating). But you can unlist and then subset on names and take the median that way...
# Make some reproducible data
set.seed(1)
l <- list( a = sample(10,3) , b = sample(10,3) , c = sample(10,3) )
ll <- list( l , l , l )
# Unlist - we get a named vector but all a's have unique names - e.g. a1 , a2... an
unl <- unlist(ll)
# a1 a2 a3 b1 b2 b3 c1 c2 c3 a1 a2 a3 b1 b2 b3 c1 c2 c3 a1 a2 a3 b1 b2 b3 c1 c2 c3
# 3 4 5 10 2 8 10 6 9 3 4 5 10 2 8 10 6 9 3 4 5 10 2 8 10 6 9
# Subset by those elements that contian 'a' in their name
a.unl <- unl[ grepl("a",names(unl)) ]
# a1 a2 a3 a1 a2 a3 a1 a2 a3
# 3 4 5 3 4 5 3 4 5
# Take median
median( a.unl )
# [1] 4
To loop over multiple names try this...
sapply( c( "a" , "b" , "c" ) , function(x) median( unl[ grepl(x,names(unl) ) ] ) )
# a b c
# 4 8 9
you could do this with a simple loop for every A,B,C,...
x <- c()
for( i in 1:n ) x <- c( x, x[[i]][[a]][['A']] )
median(x)
Sample data for creating your top-level list x:
x <- replicate(3, list(a = as.list(setNames(sample(1:100, 26), LETTERS)),
b = runif(10)),
simplify = FALSE)
First, extract a from each list:
a.only <- lapply(ll, `[[`, "a")
Then, to compute all A through Z medians in one shot, do:
do.call(mapply, c(a.only, FUN = function(...) median(unlist(list(...)))))
# A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
# 55 59 41 21 93 72 65 74 51 42 87 25 60 40 13 77 35 31 92 51 57 37 87 67 29 46
If the sublists contain more items than you need, say you only want to compute medians on A, C, Z, do:
a.slices <- lapply(a.only, `[`, c("A", "C", "Z"))
do.call(mapply, c(a.slices, FUN = function(...) median(unlist(list(...)))))
# A C Z
# 55 41 46
I have list with names in A1:A144 and I want to move A49:A96 to B1:B48 and A97:144 to C1:C48.
So for each 48th row, I want the next 48 rows moved to a new column.
How to do that?
If you want to consider a VBA alternative then:
Sub MoveData()
nF = 1
nL = 48
nSize = Cells(Rows.Count, "A").End(xlUp).Row
nBlock = nSize / nL
For k = 1 To nBlock
nF = nF + 48
nL = nL + 48
Range("A" & nF & ":A" & nL).Copy Cells(1, k + 1)
Range("A" & nF & ":A" & nL).ClearContents
Next k
End Sub
Not sure how scalable this solution is, but it does work.
First let's pretend your names are x and you want the solution to be in new.df
number.shifts <- ceiling(length(x) / 48) # work out how many columns we need
# create an empty (NA) data frame with the dimensions we need
new.df <- matrix(data = NA, nrow = length(x), ncol = number.shifts)
# run a for-loop over the x, shift the column over every 48th row
j <- 1
for (i in 1:length(x)){
if (i %% 48 == 0) {j <- j + 1}
new.df[i,j] <- x[i]
}
I think you have to elaborate on your question a little more. Do you have the data in R or in Excel and do you want the output to be in R or in Excel?
That beeing said, if x is your vector indicating clusters
x <- rep(1:3, each = 48)
and y is the variable containing names or whatever that you want to distribute over columns A:C (each having 48 rows),
y <- sample(letters, 3 * 48, replace = TRUE)
you can do this:
y.wide <- do.call(cbind, split(y, x))
Just as there is stack in R to create a very long representation of a group of columns, there is unstack to take a long column and make it into a wide form.
Here's a basic example:
mydf <- data.frame(A = 1:144)
mydf$groups <- paste0("A", gl(n=3, k=48)) ## One of many ways to create groups
mydf2 <- unstack(mydf)
head(mydf2)
# A1 A2 A3
# 1 1 49 97
# 2 2 50 98
# 3 3 51 99
# 4 4 52 100
# 5 5 53 101
# 6 6 54 102
tail(mydf2)
# A1 A2 A3
# 43 43 91 139
# 44 44 92 140
# 45 45 93 141
# 46 46 94 142
# 47 47 95 143
# 48 48 96 144