Using a different variable in each iteration of a function - r
Hi I've got a function which performs calculations on several columns (written with kind support form stack overflow community). I'd like to adapt it so that instead of calculating MOD for each iteration, it looks up a value in a table and uses that. The MOD for each column will be different.
Here is the function that does the calculations:
library(data.table)
dataDT<- data.frame(CAG = c(13, 14, 15, 17),
A01 = c(6485,35,132, 12),
A02 = c(0,42,56, 4))
thres <- 0.2
dataDT<-setDT(dataDT)
colsToBeUsed<-names(dataDT[,!'CAG'])
sumDataSetdata<-c()
sumDataSet<-unlist(lapply(X=1:length(colsToBeUsed),function(X){s=colsToBeUsed[X]
eval(parse(text=paste0('dataDT[',s,'<thres*max(',s,'),',s,':=0]')))
eval(parse(text=paste0('dataDT[,MOD',s,':=dataDT[',s,'==max(',s,'),CAG]]')))
eval(parse(text=paste0('dataDT[,norm',s,':=',s,'/sum(',s,')]')))
eval(parse(text=paste0('dataDT[,sum',s,':=',s,'/sum(',s,')*(CAG-MOD',s,'),]')))
eval(parse(text=paste0('rbind(sumDataSetdata,dataDT[,sum(sum',s,')])')))
}))
Here is the table which gives the MOD:
MODs <- data.frame(c(data.frame(samples = c('A01', 'A02', 'A03', 'A04'), MOD = c(117.8, 120.2, 124.5, 130.6))
Here is the table which says which MOD to use for each 'sample' column
ctrls <- (data.frame(samples = c('A01', 'A02', 'A03', 'A04'), ctrl = c(A01, A01, A03, A03))
Response to answer 1
Thank you, that works well for the example. I've been trying to apply it to my real data and am having a few difficulties. Here is the code for my real data.
library(data.table)
dataDT <- data.frame(area[,7:ncol(height)])
dataDT <- setDT(dataDT)
colsToBeUsed<-names(dataDT[,!'CAG'])
MODs <- data.frame(samples = samples$unique.inputdf.SampleFileName., MOD = htresults$mode)
ctrls <- data.frame(samples = samples$unique.inputdf.SampleFileName., ctrl = 'A01_RR20170609_FA_A01_2017-06-09_1.fsa')
myFun <- function(x, mod, cag, thres) {
x[x < (thres * max(x))] <- 0
norm_x <- x / sum(x)
sum_x <- norm_x * (cag - mod)
sum(sum_x)
}
transision_matrix <- merge(ctrls, MODs, by.x = "ctrl", by.y = "samples")
setDT(transision_matrix)
mf2 <- function(colname, dataDT, transision_matrix){
x <- dataDT[, colname, with = F][[1]]
mod <- transision_matrix[samples == colname, MOD]
cag <- dataDT[, "CAG"][[1]]
myFun(x, mod, cag, thres = 0.2)
}
sapply(colsToBeUsed, function(x) mf2(x, dataDT, transision_matrix))
The control sample for all columns in this experiment is A01_RR20170609_FA_A01_2017.06.09_1.fsa and its mode is 20.67000
This is the result I get
A01_RR20170609_FA_A01_2017.06.09_1.fsa A02_RR20170609_FA_A02_2017.06.09_1.fsa
0 0
A03_RR20170609_FA_A03_2017.06.09_1.fsa A04_RR20170609_FA_A04_2017.06.09_1.fsa
0 0
A05_RR20170609_FA_A05_2017.06.09_1.fsa A06_RR20170609_FA_A06_2017.06.09_1.fsa
0 0
A07_RR20170609_FA_A07_2017.06.09_1.fsa A08_RR20170609_FA_A08_2017.06.09_1.fsa
0 0
A09_RR20170609_FA_A09_2017.06.09_1.fsa A10_RR20170609_FA_A10_2017.06.09_1.fsa
0 0
A11_RR20170609_FA_A11_2017.06.09_1.fsa A12_RR20170609_FA_A12_2017.06.09_1.fsa
0 0
The results I'm expecting are:
[1] 4.108246 5.868355 4.608756 -1.159657 4.015066 4.364199 5.262355 4.337760 6.496672 5.574396
[11] 5.102111 8.911440
In case it's useful, here is info about the table 'height':
'data.frame': 660 obs. of 19 variables:
$ Dye/SamplePeak : chr "B,66" "B,67" "B,68" "B,69" ...
$ Marker : chr NA NA NA NA ...
$ Allele : chr NA NA NA NA ...
$ Size : num 144 147 148 150 151 ...
$ Area : num 148288 110 907 3355 1274 ...
$ DataPoint : num 2591 2622 2641 2655 2671 ...
$ CAG : num 13.9 14.9 15.5 15.9 16.4 ...
$ A01_RR20170609_FA_A01_2017-06-09_1.fsa: num 6485 32 125 450 211 ...
$ A02_RR20170609_FA_A02_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A03_RR20170609_FA_A03_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A04_RR20170609_FA_A04_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A05_RR20170609_FA_A05_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A06_RR20170609_FA_A06_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A07_RR20170609_FA_A07_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A08_RR20170609_FA_A08_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A09_RR20170609_FA_A09_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A10_RR20170609_FA_A10_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A11_RR20170609_FA_A11_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A12_RR20170609_FA_A12_2017-06-09_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
The table 'height' is quite large, but I've added the results from column 'CAG' and 'A01_RR20170609_FA_A01_2017.06.09_1.fsa' in case it's useful for working.
**CAG**
[1] 13.92667 14.88000 15.46333 15.89333 16.38333 16.84333 17.39667 17.79667 18.41000 18.78000 19.39333
[12] 19.76000 20.33667 20.67000 21.03333 21.36667 23.52000 24.45667 25.39667 27.31000 28.34667 29.32000
[23] 30.23333 31.17667 32.15000 32.72667 33.08333 33.67667 34.03667 34.60667 34.96667 35.54000 35.93333
[34] 36.54000 36.87333 37.49333 37.87333 38.50667 38.89000 39.48333 39.84667 40.45000 40.81000 41.41000
[45] 41.73667 42.45333 42.77667 43.34000 43.69333 46.21000 49.21333 52.21667 52.60667 13.90000 14.95000
[56] 15.47333 15.99667 16.46000 16.86333 17.48000 17.85000 18.43667 18.80667 19.42667 19.76333 20.43667
[67] 21.38667 21.72333 22.70000 23.31333 23.61667 24.23000 24.59667 25.60333 26.55333 27.16333 27.44000
[78] 28.42000 29.40000 30.37667 30.98667 31.32333 32.30000 33.23667 34.16667 35.13333 36.07667 37.05667
[89] 38.03333 38.63333 39.01667 39.60667 39.97333 40.57667 40.94000 41.54333 41.87333 42.47333 42.83333
[100] 43.46000 43.78667 44.70333 12.48333 13.03667 13.92667 14.88000 15.49333 15.86333 16.44667 16.78333
[111] 17.36667 17.82667 18.38000 18.84333 19.42333 19.79000 20.34000 20.79667 21.28333 21.74333 23.60000
[122] 24.54000 25.51333 26.48667 27.43333 28.40667 29.32000 29.96000 30.26333 30.90333 31.36000 32.24000
[133] 33.17333 34.25667 35.40667 36.01667 36.56667 36.99667 37.58667 37.97000 38.60333 38.98333 39.57667
[144] 39.94000 40.54667 40.90667 41.51000 41.86667 42.43667 42.79333 43.38667 43.74000 12.63667 13.95667
[155] 15.52333 15.92333 16.53667 16.90667 17.49000 17.89000 18.44333 18.81000 19.42333 19.79000 20.40000
[166] 20.76667 21.37667 21.74333 22.38000 24.54000 25.51333 27.46333 28.43667 29.04667 29.38000 30.29333
[177] 31.26667 32.24000 33.17333 34.10667 35.04333 35.68000 35.98667 36.59667 36.96333 37.55667 37.97000
[188] 38.60333 38.95333 39.57667 39.94000 40.51333 40.87333 41.47333 41.83333 42.43000 42.75667 43.35000
[199] 43.70667 44.29667 44.65000 45.24000 45.56000 46.54333 47.54333 12.45333 13.04000 13.87333 14.86667
[210] 15.48667 15.86000 16.44667 16.79000 17.37667 17.81333 18.43333 18.80667 19.42667 19.79667 20.41333
[221] 20.75000 21.36667 21.73333 22.59667 23.54667 24.53000 25.51000 26.46000 27.44000 28.38667 29.00000
[232] 29.33667 29.91667 30.28333 31.26333 31.84333 33.08667 34.08000 34.68667 35.08333 35.97333 36.37333
[243] 36.96000 37.59000 37.94667 38.59000 38.98000 39.57667 39.94333 40.55000 40.88333 41.48667 41.85000
[254] 42.42000 42.77667 43.37333 43.73000 44.29333 44.64667 45.26667 45.56000 12.57667 12.94000 13.55333
[265] 13.92000 14.50000 14.86667 15.41667 15.87667 16.51667 16.88667 17.49667 17.86333 18.47667 18.81333
[276] 19.42333 19.79000 20.40000 20.76667 21.40667 21.74333 23.60000 27.42667 28.94333 29.76333 30.33667
[287] 32.21333 33.17000 34.09333 34.57333 35.05667 35.69000 36.05333 36.69000 36.99667 37.58667 37.97000
[298] 38.60333 38.98333 39.60333 39.96333 40.56333 40.92000 41.48667 41.84333 42.43333 42.78667 43.40667
[309] 43.70000 12.05333 12.51333 12.91333 13.90000 14.48667 14.85667 15.47667 15.78333 16.40000 16.83333
[320] 17.48000 17.88333 18.47000 18.81000 19.39333 19.79000 20.40000 20.76667 21.34667 21.71000 22.26000
[331] 22.62333 23.56667 24.57333 26.43000 27.99333 28.66667 29.40000 31.29333 32.30000 33.17000 34.09000
[342] 34.69000 35.08000 35.65333 36.02000 36.23000 36.93333 37.52667 37.94667 38.59000 38.94667 39.57667
[353] 39.94000 40.54333 40.90333 41.47333 41.83333 42.42667 42.78333 43.40333 43.72667 52.24333 52.57667
[364] 12.02667 12.30000 13.73667 14.99333 15.57333 15.94000 16.52333 16.92000 17.44000 17.96000 18.44667
[375] 18.81333 19.45333 19.81667 20.42000 20.78333 21.35667 21.66000 22.41333 24.52333 25.52000 26.45333
[386] 27.42667 28.06333 28.33333 28.94000 29.36333 29.91000 30.30333 30.88000 31.24333 31.85000 32.18333
[397] 33.14333 33.77000 34.13000 34.70000 35.03000 35.66333 36.02667 36.63333 37.00000 37.58333 37.99000
[408] 38.58333 38.96000 39.57333 39.93333 40.53000 40.88667 41.48000 41.83667 42.43000 42.78333 43.37000
[419] 43.69333 44.36667 44.69000 45.27333 45.65000 46.24000 46.54000 50.86667 51.52667 12.42333 13.03333
[430] 13.85667 14.47000 14.86667 15.47667 15.84667 16.39667 16.70000 17.40667 17.80333 19.21000 19.79000
[441] 20.18667 20.73667 21.31667 21.71000 22.20000 22.59333 23.56667 24.51000 25.48333 26.42667 27.03333
[452] 27.33667 28.33667 29.30667 29.94333 30.91333 31.91000 32.78667 33.64667 34.00333 34.63333 34.99333
[463] 35.59667 35.99333 36.54000 36.93667 37.52333 37.90333 38.54000 38.92000 39.54667 39.90667 40.51000
[474] 40.84000 41.44000 41.79667 42.39333 42.74667 43.31000 43.69333 44.22000 52.21667 52.60667 12.27000
[485] 13.79333 14.95000 15.44000 16.01667 16.44333 16.84000 17.47667 17.84333 18.45000 18.81667 19.42333
[496] 19.69667 20.12000 20.48000 21.38667 21.68667 22.59333 23.52667 24.52333 25.12667 25.45667 26.45667
[507] 26.97333 27.39667 28.33667 28.97333 29.36667 29.91333 30.27667 31.24667 31.82000 32.18333 33.14000
[518] 34.06000 34.65667 35.01667 35.61333 35.97333 36.54667 36.94000 37.52000 37.92667 38.55333 38.92667
[529] 39.51333 39.87000 40.46667 40.82000 41.41333 41.76667 42.35667 42.71000 43.29667 43.64667 44.23000
[540] 44.58000 45.21667 45.53667 46.15000 46.51000 54.90333 12.13000 12.51667 13.81333 14.84000 15.44333
[551] 15.89333 16.74000 17.37333 17.76333 18.33667 18.73000 19.30333 19.60333 20.37667 21.30333 21.63000
[562] 22.52333 23.50667 24.48667 25.46667 26.38667 26.92333 27.34000 27.90667 28.29333 29.24333 30.22667
[573] 30.79333 31.24000 31.83333 32.43000 32.93000 33.46000 34.02333 34.67333 34.97000 35.53667 35.92333
[584] 36.52333 36.88333 37.48667 37.85667 38.50333 38.87333 39.45333 39.83667 40.42667 40.81000 41.36667
[595] 41.71667 42.33000 42.68000 43.26333 43.61000 44.56333 48.30000 12.46000 12.84667 13.83333 14.43000
[606] 14.82000 15.35667 15.65667 16.40333 17.42000 17.78000 18.55667 18.88333 19.30333 19.69000 20.31333
[617] 20.64000 21.23333 21.61667 22.21000 22.47667 23.45667 24.04667 24.40333 24.99333 25.35000 25.94000
[628] 26.29667 26.92000 27.27667 27.84000 28.25667 28.85000 29.20333 30.15333 31.13000 31.72000 32.10667
[639] 32.69667 33.04333 33.91667 34.91000 35.91000 36.44333 36.79667 37.39333 37.79333 38.41000 38.78000
[650] 39.39333 39.77333 40.36000 40.71000 41.29000 41.64000 42.24667 42.59333 43.54667 44.49000 45.40333
A01_RR20170609_FA_A01_2017.06.09_1.fsa
[1] 6485 32 125 450 211 703 553 1549 1360 3526 5028 13610 15986 31233 713 1260
[17] 31 37 33 46 43 48 40 63 78 33 118 40 176 65 296 103
[33] 501 242 923 545 2006 1355 4348 2564 8615 3886 12985 227 669 85 61 57
[49] 103 32 42 50 64 0 0 0 0 0 0 0 0 0 0 0
[65] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[81] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[97] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[113] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[129] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[145] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[161] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[177] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[193] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[209] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[225] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[241] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[257] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[273] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[289] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[305] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[321] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[337] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[353] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[369] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[385] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[401] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[417] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[433] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[449] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[465] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[481] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[497] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[513] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[529] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[545] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[561] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[577] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[593] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[609] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[625] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[641] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[657] 0 0 0 0
Response to answer 2
dputs as requested
dput(transision_matrix)
structure(list(ctrl = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), .Label = "A01_RR20170609_FA_A01_2017-06-09_1.fsa", class = "factor"),
samples = structure(1:12, .Label = c("A01_RR20170609_FA_A01_2017-06-09_1.fsa",
"A02_RR20170609_FA_A02_2017-06-09_1.fsa", "A03_RR20170609_FA_A03_2017-06-09_1.fsa",
"A04_RR20170609_FA_A04_2017-06-09_1.fsa", "A05_RR20170609_FA_A05_2017-06-09_1.fsa",
"A06_RR20170609_FA_A06_2017-06-09_1.fsa", "A07_RR20170609_FA_A07_2017-06-09_1.fsa",
"A08_RR20170609_FA_A08_2017-06-09_1.fsa", "A09_RR20170609_FA_A09_2017-06-09_1.fsa",
"A10_RR20170609_FA_A10_2017-06-09_1.fsa", "A11_RR20170609_FA_A11_2017-06-09_1.fsa",
"A12_RR20170609_FA_A12_2017-06-09_1.fsa"), class = "factor"),
MOD = c(20.67, 20.67, 20.67, 20.67, 20.67, 20.67, 20.67,
20.67, 20.67, 20.67, 20.67, 20.67)), .Names = c("ctrl", "samples",
"MOD"), row.names = c(NA, -12L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x10180d178>, index = structure(integer(0), "`__samples`" = integer(0)))
> dput(ctrls)
structure(list(samples = structure(1:12, .Label = c("A01_RR20170609_FA_A01_2017-06-09_1.fsa",
"A02_RR20170609_FA_A02_2017-06-09_1.fsa", "A03_RR20170609_FA_A03_2017-06-09_1.fsa",
"A04_RR20170609_FA_A04_2017-06-09_1.fsa", "A05_RR20170609_FA_A05_2017-06-09_1.fsa",
"A06_RR20170609_FA_A06_2017-06-09_1.fsa", "A07_RR20170609_FA_A07_2017-06-09_1.fsa",
"A08_RR20170609_FA_A08_2017-06-09_1.fsa", "A09_RR20170609_FA_A09_2017-06-09_1.fsa",
"A10_RR20170609_FA_A10_2017-06-09_1.fsa", "A11_RR20170609_FA_A11_2017-06-09_1.fsa",
"A12_RR20170609_FA_A12_2017-06-09_1.fsa"), class = "factor"),
ctrl = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), class = "factor", .Label = "A01_RR20170609_FA_A01_2017-06-09_1.fsa")), .Names = c("samples",
"ctrl"), row.names = c(NA, -12L), class = "data.frame")
> dput(MODs)
structure(list(samples = structure(1:12, .Label = c("A01_RR20170609_FA_A01_2017-06-09_1.fsa",
"A02_RR20170609_FA_A02_2017-06-09_1.fsa", "A03_RR20170609_FA_A03_2017-06-09_1.fsa",
"A04_RR20170609_FA_A04_2017-06-09_1.fsa", "A05_RR20170609_FA_A05_2017-06-09_1.fsa",
"A06_RR20170609_FA_A06_2017-06-09_1.fsa", "A07_RR20170609_FA_A07_2017-06-09_1.fsa",
"A08_RR20170609_FA_A08_2017-06-09_1.fsa", "A09_RR20170609_FA_A09_2017-06-09_1.fsa",
"A10_RR20170609_FA_A10_2017-06-09_1.fsa", "A11_RR20170609_FA_A11_2017-06-09_1.fsa",
"A12_RR20170609_FA_A12_2017-06-09_1.fsa"), class = "factor"),
MOD = c(20.67, 19.7633333333333, 16.7833333333333, 21.7433333333333,
16.79, 14.8666666666667, 15.7833333333333, 21.66, 16.3966666666667,
19.6966666666667, 19.6033333333333, 15.3566666666667)), .Names = c("samples",
"MOD"), row.names = c(NA, -12L), class = "data.frame")
Your approach looks quite complicated, so I tried to use base vectors for calculations.
1) Firstly, I crated function which takes vectors as arguments, because your supplied code with evalve was hard to understand.
myFun <- function(x, mod, cag, thres) {
x[x < (thres * max(x))] <- 0
norm_x <- x / sum(x)
sum_x <- norm_x * (cag - mod)
sum(sum_x)
}
I hope I got it right.
2) Then we create transition matrix, from which we will take MOD values.
transision_matrix <- merge(ctrls, MODs, by.x = "ctrl", by.y = "samples")
setDT(transision_matrix)
3) Then we can write function which takes column name and data.table`s as arguments to obtain your desired results:
mf2 <- function(colname, dataDT, transision_matrix){
x <- dataDT[, colname, with = F][[1]]
mod <- transision_matrix[samples == colname, MOD]
cag <- dataDT[, "CAG"][[1]]
myFun(x, mod, cag, thres = 0.2)
}
4) And lastly we need only to apply/supply it over the column name vector
sapply(colsToBeUsed, function(x) mf2(x, dataDT, transision_matrix))
A01 A02
-104.8000 -103.2286
UPDATE
It looks like the problem is in the names of samples column in transision_matrix, the names in it does not match the column names. You should change either column names of dataDT or values of samples column, that their format matches.
> 'A01_RR20170609_FA_A01_2017-06-09_1.fsa' == "A01_RR20170609_FA_A01_2017.06.09_1.fsa"
[1] FALSE
(change the dots)
You can do it like this:
> colnames(dataDT)
[1] "A01_RR20170609_FA_A01_2017-06-09_1.fsa" "CAG"
> colnames(dataDT) <- gsub(".","-", colnames(dataDT), fixed = T) #change all dots to -
> colnames(dataDT) <- gsub("-fsa",".fsa", colnames(dataDT), fixed = T) #change back the end of string
> colnames(dataDT)
[1] "A01_RR20170609_FA_A01_2017-06-09_1.fsa" "CAG"
Now everything should work.
Thanks. This is the function at the moment:
library(data.table)
dataDT <- height[,13:ncol(height)] #Create a data frame containing CAG and height columns
dataDT <- setDT(dataDT) #Convert to a data table
colsToBeUsed<-names(dataDT[,!'CAG']) #Assigns the columns to be analysed
myFun <- function(x, mod, cag, thres) { #Function that takes vectors as arguments.
x[x < (thres * max(x))] <- 0 #First sets all heights < 0.2*threshold to 0.
norm_x <- x / sum(x) #Then normalises heights by dividing by the sum of the heights.
sum_x <- norm_x * (cag - mod) #Then multiplies by the change in CAG from mode
sum(sum_x) #Then sums the results
}
transision_matrix <- merge(propsettings, modeHt, by.x = "control", by.y = "sample") #Transition matrix that determines control modes for each sample
setDT(transision_matrix)
mf2 <- function(colname, dataDT, transision_matrix){ #Function that takes column name and data table as arguments.
x <- dataDT[, colname, with = F][[1]]
mod <- transision_matrix[sample == colname, mode] #Vector of CONTROL modes to use
cag <- dataDT[, "CAG"][[1]]
thres <- resultsHt$iithreshold #THIS LIKELY SOURCE OF ERROR AS IT WORKS WITH 0.2
myFun(x, mod, cag, thres)
}
iiHt <- sapply(colsToBeUsed, function(x) mf2(x, dataDT, transision_matrix))
resultsHt$iiHt <- iiHt
This is 'resultsHt$iithreshold':
[1] 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2
[29] 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2
This is modeHt:
sample mode
1 A01_MF20170623_FA_A01_2017-06-23_1.fsa 130.93667
2 A02_MF20170623_FA_A02_2017-06-23_1.fsa 131.14667
3 A03_MF20170623_FA_A03_2017-06-23_1.fsa 132.07333
4 A05_MF20170623_FA_A05_2017-06-23_1.fsa 151.97333
5 A06_MF20170623_FA_A06_2017-06-23_1.fsa 128.39333
6 A07_MF20170623_FA_A07_2017-06-23_1.fsa 116.02333
7 A08_MF20170623_FA_A08_2017-06-23_1.fsa 127.42667
8 A09_MF20170623_FA_A09_2017-06-23_1.fsa 163.22000
9 A10_MF20170623_FA_A10_2017-06-23_1.fsa 131.92667
10 A11_MF20170623_FA_A11_2017-06-23_1.fsa 133.57333
11 A12_MF20170623_FA_A12_2017-06-23_1.fsa 164.85333
12 B01_MF20170623_FA_B01_2017-06-23_1.fsa 180.34333
13 B02_MF20170623_FA_B02_2017-06-23_1.fsa 133.08000
14 B03_MF20170623_FA_B03_2017-06-23_1.fsa 163.53333
15 B04_MF20170623_FA_B04_2017-06-23_1.fsa 133.13333
16 B05_MF20170623_FA_B05_2017-06-23_1.fsa 133.08000
17 B06_MF20170623_FA_B06_2017-06-23_1.fsa 167.23000
18 B07_MF20170623_FA_B07_2017-06-23_1.fsa 115.05667
19 B08_MF20170623_FA_B08_2017-06-23_1.fsa 179.62333
20 C01_MF20170623_FA_C01_2017-06-23_1.fsa 115.93000
21 C02_MF20170623_FA_C02_2017-06-23_1.fsa 115.17333
22 C05_MF20170623_FA_C05_2017-06-23_1.fsa 131.18667
23 C07_MF20170623_FA_C07_2017-06-23_1.fsa 131.13333
24 C08_MF20170623_FA_C08_2017-06-23_1.fsa 131.13000
25 C09_MF20170623_FA_C09_2017-06-23_1.fsa 130.09333
26 C10_MF20170623_FA_C10_2017-06-23_1.fsa 115.09000
27 C11_MF20170623_FA_C11_2017-06-23_1.fsa 130.04000
28 C12_MF20170623_FA_C12_2017-06-23_1.fsa 115.70000
29 D02_MF20170623_FA_D02_2017-06-23_1.fsa 116.03667
30 D03_MF20170623_FA_D03_2017-06-23_1.fsa 131.14000
31 D04_MF20170623_FA_D04_2017-06-23_1.fsa 115.22667
32 D05_MF20170623_FA_D05_2017-06-23_1.fsa 19.88000
33 D06_MF20170623_FA_D06_2017-06-23_1.fsa 19.91000
34 D08_MF20170623_FA_D08_2017-06-23_1.fsa 19.84667
35 D10_MF20170623_FA_D10_2017-06-23_1.fsa 72.32333
36 D11_MF20170623_FA_D11_2017-06-23_1.fsa 130.00333
37 D12_MF20170623_FA_D12_2017-06-23_1.fsa 130.01333
38 A01_MF20170522_FA_A01_2017-05-22_1.fsa 136.94667
39 C02_MF20170529_FA_C02_2017-05-30_1.fsa 132.31667
40 B08_MF20170522_FA_B08_2017-05-22_1.fsa 121.00000
This is 'propsettings':
sample control
1 A01_MF20170623_FA_A01_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
2 A02_MF20170623_FA_A02_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
3 A03_MF20170623_FA_A03_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
4 A05_MF20170623_FA_A05_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
5 A06_MF20170623_FA_A06_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
6 A07_MF20170623_FA_A07_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
7 A08_MF20170623_FA_A08_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
8 A09_MF20170623_FA_A09_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
9 A10_MF20170623_FA_A10_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
10 A11_MF20170623_FA_A11_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
11 A12_MF20170623_FA_A12_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
12 B01_MF20170623_FA_B01_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
13 B02_MF20170623_FA_B02_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
14 B03_MF20170623_FA_B03_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
15 B04_MF20170623_FA_B04_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
16 B05_MF20170623_FA_B05_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
17 B06_MF20170623_FA_B06_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
18 B07_MF20170623_FA_B07_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
19 B08_MF20170623_FA_B08_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
20 C01_MF20170623_FA_C01_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
21 C02_MF20170623_FA_C02_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
22 C05_MF20170623_FA_C05_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
23 C07_MF20170623_FA_C07_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
24 C08_MF20170623_FA_C08_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
25 C09_MF20170623_FA_C09_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
26 C10_MF20170623_FA_C10_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
27 C11_MF20170623_FA_C11_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
28 C12_MF20170623_FA_C12_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
29 D02_MF20170623_FA_D02_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
30 D03_MF20170623_FA_D03_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
31 D04_MF20170623_FA_D04_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
32 D05_MF20170623_FA_D05_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
33 D06_MF20170623_FA_D06_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
34 D08_MF20170623_FA_D08_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
35 D10_MF20170623_FA_D10_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
36 D11_MF20170623_FA_D11_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
37 D12_MF20170623_FA_D12_2017-06-23_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
38 A01_MF20170522_FA_A01_2017-05-22_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
39 C02_MF20170529_FA_C02_2017-05-30_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
40 B08_MF20170522_FA_B08_2017-05-22_1.fsa B08_MF20170522_FA_B08_2017-05-22_1.fsa
This is a summary of what 'height' looks like:
> str(height)
'data.frame': 1078 obs. of 53 variables:
$ Dye/SamplePeak : chr "B,65" "B,66" "B,67" "B,68" ...
$ Marker : chr NA NA NA NA ...
$ Allele : chr NA NA NA NA ...
$ Size : num 418 432 435 438 441 ...
$ Area : num 285 354 300 334 342 385 359 370 410 439 ...
$ DataPoint : num 5665 5827 5859 5890 5924 ...
$ flank : num 108 108 108 108 108 108 108 108 108 108 ...
$ correction : num 2 2 2 2 2 2 2 2 2 2 ...
$ start : num 100 100 100 100 100 100 100 100 100 100 ...
$ end : num 200 200 200 200 200 200 200 200 200 200 ...
$ control : chr "B08_MF20170522_FA_B08_2017-05-22_1.fsa" "B08_MF20170522_FA_B08_2017-05-22_1.fsa" "B08_MF20170522_FA_B08_2017-05-22_1.fsa" "B08_MF20170522_FA_B08_2017-05-22_1.fsa" ...
$ iithreshold : num 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 ...
$ CAG : num 105 110 111 112 113 ...
$ A01_MF20170623_FA_A01_2017-06-23_1.fsa: num 31 32 32 33 40 37 36 41 45 38 ...
$ A02_MF20170623_FA_A02_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A03_MF20170623_FA_A03_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A05_MF20170623_FA_A05_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A06_MF20170623_FA_A06_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A07_MF20170623_FA_A07_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A08_MF20170623_FA_A08_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A09_MF20170623_FA_A09_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A10_MF20170623_FA_A10_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A11_MF20170623_FA_A11_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A12_MF20170623_FA_A12_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B01_MF20170623_FA_B01_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B02_MF20170623_FA_B02_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B03_MF20170623_FA_B03_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B04_MF20170623_FA_B04_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B05_MF20170623_FA_B05_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B06_MF20170623_FA_B06_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B07_MF20170623_FA_B07_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B08_MF20170623_FA_B08_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C01_MF20170623_FA_C01_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C02_MF20170623_FA_C02_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C05_MF20170623_FA_C05_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C07_MF20170623_FA_C07_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C08_MF20170623_FA_C08_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C09_MF20170623_FA_C09_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C10_MF20170623_FA_C10_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C11_MF20170623_FA_C11_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C12_MF20170623_FA_C12_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D02_MF20170623_FA_D02_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D03_MF20170623_FA_D03_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D04_MF20170623_FA_D04_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D05_MF20170623_FA_D05_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D06_MF20170623_FA_D06_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D08_MF20170623_FA_D08_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D10_MF20170623_FA_D10_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D11_MF20170623_FA_D11_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ D12_MF20170623_FA_D12_2017-06-23_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ A01_MF20170522_FA_A01_2017-05-22_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ C02_MF20170529_FA_C02_2017-05-30_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
$ B08_MF20170522_FA_B08_2017-05-22_1.fsa: num 0 0 0 0 0 0 0 0 0 0 ...
Related
Converting matrix data frame to a list but I can't the same entry data to iNEXT
I am trying to convert a matrix data frame like this (lowland): species DT1 DT3 DT6 DT7 DT12 DT13 DT14 DT15 DT28 DT29 1 M_vaccinifolia 0 0 0 0 0 0 1 0 0 1 2 M_vaccinifolia 0 0 0 0 0 0 0 0 0 1 3 M_vaccinifolia 0 0 0 0 0 0 0 0 0 1 4 M_vaccinifolia 0 0 0 0 0 0 0 0 0 1 5 M_vaccinifolia 0 0 0 0 0 0 0 0 0 1 6 M_vaccinifolia 0 0 0 0 0 0 0 0 0 1 7 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 8 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 9 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 10 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 11 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 12 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 13 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 14 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 15 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 16 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 17 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 18 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 19 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 20 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 21 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 22 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 23 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 24 M_vaccinifolia 0 0 0 0 0 0 0 0 0 0 And I want to transform to a list that I can enter the data as the iNEXT data "ciliates" list is used to perform the examples in the rarefaction curves (example in the section "RAW INCIDENCE DATA FUNCTION: incidence_raw" in this link: https://cran.r-project.org/web/packages/iNEXT/vignettes/Introduction.html. Below is how the list is interpreted: command str(ciliates$EtoshaPan) int [1:365, 1:19] 0 0 0 0 0 0 0 0 0 0 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:365] "Acaryophrya.collaris" "Actinobolina.multinucleata.n..sp." "Afroamphisiella.multinucleata.n..sp." "Afrothrix.multinucleata.n..sp." ... ..$ : chr [1:19] "x53" "x54" "x55" "x56" ... When I convert my data lowland, I just can reach this kind of list lowland_list <- list(lowland) str(lowland_list) List of 1 $ :'data.frame': 24 obs. of 11 variables: ..$ species: chr [1:24] "M_vaccinifolia" "M_vaccinifolia" "M_vaccinifolia" "M_vaccinifolia" ... ..$ DT1 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT3 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT6 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT7 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT12 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT13 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT14 : int [1:24] 1 0 0 0 0 0 0 0 0 0 ... ..$ DT15 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT28 : int [1:24] 0 0 0 0 0 0 0 0 0 0 ... ..$ DT29 : int [1:24] 1 1 1 1 1 1 0 0 0 0 ... What is not a proper entry data format to iNEXT read as the example. I spent many hours trying to make a list to enter this data but I couldn't figure it out. How can I do this?
Not sure if this helps, but I have 2 data.frames that are in species columns and sites rows (datA and datB). Here's some example code. make an empty list: datlist <- list() datlist$A <- data.frame(t(datA)) datlist$B <- data.frame(t(datB)) then run iNEXT iNEXT(datlist , q = c(0, 1, 2) , "incidence_raw" , conf = 0.95 , se = TRUE , knots = 200 , nboot = 200 )
The data frame does not reflect the conversion of NA to 0 when "0", "1" and "NA" are input in the data about the death
I used the replace function to convert an integer r _ death to a factor and convert NA to 0. You can see that the display is only 0 or 1, but when you summarize again, the NA remains. What should I do? str(df$r_death) #> int [1:2639] NA NA NA NA NA NA NA NA NA NA ... summary(df$r_death) #> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's #> 0.0000 0.0000 0.0000 0.2676 1.0000 1.0000 1219 df$r_death<-as.factor(df$r_death) summary(df$r_death) #> 0 1 NA's #> 1040 380 1219 tidyr::replace_na(df$r_death,0) #> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 #> [45] 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 #> [89] 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 #> [133] 0 0 0 0 0 0 0 0 0 0 0 0 summary(df$r_death) #> 0 1 NA's #> 1040 380 1219
How can I convert matrix to different matrices in R?
Here I have some codes generating a matrix like this: N = 200 T = 10 mu_0 <- matrix(diag(1, T)) dim(mu_0) <- c(T,T) mu_t_0 <- matrix(rep(t(mu_0), N), ncol = T, byrow = TRUE) And generally the result looks like this V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 1 0 0 0 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 3 0 0 1 0 0 0 0 0 0 0 4 0 0 0 1 0 0 0 0 0 0 5 0 0 0 0 1 0 0 0 0 0 6 0 0 0 0 0 1 0 0 0 0 7 0 0 0 0 0 0 1 0 0 0 8 0 0 0 0 0 0 0 1 0 0 9 0 0 0 0 0 0 0 0 1 0 10 0 0 0 0 0 0 0 0 0 1 11 1 0 0 0 0 0 0 0 0 0 12 0 1 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 14 0 0 0 1 0 0 0 0 0 0 15 0 0 0 0 1 0 0 0 0 0 16 0 0 0 0 0 1 0 0 0 0 17 0 0 0 0 0 0 1 0 0 0 18 0 0 0 0 0 0 0 1 0 0 19 0 0 0 0 0 0 0 0 1 0 20 0 0 0 0 0 0 0 0 0 1 ... Now for later calculation I want to split this large matrix into different small matrices like this: Matrix One: 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 ... Matrix Two: 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 ... I tried the split function but I cannot get what I want. Are there any solutions?
You can use asplit to split an array or matrix by its margin. x <- asplit(mu_t_0, 2) str(x) #List of 10 # $ : num [1:2000(1d)] 1 0 0 0 0 0 0 0 0 0 ... # $ : num [1:2000(1d)] 0 1 0 0 0 0 0 0 0 0 ... # $ : num [1:2000(1d)] 0 0 1 0 0 0 0 0 0 0 ... # $ : num [1:2000(1d)] 0 0 0 1 0 0 0 0 0 0 ... # $ : num [1:2000(1d)] 0 0 0 0 1 0 0 0 0 0 ... # $ : num [1:2000(1d)] 0 0 0 0 0 1 0 0 0 0 ... # $ : num [1:2000(1d)] 0 0 0 0 0 0 1 0 0 0 ... # $ : num [1:2000(1d)] 0 0 0 0 0 0 0 1 0 0 ... # $ : num [1:2000(1d)] 0 0 0 0 0 0 0 0 1 0 ... # $ : num [1:2000(1d)] 0 0 0 0 0 0 0 0 0 1 ... # - attr(*, "dim")= int 10
This one puts all the columns into a list. res <- lapply(1:ncol(mu_t_0), function(i) mu_t_0[, i, drop=F]) head(res[[1]]) # [,1] # [1,] 1 # [2,] 0 # [3,] 0 # [4,] 0 # [5,] 0 # [6,] 0 head(res[[2]]) # [,1] # [1,] 0 # [2,] 1 # [3,] 0 # [4,] 0 # [5,] 0 # [6,] 0 To extract the single one column matrices use list2env; the objects in the list need names beforehand. names(res) <- paste0("m.", 1:length(res)) list2env(res, env=.GlobalEnv) ls() # [1] "m.1" "m.10" "m.2" "m.3" "m.4" "m.5" "m.6" "m.7" "m.8" "m.9" "mu_0" "mu_t_0" # [13] "N" "res" "T"
If you want individual vectors named in the global environment (i.e. object V1, V2, and etc.): invisible(mapply(assign, names(as.data.frame(mu_t_0)), as.data.frame(mu_t_0), MoreArgs=list(envir = globalenv()))) Though I'm really interested in why the OP doesn't want to just work with a single matrix...
How to calculate max value for certain vector of column names and assign value to new column?
After import my dataset looks as follows: Classes ‘data.table’ and 'data.frame': 820600 obs. of 2180 variables: $ count_comments : int 0 0 0 0 0 2 2 0 0 1 ... $ count_faves : int 5 2 2 15 1 3 19 5 1 4 ... $ dateadded : int 1530174689 1530174688 1530174687 1530162494 1530159458 1530158648 1530158074 1529994404 1529992211 1529868922 ... $ datetaken : chr "2018-05-10 15:50:59" "2018-05-10 15:50:53" "2018-05-10 15:50:03" "2006-11-27 00:00:00" ... $ dateupload : int 1530174672 1530174671 1530174669 1498275521 1436228321 1482723483 1496706006 1529994381 1529992197 1529868901 ... $ group_url : chr "https://www.flickr.com/groups/capriceclassic/" "https://www.flickr.com/groups/capriceclassic/" "https://www.flickr.com/groups/capriceclassic/" "https://www.flickr.com/groups/capriceclassic/" ... $ id :integer64 42341316794 42341318944 42341324184 35456820766 19292939750 31070311463 34738418140 42964602432 ... $ license : int 6 6 6 0 0 0 0 6 0 6 ... $ oid.800metres : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Abbey : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Abdomen : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Academicconference : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Academicdress : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Accipitriformes : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Acousticguitar : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Acoustic-electricguitar : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Acrylicpaint : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Actionfigure : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Adolescent : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Adult : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Adventure : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Advertising : num 0 0 0 0 0 ... $ oid.Aeolianlandform : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aerialphotography : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aerobatics : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aerospaceengineering : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Africanelephant : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Afterglow : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Agaric : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Agaricaceae : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Agaricus : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Agriculturalmachinery : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Agriculture : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airforce : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airracing : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airshow : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airsports : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airtravel : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airbusa320family : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aircraft : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aircraftcabin : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aircraftcarrier : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aircraftengine : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airline : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airliner : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airplane : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airport : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airportapron : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Airportterminal : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aisle : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Albumcover : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Alcohol : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Alcoholicbeverage : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ale : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Algae : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Alley : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Alligator : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Alloywheel : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.All-terrainvehicle : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Alps : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Altar : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amateurboxing : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amateurwrestling : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ambulance : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Americanfootball : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amphibian : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amphibiousassaultship : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amphibioustransportdock : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amphitheatre : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amusementpark : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Amusementride : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ancientgreektemple : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ancienthistory : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ancientromanarchitecture : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ancientrome : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Animal : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Animalmigration : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Animalshelter : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Animalsports : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Animaltraining : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Anime : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Annualplant : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ant : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Antelope : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Antique : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Antiquecar : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Apartment : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Ape : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Apple : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aqua : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aquarium : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aquaticplant : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Aqueduct : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Arabiancamel : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Arcade : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Arch : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Archbridge : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Archaeologicalsite : num 0 0 0 0 0 0 0 0 0 0 ... $ oid.Archipelago : num 0 0 0 0 0 0 0 0 0 0 ... [list output truncated] - attr(*, ".internal.selfref")=<externalptr> After that I import a second data set with the following structure: Classes ‘data.table’ and 'data.frame': 2333 obs. of 4 variables: $ Labels : chr "oid.800 metres" "oid.Abbey" "oid.Abdomen" "oid.Academic conference" ... $ CodingCategory: chr "Unclear" "UrbanContext" "HumanBodyParts" "Unclear" ... $ Occurrences : num 28 286 666 5925 569 ... $ Occurrences % : num 0.00341 0.03485 0.08116 0.72203 0.06934 ... - attr(*, ".internal.selfref")=<externalptr> > As you can see every "oid.XXX" Labels is assigned to one of the 12 CodingCategories: - UrbanContext - HumanBodyParts - Animals - SportsContext - NatureContex ... Here is a snippet of the first datatset: count_comments count_faves dateadded datetaken dateupload group_url id license oid.800metres 1: 0 5 1530174689 2018-05-10 15:50:59 1530174672 https://www.flickr.com/groups/capriceclassic/ 42341316794 6 0 2: 0 2 1530174688 2018-05-10 15:50:53 1530174671 https://www.flickr.com/groups/capriceclassic/ 42341318944 6 0 3: 0 2 1530174687 2018-05-10 15:50:03 1530174669 https://www.flickr.com/groups/capriceclassic/ 42341324184 6 0 4: 0 15 1530162494 2006-11-27 00:00:00 1498275521 https://www.flickr.com/groups/capriceclassic/ 35456820766 0 0 5: 0 1 1530159458 2007-05-17 12:38:02 1436228321 https://www.flickr.com/groups/capriceclassic/ 19292939750 0 0 6: 2 3 1530158648 2013-09-26 18:02:39 1482723483 https://www.flickr.com/groups/capriceclassic/ 31070311463 0 0 7: 2 19 1530158074 2017-05-13 10:15:45 1496706006 https://www.flickr.com/groups/capriceclassic/ 34738418140 0 0 8: 0 5 1529994404 2018-05-10 15:09:55 1529994381 https://www.flickr.com/groups/capriceclassic/ 42964602432 6 0 9: 0 1 1529992211 2017-06-26 02:25:33 1529992197 https://www.flickr.com/groups/capriceclassic/ 28146093407 0 0 10: 1 4 1529868922 2018-05-10 13:42:01 1529868901 https://www.flickr.com/groups/capriceclassic/ 42984216801 6 0 oid.Abbey oid.Abdomen oid.Academicconference oid.Academicdress oid.Accipitriformes oid.Acousticguitar oid.Acoustic-electricguitar oid.Acrylicpaint 1: 0 0 0 0 0 0 0 0 2: 0 0 0 0 0 0 0 0 3: 0 0 0 0 0 0 0 0 4: 0 0 0 0 0 0 0 0 5: 0 0 0 0 0 0 0 0 6: 0 0 0 0 0 0 0 0 7: 0 0 0 0 0 0 0 0 8: 0 0 0 0 0 0 0 0 9: 0 0 0 0 0 0 0 0 10: 0 0 0 0 0 0 0 0 oid.Actionfigure oid.Adolescent oid.Adult 1: 0 0 0 2: 0 0 0 3: 0 0 0 4: 0 0 0 5: 0 0 0 6: 0 0 0 7: 0 0 0 8: 0 0 0 9: 0 0 0 10: 0 0 0 Here is a snippet of the second data set: Labels CodingCategory Occurrences Occurrences % 1: oid.800 metres Unclear 28 0.003412137 2: oid.Abbey UrbanContext 286 0.034852547 3: oid.Abdomen HumanBodyParts 666 0.081160127 4: oid.Academic conference Unclear 5925 0.722032659 5: oid.Academic dress <NA> 569 0.069339508 6: oid.Academicconference Unclear NA NA 7: oid.Academicdress <NA> NA NA 8: oid.Accipitriformes Animals 19 0.002315379 9: oid.Acoustic guitar <NA> 329 0.040092615 10: oid.Acoustic-electric guitar <NA> 43 0.005240068 11: oid.Acrylic paint <NA> 735 0.089568608 12: oid.Acrylicpaint Unclear NA NA 13: oid.Action figure <NA> 3650 0.444796490 14: oid.Actionfigure <NA> NA NA 15: oid.Adolescent HumanBodyParts 1123 0.136851085 16: oid.Adult HumanBodyParts 983 0.119790397 17: oid.Adventure <NA> 17603 2.145137704 18: oid.Advertising <NA> 46194 5.629295637 19: oid.Aeolian landform NatureContext 5911 0.720326590 20: oid.Aeolianlandform NatureContext NA NA 21: oid.Aerial photography Unclear 10382 1.265171825 22: oid.Aerialphotography Unclear NA NA 23: oid.Aerobatics SportsContext 224 0.027297100 24: oid.Aerospace engineering <NA> 1526 0.185961492 25: oid.Aerospaceengineering <NA> NA NA 26: oid.African elephant Animals 7 0.000853034 27: oid.Afterglow NatureContext 1998 0.243480380 28: oid.Agaric NatureContext 1 0.000121862 29: oid.Agaricaceae NatureContext 1 0.000121862 30: oid.Agaricus NatureContext 1 0.000121862 31: oid.Agricultural machinery CarType 34249 4.173653424 32: oid.Agriculture <NA> 12034 1.466487936 33: oid.Air force <NA> 8143 0.992322691 34: oid.Air racing SportsContext 79 0.009627102 35: oid.Air show SportsContext 331 0.040336339 36: oid.Air sports SportsContext 113 0.013770412 37: oid.Air travel <NA> 551 0.067145991 38: oid.Airbus a320 family <NA> 73 0.008895930 39: oid.Aircraft <NA> 20583 2.508286620 40: oid.Aircraft cabin <NA> 3 0.000365586 41: oid.Aircraft carrier <NA> 5 0.000609310 42: oid.Aircraft engine <NA> 59190 7.213014867 43: oid.Airline <NA> 8080 0.984645381 44: oid.Airliner <NA> 1343 0.163660736 45: oid.Airplane <NA> 12043 1.467584694 46: oid.Airport UrbanContext 2386 0.290762856 47: oid.Airport apron <NA> 133 0.016207653 48: oid.Airport terminal UrbanContext 714 0.087009505 49: oid.Airportterminal UrbanContext NA NA 50: oid.Airtravel <NA> NA NA What I am trying to do now is create additional columns in the 1st data set that gives me the Max Value of the "oid.XXX" values for all the "oid.XXX" colums that match a specific CodingCategory per row. For example: I want to create a new column named "UrbanContext" that contains the Max Value for its corresponding "Oid.XXX" values in the specific row. This is what I came up with so far: require(data.table) require(QuantPsyc) library(lmSupport) library(dbplyr) #Import header of OID Label Data only x <- fread("/Users/01_Flickr Car Data/01_Open Image Data Set/01_Raw Data/flickrexport_cars_oid_201903.csv",sep=",", encoding = "Latin-1", header=TRUE, nrows=0) #Define oid Columns as numeric colNames <- grep('^oid', names(x), value = TRUE) colClasses <- rep('numeric', length(colNames)) names(colClasses) <- colNames #Import OID Label Data based on Open Image Data Set only flickrcar <- fread("/Users/01_Flickr Car Data/01_Open Image Data Set/01_Raw Data/flickrexport_cars_oid_201903.csv",colClasses = colClasses, sep=",", encoding = "Latin-1", header=TRUE) str(flickrcar) #NAs to zeros f_dowle3 = function(DT) { for (j in names(DT)) set(DT,which(is.na(DT[[j]])),j,0) } f_dowle3(flickrcar) #Import Coding for Open Image Data Set Labels flickrcar_label_coding <- fread("/Users/01_Open Image Data Set/01_Raw Data/180220_OID_Labels_Coding_NTT_FINAL.csv", sep=",", header = TRUE) flickrcar_label_coding[1:50, 1:4] #Set to Data Table setDT(flickrcar) setDT(flickrcar_label_coding) #Group OID Labels into LabelCategory humanbodyparts <- flickrcar_label_coding[grep("HumanBodyParts",flickrcar_label_coding$CodingCategory), "Labels"] urbancontext <- flickrcar_label_coding[grep("UrbanContext",flickrcar_label_coding$CodingCategory), "Labels"] animals <- flickrcar_label_coding[grep("Animals",flickrcar_label_coding$CodingCategory), "Labels"] sportscontext <- flickrcar_label_coding[grep("SportsContext",flickrcar_label_coding$CodingCategory), "Labels"] naturecontext <- flickrcar_label_coding[grep("NaturContext",flickrcar_label_coding$CodingCategory), "Labels"] exhibitioncontext <- flickrcar_label_coding[grep("ExhibitionContext",flickrcar_label_coding$CodingCategory), "Labels"] manualimageprocessing <- flickrcar_label_coding[grep("ManualImageProcessing",flickrcar_label_coding$CodingCategory), "Labels"] regularroadecomtext <- flickrcar_label_coding[grep("RegularRoadContext",flickrcar_label_coding$CodingCategory), "Labels"] racingcontext <- flickrcar_label_coding[grep("RacingContext",flickrcar_label_coding$CodingCategory), "Labels"] cartype <- flickrcar_label_coding[grep("CarType",flickrcar_label_coding$CodingCategory), "Labels"] carparts <- flickrcar_label_coding[grep("CarParts",flickrcar_label_coding$CodingCategory), "Labels"] carbrand <- flickrcar_label_coding[grep("CarBrand",flickrcar_label_coding$CodingCategory), "Labels"] #Insert Label Category Columns flickrcar$HumanBodyParts <- flickrcar[1, which.max(flickrcar[1,humanbodyparts])] Unfortunately this does not work. In the end the result for one row should look something like this: count_comments count_faves dateadded datetaken dateupload group_url id license oid.800metres 1: 0 5 1530174689 2018-05-10 15:50:59 1530174672 https://www.flickr.com/groups/capriceclassic/ 42341316794 6 0 oid.Abbey oid.Abdomen oid.Academicconference oid.Academicdress oid.Accipitriformes oid.Acousticguitar oid.Acoustic-electricguitar oid.Acrylicpaint 1: 0 0 0 0 0 0 0 0 oid.Actionfigure oid.Adolescent oid.Adult NatureContext RoadContext... 1: 0 0 0 0.88 0.54... Thank you very much for your help in advance!
Why can't I use aggregate with cbind on a range of columns in a data.frame?
20 Lines of the data I'm working on: Zv9_NA110 6176 7276 5'to3'IntronExon 0 + 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA110 10126 11226 5'to3'IntronExon 0 + 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 9 9 15 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA110 11219 12319 5'to3'ExonIntron 0 + 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA110 14887 15987 5'to3'IntronExon 0 + 1100 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Zv9_NA110 18923 20023 5'to3'IntronExon 0 + 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA110 21069 22169 5'to3'ExonIntron 0 + 1100 0 135 115 65 54 45 36 27 16 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA113 1615 2715 5'to3'IntronExon 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA113 2335 3435 5'to3'ExonIntron 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA113 5398 6498 5'to3'IntronExon 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA113 7173 8273 5'to3'ExonIntron 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA118 11674 12774 5'to3'IntronExon 0 + 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA118 12711 13811 5'to3'ExonIntron 0 + 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA123 38151 39251 5'to3'ExonIntron 0 - 1100 0 1061 958 844 796 695 600 464 346 265 210 150 133 94 81 72 46 18 4 0 0 0 0 0 0 0 0 0 7 9 9 9 11 21 35 43 58 91 108 180 268 406 547 712 833 882 960 1094 1172 1245 1331 1432 1510 1604 1711 1810 1830 1837 1823 1781 1690 1638 1560 1489 1257 854 731 631 589 551 497 439 404 369 301 231 168 123 76 58 50 42 28 20 11 9 9 24 27 27 27 27 27 25 18 18 18 18 18 18 18 18 18 18 18 18 18 14 5 0 0 Zv9_NA124 2578 3678 5'to3'ExonIntron 0 + 1100 0 423 407 401 377 357 345 324 304 249 185 111 54 30 12 0 0 0 0 0 0 0 0 0 0 0 0 0 1 9 9 9 9 14 18 25 27 27 27 27 27 27 27 27 27 27 27 26 18 18 18 18 18 18 16 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA129 4939 6039 5'to3'IntronExon 0 + 1100 226 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 9 9 9 9 9 9 9 9 9 9 9 9 14 34 45 60 97 128 175 293 395 524 621 764 894 1036 1164 1334 1469 1639 1801 1885 1983 Zv9_NA132 12589 13689 5'to3'ExonIntron 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA132 13634 14734 5'to3'IntronExon 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA132 14481 15581 5'to3'ExonIntron 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 9 9 9 9 9 9 9 9 9 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA132 19534 20634 5'to3'IntronExon 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zv9_NA132 28708 29808 5'to3'ExonIntron 0 - 1100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 9 15 18 24 27 42 46 73 112 142 157 162 162 162 162 162 162 162 162 159 153 153 153 153 153 150 144 132 112 76 52 30 25 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 I get this into R as follows: > dat <- read.table("dat.dat",header=F) I need to get the averages for columns 9 through 118, parsed by column 4. This works: > all_means <- aggregate(cbind(V9,V10,V11)~V4,data=dat,FUN=mean) V4 V9 V10 V11 1 5'to3'ExonIntron 0 0.00 0 2 5'to3'IntronExon 0 0.75 1 But there's no way I'm typing this out to V118. I've tried this: > aggregate(cbind(9:118)~V4,data=blah,FUN=mean) But I get this error: Error in model.frame.default(formula = cbind(9:118) ~ V4, data = blah) : variable lengths differ (found for 'V4') Is there something dumb I'm missing?
You have a number of options. create a formula using . and pass a subset of the data aggregate( . ~ V4, data = dat[,c(4,9:118)], FUN = mean) You could also create the vector of column names using paste nn <- paste0('V', 9:118) and refer by column name aggregate( . ~ V4, data = dat[,c('V4',nn)], FUN = mean) There isn't much point using cbind here, given the formula approach works, but for example. aggregate( do.call(cbind,lapply(nn, as.name)) ~ V4, data = dat, FUN = mean) But this is messy as it doesn't name the columns nicely. (and is hard to follow)
If speed is an issue in general (not necessary for this operation) and you want to use the data.table package, this is done as follows: Safer solution Thanks to mnel's comment, I would use that: library(data.table) dat <- as.data.table(dat) dat[,lapply(.SD,mean),by="V4",.SDcols=paste0("V", 9:118)] Old solution dat[,lapply(.SD,mean),by="V4",.SDcols=9:118]
You can use ## S3 method for class 'data.frame' aggregate(x, by, FUN, ..., simplify = TRUE) With your data assuming your data is in dataframe DF DF <- read.table(text = txt, header = FALSE, stringsAsFactors = FALSE) result <- aggregate(DF[, 9:118], by = list(DF[, 4]), FUN = mean) # Using pander to print result table nicely. It's not needed for aggregation :) require(pander) pandoc.table(result) ## ## ---------------------------------------------------- ## Group.1 V9 V10 V11 V12 V13 V14 ## ---------------- ----- ----- ----- ----- ----- ----- ## 5'to3'ExonIntron 161.9 148 131 122.7 109.7 98.1 ## ## 5'to3'IntronExon 0.0 0 0 0.0 0.0 0.0 ## ---------------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V15 V16 V17 V18 V19 V20 V21 V22 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 81.5 66.6 52.3 39.5 26.1 18.7 12.4 9.3 ## ## 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V23 V24 V25 V26 V27 V28 V29 V30 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 7.2 4.6 1.8 0.4 0 0 0 0.5 ## ## 0.0 0.0 0.0 0.0 0 0 0 0.0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V31 V32 V33 V34 V35 V36 V37 V38 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 0.9 1.5 1.8 2.4 2.7 5 6.4 9.1 ## ## 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V39 V40 V41 V42 V43 V44 V45 V46 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 13 16.2 19.2 21.5 23 24.7 28 29.7 ## ## 0 0.0 0.0 0.0 0 0.0 0 0.0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V47 V48 V49 V50 V51 V52 V53 V54 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 36.9 45.7 59.5 73.3 89.2 101.3 106.2 114 ## ## 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V55 V56 V57 V58 V59 V60 V61 V62 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 127.3 134 140.7 148.1 156.2 160.4 167.4 175.7 ## ## 0.0 0 0.0 0.0 0.0 0.0 0.0 0.0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V63 V64 V65 V66 V67 V68 V69 V70 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 183.9 183.1 183.7 182.3 178.1 169 163.8 156.7 ## ## 0.0 0.0 0.0 0.0 0.0 0 0.0 0.0 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V71 V72 V73 V74 V75 V76 V77 V78 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 149.8 126.6 86.3 74.0 64.0 59.8 56.0 50.6 ## ## 0.7 0.9 0.9 1.5 1.8 1.8 1.8 1.8 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V79 V80 V81 V82 V83 V84 V85 V86 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 45.6 42.2 38.7 31.9 24.9 18.6 14.1 9.4 ## ## 1.8 1.8 1.8 1.8 1.8 1.8 2.2 2.7 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## ----------------------------------------------- ## V87 V88 V89 V90 V91 V92 V93 V94 ## ----- ----- ----- ----- ----- ----- ----- ----- ## 7.6 6.2 5.1 3.7 2.9 2.5 2.7 2.7 ## ## 2.7 2.7 2.7 2.7 2.7 2.7 2.2 0.9 ## ----------------------------------------------- ## ## Table: Table continues below ## ## ## -------------------------------------------------- ## V95 V96 V97 V98 V99 V100 V101 V102 ## ----- ----- ----- ----- ----- ------ ------ ------ ## 4.2 4.5 4.5 4.5 4.5 4.5 4.3 2.5 ## ## 0.9 0.9 0.9 1.4 4.1 5.4 6.9 10.6 ## -------------------------------------------------- ## ## Table: Table continues below ## ## ## ------------------------------------------------------- ## V103 V104 V105 V106 V107 V108 V109 V110 ## ------ ------ ------ ------ ------ ------ ------ ------ ## 1.8 1.8 1.8 1.8 1.8 1.8 1.8 1.8 ## ## 13.7 18.4 30.2 40.4 53.3 63.0 77.3 90.3 ## ------------------------------------------------------- ## ## Table: Table continues below ## ## ## ------------------------------------------------------- ## V111 V112 V113 V114 V115 V116 V117 V118 ## ------ ------ ------ ------ ------ ------ ------ ------ ## 1.8 1.8 1.8 1.8 1.4 0.5 0.0 0.0 ## ## 104.5 117.3 134.3 147.8 164.8 181.0 189.4 199.2 ## ------------------------------------------------------- ##