I have a dataframe called mydf. I also have a vector called myvec <- c("chr5:11", "chr3:112", "chr22:334"). What I want to do is select range (including 3 values above and 3 values below) of rows if any of the vector elements match the key in mydf and make a subset of mydf(result).
Since in the myvec we have chr5:11 matching with the key in mydf, we are selecting rows matching chr5:8 (three values below) to chr5:14 (three values above) in the result.
mydf<- structure(list(key = structure(c(5L, 2L, 7L, 8L, 4L, 1L, 6L,
3L, 11L, 10L, 9L), .Names = c("34", "35", "36", "37", "38", "39",
"40", "41", "42", "43", "44"), .Label = c("chr5:10", "chr5:11",
"chr5:1123", "chr5:118", "chr5:12", "chr5:123", "chr5:13", "chr5:14",
"chr5:19", "chr5:8", "chr5:9"), class = "factor"), variantId = structure(1:11, .Names = c("34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44"), .Label = c("9920068",
"9920069", "9920070", "9920071", "9920072", "9920073", "9920074",
"9920075", "9920076", "9920077", "9920078"), class = "factor")), .Names = c("key",
"variantId"), row.names = c("34", "35", "36", "37", "38", "39",
"40", "41", "42", "43", "44"), class = "data.frame")
result
key variant
43 "chr5:8" "9920077"
42 "chr5:9" "9920076"
39 "chr5:10" "9920073"
35 "chr5:11" "9920069"
34 "chr5:12" "9920068"
36 "chr5:13" "9920070"
37 "chr5:14" "9920071"
How about the following (I use data.table but the base version is almost the same)
library(data.table)
mydf <- as.data.table(mydf) #(if mydf really is stored as a matrix currently)
myvec2 <- lapply(strsplit(gsub("chr", "", myvec), split=":"), as.integer)
mydf[unique(Reduce(c, sapply(myvec2, function(x){
which(key %in% paste0("chr", x[1], ":", seq((x2 <- x[2]) - 3L, x2 + 3L)))}
))), ]
(in base, replace as.data.table with as.data.frame,key with mydf$key, and replace the closing square bracket ] with ,])
Extra option for sorting
Actually, I think this option is better in general, since it stores your information in a more pliable way in the first place. This version's a bit heavier in the data.table parlance.
mydf <- as.data.table(mydf)
#Split your `key` variable into its pre- and post-colon components
# (of course using better names if those numbers mean something
# more specific to you)
mydf[ , c("chr", "sub") :=
.(as.integer(gsub("chr|:.*", "", key)),
as.integer(gsub(".*:", "", key)))]
Now, proceeding much as before with a slight tweak:
myvec2<-lapply(strsplit(gsub("chr","",myvec),split=":"),as.integer)
mydf[unique(Reduce(c, sapply(myvec2, function(x){
which(chr == x[1] & sub %in% seq((x2 <- x[2]) - 3L, x2 + 3L))}
)))][order(chr, sub)]
Outputs:
key variantId chr sub
1: chr5:8 9920077 5 8
2: chr5:9 9920076 5 9
3: chr5:10 9920073 5 10
4: chr5:11 9920069 5 11
5: chr5:12 9920068 5 12
6: chr5:13 9920070 5 13
7: chr5:14 9920071 5 14
You can use the GenomicRanges package.
library(GenomicRanges)
myvec <- c("chr5:11", "chr3:112", "chr22:334")
myvec.gr <- GRanges(gsub(":.+", "", myvec),
IRanges(as.numeric(gsub(".+:", "", myvec))-3,
as.numeric(gsub(".+:", "", myvec)))+3)
mydf.gr <- GRanges(gsub(":.+", "", mydf[,"key"]),
IRanges(as.numeric(gsub(".+:", "", mydf[,"key"])),
as.numeric(gsub(".+:", "", mydf[,"key"]))))
d.v.op <- findOverlaps(mydf.gr, myvec.gr)
mydf[queryHits(d.v.op), ]
# key variantId
# 34 "chr5:12" "9920068"
# 35 "chr5:11" "9920069"
# 36 "chr5:13" "9920070"
# 37 "chr5:14" "9920071"
# 39 "chr5:10" "9920073"
# 42 "chr5:9" "9920076"
# 43 "chr5:8" "9920077"
Related
I have a large dataset and I want to subtract specific columns from each other based on their position. I want to subtract column 2 from column 8, column 3 from column 9 and column 4 from column 10.
Thanks a lot
Magnus
structure(list(Stamp_summertime = structure(c(1546684744, 1546685858,
1546687004, 1547030061, 1547030835, 1547031816), tzone = "UTC", class = c("POSIXct",
"POSIXt")), X26.013 = c(0.138461, 0.138461, 0.138461, 0.144421,
0.144421, 0.144421), X27.024 = c(0.0752111, 0.0752111, 0.0752111,
0.0426819, 0.0426819, 0.0426819), X33.031 = c(3.75788, 3.75788,
3.75788, 3.12581, 3.12581, 3.12581), jar_camp = c("1_pf1.1",
"2_pf1.1", "3_pf1.1", "1_pf2.1", "2_pf2.1", "3_pf2.1"), jar = structure(c(1L,
12L, 23L, 1L, 12L, 23L), .Label = c("1", "10_blank", "11", "12",
"13", "14", "15", "16_blank", "17", "18", "19", "2", "20_blank",
"21", "22", "23", "24", "25", "26", "27", "28", "29", "3", "30_blank",
"31", "32", "33", "34", "35", "36", "37", "38_blank", "39", "4",
"40", "41", "42", "43", "44_blank", "45", "46", "47", "48", "49",
"5_blank", "blank_50", "51", "52", "53", "54", "55", "56", "57",
"6", "7", "8", "9", "X_blank"), class = "factor"), campaign = c("pf1.1",
"pf1.1", "pf1.1", "pf2.1", "pf2.1", "pf2.1"), i.X26.013 = c(0.144658,
0.21502, 0.458296, 0.191571, 0.0789067, 0.711814), i.X27.024 = c(0.0595547,
0.0651149, 0.146772, 0.0997815, 0.0539976, 0.185398), i.X33.031 = c(5.4066,
3.30406, 18.0479, 6.13854, 1.3028, 22.2226)), sorted = "Stamp_summertime", class = c("data.table",
"data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x00000237a3d91ef0>)
We can create 2 vectors of position and subtract the columns directly. Since you have data.table we use ..column_number to select columns by position.
library(data.table)
col1group <- 2:4
col2group <- 8:10
df[, ..col1group] - df[, ..col2group])
If you want to add them as new columns to original data you can rename them and cbind
cbind(df, setNames(df[, ..col1group] - df[, ..col2group],
paste0(names(df)[col1group], '_diff')))
Something like the following computes the subtractions in the question.
library(data.table)
nms <- names(df1)
iCols <- grep("^i\\.", nms, value = TRUE)
Cols <- sub("^i\\.", "", iCols)
df1[, lapply(seq_along(Cols), function(i) get(Cols[i]) - get(iCols[i]))]
# V1 V2 V3
#1: -0.0061970 0.0156564 -1.64872
#2: -0.0765590 0.0100962 0.45382
#3: -0.3198350 -0.0715609 -14.29002
#4: -0.0471500 -0.0570996 -3.01273
#5: 0.0655143 -0.0113157 1.82301
#6: -0.5673930 -0.1427161 -19.09679
Following Ronak Shah's answer I realized that the code below also works.
df1[, ..Cols] - df1[, ..iCols]
The numeric results are the same but the column names are the vector Cols.
To create new columns, try
newCols <- paste(Cols, "diff", sep = "_")
df1[, (newCols) := lapply(seq_along(Cols), function(i) get(Cols[i]) - get(iCols[i]))]
Base R solution:
idx <- c(2, 3, 4)
jdx <- c(8, 9, 10)
Using lapply() and column binding the list:
setNames(do.call("cbind", lapply(seq_along(idx), function(i){
df[, jdx[i], drop = FALSE] - df[, idx[i], drop = FALSE]
}
)
), c(paste("x", jdx, idx, sep = "_")))
Using sapply() and coercing vectors to a data.frame:
setNames(data.frame(sapply(seq_along(idx), function(i){
df[, jdx[i], drop = FALSE] - df[, idx[i], drop = FALSE]
}
)
), c(paste("x", jdx, idx, sep = "_")))
Using Map() and Reduce() and column binding to original data.frame:
cbind(df, setNames(Reduce(cbind, Map(function(i){
df[, jdx[i], drop = FALSE] - df[, idx[i], drop = FALSE]
}, seq_along(idx))), c(paste("x", jdx, idx, sep = "_"))))
I want to replace values in the first column of mat1
mat1 <- matrix(c("vect-1822", "vect3", "vect-1818", "vect3", "vect-2030", "vect4", "vect-1926", "vect5", "vect-1818", "vect9", "vect-1818", "vect3", "vect-2030", "vect7"), ncol = 2, byrow=T)
with values from the second column in mat2:
mat2 <- matrix(c("vect-1822", "1", "vect-1818", "33", "vect-2030", "34", "vect-1926", "42"), ncol = 2, byrow=T)
The result will be :
mat_res <- matrix(c("1", "vect3", "33", "vect3", "34", "vect4", "42", "vect5", "33", "vect9", "33", "vect3", "34", "vect7"), ncol = 2, byrow=T)
I tried with two index i and j, but it is not optimal because my matrix is very large
We can use named vector to match and replace
mat3 <- mat1
mat3[,1] <- setNames(mat2[,2], mat2[,1])[mat1[,1]]
-checking with OP's output
identical(mat3, mat_res)
#[1] TRUE
One of my favorite things about library(readr) and the read_csv() function in R is that it almost always sets the column types of my data to the correct class. However, I am currently working with an API in R that returns data to me as a dataframe of all character classes, even if the data is clearly numbers. Take this dataframe for example, which has some sports data:
dput(mydf)
structure(list(isUnplayed = c("false", "false", "false"), isInProgress =
c("false", "false", "false"), isCompleted = c("true", "true", "true"), awayScore = c("106",
"95", "95"), homeScore = c("94", "97", "111"), game.ID = c("31176",
"31177", "31178"), game.date = c("2015-10-27", "2015-10-27",
"2015-10-27"), game.time = c("8:00PM", "8:00PM", "10:30PM"),
game.location = c("Philips Arena", "United Center", "Oracle Arena"
), game.awayTeam.ID = c("88", "86", "110"), game.awayTeam.City = c("Detroit",
"Cleveland", "New Orleans"), game.awayTeam.Name = c("Pistons",
"Cavaliers", "Pelicans"), game.awayTeam.Abbreviation = c("DET",
"CLE", "NOP"), game.homeTeam.ID = c("91", "89", "101"), game.homeTeam.City = c("Atlanta",
"Chicago", "Golden State"), game.homeTeam.Name = c("Hawks",
"Bulls", "Warriors"), game.homeTeam.Abbreviation = c("ATL",
"CHI", "GSW"), quarterSummary.quarter = list(structure(list(
`#number` = c("1", "2", "3", "4"), awayScore = c("25",
"23", "34", "24"), homeScore = c("25", "18", "23", "28"
)), .Names = c("#number", "awayScore", "homeScore"), class = "data.frame", row.names = c(NA,
4L)), structure(list(`#number` = c("1", "2", "3", "4"), awayScore = c("17",
"23", "28", "27"), homeScore = c("26", "20", "25", "26")), .Names = c("#number",
"awayScore", "homeScore"), class = "data.frame", row.names = c(NA,
4L)), structure(list(`#number` = c("1", "2", "3", "4"), awayScore = c("35",
"14", "26", "20"), homeScore = c("39", "20", "35", "17")), .Names = c("#number",
"awayScore", "homeScore"), class = "data.frame", row.names = c(NA,
4L)))), .Names = c("isUnplayed", "isInProgress", "isCompleted",
"awayScore", "homeScore", "game.ID", "game.date", "game.time",
"game.location", "game.awayTeam.ID", "game.awayTeam.City", "game.awayTeam.Name",
"game.awayTeam.Abbreviation", "game.homeTeam.ID", "game.homeTeam.City",
"game.homeTeam.Name", "game.homeTeam.Abbreviation", "quarterSummary.quarter"
), class = "data.frame", row.names = c(NA, 3L))
It is quite a hassle to deal with this dataframe once it is returned by the API, given the class types. I've come up with a sort of a hack to update the column classes, which is as follows:
write_csv(mydf, 'mydf.csv')
mydf <- read_csv('mydf.csv')
By writing to CSV and then re-reading the CSV using read_csv(), the dataframe columns update. Unfortunately I am left with a CSV file in my directory that I don't want. Is there a way to update the columns of an R dataframe to their 'read_csv()' column classes, without actually having to write the CSV?
Any help is appreciated!
You don't need to write and read the data if you just want readr to guess you column type. You could use readr::type_convert for that:
iris %>%
dplyr::mutate(Sepal.Width = as.character(Sepal.Width)) %>%
readr::type_convert() %>%
str()
For comparison:
iris %>%
dplyr::mutate(Sepal.Width = as.character(Sepal.Width)) %>%
str()
try this code, type.convert convert a character vector to logical, integer, numeric, complex or factor as appropriate.
indx <- which(sapply(df, is.character))
df[, indx] <- lapply(df[, indx], type.convert)
indx <- which(sapply(df, is.factor))
df[, indx] <- lapply(df[, indx], as.character)
Let's start with a data:
structure(list(Group = c("Mark", "Matt", "Tim", "Tom"), `1` = c(0.749552072382562,
1.06820497349356, 1.00116263663573, 0.864987635002866), `2` = c(1.00839505250436,
0.796306651704629, 1.02603677593328, 1.00321936833133), `3` = c(0.736638669191169,
0.973483626272054, 1.14805519301778, 0.899272693725192), `4` = c(0.728882841159455,
0.871211836418332, 1.0442119745299, 0.859935708928745), `5` = c(0.749552072382562,
1.06820497349356, 1.00116263663573, 0.864987635002866), `6` = c(1.00839505250436,
0.796306651704629, 1.02603677593328, 1.00321936833133), `7` = c(0.736638669191169,
0.973483626272054, 1.14805519301778, 0.899272693725192), `8` = c(0.728882841159455,
0.871211836418332, 1.0442119745299, 0.859935708928745), `9` = c(0.749552072382562,
1.06820497349356, 1.00116263663573, 0.864987635002866), `10` = c(1.00839505250436,
0.796306651704629, 1.02603677593328, 1.00321936833133), `11` = c(0.736638669191169,
0.973483626272054, 1.14805519301778, 0.899272693725192), `12` = c(0.728882841159455,
0.871211836418332, 1.0442119745299, 0.859935708928745), `13` = c(0.749552072382562,
1.06820497349356, 1.00116263663573, 0.864987635002866), `14` = c(1.00839505250436,
0.796306651704629, 1.02603677593328, 1.00321936833133), `15` = c(0.736638669191169,
0.973483626272054, 1.14805519301778, 0.899272693725192), `16` = c(0.728882841159455,
0.871211836418332, 1.0442119745299, 0.859935708928745), `17` = c(0.766036811789943,
0.871085862829362, 1.02210371210681, 0.937452345474458), `18` = c(1.0357237385154,
1.02805558505417, 0.946794300033338, 1.04688545274238), `19` = c(0.763210436944137,
0.801397021884422, 0.952553568039278, 0.990226493248718), `20` = c(0.789338028300063,
0.822815644347233, 0.958462750269733, 1.04183361434861), `21` = c(0.766036811789943,
0.871085862829362, 1.02210371210681, 0.937452345474458), `22` = c(1.0357237385154,
1.02805558505417, 0.946794300033338, 1.04688545274238), `23` = c(0.763210436944137,
0.801397021884422, 0.952553568039278, 0.990226493248718), `24` = c(0.789338028300063,
0.822815644347233, 0.958462750269733, 1.04183361434861), `25` = c(0.766036811789943,
0.871085862829362, 1.02210371210681, 0.937452345474458), `26` = c(1.0357237385154,
1.02805558505417, 0.946794300033338, 1.04688545274238), `27` = c(0.763210436944137,
0.801397021884422, 0.952553568039278, 0.990226493248718), `28` = c(0.789338028300063,
0.822815644347233, 0.958462750269733, 1.04183361434861), `29` = c(0.766036811789943,
0.871085862829362, 1.02210371210681, 0.937452345474458), `30` = c(1.0357237385154,
1.02805558505417, 0.946794300033338, 1.04688545274238), `31` = c(0.763210436944137,
0.801397021884422, 0.952553568039278, 0.990226493248718), `32` = c(0.789338028300063,
0.822815644347233, 0.958462750269733, 1.04183361434861), `33` = c(0.937894856206067,
NA, 1.00383773624603, 1.04181193834546), `34` = c(1.03944921519508,
NA, 0.983868286249464, 1.10409633668759), `35` = c(0.949802513948967,
NA, 1.06522152108054, 1.04376827636719), `36` = c(0.965871712940006,
NA, 1.18437146805406, 1.01355356488254), `37` = c(0.937894856206067,
NA, 1.00383773624603, 1.04181193834546), `38` = c(1.03944921519508,
NA, 0.983868286249464, 1.10409633668759), `39` = c(0.949802513948967,
NA, 1.06522152108054, 1.04376827636719), `40` = c(0.965871712940006,
NA, 1.18437146805406, 1.01355356488254), `41` = c(0.937894856206067,
NA, 1.00383773624603, 1.04181193834546), `42` = c(1.03944921519508,
NA, 0.983868286249464, 1.10409633668759), `43` = c(0.949802513948967,
NA, 1.06522152108054, 1.04376827636719), `44` = c(0.965871712940006,
NA, 1.18437146805406, 1.01355356488254), `45` = c(0.937894856206067,
NA, 1.00383773624603, 1.04181193834546), `46` = c(1.03944921519508,
NA, 0.983868286249464, 1.10409633668759), `47` = c(0.949802513948967,
NA, 1.06522152108054, 1.04376827636719)), .Names = c("Group",
"1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
"13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23",
"24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34",
"35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
"46", "47"), row.names = c(NA, 4L), class = "data.frame")
Each row is a collection of ratio which I got from comparison of two groups. I would like to know if the ratios are significantly different than 1. So, I would like to test if each row (vector) is different than 1 by using two tests mentioned in a title. How to apply those test to my data ? Please consider that each row may have a different length. NAs should be ignored. As and output I would like to have a table with 3 columns: Group name, p-value t-test, p.value Wilcoxon.
Can someone help mi with that ?
There might be a way to use the rows of the original data frame you have, but I'd strongly recommend to work with columns (tidy data frame).
library(dplyr)
library(tidyr)
# assuming this is the name of your original dataset
dt
# reshape to create a column for each name
dt2 = data.frame(t(dt), stringsAsFactors = F)
names(dt2) = dt2[1,]
dt2 = dt2[-1,]
dt2[,names(dt2)] = sapply(dt2[,names(dt2)], as.numeric)
# reshape to create a column of names and values
dt3 = dt2 %>%
gather(name,value,Mark:Tom) %>%
filter(!is.na(value)) # remove NAs
dt3 %>%
group_by(name) %>% # for each name
summarise(pval_ttest = t.test(value, mu=1, data=.)$p.value, # calculate t test p value
pval_wilc = wilcox.test(value, mu=1, data=.)$p.value) # calculate Wilcoxon p value
# # A tibble: 4 × 3
# name pval_ttest pval_wilc
# <chr> <dbl> <dbl>
# 1 Mark 4.408038e-09 1.020895e-06
# 2 Matt 6.679416e-06 2.502045e-04
# 3 Tim 1.777060e-02 6.932590e-02
# 4 Tom 2.433548e-01 5.148382e-01
Some additional info about how a paired t test "understands" the measurements you give it and why differences and ratios might give different results.
Consider the following examples:
# paired t test of 2 vectors of same size (before and after treatment)
# it compares the means of those vectors
t.test(1:10, 13:4, paired = T)
# Paired t-test
#
# data: 1:10 and 13:4
# t = -1.5667, df = 9, p-value = 0.1516
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
# -7.331701 1.331701
# sample estimates:
# mean of the differences
# -3
# t test that compares one vector's mean to 0
# that vector is the differences of the two initial vectors
t.test(1:10 - 13:4, mu=0)
# One Sample t-test
#
# data: 1:10 - 13:4
# t = -1.5667, df = 9, p-value = 0.1516
# alternative hypothesis: true mean is not equal to 0
# 95 percent confidence interval:
# -7.331701 1.331701
# sample estimates:
# mean of x
# -3
# t test that compares one vector's mean to 1
# that vector is the ratios of the two initial vectors
t.test(1:10 / 13:4, mu=1)
# One Sample t-test
#
# data: 1:10/13:4
# t = -0.46036, df = 9, p-value = 0.6562
# alternative hypothesis: true mean is not equal to 1
# 95 percent confidence interval:
# 0.3229789 1.4480623
# sample estimates:
# mean of x
# 0.8855206
You can see that the paired t test is a simple t test of the differences' vector, which is possible as you have 2 vectors of the same length (before after treatment). It's not the same with a simple t test of the ratios' vector.
So, it's reasonable to have different results, but in some applications the ratio test is better. Check your bibliography on that.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have a nested data.frame
dput(res)
structure(list(date = structure(list(pretty = "12:00 PM CDT on August 14, 2015",
year = "2015", mon = "08", mday = "14", hour = "12", min = "00",
tzname = "America/Chicago"), .Names = c("pretty", "year",
"mon", "mday", "hour", "min", "tzname"), class = "data.frame", row.names = 1L),
fog = "0", rain = "1", snow = "0", snowfallm = "0.00", snowfalli = "0.00",
monthtodatesnowfallm = "", monthtodatesnowfalli = "", since1julsnowfallm = "",
since1julsnowfalli = "", snowdepthm = "", snowdepthi = "",
hail = "0", thunder = "0", tornado = "0", meantempm = "26",
meantempi = "79", meandewptm = "17", meandewpti = "63", meanpressurem = "1019",
meanpressurei = "30.09", meanwindspdm = "11", meanwindspdi = "7",
meanwdire = "", meanwdird = "139", meanvism = "16", meanvisi = "10",
humidity = "", maxtempm = "32", maxtempi = "90", mintempm = "21",
mintempi = "69", maxhumidity = "86", minhumidity = "36",
maxdewptm = "18", maxdewpti = "65", mindewptm = "15", mindewpti = "59",
maxpressurem = "1021", maxpressurei = "30.15", minpressurem = "1017",
minpressurei = "30.04", maxwspdm = "19", maxwspdi = "12",
minwspdm = "0", minwspdi = "0", maxvism = "16", maxvisi = "10",
minvism = "16", minvisi = "10", gdegreedays = "29", heatingdegreedays = "0",
coolingdegreedays = "14", precipm = "0.00", precipi = "0.00",
precipsource = "", heatingdegreedaysnormal = "", monthtodateheatingdegreedays = "",
monthtodateheatingdegreedaysnormal = "", since1sepheatingdegreedays = "",
since1sepheatingdegreedaysnormal = "", since1julheatingdegreedays = "",
since1julheatingdegreedaysnormal = "", coolingdegreedaysnormal = "",
monthtodatecoolingdegreedays = "", monthtodatecoolingdegreedaysnormal = "",
since1sepcoolingdegreedays = "", since1sepcoolingdegreedaysnormal = "",
since1jancoolingdegreedays = "", since1jancoolingdegreedaysnormal = ""), .Names = c("date",
"fog", "rain", "snow", "snowfallm", "snowfalli", "monthtodatesnowfallm",
"monthtodatesnowfalli", "since1julsnowfallm", "since1julsnowfalli",
"snowdepthm", "snowdepthi", "hail", "thunder", "tornado", "meantempm",
"meantempi", "meandewptm", "meandewpti", "meanpressurem", "meanpressurei",
"meanwindspdm", "meanwindspdi", "meanwdire", "meanwdird", "meanvism",
"meanvisi", "humidity", "maxtempm", "maxtempi", "mintempm", "mintempi",
"maxhumidity", "minhumidity", "maxdewptm", "maxdewpti", "mindewptm",
"mindewpti", "maxpressurem", "maxpressurei", "minpressurem",
"minpressurei", "maxwspdm", "maxwspdi", "minwspdm", "minwspdi",
"maxvism", "maxvisi", "minvism", "minvisi", "gdegreedays", "heatingdegreedays",
"coolingdegreedays", "precipm", "precipi", "precipsource", "heatingdegreedaysnormal",
"monthtodateheatingdegreedays", "monthtodateheatingdegreedaysnormal",
"since1sepheatingdegreedays", "since1sepheatingdegreedaysnormal",
"since1julheatingdegreedays", "since1julheatingdegreedaysnormal",
"coolingdegreedaysnormal", "monthtodatecoolingdegreedays", "monthtodatecoolingdegreedaysnormal",
"since1sepcoolingdegreedays", "since1sepcoolingdegreedaysnormal",
"since1jancoolingdegreedays", "since1jancoolingdegreedaysnormal"
), class = "data.frame", row.names = 1L)
and I am using the following command to retrieve data from it
df <- data.frame()
df <- rbind(df, ldply(res, function(x) x[[1]]))
To use this data frame, I convert it into data table, using dt <- data.table(df) and now I know how to work with the data, for instance dt[.id=="fog"].
Is there a more elegant/efficient solution?
The problem was solved by #antoine-sac. It was not necessary to use the apply to get the data, it was only a question of "un-nest" the data.
Your problem is that your data is a data.frame and one of its column is date. But date is a data.frame. As you say it is a nested list. So let's "un-nest" it.
You can simply do (assuming your data is in data):
df.date <- data$date
# removing incorrectly formated date from data
data$date <- NULL
At this point, data is a normal data.frame and df.date is also a basic data.frame.
> df.date
pretty year mon mday hour min tzname
1 12:00 PM CDT on August 14, 2015 2015 08 14 12 00 America/Chicago
If you want to merge that with your existing data.frame:
# binding df.date with your data
data <- cbind(data, df.date)
No need for any kind of apply.
Now if you don't know how to access variables in a data.frame, that's another thing.
If you want, say, meantempm, you can simply do data$meantempm.
I refer you to beginner tutorial about R, there are plenty to choose from with a google request.