Subset xts object using vector of unique index days - r

I'm trying to subset an xts object using a vector of xts timestamps that have been processed into a vector of unique timestamps. This follows on from this previous question that was only partially answered.
Some sample data:
dput(sample.data.merge, control="all")
structure(c(11.65, 11.13, 11.13, 11.5, 11.8, 11.45, 11.45, 11.08,
11.08, 11.25, 9.8, 10.45, 10.9, 10.9, 10.9, 10.9, 10.9, 10.9,
10.45, 10.5, 10.5, 10.08, 10.08, 10.65, 10.08, 10.65, 10.6, 10.65,
10.65, 10.085, 10.145, 11.9, 11.085, 9.35, 9.15, 9.15, 9.9, 9.0875,
9.3, 9.3, 9.3, 9.35, 9.35, 9.35, 9.25, 9.5, 9.45, 9.3, 11.15,
11.15, 11.15, 11.15, 11.8, 8, 10.05, 10.05, 10.25, 10.4, 10.15,
10.15, 10.3, 10.15, 10.1, 11.08, 11.08, 11.08, 11.65, 11.85,
11.9, 11.9, 11.9, 12.65, 13.35, 13.35, 15.95, 15.9, 15.4, 15.4,
15.4, 15.4, 15.13, 12.13, 12.35, 11.082, 11.082, 11.08, 12.1,
12.3, 12.3, 12.4, 12.6, 12.6, 12.13, 12.45, 12.9, 12.9, 12.9,
14, 12.6, 12.6, 12.45, 15.25, 12.085, 12.95, 12.95, 12.35, 12.13,
12.8, 14, 14, 12.45, 12.45, 12.45, 12.45, 12.25, 12.6, 12.085,
15.1, 15.15, 15.35, 15.3, 12.5, 12.5, 12.15, 12.2, 11.085, 11.35,
11.45, 11.13, 11.13, 11.35, 11.2, 12.5, 12.6, 12.95, 12.95, 12.5,
12.45, 12.3, 12.3, 12.3, 12.45, 12.45, 12.45, 12.5, 12.45, 12.45,
12.13, 12.13, 12.65, 190, 190, 190, 190, 130, 190, 190, 190,
190, 190, 130, 190, 130, 130, 445, 445, 445, 445, 130, 445, 190,
445, 445, 190, 190, 190, 190, 130, 190, 190, 190, 190, 190, 190,
190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190,
190, 275, 190, 190, 190, 190, 190, 190, 190, 190, 190, 130, 130,
190, 190, 190, 130, 130, 130, 190, 130, 190, 190, 190, 130, 190,
190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190,
1190, 190, 190, 130, 130, 130, 190, 1130, 190, 190, 130, 190,
190, 190, 190, 190, 190, 130, 130, 190, 190, 375, 190, 190, 190,
130, 190, 130, 190, 190, 190, 190, 130, 190, 190, 190, 190, 190,
190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190, 190,
130, 130, 130, 190, 130, 190, 190, 190, 130, 130, 445, 445, 130,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 0, 0, NA, NA, NA, NA, NA, 0.21, 0.21, 0.26, 0.0250000000000004,
0, 0.0250000000000004, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0.0249999999999995, 0.0250000000000004, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.0250000000000004,
0.100000000000001, 0.39, NA, NA, NA, NA, NA, 0.0250000000000004,
NA, NA, NA, NA, NA, 0.524999999999999, 0.25, 0, 0, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 0.149999999999999, 0.135000000000001,
0.149999999999999, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 0.409999999999999, 0.375, 0.3, 0.635, 0.385, 0.335, 0.175000000000001,
0, NA, NA, NA, NA, NA, 1.4, 0.2, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0.109999999999999, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 0.0749999999999993, 0.0749999999999993, 0.0749999999999993,
0.0250000000000004, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, NA, NA, NA,
NA, NA, 127.5, 0, 0, 0, 0, 0, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 0, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 30, 30, 30, NA, NA, NA, NA,
NA, 0, NA, NA, NA, NA, NA, 0, 0, 0, 0, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 30, 30, 30, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0, 30, 30, 0, 0, 0, 0, 0, NA, NA, NA, NA, NA, 0,
0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 0, 0, 30, 0, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 10.9,
10.9, NA, NA, NA, NA, NA, 10.29, 10.29, 10.34, 10.625, 10.65,
10.625, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9.325,
9.325, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 10.15, 10.225, 10.69, NA, NA, NA, NA, NA, 11.9,
NA, NA, NA, NA, NA, 15.4, 15.4, 15.4, 15.4, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, 12.35, 12.35, 12.425, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 12.65, 12.575, 12.875, 12.875, 12.625,
12.625, 12.625, 12.45, NA, NA, NA, NA, NA, 13.85, 15.125, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 11.275, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 12.375, 12.375, 12.375, 12.45, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 445, 445, NA, NA, NA, NA, NA, 317.5, 190, 190, 190, 190,
190, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 190,
190, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 160, 160, 160, NA, NA, NA, NA, NA, 190, NA, NA,
NA, NA, NA, 190, 190, 190, 190, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 160, 160, 160, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 190, 190, 190, 190, 190, 190, 190, 190, NA, NA, NA, NA,
NA, 190, 190, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 190, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 130, 130, 160, 190, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NaN, Inf, NA, NA, NA, NA, NA, 0.999999999999996,
1.71428571428572, 1, 1, NaN, 21.5999999999997, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.00000000000004, 2.99999999999993,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 37.1999999999995, 8.54999999999987, 0.999999999999998,
NA, NA, NA, NA, NA, 29.9999999999996, NA, NA, NA, NA, NA, 0,
0, NaN, Inf, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.66666666666666,
1.62962962962963, 0.166666666666658, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 1.26829268292683, 0.600000000000004,
3.75, 1.77165354330709, 0.454545454545457, 0.522388059701495,
1, NaN, NA, NA, NA, NA, NA, 1.07142857142857, 0.875000000000003,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.681818181818179, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, 2, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NaN, Inf, NA, NA, NA, NA, NA, 1, NaN, NaN, Inf, NaN, NaN,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, NaN,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 1, 1, 1, NA, NA, NA, NA, NA, Inf, NA, NA, NA, NA, NA,
NaN, NaN, NaN, NaN, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1,
1, 32.3333333333333, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NaN, 6.16666666666667, 0, NaN, NaN, Inf, NaN, Inf, NA,
NA, NA, NA, NA, NaN, NaN, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NaN, NA, NA, NA, NA, NA, NA, NA, NA, NA, NaN, Inf, 1, NaN,
NA, NA, NA, NA, NA), .Dim = c(150L, 8L), .Dimnames = list(NULL,
c("price", "volume", "madprice", "madvolume", "medianprice",
"medianvolume", "absdevmadprice", "absdevmadvolume")), index = structure(c(1325584080,
1325594940, 1325594940, 1325604600, 1325759100, 1325762520, 1325762520,
1325769300, 1325769300, 1325848080, 1325864880, 1326128220, 1326196500,
1326196500, 1326196500, 1326196500, 1326196500, 1326196500, 1326209700,
1326279480, 1326283620, 1326288300, 1326288300, 1326289680, 1326289680,
1326289680, 1326292320, 1326294060, 1326294600, 1326297600, 1326387000,
1326456720, 1326467160, 1326711600, 1326723000, 1326724260, 1326809940,
1326814860, 1326885960, 1326885960, 1326889980, 1326894000, 1326895200,
1326895200, 1326898080, 1326986700, 1326987240, 1326992100, 1327072140,
1327328040, 1327328040, 1327328040, 1327417920, 1327423140, 1327424820,
1327425240, 1327483200, 1327496520, 1327570320, 1327570320, 1327575420,
1327588680, 1327588980, 1327595880, 1327595880, 1327595880, 1327664820,
1327674720, 1327680660, 1327680780, 1327680780, 1327683960, 1327914300,
1327914300, 1327915260, 1327918140, 1327924860, 1327924920, 1327924980,
1327924980, 1327927680, 1328013360, 1328014200, 1328025000, 1328025000,
1328026740, 1328089440, 1328091360, 1328091360, 1328110620, 1328111340,
1328111340, 1328112420, 1328113800, 1328193540, 1328194080, 1328194140,
1328196720, 1328274360, 1328274420, 1328278320, 1328519280, 1328520120,
1328520600, 1328520600, 1328524140, 1328527980, 1328531580, 1328540880,
1328540880, 1328547600, 1328547660, 1328547720, 1328547780, 1328607060,
1328608080, 1328618760, 1328623380, 1328623380, 1328625720, 1328631480,
1328717760, 1328717880, 1328793000, 1328797980, 1329132840, 1329210480,
1329215400, 1329215820, 1329215820, 1329219480, 1329223140, 1329300900,
1329301620, 1329315240, 1329315240, 1329388740, 1329389700, 1329390000,
1329390000, 1329390180, 1329391860, 1329391860, 1329391860, 1329402120,
1329467700, 1329467700, 1329469080, 1329469080, 1329471300), tzone = "", tclass = c("POSIXlt",
"POSIXt")), .indexCLASS = c("POSIXlt", "POSIXt"), .indexTZ = "", tclass = c("POSIXlt",
"POSIXt"), tzone = "", class = c("xts", "zoo"))
The code:
sample.data.mergesub <- sample.data.merge['T10:30/T17:30']
sample.data.mergeout <- sample.data.mergesub[ which((sample.data.mergesub$absdevmadprice >=5 & sample.data.mergesub$absdevmadprice < Inf) | (sample.data.mergesub$absdevmadvol>=10 & sample.data.mergesub$absdevmadvol<Inf)),]
sample.data.unique <- unique(.indexday(sample.data.mergeout))
This sample.data.unique is therefore a vector of index days. Question: I'd like to use this to extract the full day of data from the original dataset sample.data in order to later graph the full day of trades, rather than the subset of data. For instance, if Jan 03 2012 10:53:00 meets the conditions of having absdevmadprice >= 5, and less than infinite, then I'd like to return the day (Jan 03 2012) into a vector and use this to subset the original dataset. This would select all observations in that day (so over the whole trading period) and I could then graph this day.
I've tried this code (based on Joshua's answer here) but it doesn't work:
> sample.data.uniquePOS<-sample.data.merge[paste(as.Date(as.POSIXct(sample.data.unique, origin = "1970-01-01 00:00.00 UTC", tz="GMT")))]
It returns simply the column names:
> sample.data.uniquePOS
price volume madprice madvolume medianprice medianvolume absdevmadprice
absdevmadvolume
For info, the structure of the variables:
> str(sample.data.merge)
An ‘xts’ object on 2012-01-03 09:48:00/2012-02-17 09:35:00 containing:
Data: num [1:150, 1:8] 11.6 11.1 11.1 11.5 11.8 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:8] "price" "volume" "madprice" "madvolume" ...
Indexed by objects of class: [POSIXlt,POSIXt] TZ:
xts Attributes:
NULL
> str(sample.data.uniquePOS)
An 'xts' object of zero-width
> str(sample.data.unique)
num 15371
Thanks for the help (and if anyone can explain why the code doesn't work!).

answer to own question:
Using these posts (Ananda's answer to this, Joshua's answer to this, and the as.Date.numeric function I found out about here) I was able to solve my own problem. This line of code seems to do it:
sample.data.uniquePOS <- sample.data.merge[paste(as.Date.numeric(sample.data.unique, origin= "1970-01-01 00:00.00 UTC", tz="GMT")),]
Can't give a great explanation as to why it works compared to the below, but perhaps as.POSIXct can't take the same format that as.Date.numeric can?
sample.data.uniquePOS <- sample.data.merge[paste(as.Date(as.POSIXct(sample.data.unique, origin = "1970-01-01 00:00.00 UTC", tz="GMT")))]

Related

How to load combined data in R

I get 60 Excel files that look like this (the original file has 135 rows):
structure(list(Ce = c(NA, NA, NA, "AC12-PRD-C1", "Camarda", "45431",
"171,0 cm", "754 mmHg", "20W-25W", "RÈsumÈ", NA, "Moyennag",
"Time", "Load", "V'O2", "VO2/kg", "dO2/dW", "V'CO2", "MET", "RER",
"V'E", "BF", "VTex", "PETO2", "PETCO2", "EqCO2", "EqO2", "HR",
"HRR %", "SpO2", "O2/HR", "Ce", NA, NA, NA, "AC12-PRD", NA, "t-ph",
"min", NA, "9.7222222222222224E-3", "1.9444444444444445E-2",
"2.9166666666666664E-2", "3.888888888888889E-2", "4.9999999999999996E-2",
"6.1111111111111116E-2", "7.2222222222222229E-2", "8.2638888888888887E-2",
"9.1666666666666674E-2", "0.10208333333333335", "0.1111111111111111",
"0.12430555555555556", "9.7222222222222224E-3", "2.013888888888889E-2",
"3.0555555555555555E-2", "4.0972222222222222E-2", "5.0694444444444452E-2",
"6.1805555555555558E-2", "7.1527777777777787E-2", "8.3333333333333329E-2",
"9.2361111111111116E-2", "0.10347222222222223", "0.11319444444444444",
"0.12361111111111112", NA, "t-ph", "min", NA, "9.7222222222222224E-3",
"2.013888888888889E-2", "3.0555555555555555E-2", "4.0972222222222222E-2",
"5.0694444444444452E-2", "6.1805555555555558E-2", "7.1527777777777787E-2",
"8.3333333333333329E-2", "9.2361111111111116E-2", "0.10347222222222223",
"0.11319444444444444", "0.12361111111111112", NA, "t-ph", "min",
NA, "1.0416666666666666E-2"), `ntre …` = c("5055", NA, NA, "linear",
NA, "54", NA, NA, "+ 15W", NA, NA, "e temp", NA, NA, NA, "ml",
"ml/m", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
"ntre …", "5055", NA, NA, "-C", NA, "Load", "W", NA, "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "20", "20",
"20", "20", "20", "20", "20", "20", "20", "20", "20", "20", NA,
"Load", "W", NA, "20", "20", "20", "20", "20", "20", "20", "20",
"20", "20", "20", "20", NA, "Load", "W", NA, "25"), `PIC, Insti` = c("rue St Zot",
"tÈl.", "fax.", NA, NA, "masc", "84", "01/0", NA, NA, NA, "orel 15 Se",
"min", "W", "ml/min", "/min/kg", "in/Watt", "ml/min", NA, NA,
"L/min", "1/min", "L", "mmHg", "mmHg", NA, NA, "1/min", "%",
"%", "ml", "PIC, Insti", "rue St Zot", "tÈl.", "fax.", NA, NA,
"HR", "1/min", NA, "73", "71", "74", "72", "81", "77", "78",
"77", "78", "76", "76", "80", "86", "82", "85", "84", "85", "86",
"86", "86", "90", "89", "90", "91", NA, "HR", "1/min", NA, "86",
"82", "85", "84", "85", "86", "86", "86", "90", "89", "90", "91",
NA, "HR", "1/min", NA, "92"), `tut de` = c("ique E", "-514",
"-514", NA, "Ant", "ulin", "kg", "9/2016", NA, "ThÈo", NA, "con",
NA, "163", "2041", NA, NA, NA, NA, NA, "71", "29", NA, NA, NA,
NA, NA, "170", NA, NA, "12.1", "tut de", "ique E", "-514", "-514",
NA, "Rep", "BF", "1/min", NA, "24", "22", "22", "22", "22", "19",
"19", "19", "19", "20", "19", "18", "25", "24", "24", "24", "22",
"23", "21", "21", "22", "19", "23", "24", "RÈfÈr", "BF", "1/min",
NA, "25", "24", "24", "24", "22", "23", "21", "21", "22", "19",
"23", "24", "Exer", "BF", "1/min", NA, "23"), Cardiolo = c("st, Montr",
"3741480", "3742416", NA, "oine", NA, NA, NA, NA, "Repos", NA,
NA, "0.12430555555555556", "0", "380", "4.5", "0.00", "347",
"1.3", "0.91", "14", "18", "0.777", "111.75", "34.10", "37.4",
"34.1", "80", "53", "-4", "4.8", "Cardiolo", "st, Montr", "3741480",
"3742416", NA, "os", "V'E", "L/min", NA, "12", "13", "14", "12",
"9", "12", "12", "12", "13", "14", "13", "14", "15", "17", "18",
"18", "18", "18", "20", "19", "22", "22", "21", "22", "ence",
"V'E", "L/min", NA, "15", "17", "18", "18", "18", "18", "20",
"19", "22", "22", "21", "22", "cice", "V'E", "L/min", NA, "23"
), `gie de M` = c("Èal, HT1", NA, NA, NA, NA, "62 AnnÈ", "Dr Math",
"0.62850694444444444", NA, "AT", "Manuel", NA, "0.33333333333333331",
"40", "823", "9.8", "11.08", "704", "2.8", "0.86", "23", "23",
"0.996", "101.82", "40.90", "30.1", "25.7", "94", "45", "-4",
"8.8", "gie de M", "Èal, HT1", NA, NA, NA, NA, "V'O2", "ml/min",
NA, "365", "389", "417", "310", "281", "364", "354", "382", "401",
"405", "349", "380", "461", "516", "566", "594", "626", "641",
"720", "664", "789", "724", "746", "762", NA, "V'O2", "ml/min",
NA, "461", "516", "566", "594", "626", "641", "720", "664", "789",
"724", "746", "762", NA, "V'O2", "ml/min", NA, "777"), ontrÈal = c("1N6",
NA, NA, NA, NA, "es", "ieu Gayda", "8", NA, "MaxVO2", NA, NA,
"0.54166666666666663", "115", "1703", "20.3", "11.50", "2034",
"5.8", "1.19", "62", "27", "2.316", "112.88", "40.62", "29.5",
"35.2", "139", "18", "-4", "12.2", "ontrÈal", "1N6", NA, NA,
NA, NA, "V'CO2 d", "ml/min", "m", "283", "309", "344", "264",
"210", "296", "283", "304", "325", "339", "299", "347", "370",
"427", "464", "491", "506", "512", "591", "559", "655", "658",
"640", "657", NA, "V'CO2 d", "ml/min", "m", "370", "427", "464",
"491", "506", "512", "591", "559", "655", "658", "640", "657",
NA, "V'CO2 d", "ml/min", "m", "677"), ...8 = c(NA, NA, NA, NA,
NA, NA, NA, "VÈlo", NA, "MaxVO2", "%thÈo.", NA, NA, "71", "83",
NA, NA, NA, NA, NA, "87", "93", NA, NA, NA, NA, NA, "82", NA,
NA, "101", NA, NA, NA, NA, NA, NA, "O2/dW", "ml/", "in/Watt",
"0.00", "0.00", "0.00", "0.00", "0.00", "0.00", "0.00", "0.00",
"0.00", "0.00", "0.00", "0.00", "4.03", "6.76", "9.30", "10.70",
"12.27", "13.04", "16.97", "14.20", "20.42", "17.16", "18.29",
"19.06", NA, "O2/dW", "ml/", "in/Watt", "4.03", "6.76", "9.30",
"10.70", "12.27", "13.04", "16.97", "14.20", "20.42", "17.16",
"18.29", "19.06", NA, "O2/dW", "ml/", "in/Watt", "19.81"), ...9 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "1", NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, "RER", NA, NA, "0.78", "0.80", "0.83", "0.85",
"0.75", "0.82", "0.80", "0.79", "0.81", "0.84", "0.86", "0.91",
"0.80", "0.83", "0.82", "0.83", "0.81", "0.80", "0.82", "0.84",
"0.83", "0.91", "0.86", "0.86", NA, "RER", NA, NA, "0.80", "0.83",
"0.82", "0.83", "0.81", "0.80", "0.82", "0.84", "0.83", "0.91",
"0.86", "0.86", NA, "RER", NA, NA, "0.87"), ...10 = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, "Max", "Watts", NA, "0.55208333333333337",
"130", "1583", "18.9", "9.25", "1952", "5.4", "1.23", "64", "30",
"2.114", "14.84", "39.35", "31.5", "38.8", "142", "16", "-4",
"11.2", NA, NA, NA, NA, NA, NA, "EqO2", NA, NA, "27.6", "29.3",
"30.8", "33.9", "25.3", "30.3", "28.9", "29.1", "29.2", "30.5",
"32.9", "34.1", "29.4", "28.8", "28.0", "27.0", "26.3", "25.9",
"25.7", "26.3", "25.3", "28.6", "26.4", "27.3", NA, "EqO2", NA,
NA, "29.4", "28.8", "28.0", "27.0", "26.3", "25.9", "25.7", "26.3",
"25.3", "28.6", "26.4", "27.3", NA, "EqO2", NA, NA, "27.4"),
...11 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "EqCO2", NA, NA,
"35.7", "36.8", "37.3", "39.9", "33.8", "37.2", "36.2", "36.6",
"36.0", "36.5", "38.4", "37.4", "36.6", "34.9", "34.2", "32.7",
"32.5", "32.5", "31.4", "31.2", "30.5", "31.4", "30.8", "31.7",
NA, "EqCO2", NA, NA, "36.6", "34.9", "34.2", "32.7", "32.5",
"32.5", "31.4", "31.2", "30.5", "31.4", "30.8", "31.7", NA,
"EqCO2", NA, NA, "31.5"), ...12 = c(NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "PETCO2", "mmHg", NA, "35.68", "32.78", "35.26", "33.61",
"36.27", "35.53", "35.87", "36.38", "36.51", "36.36", "35.92",
"34.10", "36.80", "36.34", "37.09", "32.65", "38.35", "38.99",
"39.26", "39.91", "39.95", "39.22", "40.18", "39.63", NA,
"PETCO2", "mmHg", NA, "36.80", "36.34", "37.09", "32.65",
"38.35", "38.99", "39.26", "39.91", "39.95", "39.22", "40.18",
"39.63", NA, "PETCO2", "mmHg", NA, "39.66"), ...13 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "VES (ml)", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "86"), ...14 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "VESi (ml/m²)", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "42.6"), ...15 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "FC (bpm)", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "90"), ...16 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "QC (l/min)", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "7.8"), ...17 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "IC (l/min/m²)", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "3.9"), ...18 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "PAS (mmHg)", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "120"), ...19 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "PAD (mmHg)", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "74"), ...20 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "PAM (mmHg)", NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "94"), ...21 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "ICT", NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, "153"), ...22 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "TEV (ms)", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "415.6"), ...23 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "RPD (%)", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "49.3"), ...24 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "WCI (kg.m/m²)", NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "4.7"), ...25 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "RVSi (dyn.s/cm5.m²)", NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "1819"
), ...26 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "RVS (dyn.s/cm5)",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "901"), ...27 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "VTD est (ml)",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "137.5"), ...28 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "FE est (%)",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "62.7"), ...29 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "O2Hb",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "3.8E-3"), ...30 = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "HHb",
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, "-0.29199999999999998"), ...31 = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, "tHb", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, "-0.28820000000000001"), ...32 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "HbDiff", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "0.29580000000000001"
)), row.names = c(NA, -85L), class = c("tbl_df", "tbl", "data.frame"
))
I want to get rid of rows and columns to make my files all looking like this:
structure(list(time = c("00:15", "00:30", "00:46", "01:00", "01:15",
"01:31", "01:46", "01:59", "02:14", "02:30", "02:44", "02:59",
"03:16", "03:29", "03:46", "03:57", "04:14", "04:31", "04:44",
"05:00", "05:16", "05:30", "05:46", "06:01", "06:15", "06:31",
"06:44", "07:00", "07:16", "00:01"), power = c(25, 25, 25, 25,
40, 40, 40, 40, 55, 55, 55, 55, 70, 70, 70, 70, 85, 85, 85, 85,
100, 100, 100, 100, 115, 115, 115, 115, 130, 130), hr = c(92,
90, 86, 87, 93, 91, 95, 94, 98, 100, 101, 100, 108, 108, 110,
113, 115, 118, 120, 122, 122, 126, 130, 130, 131, 136, 137, 139,
142, 144), fr = c(23, 22, 21, 25, 25, 22, 22, 23, 24, 23, 23,
21, 22, 24, 24, 22, 23, 23, 22, 23, 25, 26, 27, 26, 30, 27, 27,
28, 30, 31), VE = c(23, 23, 25, 20, 23, 25, 24, 23, 25, 26, 28,
28, 32, 33, 32, 33, 37, 42, 38, 41, 47, 48, 48, 50, 52, 56, 55,
60, 65, 64), absVO2 = c(0.777, 0.744, 0.761, 0.674, 0.808, 0.868,
0.787, 0.816, 0.921, 0.935, 0.989, 0.96, 1.05, 0.994, 1.121,
1.136, 1.227, 1.193, 1.164, 1.212, 1.333, 1.289, 1.358, 1.403,
1.463, 1.494, 1.441, 1.593, 1.585, 1.515), VCO2 = c(0.677, 0.68,
0.728, 0.583, 0.689, 0.766, 0.714, 0.691, 0.784, 0.825, 0.9,
0.907, 1.008, 0.982, 1.066, 1.132, 1.204, 1.269, 1.241, 1.291,
1.463, 1.488, 1.527, 1.596, 1.633, 1.756, 1.74, 1.912, 1.972,
1.93), PETCO2 = c(39.66, 39.08, 38.76, 39.67, 39, 39.4, 38.71,
41.37, 41.02, 41.19, 41.54, 41.89, 41.12, 41.16, 42.98, 42.8,
42.53, 40.75, 42.64, 41.45, 40.82, 40.23, 41.46, 41.87, 41.37,
40.54, 40.36, 40.64, 39.3, 38.91), VES = c(86, 92.4, 91.4, 84.8,
86, 88.1, 88.9, 86.5, 90.8, 93.2, 94.7, 94.8, 92.6, 93.7, 95.9,
93.1, 101.4, 101.1, 89.4, 97.6, 105.1, 100.3, 97.5, 100.7, 99.8,
108, 105.6, 104.5, 108.3, 106.5), QC = c(7.8, 8.5, 8.3, 7.5,
7.7, 8.1, 8.2, 8.2, 8.6, 9.1, 9.5, 9.5, 9.4, 10, 10.4, 10.3,
11.3, 11.5, 10.5, 11.6, 12.7, 12.3, 12.3, 12.9, 13, 14.2, 14.2,
14.4, 15.1, 15.2), IC = c(3.9, 4.2, 4.1, 3.7, 3.8, 4, 4.1, 4,
4.3, 4.5, 4.7, 4.7, 4.6, 4.9, 5.1, 5.1, 5.6, 5.7, 5.2, 5.8, 6.3,
6.1, 6.1, 6.4, 6.4, 7, 7, 7.1, 7.5, 7.5), WCI = c(4.7, 5.1, 5,
4.5, 4.6, 4.9, 4.9, 4.9, 5.2, 5.5, 5.7, 5.7, 5.6, 6, 6.2, 6.2,
6.8, 6.9, 6.3, 7, 7.6, 7.4, 7.4, 7.7, 7.8, 8.5, 8.5, 8.7, 9.1,
9.1), RVSi = c(1819, 1657, 1709, 1890, 1832, 1745, 1729, 1732,
1641, 1555, 1486, 1480, 1505, 1423, 1358, 1371, 1256, 1224, 1347,
1216, 1112, 1151, 1152, 1096, 1092, 993, 993, 978, 930, 928),
RVS = c(901, 821, 846, 936, 907, 865, 857, 858, 813, 770,
736, 733, 745, 705, 673, 679, 622, 606, 667, 603, 551, 570,
571, 543, 541, 492, 492, 484, 461, 460), VTD = c(137.5, 146.4,
149.6, 138.6, 146.4, 137.2, 142.2, 140.7, 148.5, 146.9, 153.6,
152.3, 144.2, 144, 147.5, 147.3, 146.3, 155.9, 142.4, 151.3,
152.8, 151.4, 154.3, 158.4, 159.5, 157.2, 159.8, 155.4, 150.9,
152.8), FE = c(62.7, 63.1, 61.1, 61.3, 58.8, 64.2, 62.4,
61.5, 61.1, 63.4, 61.6, 62.3, 64.2, 65.1, 65, 63.2, 69.3,
64.8, 62.7, 64.5, 68.8, 66.3, 63.2, 63.6, 62.8, 68.7, 66.2,
67.3, 71.8, 69.7), O2Hb = c("3.8E-3", "0.37130000000000002",
"0.10929999999999999", "0.1457", "0.24440000000000001", "0.39960000000000001",
"0.51039999999999996", "0.28970000000000001", "0.1762", "0.78080000000000005",
"1.0116000000000001", "1.2717000000000001", "1.4643999999999999",
"2.5387", "2.3233000000000001", "2.5627", "2.5230000000000001",
"2.7890000000000001", "2.8567", "3.5232999999999999", "3.7351999999999999",
"4.2393000000000001", "4.3661000000000003", "4.3578000000000001",
"4.2431000000000001", "4.6954000000000002", "4.5110000000000001",
"5.1020000000000003", "4.9044999999999996", "4.8182999999999998"
), HHb = c(-0.292, -0.3309, -0.2811, -0.1445, -0.1007, -0.2498,
-0.4758, -0.2753, -0.0079, -0.2002, -0.4731, -0.3644, -0.5278,
-0.7466, -0.8117, -0.8199, -1.0041, -1.1128, -0.9041, -0.9066,
-1.0106, -0.9902, -0.9717, -0.9746, -1.0317, -1.0096, -0.9691,
-1.0462, -0.8992, -0.8552), tHb = c("-0.28820000000000001",
"4.0399999999999998E-2", "-0.17180000000000001", "1.1999999999999999E-3",
"0.14369999999999999", "0.14979999999999999", "3.4599999999999999E-2",
"1.44E-2", "0.16830000000000001", "0.58050000000000002",
"0.53849999999999998", "0.9073", "0.93659999999999999", "1.7921",
"1.5116000000000001", "1.7427999999999999", "1.5188999999999999",
"1.6761999999999999", "1.9525999999999999", "2.6166999999999998",
"2.7246000000000001", "3.2490000000000001", "3.3944000000000001",
"3.3832", "3.2113999999999998", "3.6859000000000002", "3.5419",
"4.0556999999999999", "4.0053000000000001", "3.9630999999999998"
), HbDiff = c("0.29580000000000001", "0.70209999999999995",
"0.39050000000000001", "0.29020000000000001", "0.34510000000000002",
"0.64939999999999998", "0.98619999999999997", "0.56499999999999995",
"0.18410000000000001", "0.98099999999999998", "1.4846999999999999",
"1.6361000000000001", "1.9922", "3.2852999999999999", "3.1349999999999998",
"3.3826000000000001", "3.5270999999999999", "3.9018000000000002",
"3.7608000000000001", "4.4298999999999999", "4.7458", "5.2294999999999998",
"5.3377999999999997", "5.3323999999999998", "5.2747999999999999",
"5.7050000000000001", "5.4802", "6.1482000000000001", "5.8037000000000001",
"5.6734999999999998"), id = c("AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1",
"AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1", "AC12-PRD-C1"
), body_mass = c("84", "84", "84", "84", "84", "84", "84",
"84", "84", "84", "84", "84", "84", "84", "84", "84", "84",
"84", "84", "84", "84", "84", "84", "84", "84", "84", "84",
"84", "84", "84"), training = c("linear", "linear", "linear",
"linear", "linear", "linear", "linear", "linear", "linear",
"linear", "linear", "linear", "linear", "linear", "linear",
"linear", "linear", "linear", "linear", "linear", "linear",
"linear", "linear", "linear", "linear", "linear", "linear",
"linear", "linear", "linear")), row.names = c(NA, -30L), class = c("tbl_df",
"tbl", "data.frame"))
To do so, I used this code to combine my data and select the rows and columns I want:
load_files <- function(files){
temp <- read_excel(files) %>%
select(-(c(8:11, 14:15, 18:23))) %>%
mutate(id = pull(.[4,1])) %>% ##ID
mutate(body_mass = pull(.[7,3])) %>% ##body mass
mutate(training = pull(.[4,2])) %>% ##training group
set_names(c("time", "power", "hr", "fr", "VE", "absVO2", "VCO2", "PETCO2", "VES", "QC", "IC", "WCI", "RVSi", "RVS", "VTD", "FE", "O2Hb", "HHb", "tHb", "HbDiff", "id", "body_mass", "training")) %>%
slice(86:which(grepl("ration", VE))-1) %>% ##until recovery period
mutate_at(vars(1:16), as.numeric) %>%
mutate_at(vars(18), as.numeric) %>%
mutate(time = format(as.POSIXct(Sys.Date() + time), "%H:%M", tz="UTC"),
absVO2 = absVO2/1000,
VCO2 = VCO2/1000)
}
df <- map_df(file_list, load_files)
But I get this output:
Error in `set_names()`:
! The size of `nm` (23) must be compatible with the size of `x` (25).
Run `rlang::last_error()` to see where the error occurred.
It was working before, but this I have added column in the Excel files, it does work anymore.
Thank you for your support!

R, Pivot longer, multiple observations per row

I think I have a question that is nearly identical to this one: R Pivot multiple columns from wide to long but I am hopelessly lost on the regex when trying to follow along.
I am also trying to pivot data to be longer, and I also have multiple columns I'd like to save. My data currently:
FollowUpScans<-structure(list(study_id = c(40, 44, 49, 61, 66, 67, 68, 84, 86,
94, 95, 101, 123, 126, 131, 153, 154, 155, 156, 161, 166, 169,
175, 185, 199, 203, 207, 211, 217, 221, 227, 256, 257, 259, 266,
275, 284, 301, 306, 307, 309, 313, 320, 353, 382, 392, 398, 401,
402, 412, 415, 428, 431, 433, 434, 436), Score1 = c(3, 0, 4,
4, NA, 0, 0, 5, 0, 0, 7, 0, 4, 0, 4, 2, 3, 1, 0, 2, 2, 0, 3,
0, 0, 0, 9, 0, 0, 0, 6, 0, 0, 7, 5, 7, 0, 0, 8, 0, 0, 0, 5, 0,
3, 0, 5, 0, 2, 0, 0, 0, 0, 7, 0, 2), TimeBetweenScans = structure(c(316,
113, 335, 104, 7, 42, 30, 643, 404, 40, 171, 51, 449, 56, 104,
79, 116, 65, 39, 1193, 142, 106, 221, 36, 125, 137, 927, 63,
156, 32, 411, 201, 160, 166, 459, 212, 50, 312, 1627, 354, 33,
62, 842, 174, 216, 17, 214, 24, 149, 72, 9, 13, 42, 771, 113,
122), class = "difftime", units = "days"), Score2 = c(NA, 0,
7, NA, NA, NA, 0, 7, NA, 5, 8, 0, NA, NA, NA, 8, NA, NA, 9, NA,
NA, 0, 4, NA, NA, 0, 9, 2, 0, NA, NA, NA, NA, NA, NA, NA, 4,
1, 8, NA, NA, 3, NA, 0, 8, NA, 5, NA, 7, NA, 0, 3, NA, 7, NA,
4), TimeBetweenScans2 = structure(c(NA, 139, 660, NA, NA, NA,
84, 1794, NA, 221, 320, 227, NA, NA, NA, 989, NA, NA, 411, NA,
NA, 216, 474, NA, NA, 372, 1006, 429, 447, NA, NA, NA, NA, NA,
NA, NA, 313, 530, 1706, NA, NA, 130, NA, 300, 264, NA, 268, NA,
382, NA, 38, 138, NA, 1200, 166, 475), class = "difftime", units = "days"),
Score3 = c(NA, NA, NA, NA, NA, NA, 7, NA, NA, 8, NA, NA,
NA, NA, NA, 8, NA, NA, NA, NA, NA, 1, 4, NA, NA, 0, NA, 5,
0, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA,
NA, NA, NA, 5, NA, NA, NA, NA, NA, NA, 8, 0, 4), TimeBetweenScans3 = structure(c(NA,
NA, NA, NA, NA, NA, 467, NA, NA, 394, NA, NA, NA, NA, NA,
1097, NA, NA, NA, NA, NA, 266, 796, NA, NA, 941, NA, 533,
470, NA, NA, NA, NA, NA, NA, NA, NA, 783, NA, NA, NA, NA,
NA, NA, NA, NA, 388, NA, NA, NA, NA, NA, NA, 1512, 180, 640
), class = "difftime", units = "days"), Score4 = c(NA, NA,
NA, NA, NA, NA, 8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 5, NA, NA, NA, 1, NA, 5, 0, NA, NA, NA, NA,
NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA), TimeBetweenScans4 = structure(c(NA,
NA, NA, NA, NA, NA, 826, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 497, NA, NA, NA, 1102, NA, 567, 1204,
NA, NA, NA, NA, NA, NA, NA, NA, 1574, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), class = "difftime", units = "days"),
Score5 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, 1, NA,
NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
TimeBetweenScans5 = structure(c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 575,
NA, NA, NA, 1225, NA, NA, 1266, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), class = "difftime", units = "days")), row.names = c(NA,
-56L), class = c("tbl_df", "tbl", "data.frame"))
And instead of columns that looks like: study_id, Score1, TimeBetweenScans,Score2,TimeBetweenScans2, Score3, TimeBetweenScans3,etc.etc..
I'd love it to ultimately look like: study_id,Score,Time,Occurence
The "Occurence" column would just have a 1,2,3,4 etc.. to demonstrate which column it came from. The study_id column would be nice to keep because it demonstrates which "person" it came from.
Any help would be appreciated! Thank you!
You can try:
FollowUpScans %>%
rename(TimeBetweenScans1 = TimeBetweenScans) %>%
pivot_longer(-study_id,
names_to = c(".value", "Time"),
names_pattern = "([A-Za-z]+)([0-9]+)")
The steps are:
Rename the column that is likely to cause problems
pivot_longer specifying that the columns are named in a any number of characters followed by any number of digits pattern. You can use different regex patterns than the one I've shared here. For example, you could probably use "(.*)(\\d+)" for this particular dataset.
If you don't rename first, I would suspect that you would end up with too many rows. You should end up with nrow(FollowUpScans) * 5 rows.

extracting information from excel into lists in R

hello all i have this datasset :
> dput(test1)
structure(list(startdate = c("2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-01", "2019-11-05", "2019-11-15",
"2019-11-16", "2019-11-17", "2019-11-18", "2019-11-19", "2019-11-20",
"2019-11-21", NA), id = c("POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL62", "POL63", "POL64", "POL65",
"POL66", "POL67", "POL68", "POL69", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL62", "POL63", "POL64", "POL65",
"POL66", "POL67", "POL68", NA), m0_9 = c(NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98,
33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), m10_19 = c(NA,
NA, NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65,
3, 98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), m20_29 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA,
NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA,
NA, NA, NA, NA, NA), m30_39 = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), m40_49 = c(32, 34, NA, NA,
NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), m50_59 = c(NA,
NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA,
7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), m60_69 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9,
1, 65, 3, 98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA), m70 = c(NA, NA, NA, NA, NA, NA, 32,
34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), f0_9 = c(32, 34, NA,
NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), f10_19 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA, 55,
3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), f20_29 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA), f30_39 = c(NA, NA, NA, 32, 34, NA, NA, NA,
NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), f40_49 = c(NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA,
55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA), f50_59 = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA,
55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), f60_69 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA,
55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), f70 = c(NA, NA, NA, NA, NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -50L), class = c("tbl_df",
"tbl", "data.frame"))
I would like to create a list called ageCat. This list should contain a number of lists. The number of lists is the amount of age categories. Then for each age category i would like to extract the following info startAge, endAge, maleCount,femaleCount, totalCount.
Additionaly, i want only to sum up only individuals that have the same id and start date. For now i have written this:
create list of age
createLists <- function(startdate, id){
testFiltered = test1[policyid == id & start == startdate]
ageGroup <- vector("list", length == 8)
names(ageGroup) <- as.character(seq_along(ageGroup))
for(ageCat in seq_along(ageGroup)){
ageGroup[[ageCat]] <- getAgeInfo(testFiltered, ageCat)
}
getAgeInfo <- function(testFiltered, ageCat){
start =
end =
nomales =
nofemales =
}
ageGroup <- list(startAge = start,
endAge = end ,
maleCount = nomales ,
femaleCount = nofemales)
}
I have hard coded the length of the vecor ageGroup. How can i do this without hard coding it, aka. to look up how many columns with age categories I have for each gender?
Secondly, how can i extract the information startAge, endAge, maleCount,femaleCount, totalCount
Instead of working with lists I suggest to convert your data.frame to long format, getting rid of missing values and extracting sex and age. A `tidyverse´ approach might look like this:
library(dplyr)
library(tidyr)
library(tibble)
df <- tibble(
startdate = c(
"2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06", "2019-11-06",
"2019-11-06", "2019-11-06", "2019-11-06", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27", "2019-11-27",
"2019-11-27", "2019-11-27", "2019-11-01", "2019-11-05", "2019-11-15",
"2019-11-16", "2019-11-17", "2019-11-18", "2019-11-19", "2019-11-20",
"2019-11-21", NA
),
id = c(
"POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL62", "POL63", "POL64", "POL65",
"POL66", "POL67", "POL68", "POL69", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL55", "POL56", "POL57", "POL58",
"POL59", "POL60", "POL61", "POL62", "POL63", "POL64", "POL65",
"POL66", "POL67", "POL68", NA
),
m0_9 = c(
NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98,
33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
),
m10_19 = c(
NA,
NA, NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65,
3, 98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
),
m20_29 = c(
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA,
NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA,
NA, NA, NA, NA, NA
),
m30_39 = c(
NA, NA, NA, NA, NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA
),
m40_49 = c(
32, 34, NA, NA,
NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
),
m50_59 = c(
NA,
NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA,
7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), m60_69 = c(
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9,
1, 65, 3, 98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA
), m70 = c(
NA, NA, NA, NA, NA, NA, 32,
34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), f0_9 = c(
32, 34, NA,
NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), f10_19 = c(
NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA, 55,
3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), f20_29 = c(
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA
), f30_39 = c(
NA, NA, NA, 32, 34, NA, NA, NA,
NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA
), f40_49 = c(
NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA,
55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA
), f50_59 = c(
NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA,
55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), f60_69 = c(
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, 34, NA, NA, NA, NA,
55, 3, NA, NA, NA, 7, 9, 1, 65, 3, 98, 33, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA
), f70 = c(
NA, NA, NA, NA, NA, NA, NA, NA,
NA, 32, 34, NA, NA, NA, NA, 55, 3, NA, NA, NA, 7, 9, 1, 65, 3,
98, 33, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA
)
)
# Convert to tidy data frame
df_age <- df %>%
gather(age_sex, count, -startdate, -id) %>%
filter(!is.na(count)) %>%
extract(age_sex, into = c("sex", "start_age", "end_age"), regex = "(m|f)(\\d+)_?(\\d+)?", remove = FALSE) %>%
mutate(ageg = paste0(start_age, "_", end_age))
df_age
#> # A tibble: 187 x 8
#> startdate id age_sex sex start_age end_age count ageg
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr>
#> 1 2019-11-27 POL55 m0_9 m 0 9 32 0_9
#> 2 2019-11-27 POL56 m0_9 m 0 9 34 0_9
#> 3 2019-11-27 POL61 m0_9 m 0 9 55 0_9
#> 4 2019-11-27 POL55 m0_9 m 0 9 3 0_9
#> 5 2019-11-27 POL59 m0_9 m 0 9 7 0_9
#> 6 2019-11-27 POL60 m0_9 m 0 9 9 0_9
#> 7 2019-11-27 POL61 m0_9 m 0 9 1 0_9
#> 8 2019-11-27 POL55 m0_9 m 0 9 65 0_9
#> 9 2019-11-27 POL56 m0_9 m 0 9 3 0_9
#> 10 2019-11-27 POL57 m0_9 m 0 9 98 0_9
#> # ... with 177 more rows
# df back to nested list by startdate and ageg
df_list <- df_age %>%
# Count by startdate, ageg, start_age, end_age, sex
count(startdate, ageg, start_age, end_age, sex, wt = count) %>%
# male and female counts back in columns
spread(sex, n, fill = 0) %>%
# split by startdate
split(.$startdate) %>%
# ... and split each startdate list by ageg
lapply(function(x) split(x, x$ageg))
Created on 2020-03-10 by the reprex package (v0.3.0)

How to further format forest Plots in R, from the metafor package?

I'm quite new to R and have been struggling with properly formatting a forest plot I've created.
When I click the "zoom" option in R to open the graph in a new window, it looks as such:
Forest Plot Currently
My main goal is to get the forest plot as compact as possible, i.e. publication quality/style. I currently have wayyyy too much white space in my plot. I think it has something to do with me messing around with the par() function, and now have no clue how to revert to defaults.
#Metafor library
library(metafor)
#ReadXL library to import excel sheet
library(readxl)
#Name the data sheet from the excel file
ACDF<- read_excel("outpatient_ACDF_meta_analysis.xlsx")
#View the data sheet with view(ACDF)
par(mar=c(20,1,1,1))
#This below measures with risk ratios. If you want to measure odds ratios, use argument measure=OR
returnop <- escalc(measure="OR", ai=op_return_OR, bi=op_no_return_OR, ci=ip_return_OR, di=ip_no_return_OR, data=ACDF)
#Generate a Random Effects Model
REmodel<-rma(yi=yi, vi=vi, data=returnop, slab=paste(Author, Year, sep=", "), method="REML")
#Generate a forest plot of the data
forest(REmodel, xlim=c(-17, 6),
ilab=cbind(ACDF$op_return_OR, ACDF$op_no_return_OR, ACDF$ip_return_OR, ACDF$ip_no_return_OR),
ilab.xpos=c(-9.5,-8,-6,-4.5), cex=.75, ylim=c(-1, 27),
psize=1)
### add column headings to the plot
text(c(-9.5,-8,-6,-4.5), 26, c("Return+", "Return-", "Return+", "Return-"))
text(c(-8.75,-5.25), 27, c("Outpatient", "Inpatient"))
text(-16, 26, "Study", pos=4)
text(6, 26, "Log Odds Ratio [95% CI]", pos=2)
I'm not 100% as to how to provide my data otherwise, but I used the dput function to provide as follows. Apologies for the N/As, still fleshing out the data for the future.
structure(list(Study = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA), Author = c("Stieber", "Villavicencio",
"Lied", "Liu", "Garringer", "Joseffer", "Trahan", "Lied", "Sheperd",
"Talley", "Martin", "McGirt", "Adamson", "Fu", "Arshi", "Khanna",
"McClelland", "Purger", "McLellend2", NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Year = c(2005, 2007,
2007, 2009, 2010, 2010, 2011, 2012, 2012, 2013, 2015, 2015, 2016,
2017, 2017, 2017, 2017, 2017, 2017, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA), op_return_OR = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 1, 3, 2, 16, 257, 7, NA, 5, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), op_no_return_OR = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
596, 769, 992, 4581, 958, 1749, NA, 3120, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), ip_return_OR = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 8, 9, 2, 257, 2034, 12, NA,
200, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), ip_no_return_OR = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 589, 641, 482, 16171, 8930, 1744, NA, 46312, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), op_death = c(NA, NA, NA, 0, NA, NA, NA, NA, NA, NA, 1, NA,
1, 0, NA, 2, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA), op_no_death = c(NA, NA, NA, 45, NA,
NA, NA, NA, NA, NA, 596, NA, 993, 4597, NA, 1754, NA, 3125, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
), ip_death = c(NA, NA, NA, 0, NA, NA, NA, NA, NA, NA, 0, NA,
0, 42, NA, 2, NA, 20, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA), ip_no_death = c(NA, NA, NA, 64,
NA, NA, NA, NA, NA, NA, 597, NA, 484, 16386, NA, 1754, NA, 46492,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
2979.79797979798), op_thrombo = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0, NA, NA, 8, 20, 4, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), op_no_thrombo = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 597, NA, NA, 4589, 1195,
1752, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), ip_thrombo = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 2, NA, NA, 67, 150, 4, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), ip_no_thrombo = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 595, NA, NA, 16361, 10814,
1752, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), op_stroke = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 0, NA, NA, 2, 12, 0, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), op_no_stroke = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 597, NA, NA, 4595, 1203,
1756, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), ip_stroke = c(NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 2, NA, NA, 14, 132, 0, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), ip_no_stroke = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 595, NA, NA, 16414, 10832,
1756, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), op_dysphagia = c(NA, NA, NA, 0, NA, NA,
NA, NA, NA, NA, NA, NA, 11, NA, NA, NA, NA, 2, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), op_no_dysphagia = c(NA,
NA, NA, 45, NA, NA, NA, NA, NA, NA, NA, NA, 618, NA, NA, NA,
NA, 49, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA), ip_dysphagia = c(NA, NA, NA, 1, NA, NA, NA, NA,
NA, NA, NA, NA, 1, NA, NA, NA, NA, 59, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), ip_no_dysphagia = c(NA,
NA, NA, 63, NA, NA, NA, NA, NA, NA, NA, NA, 273, NA, NA, NA,
NA, 2917, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA), op_hematoma = c(NA, NA, NA, 0, NA, NA, NA, NA,
NA, NA, NA, NA, 1, NA, NA, NA, 1, 4, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), op_no_hematoma = c(NA,
NA, NA, 45, NA, NA, NA, NA, NA, NA, NA, NA, 629, NA, NA, NA,
2015, 47, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA), ip_hematoma = c(NA, NA, NA, 1, NA, NA, NA, NA,
NA, NA, NA, NA, 1, NA, NA, NA, 273, 65, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), ip_no_hematoma = c(NA,
NA, NA, 63, NA, NA, NA, NA, NA, NA, NA, NA, 273, NA, NA, NA,
7791, 1713, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA)), .Names = c("Study", "Author", "Year", "op_return_OR",
"op_no_return_OR", "ip_return_OR", "ip_no_return_OR", "op_death",
"op_no_death", "ip_death", "ip_no_death", "op_thrombo", "op_no_thrombo",
"ip_thrombo", "ip_no_thrombo", "op_stroke", "op_no_stroke", "ip_stroke",
"ip_no_stroke", "op_dysphagia", "op_no_dysphagia", "ip_dysphagia",
"ip_no_dysphagia", "op_hematoma", "op_no_hematoma", "ip_hematoma",
"ip_no_hematoma"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-35L))
The par option looks ok to me. I changed the ylim option and modified the y location and size of some of the header text as below:
#Generate a forest plot of the data
forest(REmodel, xlim=c(-17, 6),
ylim=c(-1, 10),
ilab=cbind(ACDF$op_return_OR, ACDF$op_no_return_OR, ACDF$ip_return_OR,
ACDF$ip_no_return_OR),
ilab.xpos=c(-9.5,-8,-6,-4.5), cex=.75,
psize=1)
### add column headings to the plot
text(c(-9.5,-8,-6,-4.5), 8.5, c("Return+", "Return-", "Return+", "Return-"),
cex = 0.65)
text(c(-8.75,-5.25), 9.5, c("Outpatient", "Inpatient"))
text(-17, 8.5, "Study", pos=4)
text(6, 8.5, "Log Odds Ratio [95% CI]", pos=2)
This gives the following plot:

Combine columns that have the same row values but are spread out

I want to combine these columns based on the value in "Date" so that there are only unique values of date with the corresponding age groups conglomerated. This was a result from using spread() in tidyr. If u look the values for Date are repeated
dput(dataframe) reads ....
structure(list(Date = c("201740", "201740", "201740", "201740",
"201741", "201741", "201741", "201741", "201742", "201742", "201742",
"201742", "201743", "201743", "201743", "201743", "201743", "201743",
"201744", "201744", "201744", "201744", "201744", "201744", "201745",
"201745", "201745", "201745", "201745", "201745", "201746", "201746",
"201746", "201746", "201746", "201746", "201747", "201747", "201747",
"201747", "201747", "201747", "201748", "201748", "201748", "201748",
"201748", "201748", "201749", "201749", "201749", "201749", "201749",
"201749", "201750", "201750", "201750", "201750", "201750", "201750",
"201751", "201751", "201751", "201751", "201751", "201751", "201752",
"201752", "201752", "201752", "201752", "201752", "201801", "201801",
"201801", "201801", "201801", "201801", "201802", "201802", "201802",
"201802", "201802", "201802", "201803", "201803", "201803", "201803",
"201803", "201803", "201804", "201804", "201804", "201804", "201804",
"201804", "201805"), `0-4 yr` = c(NA, 0.1, NA, NA, NA, 0.2, NA,
NA, NA, 0.2, NA, NA, NA, NA, 0.3, NA, NA, NA, NA, NA, 0.6, NA,
NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA,
1.8, NA, NA, NA, NA, NA, 2.7, NA, NA, NA, NA, NA, 3.3, NA, NA,
NA, NA, NA, 5.2, NA, NA, NA, NA, NA, 7.9, NA, NA, NA, NA, NA,
13.7, NA, NA, NA, NA, NA, 18.3, NA, NA, NA, NA, NA, 23.3, NA,
NA, NA, NA, NA, 28.2, NA, NA, NA, NA, NA, 35.6, NA, NA, NA, 41.9
), `18-49 yr` = c(NA, 0.1, NA, NA, 0.1, NA, NA, NA, NA, 0.2,
NA, NA, NA, 0.2, NA, NA, NA, NA, NA, 0.4, NA, NA, NA, NA, NA,
0.5, NA, NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 1, NA, NA,
NA, NA, NA, 1.4, NA, NA, NA, NA, NA, 1.9, NA, NA, NA, NA, NA,
2.7, NA, NA, NA, NA, NA, 4.2, NA, NA, NA, NA, NA, 6.6, NA, NA,
NA, NA, NA, 9.3, NA, NA, NA, NA, NA, 12.5, NA, NA, NA, NA, NA,
15.2, NA, NA, NA, NA, NA, 17.7, NA, NA, NA, NA, NA), `5-17 yr` = c(0,
NA, NA, NA, 0.1, NA, NA, NA, 0.1, NA, NA, NA, 0.1, NA, NA, NA,
NA, NA, 0.2, NA, NA, NA, NA, NA, 0.3, NA, NA, NA, NA, NA, 0.5,
NA, NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 0.9, NA, NA, NA,
NA, NA, 1.2, NA, NA, NA, NA, NA, 1.7, NA, NA, NA, NA, NA, 2.5,
NA, NA, NA, NA, NA, 3.5, NA, NA, NA, NA, NA, 4.3, NA, NA, NA,
NA, NA, 5.9, NA, NA, NA, NA, NA, 7.3, NA, NA, NA, NA, NA, 9,
NA, NA, NA, NA, NA, NA), `50-64 yr` = c(NA, NA, 0.2, NA, NA,
NA, 0.3, NA, NA, NA, 0.5, NA, NA, NA, NA, NA, 0.8, NA, NA, NA,
NA, NA, 1.1, NA, NA, NA, NA, NA, 1.6, NA, NA, NA, NA, NA, 2.2,
NA, NA, NA, NA, NA, 3.1, NA, NA, NA, NA, 4.1, NA, NA, NA, NA,
NA, 5.4, NA, NA, NA, NA, NA, NA, 8.1, NA, NA, NA, NA, NA, 13.7,
NA, NA, NA, NA, 21.7, NA, NA, NA, NA, NA, NA, 32.6, NA, NA, NA,
NA, NA, 42.9, NA, NA, NA, NA, NA, 52, NA, NA, NA, NA, NA, 60.2,
NA, NA), `65+ yr` = c(NA, NA, NA, 0.5, NA, NA, NA, 1, NA, NA,
NA, 2.1, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, 3.9, NA,
NA, NA, NA, NA, 5.1, NA, NA, NA, NA, NA, 6.5, NA, NA, NA, NA,
NA, 9.2, NA, NA, NA, NA, NA, 14.3, NA, NA, NA, NA, NA, 20.5,
NA, NA, NA, NA, NA, 30.2, NA, NA, NA, NA, NA, 50.2, NA, NA, NA,
NA, NA, 90.1, NA, NA, NA, NA, NA, 137.9, NA, NA, NA, NA, NA,
179.5, NA, NA, NA, NA, NA, 217.4, NA, NA, NA, NA, NA, 251.8,
NA)), .Names = c("Date", "0-4 yr", "18-49 yr", "5-17 yr", "50-64 yr",
"65+ yr"), class = "data.frame", row.names = c(NA, 97L))
Could try aggregation, this could have been done before your spread. But after works as well
library(tidyverse)
dataframe %>%
group_by(Date) %>%
summarise_all(funs(sum(., na.rm = T)))
I've used sum() here because its not clear how you want to summarise.
A more suitable way might be:
dataframe %>%
gather("age_group", "value", -Date) %>%
filter(!is.na(value)) %>%
spread(age_group, value)
Where we gather the data back to may have what been your original input, this needs to be filtered and then just re-spread

Resources