Correlation Matrix Between Variables in R [duplicate] - r

This question already has answers here:
How can I complete a correlation in R of one variable across it's factor levels, matching by date
(2 answers)
Closed 14 days ago.
I have been trying to determine the correlation between variable in panel data. My data is in the form (with more dates, some values of PM10 are NA):
structure(list(NetC = c("Cosenza Provincia", "Cosenza Provincia",
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia",
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia",
"Cosenza Provincia", "Reti Private", "Reti Private", "Reti Private",
"Reti Private", "Reti Private", "Reti Private"), ID = c("IT1938A",
"IT1938A", "IT1938A", "IT2086A", "IT2086A", "IT2086A", "IT2110A",
"IT2110A", "IT2110A", "IT1766A", "IT1766A", "IT1766A", "IT2090A",
"IT2090A", "IT2090A"), Stat = c("Citta dei Ragazzi", "Citta dei Ragazzi",
"Citta dei Ragazzi", "Rende", "Rende", "Rende", "Acri", "Acri",
"Acri", "Firmo", "Firmo", "Firmo", "Schiavonea", "Schiavonea",
"Schiavonea"), Data = c("1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022",
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022",
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022"),
PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4,
15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = "data.frame", row.names = c(NA,
-15L))
I have tried using plm::cortab, but it doesn't calculate the correlation.
library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c("Citta dei Ragazzi", "Rende",
"Acri", "Firmo", "Schiavonea"))
The output should look like:
Citta dei Ragazzi
Rende
Acri
Citta dei Ragazzi
1
Rende
x
1
Acri
x
x
1

This has pretty much already been asked (How can I complete a correlation in R of one variable across it's factor levels, matching by date) but for ease I have adapted that answer here for your use:
# simple correlation matrix:
data.wider <- data %>%
select(-ID, -NetC) %>% # remove unnecessary vars
pivot_wider(names_from = 'Stat', values_from = 'PM10')
cor(data.wider[,-1], use = 'p')
# more lines required to set up correlation testing:
pw <- combn(unique(data$Stat),2) # make pairwise sets
pw
pairwise_c <- apply(pw,2,function(i){
tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})
results <- cbind(data.frame(t(pw)),bind_rows(pairwise_c))
results

Related

Rollmean over one month with different groups in R

I have following extract of my dataset:
library(dyplr)
library(runner)
example <- data.frame(Date <- c("2020-03-24", "2020-04-06" ,"2020-04-08" ,
"2020-04-13", "2020-04-14", "2020-04-15",
"2020-04-16", "2020-04-18", "2020-04-23",
"2020-04-24", "2020-04-26", "2020-04-29",
"2020-03-24", "2020-04-06" ,"2020-04-08" ,
"2020-04-01", "2020-04-12", "2020-04-15",
"2020-04-17", "2020-04-18", "2020-04-22",
"2020-05-01", "2020-05-15", "2020-05-29",
"2020-03-08", "2020-04-06" ,"2020-04-15",
"2020-04-22", "2020-04-28", "2020-05-05",
"2020-05-08", "2020-05-22", "2020-05-23"),
username <- c("steves_" ,"steves_" ,"steves_",
"steves_" ,"steves_" ,"steves_",
"steves_" ,"steves_" ,"steves_",
"steves_" ,"steves_" ,"steves_",
"jules_" ,"jules_" ,"jules_",
"jules_" ,"jules_" ,"jules_",
"jules_" ,"jules_" ,"jules_",
"jules_" ,"jules_" ,"jules_",
"mia" ,"mia" ,"mia",
"mia" ,"mia" ,"mia",
"mia" ,"mia" ,"mia"),
ER <- as.numeric(c("0.092", "0.08", "0.028",
"0.1", "0.09", "0.02",
"0.02", "0.8", "0.001",
"0.001", "0.1", "0.098",
"0.001", "0.002","0.02",
"0.0098", "0.002","0.0019",
"0.002", "0.11","0.002",
"0.02", "0.01", "0.009",
"0.19", "0.09", "0.21",
"0.22", "0.19", "0.22",
"0.09", "0.19", "0.28")))
colnames(example) <- c("Date", "username", "ER")
example$Date <- as.Date(example$Date)
str(example)
I would like to calculate the respective average of the ER over a month from the respective dates.
I know that there are similar contributions to this already in the forum - but unfortunately I could not find the solution for me.
I have tried the following solutions:
example$avgER_30days <- example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(rollmean(example$ER, Date > (Date %m-% months(1)) & Date < Date, fill = NA))
or with the package runners
example$average <- example %>%
group_by(username) %>%
arrange(username, Date) %>%
mutate(mean_run(x = example$ER, k = 30, lag = 1, idx=example$Date)) %>%
ungroup(username)
I would be happy if you could help me!
Here are two equivalent alternatives.
In the first alternative below, the second argument to rollapplyr is a list such that the ith component is the vector of offsets to average over for the ith row of the group.
In the second alternative we can specify the width as a vector of widths, one per row, and then when taking the mean eliminate the last value.
Note that w is slightly different in the two alternatives.
Review ?rollapply for details on the arguments and for further examples.
library(dplyr, exclude = c("filter", "lag"))
library(zoo)
example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(w = seq_along(Date) - findInterval(Date - 30, Date) - 1,
avg30 = rollapplyr(ER, lapply(-w, seq, to = -1), mean, fill=NA)) %>%
ungroup
example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(w = seq_along(Date) - findInterval(Date - 30, Date),
avg30 = rollapplyr(ER, w, function(x) mean(head(x, -1)), fill = NA)) %>%
ungroup

Finding statistics after grouping in data.table

I had a small question in regards to data.table. Since i'm not so good at it i'm not quite sure how I can do this in data.table.
Basically I have 3 columns and want to group by the first two columns ( key and date ) and then for each key and each date, find the maximum and minimum that occurred in the third column ( fare)
I tried doing this but it gives me an error
flights[, c("max_day", "min_day") := unlist(lapply(gross_fare, findr)), by = c("key", "created_date")]
Error in `[.data.table`(flights, , `:=`(c("max_day", "min_day"), unlist(lapply(gross_fare, :
Supplied 18 items to be assigned to group 1 of size 9 in column 'max_day'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.
findr is a function which just finds the max and min i.e.
findr <- function(x) {list(max = max(x), min = min(x)}
I've done what I want to do in dplyr and I'll attach the code for that, but since i have millions of rows, dplyr eats up my ram so data.table would help
test <- flights %>%
select(key, created_date, gross_fare) %>%
group_by(key, created_date) %>%
summarise(
max_day = max(gross_fare),
min_day = min(gross_fare),
diff = max_day - min_day) %>%
arrange(created_date)
I've put the dput output if anyone wants to use that
If anyone can help that'd be great, thank you :)
data.table::setDT(structure(list(key = c("LHE_KHI_LHE+KHI_PA-405_15.0_1", "KHI_ISB_KHI+ISB_PK-370_20.0_0",
"LHE_KHI_LHE+KHI_PK-307_20.0_0", "ISB_KHI_ISB+KHI_PF-124_20.0_1",
"LHE_KHI_LHE+KHI_PK-307_20.0_0", "LHE_KHI_LHE+KHI_PA-405_15.0_1",
"KHI_LHE_KHI+LHE_PK-304_20.0_0", "KHI_ISB_KHI+ISB_PA-204_15.0_1",
"ISB_KHI_ISB+KHI_PA-207_15.0_1", "KHI_ISB_KHI+ISB_PA-200_20.0_1",
"KHI_LHE_KHI+LHE_PK-304_40.0_0", "ISB_KHI_ISB+KHI_PA-201_35.0_1",
"ISB_KHI_ISB+KHI_ER-501_20.0_1", "KHI_LHE_KHI+LHE_PF-145_20.0_2",
"KHI_ISB_KHI+ISB_PA-204_20.0_1", "LHE_KHI_LHE+KHI_PA-401_0.0_0",
"ISB_KHI_ISB+KHI_PK-309_40.0_0", "KHI_ISB_KHI+ISB_PF-123_20.0_2",
"ISB_KHI_ISB+KHI_PA-205_15.0_1", "LHE_KHI_LHE+KHI_PF-142_0.0_0",
"ISB_KHI_ISB+KHI_PA-223_15.0_1", "ISB_KHI_ISB+KHI_PF-126_20.0_2",
"ISB_KHI_ISB+KHI_PK-309_20.0_0", "KHI_ISB_KHI+ISB_PF-121_20.0_2",
"ISB_KHI_ISB+KHI_PK-373_20.0_0", "KHI_LHE_KHI+LHE_PF-145_20.0_2",
"KHI_LHE_KHI+LHE_PA-402_15.0_1", "LHE_KHI_LHE+KHI_PA-407_20.0_1",
"KHI_ISB_KHI+ISB_PK-308_40.0_0", "KHI_LHE_KHI+LHE_PF-145_20.0_2",
"LHE_KHI_LHE+KHI_PF-144_0.0_0", "ISB_KHI_ISB+KHI_PK-369_40.0_0",
"ISB_KHI_ISB+KHI_PF-124_20.0_2", "KHI_ISB_KHI+ISB_PA-204_15.0_1",
"KHI_ISB_KHI+ISB_PA-200_15.0_1", "ISB_KHI_ISB+KHI_PF-124_20.0_1",
"KHI_ISB_KHI+ISB_PK-300_20.0_0", "ISB_KHI_ISB+KHI_PF-122_20.0_2",
"KHI_ISB_KHI+ISB_PK-368_20.0_0", "KHI_ISB_KHI+ISB_PA-204_15.0_1",
"ISB_KHI_ISB+KHI_ER-503_20.0_1", "ISB_KHI_ISB+KHI_PA-209_15.0_1",
"KHI_ISB_KHI+ISB_PK-308_40.0_0", "ISB_KHI_ISB+KHI_PF-124_20.0_1",
"ISB_KHI_ISB+KHI_PK-301_40.0_0", "KHI_LHE_KHI+LHE_PA-408_35.0_1",
"LHE_KHI_LHE+KHI_PF-144_20.0_2", "KHI_ISB_KHI+ISB_PF-121_20.0_2",
"KHI_ISB_KHI+ISB_PA-204_35.0_1", "ISB_KHI_ISB+KHI_PK-309_40.0_0",
"ISB_KHI_ISB+KHI_PA-223_20.0_1", "KHI_ISB_KHI+ISB_PA-206_35.0_1",
"LHE_KHI_LHE+KHI_PF-142_32.0_1", "LHE_KHI_LHE+KHI_PF-142_20.0_1",
"KHI_ISB_KHI+ISB_PF-123_20.0_2", "ISB_KHI_ISB+KHI_PA-209_15.0_1",
"KHI_ISB_KHI+ISB_PA-204_35.0_1", "ISB_KHI_ISB+KHI_PA-201_20.0_1",
"KHI_ISB_KHI+ISB_PK-368_20.0_0", "ISB_KHI_ISB+KHI_PA-205_20.0_1",
"KHI_ISB_KHI+ISB_PF-121_20.0_1", "ISB_KHI_ISB+KHI_PF-124_20.0_1",
"ISB_KHI_ISB+KHI_PA-205_15.0_1", "KHI_LHE_KHI+LHE_PF-145_20.0_2",
"KHI_LHE_KHI+LHE_PA-406_35.0_1", "KHI_ISB_KHI+ISB_PK-308_20.0_0",
"LHE_KHI_LHE+KHI_PA-401_20.0_1", "LHE_KHI_LHE+KHI_PA-401_15.0_1",
"KHI_ISB_KHI+ISB_PA-204_35.0_1", "KHI_LHE_KHI+LHE_PA-406_35.0_1",
"KHI_ISB_KHI+ISB_PA-206_35.0_1", "KHI_ISB_KHI+ISB_PF-121_20.0_1",
"ISB_KHI_ISB+KHI_PA-205_20.0_1", "LHE_KHI_LHE+KHI_PF-142_20.0_1",
"LHE_KHI_LHE+KHI_PF-146_20.0_2", "LHE_KHI_LHE+KHI_PA-401_35.0_1",
"ISB_KHI_ISB+KHI_PA-209_15.0_1", "ISB_KHI_ISB+KHI_PK-301_40.0_0",
"ISB_KHI_ISB+KHI_PA-205_35.0_1", "KHI_LHE_KHI+LHE_PA-406_15.0_1",
"KHI_ISB_KHI+ISB_PF-123_20.0_1", "ISB_KHI_ISB+KHI_PA-201_35.0_1",
"KHI_ISB_KHI+ISB_PK-300_40.0_0", "KHI_LHE_KHI+LHE_PA-402_35.0_1",
"ISB_KHI_ISB+KHI_ER-505_20.0_1", "ISB_KHI_ISB+KHI_PF-122_20.0_2",
"ISB_KHI_ISB+KHI_PA-207_15.0_1", "KHI_LHE_KHI+LHE_PA-404_35.0_1",
"KHI_ISB_KHI+ISB_PF-123_20.0_1", "ISB_KHI_ISB+KHI_ER-503_20.0_1",
"ISB_GIL_ISB+GIL_PK-605_20.0_0", "KHI_ISB_KHI+ISB_PF-123_20.0_1",
"KHI_ISB_KHI+ISB_PA-200_15.0_1", "ISB_KHI_ISB+KHI_PF-122_20.0_2",
"KHI_LHE_KHI+LHE_PA-404_35.0_1", "ISB_KHI_ISB+KHI_PF-122_20.0_2",
"PEW_KHI_PEW+KHI_PF-152_20.0_1", "LHE_KHI_LHE+KHI_PK-303_20.0_0",
"KHI_ISB_KHI+ISB_PA-222_35.0_1", "ISB_KHI_ISB+KHI_PF-124_20.0_1"
), created_date = c("2021-04-20", "2021-05-27", "2021-02-13",
"2021-08-14", "2021-08-11", "2021-08-21", "2021-01-26", "2021-08-21",
"2021-05-24", "2021-09-15", "2021-06-05", "2021-07-19", "2021-09-29",
"2021-07-02", "2021-08-10", "2021-01-04", "2021-07-15", "2021-07-14",
"2021-08-13", "2021-01-11", "2021-09-13", "2021-09-20", "2021-05-27",
"2021-02-20", "2021-08-15", "2021-07-27", "2021-08-26", "2021-09-15",
"2021-08-02", "2021-06-25", "2021-05-15", "2021-08-26", "2021-07-30",
"2021-06-27", "2021-08-07", "2021-03-19", "2021-03-02", "2021-06-06",
"2021-08-15", "2021-06-27", "2021-09-19", "2021-07-28", "2021-08-09",
"2021-08-16", "2021-09-09", "2021-06-04", "2021-08-12", "2021-05-15",
"2021-07-26", "2021-05-27", "2021-08-12", "2021-08-02", "2021-01-26",
"2021-04-20", "2021-08-26", "2021-08-26", "2021-03-21", "2021-01-09",
"2021-04-23", "2021-01-04", "2021-08-13", "2021-06-22", "2021-05-31",
"2021-08-18", "2021-06-16", "2021-08-14", "2021-08-10", "2021-06-16",
"2021-04-08", "2021-05-20", "2021-06-22", "2021-04-20", "2021-01-05",
"2021-02-27", "2021-07-07", "2021-03-26", "2021-08-16", "2021-05-01",
"2021-07-31", "2021-06-14", "2021-06-16", "2021-03-25", "2021-09-14",
"2021-06-06", "2021-09-02", "2021-08-06", "2021-07-18", "2021-02-28",
"2021-04-28", "2021-09-19", "2021-08-25", "2021-06-17", "2021-06-07",
"2021-06-17", "2021-07-07", "2021-08-23", "2021-07-09", "2021-07-19",
"2021-07-14", "2021-05-21"), gross_fare = c(7796, 7427, 11504,
6870, 6580, 14945, 8697, 7524, 7124, 6785, 11858, 7524, 11500,
9525, 6785, 8739, 8200, 13560, 9045, 7400, 7524, 12500, 7458,
14000, 6570, 9525, 6220, 10545, 8310, 7900, 7820, 8410, 11285,
19892, 6810, 9800, 11441, 11900, 6570, 13592, 11500, 8300, 20380,
8525, 7340, 9707, 7870, 10655, 10545, 11798, 14645, 10545, 8650,
8650, 7870, 12945, 10799, 10227, 6765, 10227, 20120, 11045, 9403,
7870, 7124, 6570, 6810, 6531, 8605, 7124, 11072, 7390, 10227,
13435, 10530, 12280, 18945, 11147, 10545, 6531, 6620, 10799,
18480, 32702, 5606, 13560, 23895, 8027, 9655, 11500, 11990, 6620,
9403, 7620, 14645, 19105, 9000, 6440, 12645, 8025)), row.names = c(NA,
-100L), class = c("data.table", "data.frame")))
I guess this line of code should do the job:
library(data.table)
flights[, .(min_day = min(gross_fare), max_day = max(gross_fare), diff = max(gross_fare) - min(gross_fare)), by = .(key, created_date)][]
Since the function findr returns a list, there's no need to complicate things:
findr <- function(x) {list(max = max(x), min = min(x))}
flights[, c("max_day", "min_day") := findr(gross_fare), by = list(key, created_date)][]
To also return the difference between max and min, use
findr2 <- function(x) {
list(max = max(x), min = min(x), diff = diff(range(x)))
}
flights[, c("max_day", "min_day", "diff_day") := findr2(gross_fare), by = list(key, created_date)][]

how to do the mean absolute error (mae) on two dataframes with NaN in r

My data looks like this:
> dput(head(df1,25))
structure(list(Date = structure(c(16644, 16645, 16646, 16647,
16648, 16649, 16650, 16651, 16652, 16653, 16654, 16655, 16656,
16657, 16658, 16659, 16660, 16661, 16662, 16663, 16664, 16665,
16666, 16667, 16668), class = "Date"), AU = c(0.241392906920806,
0.257591745069017, 0.263305712230276, NaN, 0.252892547032525,
0.251771180928526, 0.249211746794207, 0.257289083109259, 0.205017582640463,
0.20072274573488, 0.210154167590338, 0.207384553271337, 0.193725450540089,
0.199282601988984, 0.216267134143314, 0.217052471451736, NaN,
0.220703029531909, 0.2164619798534, 0.223442036108148, 0.22061326758891,
NaN, 0.277777461504811, NaN, 0.200839628485262)), row.names = c(NA,
-25L), class = c("tbl_df", "tbl", "data.frame"))
> dput(head(df2,25))
structure(list(UF1 = c(0.2559, 0.2565, 0.257, 0.2577, 0.2583,
0.259, 0.2596, 0.2603, 0.2611, 0.2618, 0.2625, 0.2633, 0.2641,
0.2649, 0.2657, 0.2665, 0.2674, 0.2682, 0.2691, 0.27, 0.2709,
0.2718, 0.2727, 0.2736, 0.2745), UF2 = c(0.2597, 0.2602, 0.2608,
0.2614, 0.2621, 0.2627, 0.2634, 0.2641, 0.2648, 0.2655, 0.2663,
0.267, 0.2678, 0.2686, 0.2694, 0.2702, 0.2711, 0.2719, 0.2728,
0.2737, 0.2745, 0.2754, 0.2763, 0.2773, 0.2782), UF3 = c(0.2912,
0.2915, 0.2918, 0.2922, 0.2926, 0.293, 0.2934, 0.2938, 0.2943,
0.2947, 0.2952, 0.2957, 0.2962, 0.2968, 0.2973, 0.2979, 0.2985,
0.2991, 0.2997, 0.3003, 0.3009, 0.3016, 0.3022, 0.3029, 0.3035
), Date = structure(c(16644, 16645, 16646, 16647, 16648, 16649,
16650, 16651, 16652, 16653, 16654, 16655, 16656, 16657, 16658,
16659, 16660, 16661, 16662, 16663, 16664, 16665, 16666, 16667,
16668), class = "Date")), row.names = c(NA, 25L), class = "data.frame")
I'm trying to do the MEAN ABSOLUTE ERROR (mae) between the observed values df1$AU and predicted values df2$UF1, df$UF2 and df$UF3 using the following code (How to make a function of MAE and RAE without using library(Metrics)?):
mae1 <- function(df1$AU,df2$UF1, na.rm=TRUE)
{
mean(abs(df1$AU-df2$UF1), na.rm=na.rm)
}
mae1(df1$AU,df2$UF1, na.rm=TRUE)
but I always get this error:
Error in mean.default(abs(df1$AU - df2$UF1), :
object 'na.rm' not found
I also tried to do with library(Metrics)
mae(df1$AU, df2$UF1, na.rm=TRUE)
but always get this error
Error in mae(df1$AU, df2$UF1, :
unused argument (na.rm = TRUE)
I tried to do only:
mean(abs(df1$AU-df2$UF1), na.rm=TRUE)
and I got a value, but I don't know if corresponds to the real "mae" value.
Note:
My data as NaN
I think this error may be related to na.rm error instead of the mae function but I couldn't solve it anyway.
Any help will be much appreciated.
Try
mae1 <- function(o,p,m=T) {
mean(abs(o-p),na.rm=m)
}
mae1(df1$AU,df2$UF1)
[1] 0.03733099
Note: your function should work as well, I don't know why you are getting these errors.
Base R Solution:
# Extract the required vector names in a list:
# req_vec_names => list of vector names
req_vec_names <- list(
act_vec_name = "AU",
pred_vec_names = grep(
"UF\\d+",
c(
colnames(df1),
colnames(df2)
),
value = TRUE
)
)
# Function to create the mean absolute error:
# mae => function()
mae <- function(actual_vec, pred_vec){
return(
mean(
abs(actual_vec - pred_vec),
na.rm = TRUE
)
)
}
# Calculate the mean absolute error:
# maes => named double vector
maes <- vapply(
req_vec_names$pred_vec_names,
function(x){
mae(
as.double(df1[,req_vec_names$act_vec_name]),
as.double(df2[,x])
)
},
double(1)
)
# Print result: named double vector => stdout(console)
maes
#UF1 UF2 UF3
#0.03733099 0.04024130 0.06848576

how to do the mean of two dataframes columns to be subtrated "mean(df1$a-df2$b)" in r

My two dataframes looks like this:
> dput(head(df1,25))
structure(list(Date = structure(c(16644, 16645, 16646, 16647,
16648, 16649, 16650, 16651, 16652, 16653, 16654, 16655, 16656,
16657, 16658, 16659, 16660, 16661, 16662, 16663, 16664, 16665,
16666, 16667, 16668), class = "Date"), AU = c(0.241392906920806,
0.257591745069017, 0.263305712230276, NaN, 0.252892547032525,
0.251771180928526, 0.249211746794207, 0.257289083109259, 0.205017582640463,
0.20072274573488, 0.210154167590338, 0.207384553271337, 0.193725450540089,
0.199282601988984, 0.216267134143314, 0.217052471451736, NaN,
0.220703029531909, 0.2164619798534, 0.223442036108148, 0.22061326758891,
NaN, 0.277777461504811, NaN, 0.200839628485262)), row.names = c(NA,
-25L), class = c("tbl_df", "tbl", "data.frame"))
> dput(head(df2,25))
structure(list(UF1 = c(0.2559, 0.2565, 0.257, 0.2577, 0.2583,
0.259, 0.2596, 0.2603, 0.2611, 0.2618, 0.2625, 0.2633, 0.2641,
0.2649, 0.2657, 0.2665, 0.2674, 0.2682, 0.2691, 0.27, 0.2709,
0.2718, 0.2727, 0.2736, 0.2745), UF2 = c(0.2597, 0.2602, 0.2608,
0.2614, 0.2621, 0.2627, 0.2634, 0.2641, 0.2648, 0.2655, 0.2663,
0.267, 0.2678, 0.2686, 0.2694, 0.2702, 0.2711, 0.2719, 0.2728,
0.2737, 0.2745, 0.2754, 0.2763, 0.2773, 0.2782), UF3 = c(0.2912,
0.2915, 0.2918, 0.2922, 0.2926, 0.293, 0.2934, 0.2938, 0.2943,
0.2947, 0.2952, 0.2957, 0.2962, 0.2968, 0.2973, 0.2979, 0.2985,
0.2991, 0.2997, 0.3003, 0.3009, 0.3016, 0.3022, 0.3029, 0.3035
), Date = structure(c(16644, 16645, 16646, 16647, 16648, 16649,
16650, 16651, 16652, 16653, 16654, 16655, 16656, 16657, 16658,
16659, 16660, 16661, 16662, 16663, 16664, 16665, 16666, 16667,
16668), class = "Date")), row.names = c(NA, 25L), class = "data.frame")
>
I want to do the mean of two different dataframes columns subtracting (mean(df1$AU-df2$UF)).
The closest to the solution I got is the following:
data.frame(mean = colMeans(df1$AU, na.rm = TRUE) - colMeans(df2$UF))
but I got this error:
Error in colMeans(df1$mAU, na.rm = TRUE) :
'x' must be an array of at least two dimensions
I succeed to run the same code only for dataframes with one column each, but since I have 3 or more columns per dataframe I want calculate against df1$AU I need to be more efficient.
Any help will be much appreciated. Thank you.
Assuming what you meant is that you want the subtraction of the means of the (numeric) columns in df1 with the mean of the (numeric) columns in df2, this can be done like this:
mean(df1$AU, na.rm = T) - colMeans(df2[,1:3], na.rm = T)
this outputs:
UF1 UF2 UF3
-0.0367389 -0.0404509 -0.0688949
per column of the df2
I hope this is helpful.
Here are two base R functions to compute the mean of the differences. The 2nd is faster.
meanDiffs1 <- function(x, y, na.rm = TRUE){
z <- if(na.rm) na.omit(cbind(x, -1*y)) else cbind(x, -1*y)
mean(rowSums(z))
}
meanDiffs2 <- function(x, y, na.rm = TRUE){
if(na.rm){
i <- is.na(x)
j <- is.na(y)
mean(x[!i & !j] - y[!i & !j])
} else {
mean(x - y)
}
}
meanDiffs(df1$AU, df2$UF1)
#[1] -0.0361429
meanDiffs2(df1$AU, df2$UF1)
#[1] -0.0361429
To compute all mean differences between df1$AU and df$UF*, use sapply.
sapply(df2[1:3], \(y) meanDiffs2(df1$AU, y))
# UF1 UF2 UF3
#-0.03614290 -0.03986195 -0.06848576

Data sorting with R for XLConnect

Ok, so I have a workbook in Excel with sheets that look like this:
Date 1/2/15 1/3/15 1/4/15
Euro 3.54 2.50 #N/A
USD 3.20 3.30 3.35
Yen 2.50 2.35 2.40
The sheets are arranged exactly like this. I loaded each sheet individually via XLConnect such that:
wbFX <- loadWorkbook("fx.xlsx")
FX.high <- readWorksheet(wbFX, sheet=1)
FX.low <- readWorksheet(wbFX, sheet=2)
FX.close <- readWorksheet(wbFX, sheet=3)
One sheet is for closing prices, the other for low and the last one is closing prices.
I want to merge the currency rows (i.e: grouping Euro low, Euro high, Euro close) and create a table or dataframe such that I have:
Date 1/2/15 1/3/15 1/4/15
Euro close 3.54 2.50 #N/A
Euro low 3.20 3.30 3.35
Euro high 2.50 2.35 2.40
I only have very rudimentary knowledge of R and I'm not very familiar with the for() loop function in R. I understand the basics of it, but I'm struggling to reproduce what I want.
Suggestions would be very much appreciated!
UPDATED: Adding dput(FX.high) gives:
> dput(head(FX.high))
structure(list(Col1 = c("EUR CURNCY", "JPY CURNCY", "GBP CURNCY",
"CHF CURNCY", "AUD CURNCY", "CAD CURNCY"), X2016.08.30.00.00.00 = c(1.1192,
102.56, 1.312, 0.9815, 0.758, 1.3056), X2016.08.29.00.00.00 = c(1.1208,
102.39, 1.3172, 0.9807, 0.7582, 1.3048), X2016.08.28.00.00.00 = c(1.1341,
101.94, 1.3279, 0.9793, 0.7692, 1.3012), X2016.08.27.00.00.00 = c(1.1341,
101.94, 1.3279, 0.9793, 0.7692, 1.3012), X2016.08.26.00.00.00 = c(1.1341,
101.94, 1.3279, 0.9793, 0.7692, 1.3012), X2016.08.25.00.00.00 = c(1.1298,
100.62, 1.3264, 0.9688, 0.7639, 1.294), X2016.08.24.00.00.00 = c(1.1312,
100.61, 1.3273, 0.9684, 0.7634, 1.2958), X2016.08.23.00.00.00 = c(1.1355,
100.39, 1.3211, 0.9634, 0.7655, 1.2948), X2016.08.22.00.00.00 = c(1.1334,
100.93, 1.3157, 0.9649, 0.764, 1.2965), X2016.08.21.00.00.00 = c(1.136,
100.46, 1.3185, 0.9611, 0.7691, 1.2892), X2016.08.20.00.00.00 = c(1.136,
100.46, 1.3185, 0.9611, 0.7691, 1.2892), X2016.08.19.00.00.00 = c(1.136,
100.46, 1.3185, 0.9611, 0.7691, 1.2892), X2016.08.18.00.00.00 = c(1.1366,
100.5, 1.3173, 0.9627, 0.7723, 1.2858), X2016.08.17.00.00.00 = c(1.1316,
101.17, 1.3086, 0.9659, 0.7708, 1.2918), X2016.08.16.00.00.00 = c(1.1323,
101.29, 1.3051, 0.9735, 0.7749, 1.2934), X2016.08.15.00.00.00 = c(1.1204,
101.45, 1.2945, 0.9775, 0.7692, 1.2976), X2016.08.14.00.00.00 = c(1.1221,
102.27, 1.3035, 0.9766, 0.7725, 1.2994), X2016.08.13.00.00.00 = c(1.1221,
102.27, 1.3035, 0.9766, 0.7725, 1.2994), X2016.08.12.00.00.00 = c(1.1221,
102.27, 1.3035, 0.9766, 0.7725, 1.2994), X2016.08.11.00.00.00 = c(1.1192,
102.06, 1.3028, 0.9766, 0.775, 1.308), X2016.08.10.00.00.00 = c(1.119,
101.96, 1.3094, 0.9819, 0.7756, 1.3124), X2016.08.09.00.00.00 = c(1.1123,
102.53, 1.3049, 0.9844, 0.7687, 1.319), X2016.08.08.00.00.00 = c(1.1105,
102.66, 1.3097, 0.9842, 0.7672, 1.319), X2016.08.07.00.00.00 = c(1.1161,
102.06, 1.3175, 0.9831, 0.7664, 1.32), X2016.08.06.00.00.00 = c(1.1161,
102.06, 1.3175, 0.9831, 0.7664, 1.32), X2016.08.05.00.00.00 = c(1.1161,
102.06, 1.3175, 0.9831, 0.7664, 1.32), X2016.08.04.00.00.00 = c(1.1156,
101.67, 1.3346, 0.975, 0.7641, 1.3089), X2016.08.03.00.00.00 = c(1.1227,
101.57, 1.3372, 0.9739, 0.7616, 1.3148), X2016.08.02.00.00.00 = c(1.1234,
102.83, 1.3366, 0.9698, 0.7638, 1.3142), X2016.08.01.00.00.00 = c(1.1184,
102.68, 1.3273, 0.9703, 0.7615, 1.3127), X2016.07.31.00.00.00 = c(1.1197,
105.63, 1.3301, 0.981, 0.761, 1.3186), X2016.07.30.00.00.00 = c(1.1197,
105.63, 1.3301, 0.981, 0.761, 1.3186), X2016.07.29.00.00.00 = c(1.1197,
105.63, 1.3301, 0.981, 0.761, 1.3186), X2016.07.28.00.00.00 = c(1.1119,
105.51, 1.3248, 0.9868, 0.7549, 1.3192), X2016.07.27.00.00.00 = c(1.1065,
106.54, 1.3235, 0.995, 0.7566, 1.3253), X2016.07.26.00.00.00 = c(1.103,
105.89, 1.3176, 0.9928, 0.754, 1.3244), X2016.07.25.00.00.00 = c(1.0999,
106.72, 1.3165, 0.9897, 0.7492, 1.3242), X2016.07.24.00.00.00 = c(1.1041,
106.4, 1.3291, 0.9895, 0.7508, 1.3185), X2016.07.23.00.00.00 = c(1.1041,
106.4, 1.3291, 0.9895, 0.7508, 1.3185), X2016.07.22.00.00.00 = c(1.1041,
106.4, 1.3291, 0.9895, 0.7508, 1.3185), X2016.07.21.00.00.00 = c(1.106,
107.49, 1.3275, 0.9907, 0.7514, 1.3101), X2016.07.20.00.00.00 = c(1.103,
107.02, 1.3226, 0.9905, 0.7517, 1.3096), X2016.07.19.00.00.00 = c(1.1081,
106.53, 1.3276, 0.9878, 0.7592, 1.3054), X2016.07.18.00.00.00 = c(1.1084,
106.26, 1.3315, 0.9847, 0.7607, 1.3022), X2016.07.17.00.00.00 = c(1.1149,
106.32, 1.3481, 0.9847, 0.7676, 1.2988), X2016.07.16.00.00.00 = c(1.1149,
106.32, 1.3481, 0.9847, 0.7676, 1.2988), X2016.07.15.00.00.00 = c(1.1149,
106.32, 1.3481, 0.9847, 0.7676, 1.2988), X2016.07.14.00.00.00 = c(1.1165,
105.94, 1.3475, 0.9854, 0.7653, 1.2987), X2016.07.13.00.00.00 = c(1.112,
104.88, 1.3338, 0.9894, 0.7638, 1.3084), X2016.07.12.00.00.00 = c(1.1126,
104.99, 1.3295, 0.9894, 0.7658, 1.3133), X2016.07.11.00.00.00 = c(1.1075,
102.89, 1.3018, 0.9858, 0.7576, 1.314), X2016.07.10.00.00.00 = c(1.112,
101.28, 1.3019, 0.9866, 0.7574, 1.309), X2016.07.09.00.00.00 = c(1.112,
101.28, 1.3019, 0.9866, 0.7574, 1.309), X2016.07.08.00.00.00 = c(1.112,
101.28, 1.3019, 0.9866, 0.7574, 1.309), X2016.07.07.00.00.00 = c(1.1107,
101.41, 1.3047, 0.9792, 0.7539, 1.3021), X2016.07.06.00.00.00 = c(1.1112,
101.77, 1.3028, 0.9806, 0.7529, 1.3056), X2016.07.05.00.00.00 = c(1.1186,
102.59, 1.3291, 0.9774, 0.7545, 1.3017), X2016.07.04.00.00.00 = c(1.116,
102.8, 1.3341, 0.9759, 0.7545, 1.2926), X2016.07.03.00.00.00 = c(1.1169,
103.39, 1.335, 0.9781, 0.7503, 1.2975), X2016.07.02.00.00.00 = c(1.1169,
103.39, 1.335, 0.9781, 0.7503, 1.2975), X2016.07.01.00.00.00 = c(1.1169,
103.39, 1.335, 0.9781, 0.7503, 1.2975), X2016.06.30.00.00.00 = c(1.1155,
103.29, 1.3496, 0.9821, 0.7473, 1.3016), X2016.06.29.00.00.00 = c(1.113,
102.94, 1.3534, 0.9823, 0.7456, 1.3042), X2016.06.28.00.00.00 = c(1.1112,
102.84, 1.3419, 0.9837, 0.7415, 1.3108), X2016.06.27.00.00.00 = c(1.1084,
102.48, 1.3566, 0.9819, 0.7459, 1.312), X2016.06.26.00.00.00 = c(1.1428,
106.84, 1.5018, 0.9804, 0.7648, 1.3099), X2016.06.25.00.00.00 = c(1.1428,
106.84, 1.5018, 0.9804, 0.7648, 1.3099), X2016.06.24.00.00.00 = c(1.1428,
106.84, 1.5018, 0.9804, 0.7648, 1.3099), X2016.06.23.00.00.00 = c(1.1421,
106.17, 1.4947, 0.9602, 0.7616, 1.2847), X2016.06.22.00.00.00 = c(1.1338,
104.85, 1.4774, 0.9629, 0.7527, 1.2853), X2016.06.21.00.00.00 = c(1.135,
105.06, 1.4783, 0.9624, 0.7513, 1.2827), X2016.06.20.00.00.00 = c(1.1382,
104.85, 1.472, 0.9643, 0.7481, 1.2889), X2016.06.19.00.00.00 = c(1.1296,
104.83, 1.4388, 0.9659, 0.7411, 1.2968), X2016.06.18.00.00.00 = c(1.1296,
104.83, 1.4388, 0.9659, 0.7411, 1.2968), X2016.06.17.00.00.00 = c(1.1296,
104.83, 1.4388, 0.9659, 0.7411, 1.2968), X2016.06.16.00.00.00 = c(1.1295,
106.03, 1.4254, 0.9687, 0.744, 1.3086), X2016.06.15.00.00.00 = c(1.1298,
106.4, 1.4218, 0.9664, 0.7446, 1.2943), X2016.06.14.00.00.00 = c(1.1298,
106.42, 1.4271, 0.9669, 0.7405, 1.2873), X2016.06.13.00.00.00 = c(1.1303,
106.98, 1.4328, 0.9679, 0.7411, 1.2839), X2016.06.12.00.00.00 = c(1.1321,
107.26, 1.4473, 0.9658, 0.7438, 1.2784), X2016.06.11.00.00.00 = c(1.1321,
107.26, 1.4473, 0.9658, 0.7438, 1.2784), X2016.06.10.00.00.00 = c(1.1321,
107.26, 1.4473, 0.9658, 0.7438, 1.2784), X2016.06.09.00.00.00 = c(1.1416,
107.18, 1.4527, 0.9656, 0.7505, 1.2767), X2016.06.08.00.00.00 = c(1.1411,
107.39, 1.4601, 0.9659, 0.7482, 1.2761), X2016.06.07.00.00.00 = c(1.1381,
107.9, 1.466, 0.9719, 0.7464, 1.284), X2016.06.06.00.00.00 = c(1.1393,
107.66, 1.4529, 0.978, 0.7391, 1.2983), X2016.06.05.00.00.00 = c(1.1374,
109.14, 1.4582, 0.992, 0.7369, 1.3107), X2016.06.04.00.00.00 = c(1.1374,
109.14, 1.4582, 0.992, 0.7369, 1.3107), X2016.06.03.00.00.00 = c(1.1374,
109.14, 1.4582, 0.992, 0.7369, 1.3107), X2016.06.02.00.00.00 = c(1.122,
109.59, 1.4473, 0.9911, 0.727, 1.3144), X2016.06.01.00.00.00 = c(1.1194,
110.83, 1.4508, 0.9951, 0.7299, 1.3123), X2016.05.31.00.00.00 = c(1.1173,
111.35, 1.4725, 0.9951, 0.7267, 1.3134), X2016.05.30.00.00.00 = c(1.1145,
111.45, 1.4642, 0.9956, 0.7188, 1.3095), X2016.05.29.00.00.00 = c(1.1201,
110.45, 1.4689, 0.9949, 0.7235, 1.3068), X2016.05.28.00.00.00 = c(1.1201,
110.45, 1.4689, 0.9949, 0.7235, 1.3068), X2016.05.27.00.00.00 = c(1.1201,
110.45, 1.4689, 0.9949, 0.7235, 1.3068), X2016.05.26.00.00.00 = c(1.1217,
110.23, 1.474, 0.9928, 0.7244, 1.3037), X2016.05.25.00.00.00 = c(1.1167,
110.45, 1.4729, 0.9936, 0.7219, 1.3133), X2016.05.24.00.00.00 = c(1.1227,
110.13, 1.4642, 0.9938, 0.7228, 1.3188), X2016.05.23.00.00.00 = c(1.1243,
110.24, 1.4549, 0.9924, 0.726, 1.3174), X2016.05.22.00.00.00 = c(1.1237,
110.59, 1.4613, 0.9927, 0.725, 1.3162), X2016.05.21.00.00.00 = c(1.1237,
110.59, 1.4613, 0.9927, 0.725, 1.3162), X2016.05.20.00.00.00 = c(1.1237,
110.59, 1.4613, 0.9927, 0.725, 1.3162), X2016.05.19.00.00.00 = c(1.123,
110.38, 1.4663, 0.9923, 0.7242, 1.3154), X2016.05.18.00.00.00 = c(1.1316,
110.26, 1.4635, 0.9881, 0.7332, 1.3037), X2016.05.17.00.00.00 = c(1.1349,
109.65, 1.4524, 0.9809, 0.7366, 1.2955), X2016.05.16.00.00.00 = c(1.1342,
109.1, 1.4415, 0.9784, 0.7309, 1.2963), X2016.05.15.00.00.00 = c(1.138,
109.56, 1.4456, 0.9775, 0.7326, 1.2958), X2016.05.14.00.00.00 = c(1.138,
109.56, 1.4456, 0.9775, 0.7326, 1.2958), X2016.05.13.00.00.00 = c(1.138,
109.56, 1.4456, 0.9775, 0.7326, 1.2958), X2016.05.12.00.00.00 = c(1.1429,
109.4, 1.4531, 0.9726, 0.738, 1.2879), X2016.05.11.00.00.00 = c(1.1447,
109.38, 1.4488, 0.9762, 0.7402, 1.2942), X2016.05.10.00.00.00 = c(1.141,
109.35, 1.4478, 0.9763, 0.7374, 1.298), X2016.05.09.00.00.00 = c(1.142,
108.6, 1.448, 0.9736, 0.7385, 1.3015), X2016.05.08.00.00.00 = c(1.1479,
107.42, 1.4543, 0.973, 0.7478, 1.2952), X2016.05.07.00.00.00 = c(1.1479,
107.42, 1.4543, 0.973, 0.7478, 1.2952), X2016.05.06.00.00.00 = c(1.1479,
107.42, 1.4543, 0.973, 0.7478, 1.2952), X2016.05.05.00.00.00 = c(1.1494,
107.5, 1.4529, 0.9697, 0.7514, 1.2875), X2016.05.04.00.00.00 = c(1.1529,
107.46, 1.4572, 0.959, 0.7517, 1.2886), X2016.05.03.00.00.00 = c(1.1616,
106.68, 1.477, 0.9554, 0.7719, 1.2732), X2016.05.02.00.00.00 = c(1.1536,
106.82, 1.4696, 0.9605, 0.7672, 1.2571), X2016.05.01.00.00.00 = c(1.1459,
108.2, 1.467, 0.9672, 0.7669, 1.2587), X2016.04.30.00.00.00 = c(1.1459,
108.2, 1.467, 0.9672, 0.7669, 1.2587), X2016.04.29.00.00.00 = c(1.1459,
108.2, 1.467, 0.9672, 0.7669, 1.2587), X2016.04.28.00.00.00 = c(1.1368,
111.88, 1.4623, 0.9734, 0.7658, 1.2606), X2016.04.27.00.00.00 = c(1.1362,
111.75, 1.4622, 0.9753, 0.7765, 1.2694), X2016.04.26.00.00.00 = c(1.134,
111.47, 1.4639, 0.9767, 0.7765, 1.2688), X2016.04.25.00.00.00 = c(1.1278,
111.91, 1.452, 0.9794, 0.7729, 1.2717), X2016.04.24.00.00.00 = c(1.1309,
111.81, 1.4452, 0.9797, 0.7774, 1.2758), X2016.04.23.00.00.00 = c(1.1309,
111.81, 1.4452, 0.9797, 0.7774, 1.2758), X2016.04.22.00.00.00 = c(1.1309,
111.81, 1.4452, 0.9797, 0.7774, 1.2758), X2016.04.21.00.00.00 = c(1.1398,
109.9, 1.444, 0.9753, 0.7835, 1.275), X2016.04.20.00.00.00 = c(1.1388,
109.88, 1.441, 0.9733, 0.7829, 1.273), X2016.04.19.00.00.00 = c(1.1385,
109.49, 1.4419, 0.965, 0.7826, 1.2798), X2016.04.18.00.00.00 = c(1.1332,
108.99, 1.4291, 0.9679, 0.7759, 1.299), X2016.04.17.00.00.00 = c(1.1317,
109.73, 1.4242, 0.9688, 0.7734, 1.2903), X2016.04.16.00.00.00 = c(1.1317,
109.73, 1.4242, 0.9688, 0.7734, 1.2903), X2016.04.15.00.00.00 = c(1.1317,
109.73, 1.4242, 0.9688, 0.7734, 1.2903), X2016.04.14.00.00.00 = c(1.1295,
109.55, 1.4208, 0.9688, 0.7737, 1.2897), X2016.04.13.00.00.00 = c(1.1391,
109.41, 1.4279, 0.9672, 0.7716, 1.2828), X2016.04.12.00.00.00 = c(1.1465,
108.79, 1.4348, 0.9594, 0.769, 1.2921), X2016.04.11.00.00.00 = c(1.1447,
108.44, 1.4287, 0.9571, 0.763, 1.3016), X2016.04.10.00.00.00 = c(1.1419,
109.1, 1.4141, 0.9582, 0.7579, 1.3157), X2016.04.09.00.00.00 = c(1.1419,
109.1, 1.4141, 0.9582, 0.7579, 1.3157), X2016.04.08.00.00.00 = c(1.1419,
109.1, 1.4141, 0.9582, 0.7579, 1.3157), X2016.04.07.00.00.00 = c(1.1454,
109.9, 1.4157, 0.9581, 0.7637, 1.3181), X2016.04.06.00.00.00 = c(1.1432,
110.64, 1.4171, 0.9622, 0.7619, 1.3187), X2016.04.05.00.00.00 = c(1.1405,
111.36, 1.4279, 0.9605, 0.7632, 1.3219), X2016.04.04.00.00.00 = c(1.1413,
111.8, 1.4322, 0.9615, 0.7679, 1.3088), X2016.04.03.00.00.00 = c(1.1438,
112.58, 1.4372, 0.9626, 0.7701, 1.3147), X2016.04.02.00.00.00 = c(1.1438,
112.58, 1.4372, 0.9626, 0.7701, 1.3147), X2016.04.01.00.00.00 = c(1.1438,
112.58, 1.4372, 0.9626, 0.7701, 1.3147), X2016.03.31.00.00.00 = c(1.1412,
112.66, 1.4426, 0.9663, 0.7723, 1.3011), X2016.03.30.00.00.00 = c(1.1365,
112.81, 1.4459, 0.9672, 0.7709, 1.3081), X2016.03.29.00.00.00 = c(1.1303,
113.8, 1.4404, 0.9763, 0.7645, 1.3216), X2016.03.28.00.00.00 = c(1.122,
113.69, 1.4283, 0.9787, 0.7558, 1.3285), X2016.03.27.00.00.00 = c(1.1181,
113.32, 1.4159, 0.9788, 0.7535, 1.3285), X2016.03.26.00.00.00 = c(1.1181,
113.32, 1.4159, 0.9788, 0.7535, 1.3285), X2016.03.25.00.00.00 = c(1.1181,
113.32, 1.4159, 0.9788, 0.7535, 1.3285), X2016.03.24.00.00.00 = c(1.1188,
113.01, 1.4183, 0.9774, 0.7538, 1.3296), X2016.03.23.00.00.00 = c(1.1224,
112.91, 1.4227, 0.9766, 0.7649, 1.3219), X2016.03.22.00.00.00 = c(1.126,
112.49, 1.4398, 0.9736, 0.7643, 1.3139), X2016.03.21.00.00.00 = c(1.1288,
111.98, 1.4469, 0.9731, 0.7627, 1.3102), X2016.03.20.00.00.00 = c(1.1337,
111.76, 1.4514, 0.9717, 0.768, 1.3043), X2016.03.19.00.00.00 = c(1.1337,
111.76, 1.4514, 0.9717, 0.768, 1.3043), X2016.03.18.00.00.00 = c(1.1337,
111.76, 1.4514, 0.9717, 0.768, 1.3043), X2016.03.17.00.00.00 = c(1.1342,
112.96, 1.4503, 0.9787, 0.7657, 1.3133), X2016.03.16.00.00.00 = c(1.1242,
113.82, 1.4274, 0.9914, 0.7561, 1.3405), X2016.03.15.00.00.00 = c(1.1125,
114.14, 1.4306, 0.9896, 0.7528, 1.3402), X2016.03.14.00.00.00 = c(1.1176,
114.01, 1.4401, 0.9882, 0.7594, 1.3308), X2016.03.13.00.00.00 = c(1.121,
113.92, 1.4437, 0.9891, 0.7584, 1.3359), X2016.03.12.00.00.00 = c(1.121,
113.92, 1.4437, 0.9891, 0.7584, 1.3359), X2016.03.11.00.00.00 = c(1.121,
113.92, 1.4437, 0.9891, 0.7584, 1.3359), X2016.03.10.00.00.00 = c(1.1218,
114.45, 1.4317, 1.0093, 0.7512, 1.3398), X2016.03.09.00.00.00 = c(1.1035,
113.45, 1.4241, 1.0039, 0.7528, 1.3446), X2016.03.08.00.00.00 = c(1.1058,
113.52, 1.4276, 0.9971, 0.7473, 1.3425), X2016.03.07.00.00.00 = c(1.1026,
114.09, 1.4284, 1.0012, 0.7485, 1.3377), X2016.03.06.00.00.00 = c(1.1043,
114.26, 1.4248, 0.9989, 0.7443, 1.3471), X2016.03.05.00.00.00 = c(1.1043,
114.26, 1.4248, 0.9989, 0.7443, 1.3471), X2016.03.04.00.00.00 = c(1.1043,
114.26, 1.4248, 0.9989, 0.7443, 1.3471), X2016.03.03.00.00.00 = c(1.0973,
114.27, 1.4194, 0.9983, 0.7374, 1.3473), X2016.03.02.00.00.00 = c(1.0881,
114.56, 1.4093, 1.0009, 0.7301, 1.3499), X2016.03.01.00.00.00 = c(1.0894,
114.19, 1.4018, 1.0009, 0.7192, 1.3552), X2016.02.29.00.00.00 = c(1.0963,
113.99, 1.3946, 1.0038, 0.7168, 1.3587), X2016.02.28.00.00.00 = c(1.1068,
114, 1.4043, 0.9989, 0.7257, 1.3565), X2016.02.27.00.00.00 = c(1.1068,
114, 1.4043, 0.9989, 0.7257, 1.3565), X2016.02.26.00.00.00 = c(1.1068,
114, 1.4043, 0.9989, 0.7257, 1.3565), X2016.02.25.00.00.00 = c(1.105,
113.02, 1.3997, 0.9952, 0.7244, 1.3735), X2016.02.24.00.00.00 = c(1.1046,
112.27, 1.4028, 0.9953, 0.7213, 1.3859), X2016.02.23.00.00.00 = c(1.1053,
113.05, 1.4156, 1.0002, 0.7259, 1.3821), X2016.02.22.00.00.00 = c(1.1135,
113.39, 1.4332, 1.0004, 0.7247, 1.3813), X2016.02.21.00.00.00 = c(1.1139,
113.38, 1.4409, 0.9968, 0.7162, 1.3847), X2016.02.20.00.00.00 = c(1.1139,
113.38, 1.4409, 0.9968, 0.7162, 1.3847), X2016.02.19.00.00.00 = c(1.1139,
113.38, 1.4409, 0.9968, 0.7162, 1.3847), X2016.02.18.00.00.00 = c(1.115,
114.33, 1.4394, 0.9969, 0.7185, 1.3752), X2016.02.17.00.00.00 = c(1.1179,
114.51, 1.4339, 0.9942, 0.7187, 1.3899), X2016.02.16.00.00.00 = c(1.1193,
114.87, 1.4516, 0.9896, 0.7182, 1.3912), X2016.02.15.00.00.00 = c(1.1261,
114.73, 1.4567, 0.9889, 0.7172, 1.3867), X2016.02.14.00.00.00 = c(1.1334,
113.54, 1.457, 0.9791, 0.7129, 1.3965), X2016.02.13.00.00.00 = c(1.1334,
113.54, 1.457, 0.9791, 0.7129, 1.3965), X2016.02.12.00.00.00 = c(1.1334,
113.54, 1.457, 0.9791, 0.7129, 1.3965), X2016.02.11.00.00.00 = c(1.1376,
113.6, 1.4564, 0.9762, 0.7153, 1.4016), X2016.02.10.00.00.00 = c(1.1311,
115.26, 1.4578, 0.982, 0.7125, 1.3999), X2016.02.09.00.00.00 = c(1.1338,
115.85, 1.4516, 0.9875, 0.7096, 1.396), X2016.02.08.00.00.00 = c(1.1216,
117.53, 1.4547, 0.9973, 0.7129, 1.3978), X2016.02.07.00.00.00 = c(1.1246,
117.43, 1.4592, 0.9985, 0.7219, 1.3919), X2016.02.06.00.00.00 = c(1.1246,
117.43, 1.4592, 0.9985, 0.7219, 1.3919), X2016.02.05.00.00.00 = c(1.1246,
117.43, 1.4592, 0.9985, 0.7219, 1.3919), X2016.02.04.00.00.00 = c(1.1239,
118.24, 1.4668, 1.0074, 0.7243, 1.3798), X2016.02.03.00.00.00 = c(1.1146,
120.04, 1.4649, 1.0196, 0.7189, 1.4103), X2016.02.02.00.00.00 = c(1.094,
121.04, 1.4446, 1.0224, 0.7129, 1.4082), X2016.02.01.00.00.00 = c(1.0913,
121.49, 1.4445, 1.025, 0.7121, 1.4062), X2016.01.31.00.00.00 = c(1.0949,
121.69, 1.4413, 1.0257, 0.7141, 1.4109), X2016.01.30.00.00.00 = c(1.0949,
121.69, 1.4413, 1.0257, 0.7141, 1.4109), X2016.01.29.00.00.00 = c(1.0949,
121.69, 1.4413, 1.0257, 0.7141, 1.4109), X2016.01.28.00.00.00 = c(1.0968,
118.99, 1.4408, 1.0178, 0.7129, 1.4123), X2016.01.27.00.00.00 = c(1.0917,
119.07, 1.4355, 1.0189, 0.7082, 1.4157), X2016.01.26.00.00.00 = c(1.0874,
118.62, 1.4367, 1.0199, 0.7021, 1.4326), X2016.01.25.00.00.00 = c(1.0857,
118.86, 1.4332, 1.0184, 0.7032, 1.4293), X2016.01.24.00.00.00 = c(1.0877,
118.88, 1.4363, 1.0166, 0.7046, 1.4301), X2016.01.23.00.00.00 = c(1.0877,
118.88, 1.4363, 1.0166, 0.7046, 1.4301), X2016.01.22.00.00.00 = c(1.0877,
118.88, 1.4363, 1.0166, 0.7046, 1.4301), X2016.01.21.00.00.00 = c(1.0921,
117.81, 1.4249, 1.0147, 0.7018, 1.4541), X2016.01.20.00.00.00 = c(1.0976,
117.69, 1.4219, 1.0058, 0.6926, 1.469), X2016.01.19.00.00.00 = c(1.0939,
118.11, 1.434, 1.0082, 0.6957, 1.4589), X2016.01.18.00.00.00 = c(1.0942,
117.44, 1.4323, 1.0073, 0.6928, 1.466), X2016.01.17.00.00.00 = c(1.0985,
118.27, 1.4428, 1.0061, 0.7002, 1.4554), X2016.01.16.00.00.00 = c(1.0985,
118.27, 1.4428, 1.0061, 0.7002, 1.4554), X2016.01.15.00.00.00 = c(1.0985,
118.27, 1.4428, 1.0061, 0.7002, 1.4554), X2016.01.14.00.00.00 = c(1.0943,
118.28, 1.4445, 1.0092, 0.6997, 1.4397), X2016.01.13.00.00.00 = c(1.0888,
118.38, 1.4476, 1.0107, 0.7049, 1.438), X2016.01.12.00.00.00 = c(1.09,
118.07, 1.456, 1.0047, 0.7021, 1.4315), X2016.01.11.00.00.00 = c(1.097,
118.02, 1.4604, 1.0023, 0.7036, 1.4246), X2016.01.10.00.00.00 = c(1.0934,
118.83, 1.4645, 1.0052, 0.7077, 1.4178), X2016.01.09.00.00.00 = c(1.0934,
118.83, 1.4645, 1.0052, 0.7077, 1.4178), X2016.01.08.00.00.00 = c(1.0934,
118.83, 1.4645, 1.0052, 0.7077, 1.4178), X2016.01.07.00.00.00 = c(1.094,
118.76, 1.4641, 1.0081, 0.7086, 1.417), X2016.01.06.00.00.00 = c(1.0799,
119.17, 1.4682, 1.0121, 0.7172, 1.4109), X2016.01.05.00.00.00 = c(1.0839,
119.7, 1.4726, 1.0125, 0.7215, 1.4019), X2016.01.04.00.00.00 = c(1.0946,
120.47, 1.4816, 1.0063, 0.7305, 1.3983), X2016.01.03.00.00.00 = c(1.0867,
120.55, 1.476, 1.0083, 0.7304, 1.3856), X2016.01.02.00.00.00 = c(1.0867,
120.55, 1.476, 1.0083, 0.7304, 1.3856), X2016.01.01.00.00.00 = c(1.0867,
120.55, 1.476, 1.0083, 0.7304, 1.3856)), .Names = c("Col1", "X2016.08.30.00.00.00",
"X2016.08.29.00.00.00", "X2016.08.28.00.00.00", "X2016.08.27.00.00.00",
"X2016.08.26.00.00.00", "X2016.08.25.00.00.00", "X2016.08.24.00.00.00",
"X2016.08.23.00.00.00", "X2016.08.22.00.00.00", "X2016.08.21.00.00.00",
"X2016.08.20.00.00.00", "X2016.08.19.00.00.00", "X2016.08.18.00.00.00",
"X2016.08.17.00.00.00", "X2016.08.16.00.00.00", "X2016.08.15.00.00.00",
"X2016.08.14.00.00.00", "X2016.08.13.00.00.00", "X2016.08.12.00.00.00",
"X2016.08.11.00.00.00", "X2016.08.10.00.00.00", "X2016.08.09.00.00.00",
"X2016.08.08.00.00.00", "X2016.08.07.00.00.00", "X2016.08.06.00.00.00",
"X2016.08.05.00.00.00", "X2016.08.04.00.00.00", "X2016.08.03.00.00.00",
"X2016.08.02.00.00.00", "X2016.08.01.00.00.00", "X2016.07.31.00.00.00",
"X2016.07.30.00.00.00", "X2016.07.29.00.00.00", "X2016.07.28.00.00.00",
"X2016.07.27.00.00.00", "X2016.07.26.00.00.00", "X2016.07.25.00.00.00",
"X2016.07.24.00.00.00", "X2016.07.23.00.00.00", "X2016.07.22.00.00.00",
"X2016.07.21.00.00.00", "X2016.07.20.00.00.00", "X2016.07.19.00.00.00",
"X2016.07.18.00.00.00", "X2016.07.17.00.00.00", "X2016.07.16.00.00.00",
"X2016.07.15.00.00.00", "X2016.07.14.00.00.00", "X2016.07.13.00.00.00",
"X2016.07.12.00.00.00", "X2016.07.11.00.00.00", "X2016.07.10.00.00.00",
"X2016.07.09.00.00.00", "X2016.07.08.00.00.00", "X2016.07.07.00.00.00",
"X2016.07.06.00.00.00", "X2016.07.05.00.00.00", "X2016.07.04.00.00.00",
"X2016.07.03.00.00.00", "X2016.07.02.00.00.00", "X2016.07.01.00.00.00",
"X2016.06.30.00.00.00", "X2016.06.29.00.00.00", "X2016.06.28.00.00.00",
"X2016.06.27.00.00.00", "X2016.06.26.00.00.00", "X2016.06.25.00.00.00",
"X2016.06.24.00.00.00", "X2016.06.23.00.00.00", "X2016.06.22.00.00.00",
"X2016.06.21.00.00.00", "X2016.06.20.00.00.00", "X2016.06.19.00.00.00",
"X2016.06.18.00.00.00", "X2016.06.17.00.00.00", "X2016.06.16.00.00.00",
"X2016.06.15.00.00.00", "X2016.06.14.00.00.00", "X2016.06.13.00.00.00",
"X2016.06.12.00.00.00", "X2016.06.11.00.00.00", "X2016.06.10.00.00.00",
"X2016.06.09.00.00.00", "X2016.06.08.00.00.00", "X2016.06.07.00.00.00",
"X2016.06.06.00.00.00", "X2016.06.05.00.00.00", "X2016.06.04.00.00.00",
"X2016.06.03.00.00.00", "X2016.06.02.00.00.00", "X2016.06.01.00.00.00",
"X2016.05.31.00.00.00", "X2016.05.30.00.00.00", "X2016.05.29.00.00.00",
"X2016.05.28.00.00.00", "X2016.05.27.00.00.00", "X2016.05.26.00.00.00",
"X2016.05.25.00.00.00", "X2016.05.24.00.00.00", "X2016.05.23.00.00.00",
"X2016.05.22.00.00.00", "X2016.05.21.00.00.00", "X2016.05.20.00.00.00",
"X2016.05.19.00.00.00", "X2016.05.18.00.00.00", "X2016.05.17.00.00.00",
"X2016.05.16.00.00.00", "X2016.05.15.00.00.00", "X2016.05.14.00.00.00",
"X2016.05.13.00.00.00", "X2016.05.12.00.00.00", "X2016.05.11.00.00.00",
"X2016.05.10.00.00.00", "X2016.05.09.00.00.00", "X2016.05.08.00.00.00",
"X2016.05.07.00.00.00", "X2016.05.06.00.00.00", "X2016.05.05.00.00.00",
"X2016.05.04.00.00.00", "X2016.05.03.00.00.00", "X2016.05.02.00.00.00",
"X2016.05.01.00.00.00", "X2016.04.30.00.00.00", "X2016.04.29.00.00.00",
"X2016.04.28.00.00.00", "X2016.04.27.00.00.00", "X2016.04.26.00.00.00",
"X2016.04.25.00.00.00", "X2016.04.24.00.00.00", "X2016.04.23.00.00.00",
"X2016.04.22.00.00.00", "X2016.04.21.00.00.00", "X2016.04.20.00.00.00",
"X2016.04.19.00.00.00", "X2016.04.18.00.00.00", "X2016.04.17.00.00.00",
"X2016.04.16.00.00.00", "X2016.04.15.00.00.00", "X2016.04.14.00.00.00",
"X2016.04.13.00.00.00", "X2016.04.12.00.00.00", "X2016.04.11.00.00.00",
"X2016.04.10.00.00.00", "X2016.04.09.00.00.00", "X2016.04.08.00.00.00",
"X2016.04.07.00.00.00", "X2016.04.06.00.00.00", "X2016.04.05.00.00.00",
"X2016.04.04.00.00.00", "X2016.04.03.00.00.00", "X2016.04.02.00.00.00",
"X2016.04.01.00.00.00", "X2016.03.31.00.00.00", "X2016.03.30.00.00.00",
"X2016.03.29.00.00.00", "X2016.03.28.00.00.00", "X2016.03.27.00.00.00",
"X2016.03.26.00.00.00", "X2016.03.25.00.00.00", "X2016.03.24.00.00.00",
"X2016.03.23.00.00.00", "X2016.03.22.00.00.00", "X2016.03.21.00.00.00",
"X2016.03.20.00.00.00", "X2016.03.19.00.00.00", "X2016.03.18.00.00.00",
"X2016.03.17.00.00.00", "X2016.03.16.00.00.00", "X2016.03.15.00.00.00",
"X2016.03.14.00.00.00", "X2016.03.13.00.00.00", "X2016.03.12.00.00.00",
"X2016.03.11.00.00.00", "X2016.03.10.00.00.00", "X2016.03.09.00.00.00",
"X2016.03.08.00.00.00", "X2016.03.07.00.00.00", "X2016.03.06.00.00.00",
"X2016.03.05.00.00.00", "X2016.03.04.00.00.00", "X2016.03.03.00.00.00",
"X2016.03.02.00.00.00", "X2016.03.01.00.00.00", "X2016.02.29.00.00.00",
"X2016.02.28.00.00.00", "X2016.02.27.00.00.00", "X2016.02.26.00.00.00",
"X2016.02.25.00.00.00", "X2016.02.24.00.00.00", "X2016.02.23.00.00.00",
"X2016.02.22.00.00.00", "X2016.02.21.00.00.00", "X2016.02.20.00.00.00",
"X2016.02.19.00.00.00", "X2016.02.18.00.00.00", "X2016.02.17.00.00.00",
"X2016.02.16.00.00.00", "X2016.02.15.00.00.00", "X2016.02.14.00.00.00",
"X2016.02.13.00.00.00", "X2016.02.12.00.00.00", "X2016.02.11.00.00.00",
"X2016.02.10.00.00.00", "X2016.02.09.00.00.00", "X2016.02.08.00.00.00",
"X2016.02.07.00.00.00", "X2016.02.06.00.00.00", "X2016.02.05.00.00.00",
"X2016.02.04.00.00.00", "X2016.02.03.00.00.00", "X2016.02.02.00.00.00",
"X2016.02.01.00.00.00", "X2016.01.31.00.00.00", "X2016.01.30.00.00.00",
"X2016.01.29.00.00.00", "X2016.01.28.00.00.00", "X2016.01.27.00.00.00",
"X2016.01.26.00.00.00", "X2016.01.25.00.00.00", "X2016.01.24.00.00.00",
"X2016.01.23.00.00.00", "X2016.01.22.00.00.00", "X2016.01.21.00.00.00",
"X2016.01.20.00.00.00", "X2016.01.19.00.00.00", "X2016.01.18.00.00.00",
"X2016.01.17.00.00.00", "X2016.01.16.00.00.00", "X2016.01.15.00.00.00",
"X2016.01.14.00.00.00", "X2016.01.13.00.00.00", "X2016.01.12.00.00.00",
"X2016.01.11.00.00.00", "X2016.01.10.00.00.00", "X2016.01.09.00.00.00",
"X2016.01.08.00.00.00", "X2016.01.07.00.00.00", "X2016.01.06.00.00.00",
"X2016.01.05.00.00.00", "X2016.01.04.00.00.00", "X2016.01.03.00.00.00",
"X2016.01.02.00.00.00", "X2016.01.01.00.00.00"), row.names = c(NA,
6L), class = "data.frame")
I haven't tested this, since I don't have your data files, so let me know if the code below gives you what you wanted. The steps are as follows:
Get the names of the three worksheets in the Excel workbook (I'm assuming they'll be something like "close", "low", and "high").
Read each worksheet into a list of data frames using lapply. The data in your worksheets is transposed (i.e., the variables are in rows instead of columns), so we we transpose the data so that each variable is in a column, and we also add an extra column that tells us which worksheet the data originally came from.
Combine the three data frames into a single data frame.
"melt" so that the three different currencies will be stacked in long format.
The final output, df, should be a data frame in "long" format that's ready for analysis.
library(reshape2)
library(XLConnect)
wbFX <- loadWorkbook("fx.xlsx")
# Get names of worksheets
sheets = getSheets(wbFX)
# Read the three worksheets into a list of data frames
df = lapply(sheets, function(sh) {
dat = as.data.frame(t(readWorksheet(wbFX, sheet=sh)))
dat$Date = as.Date(dat$Date)
dat$Price_Type = sh
dat
})
# Combine each list element into a single data frame
df = do.call(rbind, df)
# Melt to long format
df = melt(df, id.var=c("Date", "Price_Type"), variable.name="Currency", value.name="Price")
Maybe try
library(reshape2)
lst <- lapply(list(high, low, close), melt, id.vars = 1, variable.name = "var")
df <- Reduce(function(...) merge(..., by = c("Date", "var")), lst )
names(df) <- c("currency", "date", "high", "low", "close")
recast(df, currency+variable~date, id.var = 1:2)
which should give you something like
# currency variable 1/2/15 1/3/15 1/4/15
# 1 Euro high ... ... ...
# 2 Euro low ... ... ...
# 3 Euro close ... ... ...

Resources