I am trying to calculate diameter growth for a set of trees over a number of years in a dataframe in which each row is a given tree during a given year. Typically, this sort of data has each individual stem as a single row with that stem's diameter for each year given in a separate column, but for various reasons, this dataframe needs to remain such that each row is an individual stem in an individual year. A simplistic model version of the data would be as follows
df<-data.frame("Stem"=c(1:5,1:5,1,2,3,5,1,2,3,5,6),
"Year"=c(rep(1997,5), rep(1998,5), rep(1999,4), rep(2000,5)),
"Diameter"=c(1:5,seq(1.5,5.5,1),2,3,4,6,3,5,7,9,15))
df
Stem Year DAP
1 1 1997 1.0
2 2 1997 2.0
3 3 1997 3.0
4 4 1997 4.0
5 5 1997 5.0
6 1 1998 1.5
7 2 1998 2.5
8 3 1998 3.5
9 4 1998 4.5
10 5 1998 5.5
11 1 1999 2.0
12 2 1999 3.0
13 3 1999 4.0
14 5 1999 6.0
15 1 2000 3.0
16 2 2000 5.0
17 3 2000 7.0
18 5 2000 9.0
19 6 2000 15.0
What I am trying to accomplish is to make a new column that takes the diameter for a given stem in a given year and subtracts the diameter for that same stem in the previous year. I assume that this will require some set of nested for loops. Something like
for (i in 1:length(unique(df$Stem_ID){
for (t in 2:length(unique(df$Year){
.....
}
}
What I'm struggling with is how to write the function that calculates:
Diameter[t]-Diameter[t-1] for each stem. Any suggestions would be greatly appreciated.
Try:
> do.call(rbind, lapply(split(df, df$Stem), function(x) transform(x, diff = c(0,diff(x$Diameter)))))
Stem Year Diameter diff
1.1 1 1997 1.0 0.0
1.6 1 1998 1.5 0.5
1.11 1 1999 2.0 0.5
1.15 1 2000 3.0 1.0
2.2 2 1997 2.0 0.0
2.7 2 1998 2.5 0.5
2.12 2 1999 3.0 0.5
2.16 2 2000 5.0 2.0
3.3 3 1997 3.0 0.0
3.8 3 1998 3.5 0.5
3.13 3 1999 4.0 0.5
3.17 3 2000 7.0 3.0
4.4 4 1997 4.0 0.0
4.9 4 1998 4.5 0.5
5.5 5 1997 5.0 0.0
5.10 5 1998 5.5 0.5
5.14 5 1999 6.0 0.5
5.18 5 2000 9.0 3.0
6 6 2000 15.0 0.0
Rnso's answer works. You could also do the slightly shorter:
>df[order(df$Stem),]
>df$diff <- unlist(tapply(df$Diameter,df$Stem, function(x) c(NA,diff(x))))
Stem Year Diameter diff
1 1 1997 1.0 NA
6 1 1998 1.5 0.5
11 1 1999 2.0 0.5
15 1 2000 3.0 1.0
2 2 1997 2.0 NA
7 2 1998 2.5 0.5
12 2 1999 3.0 0.5
16 2 2000 5.0 2.0
3 3 1997 3.0 NA
8 3 1998 3.5 0.5
13 3 1999 4.0 0.5
17 3 2000 7.0 3.0
4 4 1997 4.0 NA
9 4 1998 4.5 0.5
5 5 1997 5.0 NA
10 5 1998 5.5 0.5
14 5 1999 6.0 0.5
18 5 2000 9.0 3.0
19 6 2000 15.0 NA
Or if you're willing to use the data.table package you can be very succinct:
>require(data.table)
>DT <- data.table(df)
>setkey(DT,Stem)
>DT <- DT[,diff:= c(NA, diff(Diameter)), by = Stem]
>df <- as.data.frame(DT)
Stem Year Diameter diff
1 1 1997 1.0 NA
2 1 1998 1.5 0.5
3 1 1999 2.0 0.5
4 1 2000 3.0 1.0
5 2 1997 2.0 NA
6 2 1998 2.5 0.5
7 2 1999 3.0 0.5
8 2 2000 5.0 2.0
9 3 1997 3.0 NA
10 3 1998 3.5 0.5
11 3 1999 4.0 0.5
12 3 2000 7.0 3.0
13 4 1997 4.0 NA
14 4 1998 4.5 0.5
15 5 1997 5.0 NA
16 5 1998 5.5 0.5
17 5 1999 6.0 0.5
18 5 2000 9.0 3.0
19 6 2000 15.0 NA
If you have a large dataset, data.table has the advantage of being extremely fast.
Related
I have the following data where e_in is exogenous giving. Ann then is an equal distribution of e_in, however, e_in can only be distributed downwards, i.e. a string (this is why 7 and 8 has ann=9 while 1 to 6 have ann=8.5)
e_in<-c(13,10,4,9,14,1,11,7)
ann<-c(8.5,8.5,8.5,8.5,8.5,8.5,9,9)
Dat_1<-data.frame(e_in,ann)
>Dat_1
e_in ann
1 13 8.5
2 10 8.5
3 4 8.5
4 9 8.5
5 14 8.5
6 1 8.5
7 11 9.0
8 7 9.0
I would now like to calculate how much e_in is available at each point down the string (shown as smn). So for 1 there is 13 e_in avabile, where 1 will take 8.5. Number 2 will then have own e_in + whatever is send downwards form 1 (here 10 + (13-8.5) = 14.5) and so on.
As the following:
smn<-c(13,14.5,10,10.5,16,8.5,11,9)
Dat_2<-data.frame(e_in,ann,smn)
>Dat_2
e_in ann smn
1 13 8.5 13.0
2 10 8.5 14.5
3 4 8.5 10.0
4 9 8.5 10.5
5 14 8.5 16.0
6 1 8.5 8.5
7 11 9.0 11.0
8 7 9.0 9.0
Is there any easy way/package for this sort of calculation
(I have done it ‘by hand’ for this example but it becomes significantly more time consuming with bigger strings.)
I think you just need the cumulative sum ofe_in minus the lagged cumulative sum of ann
Dat_1$smn <- cumsum(Dat_1$e_in) - cumsum(c(0, head(Dat_1$ann, -1)))
Dat_1
# e_in ann smn
# 1 13 8.5 13.0
# 2 10 8.5 14.5
# 3 4 8.5 10.0
# 4 9 8.5 10.5
# 5 14 8.5 16.0
# 6 1 8.5 8.5
# 7 11 9.0 11.0
# 8 7 9.0 9.0
I tried to think of a lag & window solution but couldn't so curious to see if anyone else managed it.
In place of that here's a loop that can do it:
Dat_1['smn'] = c(Dat_1[1, 'e_in'])
for (i in 2:nrow(Dat_1)){
Dat_1[i, 'smn'] <- Dat_1[i, 'e_in'] + Dat_1[i-1, 'smn'] - Dat_1[i-1, 'ann']
}
e_in ann smn
1 13 8.5 13.0
2 10 8.5 14.5
3 4 8.5 10.0
4 9 8.5 10.5
5 14 8.5 16.0
6 1 8.5 8.5
7 11 9.0 11.0
8 7 9.0 9.0
EDIT
Just seen Allan Cameron's answer which inspired me to correct it using dplyr
Dat_1 %>%
mutate(
smn = cumsum(e_in) - cumsum(lag(ann, n = 1L, default = 0))
)
Same result
I have an output.txt file that results from an analysis I am doing in R. I would like to extract:
The << Output >> tables from the .txt file for each subject and combine them into an R data frame. The output column names are consistent between each of the subjects.
The same for the << Predicted Output >> and combine into a data frame in R.
As noted, the output file has written text in between that I don't need.
To make it easy for looking into the structure of the output.txt file, I have uploaded the .txt file on the following link here. I am also putting a screenshot below to show how the output is structured.
I attempted to do this using things like this but no luck:
df <- read.delim("ivivc_outputs.txt").
Try this to get started. Add more conditions if you need. I hope, this is helpful. If you need anything feel free to ask.
b = readLines(file('ivivc_outputs.txt', 'r'))
n_out = 1
n_pred = 1
listOutput = list()
listPredictOutput = list()
for(i in 1:length(b)){
if(b[i] == "<< Output >>"){
a = strsplit(b[i+1], " ")[[1]]
a = a[a != ""]
# print(a)
df <- data.frame(matrix(ncol = length(a), nrow = 0))
colnames(df) = a
control = 2
while(control != 20){
l = strsplit(b[i+control], " ")[[1]]
l = l[l != ""]
df[control-1,] = l
control = control + 1
}
listOutput[[n_out]] = df
n_out = n_out+1
}
if(b[i] == "<< Predicted Output >>"){
a = strsplit(b[i+1], " ")[[1]]
a = a[a != ""]
# print(a)
df <- data.frame(matrix(ncol = length(a), nrow = 0))
colnames(df) = a
control = 2
while(control != 20){
l = strsplit(b[i+control], " ")[[1]]
l = l[l != ""]
df[control-1,] = l
control = control + 1
}
listPredictOutput[[n_pred]] = df
n_pred = n_pred+1
}
}
# to merge all data frames use `bind_rows` from `dplyr`
library(dplyr)
dfOutput = bind_rows(listOutput)
dfPredictOutput = bind_rows(listPredictOutput)
> dfPredictOutput
# pH subj formula. time FABpred conc.pred AUCpred
# 1 1.2 1 capsule 0.0 0.000000 0.00000 0.00000
# 2 1.2 1 capsule 1.0 2.528737 8.39300 4.19650
# 3 1.2 1 capsule 2.0 7.415708 22.57987 19.68293
# 4 1.2 1 capsule 3.0 15.845734 45.08950 53.51761
# 5 1.2 1 capsule 4.0 24.275759 62.14611 107.13542
# 6 1.2 1 capsule 5.0 33.133394 76.48998 176.45346
# 7 1.2 1 capsule 6.0 41.991029 87.35901 258.37796
# 8 1.2 1 capsule 6.5 45.900606 89.90799 302.69471
# 9 1.2 1 capsule 7.0 49.810183 92.12832 348.20378
# 10 1.2 1 capsule 7.5 53.719760 94.06237 394.75146
# 11 1.2 1 capsule 8.0 57.629337 95.74705 442.20381
# 12 1.2 1 capsule 8.0 57.629337 95.74705 442.20381
# 13 1.2 1 capsule 9.0 63.860225 93.23271 536.69369
# 14 1.2 1 capsule 10.0 70.091113 91.32747 628.97378
# 15 1.2 1 capsule 12.0 79.498532 79.70975 800.01101
# 16 1.2 1 capsule 16.0 90.372043 49.52751 1058.48553
# 17 1.2 1 capsule 20.0 101.245554 40.79704 1239.13462
# 18 1.2 1 capsule 24.0 107.354268 26.67212 1374.07293
# 19 1.2 1 capsuleContent 0.0 0.000000 0.000000 0.000000
# 20 1.2 1 capsuleContent 1.0 1.490256 4.946231 2.473115
# 21 1.2 1 capsuleContent 2.0 5.338746 16.521316 13.206889
# 22 1.2 1 capsuleContent 3.0 13.341161 39.079386 41.007240
# 23 1.2 1 capsuleContent 4.0 21.343576 56.172708 88.633287
# 24 1.2 1 capsuleContent 5.0 30.017950 71.355393 152.397337
# 25 1.2 1 capsuleContent 6.0 38.692324 82.860036 229.505052
# 26 1.2 1 capsuleContent 6.5 42.571357 85.881180 271.690356
# 27 1.2 1 capsuleContent 7.0 46.450390 88.512793 315.288850
# 28 1.2 1 capsuleContent 7.5 50.329423 90.805099 360.118323
# 29 1.2 1 capsuleContent 8.0 54.208457 92.801847 406.020060
# 30 1.2 1 capsuleContent 8.0 54.208457 92.801847 406.020060
# 31 1.2 1 capsuleContent 9.0 60.439345 91.000989 497.921478
# 32 1.2 1 capsuleContent 10.0 66.670233 89.636393 588.240168
# 33 1.2 1 capsuleContent 12.0 76.322001 79.472871 757.349432
# 34 1.2 1 capsuleContent 16.0 87.256599 49.607701 1015.510577
# 35 1.2 1 capsuleContent 20.0 98.191197 40.968947 1196.663873
# 36 1.2 1 capsuleContent 24.0 104.177736 26.424419 1331.450604
# 37 1.2 2 capsule 0.0 0.000000 0.00000 0.00000
# 38 1.2 2 capsule 1.0 2.528737 8.39300 4.19650
# 39 1.2 2 capsule 2.0 7.415708 22.57987 19.68293
# 40 1.2 2 capsule 3.0 15.845734 45.08950 53.51761
# 41 1.2 2 capsule 4.0 24.275759 62.14611 107.13542
# 42 1.2 2 capsule 5.0 33.133394 76.48998 176.45346
# 43 1.2 2 capsule 6.0 41.991029 87.35901 258.37796
# 44 1.2 2 capsule 6.5 45.900606 89.90799 302.69471
# 45 1.2 2 capsule 7.0 49.810183 92.12832 348.20378
# 46 1.2 2 capsule 7.5 53.719760 94.06237 394.75146
# 47 1.2 2 capsule 8.0 57.629337 95.74705 442.20381
# 48 1.2 2 capsule 8.0 57.629337 95.74705 442.20381
# 49 1.2 2 capsule 9.0 63.860225 93.23271 536.69369
# 50 1.2 2 capsule 10.0 70.091113 91.32747 628.97378
# 51 1.2 2 capsule 12.0 79.498532 79.70975 800.01101
# 52 1.2 2 capsule 16.0 90.372043 49.52751 1058.48553
# 53 1.2 2 capsule 20.0 101.245554 40.79704 1239.13462
# 54 1.2 2 capsule 24.0 107.354268 26.67212 1374.07293
# 55 1.2 2 capsuleContent 0.0 0.000000 0.000000 0.000000
# 56 1.2 2 capsuleContent 1.0 1.490256 4.946231 2.473115
# 57 1.2 2 capsuleContent 2.0 5.338746 16.521316 13.206889
# 58 1.2 2 capsuleContent 3.0 13.341161 39.079386 41.007240
# 59 1.2 2 capsuleContent 4.0 21.343576 56.172708 88.633287
# 60 1.2 2 capsuleContent 5.0 30.017950 71.355393 152.397337
# 61 1.2 2 capsuleContent 6.0 38.692324 82.860036 229.505052
# 62 1.2 2 capsuleContent 6.5 42.571357 85.881180 271.690356
# 63 1.2 2 capsuleContent 7.0 46.450390 88.512793 315.288850
# 64 1.2 2 capsuleContent 7.5 50.329423 90.805099 360.118323
# 65 1.2 2 capsuleContent 8.0 54.208457 92.801847 406.020060
# 66 1.2 2 capsuleContent 8.0 54.208457 92.801847 406.020060
# 67 1.2 2 capsuleContent 9.0 60.439345 91.000989 497.921478
# 68 1.2 2 capsuleContent 10.0 66.670233 89.636393 588.240168
# 69 1.2 2 capsuleContent 12.0 76.322001 79.472871 757.349432
# 70 1.2 2 capsuleContent 16.0 87.256599 49.607701 1015.510577
# 71 1.2 2 capsuleContent 20.0 98.191197 40.968947 1196.663873
# 72 1.2 2 capsuleContent 24.0 104.177736 26.424419 1331.450604
# 73 1.2 3 capsule 0.0 0.000000 0.00000 0.00000
# 74 1.2 3 capsule 1.0 2.528737 8.39300 4.19650
# 75 1.2 3 capsule 2.0 7.415708 22.57987 19.68293
# 76 1.2 3 capsule 3.0 15.845734 45.08950 53.51761
# 77 1.2 3 capsule 4.0 24.275759 62.14611 107.13542
# 78 1.2 3 capsule 5.0 33.133394 76.48998 176.45346
# 79 1.2 3 capsule 6.0 41.991029 87.35901 258.37796
# 80 1.2 3 capsule 6.5 45.900606 89.90799 302.69471
# 81 1.2 3 capsule 7.0 49.810183 92.12832 348.20378
# 82 1.2 3 capsule 7.5 53.719760 94.06237 394.75146
# 83 1.2 3 capsule 8.0 57.629337 95.74705 442.20381
# 84 1.2 3 capsule 8.0 57.629337 95.74705 442.20381
# 85 1.2 3 capsule 9.0 63.860225 93.23271 536.69369
# 86 1.2 3 capsule 10.0 70.091113 91.32747 628.97378
# 87 1.2 3 capsule 12.0 79.498532 79.70975 800.01101
# 88 1.2 3 capsule 16.0 90.372043 49.52751 1058.48553
# 89 1.2 3 capsule 20.0 101.245554 40.79704 1239.13462
# 90 1.2 3 capsule 24.0 107.354268 26.67212 1374.07293
# 91 1.2 3 capsuleContent 0.0 0.000000 0.000000 0.000000
# 92 1.2 3 capsuleContent 1.0 1.490256 4.946231 2.473115
# 93 1.2 3 capsuleContent 2.0 5.338746 16.521316 13.206889
# 94 1.2 3 capsuleContent 3.0 13.341161 39.079386 41.007240
# 95 1.2 3 capsuleContent 4.0 21.343576 56.172708 88.633287
# 96 1.2 3 capsuleContent 5.0 30.017950 71.355393 152.397337
# 97 1.2 3 capsuleContent 6.0 38.692324 82.860036 229.505052
# 98 1.2 3 capsuleContent 6.5 42.571357 85.881180 271.690356
# 99 1.2 3 capsuleContent 7.0 46.450390 88.512793 315.288850
# 100 1.2 3 capsuleContent 7.5 50.329423 90.805099 360.118323
# 101 1.2 3 capsuleContent 8.0 54.208457 92.801847 406.020060
# 102 1.2 3 capsuleContent 8.0 54.208457 92.801847 406.020060
# 103 1.2 3 capsuleContent 9.0 60.439345 91.000989 497.921478
# 104 1.2 3 capsuleContent 10.0 66.670233 89.636393 588.240168
# 105 1.2 3 capsuleContent 12.0 76.322001 79.472871 757.349432
# 106 1.2 3 capsuleContent 16.0 87.256599 49.607701 1015.510577
# 107 1.2 3 capsuleContent 20.0 98.191197 40.968947 1196.663873
# 108 1.2 3 capsuleContent 24.0 104.177736 26.424419 1331.450604
I want to conditionally create a new var = old var. My data looks like this:
id id2
1.1 1 1
1.2 2 2
1.3 3 3
1.4 4 4
1.5 NA 5
5.5 5 6
5.6 6 7
5.7 7 8
5.8 8 9
5.51 NA 10
9.9 9 11
9.10 10 12
9.11 11 13
9.4 NA 14
12.12 12 15
12.2 NA 16
13.13 13 17
13.14 14 18
13.15 15 19
13.16 16 20
How can I create a new var = id2 when id is missing? If id is not missing, id3 is missing.
id id2 id3
1.1 1 1
1.2 2 2
1.3 3 3
1.4 4 4
1.5 NA 5 5
5.5 5 6
5.6 6 7
5.7 7 8
5.8 8 9
5.51 NA 10 10
9.9 9 11
9.10 10 12
9.11 11 13
9.4 NA 14 14
12.12 12 15
12.2 NA 16 16
13.13 13 17
13.14 14 18
13.15 15 19
13.16 16 20
Thanks!!
Assuming that dat is your data frame, you can do the following based on ifelse in base R.
dat$id3 <- with(dat, ifelse(is.na(id), id2, NA))
Or
dat2 <- transform(dat, id3 = ifelse(is.na(id), id2, NA))
DATA
dat <- read.table(text = " id id2
1.1 1 1
1.2 2 2
1.3 3 3
1.4 4 4
1.5 NA 5
5.5 5 6
5.6 6 7
5.7 7 8
5.8 8 9
5.51 NA 10
9.9 9 11
9.10 10 12
9.11 11 13
9.4 NA 14
12.12 12 15
12.2 NA 16
13.13 13 17
13.14 14 18
13.15 15 19
13.16 16 20",
header = TRUE)
I have a data set that looks like this:
Group Year Height
1 1.0 2004-2005 27
2 1.0 2005-2006 32600
3 2.0 2005-2006 520
4 1.0 2006-2007 216
5 2.0 2006-2007 39059
6 1.0 2006-2007 428
7 1.0 2007-2008 10624
8 2.0 2007-2008 30391
9 3.0 2007-2008 7450
10 4.0 2007-2008 234
11 1.0 2008-2009 2487
12 2.0 2008-2009 170
13 3.0 2008-2009 2606
14 4.0 2008-2009 519
15 5.0 2008-2009 54857
16 1.0 2009-2010 2272
17 2.0 2009-2010 3592
18 3.0 2009-2010 4792
19 4.0 2009-2010 75292
20 5.0 2009-2010 7555
21 6.0 2009-2010 9185
22 2.0 2010-2011 2073
23 3.0 2010-2011 582
24 4.0 2010-2011 6248
25 5.0 2010-2011 215
26 6.0 2010-2011 9153
27 7.0 2010-2011 3831
28 3.5 2011 5560
29 4.5 2011 1396
I have created an index to remove certain groups as well as year classes. For example
logical= (mydata$Group != 1 & mydata$Group != 2 & mydata$Year !=2011)
mydata_wk = mydata[logical,]
Now- I would like to plot this data. My problem is that when I plot this using the command below the X axis still shows the years that I deleted via indexing. For example, on the plot it will show 2011 which I deleted using the index above. I have tried to convert the same set to a data.matrix but it turns the years into numbers which correspond to the original data frame's year column which I DO NOT want to plot. What I want is to plot just the values that are in the final data frame after indexing. Any thoughts on this?
plot(mydata_wk$Group, mydata_wk$Year)
Thank you in advance.
is this the output you require
mydata<-read.table(text=' Group ,Year, Height
1.0 ,2004-2005, 27
1.0 ,2005-2006, 32600
2.0 ,2005-2006, 520
1.0 ,2006-2007, 216
2.0 ,2006-2007, 39059
1.0 ,2006-2007, 428
1.0 ,2007-2008, 10624
2.0 ,2007-2008, 30391
3.0 ,2007-2008, 7450
4.0 ,2007-2008, 234
1.0 ,2008-2009, 2487
2.0 ,2008-2009, 170
3.0 ,2008-2009, 2606
4.0 ,2008-2009, 519
5.0 ,2008-2009, 54857
1.0 ,2009-2010, 2272
2.0 ,2009-2010, 3592
3.0 ,2009-2010, 4792
4.0 ,2009-2010, 75292
5.0 ,2009-2010, 7555
6.0 ,2009-2010, 9185
2.0 ,2010-2011, 2073
3.0 ,2010-2011, 582
4.0 ,2010-2011, 6248
5.0 ,2010-2011, 215
6.0 ,2010-2011, 9153
7.0 ,2010-2011, 3831
3.5 , 2011, 5560
4.5 , 2011, 1396',header=TRUE,sep=",",stringsAsFactors=FALSE)
# NOT operation is common among the conditions
logical= ( !( mydata$Group %in% c(1,2) | grepl('2011',mydata$Year) ) )
mydata_wk<-mydata[logical,]
> mydata_wk
Group Year Height
9 3 2007-2008 7450
10 4 2007-2008 234
13 3 2008-2009 2606
14 4 2008-2009 519
15 5 2008-2009 54857
18 3 2009-2010 4792
19 4 2009-2010 75292
20 5 2009-2010 7555
21 6 2009-2010 9185
> str(mydata_wk)
'data.frame': 9 obs. of 3 variables:
$ Group : num 3 4 3 4 5 3 4 5 6
$ Year : chr "2007-2008" "2007-2008" "2008-2009" "2008-2009" ...
$ Height: int 7450 234 2606 519 54857 4792 75292 7555 9185
Hi
i have a 10 year, 5 minutes resolution data set of dust concentration
and i have seperetly a 15 year data set with a day resolution of the synoptic clasification
how can i combine these two datasets they are not the same length or resolution
here is a sample of the data
> head(synoptic)
date synoptic
1 01/01/1995 8
2 02/01/1995 7
3 03/01/1995 7
4 04/01/1995 20
5 05/01/1995 1
6 06/01/1995 1
>
head(beit.shemesh)
X........................ StWd SHT PRE GSR RH Temp WD WS PM10 CO O3
1 NA 64 19.8 0 -2.9 37 15.2 61 2.2 241 0.9 40.6
2 NA 37 20.1 0 1.1 38 15.2 344 2.1 241 0.9 40.3
3 NA 36 20.2 0 0.7 39 15.1 32 1.9 241 0.9 39.4
4 NA 52 20.1 0 0.9 40 14.9 20 2.1 241 0.9 38.7
5 NA 42 19.0 0 0.9 40 14.6 11 2.0 241 0.9 38.7
6 NA 75 19.9 0 0.2 40 14.5 341 1.3 241 0.9 39.1
No2 Nox No SO2 date
1 1.4 2.9 1.5 1.6 31/12/2000 24:00
2 1.7 3.1 1.4 0.9 01/01/2001 00:05
3 2.1 3.5 1.4 1.2 01/01/2001 00:10
4 2.7 4.2 1.5 1.3 01/01/2001 00:15
5 2.3 3.8 1.5 1.4 01/01/2001 00:20
6 2.8 4.3 1.5 1.3 01/01/2001 00:25
any idea's
Make an extra column for calculating the dates, and then merge. To do this, you have to generate a variable in each dataframe bearing the same name, hence you first need some renaming. Also make sure that the merge column you use has the same type in both dataframes :
beit.shemesh$datetime <- beit.shemesh$date
beit.shemesh$date <- as.Date(beith.shemesh$datetime,format="%d/%m/%Y")
synoptic$date <- as.Date(synoptic$date,format="%d/%m/%Y")
merge(synoptic, beit.shemesh,by="date",all.y=TRUE)
Using all.y=TRUE keeps the beit.shemesh dataset intact. If you also want empty rows for all non-matching rows in synoptic, you could use all=TRUE instead.