Related
My measured variable V1 follows cycles: it goes up to reach a maximum and down to reach a minimum. I call "cycle" the range of data points between 2 consecutive maxima (cycle 1 is maximum 1 - minimum 1 - maximum 2, cycle 2 is maximum 2 - minimum 2 - maximum 3). The minima and maxima of each cycle are different.
My 2 questions are:
how to identify the range of data points in V1 corresponding to each cycle?
how to extract all the minima and all the maxima in V1?
I have used ggplot to identify my minima and maxima using stat_peaks() and stat_valleys(). I want to find a way of doing it without plotting it, to apply it to many data frames.
library(ggplot2)
library(ggpmisc)
#I plotted my data to visualize the minima (in yellow) and maxima (in blue) with stat_peaks and stat_valleys.
plot <- ggplot(df, aes(x=V0, y=V1))+
geom_point()+
stat_peaks(color="yellow", span=61)+
stat_valleys(color="blue", span=101)
#I used the ggplot_build function to extract the values of the highlighted peaks and valleys.
pb <- ggplot_build(plot)
I wanted to identify the 10 largest values in pb for which colour == "yellow" and the 10 lowest values in pb for which colour == "blue" but it does not work because pb is not a dataframe.
dput(df[1:200, c(1,2)])
structure(list(V0 = c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8,
0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 2.1,
2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3, 3.1, 3.2, 3.3, 3.4,
3.5, 3.6, 3.7, 3.8, 3.9, 4, 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7,
4.8, 4.9, 5, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6,
6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7, 7.1, 7.2, 7.3,
7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6,
8.7, 8.8, 8.9, 9, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9,
10, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 11,
11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7, 11.8, 11.9, 12, 12.1,
12.2, 12.3, 12.4, 12.5, 12.6, 12.7, 12.8, 12.9, 13, 13.1, 13.2,
13.3, 13.4, 13.5, 13.6, 13.7, 13.8, 13.9, 14, 14.1, 14.2, 14.3,
14.4, 14.5, 14.6, 14.7, 14.8, 14.9, 15, 15.1, 15.2, 15.3, 15.4,
15.5, 15.6, 15.7, 15.8, 15.9, 16, 16.1, 16.2, 16.3, 16.4, 16.5,
16.6, 16.7, 16.8, 16.9, 17, 17.1, 17.2, 17.3, 17.4, 17.5, 17.6,
17.7, 17.8, 17.9, 18, 18.1, 18.2, 18.3, 18.4, 18.5, 18.6, 18.7,
18.8, 18.9, 19, 19.1, 19.2, 19.3, 19.4, 19.5, 19.6, 19.7, 19.8,
19.9, 20), V1 = c(32.56, 31.97, 29.08, 27.34, 25.34, 22.58,
20.93, 17.93, 14.65, 12.2, 9.88, 7, 5.52, 3.96, 3.26, 2.76, 3.23,
3.38, 3.5, 3.67, 4.24, 7.1, 9.94, 14.58, 17.57, 21.64, 23.83,
27.28, 29.48, 33.13, 34.37, 36.74, 37.13, 36.52, 35.87, 36, 35.49,
33.81, 32.89, 30.47, 29.87, 27.84, 25.83, 23.31, 21.39, 18.63,
16.42, 12.9, 10.6, 7.43, 5.95, 4.52, 3.76, 2.61, 2.94, 3.42,
2.89, 3.38, 3.64, 4.2, 5.74, 9.48, 12.71, 17.46, 19.76, 23.93,
27.46, 31.99, 34.07, 40.37, 46.48, 42.89, 48.33, 56.99, 47.16,
43.53, 39.86, 37.48, 30.36, 26.01, 23.03, 20.57, 15.92, 13.87,
11.61, 8.58, 6.52, 4.79, 3.88, 2.9, 2.94, 3.22, 3.45, 3.66, 3.89,
6.01, 8.37, 12.83, 15.06, 18.68, 21.2, 24.12, 26.97, 28.48, 26.69,
37.06, 40.15, 39.36, 35.73, 35.61, 35.83, 35.14, 31.55, 30.05,
25.34, 24.24, 23.4, 21.09, 18.32, 16.04, 13.18, 10.07, 8.23,
5.78, 4.71, 3.44, 3.48, 3.71, 3.72, 3.9, 4.56, 6.93, 9.3, 14.04,
14.66, 16.25, 18.43, 20.76, 21.86, 23.87, 26.63, 24.85, 29.98,
26.67, 26.99, 27.36, 25.08, 25.24, 26.48, 24.1, 22.66, 22.28,
23.29, 21.87, 21.02, 19.53, 22.75, 22.04, 20.64, 19.05, 19.4,
21, 18.93, 25.38, 23.59, 21.48, 21.9, 23.75, 23.38, 25.06, 25.2,
26.38, 25.22, 28.62, 27.38, 34.16, 35.94, 34.03, 28.95, 24.33,
24.76, 25.56, 24.96, 21.99, 23.53, 23.76, 24.5, 22.39, 23.01,
23.42, 24, 22.65, 21.44, 22.15, 21.72, 18.46, 17.65, 15.34, 16.11,
14.93)), row.names = c(NA, 200L), class = "data.frame")
You can add a variable to your data frame that labels the maxima and minima quite easily with the following line:
df$is_min_max <- c(FALSE, diff(as.numeric(diff(df$V1) > 0)) != 0, FALSE)
I'll explain how this works:
You can find out the difference between consecutive points in your data by doing
diff(df$V1)
so you can see where your data are going up or down by doing
as.numeric(diff(df$V1) > 0)
Which will give you a 1 between two points on an upward gradient and 0 on a downward gradient. So if you do
diff(as.numeric(diff(df$V1) > 0))
You will get a +1 or -1 at the points where the direction changes.
So if you do:
diff(as.numeric(diff(df$V1) > 0)) != 0
You will get a logical vector of the points that are local maxima and minima. Note the start and end points have been removed because we have double-diffed. Therefore we need to add a FALSE on at either end:
c(FALSE, diff(as.numeric(diff(df$V1) > 0)) != 0, FALSE)
So we could add this to your data frame as
df$is_min_max <- c(FALSE, diff(as.numeric(diff(df$V1) > 0)) != 0, FALSE)
You haven't included the actual data in your example, so I will show an example here using a simple sine wave:
df <- data.frame(x = seq(1, 20, 0.1), V1 = sin(seq(1, 20, 0.1)))
plot(df$x, df$V1)
And now we can just find our local maxima and minima...
df$is_min_max <- c(FALSE, diff(as.numeric(diff(df$V1) > 0)) != 0, FALSE)
And plot them:
points(df$x[df$is_min_max], df$V1[df$is_min_max], col = "red", cex = 3 )
Note that this will show up every change in direction, so if there are local "wobbles" in your data you will find maxima and minima there too. Removing these is possible but a little more complex.
Created on 2020-02-27 by the reprex package (v0.3.0)
The coordinates of peaks and valleys are contained in pb:
The peaks are in
pb$data[[2]]
xintercept yintercept label x y PANEL group x.label y.label shape colour size fill alpha stroke
1 7.9 0.9989413 7.9 7.9 0.9989413 1 -1 7.9 0.9989 19 yellow 1.5 NA NA 0.5
2 14.1 0.9993094 14.1 14.1 0.9993094 1 -1 14.1 0.9993 19 yellow 1.5 NA NA 0.5
The valleys are in
pb$data[[3]]
xintercept yintercept label x y PANEL group x.label y.label shape colour size fill alpha stroke
1 11 -0.9999902 11 11 -0.9999902 1 -1 11 -1 19 blue 1.5 NA NA 0.5
Note that the order of list elements may change depending on the order of ggplot function calls (layers).
Also note that the sample data provided by the OP is too small with respect to the spans given in calls stat_peaks(color="yellow", span=61) and stat_valleys(color="blue", span=101), resp.
Therefore, I have used the sample data from Allan's answer:
df <- data.frame(V0 = seq(1, 20, 0.1), V1 = sin(seq(1, 20, 0.1)))
which highlights two peaks and one valley using OP's code:
library(ggplot2)
library(ggpmisc)
plot <- ggplot(df, aes(x=V0, y=V1))+
geom_point()+
stat_peaks(color="yellow", span=61)+
stat_valleys(color="blue", span=101)
plot
This is my data.frame:
tsdata=structure(list(S.Educ = c(228.3000255, 237.2500067, 248.2500235,
235.8500022, 243.3000045, 254.9999758, 258.1000174, 261.1000144,
268.3000056, 277.1000373, 249.2000057, 234.7000229, 239.8500266,
0.66159, 0.67453, 0.64213, 0.6248, 0.65231, 228.3000255, 237.2500067,
248.2500235, 235.8500022, 243.3000045, 254.9999758, 258.1000174,
261.1000144, 268.3000056, 277.1000373, 249.2000057, 234.7000229,
239.8500266, 0.76481, 0.81037, 0.82068, 0.83542, 0.86355, 0.88456,
0.92678, 0.80906, 228.3000255, 237.2500067, 248.2500235, 235.8500022,
243.3000045, 254.9999758, 258.1000174, 261.1000144, 268.3000056
), i.Educ.A = c(6.76, 6.53, 6.93, 6.99, 7.16, 7.46, 7.16, 7.49,
7.43, 7.06, 7.43, 6.96, 6.68, 11.36, 10.86, 10.31, 10.46, 9.76,
6.76, 6.53, 6.93, 6.99, 7.16, 7.46, 7.16, 7.49, 7.43, 7.06, 7.43,
6.96, 6.68, 10.88, 10.81, 10.63, 9.75, 10.13, 12.44, 13.88, 13.25,
6.76, 6.53, 6.93, 6.99, 7.16, 7.46, 7.16, 7.49, 7.43), i.Globi.A = c(14.76,
15.16, 15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11,
11.01, 9.36, 9.44, 8.86, 9.69, 9.06, 9.56, 9.81, 14.76, 15.16,
15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11, 11.01,
9.36, 9.44, 12, 11.31, 10.06, 9.19, 8.72, 8.54, 9.56, 9.13, 14.76,
15.16, 15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56), i.Educ.B = c(6.76,
6.53, 6.93, 6.99, 7.16, 7.46, 7.16, 7.49, 7.43, 7.06, 7.43, 6.96,
6.68, 11.36, 10.86, 10.31, 10.46, 9.76, 6.76, 6.53, 6.93, 6.99,
7.16, 7.46, 7.16, 7.49, 7.43, 7.06, 7.43, 6.96, 6.68, 10.88,
10.81, 10.63, 9.75, 10.13, 12.44, 13.88, 13.25, 6.76, 6.53, 6.93,
6.99, 7.16, 7.46, 7.16, 7.49, 7.43), i.Globi.B = c(14.76, 15.16,
15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11, 11.01,
9.36, 9.44, 8.86, 9.69, 9.06, 9.56, 9.81, 14.76, 15.16, 15.56,
14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11, 11.01, 9.36,
9.44, 12, 11.31, 10.06, 9.19, 8.72, 8.54, 9.56, 9.13, 14.76,
15.16, 15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56), S.Torto = c(228.3000255,
237.2500067, 248.2500235, 235.8500022, 243.3000045, 254.9999758,
258.1000174, 261.1000144, 268.3000056, 277.1000373, 249.2000057,
234.7000229, 239.8500266, 237.850026, 239.0000239, 238.3000059,
238.8500041, 239.280016, 241.7499798, 246.4500108, 235.6499768,
234.0500179, 232.5000145, 231.6999874, 234.6700059, 233.4299733,
224.7500106, 227.0000177, 231.5800102, 237.3000006, 245.2999909,
241.8999792, 246.8999855, 245.7000042, 247.5000025, 251.5999872,
254.7999858, 259.4500127, 250.9999839, 251.5000089, 251.1999823,
248.3999935, 236.4499965, 239.0000239, 216.4999833, 211.5499951,
202.0999805, 200.2499921), i.Torto.A = c(6.76, 6.53, 6.93, 6.99,
7.16, 7.46, 7.16, 7.49, 7.43, 7.06, 7.43, 6.96, 6.68, 6.81, 6.56,
6.31, 6.81, 6.56, 6.66, 6.76, 6.75, 6.37, 6.9, 6.37, 6.4, 6.5,
6.5, 6.5, 6.5, 6.3, 6.31, 6.31, 6.38, 6.44, 6.44, 6.31, 6.38,
6.5, 6.44, 6.38, 6.38, 6.38, 6.38, 6.44, 6.56, 7.81, 7.94, 7.5
), i.Globi.A = c(14.76, 15.16, 15.56, 14.96, 14.46, 15.96, 13.16,
11.56, 11.56, 10.11, 11.01, 9.36, 9.44, 8.86, 9.69, 9.06, 9.56,
9.81, 10.36, 10.36, 9.6, 9.62, 9.9, 9.87, 9.75, 9.13, 9.13, 11.75,
11.75, 11.6, 11.94, 12, 11.31, 10.06, 9.19, 8.72, 8.54, 9.56,
9.13, 8.75, 7.88, 7.88, 8.31, 8.06, 8.19, 8.06, 8.19, 8.06),
i.Torto.B = c(6.76, 6.53, 6.93, 6.99, 7.16, 7.46, 7.16, 7.49,
7.43, 7.06, 7.43, 6.96, 6.68, 6.81, 6.56, 6.31, 6.81, 6.56,
6.66, 6.76, 6.75, 6.37, 6.9, 6.37, 6.4, 6.5, 6.5, 6.5, 6.5,
6.3, 6.31, 6.31, 6.38, 6.44, 6.44, 6.31, 6.38, 6.5, 6.44,
6.38, 6.38, 6.38, 6.38, 6.44, 6.56, 7.81, 7.94, 7.5), i.Globi.B = c(14.76,
15.16, 15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11,
11.01, 9.36, 9.44, 8.86, 9.69, 9.06, 9.56, 9.81, 10.36, 10.36,
9.6, 9.62, 9.9, 9.87, 9.75, 9.13, 9.13, 11.75, 11.75, 11.6,
11.94, 12, 11.31, 10.06, 9.19, 8.72, 8.54, 9.56, 9.13, 8.75,
7.88, 7.88, 8.31, 8.06, 8.19, 8.06, 8.19, 8.06), S.Pub = c(7.6025,
7.9875, 8.228, 7.96, 7.99, 8.51, 8.57, 8.715, 8.8475, 9.01,
8.67, 8.38, 8.655, 8.67, 234.7000229, 239.8500266, 237.850026,
239.0000239, 238.3000059, 238.8500041, 239.280016, 241.7499798,
246.4500108, 235.6499768, 234.0500179, 232.5000145, 231.6999874,
9.971, 10.049, 10.2025, 10.62, 10.5125, 11.1175, 10.9685,
11.186, 11.285, 11.2925, 11.96, 234.7000229, 239.8500266,
237.850026, 239.0000239, 238.3000059, 238.8500041, 239.280016,
241.7499798, 246.4500108, 235.6499768), i.Pub.A = c(15.56,
15.66, 16.46, 20.06, 19.66, 15.46, 14.06, 14.06, 20.06, 21.31,
22.31, 19.66, 18.66, 17.16, 6.96, 6.68, 6.81, 6.56, 6.31,
6.81, 6.56, 6.66, 6.76, 6.75, 6.37, 6.9, 6.37, 10.75, 10.75,
11.25, 11.62, 11.88, 11.5, 11.88, 12, 12, 12, 11.25, 6.96,
6.68, 6.81, 6.56, 6.31, 6.81, 6.56, 6.66, 6.76, 6.75), i.Globi.A = c(14.76,
15.16, 15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11,
11.01, 9.36, 9.44, 8.86, 9.36, 9.44, 8.86, 9.69, 9.06, 9.56,
9.81, 10.36, 10.36, 9.6, 9.62, 9.9, 9.87, 11.75, 11.75, 11.6,
11.94, 12, 11.31, 10.06, 9.19, 8.72, 8.54, 9.56, 9.36, 9.44,
8.86, 9.69, 9.06, 9.56, 9.81, 10.36, 10.36, 9.6), i.Pub.B = c(15.56,
15.66, 16.46, 20.06, 19.66, 15.46, 14.06, 14.06, 20.06, 21.31,
22.31, 19.66, 18.66, 17.16, 6.96, 6.68, 6.81, 6.56, 6.31,
6.81, 6.56, 6.66, 6.76, 6.75, 6.37, 6.9, 6.37, 10.75, 10.75,
11.25, 11.62, 11.88, 11.5, 11.88, 12, 12, 12, 11.25, 6.96,
6.68, 6.81, 6.56, 6.31, 6.81, 6.56, 6.66, 6.76, 6.75), i.Globi.B = c(14.76,
15.16, 15.56, 14.96, 14.46, 15.96, 13.16, 11.56, 11.56, 10.11,
11.01, 9.36, 9.44, 8.86, 9.36, 9.44, 8.86, 9.69, 9.06, 9.56,
9.81, 10.36, 10.36, 9.6, 9.62, 9.9, 9.87, 11.75, 11.75, 11.6,
11.94, 12, 11.31, 10.06, 9.19, 8.72, 8.54, 9.56, 9.36, 9.44,
8.86, 9.69, 9.06, 9.56, 9.81, 10.36, 10.36, 9.6), S.Hemot = c(228.3000255,
237.2500067, 248.2500235, 235.8500022, 243.3000045, 254.9999758,
258.1000174, 261.1000144, 268.3000056, 277.1000373, 249.2000057,
234.7000229, 239.8500266, 237.850026, 239.0000239, 238.3000059,
238.8500041, 239.280016, 241.7499798, 246.4500108, 235.6499768,
234.0500179, 232.5000145, 231.6999874, 234.6700059, 233.4299733,
224.7500106, 227.0000177, 231.5800102, 237.3000006, 245.2999909,
241.8999792, 246.8999855, 245.7000042, 247.5000025, 251.5999872,
254.7999858, 259.4500127, 250.9999839, 251.5000089, 251.1999823,
248.3999935, 236.4499965, 239.0000239, 216.4999833, 211.5499951,
202.0999805, 200.2499921), i.Hemot.A = c(13.8595, 14.069,
15.1918, 16.7617, 234.7000229, 239.8500266, 237.850026, 239.0000239,
238.3000059, 238.8500041, 239.280016, 241.7499798, 246.4500108,
235.6499768, 234.0500179, 232.5000145, 231.6999874, 13.3219,
13.2862, 13.19, 12.9323, 13.2048, 13.1795, 13.0952, 13.0033,
12.9988, 12.9988, 12.8194, 12.8194, 12.65, 12.476, 12.88,
13.3279, 13.36, 13.2716, 13.1788, 12.8091, 12.49, 12.733,
12.7595, 12.668, 12.6874, 12.6965, 12.6845, 12.6471, 12.4939,
12.5433, 12.555), i.Globi.A = c(14.76, 15.16, 15.56, 14.96,
6.96, 6.68, 6.81, 6.56, 6.31, 6.81, 6.56, 6.66, 6.76, 6.75,
6.37, 6.9, 6.37, 9.81, 10.36, 10.36, 9.6, 9.62, 9.9, 9.87,
9.75, 9.13, 9.13, 11.75, 11.75, 11.6, 11.94, 12, 11.31, 10.06,
9.19, 8.72, 8.54, 9.56, 9.13, 8.75, 7.88, 7.88, 8.31, 8.06,
8.19, 8.06, 8.19, 8.06), i.Hemot.B = c(13.8595, 14.069, 15.1918,
16.7617, 234.7000229, 239.8500266, 237.850026, 239.0000239,
238.3000059, 238.8500041, 239.280016, 241.7499798, 246.4500108,
235.6499768, 234.0500179, 232.5000145, 231.6999874, 13.3219,
13.2862, 13.19, 12.9323, 13.2048, 13.1795, 13.0952, 13.0033,
12.9988, 12.9988, 12.8194, 12.8194, 12.65, 12.476, 12.88,
13.3279, 13.36, 13.2716, 13.1788, 12.8091, 12.49, 12.733,
12.7595, 12.668, 12.6874, 12.6965, 12.6845, 12.6471, 12.4939,
12.5433, 12.555), i.Globi.B = c(14.76, 15.16, 15.56, 14.96,
6.96, 6.68, 6.81, 6.56, 6.31, 6.81, 6.56, 6.66, 6.76, 6.75,
6.37, 6.9, 6.37, 9.81, 10.36, 10.36, 9.6, 9.62, 9.9, 9.87,
9.75, 9.13, 9.13, 11.75, 11.75, 11.6, 11.94, 12, 11.31, 10.06,
9.19, 8.72, 8.54, 9.56, 9.13, 8.75, 7.88, 7.88, 8.31, 8.06,
8.19, 8.06, 8.19, 8.06)), .Names = c("S.Educ", "i.Educ.A",
"i.Globi.A", "i.Educ.B", "i.Globi.B", "S.Torto", "i.Torto.A",
"i.Globi.A", "i.Torto.B", "i.Globi.B", "S.Pub", "i.Pub.A", "i.Globi.A",
"i.Pub.B", "i.Globi.B", "S.Hemot", "i.Hemot.A", "i.Globi.A",
"i.Hemot.B", "i.Globi.B"), row.names = c(NA, -48L), class = "data.frame")
I have 4 countries: c("Educ","Torto","Pub","Hemot")
For each country I need to run a regression. For example, for Educ country:
Reg.Educ.A = lm (data.country $ S.Educ ~ data.country $ i.Educ.A - data.country $ i.Globi.A)
Reg.Educ.B = lm (data.country $ S.Educ ~ data.country $ i.Educ.B - data.country $ i.Globi.B)
I'm not very good at R. What I got to do was this, but it is not working.
I need to use the If/else statements.
The idea is: "If the country is "Educ" I will work with data.countra=tsdata[,1:5]. If the country is "Torto" I will work with data.country=tsdata[,6:10], and so on until the fourth country which is "Hemot".
The code below was the maximum I could do and it is wrong. I do not know how to use it "if/else" statements:
country <- c("Educ", "Torto", "Pub", "Hemot")
coef.matrix=matrix(0,4,4)
for (i in 1:4){
if (country[i]=="Educ"){
data.country=tsdata[,1:5]
coef.matrix[1,1]=summary(lm(data.country$S.Educ~data.country$i.Educ.A - data.country$i.Globi.A))$coef[1,1]
coef.matrix[1,2]=summary(lm(data.country$S.Educ~data.country$i.Educ.A - data.country$i.Globi.A))$coef[2,1]
coef.matrix[1,3]=summary(lm(data.country$S.Educ~data.country$i.Educ.B - data.country$i.Globi.B))$coef[1,1]
coef.matrix[1,4]=summary(lm(data.country$S.Educ~data.country$i.Educ.B - data.country$i.Globi.B))$coef[2,1]
}
if (country[i]=="Torto"){
data.country=tsdata[,6:10]
coef.matrix[2,1]=summary(lm(data.country$S.Torto~data.country$i.Torto.A - data.country$i.Globi.A))$coef[1,1]
coef.matrix[2,2]=summary(lm(data.country$S.Torto~data.country$i.Torto.A - data.country$i.Globi.A))$coef[2,1]
coef.matrix[2,3]=summary(lm(data.country$S.Torto~data.country$i.Torto.B - data.country$i.Globi.B))$coef[1,1]
coef.matrix[2,4]=summary(lm(data.country$S.Torto~data.country$i.Torto.B - data.country$i.Globi.B))$coef[2,1]
}
if (country[i]=="Pub"){
data.country=tsdata[,11:15]
coef.matrix[3,1]=summary(lm(data.country$S.Pub~data.country$i.Pub.A - data.country$i.Globi.A))$coef[1,1]
coef.matrix[3,2]=summary(lm(data.country$S.Pub~data.country$i.Pub.A - data.country$i.Globi.A))$coef[2,1]
coef.matrix[3,3]=summary(lm(data.country$S.Pub~data.country$i.Pub.B - data.country$i.Globi.B))$coef[1,1]
coef.matrix[3,4]=summary(lm(data.country$S.Pub~data.country$i.Pub.B - data.country$i.Globi.B))$coef[2,1]
}
if (country[i]=="Hemot"){
data.country=tsdata[,16:20]
coef.matrix[4,1]=summary(lm(data.country$S.Hemot~data.country$i.Hemot.A - data.country$i.Globi.A))$coef[1,1]
coef.matrix[4,2]=summary(lm(data.country$S.Hemot~data.country$i.Hemot.A - data.country$i.Globi.A))$coef[2,1]
coef.matrix[4,3]=summary(lm(data.country$S.Hemot~data.country$i.Hemot.B - data.country$i.Globi.B))$coef[1,1]
coef.matrix[4,4]=summary(lm(data.country$S.Hemot~data.country$i.Hemot.B - data.country$i.Globi.B))$coef[2,1]
}
}
Any help would be great.
This is simpler if you use long-format data. For example, with data.table...
# fix data
library(data.table)
DT = melt(setDT(tsdata),
meas = split(seq_along(tsdata), 1:5),
value.name = c("S", "iA", "iGA", "iB", "iGB"),
variable.name = "country"
)
# run models, extracting coefficients
DT[, c(
A = as.list(coef(lm(S ~ iA - iGA))),
B = as.list(coef(lm(S ~ iB - iGB)))
), by=country]
country A.(Intercept) A.iA B.(Intercept) B.iB
1: 1 613.7994 -52.85050685 613.7994 -52.85050685
2: 2 277.2805 -5.47557795 277.2805 -5.47557795
3: 3 333.4474 -19.39535189 333.4474 -19.39535189
4: 4 236.4338 0.05428531 236.4338 0.05428531
I want to analyze my data as here with factor analysis and PCA.
It works so far, but what I did figure out is the following.
How can I perform a varimax rotation and visualize the rotated matrix in the correlation circle?
res.pca <- prcomp(decathlon2.active, scale = TRUE)
my.var <- varimax(res.pca$rotation)
res.pca is a prcomp object, my.var is a list, therefore I cannot use it for the plots as described in the article.
Any ideas?
Edit (output of dput(decathlon2.active)):
> data(decathlon2)
> decathlon2.active <- decathlon2[1:23, 1:10]
> dput(decathlon2.active)
structure(list(X100m = c(11.04, 10.76, 11.02, 11.34, 11.13, 10.83,
11.64, 11.37, 11.33, 11.33, 11.36, 10.85, 10.44, 10.5, 10.89,
10.62, 10.91, 10.97, 10.69, 10.98, 10.95, 10.9, 11.14), Long.jump = c(7.58,
7.4, 7.23, 7.09, 7.3, 7.31, 6.81, 7.56, 6.97, 7.27, 6.8, 7.84,
7.96, 7.81, 7.47, 7.74, 7.14, 7.19, 7.48, 7.49, 7.31, 7.3, 6.99
), Shot.put = c(14.83, 14.26, 14.25, 15.19, 13.48, 13.76, 14.57,
14.41, 14.09, 12.68, 13.46, 16.36, 15.23, 15.93, 15.73, 14.48,
15.31, 14.65, 14.8, 14.01, 15.1, 14.77, 14.91), High.jump = c(2.07,
1.86, 1.92, 2.1, 2.01, 2.13, 1.95, 1.86, 1.95, 1.98, 1.86, 2.12,
2.06, 2.09, 2.15, 1.97, 2.12, 2.03, 2.12, 1.94, 2.06, 1.88, 1.94
), X400m = c(49.81, 49.37, 48.93, 50.42, 48.62, 49.91, 50.14,
51.1, 49.48, 49.2, 51.16, 48.36, 49.19, 46.81, 48.97, 47.97,
49.4, 48.73, 49.13, 49.76, 50.79, 50.3, 49.41), X110m.hurdle = c(14.69,
14.05, 14.99, 15.31, 14.17, 14.38, 14.93, 15.06, 14.48, 15.29,
15.67, 14.05, 14.13, 13.97, 14.56, 14.01, 14.95, 14.25, 14.17,
14.25, 14.21, 14.34, 14.37), Discus = c(43.75, 50.72, 40.87,
46.26, 45.67, 44.41, 47.6, 44.99, 42.1, 37.92, 40.49, 48.72,
50.11, 51.65, 48.34, 43.73, 45.62, 44.72, 44.75, 42.43, 44.6,
44.41, 44.83), Pole.vault = c(5.02, 4.92, 5.32, 4.72, 4.42, 4.42,
4.92, 4.82, 4.72, 4.62, 5.02, 5, 4.9, 4.6, 4.4, 4.9, 4.7, 4.8,
4.4, 5.1, 5, 5, 4.6), Javeline = c(63.19, 60.15, 62.77, 63.44,
55.37, 56.37, 52.33, 57.19, 55.4, 57.44, 54.68, 70.52, 69.71,
55.54, 58.46, 55.39, 63.45, 57.76, 55.27, 56.32, 53.45, 60.89,
64.55), X1500m = c(291.7, 301.5, 280.1, 276.4, 268, 285.1, 262.1,
285.1, 282, 266.6, 291.7, 280.01, 282, 278.11, 265.42, 278.05,
269.54, 264.35, 276.31, 273.56, 287.63, 278.82, 267.09)), .Names = c("X100m",
"Long.jump", "Shot.put", "High.jump", "X400m", "X110m.hurdle",
"Discus", "Pole.vault", "Javeline", "X1500m"), row.names = c("SEBRLE",
"CLAY", "BERNARD", "YURKOV", "ZSIVOCZKY", "McMULLEN", "MARTINEAU",
"HERNU", "BARRAS", "NOOL", "BOURGUIGNON", "Sebrle", "Clay", "Karpov",
"Macey", "Warners", "Zsivoczky", "Hernu", "Bernard", "Schwarzl",
"Pogorelov", "Schoenbeck", "Barras"), class = "data.frame")
I followed these steps to plot the results of a piecewise linear regression with one breakpoint which I have done by segmented package:
lin.mod <- lm(ChH~CL)
segmented.mod <- segmented(lin.mod, seg.Z=~CL)
data1 <- data.frame(x = CL, y = ChH)
data2 <- data.frame(x = CL, y = broken.line(segmented.mod)$fit)
ggplot(data1, aes(x = CL, y = ChH)) +
geom_point() +
geom_line(data = data2, color = 'blue')
and I get this plot which does not show two lines with a breakpoint!!!
How should I change my codes to get the correct plot?
This is my dataset:
(ChH has 11 missing data)
CL <- c(9.26, 9.38, 9.41, 9.44, 9.52, 9.58, 9.74, 9.91, 10.03, 10.22,
10.23, 10.4, 10.92, 11.15, 11.38, 11.77, 11.79, 12, 12.45, 12.5,
12.54, 12.79, 12.98, 13.04, 13.04, 13.54, 14.26, 14.33, 14.4,
14.56, 14.77, 14.83, 15.14, 15.19, 15.21, 15.46, 15.61, 15.62,
15.82, 15.87, 16.02, 16.04, 16.05, 16.07, 16.26, 16.32, 16.33,
16.41, 16.53, 16.57, 16.63, 16.64, 16.68, 16.76, 16.87, 17.13,
17.2, 17.37, 17.47, 17.49, 17.68, 17.72, 18.04, 18.1, 18.14,
18.16, 18.18, 18.18, 18.18, 18.22, 18.42, 18.55, 18.63, 18.72,
18.75, 18.77, 18.84, 19, 19.03, 19.3, 19.34, 19.35, 19.36, 19.46,
19.58, 19.61, 19.64, 19.7, 19.73, 19.76, 19.85, 19.85, 19.89,
19.93, 19.97, 20.1, 20.13, 20.16, 20.16, 20.22, 20.26, 20.29,
20.31, 20.31, 20.37, 20.43, 20.46, 20.47, 20.61, 20.64, 20.65,
20.66, 20.78, 20.85, 20.85, 20.88, 20.9, 20.98, 21, 21.02, 21.23,
21.26, 21.29, 21.33, 21.39, 21.4, 21.41, 21.45, 21.5, 21.5, 21.58,
21.6, 21.76, 21.85, 21.9, 22.1, 22.12, 22.14, 22.17, 22.2, 22.21,
22.23, 22.24, 22.3, 22.4, 22.42, 22.43, 22.46, 22.47, 22.48,
22.5, 22.68, 22.7, 22.7, 22.75, 22.8, 22.85, 22.89, 22.89, 22.92,
22.93, 22.94, 22.99, 23.19, 23.3, 23.33, 23.42, 23.51, 23.53,
23.67, 23.7, 23.7, 23.72, 23.72, 23.76, 23.77, 23.78, 23.91,
24.05, 24.05, 24.06, 24.08, 24.11, 24.16, 24.17, 24.2, 24.21,
24.3, 24.38, 24.38, 24.43, 24.49, 24.62, 24.89, 24.89, 24.91,
24.92, 24.95, 24.95, 25.07, 25.1, 25.11, 25.13, 25.13, 25.16,
25.28, 25.3, 25.32, 25.42, 25.43, 25.47, 25.6, 25.71, 25.87,
25.92, 25.94, 25.96, 26.14, 26.18, 26.22, 26.32, 26.33, 26.36,
26.43, 26.6, 26.69, 26.73, 26.73, 26.82, 26.83, 26.86, 27, 27,
27.08, 27.09, 27.1, 27.14, 27.23, 27.24, 27.27, 27.3, 27.55,
27.56, 27.81, 27.9, 27.94, 27.94, 27.98, 28.03, 28.03, 28.17,
28.18, 28.2, 28.49, 28.55, 28.7, 28.76, 28.88, 29.07, 29.13,
29.23, 29.43, 29.63, 29.71, 29.75, 29.97, 30.8, 30.87, 31.27,
31.28, 31.33, 31.45, 31.61, 31.64, 31.68, 32.11, 32.91, 33, 33.6,
34.04, 35.04, 36.05, 36.85)
And:
ChH <- c(2.76, 3.03, 2.86, 2.86, 2.99, 3, 2.96, 3.17, 3.12, 3.27, 3.21,
3.08, 3.53, 3.6, 8.7, 3.75, 3.87, 4.17, 4.38, 4.23, 4.04, 4.24,
4.36, 4.2, 8.78, 4.17, 5.02, 5.22, 5.06, 4.9, NA, 5.3, 5.16,
5.51, 4.25, 5.3, 5.25, 5.65, 5.52, 5.57, 5.5, 5.48, 6.14, 4.65,
5.75, 5.41, 5.42, 5.73, 5.63, 5.85, 6.09, 6.05, 5.88, 5.97, 6.64,
5.18, 6.51, 6.38, 6.27, 6.09, 6.62, 6.3, 4.2, 7.13, NA, 5.85,
6.83, 6.75, 6.94, 6.73, 6.23, 6.79, 6.7, 6.87, NA, 6.7, 6.52,
NA, 7.17, 7.06, 7.01, 7.33, 7.04, 6.94, 7.35, 7.01, 7.54, 7.8,
7.75, 7.86, 7.58, 7.09, 7.42, 7.52, 6.69, NA, 7.69, 7.57, 7.34,
7.52, 8.18, 7.51, 7.8, 7.77, 8.07, 7.92, 6.7, 7.43, 7.58, 8.09,
7.7, 7.81, 8.11, 7.83, 7.48, 7.81, 8.27, 8.32, 7.86, 8.1, 8.63,
7.8, 5.42, 8.36, 8.08, NA, 7.78, 8.27, 8.44, 6.62, 8.01, 8.5,
7.86, 9.1, 8.15, 8.69, 8.6, 8.49, 7.98, 8.76, 8.34, 8.75, 7.97,
9.08, 8.29, NA, 8.92, 8.71, 8.94, 8.44, 9, 8.63, 9.15, 8.93,
9.37, 8.77, 9.21, 9.07, 9.1, 8.89, 7.43, 8.34, 8.64, 8.5, 9.59,
7.59, 9.08, 9.4, 9.07, 8.83, 9.46, 9.3, 9.24, 9.44, 9, 9.43,
9.17, 7.68, 9.56, 9.27, 9.33, 6.8, 9.98, 9.81, 9.59, 9.49, 9.55,
9.39, 10.04, 9.5, 9.93, 9.3, 9.49, 8.45, 7.77, 7.84, 9.88, 9.35,
10.09, 10.22, 10.75, 10.75, 8.04, 8.07, 10.14, 9.94, 10.44, 10.25,
9.49, 10.6, 8.41, 9.57, 11.25, NA, 11.61, 6.72, 10.63, 11.12,
10.55, 10.7, 10.18, 10.94, 11.02, 10.66, 10.73, 8.65, 11.84,
NA, 11.25, 11.59, 10.96, 11.58, 11.43, 12.46, 10.46, 10.99, 11.94,
8.77, 11.58, 12.36, 11, 11.05, 11.86, 9.52, 12.48, 12.39, 12.64,
12.28, 12.12, 11.27, 10.86, 12.49, 12.13, 12.74, 9.64, 10.97,
12.41, 12.32, 13.86, 13.04, NA, 10.26, 13.24, 13.89, 12.77, 13.33,
13.37, 13.55, 14.01, 14.25, 14.75, 14.3, 13.87, 14.96, 14.32,
14.49, NA, 15.41, 15.47, 14.31, 17.7, 12.48, 16.46)
Edited to take into account OP's real data
Put everything inside the same data.frame:
library(segmented)
library(ggplot2)
lin.mod <- lm(ChH~CL)
segmented.mod <- segmented(lin.mod, seg.Z=~CL)
fit <- numeric(length(CL)) * NA
fit[complete.cases(rowSums(cbind(ChH, CL)))] <- broken.line(segmented.mod)$fit
data1 <- data.frame(CL = CL, ChH = ChH, fit = fit)
ggplot(data1, aes(x = CL, y = ChH)) +
geom_point() +
geom_line(aes(x = CL, y = fit), color = 'blue')
I have written a short script to plot trends of measured radioactivity activities on 2 separate measurement devices. The script is shown below
pkgLoad <- function(x)
{
if (!require(x,character.only = TRUE))
{
install.packages(x,dep=TRUE, repos='http://star-www.st-andrews.ac.uk/cran/')
if(!require(x,character.only = TRUE)) stop("Package not found")
}
}
pkgLoad("ggplot2")
pkgLoad("XLConnect")
pkgLoad("reshape2")
#Load the workbook
wb<-loadWorkbook("CapintecQC.xlsx")
df_blue <-readWorksheet(wb, sheet = "Blue", startCol=1, endCol=6)
#sort date format
df_blue$Date <- as.Date(df_blue$Date , "%d/%m/%y")
df_blue[order(df_blue$Date ),]
df_gold <-readWorksheet(wb, sheet = "Gold", startCol=1, endCol=6)
df_gold$Date <- as.Date(df_gold$Date , "%d/%m/%y")
df_gold[order(df_gold$Date ),]
#Reference Cs-137 details
ref_activity <- 9.3
half_life <- 30.23
ref_date <- as.Date('06/01/08',format='%d/%m/%y')
blue_melt <- melt(df_blue[,c(1,2:6)], id="Date", value.name="Activity", variable.name="Isotope")
#Add new column to data frame with expected activity
df_gold["Exp_Act"] <- round(ref_activity*exp((-0.693/half_life)*as.numeric(difftime(df_gold$Date,ref_date))/365.25),3)
df_gold["Exp_Act_0.95"] <- 0.95 * df_gold$Exp_Act
df_gold["Exp_Act_1.05"] <- 1.05 * df_gold$Exp_Act
gold_melt <- melt(df_gold[,c(1,2:6)], id="Date", value.name="Activity", variable.name="Isotope")
p <- ggplot( NULL )+geom_point(data = gold_melt, aes(x=Date,y=Activity, col=Isotope)) + geom_ribbon(data = df_gold, aes(x = Date, ymin = Exp_Act_0.95, ymax = Exp_Act_1.05), fill='blue', alpha=0.2) + geom_point(data = blue_melt, aes(x=Date,y=Activity, col=Isotope), shape=2) + theme_bw()
print(p)
I am not very competent with R/ggplot2. I would like the final plot to show the measured activity for each radionuclide to be the same color for both devices (i.e Cs-137 in red, 99mTc in Blue). How can I do this as my graph plots different colours.
Also the legend is non-pleasing.
(i) The format for each nuclide, which is picked up from the excel header changes from Cs-137 to Cs.137. How can I have Cs-137, Tc-99m etc as headers?
(ii) Each radionuclide is duplicated in the legend - one for each device. Is it possible to show just the legend for the first data frame (df_gold) or better just have text in the legend, with the text color matched to the marker color in the plot?)
df_gold structure
structure(list(Date = structure(c(15708, 15709, 15712, 15713,
15714, 15715, 15716, 15719, 15720, 15721, 15722, 15723, 15726,
15727, 15729, 15730, 15733, 15734, 15735, 15736, 15740, 15741,
15743, 15747, 15748, 15749, 15750, 15751, 15754, 15755, 15756,
15757, 15758, 15761, 15762, 15764, 15765, 15768, 15769, 15770,
15771, 15772, 15775, 15776, 15777, 15779, 15782, 15783, 15784,
15785, 15786, 15789, 15790, 15791, 15792, 15797, 15798, 15799,
15800), class = "Date"), Cs..137 = c(8.2, 8.1, 8.1, 8.1, 8.1,
8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1,
8.1, 8.1, 8.1, 8.2, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8, 8.2,
8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1,
8.1, 8.1, 8.1, 8, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1, 8.1,
8.1, 8.1), In..111 = c(6.49, 6.47, 6.48, 6.43, 6.49, 6.51, 6.5,
6.47, 6.48, 6.4, 6.48, 6.48, 6.48, 6.49, 6.49, 6.47, 6.48, 6.48,
6.5, 6.47, 6.49, 6.55, 6.46, 6.49, 6.48, 6.48, 6.46, 6.48, 6.49,
6.44, 6.49, 6.46, 6.45, 6.46, 6.46, 6.43, 6.49, 6.47, 6.45, 6.43,
6.44, 6.44, 6.44, 6.46, 6.45, 6.47, 6.45, 6.43, 6.44, 6.47, 6.45,
6.46, 6.45, 6.46, 6.39, 6.46, 6.44, 6.42, 6.41), I..123 = c(6.97,
6.94, 6.96, 6.91, 6.92, 6.95, 6.93, 6.92, 6.93, 7, 6.97, 6.96,
6.96, 6.94, 6.98, 6.97, 6.95, 6.95, 6.94, 6.96, 6.97, 7.01, 6.92,
7, 6.98, 6.97, 6.91, 6.99, 6.95, 6.88, 6.96, 6.91, 6.91, 6.93,
6.94, 6.94, 6.97, 6.93, 6.93, 6.93, 6.96, 6.94, 6.94, 6.92, 6.93,
6.91, 6.93, 6.92, 6.92, 6.91, 6.91, 6.89, 6.92, 6.9, 6.9, 6.91,
6.91, 6.9, 6.9), I..131 = c(10.5, 10.5, 10.5, 10.5, 10.5, 10.5,
10.5, 10.5, 10.5, 10.8, 10.5, 10.6, 10.5, 10.5, 10.5, 10.5, 10.5,
10.5, 10.5, 10.5, 10.5, 10.6, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5,
10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.5, 10.4,
10.5, 10.4, 10.5, 10.5, 10.5, 10.4, 10.5, 10.4, 10.4, 10.5, 10.4,
10.4, 10.4, 10.4, 10.4, 10.3, 10.5, 10.5, 10.5, 10.6), Tc..99m = c(15,
15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15.1, 15, 15, 15.1, 15,
15, 15, 15, 15.1, 15, 15.1, 15, 15, 15, 15, 15, 15, 15, 15, 15,
15, 15, 15, 15, 14.9, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
14.9, 14.8, 14.9, 14.9, 14.9, 14.9, 15, 15, 14.8, 15, 15, 15,
15), Exp_Act = c(8.294, 8.293, 8.292, 8.291, 8.291, 8.29, 8.29,
8.288, 8.288, 8.287, 8.287, 8.286, 8.285, 8.284, 8.283, 8.283,
8.281, 8.28, 8.28, 8.279, 8.277, 8.277, 8.276, 8.274, 8.273,
8.273, 8.272, 8.272, 8.27, 8.27, 8.269, 8.269, 8.268, 8.266,
8.266, 8.265, 8.264, 8.263, 8.262, 8.262, 8.261, 8.261, 8.259,
8.259, 8.258, 8.257, 8.256, 8.255, 8.255, 8.254, 8.254, 8.252,
8.251, 8.251, 8.25, 8.248, 8.247, 8.247, 8.246), Exp_Act_0.95 = c(7.8793,
7.87835, 7.8774, 7.87645, 7.87645, 7.8755, 7.8755, 7.8736, 7.8736,
7.87265, 7.87265, 7.8717, 7.87075, 7.8698, 7.86885, 7.86885,
7.86695, 7.866, 7.866, 7.86505, 7.86315, 7.86315, 7.8622, 7.8603,
7.85935, 7.85935, 7.8584, 7.8584, 7.8565, 7.8565, 7.85555, 7.85555,
7.8546, 7.8527, 7.8527, 7.85175, 7.8508, 7.84985, 7.8489, 7.8489,
7.84795, 7.84795, 7.84605, 7.84605, 7.8451, 7.84415, 7.8432,
7.84225, 7.84225, 7.8413, 7.8413, 7.8394, 7.83845, 7.83845, 7.8375,
7.8356, 7.83465, 7.83465, 7.8337), Exp_Act_1.05 = c(8.7087, 8.70765,
8.7066, 8.70555, 8.70555, 8.7045, 8.7045, 8.7024, 8.7024, 8.70135,
8.70135, 8.7003, 8.69925, 8.6982, 8.69715, 8.69715, 8.69505,
8.694, 8.694, 8.69295, 8.69085, 8.69085, 8.6898, 8.6877, 8.68665,
8.68665, 8.6856, 8.6856, 8.6835, 8.6835, 8.68245, 8.68245, 8.6814,
8.6793, 8.6793, 8.67825, 8.6772, 8.67615, 8.6751, 8.6751, 8.67405,
8.67405, 8.67195, 8.67195, 8.6709, 8.66985, 8.6688, 8.66775,
8.66775, 8.6667, 8.6667, 8.6646, 8.66355, 8.66355, 8.6625, 8.6604,
8.65935, 8.65935, 8.6583)), row.names = c(NA, -59L), .Names = c("Date",
"Cs..137", "In..111", "I..123", "I..131", "Tc..99m", "Exp_Act",
"Exp_Act_0.95", "Exp_Act_1.05"), class = "data.frame")
df_blue structure
structure(list(Date = structure(c(15790, 15791, 15792, 15797,
15798, 15799, 15800), class = "Date"), Cs.137 = c(8.1, 8.2, 8.2,
8.2, 8.2, 8.2, 8.2), I.123 = c(6.82, 6.85, 6.91, 6.84, 6.82,
6.82, 6.83), I.131 = c(10.5, 10.6, 10.6, 10.5, 10.6, 10.6, 10.6
), In.111 = c(6.35, 6.45, 6.43, 6.37, 6.38, 6.4, 6.37), X99m.Tc = c(15,
15, 15.1, 15.1, 15.1, 15.1, 15.1)), .Names = c("Date", "Cs.137",
"I.123", "I.131", "In.111", "X99m.Tc"), row.names = c(NA, -7L
), class = "data.frame")
My approach would be to bind together both data frames and then add new column that contains name of device (gold or blue).
df<-rbind(gold_melt,blue_melt)
df$device<-rep(c("gold","blue"),c(nrow(gold_melt),nrow(blue_melt)))
With function recode() from library car change names of Isotope as they should be.
df$Isotope<-recode(df$Isotope,"c('Cs..137','Cs.137')='Cs-137';
c('I..123','I.123')='I-123';
c('I..131','I.131')='I-131';
c('In..111','In.111')='In-111'
;c('Tc..99m','X99m.Tc')='Tc-99m'")
Now you just need one call to geom_point() using new data frame. I added also shape=device to get different shapes for each device.
ggplot(NULL) +
geom_point(data=df,aes(x=Date,y=Activity, col=Isotope,shape=device))+
geom_ribbon(data = df_gold, aes(x = Date, ymin = Exp_Act_0.95, ymax = Exp_Act_1.05), fill='blue', alpha=0.2)
Just in case you'd want to "fuse" the legend together, then building up on Didzis' answer:
df <- transform(df, device = factor(device, levels=unique(device)),
grp = paste(Isotope, device, sep="_"))
require(RColorBrewer)
ggplot() + geom_point(data = df, aes(x = Date, y = Activity,
colour=grp, shape = grp, fill=grp)) +
geom_ribbon(data = df_gold, aes(x = Date, ymin = Exp_Act_0.95,
ymax = Exp_Act_1.05), fill='blue', alpha=0.2) +
scale_shape_manual("", values=rep(c(21,24), 5)) +
scale_fill_manual("", values=rep(brewer.pal(5, "Set1"), each=2)) +
scale_colour_manual("", values=rep(brewer.pal(5, "Set1"), each=2))