R generate 2D histogram from raw data - r
I have some raw data in 2D, x, y as given below. I want to generate a 2D histogram from the data. Typically, dividing the x,y values into bins of size 0.5, and count the number of occurrences in each bin (for both x and y at the same time). Is there any way to do that?
> df
x y
1 4.2179611 5.7588577
2 5.3901279 5.8219784
3 4.1933089 6.4317645
4 5.8076411 5.8999598
5 5.5781166 5.9382342
6 4.5569735 6.7833469
7 4.4024492 5.8019719
8 4.1734975 6.0896355
9 5.1707871 5.5640962
10 5.6380258 6.9112775
11 4.6405353 5.2251746
12 4.1809004 6.1127144
13 4.2764079 5.4598799
14 5.4466446 6.0130047
15 5.2443804 5.5421851
16 5.7521515 5.4115965
17 4.9667564 5.3519795
18 4.5007141 6.8669231
19 5.0268273 5.7681888
20 4.4738948 6.4241168
21 4.4116357 5.9819519
22 4.5741988 6.4595129
23 4.0839075 6.8105259
24 4.7154364 6.5054761
25 4.8986785 5.5511226
26 5.6262397 6.8996480
27 4.9034275 5.6716375
28 4.1872928 5.8387641
29 4.0444855 5.2554446
30 4.8911393 5.8449165
31 5.7268887 6.7100432
32 5.9136374 6.5059128
33 4.9481286 6.4679917
34 4.6198987 5.7462047
35 5.7306916 6.0613158
36 5.5818586 6.4533566
37 5.9240267 6.7748290
38 4.8160926 6.4942865
39 5.5456258 5.7911897
40 4.3075173 6.8165520
41 4.9654533 5.8904734
42 5.9581820 5.7692468
43 4.2417172 5.7990554
44 5.3670112 5.8252479
45 5.2932098 5.3983672
46 5.7456521 6.2563828
47 4.9398795 5.2879065
48 4.8526884 6.9827555
49 5.6135753 6.5219431
50 4.0727956 5.2647714
51 6.9418969 5.2584325
52 5.4189039 5.9936456
53 3.9193741 6.7099562
54 5.5885252 5.9680734
55 5.9581279 5.1843804
56 4.5724421 6.6774004
57 4.7700303 6.6083613
58 5.5490254 6.2431170
59 4.1668548 5.1017475
60 5.8948947 6.7646917
61 6.5501872 5.2803433
62 5.6011444 4.2733087
63 5.1337226 6.5225780
64 5.3153358 6.6164809
65 3.3815056 6.4077659
66 3.8405670 5.3677008
67 6.7036350 4.3090214
68 3.2446588 4.0965275
69 4.6563593 7.6868628
70 5.2382914 7.0020874
71 6.0771605 6.6232541
72 3.5672511 6.9333691
73 5.0865233 4.0778233
74 5.6743559 5.5177734
75 4.5759146 7.2210012
76 5.8203140 4.9787148
77 3.1106176 6.3937707
78 4.6310679 4.4731806
79 6.8237641 6.2679791
80 3.7653803 5.9188107
81 5.6139040 5.8586176
82 6.2016662 5.3514293
83 3.9362048 5.3217560
84 6.8005236 7.9247371
85 5.8030101 7.7492432
86 6.0143418 6.0709249
87 6.5734089 7.6112815
88 4.0569383 5.8440535
89 4.6825752 7.7926235
90 4.8204027 6.3106798
91 3.5001675 6.3156079
92 3.6521280 7.5155810
93 5.0945236 4.8206873
94 3.8732946 5.6771599
95 6.4812309 5.6082170
96 5.0308355 7.6877289
97 5.2193389 7.7133717
98 6.2239631 5.5387684
99 4.6501488 7.8559335
100 3.5389389 5.4594034
101 5.7139486 4.5008182
102 3.5425132 7.3562487
103 6.9950663 6.1036549
104 5.3801845 5.8903123
105 4.7629191 5.3394552
106 4.4102815 7.2312852
107 5.8723641 4.1410996
108 3.4691208 4.6383708
109 4.6479362 5.8562699
110 3.0315732 6.8614265
111 5.9456145 4.7497545
112 4.8461189 4.4730002
113 4.9606723 5.1099093
114 4.7802659 7.8147864
115 5.0189229 6.9308301
116 6.4738074 5.0539666
117 5.3725075 5.3282273
118 6.5374505 7.0508875
119 4.0907139 5.0855075
120 5.0557532 5.6449829
121 6.5483249 7.5800015
122 3.1083616 7.3697234
123 3.6119548 7.7639486
124 6.5157691 7.7152933
125 4.0305622 7.0521419
126 3.2197769 6.5881246
127 4.7570419 6.4564400
128 4.0063007 6.3981942
129 4.4412649 7.6576221
130 5.7348769 6.7601804
131 3.1312551 5.6295996
132 3.8627964 7.5817083
133 5.2008281 5.1082509
134 6.4229161 6.2816475
135 2.5241894 6.0802138
136 7.3759753 5.1090478
137 3.7284166 5.2045976
138 3.4404286 6.9708127
139 6.4237399 5.1363851
140 4.1829368 5.1612791
141 5.9500285 5.4765621
142 3.3555182 6.2627360
143 7.7691356 5.1877095
144 4.0684189 7.1663495
145 7.3929140 7.3819058
146 2.1659981 7.9796005
147 4.8539955 7.3108966
148 5.3932658 4.7116979
149 3.5610560 4.6096759
150 5.1883331 6.8068501
151 6.4233558 7.2955388
152 7.3308739 6.1761356
153 3.0710449 4.5296235
154 7.5400128 5.1559900
155 3.5776389 5.2057676
156 4.0402288 7.1487121
157 2.3107258 6.9816127
158 7.2065591 7.7307439
159 5.7577620 5.6652052
160 2.0595554 7.4373547
161 7.5994468 4.6216856
162 4.8053745 3.9113634
163 7.5769460 7.6019067
164 5.5362034 8.9270974
165 3.6713241 3.9060205
166 6.0612046 7.3862080
167 6.9205755 7.0792392
168 6.0892821 6.3248315
169 2.0532905 4.1545875
170 3.4086310 3.5510909
171 5.2148895 5.3266145
172 4.7638780 7.9240988
173 6.4717329 5.1350172
174 7.8287022 4.3457324
175 6.0299681 3.0952274
176 3.2760103 5.2730464
177 2.5729991 7.6594251
178 3.9403251 7.8928014
179 6.0021556 7.5313493
180 7.8561727 4.5092728
181 3.5818174 4.1140876
182 7.4972295 5.5313987
183 6.0138287 6.9369784
184 3.9257191 7.6395296
185 3.0462106 3.1347680
186 6.0630447 4.1847229
187 7.4878528 5.1004141
188 4.5145570 4.6389011
189 6.2777996 4.2647980
190 3.0166336 7.5755042
191 2.8791041 6.4471746
192 7.1029767 7.0061048
193 2.4526181 6.3373793
194 5.8762775 7.0746223
195 7.0609100 8.1256569
196 4.7252400 8.4829780
197 3.3695501 8.8786640
198 3.8505741 6.8260398
199 5.3573846 6.3864944
200 3.7039072 8.9951078
201 4.6216933 6.7890198
202 7.0390643 5.9458624
203 5.7172605 6.9083246
204 2.3814644 8.3856125
205 2.4432566 3.2618192
206 4.3881965 6.7022219
207 5.2583749 7.2432485
208 5.8540367 8.5154705
209 6.4267791 4.9593757
210 5.0668461 3.1358129
211 2.6845736 8.9880143
212 7.3094761 5.4049133
213 4.2176252 5.5062193
214 5.2025716 4.0798478
215 6.5592571 8.1852765
216 2.0417939 7.0843906
217 7.6045374 7.4870940
218 6.5971789 8.8641329
219 5.3541694 7.2176914
220 2.8314803 6.4831720
221 2.4252467 4.0918736
222 6.6804732 6.3624739
223 6.0325285 6.2057468
224 2.2751047 5.1275412
225 5.5397481 5.9890834
226 4.6420585 4.6013327
227 7.6385642 5.1722194
228 6.7378078 5.8246169
229 5.0647686 7.9219705
230 2.8672731 6.6371082
231 7.5487359 4.5727898
232 1.0837662 7.1788146
233 5.4483746 6.8955122
234 9.3085746 4.8330044
235 3.8484225 6.0133789
236 2.8034987 3.0023096
237 2.8952626 8.2623788
238 5.7666136 3.2158710
239 6.4978214 5.7866574
240 1.5184268 5.9791716
241 2.3836147 8.2897188
242 4.7318649 6.1174515
243 5.8544588 7.5056688
244 9.6776416 6.5151695
245 0.4319531 4.2470331
246 0.9810053 8.6452087
247 7.0819634 3.2488110
248 1.9084265 6.1122130
249 7.5096342 3.3495096
250 8.9564496 3.4960564
251 5.7603943 6.9091760
252 0.8801204 7.2744429
253 1.2183581 6.4264214
254 1.7761613 7.1199729
255 3.2490662 7.9935963
256 3.5420375 8.4801333
257 8.7709382 3.8011487
258 8.4770868 3.4749692
259 0.9965042 6.7509705
260 7.5049457 5.4313474
261 9.7261151 6.5909553
262 5.3893371 4.0194548
263 9.6154510 7.3117416
264 1.0327841 6.2376586
265 4.0064715 3.7333634
266 6.6941050 3.9452152
267 4.1317951 9.3322756
268 9.6481471 7.5330023
269 7.3474233 1.0310166
270 3.7343864 4.9808341
271 9.1412231 2.6655861
272 5.8414100 0.1329439
273 2.4837309 7.4956203
274 2.7983337 1.3563719
275 0.6335727 7.9273816
276 7.5566740 0.4321263
277 8.6182079 0.6038505
278 0.8928523 8.0131172
279 5.7375090 8.5275545
280 0.7864533 3.3954255
281 8.7808839 1.7059789
282 9.6621659 0.9215045
283 8.4894688 8.7667948
284 1.0358920 7.2505891
285 0.7378660 0.1173287
286 9.5485481 3.3186128
287 6.8987508 9.5480887
288 7.4105831 5.8809522
289 6.6984457 5.9509037
290 1.7878216 9.1932955
291 0.8443295 5.1662902
292 0.4498266 8.9636923
293 2.5068754 5.3692908
294 9.2509052 2.4204235
295 4.1333742 6.2581851
296 6.5510938 7.2923688
297 4.3412873 3.5514825
298 4.2349765 9.3207514
299 2.8730785 7.2752405
300 2.0425362 6.6513146
301 6.4498432 7.2949259
302 5.7453188 6.3263712
303 7.0501276 8.2238207
304 4.1915008 1.5325379
305 8.1307954 7.7681944
306 7.3156552 6.3031412
307 4.0302052 0.3039900
308 3.3740358 2.1386235
309 8.2055657 2.9112215
310 1.8817856 7.0503046
311 7.0820523 6.8739097
312 5.0725238 6.9951556
313 1.6246224 5.4126084
314 3.8865553 7.6398192
315 6.6727672 8.9677947
316 9.6048687 7.6757966
317 2.2006018 9.6385351
318 9.6403802 7.6438900
319 0.1267512 0.9048408
320 1.8160829 7.3193066
321 9.9318386 9.6068456
322 2.1275892 7.8034724
323 1.2232242 1.0695030
324 3.0198057 3.8964732
325 3.3265773 8.5865587
326 5.1519605 7.5068253
327 0.4137485 5.9223826
328 1.6896445 0.6071874
329 1.8534083 2.3554291
330 1.7182264 9.3488597
331 6.4165456 9.8670765
332 7.6270001 2.1839607
333 8.9867227 5.9565743
334 6.9185079 0.2440980
335 6.7359209 7.1072908
336 3.8034763 5.8466404
337 3.4583027 6.9041502
338 1.7983897 1.7108336
339 6.9184406 6.3632716
340 1.3538600 6.8484462
341 3.6731748 4.9846946
342 5.6139620 8.0637827
343 9.0991782 2.3051189
344 1.1220448 8.9624365
345 2.5925265 8.3673795
346 9.9977377 8.5423564
347 5.1761187 5.1240824
348 5.9330451 9.4141322
349 6.3337224 6.8055697
350 2.7287418 5.7100024
351 6.1022411 2.9733360
352 2.7331869 3.7135612
353 6.7394034 8.2721572
354 2.1757932 9.0574057
355 5.5011486 6.0124142
356 4.5301911 2.5865048
357 5.3137001 0.7062267
358 0.6959286 3.2395043
359 5.3494169 6.5742589
360 7.1472046 6.3821916
361 0.1749855 0.3954287
362 6.7709760 6.5212015
363 7.2983482 3.0086604
364 0.6147726 9.3336870
365 7.4417342 2.6836695
366 1.2769881 4.0591093
367 9.5342317 5.3443613
368 0.9368862 1.1391497
369 8.4271193 8.6641296
370 6.2000851 8.2987486
371 2.1768279 6.0684896
372 5.2021222 6.9222675
373 0.6095874 8.4759464
374 2.0217473 9.5844241
375 4.8080163 6.5052801
376 3.6099334 0.3272768
377 6.0132712 7.9920535
378 4.0495344 8.8153621
379 6.9646704 7.0375214
380 3.9211171 2.5994333
381 4.4749268 1.0517360
382 1.1683429 3.8710614
383 1.7618115 0.3513996
384 1.1257639 5.7446745
385 3.7351688 8.7376011
386 4.9234662 7.1975462
387 7.4899861 7.3846309
388 7.4170082 2.2885060
389 0.8526702 3.8160722
390 4.5907512 8.9315418
391 7.6996179 9.8409051
392 0.2340987 4.2906009
393 2.2502736 1.7819172
394 3.5679969 1.7419479
395 5.4214908 5.6001803
396 3.9965213 9.2021549
397 3.8610336 2.0462740
398 5.9490575 4.4422382
399 9.8897791 5.6402915
400 6.1153192 4.1236797
401 5.8906384 2.6153750
402 8.0582664 2.7137804
403 7.2969209 2.9362187
404 3.8673527 1.0837191
405 3.5647339 6.2338014
406 9.6490210 0.8373270
407 0.8133243 6.3393130
408 2.8760565 9.9462423
409 3.3836457 7.4451869
410 4.7772609 2.9141127
411 8.6635971 5.7812494
412 5.6192160 1.4764255
413 9.1334625 8.9822399
414 0.4662385 6.6440937
415 3.4503559 4.2064800
416 0.6704780 2.8508758
417 0.5211872 4.3109175
418 7.5615411 9.2851454
419 7.5081906 4.0019450
420 8.8851669 9.7323717
421 7.3856288 8.6152906
422 9.5926351 0.3993818
423 1.4478981 1.4845263
424 5.0425560 1.3501638
425 0.8952120 7.9407680
426 6.4732584 7.1493210
427 9.6595225 5.2377876
428 7.2204625 2.0300222
429 3.5410601 7.3117738
430 6.7991771 3.6368291
Just for clarification, I want to get something like this plot below (this plot doesn't have to do anything with my raw data, I am just showing it to explain the problem more clearly! If I use hist(df$x) it will show the distribution of x only.)
The ggplot is elegant and fast and pretty, as usual. But if you want to use base graphics (image, contour, persp) and display your actual frequencies (instead of the smoothing 2D kernel), you have to first obtain the binnings yourself and create a matrix of frequencies. Here's some code (not necessarily elegant, but pretty robust) that does 2D binning and generates plots somewhat similar to the ones above:
require(mvtnorm)
xy <- rmvnorm(1000,c(5,10),sigma=rbind(c(3,-2),c(-2,3)))
nbins <- 20
x.bin <- seq(floor(min(xy[,1])), ceiling(max(xy[,1])), length=nbins)
y.bin <- seq(floor(min(xy[,2])), ceiling(max(xy[,2])), length=nbins)
freq <- as.data.frame(table(findInterval(xy[,1], x.bin),findInterval(xy[,2], y.bin)))
freq[,1] <- as.numeric(freq[,1])
freq[,2] <- as.numeric(freq[,2])
freq2D <- diag(nbins)*0
freq2D[cbind(freq[,1], freq[,2])] <- freq[,3]
par(mfrow=c(1,2))
image(x.bin, y.bin, freq2D, col=topo.colors(max(freq2D)))
contour(x.bin, y.bin, freq2D, add=TRUE, col=rgb(1,1,1,.7))
palette(rainbow(max(freq2D)))
cols <- (freq2D[-1,-1] + freq2D[-1,-(nbins-1)] + freq2D[-(nbins-1),-(nbins-1)] + freq2D[-(nbins-1),-1])/4
persp(freq2D, col=cols)
For a really fun time, try making an interactive, zoomable, 3D surface:
require(rgl)
surface3d(x.bin,y.bin,freq2D/10, col="red")
Bivariate density estimates can be done with MASS::kde2d, or KernSmooth::bkde2D (both supplied with the base R distribution). The latter uses an algorithm based on the fast Fourier transform over a grid of points, and is very fast. The result can be plotted with contour or persp or similar functions in other graphing packages.
Using your data:
require(KernSmooth)
z <- bkde2D(df, .5)
persp(z$fhat)
If you want it with a 2d contour, you can also use the package ggplot2. Some example code is shown in this question:
gradient breaks in a ggplot stat_bin2d plot
Adjusted slightly:
x <- rnorm(10000)+5
y <- rnorm(10000)+5
df <- data.frame(x,y)
require(ggplot2)
p <- ggplot(df, aes(x, y))
p <- p + stat_bin2d(bins = 20)
p
Here's the output of the code above:
For completeness, you can also use the hist2d{gplots} function. It seems to be the most straightforward for a 2D plot:
library(gplots)
# data is in variable df
# define bin sizes
bin_size <- 0.5
xbins <- (max(df$x) - min(df$x))/bin_size
ybins <- (max(df$y) - min(df$y))/bin_size
# create plot
hist2d(df, same.scale=TRUE, nbins=c(xbins, ybins))
# if you want to retrieve the data for other purposes
df.hist2d <- hist2d(df, same.scale=TRUE, nbins=c(xbins, ybins), show=FALSE)
df.hist2d$counts
i came to this page from http://www.r-bloggers.com/5-ways-to-do-2d-histograms-in-r/ which lists one of the answers above.
It provides code samples for a total of 5 methods:
hist2d from the library gplots
hexbin,hexbinplot from the library hexbin
stat_bin2d from the library ggplot2
kde2d from the library MASS
the "hard way" solution listed above.
freq <- as.data.frame(table(findInterval(xy[,1], x.bin),findInterval(xy[,2], y.bin)))
freq[,1] <- as.numeric(freq[,1])
freq[,2] <- as.numeric(freq[,2])
This is probably wrong since it destroys the original indices.
Related
Using dplyr to compute calculated fields depending on multiple columns without explicitly writing column names
Consider the following code. set.seed(56) library(dplyr) df <- data.frame( NUM_1 = sample.int(500, replace = TRUE), DENOM_1 = sample.int(500, replace = TRUE), NUM_2 = sample.int(500, replace = TRUE), DENOM_2 = sample.int(500, replace = TRUE) ) head(df) NUM_1 DENOM_1 NUM_2 DENOM_2 1 417 379 154 173 2 160 437 239 154 3 243 315 106 361 4 291 169 393 340 5 170 450 429 421 6 422 131 75 64 Without having to manually specify each of the column names (the actual problem has about 40 of these I need to create), I would like to create columns FRAC_1 and FRAC_2 for which FRAC_X = NUM_X/DENOM_X. So, this would be what I'm looking for with regard to output, but since I'm dealing with about 40 of these, I don't want to have to manually type out each column: df_frac <- df %>% mutate(FRAC_1 = NUM_1 / DENOM_1, FRAC_2 = NUM_2 / DENOM_2) head(df_frac) NUM_1 DENOM_1 NUM_2 DENOM_2 FRAC_1 FRAC_2 1 417 379 154 173 1.1002639 0.8901734 2 160 437 239 154 0.3661327 1.5519481 3 243 315 106 361 0.7714286 0.2936288 4 291 169 393 340 1.7218935 1.1558824 5 170 450 429 421 0.3777778 1.0190024 6 422 131 75 64 3.2213740 1.1718750 I would strongly prefer a dplyr solution to this. I thought maybe I could use mutate() with across(), but it isn't clear to me how to tell across() to pair the NUM_x with the corresponding DENOM_x columns.
Here is one in tidyverse Loop across the columns with names starts_with 'NUM' Extract the column name cur_column(), replace the substring from 'NUM' to 'DENOM' in str_replace get the column value, divide by the NUM column, and change the column name in .names to create the 'FRAC' columns library(dplyr) library(stringr) df <- df %>% mutate(across(starts_with("NUM"), ~ ./get(str_replace(cur_column(), 'NUM', 'DENOM')), .names = "{str_replace(.col, 'NUM', 'FRAC')}")) -output head(df) NUM_1 DENOM_1 NUM_2 DENOM_2 FRAC_1 FRAC_2 1 417 379 154 173 1.1002639 0.8901734 2 160 437 239 154 0.3661327 1.5519481 3 243 315 106 361 0.7714286 0.2936288 4 291 169 393 340 1.7218935 1.1558824 5 170 450 429 421 0.3777778 1.0190024 6 422 131 75 64 3.2213740 1.1718750
ggplot_line: label the top 2 peak with X-axis values
I am new to R programming. I am plotting a mass spectrum with ggplot and would like to label the top 2 peaks with their x-axis values (i.e. m). Does anyone know how to achieve that? Thanks so much for your help! Here is part of the raw data I used for the ggplot. m Intensity 1 30001 2.964e+01 2 30002 3.336e+01 3 30003 3.968e+01 4 30004 5.015e+01 5 30005 6.838e+01 6 30006 1.016e+02 7 30007 1.464e+02 8 30008 2.130e+02 9 30009 3.115e+02 10 30010 3.951e+02 11 30011 5.134e+02 12 30012 5.316e+02 13 30013 6.377e+02 14 30014 8.813e+02 15 30015 1.071e+03 16 30016 1.119e+03 17 30017 1.202e+03 18 30018 1.299e+03 19 30019 1.112e+03 20 30020 1.205e+03 21 30021 1.422e+03 22 30022 1.653e+03 23 30023 1.726e+03 24 30024 2.423e+03 25 30025 3.059e+03 26 30026 3.267e+03 27 30027 3.993e+03 28 30028 5.172e+03 29 30029 5.278e+03 30 30030 2.794e+03 31 30031 1.459e+03 32 30032 2.512e+03 33 30033 6.590e+03 34 30034 1.245e+04 35 30035 1.144e+04 36 30036 5.197e+03 37 30037 6.012e+03 38 30038 1.453e+04 39 30039 1.513e+04 40 30040 5.802e+03 41 30041 9.226e+03 42 30042 5.809e+03 43 30043 3.074e+03 44 30044 3.882e+03 45 30045 9.941e+02 46 30046 8.170e+02 47 30047 1.149e+03 48 30048 3.567e+02 49 30049 3.805e+02 50 30050 3.654e+02 51 30051 4.724e+02 52 30052 7.819e+02 53 30053 8.634e+02 54 30054 5.235e+02 55 30055 1.712e+02 56 30056 9.232e+01 57 30057 9.434e+01 58 30058 7.191e+01 59 30059 8.036e+01 60 30060 4.456e+01 61 30061 9.428e+01 62 30062 9.392e+01 63 30063 8.413e+01 64 30064 5.671e+01 65 30065 2.639e+01 66 30066 2.027e+01 67 30067 4.584e+01 68 30068 6.956e+01 69 30069 6.181e+01 70 30070 6.450e+01 71 30071 2.826e+01 72 30072 3.610e+01 73 30073 6.325e+01 74 30074 3.509e+01 75 30075 3.478e+01 76 30076 1.120e+01 77 30077 6.993e+00 78 30078 9.936e+00 79 30079 7.738e+00 80 30080 9.771e+00 81 30081 1.762e+01 82 30082 3.060e+01 83 30083 2.175e+01 84 30084 2.816e+01 85 30085 2.700e+01 86 30086 2.114e+01 87 30087 4.378e+01 88 30088 5.824e+01 89 30089 6.193e+01 90 30090 4.146e+01 91 30091 9.697e+04 92 30092 9.458e+04 93 30093 9.216e+04 94 30094 8.972e+04 95 30095 8.723e+04 96 30096 8.468e+04 97 30097 8.211e+04 98 30098 7.959e+04 99 30099 7.726e+04 100 30100 7.527e+04 101 30101 7.379e+04 102 30102 7.298e+04 103 30103 7.301e+04 104 30104 7.399e+04 105 30105 7.602e+04 106 30106 7.916e+04 107 30107 8.340e+04 108 30108 8.862e+04 109 30109 9.460e+04 110 30110 1.010e+05 111 30111 1.074e+05 112 30112 1.133e+05 113 30113 1.180e+05 114 30114 1.211e+05 115 30115 1.222e+05 116 30116 1.213e+05 117 30117 1.186e+05 118 30118 1.146e+05 119 30119 1.100e+05 120 30120 1.054e+05 121 30121 1.014e+05 122 30122 9.838e+04 123 30123 9.637e+04 124 30124 9.535e+04 125 30125 9.508e+04 126 30126 9.520e+04 127 30127 9.527e+04 128 30128 9.484e+04 129 30129 9.355e+04 130 30130 9.128e+04 131 30131 8.809e+04 132 30132 8.425e+04 133 30133 8.012e+04 134 30134 7.603e+04 135 30135 7.225e+04 136 30136 6.895e+04 137 30137 6.617e+04 138 30138 6.392e+04 139 30139 6.214e+04 140 30140 6.078e+04 141 30141 5.980e+04 142 30142 5.922e+04 143 30143 5.905e+04 144 30144 5.934e+04 145 30145 6.013e+04 146 30146 6.143e+04 147 30147 6.324e+04 148 30148 6.552e+04 149 30149 6.816e+04 150 30150 7.100e+04 151 30151 7.384e+04 152 30152 7.655e+04 153 30153 7.904e+04 154 30154 8.132e+04 155 30155 8.353e+04 156 30156 8.595e+04 157 30157 8.896e+04 158 30158 9.302e+04 159 30159 9.864e+04 160 30160 1.063e+05 161 30161 1.165e+05 162 30162 1.293e+05 163 30163 1.443e+05 164 30164 1.605e+05 165 30165 1.759e+05 166 30166 1.883e+05 167 30167 1.957e+05 168 30168 1.969e+05 169 30169 1.921e+05 170 30170 1.824e+05 171 30171 1.693e+05 172 30172 1.544e+05 173 30173 1.390e+05 174 30174 1.241e+05 175 30175 1.102e+05 176 30176 9.755e+04 177 30177 8.644e+04 178 30178 7.692e+04 179 30179 6.900e+04 180 30180 6.262e+04 181 30181 5.766e+04 182 30182 5.397e+04 183 30183 5.137e+04 184 30184 4.972e+04 185 30185 4.889e+04 186 30186 4.881e+04 187 30187 4.940e+04 188 30188 5.059e+04 189 30189 5.230e+04 190 30190 5.444e+04 191 30191 5.690e+04 192 30192 5.960e+04 193 30193 6.244e+04 194 30194 6.539e+04 195 30195 6.842e+04 196 30196 7.153e+04 197 30197 7.471e+04 198 30198 7.795e+04 199 30199 8.118e+04 200 30200 8.430e+04 201 30201 8.719e+04 202 30202 8.976e+04 203 30203 9.193e+04 204 30204 9.364e+04 205 30205 9.480e+04 206 30206 9.531e+04 207 30207 9.504e+04 208 30208 9.391e+04 209 30209 9.189e+04 210 30210 8.912e+04 211 30211 8.587e+04 212 30212 8.251e+04 213 30213 7.939e+04 214 30214 7.680e+04 215 30215 7.492e+04 216 30216 7.381e+04 217 30217 7.349e+04 218 30218 7.394e+04 219 30219 7.510e+04 220 30220 7.690e+04 221 30221 7.919e+04 222 30222 8.174e+04 223 30223 8.425e+04 224 30224 8.637e+04 225 30225 8.776e+04 226 30226 8.826e+04 227 30227 8.788e+04 228 30228 8.690e+04 229 30229 8.569e+04 230 30230 8.465e+04 231 30231 8.405e+04 232 30232 8.398e+04 233 30233 8.434e+04 234 30234 8.494e+04 235 30235 8.554e+04 236 30236 8.598e+04 237 30237 8.623e+04 238 30238 8.638e+04 239 30239 8.665e+04 240 30240 8.736e+04 241 30241 8.884e+04 242 30242 9.147e+04 243 30243 9.559e+04 244 30244 1.016e+05 245 30245 1.097e+05 246 30246 1.200e+05 247 30247 1.321e+05 Here is my code for ggplot: ggplot(data=raw.1) + geom_line(mapping = aes(x=m, y=Intensity)) Below is the ggplot output:
I would do it this way. My solution requires the ggrepel package as well as some dplyr functions. The key to this working is that you can set data = for each geom_ layer in ggplot2. The geom_text_repel() layer from ggrepel ensures that the labels will not overlap your data from geom_line(). library(ggplot2) library(dplyr) library(ggrepel) ggplot(mapping = aes(x = m, y = Intensity, label = m)) + geom_line(data=raw.1) + geom_text_repel(data = raw.1 %>% arrange(desc(Intensity)) %>% # arranges in descending order slice_head(n = 2)) # only keeps the top two intensities. My plot does not look like yours since you only shared the first 247 data points. I suspect that this initial solution might not work for you because I am a chemist and have some idea what you hope to accomplish. This approach labels the top two highest intensities, not necessarily the top two peaks. We need to identify local all maxima and then select the two tallest. Here is how we do that. The following code calculates the slope between each point, and then looks for points where a positive slope changes to a negative slope (local maximum), then it sorts and selects the top two by intensity. top_two <- raw.1 %>% mutate(deriv = Intensity - lag(Intensity) , max = case_when(deriv >=0 & lead(deriv) <0 ~ T, T ~ F)) %>% filter(max) %>% arrange(desc(Intensity)) %>% slice_head(n = 2) Let's modify the original plot code to put this in. ggplot(mapping = aes(x = m, y = Intensity, label = m)) + geom_line(data = raw.1) + geom_text_repel(data = top_two, nudge_y = 1e4) Data: raw.1 <- structure(list(m = c(30001, 30002, 30003, 30004, 30005, 30006, 30007, 30008, 30009, 30010, 30011, 30012, 30013, 30014, 30015, 30016, 30017, 30018, 30019, 30020, 30021, 30022, 30023, 30024, 30025, 30026, 30027, 30028, 30029, 30030, 30031, 30032, 30033, 30034, 30035, 30036, 30037, 30038, 30039, 30040, 30041, 30042, 30043, 30044, 30045, 30046, 30047, 30048, 30049, 30050, 30051, 30052, 30053, 30054, 30055, 30056, 30057, 30058, 30059, 30060, 30061, 30062, 30063, 30064, 30065, 30066, 30067, 30068, 30069, 30070, 30071, 30072, 30073, 30074, 30075, 30076, 30077, 30078, 30079, 30080, 30081, 30082, 30083, 30084, 30085, 30086, 30087, 30088, 30089, 30090, 30091, 30092, 30093, 30094, 30095, 30096, 30097, 30098, 30099, 30100, 30101, 30102, 30103, 30104, 30105, 30106, 30107, 30108, 30109, 30110, 30111, 30112, 30113, 30114, 30115, 30116, 30117, 30118, 30119, 30120, 30121, 30122, 30123, 30124, 30125, 30126, 30127, 30128, 30129, 30130, 30131, 30132, 30133, 30134, 30135, 30136, 30137, 30138, 30139, 30140, 30141, 30142, 30143, 30144, 30145, 30146, 30147, 30148, 30149, 30150, 30151, 30152, 30153, 30154, 30155, 30156, 30157, 30158, 30159, 30160, 30161, 30162, 30163, 30164, 30165, 30166, 30167, 30168, 30169, 30170, 30171, 30172, 30173, 30174, 30175, 30176, 30177, 30178, 30179, 30180, 30181, 30182, 30183, 30184, 30185, 30186, 30187, 30188, 30189, 30190, 30191, 30192, 30193, 30194, 30195, 30196, 30197, 30198, 30199, 30200, 30201, 30202, 30203, 30204, 30205, 30206, 30207, 30208, 30209, 30210, 30211, 30212, 30213, 30214, 30215, 30216, 30217, 30218, 30219, 30220, 30221, 30222, 30223, 30224, 30225, 30226, 30227, 30228, 30229, 30230, 30231, 30232, 30233, 30234, 30235, 30236, 30237, 30238, 30239, 30240, 30241, 30242, 30243, 30244, 30245, 30246, 30247), Intensity = c(29.64, 33.36, 39.68, 50.15, 68.38, 101.6, 146.4, 213, 311.5, 395.1, 513.4, 531.6, 637.7, 881.3, 1071, 1119, 1202, 1299, 1112, 1205, 1422, 1653, 1726, 2423, 3059, 3267, 3993, 5172, 5278, 2794, 1459, 2512, 6590, 12450, 11440, 5197, 6012, 14530, 15130, 5802, 9226, 5809, 3074, 3882, 994.1, 817, 1149, 356.7, 380.5, 365.4, 472.4, 781.9, 863.4, 523.5, 171.2, 92.32, 94.34, 71.91, 80.36, 44.56, 94.28, 93.92, 84.13, 56.71, 26.39, 20.27, 45.84, 69.56, 61.81, 64.5, 28.26, 36.1, 63.25, 35.09, 34.78, 11.2, 6.993, 9.936, 7.738, 9.771, 17.62, 30.6, 21.75, 28.16, 27, 21.14, 43.78, 58.24, 61.93, 41.46, 96970, 94580, 92160, 89720, 87230, 84680, 82110, 79590, 77260, 75270, 73790, 72980, 73010, 73990, 76020, 79160, 83400, 88620, 94600, 101000, 107400, 113300, 118000, 121100, 122200, 121300, 118600, 114600, 110000, 105400, 101400, 98380, 96370, 95350, 95080, 95200, 95270, 94840, 93550, 91280, 88090, 84250, 80120, 76030, 72250, 68950, 66170, 63920, 62140, 60780, 59800, 59220, 59050, 59340, 60130, 61430, 63240, 65520, 68160, 71000, 73840, 76550, 79040, 81320, 83530, 85950, 88960, 93020, 98640, 106300, 116500, 129300, 144300, 160500, 175900, 188300, 195700, 196900, 192100, 182400, 169300, 154400, 139000, 124100, 110200, 97550, 86440, 76920, 69000, 62620, 57660, 53970, 51370, 49720, 48890, 48810, 49400, 50590, 52300, 54440, 56900, 59600, 62440, 65390, 68420, 71530, 74710, 77950, 81180, 84300, 87190, 89760, 91930, 93640, 94800, 95310, 95040, 93910, 91890, 89120, 85870, 82510, 79390, 76800, 74920, 73810, 73490, 73940, 75100, 76900, 79190, 81740, 84250, 86370, 87760, 88260, 87880, 86900, 85690, 84650, 84050, 83980, 84340, 84940, 85540, 85980, 86230, 86380, 86650, 87360, 88840, 91470, 95590, 101600, 109700, 120000, 132100 )), row.names = c(NA, -247L), class = c("tbl_df", "tbl", "data.frame" ))
This approach assumes or treats your x-axis as discrete values of a continuous variable and finds the local maxima based on 2nd derivative using code from Finding local maxima and minima Rest of the plotting is similar to Ben Norris's answer using geom_text_repel() to label the points of interest. Also as noted, the data your provided are different vs. the figure in your question. library(ggplot2) library(ggrepel) # find local maxima aka peaks local_maximas <- raw.1[which(diff(sign(diff(raw.1$Intensity)))==-2)+1,] top2 <- tail(local_maximas[order(local_maximas$Intensity),],2) #subset of top 2 highest peaks raw.1$label <- ifelse(raw.1$m %in% top2$m, raw.1$m, NA) #make labels for plot ggplot(data = raw.1) + geom_line(aes(x=m, y=Intensity)) + geom_text_repel(aes(x = m, y = Intensity, label = label))
Why are my 95% confidence intervals of my multivariate regression being plotted as a loess line?
I've been trying to plot a 95% prediction interval for a multivariate regression line in ggplot2. The graph is a regression of three independent variables ("x", "y", and "z") being used to predict a dependent variable ("a") on the y-axis. However, when I actually try to plot the results in ggplot2, I get a rather unusual result where the regression line is straightforward but the 95% prediction interval bands are very squiggly and do not resemble a straight line at all. They look like loess lines more than anything. Here is a picture showing the result I get: Does anyone know why I am getting a result where the 95% confidence intervals aren't smooth lines? The only thing I can think of is that this is related to the fact that this is a multivariate regression rather than a univariate one, but checking the actual variables all three show a strong correlation with the dependent variable (r2 > 0.95). I looked up the results of a plot of a multivariate regression with a 95% confidence interval, but none of them seemed similar to my result, they all seemed to have pretty smooth lines. I tried fitting a method="lm" into the predict() call of my code following this question, but that did not work either. Below is a dataset and code that replicates this result. x y z a 1 2.366153239 5.420534999 2.328204243 10.55858156 2 1.431094272 2.975529566 1.724972338 2.533696814 3 2.60453538 5.75827066 2.399639694 11.48783737 4 2.483771412 5.470167623 2.338838948 10.74706177 5 1.971210737 4.287715955 2.070680071 7.334766592 6 2.5596573 5.558000525 2.357541203 11.6127708 7 2.177892158 4.730480377 2.174966753 8.631949429 8 1.49665751 3.203559121 1.78984891 3.020424886 9 2.728865195 6.376658918 2.525204728 12.51412704 10 1.908668224 4.025351691 2.006327912 6.593044534 11 1.978895443 4.24563401 2.060493633 7.402451521 12 1.627855104 3.344274234 1.828735693 3.731699451 13 1.53436705 3.350605596 1.83046595 3.170525564 14 2.448831586 5.585936937 2.363458681 10.76866329 15 2.443160968 5.331752143 2.309058714 10.58310613 16 2.156078216 4.417635062 2.101817086 8.109576771 17 1.931534652 4.249610334 2.061458303 6.790693233 18 1.452715015 3.225752129 1.796037897 3.356200016 19 1.729354145 3.683866912 1.919340228 5.420225217 20 1.861239059 3.912023005 1.977883466 6.267750682 21 1.822955174 3.804437795 1.950496807 5.991464547 22 2.113126565 4.492001488 2.119434238 8.114076324 23 2.171856126 4.662613282 2.159308519 7.806138626 24 1.391215895 3.010620886 1.735114084 2.461296784 25 1.319165859 2.895911938 1.701737917 2.055404964 26 2.034006688 4.322608316 2.079088338 6.977001452 27 2.85574569 6.160996329 2.482135437 14.34613881 28 1.411579618 3.097385927 1.759939183 2.613006652 29 2.576957482 6.029643051 2.45553315 11.91628836 30 1.796913834 3.923259637 1.980721999 5.911392672 31 2.024389004 4.345833727 2.084666335 8.022132643 32 1.63435577 3.493472658 1.869083374 3.515715835 33 1.584595569 3.453157121 1.858267236 3.397523976 34 1.881578895 4.030076005 2.00750492 6.011267174 35 1.728309802 3.752101123 1.937034105 5.225370259 36 1.414715557 3.140049044 1.772018353 2.736961545 37 1.488730081 3.116621591 1.76539559 2.902519892 38 1.522138034 3.257327011 1.804806641 2.890371758 39 1.800033345 3.987130478 1.996780027 5.640594153 40 1.794222122 4.143928062 2.035664035 6.206575927 41 2.676710091 6.289901082 2.50796752 13.49805633 42 2.328582719 5.13691546 2.266476442 9.430961545 43 2.484723966 5.458712793 2.336388836 10.7561993 44 2.287108375 4.856940066 2.203846652 9.917240545 45 2.417128932 5.582744146 2.362783136 10.54534144 46 2.328332495 5.105945474 2.259633925 9.840475333 47 2.362264634 5.293304825 2.300718328 9.848820151 48 2.28292536 5.018934097 2.24029777 9.269934816 49 1.449825221 3.006177531 1.73383319 3.121042465 50 2.211679876 4.692264893 2.166163635 8.631218063 51 2.704614597 6.072756474 2.464296345 12.31992499 52 2.48097622 5.43590303 2.331502312 11.2245765 53 1.497529983 3.380994674 1.838748127 3.752088968 54 2.696365396 5.825540285 2.413615604 12.36222133 55 2.165729837 4.666265285 2.160153996 8.455875079 56 2.410978268 5.417499423 2.327552239 10.08813972 57 2.185447829 4.991792206 2.234231905 9.215327913 58 2.041898307 4.22566518 2.055642279 7.418180823 59 2.099077244 4.375757022 2.091831021 7.696212639 60 2.000032635 4.234467391 2.057782153 7.110696123 61 2.025963678 4.260852439 2.064183238 6.851163763 62 2.083395224 4.351567427 2.08604109 7.884576511 63 1.981523362 4.318820559 2.07817722 7.43543802 64 2.033235038 4.336636932 2.082459347 7.313220387 65 1.423999144 3.206803244 1.790754937 2.564949357 66 2.217982257 4.825910853 2.196795587 8.920558764 67 1.240285111 2.808498672 1.675857593 1.568615918 68 2.215837149 5.041487758 2.245325758 8.802372134 69 2.134859238 4.731890939 2.175291001 8.132101136 70 2.306998207 5.059171458 2.249260202 9.336074756 71 1.896404791 4.104681782 2.026001427 6.445449942 72 1.922935417 4.151905673 2.037622554 6.818169682 73 2.111422924 4.716264233 2.171696165 8.366370302 74 2.28264494 4.852811209 2.202909714 9.210340372 75 2.190760504 4.574710979 2.1388574 8.447427164 76 2.037589062 4.275276265 2.06767412 6.989197008 77 1.717192759 3.810543836 1.952061433 4.610157727 78 1.876769266 4.043051268 2.010734012 6.306275287 79 2.030134158 4.579339426 2.139939117 7.715792425 80 1.93577016 4.356708827 2.08727306 6.788521191 81 2.056518774 4.445588116 2.108456335 7.636510887 82 2.120080841 4.615120517 2.148283156 7.916807491 83 2.232689054 4.861361591 2.204849562 8.694167142 84 2.181147406 4.782479201 2.186888017 8.854567878 85 2.92779884 6.305666829 2.511108685 13.9593635 86 1.860080456 4.459637473 2.111785376 6.163314804 87 1.913818428 4.602767301 2.145406092 7.174915716 88 1.877883958 4.594104966 2.143386332 6.335054251 89 1.994987686 4.632100752 2.152231575 7.707952547 90 2.14756511 5.023880521 2.241401464 9.161721393 91 1.503591471 3.687628672 1.92031994 4.280824129 92 1.4536743 3.579343567 1.891915317 3.761200116 93 1.50872427 3.584888833 1.893380266 4.106767082 94 1.537573733 3.649466946 1.910357806 4.126327608 95 1.934796461 4.373238129 2.091228856 7.584097036 96 1.526250724 3.248434627 1.802341429 3.228826156 97 1.606399474 3.500439216 1.870946075 4.939855112 98 1.943162189 4.329208633 2.080675043 6.460498957 99 1.963384107 4.353112625 2.086411423 6.649308332 100 2.183124049 4.711248626 2.170541091 8.474527832 101 1.640763809 3.543853682 1.882512598 3.832330237 102 1.659456682 3.523415014 1.877076188 3.997282849 103 1.436096958 3.166318574 1.779415234 2.839078464 104 2.428955194 4.91133048 2.216152179 10.44793169 105 2.668500746 6.154858094 2.480898646 12.73883098 106 2.676812229 6.178980921 2.485755604 12.64109656 107 2.126920019 4.640923356 2.154280241 8.600833727 108 1.878254881 4.025530246 2.00637241 6.253828812 109 2.242102174 4.726797674 2.174119977 8.29404964 110 1.676813632 3.822754538 1.955186574 5.370638028 111 1.874531192 4.17438727 2.043131731 7.265087007 112 1.998637301 4.2363594 2.058241822 6.722389092 113 1.944116978 4.159527009 2.039491851 6.038562805 114 2.308184503 5.192956851 2.278806014 9.36048303 115 2.042370888 4.49535532 2.120225299 7.320526962 116 2.015621187 4.318820559 2.07817722 7.081078135 117 1.81401665 4.146304301 2.036247603 6.492542819 118 1.676813632 3.87937827 1.969613736 5.221868194 119 2.807346477 6.428545769 2.535457704 13.72308897 120 1.621259207 3.543853682 1.882512598 4.162470391 121 1.50100345 3.321793359 1.822578766 3.106378794 122 1.582428764 3.464319806 1.861268333 4.143134726 123 1.654547625 3.591817741 1.895209155 4.509649984 124 2.332936461 4.937777822 2.222111118 9.398917323 125 2.498105588 5.513601542 2.348105948 11.29414737 126 1.890319403 3.887730313 1.97173282 5.847161058 127 1.804890841 3.940999114 1.985194981 6.17864926 128 2.096209309 4.6042388 2.145749007 7.788418833 129 2.047658751 4.337290741 2.082616321 7.612336837 130 2.680572077 5.989462544 2.447337848 12.15745472 131 2.333554566 5.407171771 2.325332615 10.44467195 132 2.212180997 4.932817886 2.220994797 8.881836305 133 1.478852439 3.063390922 1.750254531 2.890371758 134 1.648334702 3.518387649 1.875736562 4.141546164 135 2.307921185 4.90823336 2.215453308 9.305650552 136 2.13384989 4.645130271 2.155256428 8.018790088 137 1.728309802 3.555348061 1.885563062 4.941642423 138 1.691821236 3.556775613 1.885941572 4.886582645 139 1.746238611 3.891820298 1.972769702 5.363543151 140 1.679155631 3.642966397 1.908655652 4.754882459 141 1.94348069 4.156536582 2.038758589 6.277601677 142 1.549402462 3.250374492 1.8028795 3.342508385 143 1.856975574 4.232023463 2.057188242 6.413458957 144 2.529503815 5.684310793 2.38417927 11.22830537 145 2.035545742 4.643428898 2.154861689 7.244227516 146 2.467132416 5.697093487 2.386858497 11.50287513 147 2.298324686 4.870031331 2.206814748 9.286469586 148 1.937388065 4.34601078 2.0847088 7.322972679 149 1.956955486 4.536730733 2.129960266 7.739019572 150 2.036823984 4.518958489 2.125784206 8.594154233 151 1.972996546 4.529692045 2.128307319 7.967481199 152 1.58746864 3.283839256 1.812136655 3.314186005 153 1.521311054 3.464922216 1.861430153 3.681603045 154 2.44446969 5.445011746 2.333454895 10.3609124 155 2.294121109 4.731979033 2.17531125 9.105210941 156 3.126345733 6.927557906 2.632025438 15.6772624 157 1.867746396 4.253056253 2.06229393 6.32459191 158 1.839082858 4.029806041 2.00743768 5.382980154 159 2.127330896 4.844974178 2.201130205 7.863266724 160 2.404523583 5.236441963 2.288327329 10.04902409 161 2.262955985 4.845642719 2.201282063 9.034969801 162 2.253418218 4.727387819 2.174255693 9.130463484 163 2.302083991 5.167955549 2.273313781 10.06411762 164 2.192165626 4.835011259 2.198865903 9.262695602 165 1.672685332 3.734489965 1.93248285 4.565493369 166 1.568460311 3.539508997 1.881358285 3.52282487 167 1.609819887 3.523868735 1.877197042 3.920784511 168 1.616583967 3.587676949 1.894116403 4.394572604 169 1.643301653 3.654700957 1.911727218 3.912023005 170 1.621923158 3.581532841 1.892493815 3.891820298 171 2.090637708 4.527208645 2.127723818 8.536995819 172 2.109497906 4.585222548 2.141313277 8.203668045 173 2.03091153 4.429625613 2.104667578 7.785783239 174 2.09487893 4.582924577 2.140776629 8.204589814 175 2.040382454 4.335786342 2.08225511 6.632541816 176 2.312894869 5.342334252 2.311349011 9.798127037 177 1.430087263 3.148453361 1.774388165 2.939161922 178 2.293711966 4.871098263 2.20705647 9.392661929 179 2.391075023 4.894101478 2.212261621 9.375295332 180 2.517077345 5.718436483 2.391325257 11.47221284 181 1.989024673 4.154969184 2.038374152 6.872128101 182 2.02016078 4.294014757 2.072200463 7.403304815 183 1.797360845 4.076689627 2.019081382 5.90560705 184 1.705239225 3.931825633 1.982883162 5.697965589 185 1.471533812 3.312439025 1.820010721 3.529590596 186 1.438083095 3.346917175 1.829458164 3.533978493 187 1.619261465 3.559624618 1.886696748 4.109233175 188 1.609819887 3.6558396 1.912025 4.166355098 189 2.346796539 5.146965796 2.26869253 9.872567414 190 1.784208279 3.519720884 1.876091918 4.879539029 191 1.832126365 3.811539467 1.952316436 5.259368616 192 1.677986168 3.452840615 1.858182073 3.885884348 193 1.966109701 4.163870625 2.04055645 6.526348436 194 1.701367309 3.828641396 1.956691441 4.605170186 195 1.931534652 4.279440046 2.06868075 6.927802974 196 1.36183801 3.102342009 1.761346646 2.645465326 197 2.432819556 5.883322388 2.425556099 10.46486408 198 2.078341803 4.564943223 2.136572775 7.650468513 199 1.432099112 3.171155089 1.780773733 2.931193752 200 2.174427741 4.839451482 2.199875333 8.482392615 201 2.16404302 4.710430697 2.170352666 8.620246046 202 1.738643812 3.737669618 1.933305361 5.834810737 203 2.303817478 5.000921602 2.236274044 9.718344619 204 1.741189967 3.731819205 1.931791708 5.090062428 205 1.794671893 3.904293207 1.975928442 5.247024072 206 1.757635562 3.857777991 1.964122703 5.006560336 207 1.676226207 3.66137978 1.913473224 4.566637236 208 1.77911412 3.86388263 1.965676125 5.669260041 209 2.059914227 4.564348191 2.136433521 7.695152987 210 1.32424147 3.104586678 1.761983734 2.182674796 211 1.604334732 3.751518852 1.936883799 4.85787254 212 1.662497734 3.79739748 1.948691222 5.073109185 213 1.44885795 3.04690056 1.745537327 2.907447359 214 2.487551021 5.598973005 2.366214911 10.97673998 215 2.438166592 5.528436532 2.351262753 10.75773968 216 1.892477044 4.164647686 2.040746845 7.15334893 217 1.520482581 3.272335343 1.80895974 3.424588334 218 2.488969385 5.681996883 2.383693957 10.74868607 219 2.215837149 4.53044664 2.128484588 7.620705087 220 2.442786243 5.526780079 2.350910479 10.69919132 221 2.570602875 5.907702431 2.430576563 11.59161344 222 2.608344119 6.053264948 2.460338381 12.33182385 223 2.524368131 5.738731256 2.395564914 11.20612853 224 1.539964086 3.38269391 1.839210132 3.571221411 225 1.541550744 3.476614021 1.864568052 3.523119986 226 2.111209474 4.695924549 2.167008202 8.126284621 227 1.910391851 4.139955073 2.034687955 6.467590025 228 2.801971864 6.015864434 2.452725919 13.0280527 229 2.616209119 5.780126041 2.404189269 11.53329656 230 2.570130461 5.673975975 2.38201091 10.97701107 231 2.545595117 5.629669374 2.372692431 11.14107887 232 2.618299253 5.800606659 2.408444863 11.97035031 233 2.443348195 5.385412073 2.320649063 10.85417971 234 2.385152788 5.279188197 2.297648406 10.67131308 235 2.512400994 5.685007319 2.384325338 11.58593194 236 2.39352554 5.12693575 2.26427378 10.4590302 237 1.823796962 3.992680908 1.998169389 6.109247583 238 1.768267491 3.745968421 1.935450444 5.260096154 239 2.376820756 5.302583255 2.302733865 10.4487146 240 2.042402374 4.477336814 2.115971837 7.810068783 241 2.159700495 4.673996377 2.161942732 8.189916149 242 1.948229832 4.378018613 2.092371528 6.932447892 243 1.330510703 3.059880093 1.749251295 2.083184528 244 1.464097665 3.342685111 1.828301154 3.072693315 245 1.446917352 3.196630216 1.787912251 2.829087196 246 2.082252099 4.60990894 2.14706985 8.075582637 247 1.933494729 4.136126096 2.033746812 7.003065459 248 1.840298976 3.949126093 1.987240824 7.056175284 249 1.649584193 3.645188765 1.909237745 4.51129897 250 1.778648064 3.883623531 1.97069113 5.09681299 251 2.526339825 5.903056741 2.429620699 11.66907415 252 2.512244141 5.734958092 2.394777253 10.93748043 253 1.947599667 4.356708827 2.08727306 6.514712691 254 2.181687439 4.946274535 2.224022153 8.799405331 255 2.109497906 4.510859507 2.123878411 8.132101136 256 1.831713667 4.188138442 2.046494183 6.109247583 257 1.5517319 3.446807893 1.856558077 3.765840495 258 2.47549747 5.727881894 2.393299374 10.78967984 259 1.96580772 4.156693187 2.038796995 6.229496711 260 1.978602442 4.21508618 2.053067505 7.258412151 261 2.064000486 4.339901708 2.083243075 7.670717659 262 2.117775721 4.510639702 2.123826665 7.731676304 263 2.221912965 4.838923916 2.199755422 8.877208949 264 1.940925986 4.266896327 2.065646709 6.450865289 265 2.040382454 4.579852378 2.140058966 7.857666456 266 2.173143952 4.666735542 2.160262841 8.561717125 267 2.240859653 4.901564199 2.21394765 8.808442394 268 1.888874933 4.080921542 2.02012909 6.163314804 269 1.845529749 4.082609306 2.020546784 6.885284696 270 2.238519604 4.984229093 2.23253871 8.987910316 271 2.393206767 5.29338648 2.300736073 10.70491521 272 2.702044102 5.884714177 2.425842983 12.3883942 273 2.219296721 4.854631045 2.203322728 9.263449766 274 1.96161829 4.090838423 2.022582118 6.993932975 275 2.00561407 4.171305603 2.042377439 7.324270223 276 2.467836387 5.578051269 2.361789844 10.8016414 277 1.390119244 3.100092289 1.760707894 2.379546134 278 1.365322726 3.044760505 1.744924212 2.401525041 279 1.598782218 3.516726026 1.875293584 4.234975692 280 1.94538671 4.131961426 2.032722663 6.199494461 281 2.172592522 4.89858579 2.213274902 8.804952261 282 1.908668224 4.102312732 2.025416681 6.374172668 283 1.944434766 4.112266337 2.027872367 6.54672802 284 1.58387445 3.505557397 1.872313381 3.941581808 285 1.743721514 3.832670536 1.95772075 5.11349268 286 1.592453126 3.549329989 1.883966557 4.871143315 287 1.283414418 2.79971739 1.673235605 1.54329811 288 1.320439849 2.90690106 1.704963654 2.070653036 289 1.194572818 2.708716646 1.645817926 1.184789985 290 1.231175294 2.681021529 1.637382524 1.115141591 291 1.365322726 3.074795481 1.753509476 2.254444718 292 1.408422528 3.000719815 1.732258588 2.422144328 293 2.184734225 4.886582645 2.210561613 8.803574418 294 2.030652566 4.649187071 2.156197364 7.901007052 295 1.890679763 4.02356438 2.005882444 6.212726329 296 1.855414729 4.027135813 2.006772486 5.858647185 297 1.819146836 3.737669618 1.933305361 5.340274716 298 1.51380043 3.337192052 1.826798306 3.514823642 299 1.923936518 4.162158962 2.040136996 6.485993092 300 2.54480266 5.875913394 2.42402834 11.44989333 301 2.015083881 4.471638793 2.114624977 7.725330038 302 1.902054478 4.30514559 2.074884476 7.141300544 303 1.932189012 4.149463861 2.037023284 7.112433389 304 1.357151358 2.977059008 1.725415605 2.054123734 305 2.172040349 4.677490848 2.162750759 7.584422406 306 2.12856108 4.80073697 2.191058413 8.165269798 307 1.597383378 3.38269391 1.839210132 3.948162052 308 1.571436916 3.451890496 1.857926397 3.970291914 309 1.669116161 3.728100167 1.930828881 4.514479321 310 1.792870023 3.818920387 1.95420582 5.395716273 311 2.701422654 6.042632834 2.45817673 12.51412704 312 2.724885462 6.056784013 2.461053436 12.63817968 313 2.649668658 5.96870756 2.44309385 12.34583459 314 1.328012928 3.19047635 1.786190457 2.231089091 315 2.290836238 4.827072968 2.197060074 8.67484801 316 2.375600157 5.495650681 2.344280419 10.05803872 317 1.625886455 3.693369359 1.92181408 5.110178924 318 2.329332455 5.313205979 2.305039258 9.613803477 319 1.515480102 3.456316681 1.859117178 3.185525845 320 1.472454994 3.284663565 1.812364082 3.025291076 321 1.506165026 3.349904087 1.83027432 3.054001182 322 1.473374347 3.306520335 1.81838399 3.25617161 323 1.527068855 3.325036021 1.82346813 3.36729583 324 2.110354575 4.662495253 2.159281189 8.165363632 325 1.523787537 3.514526067 1.874706928 3.062923523 326 2.023599447 4.094344562 2.02344868 6.938769333 327 2.753898938 5.917548864 2.432601255 12.66032792 328 2.617941755 5.678362097 2.382931408 11.46939146 329 2.119034653 4.483002552 2.117310216 8.240121298 330 2.066147705 4.476199805 2.115703147 7.14397299 331 2.101925481 4.630837933 2.151938181 7.659327016 332 2.508777239 5.407171771 2.325332615 11.8987611 333 2.005568244 4.463030419 2.112588559 7.397665697 334 1.726738648 3.759687344 1.938991321 4.430816799 335 1.774901671 3.812975852 1.952684268 4.937562683 336 1.648959883 3.423610976 1.85030024 3.988984047 337 1.777714463 3.797733859 1.948777529 5.403847868 338 1.704136403 3.63758616 1.9072457 5.272486607 339 1.844729114 3.968445871 1.992095849 6.429622699 340 1.768267491 3.797733859 1.948777529 5.523339153 341 2.159320704 4.744410253 2.178166718 8.301035184 342 2.109497906 4.580877493 2.140298459 7.726636028 343 2.521315024 5.573617308 2.360850971 11.60597801 344 2.576758408 5.79269513 2.406801847 11.8427421 345 2.669803365 5.872117789 2.423245301 12.427118 346 2.441001399 5.430134791 2.330264962 11.48863277 347 2.117775721 4.750395438 2.17954019 8.556413905 348 2.023599447 4.553734634 2.133948133 7.34601021 349 2.394268344 5.066826574 2.250961255 9.954988325 350 2.106053393 4.696472344 2.167134593 7.892825526 351 2.100394247 4.736330019 2.176311103 7.72870183 352 2.160269524 4.922204729 2.21860423 8.255482913 353 2.188997276 4.774912961 2.185157422 9.409191231 354 1.874905013 3.86388263 1.965676125 6.003887067 355 2.061842158 4.182126476 2.045024811 6.745236349 356 1.418864374 3.119939077 1.766334928 2.7631695 data<-read.csv(data.csv,header=T) fit.all<-lm(a~x+y+z,data=data) b<-data.frame(data,predict(fit.all,interval="prediction")) ggplot(data,aes(x=x+y+z,y=a))+ geom_point(size=3,shape=1,col="black")+ geom_smooth(method="lm")+ geom_line(aes(y=lwr), color = "red", linetype = "dashed")+ geom_line(aes(y=upr), color = "red", linetype = "dashed")+ theme_classic()
That approach is not going to produce a sensible graphical display as Ben suggested. What you could do is examine the relationship between each predictor and the outcome separately while holding the other predictors not under immediate consideration constant at some chosen level. Here I use the means as those chosen levels. data_x.yz <- data.frame( x = seq(min(data$x), max(data$x), 0.1), y = mean(data$y), z = mean(data$z) ) data_x.yz <- cbind( data_x.yz, predict(fit.all, newdata = data_x.yz, interval = "prediction") ) ggplot(data_x.yz, aes(x, fit, ymin = lwr, ymax = upr)) + geom_line(color = "blue") + geom_ribbon(fill = NA, color = "red", linetype = "dashed") data_y.xz <- data.frame( x = mean(data$x), y = seq(min(data$y), max(data$y), 0.1), z = mean(data$z) ) data_y.xz <- cbind( data_y.xz, predict(fit.all, newdata = data_y.xz, interval = "prediction") ) ggplot(data_y.xz, aes(y, fit, ymin = lwr, ymax = upr)) + geom_line(color = "blue") + geom_ribbon(fill = NA, color = "red", linetype = "dashed") data_z.yx <- data.frame( x = mean(data$x), y = mean(data$y), z = seq(1.6, 2.6, 0.1) ) data_z.yx <- cbind( data_z.yx, predict(fit.all, newdata = data_z.yx, interval = "prediction") ) ggplot(data_z.yx, aes(z, fit, ymin = lwr, ymax = upr)) + geom_line(color = "blue") + geom_ribbon(fill = NA, color = "red", linetype = "dashed")
Subset timeseries (date sequence) into a list
I have a dataframe with a series of dates, here's a simplified version of it: > eventdates dr.rank dr.start dr.end 1 14 1964-09-30 1964-10-06 2 16 1964-11-01 1964-12-24 I also have a time series of dates with values etc. associated with that, here's a much simplified version of the timeseries: ts1964 <- data.frame(DATE = seq(from = as.Date("1964-01-01"), to = as.Date("1964-12-31"), by = "days"), Q = 1:366) What I am trying to do is subset by each date in eventdates, i.e.: > filter(ts1964, ts1964$DATE >= eventdates[1,2] & ts1964$DATE <= eventdates[1,3]) DATE Q 1 1964-09-30 274 2 1964-10-01 275 3 1964-10-02 276 4 1964-10-03 277 5 1964-10-04 278 6 1964-10-05 279 7 1964-10-06 280 8 1964-10-07 281 9 1964-10-08 282 10 1964-10-09 283 11 1964-10-10 284 12 1964-10-11 285 13 1964-10-12 286 14 1964-10-13 287 15 1964-10-14 288 16 1964-10-15 289 17 1964-10-16 290 18 1964-10-17 291 19 1964-10-18 292 20 1964-10-19 293 21 1964-10-20 294 22 1964-10-21 295 23 1964-10-22 296 24 1964-10-23 297 25 1964-10-24 298 26 1964-10-25 299 27 1964-10-26 300 28 1964-10-27 301 29 1964-10-28 302 30 1964-10-29 303 31 1964-10-30 304 32 1964-10-31 305 33 1964-11-01 306 > But I need to do this hundreds of times. What I would like to do is have each subset form an element in a list. I would normally be considering to using something like dlply in plyr but this isn't an option when I'm using dplyr. Could anyone advise on how I might achieve this otherwise? Thanks
We can use Map Map(function(x,y) filter(ts1964, DATE >= x & DATE <= y), eventdates$dr.start, eventdates$dr.end)
grep: How can i search through my data using a wildcard in R
I have recently started using R. So now I am trying to get some data out of it. However, the results I get are quite confusing. I have datas from the year 1961 to 1963 of everyday in the format 1961-04-25. I created a vector called: date So when I try to use grep to just search for the period between April 10 and May 21 and display the dates I used this command: date[date >= grep("196.-04-10", date, value = TRUE) & date <= grep("196.-05-21", date, value = TRUE)] The results I get is are somehow confusing as it is making 3 days steps instead of giving me every single day... see below. [1] "1961-04-10" "1961-04-13" "1961-04-16" "1961-04-19" "1961-04-22" "1961-04-25" "1961-04-28" "1961-05-01" "1961-05-04" "1961-05-07" "1961-05-10" [12] "1961-05-13" "1961-05-16" "1961-05-19" "1962-04-12" "1962-04-15" "1962-04-18" "1962-04-21" "1962-04-24" "1962-04-27" "1962-04-30" "1962-05-03" [23] "1962-05-06" "1962-05-09" "1962-05-12" "1962-05-15" "1962-05-18" "1962-05-21" "1963-04-11" "1963-04-14" "1963-04-17" "1963-04-20" "1963-04-23" [34] "1963-04-26" "1963-04-29" "1963-05-02" "1963-05-05" "1963-05-08" "1963-05-11" "1963-05-14" "1963-05-17" "1963-05-20"
I think the grep strategy is misguided, but maybe something like this will work ... basically, I'm computing the day-of-year (Julian date, yday()) and using that for comparison. z <- as.Date(c("1961-04-10","1961-04-11","1961-04-12", "1961-05-21","1961-05-22","1961-05-23", "1963-04-09","1963-04-12","1963-05-21","1963-05-22")) library(lubridate) z[yday(z)>=yday(as.Date("1961-04-10")) & yday(z)<=yday(as.Date("1961-05-21"))] ## [1] "1961-04-10" "1961-04-11" "1961-04-12" "1961-05-21" "1963-04-12" ## [6] "1963-05-21"yz <- year(z) Actually, this solution is fragile to leap-years ... Better (?): yz <- year(z) z[z>=as.Date(paste0(yz,"-04-10")) & z<=as.Date(paste0(yz,"-05-21"))] (You should definitely test this for yourself, I haven't tested carefully!)
Using a date format for your variable would be the best bet here. ## set up some test data datevar <- seq.Date(as.Date("1961-01-01"),as.Date("1963-12-31"),by="day") test <- data.frame(date=datevar,id=1:(length(datevar))) head(test) ## which looks like: > head(test) date id 1 1961-01-01 1 2 1961-01-02 2 3 1961-01-03 3 4 1961-01-04 4 5 1961-01-05 5 6 1961-01-06 6 ## find the date ranges you want selectdates <- (format(test$date,"%m") == "04" & as.numeric(format(test$date,"%d")) >= 10) | (format(test$date,"%m") == "05" & as.numeric(format(test$date,"%d")) <= 21) ## subset the original data result <- test[selectdates,] ## which looks as expected: > result date id 100 1961-04-10 100 101 1961-04-11 101 102 1961-04-12 102 103 1961-04-13 103 104 1961-04-14 104 105 1961-04-15 105 106 1961-04-16 106 107 1961-04-17 107 108 1961-04-18 108 109 1961-04-19 109 110 1961-04-20 110 111 1961-04-21 111 112 1961-04-22 112 113 1961-04-23 113 114 1961-04-24 114 115 1961-04-25 115 116 1961-04-26 116 117 1961-04-27 117 118 1961-04-28 118 119 1961-04-29 119 120 1961-04-30 120 121 1961-05-01 121 122 1961-05-02 122 123 1961-05-03 123 124 1961-05-04 124 125 1961-05-05 125 126 1961-05-06 126 127 1961-05-07 127 128 1961-05-08 128 129 1961-05-09 129 130 1961-05-10 130 131 1961-05-11 131 132 1961-05-12 132 133 1961-05-13 133 134 1961-05-14 134 135 1961-05-15 135 136 1961-05-16 136 137 1961-05-17 137 138 1961-05-18 138 139 1961-05-19 139 140 1961-05-20 140 141 1961-05-21 141 465 1962-04-10 465 ...