Extrapolating data using interp linear interpolation for water quality data visualizations - r

Background: I'm working on creating data visualizations from water quality data. The code I have works great for the most part, but sometimes we have to drop data from the surface if it's too choppy to get good data, so we end up with data starting at 2m (sometimes 3m) depth. When we drop the values, the interpolation returns NA values at the surface, making the visualizations unusable for our purpose. See images for examples: Example of a good visualization Example of visualization with surface NAs
Example data snip and further explanation below code
The akima::interp function does not allow for extrapolation when using linear interpolation, and the non-linear option isn't appropriate for our data.
I'm looking for a workaround for this. My thoughts were either:
a line of code that does something like: "if minimum depth at station is greater than 1, copy data from minimum depth to all depths between 1 and the minimum" (disclaimer: I know this isn't appropriate for analysis, but for our purposes creating visualizations to show data trends we are okay with this)
something I'm missing within the function that will allow for extrapolation (or using a different linear interpolation function)
Here is the code I have so far (this is taken from within a larger function).
library(akima); library(dplyr)
library("data.table");library(tidyverse);library(naniar)
library(magrittr);library(janitor);library(lubridate);library(wql)
##Example data import
example.data<-structure(list(Station = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8,
8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11,
11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12,
12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15,
15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16,
16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17,
17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 20, 20, 20,
20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20,
20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21,
21, 21, 21, 21, 21, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22,
22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23,
23, 23, 23, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25,
25, 26, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27,
27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 29,
29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29.5, 29.5, 29.5, 29.5,
29.5, 29.5, 29.5, 29.5, 29.5, 29.5, 29.5, 29.5, 30, 30, 30, 30,
30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31,
32, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33, 33, 33,
33, 33, 34, 34, 649, 649, 649, 649, 649, 649, 649, 649, 649,
649, 657, 657, 657, 657, 657, 657, 657, 657, 657, 657), Distance.from.36 = c(127.7053,
127.7053, 127.7053, 127.7053, 127.7053, 127.7053, 127.7053, 127.7053,
127.7053, 127.7053, 125.3243, 125.3243, 125.3243, 125.3243, 125.3243,
125.3243, 125.3243, 125.3243, 125.3243, 125.3243, 119.8957, 119.8957,
119.8957, 119.8957, 119.8957, 119.8957, 119.8957, 119.8957, 119.8957,
119.8957, 119.8957, 119.8957, 119.8957, 119.8957, 119.8957, 119.8957,
115.6252, 115.6252, 115.6252, 115.6252, 115.6252, 115.6252, 115.6252,
115.6252, 115.6252, 115.6252, 115.6252, 115.6252, 110.8977, 110.8977,
110.8977, 110.8977, 110.8977, 110.8977, 110.8977, 110.8977, 110.8977,
110.8977, 104.981, 104.981, 104.981, 104.981, 104.981, 104.981,
104.981, 104.981, 104.981, 104.981, 104.981, 104.981, 104.981,
99.7699, 99.7699, 99.7699, 99.7699, 99.7699, 99.7699, 99.7699,
99.7699, 99.7699, 99.7699, 99.7699, 99.7699, 99.7699, 99.7699,
99.7699, 96.7889, 96.7889, 96.7889, 96.7889, 96.7889, 96.7889,
96.7889, 96.7889, 96.7889, 96.7889, 96.7889, 96.7889, 96.7889,
96.7889, 96.7889, 96.7889, 96.7889, 96.7889, 96.7889, 96.7889,
96.7889, 96.7889, 96.7889, 96.7889, 96.7889, 93.5951, 93.5951,
93.5951, 93.5951, 93.5951, 93.5951, 93.5951, 93.5951, 93.5951,
93.5951, 93.5951, 93.5951, 93.5951, 93.5951, 93.5951, 93.5951,
93.5951, 89.7672, 89.7672, 89.7672, 89.7672, 89.7672, 89.7672,
89.7672, 89.7672, 89.7672, 89.7672, 89.7672, 89.7672, 89.7672,
84.4458, 84.4458, 84.4458, 84.4458, 84.4458, 84.4458, 84.4458,
84.4458, 78.5444, 78.5444, 78.5444, 78.5444, 78.5444, 78.5444,
78.5444, 78.5444, 78.5444, 74.4288, 74.4288, 74.4288, 74.4288,
74.4288, 74.4288, 74.4288, 74.4288, 74.4288, 74.4288, 74.4288,
74.4288, 74.4288, 74.4288, 69.9895, 69.9895, 69.9895, 69.9895,
69.9895, 69.9895, 69.9895, 69.9895, 69.9895, 69.9895, 69.9895,
69.9895, 69.9895, 69.9895, 69.9895, 69.9895, 69.9895, 69.9895,
69.9895, 69.9895, 69.9895, 69.9895, 63.0794, 63.0794, 63.0794,
63.0794, 63.0794, 63.0794, 63.0794, 63.0794, 63.0794, 63.0794,
63.0794, 58.8909, 58.8909, 58.8909, 58.8909, 58.8909, 58.8909,
58.8909, 58.8909, 58.8909, 58.8909, 58.8909, 58.8909, 54.9481,
54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481,
54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481,
54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481,
54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481,
54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481, 54.9481,
54.9481, 54.9481, 54.9481, 51.4501, 51.4501, 51.4501, 51.4501,
51.4501, 51.4501, 51.4501, 51.4501, 51.4501, 51.4501, 51.4501,
51.4501, 51.4501, 51.4501, 51.4501, 51.4501, 51.4501, 51.4501,
51.4501, 51.4501, 51.4501, 51.4501, 51.4501, 51.4501, 46.6266,
46.6266, 46.6266, 46.6266, 46.6266, 46.6266, 46.6266, 46.6266,
46.6266, 46.6266, 46.6266, 46.6266, 46.6266, 46.6266, 46.6266,
46.6266, 43.921, 43.921, 43.921, 43.921, 43.921, 43.921, 43.921,
43.921, 43.921, 43.921, 43.921, 43.921, 43.921, 43.921, 43.921,
43.921, 43.921, 39.6557, 39.6557, 39.6557, 39.6557, 39.6557,
39.6557, 39.6557, 39.6557, 39.6557, 39.6557, 39.6557, 39.6557,
39.6557, 37.0911, 37.0911, 37.0911, 37.0911, 37.0911, 37.0911,
37.0911, 37.0911, 32.8382, 32.8382, 32.8382, 32.8382, 32.8382,
32.8382, 28.756, 28.756, 28.756, 28.756, 28.756, 28.756, 28.756,
28.756, 26.1872, 26.1872, 26.1872, 26.1872, 26.1872, 26.1872,
26.1872, 26.1872, 26.1872, 26.1872, 23.5695, 23.5695, 23.5695,
23.5695, 23.5695, 23.5695, 23.5695, 23.5695, 23.5695, 23.5695,
23.5695, 23.5695, 20.2526, 20.2526, 20.2526, 20.2526, 20.2526,
20.2526, 20.2526, 20.2526, 20.2526, 20.2526, 20.2526, 17.564,
17.564, 17.564, 17.564, 17.564, 17.564, 17.564, 17.564, 17.564,
17.564, 17.564, 17.564, 14.7543, 14.7543, 14.7543, 14.7543, 14.7543,
14.7543, 14.7543, 14.7543, 14.7543, 14.7543, 10.6065, 10.6065,
10.6065, 10.6065, 10.6065, 10.6065, 10.6065, 10.6065, 10.6065,
10.6065, 8.1507, 8.1507, 8.1507, 8.1507, 8.1507, 8.1507, 8.1507,
8.1507, 6.8085, 6.8085, 6.8085, 6.8085, 6.8085, 6.8085, 6.8085,
6.8085, 6.8085, 6.8085, 3.529, 3.529, 132.9388, 132.9388, 132.9388,
132.9388, 132.9388, 132.9388, 132.9388, 132.9388, 132.9388, 132.9388,
147.2548, 147.2548, 147.2548, 147.2548, 147.2548, 147.2548, 147.2548,
147.2548, 147.2548, 147.2548), Depth = c(2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L,
20L, 21L, 22L, 23L, 24L, 25L, 26L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L, 22L, 23L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
13L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L,
29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L,
17L, 18L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
14L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 2L, 3L, 4L, 5L, 6L, 7L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 10L, 11L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
3L, 4L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L), Calc.Chl = c(3.3, 3.2, 3.2, 3.2,
3.2, 3.2, 3.1, 3.1, 3.2, 3.2, 3.4, 3.4, 3.6, 3.6, 3.6, 3.7, 3.5,
3.4, 3.6, 3.6, 3.7, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.5, 3.4,
3.5, 3.5, 3.5, 3.5, 3.5, 3.4, 3.9, 3.9, 3.8, 3.9, 3.8, 3.8, 3.8,
3.8, 3.7, 3.7, 3.7, 3.6, 4.4, 4.4, 4.3, 4.4, 4.3, 4.1, 4.1, 4.1,
4, 4.1, 3.3, 3, 2.9, 2.9, 2.9, 2.8, 2.6, 2.8, 2.8, 2.7, 2.7,
2.7, 2.8, 3.1, 3.1, 2.8, 2.5, 2.5, 2.4, 2.2, 2.2, 2.1, 2, 1.7,
1.9, 2, 2.1, 2.1, 2.2, 2.2, 2, 2, 2, 1.9, 1.9, 1.9, 1.8, 1.7,
1.6, 1.6, 1.6, 1.8, 1.9, 1.9, 1.9, 1.9, 2, 1.9, 2, 2, 2, 2, 2,
1.2, 1.9, 1.9, 1.8, 1.7, 1.6, 1.6, 1.6, 1.6, 1.6, 1.6, 1.6, 1.7,
1.4, 1.3, 1.8, 2, 1.2, 1.7, 1.7, 1.6, 1.6, 1.5, 1.5, 1.4, 1.5,
1.7, 1.7, 1.7, 1.7, 2.9, 2.8, 1.8, 1.8, 1.8, 2.2, 2.1, 1.7, 4.5,
3.8, 2.8, 2.3, 2.8, 2.9, 2.6, 2.4, 2.9, 3.8, 3.6, 3.1, 3, 3.1,
3, 2.9, 2.6, 3, 3.1, 2.8, 2.6, 2.8, 2.3, 4, 3.8, 3.2, 2.8, 2.6,
2.5, 2.7, 2.6, 2.5, 2.5, 2.5, 2.6, 2.7, 2.7, 2.6, 2.7, 3, 3,
2.8, 2.8, 2.8, 2.8, 4.1, 4.3, 4.4, 4.3, 4.3, 4.4, 4.6, 4.8, 4.9,
4.8, 4.8, 3.6, 4, 4.2, 4.2, 4, 4.3, 4.9, 5.1, 5.5, 5.5, 5.3,
5.7, 4.1, 5, 4.7, 4.9, 4.9, 5.2, 5.2, 5.2, 5.2, 5, 4.9, 4.9,
5.1, 5.1, 5.3, 5.2, 5.5, 5.6, 5.7, 5.7, 5.3, 5.2, 5.2, 5.4, 5.6,
5.8, 5.8, 5.8, 5.8, 6, 6.1, 6, 6, 6, 5.9, 6, 5.8, 6.1, 6, 4.5,
5, 5.1, 4.5, 4.6, 4.5, 4.5, 4.4, 4.4, 4.5, 4.5, 4.5, 4.4, 4.6,
4.6, 4.7, 4.8, 4.7, 4.9, 4.8, 4.5, 4.9, 5.3, 5.6, 4.1, 4.1, 4.1,
4.2, 4.2, 4.4, 4.4, 4.3, 4.4, 4.4, 4.5, 4.8, 4.7, 4.7, 4.9, 5.1,
4.1, 3.9, 3.9, 3.9, 3.8, 3.8, 3.9, 4, 4.2, 4.3, 4.5, 4.6, 5,
5.1, 5.2, 5.3, 5.2, 6.2, 5.7, 4.5, 4.1, 4.1, 4.2, 4.2, 4.3, 4.3,
4.4, 4.4, 4.4, 4.6, 4.3, 4.2, 4.1, 4.2, 4.2, 4.2, 4.3, 4.3, 4.7,
4.6, 4.6, 4.6, 4.6, 4.6, 4.2, 4.2, 4.2, 4.1, 4.1, 4.3, 4.3, 4.3,
4.8, 4.6, 4.6, 4.6, 4.6, 4.5, 4.4, 4.3, 4.3, 4.4, 4.7, 4.6, 4.7,
4.6, 4.5, 4.4, 4.3, 4.2, 4.3, 4.3, 4.4, 4.5, 5.5, 5.4, 5.3, 5.1,
5, 4.9, 4.9, 5, 5, 5.1, 5, 7.1, 6.9, 6.9, 6.9, 6.8, 6.8, 6.7,
6.5, 6.4, 6.1, 5.5, 5.3, 7.3, 7.2, 7.3, 7.4, 7.1, 6.8, 6.6, 6.4,
6.2, 6, 7.3, 8.2, 8.8, 9.6, 10.4, 10.5, 10.8, 11.1, 11.3, 11.4,
6.7, 6.6, 6.8, 6.9, 6.9, 7.3, 7.6, 7.8, 8.2, 8.2, 8.2, 8.1, 8.1,
8.1, 8.1, 8.1, 8.1, 8.5, 8.3, 8.1, 3.1, 3.1, 3.1, 3.1, 3.2, 3.1,
3.1, 3.2, 3.2, 3.2, 2.1, 2.2, 2.2, 2.3, 2.4, 2.4, 2.2, 2.3, 2.2,
2.1), DO = c(91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L,
91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L,
91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L,
92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L, 92L,
92L, 92L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 92L, 91L, 91L, 91L,
91L, 91L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 91L, 90L, 90L, 89L,
89L, 89L, 89L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 89L, 89L,
89L, 88L, 88L, 88L, 88L, 88L, 88L, 87L, 87L, 87L, 87L, 87L, 87L,
87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 90L, 90L, 89L,
88L, 88L, 88L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L,
87L, 88L, 88L, 88L, 87L, 87L, 87L, 87L, 86L, 86L, 86L, 86L, 86L,
86L, 89L, 87L, 86L, 86L, 86L, 86L, 86L, 85L, 89L, 87L, 86L, 86L,
86L, 86L, 86L, 86L, 85L, 87L, 87L, 86L, 86L, 86L, 86L, 86L, 86L,
85L, 85L, 85L, 85L, 85L, 85L, 88L, 87L, 87L, 86L, 86L, 86L, 86L,
85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L, 85L,
85L, 85L, 87L, 87L, 86L, 86L, 86L, 85L, 85L, 85L, 85L, 85L, 85L,
84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 86L,
86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 86L, 85L, 85L, 85L, 84L,
84L, 84L, 84L, 83L, 83L, 83L, 83L, 83L, 82L, 82L, 82L, 82L, 82L,
82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 82L, 85L,
84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L,
84L, 84L, 84L, 84L, 84L, 83L, 83L, 83L, 83L, 83L, 85L, 85L, 84L,
84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L, 84L,
86L, 86L, 85L, 85L, 85L, 85L, 85L, 84L, 84L, 84L, 84L, 84L, 84L,
84L, 84L, 84L, 84L, 92L, 90L, 89L, 88L, 88L, 87L, 87L, 87L, 87L,
87L, 87L, 87L, 87L, 90L, 90L, 89L, 89L, 89L, 89L, 89L, 90L, 92L,
92L, 92L, 92L, 92L, 91L, 91L, 91L, 91L, 91L, 90L, 90L, 90L, 90L,
92L, 92L, 92L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 91L, 90L,
90L, 91L, 91L, 91L, 91L, 91L, 92L, 92L, 92L, 91L, 90L, 90L, 89L,
89L, 89L, 89L, 88L, 88L, 88L, 88L, 91L, 91L, 92L, 91L, 91L, 91L,
91L, 90L, 89L, 88L, 87L, 87L, 92L, 92L, 92L, 92L, 92L, 92L, 92L,
92L, 92L, 92L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L, 89L,
85L, 85L, 85L, 85L, 85L, 85L, 86L, 86L, 82L, 83L, 83L, 84L, 84L,
85L, 85L, 86L, 86L, 86L, 74L, 74L, 92L, 92L, 92L, 92L, 92L, 92L,
92L, 92L, 92L, 92L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L, 90L,
90L), Calc.SPM = c(35L, 38L, 37L, 39L, 42L, 42L, 45L, 48L, 46L,
46L, 46L, 44L, 46L, 52L, 50L, 53L, 59L, 67L, 65L, 65L, 41L, 41L,
41L, 40L, 42L, 41L, 41L, 45L, 46L, 47L, 46L, 47L, 47L, 48L, 48L,
49L, 47L, 47L, 46L, 48L, 48L, 51L, 55L, 55L, 65L, 66L, 70L, 72L,
55L, 55L, 60L, 60L, 62L, 68L, 71L, 69L, 77L, 72L, 47L, 54L, 60L,
63L, 68L, 74L, 94L, 94L, 106L, 123L, 120L, 127L, 130L, 36L, 33L,
31L, 34L, 34L, 41L, 55L, 68L, 87L, 114L, 168L, 204L, 226L, 240L,
262L, 42L, 44L, 54L, 58L, 52L, 52L, 51L, 50L, 58L, 68L, 90L,
118L, 156L, 174L, 180L, 187L, 180L, 190L, 189L, 187L, 187L, 185L,
184L, 183L, 192L, 28L, 28L, 28L, 29L, 30L, 31L, 31L, 35L, 40L,
41L, 45L, 48L, 53L, 78L, 169L, 186L, 195L, 25L, 25L, 25L, 27L,
29L, 32L, 32L, 41L, 65L, 70L, 67L, 66L, 69L, 27L, 29L, 74L, 102L,
148L, 141L, 149L, 217L, 24L, 25L, 33L, 88L, 118L, 133L, 170L,
223L, 225L, 27L, 32L, 36L, 41L, 45L, 52L, 66L, 106L, 110L, 106L,
113L, 136L, 149L, 185L, 23L, 23L, 23L, 23L, 24L, 28L, 33L, 40L,
60L, 72L, 76L, 84L, 90L, 92L, 91L, 91L, 90L, 89L, 86L, 85L, 84L,
86L, 15L, 14L, 15L, 14L, 20L, 29L, 32L, 34L, 34L, 37L, 45L, 14L,
14L, 15L, 16L, 17L, 40L, 47L, 53L, 67L, 70L, 85L, 100L, 11L,
11L, 11L, 12L, 12L, 12L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L,
13L, 14L, 16L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 18L, 19L,
18L, 20L, 22L, 23L, 24L, 27L, 30L, 28L, 26L, 25L, 24L, 25L, 13L,
14L, 16L, 16L, 17L, 17L, 19L, 20L, 21L, 21L, 21L, 22L, 23L, 23L,
23L, 23L, 21L, 20L, 23L, 22L, 25L, 36L, 39L, 41L, 30L, 34L, 40L,
47L, 53L, 56L, 62L, 64L, 67L, 70L, 88L, 93L, 87L, 89L, 98L, 109L,
17L, 19L, 19L, 21L, 22L, 23L, 28L, 38L, 50L, 63L, 70L, 100L,
118L, 117L, 128L, 135L, 124L, 22L, 19L, 18L, 26L, 42L, 51L, 63L,
65L, 70L, 68L, 70L, 79L, 92L, 31L, 31L, 30L, 31L, 35L, 38L, 42L,
44L, 47L, 45L, 48L, 50L, 51L, 50L, 22L, 23L, 23L, 23L, 31L, 54L,
58L, 56L, 25L, 25L, 25L, 27L, 27L, 25L, 24L, 26L, 30L, 34L, 23L,
23L, 23L, 24L, 24L, 22L, 21L, 20L, 20L, 20L, 21L, 24L, 23L, 25L,
26L, 26L, 26L, 28L, 31L, 36L, 34L, 36L, 34L, 24L, 26L, 28L, 28L,
32L, 32L, 32L, 32L, 31L, 32L, 34L, 37L, 27L, 27L, 29L, 29L, 27L,
27L, 27L, 25L, 25L, 23L, 35L, 35L, 35L, 35L, 39L, 40L, 40L, 43L,
45L, 43L, 67L, 72L, 75L, 82L, 96L, 137L, 142L, 105L, 245L, 264L,
267L, 279L, 248L, 224L, 180L, 137L, 104L, 98L, 293L, 294L, 33L,
35L, 36L, 36L, 35L, 34L, 38L, 38L, 38L, 39L, 27L, 27L, 27L, 28L,
28L, 28L, 28L, 28L, 28L, 29L), Salinity = c(0.06, 0.06, 0.06,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.08, 0.08, 0.08, 0.08,
0.08, 0.08, 0.07, 0.07, 0.07, 0.07, 0.08, 0.08, 0.08, 0.08, 0.08,
0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08, 0.08,
0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16, 0.16,
0.16, 0.66, 0.65, 0.66, 0.66, 0.66, 0.66, 0.66, 0.65, 0.65, 0.65,
2.97, 3.12, 3.15, 3.16, 3.17, 3.17, 3.16, 3.16, 3.2, 3.23, 3.23,
3.24, 3.24, 3.86, 4.2, 4.57, 4.68, 4.7, 5, 5.26, 5.5, 5.84, 6.05,
6.12, 6.13, 6.12, 6.12, 6.12, 6.01, 6.23, 6.78, 7.03, 7.16, 7.25,
7.35, 7.34, 7.74, 8.05, 8.25, 8.38, 8.54, 8.59, 8.59, 8.6, 8.58,
8.61, 8.61, 8.63, 8.63, 8.64, 8.63, 8.64, 8.64, 6.87, 6.88, 6.93,
7.37, 7.67, 7.89, 8.09, 8.34, 8.5, 8.54, 8.57, 8.67, 8.81, 9.28,
10.01, 10.07, 10.09, 9.45, 9.5, 9.53, 9.74, 9.92, 10.11, 10.23,
10.94, 11.77, 11.87, 11.79, 11.77, 11.85, 13.1, 13.71, 14.52,
14.63, 14.98, 15.06, 15.06, 15.05, 16.77, 17.64, 18.82, 20.03,
20.27, 20.33, 20.48, 20.56, 20.57, 20.6, 21.19, 21.56, 21.66,
21.73, 21.78, 21.83, 21.88, 21.87, 21.87, 21.88, 21.89, 21.89,
21.89, 20.12, 20.13, 20.23, 20.31, 20.65, 21.24, 21.62, 21.87,
22.12, 22.18, 22.24, 22.34, 22.45, 22.53, 22.55, 22.53, 22.53,
22.52, 22.56, 22.57, 22.57, 22.56, 23.69, 25.05, 25.27, 25.53,
26.04, 26.32, 26.36, 26.38, 26.41, 26.46, 26.46, 25.11, 25.72,
25.97, 26.19, 26.59, 26.97, 27.05, 27.09, 27.15, 27.15, 27.22,
27.22, 27.36, 27.43, 27.48, 27.44, 27.33, 27.26, 27.27, 27.33,
27.37, 27.57, 27.68, 27.72, 27.79, 28.01, 28.19, 28.38, 28.47,
28.48, 28.48, 28.57, 28.66, 28.68, 28.8, 28.85, 28.92, 28.93,
28.95, 28.95, 28.95, 28.93, 28.93, 28.93, 28.91, 28.9, 28.91,
28.91, 28.92, 28.92, 28.92, 26.77, 26.95, 27.12, 27.26, 27.35,
27.33, 27.39, 27.41, 27.44, 27.46, 27.46, 27.47, 27.51, 27.58,
27.61, 27.62, 27.66, 27.71, 27.82, 27.73, 27.87, 27.94, 27.94,
27.94, 26.28, 26.32, 26.42, 26.48, 26.55, 26.59, 26.62, 26.62,
26.64, 26.65, 26.65, 26.65, 26.66, 26.66, 26.65, 26.65, 25.74,
25.9, 25.91, 26.16, 26.23, 26.25, 26.37, 26.45, 26.48, 26.49,
26.5, 26.52, 26.53, 26.53, 26.53, 26.53, 26.53, 24.72, 25, 25.13,
25.3, 25.43, 25.46, 25.49, 25.49, 25.5, 25.49, 25.49, 25.5, 25.5,
25.36, 25.36, 25.36, 25.36, 25.35, 25.35, 25.34, 25.34, 24.45,
24.45, 24.45, 24.45, 24.44, 24.44, 23.78, 23.78, 23.78, 23.78,
23.8, 23.81, 23.81, 23.81, 23.56, 23.56, 23.56, 23.58, 23.63,
23.68, 23.72, 23.73, 23.74, 23.74, 23.31, 23.31, 23.31, 23.33,
23.37, 23.47, 23.51, 23.58, 23.61, 23.61, 23.66, 23.74, 23.12,
23.16, 23.22, 23.26, 23.27, 23.27, 23.29, 23.33, 23.36, 23.38,
23.39, 23.05, 23.05, 23.06, 23.07, 23.08, 23.08, 23.09, 23.1,
23.12, 23.18, 23.25, 23.27, 22.63, 22.63, 22.7, 22.87, 22.89,
22.9, 22.93, 22.95, 22.98, 23.01, 22.33, 22.47, 22.51, 22.55,
22.58, 22.6, 22.62, 22.64, 22.68, 22.72, 21.29, 21.67, 21.67,
21.66, 21.68, 21.74, 21.9, 22.15, 20.73, 20.83, 20.91, 21.02,
21.1, 21.25, 21.39, 21.5, 21.68, 21.83, 20.46, 20.51, 0.06, 0.06,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06,
0.06, 0.06, 0.06, 0.06, 0.06, 0.06, 0.06), Temperature = c(19.14,
19.12, 19.11, 19.09, 19.08, 19.08, 19.06, 19.05, 19.06, 19.06,
19.1, 19.11, 19.09, 19.07, 19.07, 19.06, 19.06, 19.05, 19.05,
19.05, 19.32, 19.32, 19.33, 19.32, 19.32, 19.32, 19.32, 19.31,
19.3, 19.3, 19.3, 19.3, 19.3, 19.3, 19.3, 19.29, 19.71, 19.71,
19.72, 19.7, 19.7, 19.68, 19.65, 19.66, 19.62, 19.62, 19.62,
19.62, 19.78, 19.79, 19.76, 19.76, 19.74, 19.72, 19.72, 19.72,
19.71, 19.71, 19.63, 19.54, 19.5, 19.49, 19.48, 19.45, 19.39,
19.39, 19.35, 19.32, 19.33, 19.31, 19.3, 19.47, 19.21, 18.91,
18.86, 18.86, 18.81, 18.76, 18.72, 18.67, 18.64, 18.63, 18.63,
18.63, 18.63, 18.63, 18.93, 18.85, 18.7, 18.65, 18.64, 18.63,
18.62, 18.62, 18.56, 18.51, 18.48, 18.45, 18.42, 18.41, 18.41,
18.41, 18.41, 18.4, 18.41, 18.41, 18.41, 18.41, 18.41, 18.41,
18.41, 19.24, 19.21, 19.09, 18.62, 18.56, 18.5, 18.47, 18.45,
18.43, 18.44, 18.44, 18.44, 18.44, 18.41, 18.34, 18.34, 18.33,
18.72, 18.66, 18.65, 18.5, 18.39, 18.3, 18.28, 18.21, 18.13,
18.12, 18.13, 18.13, 18.12, 19.18, 18.47, 18.05, 18.02, 17.96,
17.95, 17.95, 17.95, 18.36, 17.55, 17.21, 16.85, 16.78, 16.76,
16.71, 16.69, 16.69, 16.84, 16.61, 16.51, 16.45, 16.42, 16.39,
16.37, 16.34, 16.34, 16.34, 16.34, 16.33, 16.33, 16.32, 17.11,
17.05, 16.88, 16.81, 16.69, 16.52, 16.4, 16.31, 16.21, 16.19,
16.17, 16.14, 16.12, 16.1, 16.1, 16.1, 16.1, 16.11, 16.11, 16.11,
16.11, 16.12, 16.32, 15.69, 15.54, 15.46, 15.2, 15.05, 15.02,
15.01, 14.99, 14.96, 14.96, 15.28, 15.16, 15.09, 15.04, 14.93,
14.77, 14.74, 14.72, 14.69, 14.69, 14.66, 14.65, 14.77, 14.74,
14.71, 14.73, 14.8, 14.83, 14.83, 14.8, 14.78, 14.66, 14.59,
14.57, 14.52, 14.39, 14.27, 14.15, 14.09, 14.09, 14.09, 14.03,
13.97, 13.96, 13.89, 13.86, 13.8, 13.79, 13.77, 13.77, 13.77,
13.78, 13.78, 13.78, 13.78, 13.78, 13.78, 13.78, 13.77, 13.77,
13.77, 14.97, 14.87, 14.77, 14.7, 14.65, 14.67, 14.65, 14.64,
14.63, 14.62, 14.62, 14.61, 14.59, 14.54, 14.52, 14.5, 14.49,
14.47, 14.4, 14.45, 14.38, 14.34, 14.34, 14.34, 15.6, 15.58,
15.53, 15.49, 15.46, 15.44, 15.43, 15.43, 15.41, 15.39, 15.38,
15.38, 15.38, 15.38, 15.39, 15.4, 15.69, 15.63, 15.63, 15.58,
15.58, 15.59, 15.57, 15.53, 15.5, 15.49, 15.48, 15.46, 15.45,
15.45, 15.44, 15.44, 15.45, 17.95, 17.59, 17.42, 17.15, 16.93,
16.87, 16.83, 16.81, 16.8, 16.81, 16.8, 16.79, 16.79, 17.2, 17.2,
17.2, 17.2, 17.21, 17.21, 17.22, 17.22, 18.16, 18.16, 18.16,
18.16, 18.16, 18.16, 18.88, 18.88, 18.88, 18.88, 18.87, 18.86,
18.86, 18.86, 19.34, 19.35, 19.34, 19.33, 19.26, 19.13, 19, 18.97,
18.96, 18.96, 19.76, 19.76, 19.77, 19.75, 19.68, 19.46, 19.37,
19.2, 19.14, 19.12, 19.03, 18.88, 20.13, 20.1, 19.98, 19.88,
19.86, 19.85, 19.81, 19.77, 19.73, 19.7, 19.68, 20.28, 20.27,
20.26, 20.25, 20.24, 20.23, 20.21, 20.18, 20.13, 19.96, 19.78,
19.76, 20.48, 20.48, 20.46, 20.31, 20.27, 20.25, 20.23, 20.2,
20.19, 20.17, 20.38, 20.36, 20.37, 20.36, 20.33, 20.31, 20.31,
20.31, 20.29, 20.26, 20.51, 20.43, 20.39, 20.37, 20.34, 20.22,
20.06, 19.91, 20.53, 20.49, 20.48, 20.46, 20.47, 20.48, 20.48,
20.48, 20.49, 20.48, 19.47, 19.43, 19.11, 19.1, 19.08, 19.09,
19.09, 19.09, 19.09, 19.1, 19.1, 19.11, 18.22, 18.22, 18.23,
18.22, 18.2, 18.2, 18.2, 18.2, 18.2, 18.2)), class = "data.frame", row.names = c(NA,
-453L))
##Setting up data
example.data$Date<-'06/10/2005'
example.data$Date<-as.Date(example.data$Date,"%m/%d/%Y")
d_date<-example.data[example.data$Date=='2005-06-10',]
d_date<-drop_na(d_date,c(Calc.Chl,DO,Calc.SPM,Salinity,Temperature))
d_date$logDepth<-log10(d_date$Depth)
##add a line, if minimum depth #station > 1, copy minimum as all depths between 1 and min
#Interpolations
interp.chl <- interp(d_date$Distance.from.36, d_date$logDepth, d_date$Calc.Chl, nx = 1000, ny = 800,yo=seq(0,max(d_date$logDepth), length = 800))
interp.df.chl <- interp.chl %>% interp2xyz() %>% as.data.frame()
names(interp.df.chl) <- c("x", "y", "Chl")
interp.do <- interp(d_date$Distance.from.36, d_date$logDepth, d_date$DO, nx = 1000, ny = 800,yo=seq(0,max(d_date$logDepth), length = 800))
interp.df.do <- interp.do %>% interp2xyz() %>% as.data.frame()
names(interp.df.do) <- c("x", "y", "Oxy")
interp.spm <- interp(d_date$Distance.from.36, d_date$logDepth, d_date$Calc.SPM, nx = 1000, ny = 800,yo=seq(0,max(d_date$logDepth), length = 800))
interp.df.spm <- interp.spm %>% interp2xyz() %>% as.data.frame()
names(interp.df.spm) <- c("x", "y", "SPM")
interp.sal <- interp(d_date$Distance.from.36, d_date$logDepth, d_date$Salinity, nx = 1000, ny = 800,yo=seq(0,max(d_date$logDepth), length = 800))
interp.df.sal <- interp.sal %>% interp2xyz() %>% as.data.frame()
names(interp.df.sal) <- c("x", "y", "Sal")
interp.temp <- interp(d_date$Distance.from.36, d_date$logDepth, d_date$Temperature, nx = 1000, ny = 800,yo=seq(0,max(d_date$logDepth), length = 800))
interp.df.temp <- interp.temp %>% interp2xyz() %>% as.data.frame()
names(interp.df.temp) <- c("x", "y", "Temp")
For context, our data is organized by date/station/depth, where each date is sampled at many locations (stations), at depths in 1m increments. If we drop surface data, there is no 1m depth, and the first depth is 2. Here is a snip of data of two stations within one date that one minimum depth is 1m and the second is 2m.
TIA!

Related

Informative linear discriminant plot using ggplot

I am trying to carry out linear discriminant analysis and plot the results graphically:
aircraft = read_csv(file = "aircraft.csv") %>%
mutate( Period = factor( Period ))
lda.0 = lda( Period ~ Power + Span + Length + Weight + Speed + Range, data = aircraft )
plot( lda.0 )
Using my full dataset, I get the following graph:
As you can see, it is difficult to see what is going on here. I want to plot this in a more informative way.
I was thinking of using ggplot with something like this:
ggplot( lda.0, aes( ) ) +
geom_density_2d( ) +
geom_point( aes( colour = ), alpha = 0.5 ) +
theme( legend.position = "bottom") +
theme( legend.position = "bottom") + ggtitle("Contour Plot") + theme(plot.title=element_text(hjust=0.5))
So that I get a graph like this:
Or a graph like this:
Or a graph like this:
How do I accomplish this (as I said, I would like to use something more flexible like ggplot)?
However, the full dataset is too large to include in this post, so I have included a smaller version of the dataset:
structure(list(Year = c(14L, 14L, 14L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L,
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 20L, 20L, 20L,
20L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 23L, 23L, 23L, 23L,
23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 25L, 25L, 25L,
25L, 25L, 25L, 25L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L,
26L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L, 28L, 28L,
28L), Period = c(1L, 3L, 3L, 1L, 2L, 1L, 3L, 2L, 1L, 3L, 2L,
3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 1L,
1L, 3L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 3L,
2L, 3L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 1L, 3L, 2L,
2L, 3L, 1L, 3L, 1L, 3L, 2L, 1L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 2L,
1L, 2L, 1L, 1L, 1L, 3L, 2L, 1L, 2L), Power = c(82, 82, 223.6,
164, 119, 74.5, 74.5, 279.5, 82, 67, 112, 149, 119, 119, 238.5,
205, 82, 119, 194, 336, 558.9, 287, 388, 164, 194, 194, 186.3,
119, 119, 89.4, 126.7, 149, 119, 536.6, 402, 298, 298, 342.8,
536, 223.6, 521.6, 186.3, 238.5, 287, 335.3, 335.3, 335.3, 335.3,
335.3, 335.3, 357.7, 313, 782.6, 298, 670.6, 223.5, 335.3, 391,
391, 436, 391, 436, 171.4, 350, 298, 223.6, 298, 634, 223.5,
864.4, 760, 503.5, 63.3, 357.7, 812, 335.3, 298, 298, 335.3,
298, 317, 231, 335.3, 432, 918, 745.2, 424.8, 372.6, 782, 626,
544, 335.3, 372.6, 373, 391.2, 864, 894, 179, 74.5, 391.2), Span = c(12.8,
11, 17.9, 14.5, 12.9, 7.5, 11.13, 14.3, 7.8, 11, 11.7, 12.8,
8.5, 13.3, 14.9, 12, 9.4, 15.95, 16.74, 22.2, 23.4, 14.3, 23.72,
11.9, 14.4, 14.4, 9.7, 8, 9.4, 14.55, 9.1, 8.11, 9.5, 20.73,
22.8, 38.4, 14, 26.5, 30.48, 9.7, 15.5, 9.1, 14.17, 10.1, 14.8,
15.62, 14.05, 14.05, 14.8, 15.24, 14, 12.24, 27.2, 8.84, 22.86,
7.7, 9.5, 9.8, 15.93, 15.93, 15.93, 15.93, 13.08, 15.21, 8.94,
9.6, 10.8, 13.72, 8.9, 26.72, 25, 9.6, 8.84, 11.58, 17.3, 12.5,
12.1, 12.09, 9.8, 15.3, 9.08, 17.75, 15.3, 15.15, 27.4, 22, 13.7,
10.3, 22.76, 22.25, 17.25, 11, 12, 9.5, 14.15, 20.4, 20.4, 14.5,
8.84, 11.35), Length = c(7.6, 9, 10.35, 9.8, 7.9, 6.3, 8.28,
9.4, 6.7, 8.3, 8, 8.7, 7.4, 9.6, 8.9, 7.9, 6.2, 10.25, 10.77,
10.9, 12.6, 9.4, 11.86, 9.8, 9.2, 8.9, 8, 6.5, 6.95, 9.83, 7.3,
6.38, 8.5, 13.27, 13.5, 20.85, 9.2, 14.33, 19.16, 6.5, 9.7, 8.1,
9.68, 7.7, 10.8, 11.89, 10.97, 11.28, 9.5, 11.42, 11, 7.3, 18.2,
7.01, 18.08, 6.8, 6.8, 7.1, 11.5, 11.5, 11.5, 11.5, 9.27, 9.78,
6.17, 6.4, 7.32, 10.74, 6.9, 18.97, 15.1, 7.06, 7.17, 9.5, 10.55,
8.38, 8.7, 8.81, 6.7, 9.42, 5.99, 10.27, 10.22, 11, 19.8, 14.63,
11.2, 6.56, 14.88, 13.81, 12.6, 7, 7.5, 7.2, 9.91, 14.8, 15,
9.8, 7.17, 8.94), Weight = c(1070, 830, 2200, 1946, 1190, 653,
930, 1575, 676, 920, 1353, 1550, 888, 1275, 1537, 1292, 611,
1350, 1700, 3312, 4920, 1510, 3625, 900, 1665, 1640, 1081, 625,
932, 1378, 886, 902, 1070, 5670, 3636, 12925, 2107, 4770, 6060,
1192, 1900, 1050, 2155, 1379, 2858, 3380, 2290, 2290, 2347, 3308,
2630, 1333, 10000, 1351, 6250, 885, 1531, 1438, 3820, 3820, 3820,
3820, 1905, 2646, 1151, 1266, 1575, 2383, 860, 7983, 6200, 1484,
567, 1867, 4350, 1935, 1823, 2253, 1487, 2220, 1244, 2700, 2280,
3652, 8165, 5500, 3568, 1414, 5875, 5460, 4310, 1500, 1795, 1628,
2449, 6900, 6900, 1900, 567, 2102), Speed = c(105L, 145L, 135L,
138L, 140L, 177L, 113L, 230L, 175L, 106L, 140L, 170L, 175L, 157L,
183L, 201L, 209L, 145L, 120L, 135L, 152L, 176L, 140L, 190L, 175L,
175L, 205L, 196L, 165L, 146L, 175L, 222L, 159L, 166L, 158L, 146L,
185L, 120L, 157L, 226L, 205L, 230L, 161L, 251L, 171L, 206L, 171L,
171L, 235L, 161L, 145L, 245L, 183L, 214L, 180L, 220L, 237L, 254L,
169L, 169L, 169L, 169L, 153L, 183L, 261L, 245L, 235L, 200L, 246L,
174L, 180L, 319L, 146L, 251L, 230L, 290L, 230L, 233L, 250L, 255L,
233L, 175L, 230L, 180L, 145L, 185L, 196L, 298L, 183L, 198L, 195L,
300L, 270L, 297L, 225L, 212L, 195L, 197L, 146L, 296L), Range = c(400L,
402L, 500L, 500L, 400L, 350L, 402L, 700L, 525L, 300L, 560L, 550L,
250L, 450L, 700L, 600L, 175L, 450L, 450L, 450L, 600L, 800L, 500L,
600L, 600L, 600L, 600L, 400L, 250L, 400L, 350L, 547L, 450L, 1770L,
800L, 2365L, 925L, 400L, 1205L, 580L, 600L, 600L, 684L, 402L,
563L, 644L, 885L, 885L, 800L, 440L, 557L, 750L, 3600L, 500L,
805L, 330L, 600L, 628L, 1640L, 1640L, 1640L, 1640L, 604L, 1046L,
644L, 500L, 600L, 1046L, 550L, 1585L, 650L, 917L, 515L, 805L,
750L, 1110L, 772L, 1127L, 500L, 850L, 523L, 850L, 900L, 700L,
668L, 700L, 1706L, 600L, 1385L, 1000L, 902L, 600L, 500L, 450L,
579L, 1125L, 1300L, 660L, 515L, 756L)), row.names = c(NA, 100L
), class = "data.frame")
EDIT:
Is this what you want?
lda.0.val = predict(lda.0)$x
df = data.frame(
LD1 = lda.0.val[,1],
LD2 = lda.0.val[,2],
Period = aircraft$Period)
ggplot(df, aes(x=LD1, y=LD2)) +
geom_density_2d() + geom_point(aes(color=Period))
Output:

Adding smooth curve to my ggplot histogram

I am trying to add a smooth curve to my ggplot histogram plot using geom_density():
ggplot(aircraft, aes(log10(Power))) + geom_histogram() + geom_density()
However, as you can see from the small curve at the bottom of the graph, it isn't working as I wanted:
This is an example of the type of smooth curve I want:
How do I add this smooth curve to my histogram?
My data is too large to add here, so here is a sample of it:
structure(list(Year = c(14L, 14L, 14L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L,
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 20L, 20L, 20L,
20L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 23L, 23L, 23L, 23L,
23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 25L, 25L, 25L,
25L, 25L, 25L, 25L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L,
26L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L, 28L, 28L,
28L), Period = c(1L, 3L, 3L, 1L, 2L, 1L, 3L, 2L, 1L, 3L, 2L,
3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 1L,
1L, 3L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 3L,
2L, 3L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 1L, 3L, 2L,
2L, 3L, 1L, 3L, 1L, 3L, 2L, 1L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 2L,
1L, 2L, 1L, 1L, 1L, 3L, 2L, 1L, 2L), Power = c(82, 82, 223.6,
164, 119, 74.5, 74.5, 279.5, 82, 67, 112, 149, 119, 119, 238.5,
205, 82, 119, 194, 336, 558.9, 287, 388, 164, 194, 194, 186.3,
119, 119, 89.4, 126.7, 149, 119, 536.6, 402, 298, 298, 342.8,
536, 223.6, 521.6, 186.3, 238.5, 287, 335.3, 335.3, 335.3, 335.3,
335.3, 335.3, 357.7, 313, 782.6, 298, 670.6, 223.5, 335.3, 391,
391, 436, 391, 436, 171.4, 350, 298, 223.6, 298, 634, 223.5,
864.4, 760, 503.5, 63.3, 357.7, 812, 335.3, 298, 298, 335.3,
298, 317, 231, 335.3, 432, 918, 745.2, 424.8, 372.6, 782, 626,
544, 335.3, 372.6, 373, 391.2, 864, 894, 179, 74.5, 391.2), Span = c(12.8,
11, 17.9, 14.5, 12.9, 7.5, 11.13, 14.3, 7.8, 11, 11.7, 12.8,
8.5, 13.3, 14.9, 12, 9.4, 15.95, 16.74, 22.2, 23.4, 14.3, 23.72,
11.9, 14.4, 14.4, 9.7, 8, 9.4, 14.55, 9.1, 8.11, 9.5, 20.73,
22.8, 38.4, 14, 26.5, 30.48, 9.7, 15.5, 9.1, 14.17, 10.1, 14.8,
15.62, 14.05, 14.05, 14.8, 15.24, 14, 12.24, 27.2, 8.84, 22.86,
7.7, 9.5, 9.8, 15.93, 15.93, 15.93, 15.93, 13.08, 15.21, 8.94,
9.6, 10.8, 13.72, 8.9, 26.72, 25, 9.6, 8.84, 11.58, 17.3, 12.5,
12.1, 12.09, 9.8, 15.3, 9.08, 17.75, 15.3, 15.15, 27.4, 22, 13.7,
10.3, 22.76, 22.25, 17.25, 11, 12, 9.5, 14.15, 20.4, 20.4, 14.5,
8.84, 11.35), Length = c(7.6, 9, 10.35, 9.8, 7.9, 6.3, 8.28,
9.4, 6.7, 8.3, 8, 8.7, 7.4, 9.6, 8.9, 7.9, 6.2, 10.25, 10.77,
10.9, 12.6, 9.4, 11.86, 9.8, 9.2, 8.9, 8, 6.5, 6.95, 9.83, 7.3,
6.38, 8.5, 13.27, 13.5, 20.85, 9.2, 14.33, 19.16, 6.5, 9.7, 8.1,
9.68, 7.7, 10.8, 11.89, 10.97, 11.28, 9.5, 11.42, 11, 7.3, 18.2,
7.01, 18.08, 6.8, 6.8, 7.1, 11.5, 11.5, 11.5, 11.5, 9.27, 9.78,
6.17, 6.4, 7.32, 10.74, 6.9, 18.97, 15.1, 7.06, 7.17, 9.5, 10.55,
8.38, 8.7, 8.81, 6.7, 9.42, 5.99, 10.27, 10.22, 11, 19.8, 14.63,
11.2, 6.56, 14.88, 13.81, 12.6, 7, 7.5, 7.2, 9.91, 14.8, 15,
9.8, 7.17, 8.94), Weight = c(1070, 830, 2200, 1946, 1190, 653,
930, 1575, 676, 920, 1353, 1550, 888, 1275, 1537, 1292, 611,
1350, 1700, 3312, 4920, 1510, 3625, 900, 1665, 1640, 1081, 625,
932, 1378, 886, 902, 1070, 5670, 3636, 12925, 2107, 4770, 6060,
1192, 1900, 1050, 2155, 1379, 2858, 3380, 2290, 2290, 2347, 3308,
2630, 1333, 10000, 1351, 6250, 885, 1531, 1438, 3820, 3820, 3820,
3820, 1905, 2646, 1151, 1266, 1575, 2383, 860, 7983, 6200, 1484,
567, 1867, 4350, 1935, 1823, 2253, 1487, 2220, 1244, 2700, 2280,
3652, 8165, 5500, 3568, 1414, 5875, 5460, 4310, 1500, 1795, 1628,
2449, 6900, 6900, 1900, 567, 2102), Speed = c(105L, 145L, 135L,
138L, 140L, 177L, 113L, 230L, 175L, 106L, 140L, 170L, 175L, 157L,
183L, 201L, 209L, 145L, 120L, 135L, 152L, 176L, 140L, 190L, 175L,
175L, 205L, 196L, 165L, 146L, 175L, 222L, 159L, 166L, 158L, 146L,
185L, 120L, 157L, 226L, 205L, 230L, 161L, 251L, 171L, 206L, 171L,
171L, 235L, 161L, 145L, 245L, 183L, 214L, 180L, 220L, 237L, 254L,
169L, 169L, 169L, 169L, 153L, 183L, 261L, 245L, 235L, 200L, 246L,
174L, 180L, 319L, 146L, 251L, 230L, 290L, 230L, 233L, 250L, 255L,
233L, 175L, 230L, 180L, 145L, 185L, 196L, 298L, 183L, 198L, 195L,
300L, 270L, 297L, 225L, 212L, 195L, 197L, 146L, 296L), Range = c(400L,
402L, 500L, 500L, 400L, 350L, 402L, 700L, 525L, 300L, 560L, 550L,
250L, 450L, 700L, 600L, 175L, 450L, 450L, 450L, 600L, 800L, 500L,
600L, 600L, 600L, 600L, 400L, 250L, 400L, 350L, 547L, 450L, 1770L,
800L, 2365L, 925L, 400L, 1205L, 580L, 600L, 600L, 684L, 402L,
563L, 644L, 885L, 885L, 800L, 440L, 557L, 750L, 3600L, 500L,
805L, 330L, 600L, 628L, 1640L, 1640L, 1640L, 1640L, 604L, 1046L,
644L, 500L, 600L, 1046L, 550L, 1585L, 650L, 917L, 515L, 805L,
750L, 1110L, 772L, 1127L, 500L, 850L, 523L, 850L, 900L, 700L,
668L, 700L, 1706L, 600L, 1385L, 1000L, 902L, 600L, 500L, 450L,
579L, 1125L, 1300L, 660L, 515L, 756L)), row.names = c(NA, 100L
), class = "data.frame")
One solution could be to set y = ..density.. in the aes
library(tidyverse)
ggplot(aircraft, aes(x = log10(Power),
y = ..density..)) +
geom_histogram(alpha =0.5) +
geom_density(color = "red",
size = 2)

Creating smoothed histograms and contour plots using ggplot

I am trying to create some smoothed histograms and contour plots using ggplot.
An excerpt/sample of my data is as follows:
structure(list(Year = c(14L, 14L, 14L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L, 17L,
18L, 18L, 18L, 18L, 18L, 18L, 18L, 18L, 19L, 19L, 20L, 20L, 20L,
20L, 21L, 21L, 21L, 22L, 22L, 22L, 22L, 22L, 23L, 23L, 23L, 23L,
23L, 23L, 23L, 23L, 23L, 24L, 24L, 24L, 24L, 24L, 25L, 25L, 25L,
25L, 25L, 25L, 25L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L,
26L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 28L, 28L, 28L, 28L,
28L), Period = c(1L, 3L, 3L, 1L, 2L, 1L, 3L, 2L, 1L, 3L, 2L,
3L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 1L, 1L, 1L,
1L, 3L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 3L,
2L, 3L, 1L, 1L, 2L, 3L, 1L, 3L, 2L, 1L, 2L, 1L, 1L, 1L, 3L, 2L,
2L, 3L, 1L, 3L, 1L, 3L, 2L, 1L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L,
3L, 3L, 1L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 2L,
1L, 2L, 1L, 1L, 1L, 3L, 2L, 1L, 2L), Power = c(82, 82, 223.6,
164, 119, 74.5, 74.5, 279.5, 82, 67, 112, 149, 119, 119, 238.5,
205, 82, 119, 194, 336, 558.9, 287, 388, 164, 194, 194, 186.3,
119, 119, 89.4, 126.7, 149, 119, 536.6, 402, 298, 298, 342.8,
536, 223.6, 521.6, 186.3, 238.5, 287, 335.3, 335.3, 335.3, 335.3,
335.3, 335.3, 357.7, 313, 782.6, 298, 670.6, 223.5, 335.3, 391,
391, 436, 391, 436, 171.4, 350, 298, 223.6, 298, 634, 223.5,
864.4, 760, 503.5, 63.3, 357.7, 812, 335.3, 298, 298, 335.3,
298, 317, 231, 335.3, 432, 918, 745.2, 424.8, 372.6, 782, 626,
544, 335.3, 372.6, 373, 391.2, 864, 894, 179, 74.5, 391.2), Span = c(12.8,
11, 17.9, 14.5, 12.9, 7.5, 11.13, 14.3, 7.8, 11, 11.7, 12.8,
8.5, 13.3, 14.9, 12, 9.4, 15.95, 16.74, 22.2, 23.4, 14.3, 23.72,
11.9, 14.4, 14.4, 9.7, 8, 9.4, 14.55, 9.1, 8.11, 9.5, 20.73,
22.8, 38.4, 14, 26.5, 30.48, 9.7, 15.5, 9.1, 14.17, 10.1, 14.8,
15.62, 14.05, 14.05, 14.8, 15.24, 14, 12.24, 27.2, 8.84, 22.86,
7.7, 9.5, 9.8, 15.93, 15.93, 15.93, 15.93, 13.08, 15.21, 8.94,
9.6, 10.8, 13.72, 8.9, 26.72, 25, 9.6, 8.84, 11.58, 17.3, 12.5,
12.1, 12.09, 9.8, 15.3, 9.08, 17.75, 15.3, 15.15, 27.4, 22, 13.7,
10.3, 22.76, 22.25, 17.25, 11, 12, 9.5, 14.15, 20.4, 20.4, 14.5,
8.84, 11.35), Length = c(7.6, 9, 10.35, 9.8, 7.9, 6.3, 8.28,
9.4, 6.7, 8.3, 8, 8.7, 7.4, 9.6, 8.9, 7.9, 6.2, 10.25, 10.77,
10.9, 12.6, 9.4, 11.86, 9.8, 9.2, 8.9, 8, 6.5, 6.95, 9.83, 7.3,
6.38, 8.5, 13.27, 13.5, 20.85, 9.2, 14.33, 19.16, 6.5, 9.7, 8.1,
9.68, 7.7, 10.8, 11.89, 10.97, 11.28, 9.5, 11.42, 11, 7.3, 18.2,
7.01, 18.08, 6.8, 6.8, 7.1, 11.5, 11.5, 11.5, 11.5, 9.27, 9.78,
6.17, 6.4, 7.32, 10.74, 6.9, 18.97, 15.1, 7.06, 7.17, 9.5, 10.55,
8.38, 8.7, 8.81, 6.7, 9.42, 5.99, 10.27, 10.22, 11, 19.8, 14.63,
11.2, 6.56, 14.88, 13.81, 12.6, 7, 7.5, 7.2, 9.91, 14.8, 15,
9.8, 7.17, 8.94), Weight = c(1070, 830, 2200, 1946, 1190, 653,
930, 1575, 676, 920, 1353, 1550, 888, 1275, 1537, 1292, 611,
1350, 1700, 3312, 4920, 1510, 3625, 900, 1665, 1640, 1081, 625,
932, 1378, 886, 902, 1070, 5670, 3636, 12925, 2107, 4770, 6060,
1192, 1900, 1050, 2155, 1379, 2858, 3380, 2290, 2290, 2347, 3308,
2630, 1333, 10000, 1351, 6250, 885, 1531, 1438, 3820, 3820, 3820,
3820, 1905, 2646, 1151, 1266, 1575, 2383, 860, 7983, 6200, 1484,
567, 1867, 4350, 1935, 1823, 2253, 1487, 2220, 1244, 2700, 2280,
3652, 8165, 5500, 3568, 1414, 5875, 5460, 4310, 1500, 1795, 1628,
2449, 6900, 6900, 1900, 567, 2102), Speed = c(105L, 145L, 135L,
138L, 140L, 177L, 113L, 230L, 175L, 106L, 140L, 170L, 175L, 157L,
183L, 201L, 209L, 145L, 120L, 135L, 152L, 176L, 140L, 190L, 175L,
175L, 205L, 196L, 165L, 146L, 175L, 222L, 159L, 166L, 158L, 146L,
185L, 120L, 157L, 226L, 205L, 230L, 161L, 251L, 171L, 206L, 171L,
171L, 235L, 161L, 145L, 245L, 183L, 214L, 180L, 220L, 237L, 254L,
169L, 169L, 169L, 169L, 153L, 183L, 261L, 245L, 235L, 200L, 246L,
174L, 180L, 319L, 146L, 251L, 230L, 290L, 230L, 233L, 250L, 255L,
233L, 175L, 230L, 180L, 145L, 185L, 196L, 298L, 183L, 198L, 195L,
300L, 270L, 297L, 225L, 212L, 195L, 197L, 146L, 296L), Range = c(400L,
402L, 500L, 500L, 400L, 350L, 402L, 700L, 525L, 300L, 560L, 550L,
250L, 450L, 700L, 600L, 175L, 450L, 450L, 450L, 600L, 800L, 500L,
600L, 600L, 600L, 600L, 400L, 250L, 400L, 350L, 547L, 450L, 1770L,
800L, 2365L, 925L, 400L, 1205L, 580L, 600L, 600L, 684L, 402L,
563L, 644L, 885L, 885L, 800L, 440L, 557L, 750L, 3600L, 500L,
805L, 330L, 600L, 628L, 1640L, 1640L, 1640L, 1640L, 604L, 1046L,
644L, 500L, 600L, 1046L, 550L, 1585L, 650L, 917L, 515L, 805L,
750L, 1110L, 772L, 1127L, 500L, 850L, 523L, 850L, 900L, 700L,
668L, 700L, 1706L, 600L, 1385L, 1000L, 902L, 600L, 500L, 450L,
579L, 1125L, 1300L, 660L, 515L, 756L)), row.names = c(NA, 100L
), class = "data.frame")
I have the following code:
library(ggplot2)
data <- read.csv("data.csv")
data[,3:8] <- log10(data[,3:8])
# density plots
ggplot( data, aes( Power, group = Period, colour = Period ) ) + geom_density( aes( fill = Period ), alpha = 1 ) + ggtitle("All data")
ggplot( data, aes( Length, group = Period, colour = Period ) ) + geom_density( aes( fill = Period ), alpha = 1 ) + ggtitle("All data")
library(ggplot2)
data <- read.csv("data.csv")
data[,3:8] <- log10(data[,3:8])
ggplot( data, aes(Power, Weight ) ) +
geom_density_2d( ) +
geom_point( aes( colour = Period ), alpha = 3 ) +
theme( legend.position = "bottom")
ggplot( data, aes( Speed, Length ) ) +
geom_density_2d( ) +
geom_point( aes( colour = Period ), alpha = 3 ) +
theme( legend.position = "bottom")
This code produces these plots:
As you can see, Period 1.5 and 2.5 should not exist – only 1, 2, and 3. And, for the contour plots, I would like it to say "Period", rather than "colour", but the same code as used in the smoothed histograms does not seem to work. And lastly, is there a way to centre the heading, so that "All data" is in the middle?
The issue is caused by using a continuous variable for what essentially is a categorical mapping. By using a categorical or factor variable for such mapping you will get what you are after.
Easiest is to coerce that variable in the data set already:
data <- data %>%
mutate(Period = as.factor(Period))
And your plot will look as what you expect (and reducing your alpha value a little for transparent density plots):

R: groupedData changes model fit in nlme?

> dput(mydat)
structure(list(ID = c(31L, 35L, 115L, 48L, 36L, 73L, 111L, 51L,
113L, 20L, 16L, 51L, 59L, 79L, 107L, 90L, 60L, 72L, 21L, 28L,
104L, 65L, 63L, 132L, 99L, 52L, 93L, 57L, 87L, 83L, 57L, 69L,
110L, 12L, 78L, 125L, 80L, 80L, 126L, 74L, 48L, 135L, 7L, 5L,
66L, 51L, 136L, 46L, 3L, 80L, 130L, 6L, 129L, 63L, 88L, 49L,
60L, 71L, 42L, 89L, 106L, 128L, 114L, 82L, 103L, 8L, 67L, 130L,
118L, 130L, 48L, 13L, 51L, 100L, 85L, 21L, 87L, 67L, 39L, 8L,
18L, 29L, 74L, 103L, 98L, 135L, 88L, 10L, 93L, 128L, 2L, 90L,
12L, 10L, 66L, 52L, 25L, 128L, 123L, 75L, 13L, 3L, 37L, 85L,
53L, 13L, 10L, 76L, 93L, 68L, 40L, 36L, 29L, 109L, 96L, 120L,
4L, 75L, 81L, 119L, 45L, 11L, 77L, 136L, 33L, 17L, 15L, 126L,
99L, 45L, 26L, 37L, 42L, 2L, 105L, 98L, 62L, 42L, 27L, 124L,
47L, 85L, 115L, 122L, 120L, 100L, 136L, 62L, 99L, 99L, 78L, 71L,
93L, 118L, 28L, 103L, 43L, 64L, 56L, 124L, 128L, 103L, 82L, 138L,
110L, 60L, 49L, 12L, 92L, 89L, 123L, 44L, 108L, 71L, 3L, 26L,
125L, 6L, 113L, 117L, 97L, 6L, 17L, 91L, 109L, 126L, 32L, 90L,
114L, 66L, 104L, 12L, 1L, 98L, 76L, 60L, 23L, 69L, 84L, 111L),
Y = c(1.50403545875011, 0.786396740696073, 4.47452220273871,
4.38068147273783, 3.12839926871781, 3.71525102887885, 4.91099771631064,
5.6099549267089, 2.56348108539441, 3.19948091486236, 2.08635983067475,
1.0763458953491, 1.51606413703901, 5.24654043255577, 4.52424029343984,
4.20774205260695, 4.12910958622073, 0.633743240652555, 4.77971190302622,
3.93816934639032, 1.49484285995404, 1.56126305852856, 4.46548695066214,
1.0084930673158, 3.04727486738418, 3.35888620440587, 3.40432046722173,
3.76440032295639, 4.07871050532828, 4.19226071864204, 1.7160033436348,
1.03724192908934, 1.58238166258979, 3.68196445899468, 3.94299959336604,
4.1723985779393, 1.48656664161404, 2.82216807936802, 4.29307514012286,
1.56346766351964, 2.82672252016899, 2.88817949391832, 1.579432355962,
2.75587485567249, 4.52577018572453, 4.78804103487477, 3.76900787094377,
4.59964294342393, 1.20237162906479, 1.54913073517381, 2.36361197989214,
5.29470645462496, 4.37803432245733, 1.53760300072777, 4.68198252411321,
4.24868424001548, 2.70586371228392, 0.795680498261033, 2.86864443839483,
5.05097104595277, 1.75587485567249, 1.4190891773674, 4.60685410171667,
2.06818586174616, 0.965391506489986, 1.64857793561105, 5.11577022280303,
3.23527587668705, 3.70722941932729, 1.59578845118596, 3.48826861549546,
4.15706370038262, 0.487678450889512, 3.22814360759774, 4.77382300021727,
3.69583177282669, 4.62949114909487, 4.4704545944726, 2.69108149212297,
3.4379090355395, 4.8963222496021, 2.03342375548695, 4.28386634847347,
2.83569057149243, 1.69219063796734, 4.30362797638389, 3.03981055414835,
5.36239952864841, 1.58185218466613, 4.36789612311481, 2.85064623518307,
0.684841449866386, 4.95956127990689, 4.73626101314068, 3.74036268949424,
4.58490763903562, 2.04139268515822, 4.33829709023269, 4.43218332439469,
4.84305827754328, 1.81291335664286, 1.3818767641537, 1.23195309926451,
3.45651785780526, 3.28375338333253, 4.76814952267996, 4.21208101599211,
4.61628642071719, 3.52930199778798, 3.87926795682461, 0.152221483815728,
3.37621185028267, 1.05830921980416, 4.56712051193916, 4.01973923267471,
4.52527809662657, 3.55762748842683, 6.16255544574663, 3.3392526340327,
4.9800761268684, 4.67728750108277, 2.77305469336426, 4.37963179601937,
5.08042444249764, 4.41390299750444, 1.13552331082485, 4.83799224531426,
0.949064591865248, 1.67706978322635, 4.95286990226433, 1.25024436906661,
1.66401619974254, 5.82804345699154, 3.19728055812562, 4.28768978393608,
3.66913084737333, 1.61566819024883, 2.77959649125782, 1.46691039598072,
1.3041916833582, 2.99475694458763, 1.02667774806713, 3.63346845557959,
4.97170714505069, 4.92332693090275, 4.34570692127843, 1.48434160170634,
2.78175537465247, 4.30446898485417, 3.35621713421974, 2,
2.01703333929878, 2.71180722904119, 4.74145100905455, 3.49262072204319,
4.93047527477251, 4.47468237035325, 1.79239168949825, 4.8662518850263,
1.49607694983225, 1.60572117915046, 4.3945392313722, 2.07918124604762,
1.22862414376523, 1.68741633267633, 4.06740565843782, 4.09537859956006,
3.53617953213723, 3.67089495352021, 4.00436437110775, 2.10720996964787,
3.90167623132638, 6.60281435674245, 6.51005266486288, 4.60659630917929,
1.6845945840158, 4.9596613702735, 1.69603672384819, 2.68841982200271,
1.48366771663978, 1.5218420041367, 4.65083185753557, 1.83884909073726,
3.05766610390983, 3.61151088712666, 3.78290240597464, 2.01283722470517,
3.34084054981233, 1.27334439191564, 2.32837960343874, 4.10859884597357,
0.864516798738721, 1.22753724273698, 3.99100444033076, 0.752257491617398,
5.11358574735257, 4.13624480174614, 3.87128097285797, 3.99690551069567,
3.05766610390983), X = c(5.7, 6.3, 4.7, 17, 0.9, 0.6, 3,
4.1, 6.9, 2, 11.1, 3.7, 2, 12, 1, 3.4, 8.9, 12, 12, 0, 3.9,
3.9, 7.9, 17, 19, 1.7, 16, 13.9, 7, 9.9, 0.9, 0, 3.6, 17.3,
11.7, 5, 1.7, 5.6, 8.1, 11, 3.9, 16.3, 2.1, 19.7, 19.4, 2.7,
0.9, 2, 15.9, 15.9, 12.1, 16.6, 14, 7.1, 1.9, 1, 18.7, 0,
3.9, 0.9, 8.1, 11.9, 0, 4.1, 7, 2, 2.9, 9.7, 3.6, 3.4, 9.1,
8.1, 13.9, 2.4, 6.9, 11.1, 4.9, 0, 9.1, 18.9, 1.9, 12, 1.1,
2.9, 4, 7.7, 3.9, 1.7, 2.9, 2, 6.1, 2, 5, 10.7, 14.3, 2.9,
2.4, 17.9, 2, 0.9, 10, 2.9, 2.9, 1.9, 8.7, 1.9, 13.4, 0,
14, 0.9, 2.4, 2, 8.9, 1.9, 0, 9.9, 6.9, 3.7, 0, 15.1, 0,
12.1, 8.9, 10.1, 0.9, 11.7, 2.7, 11.9, 10.9, 9, 0.9, 0.9,
9.9, 17.9, 18.9, 15, 8, 15.7, 10.9, 9.7, 13.9, 3.9, 1.9,
1.9, 2, 3.1, 5.1, 0.7, 15.9, 7.9, 10.1, 6.1, 5, 4.9, 6.3,
2, 1.7, 2.9, 13.3, 2.1, 1, 3, 0.7, 7.9, 10, 4, 4, 4, 3.9,
1.9, 3.9, 4.9, 4.1, 11.9, 7, 13, 7.9, 7.1, 2.6, 8.7, 10.9,
6.1, 16.9, 2.7, 3.3, 0.3, 16.9, 0, 6, 16.3, 1.4, 9.1, 1.9,
3.9, 8, 6, 8, 11.9, 0, 0.9)), .Names = c("ID", "Y", "X"), row.names = c(NA,
-200L), class = "data.frame")
I have a simple data set with 3 variables, X, Y, and an subject ID.
> head(mydat)
ID Y X
1 31 1.5040355 5.7
2 35 0.7863967 6.3
3 115 4.4745222 4.7
4 48 4.3806815 17.0
5 36 3.1283993 0.9
6 73 3.7152510 0.6
I run the following nonlinear mixed model, and it runs just fine.
library(nlme)
model <- nlme(Y ~ a*exp(X) + b,
data = mydat,
fixed = list(a ~ 1, b ~ 1),
random = list(ID = pdDiag(list(a ~ 1, b ~ 1))),
start = list(fixed = c(a = 1, b = 1)))
Now I use the groupedData command on my data set. However, when I run the same exact analysis on the groupedData object, the model doesn't fit anymore.
mydat2 <- groupedData(Y ~ X | ID, data = mydat)
model2 <- nlme(Y ~ a*exp(X) + b,
data = mydat2,
fixed = list(a ~ 1, b ~ 1),
random = list(ID = pdDiag(list(a ~ 1, b ~ 1))),
start = list(fixed = c(a = 1, b = 1)))
Error in nlme.formula(Y ~ a * exp(X) + b, data = mydat2, fixed = list(a ~ :
step halving factor reduced below minimum in PNLS step
I do not understand why because the groupedData call should not have changed the contents of mydat. What went wrong?
mydat2 <- groupedData(Y ~ X | ID, data = mydat,order.groups=FALSE)
as the default for order.groups=TRUE (see the help for groupedData).

Calling function inside with-statement gives error variable not found in function scope

I am preparing a bootstrapped estimation of a mean prediction error on a multiple imputed dataset. My function seems to be unable to find the dependent variable in scope. Is there some way to circumvent that?
Multiple imputation runs smoothly, but the specific problem seems to be that the line
mod.nb.train <- with(data = data.mi.train, exp = glm.nb(f))
cannot find the variable CG.tot:
Error in eval(expr, envir, enclos) : object 'CG.tot' not found
However, if I state the formula as a string:
glm.nb(formula=CG.tot~Fibrinogen)
it works...
Minimal running example:
library(mice)
library(MASS)
#compute the mean prediction error on a dataframe with missing data
predicterr <- function(f, data, indices){
if(!(class(f)=="formula")){stop("'f' must be of the 'formula' type")}
if(!(class(data)=="data.frame")){stop("'data' must be of the 'data.frame' type")}
#recompute random sampling & multiple imputation
data.test <- data[sample(nrow(data), 15),]
data.train <- data[setdiff(rownames(data), rownames(data.test)),]
data.mi.train <- mice(data.train)
data.mi.test <- mice(data.test)
#recompute model
mod.nb.train <- with(data = data.mi.train, exp = glm.nb(f))
coeffs <- summary(pool(mod.nb.train))[,"est"]
#compute prediction error on each dataset row
errvec <- apply(complete(data.mi.test, include = F, action = "long")[,c(names(coeffs)[-1], as.character(f)[2])],
1, function(x){
return(exp(sum(x[1:length(x)-1]*coeffs[-1], coeffs[1]))-x[length(x)])
})
return(mean(errvec))
}
predicterr(CG.tot~Fibrinogen, d.mi)
Dataset (a little long, but that's for the imputation...):
d <- structure(list(Hb = c(7.5, 12.9, 12.9, 10.2, 10.5, 11.2, 12.7,
9.3, 11.7, 13.4, 151, 10.9, 5.9, 12.8, 10.2, 15.3, 13.8, 9.6,
7.6, 12.2, 11.1, 13.6, 8.9, 7.2, 7.8, 8.7, 10.3, 14, 8.8, 7.5
), Hct = c(23, 39.8, 39.4, 31.6, 32.5, 34.4, 39, 28, 35.9, 41.2,
43.8, 33.7, 18.6, 37.7, 31.7, 44, 87.3, 29.4, 23.6, 37.7, 34.3,
39.8, 27.4, 22.6, 24.2, 29.1, 31.8, 43.1, 27.3, 23.3), EXTEM.CT = c(51L,
60L, 45L, 115L, 55L, 48L, 49L, 106L, 56L, 68L, 61L, 53L, 69L,
44L, 58L, 126L, 47L, 68L, 49L, 68L, 51L, 84L, 63L, 66L, 51L,
108L, 63L, 51L, 53L, 63L), EXTEM.CFT = c(133L, 162L, 175L, 216L,
101L, 60L, 140L, 248L, 137L, 203L, 113L, 199L, 316L, 90L, 224L,
235L, 133L, 46L, 308L, 300L, 119L, 420L, 44L, 207L, 91L, 69L,
96L, 130L, 153L, 99L), EXTEM.MCF = c(59L, 55L, 50L, 46L, 64L,
72L, 52L, 46L, 50L, 50L, 60L, 40L, 40L, 56L, 46L, 47L, 52L, 67L,
40L, 35L, 83L, 30L, 82L, 47L, 61L, 76L, 63L, 51L, 58L, 58L),
INTEM.CT = c(NA, 158L, 154L, 240L, 141L, 141L, 143L, 122L,
104L, 193L, 183L, 186L, 182L, 172L, 192L, 149L, 133L, 162L,
238L, 158L, 144L, 144L, 162L, 213L, 139L, 157L, 104L, 376L,
140L, 192L), INTEM.CFT = c(NA, 91L, 119L, 165L, 97L, 51L,
118L, 190L, 84L, 90L, 82L, 114L, 226L, 90L, 89L, 209L, NA,
64L, 203L, 222L, 64L, 104L, 43L, 170L, 66L, 50L, 61L, 332L,
70L, 66L), INTEM.MCF = c(NA, 57L, 48L, 48L, 74L, 70L, 49L,
50L, 50L, 55L, 58L, 49L, 40L, 57L, 48L, 46L, 64L, 68L, 44L,
39L, 64L, 54L, 80L, 51L, 64L, 78L, 68L, 54L, 62L, 61L), FIBTEM.CT = c(50L,
62L, 101L, 123L, 58L, 49L, 49L, 74L, 77L, 117L, 61L, 54L,
79L, 41L, 69L, 189L, 49L, 67L, 55L, 56L, 57L, 59L, 56L, 62L,
57L, 65L, 51L, 58L, 68L, 67L), FIBTEM.CFT = c(NA, NA, NA,
NA, NA, 94L, NA, NA, NA, NA, NA, 615L, NA, 56L, NA, NA, NA,
79L, NA, NA, 625L, NA, 75L, NA, 892L, NA, NA, NA, NA, 1206L
), FIBTEM.MCF = c(9L, 9L, NA, 5L, 10L, 21L, 11L, 4L, 6L,
3L, 16L, 7L, 6L, 31L, NA, 4L, NA, 35L, 11L, 10L, 42L, NA,
28L, 13L, 22L, 28L, 8L, 7L, 9L, 21L), INR = c(1.14, 1, 1,
1.33, 1.01, 1.07, 1.06, 1.43, 1.22, 1.12, 1.18, 1.54, NA,
1.3, 1.13, 1.05, 1.09, 1.11, 1.49, 1.22, 1.33, 1.04, NA,
1.87, 1.67, 1, 1, 1.07, 1.12, 1.88), PTT = c(30, 28.4, 22.1,
37.8, 25.6, 28.9, 27.2, 32.7, 27.2, 28.9, 27.3, 69.9, 132,
31.9, 26.5, NA, 28.9, 44.3, 50.8, 36.6, NA, 23.5, 30, 70.6,
41.2, 30.1, 25.7, 26.7, 26, 41.9), Platelets = c(150, 193,
343, 138, 284, 216, 141, 291, 142, 230, 254, 126, NA, 249,
153, 308, 253, 66, 30, 41, 293, 208, 545, 141, 136, 256,
249, 305, 327, 112), Fibrinogen = c(1.3, NA, NA, 0.9, 2.1,
3.4, 2.3, 1.1, 1.5, 1.1, 1.8, 0.8, NA, 2.3, 2.4, NA, 2.2,
7.4, 1.8, 1.7, NA, 2.6, 7.1, 0.6, 1.2, NA, 1.1, 2.5, 1.7,
2), CG.tot = c(3L, 2L, 3L, 11L, 12L, 0L, 1L, 10L, 4L, 4L,
5L, 0L, 12L, 11L, 3L, 9L, 5L, 0L, 4L, 0L, 0L, 3L, 0L, 21L,
2L, 1L, 1L, 1L, 2L, 3L)), .Names = c("Hb", "Hct", "EXTEM.CT",
"EXTEM.CFT", "EXTEM.MCF", "INTEM.CT", "INTEM.CFT", "INTEM.MCF",
"FIBTEM.CT", "FIBTEM.CFT", "FIBTEM.MCF", "INR", "PTT", "Platelets",
"Fibrinogen", "CG.tot"), row.names = c(50L, 38L, 54L, 82L, 86L,
4L, 24L, 78L, 59L, 58L, 72L, 16L, 85L, 81L, 45L, 77L, 70L, 6L,
63L, 7L, 11L, 53L, 13L, 93L, 36L, 30L, 18L, 19L, 40L, 43L), class = "data.frame")
You're missing one parameter in glm.nb:
mod.nb.train <- with(data = data.mi.train, exp = glm.nb(f, environment()))
and it works.

Resources