I have a series of x and y values that I've used to build a linear model.
I can use predict() to find a value of y from a known value of x, but I'm struggling to calculate x from a known value of y. I've seen a few posts that talk about using the approx() function, but I can't figure out how to implement it for my use case. The idea is to write a function that takes a numerical value of y as an input and returns the expected value of x that it would correspond to, ideally with a prediction interval, eg "The expected value of x is 38.90, plus or minus 0.7", or something like that.
Here's my data:
> dput(x)
c(4.66, 5.53, 5.62, 5.85, 6.26, 6.91, 7.04, 7.32, 7.43, 7.85,
8.1, 8.3, 8.34, 8.53, 8.69, 8.7, 8.73, 8.76, 8.96, 9.06, 9.42,
9.78, 10.3, 10.82, 10.98, 11.07, 11.09, 11.32, 11.75, 12.1, 12.46,
12.5, 12.99, 13.02, 13.28, 13.43, 13.96, 14, 14.07, 14.29, 14.57,
14.66, 15.21, 15.56, 15.97, 16.44, 16.8, 17.95, 18.33, 18.62,
18.92, 19.49, 19.9, 19.92, 20.14, 20.18, 21.19, 22.7, 23.25,
23.48, 23.49, 23.58, 23.7, 23.83, 23.83, 23.97, 24.05, 24.14,
24.15, 24.19, 24.32, 24.62, 24.9, 24.92, 25, 25.06, 25.31, 25.36,
25.86, 25.9, 25.95, 25.99, 26.08, 26.2, 26.27, 26.39, 26.5, 26.51,
26.68, 26.78, 26.82, 26.92, 26.92, 27.05, 27.05, 27.07, 27.32,
27.6, 27.77, 27.8, 27.91, 27.96, 27.97, 28.04, 28.05, 28.15,
28.2, 28.28, 28.37, 28.51, 28.53, 28.53, 28.66, 28.68, 28.72,
28.74, 28.82, 28.83, 28.83, 28.86, 28.89, 28.91, 29.04, 29.2,
29.35, 29.4, 29.42, 29.48, 29.53, 29.65, 29.67, 29.69, 29.7,
29.72, 29.93, 29.97, 30.03, 30.08, 30.09, 30.11, 30.18, 30.62,
30.66, 30.78, 31, 31.32, 31.43, 31.47, 31.69, 31.96, 32.33, 32.5,
32.5, 32.58, 32.7, 32.92, 33.2, 33.6, 33.72, 33.77, 33.95, 34.02,
34.08, 34.42, 34.79, 34.91, 34.99, 35.08, 35.15, 35.49, 35.6,
35.6, 35.74, 35.8, 36.05, 36.17, 36.3, 36.37, 36.84, 37.31, 37.95,
38.75, 38.78, 38.81, 38.9, 39.21, 39.31, 39.5, 42.68, 43.92,
43.95, 44.64, 45.7, 45.95, 46.25, 46.8, 49.08, 50.33, 51.23,
52.76, 53.06, 62)
> dput(y)
c(11.91, 13.491, 13.708, 13.984, 14.624, 15.688, 15.823, 16.105,
16.387, 17.004, 17.239, 17.498, 17.686, 17.844, 17.997, 18.044,
18.003, 18.191, 18.332, 18.25, 18.778, 19.237, 19.693, 20.177,
20.441, 20.876, 20.512, 20.894, 21.493, 21.539, 21.951, 21.763,
22.498, 22.451, 22.744, 22.785, 23.409, 23.314, 23.408, 23.567,
23.849, 23.978, 24.472, 24.678, 25.236, 25.547, 25.676, 26.81,
26.83, 27.275, 27.331, 27.844, 28.009, 28.244, 28.497, 28.555,
29.067, 30.412, 30.788, 30.965, 31.058, 31.423, 31.346, 31.118,
31.252, 31.258, 31.399, 31.605, 31.552, 31.881, 31.822, 31.91,
32.333, 32.174, 32.222, 32.704, 32.445, 32.557, 32.993, 32.845,
32.997, 32.909, 32.911, 33.121, 33.191, 33.156, 33.426, 33.332,
33.52, 33.526, 33.697, 33.379, 33.849, 33.726, 33.538, 33.885,
33.961, 34.284, 34.208, 33.896, 34.278, 34.355, 34.276, 34.267,
34.399, 34.507, 34.492, 34.531, 34.695, 34.642, 34.872, 34.772,
34.813, 34.942, 34.883, 34.948, 34.719, 34.983, 34.99, 35.136,
35.007, 34.026, 35.148, 35.201, 35.459, 35.418, 35.236, 35.411,
35.459, 35.5, 35.665, 35.724, 35.636, 35.667, 35.747, 35.788,
35.882, 35.9, 35.83, 36.106, 36.029, 36.364, 36.358, 36.517,
37.005, 36.74, 36.963, 36.634, 37.04, 37.48, 37.581, 37.78, 37.686,
38.262, 37.998, 37.986, 38.498, 39.296, 38.467, 38.779, 38.885,
38.72, 39.038, 38.932, 39.719, 39.654, 39.367, 40.072, 39.707,
39.742, 39.919, 40.054, 40.189, 40.197, 40.154, 40.383, 42.146,
40.595, 40.971, 41.441, 41.964, 42.328, 42.463, 42.627, 42.633,
42.721, 42.786, 42.857, 45.318, 45.665, 46.406, 46.335, 47.663,
47.181, 48.074, 48.109, 49.931, 50.377, 51.053, 52.451, 53.004,
65.889)
> model <- lm(y ~ poly(x,3,raw=TRUE))
> model
Call:
lm(formula = y ~ poly(x, 3, raw = TRUE))
Coefficients:
(Intercept) poly(x, 3, raw = TRUE)1 poly(x, 3, raw = TRUE)2 poly(x, 3, raw = TRUE)3
6.6096981 1.4736619 -0.0238935 0.0002445
Since you have fitted a low order polynomial in ordinary form (raw = TRUE), you can use polyroot to directly find x given y.
## pc: polynomial coefficients in increasing order
solvePC <- function (pc, y) {
pc[1] <- pc[1] - y
## all roots, including complex ones
roots <- polyroot(pc)
## keep real roots
Re(roots)[abs(Im(roots)) / Mod(roots) < 1e-10]
}
y0 <- 38.9 ## example y-value
x0 <- solvePC(coef(model), y0)
#[1] 34.28348
plot(x, y, col = 8)
lines(x, model$fitted, lwd = 2)
abline(h = y0)
abline(v = x0)
To get an interval estimate, we can use sampling methods.
## polyfit: an ordinary polynomial regression model fitted by lm()
rootCI <- function (polyfit, y, nSamples = 1000, level = 0.05) {
## sample regression coefficients from their joint distribution
pc <- MASS::mvrnorm(nSamples, coef(polyfit), vcov(polyfit))
## for each row (a sample), call solvePC()
roots <- apply(pc, 1, solvePC, y)
## confidence interval
quantile(roots, prob = c(0.5 * level, 1 - 0.5 * level))
}
## 95% confidence interval
rootCI(model, y = y0)
# 2.5% 97.5%
#34.17981 34.38828
You can use optim:
Predict the y values given x:
pred_y <- function(x)predict(model, data.frame(x))
pred_y(x = 10)
[1] 19.20145
Now to predict x given y, we do:
pred_x <- function(y) optim(1, \(x) (y-pred_y(x))^2, method='BFGS')[[1]]
pred_x(19.20145)
[1] 10
The uniroot function is intended for this type of problem.
#coefficients for the model
coeff <- c(6.6096981, 1.4736619, -0.0238935, 0.0002445)
#define the equation which one needs the root of
modely <- function(x, y) {
# could use the predict function here
my<-coeff[1] + coeff[2]*x + coeff[3]*x**2 + coeff[4]*x**3
y-my
}
#use the uniroot functiion
#In this example y=10
uniroot(modely, lower=-100, upper=100, y=10)
$root
[1] 2.391022
$f.root
[1] -1.208443e-08
$iter
[1] 10
$init.it
[1] NA
$estim.prec
[1] 6.103516e-05
In this case for y=10, x = 2.391022
I have soil moisture data with x-, y- and z-coordinates like this:
gue <- structure(list(x = c(311939.1507, 311935.4607, 311924.7316, 311959.553,
311973.5368, 311953.3743, 311957.9409, 311948.3151, 311946.7169,
311997.0803, 312017.5236, 312006.0245, 312001.5179, 311992.7044,
311977.3076, 311960.4159, 311970.6047, 311957.2564, 311866.4246,
311870.8714, 311861.4461, 311928.7096, 311929.6291, 311929.4233,
311891.2915, 311890.3429, 311900.8905, 311864.4995, 311870.8143,
311866.9257, 312002.571, 312017.816, 312004.5024, 311947.1186,
311943.0152, 311952.2695, 311920.6095, 311929.8371, 311918.6095,
312011.9019, 311999.5755, 312011.1461, 311913.7251, 311925.3459,
311944.4701, 311910.2079, 311908.7618, 311896.0776, 311864.4814,
311856.9027, 311857.5747, 311967.3779, 311962.2024, 311956.8318,
311977.5254, 311971.1776, 311982.537, 311993.4709, 312004.6407,
312015.6118, 311990.8601, 311994.686, 311988.3037, 311990.518,
311986.3918, 311998.8876, 311923.9157, 311903.4563, 311915.714,
311856.9087, 311858.9812, 311874.5867, 311963.9099, 311938.4542,
311945.9505, 311804.3039, 311797.2571, 311791.6967, 311921.3965,
311928.9353, 311920.0597, 311833.5109, 311829.8683, 311847.6261,
311889.1243, 311902.4909, 311901.245, 311981.1118, 312005.7098,
311976.5858, 311819.8901, 311816.4143, 311819.4172, 311870.418,
311873.2656, 311888.3401, 311910.8377, 311897.6697, 311902.4571,
311846.8196, 311833.6235, 311846.2942, 311931.3916, 311930.1891,
311947.659, 311792.2642, 311793.2539, 311794.1931, 311795.1288,
311796.0806, 311797.0142, 311797.95, 311798.8822, 311799.8229,
311800.7774, 311801.7094, 311802.6395, 311803.583, 311804.5185,
311805.4558, 311806.391, 311807.3346, 311808.2757, 311809.2187,
311810.1549, 311811.1014, 311812.0366, 311812.9667, 311813.9107,
311814.8373, 311815.7777, 311816.7365, 311817.6522, 311818.6091,
311819.5335, 311820.4961, 311821.4337, 311822.3855, 311823.3195,
311824.2713, 311825.214, 311826.1705, 311827.1188, 311828.0501,
311828.9893, 311829.9324, 311830.8706, 311831.8181, 311832.7667,
311833.705, 311834.6546, 311835.609, 311836.5527, 311837.5157,
311838.4495, 311839.3926, 311840.3423, 311841.2799, 311842.2288,
311843.1691, 311844.118, 311845.0746, 311846.019, 311846.9709,
311847.9201, 311848.859, 311849.8105, 311850.7503, 311851.6889,
311852.6355, 311853.6045, 311854.5296, 311855.4717, 311856.4171,
311857.3759, 311858.3151, 311859.2604, 311860.2178, 311861.1636,
311862.1071, 311863.0347, 311863.9857, 311864.9316, 311865.8722,
311866.8158, 311867.7702, 311868.7155, 311869.649, 311870.6018,
311871.5449, 311872.4871, 311873.4352, 311874.385, 311875.3042,
311876.2617, 311877.2068, 311878.1429, 311879.0956, 311880.0401,
311880.9822, 311881.929, 311882.8651, 311883.8017, 311884.7429,
311885.6949, 311886.6349, 311887.7207, 311888.6653, 311889.6041,
311890.5358, 311891.4838, 311892.4292, 311893.3736, 311894.326,
311895.2703, 311896.2182, 311897.1635, 311898.1032, 311899.0496,
311899.9967, 311900.9456, 311901.8889, 311902.8162, 311903.7566,
311904.6996, 311905.6627, 311906.5899, 311907.5448, 311908.4856,
311909.4399, 311910.3649, 311911.3188, 311912.2629, 311913.2022,
311914.1527, 311915.1025, 311916.0425, 311916.985, 311917.9254,
311918.8661, 311919.8174, 311920.7668, 311921.7026, 311922.6517,
311923.5949, 311924.5252, 311925.4599, 311926.422, 311927.3646,
311928.3, 311929.2432, 311930.1796, 311931.1358, 311932.0726,
311933.0069, 311933.9585, 311934.845, 311935.7788, 311936.7193,
311937.6441, 311938.572, 311939.5094, 311940.4666, 311941.4067,
311942.3489, 311943.2712, 311944.2195, 311945.1536, 311946.0927,
311947.0413, 311947.9761, 311948.9082, 311949.8557, 311950.8201,
311951.7616, 311952.7148, 311953.7894, 311954.7289, 311955.6646,
311956.6081, 311957.5588, 311958.4896, 311959.4297, 311960.3761,
311961.3191, 311962.2653, 311963.195, 311964.1501, 311965.0856,
311966.0254, 311966.9739, 311967.9305, 311968.8592, 311971.7861,
311970.758, 311969.8205), y = c(5846548.408, 5846546.489, 5846538.014,
5846525.283, 5846510.302, 5846503.516, 5846529.769, 5846523.06,
5846522.742, 5846512.263, 5846525.347, 5846522.042, 5846537.487,
5846545.587, 5846532.112, 5846425.917, 5846406.543, 5846434.03,
5846500.989, 5846498.286, 5846487.134, 5846488.045, 5846483.29,
5846468.713, 5846534.269, 5846533.527, 5846504.056, 5846453.395,
5846438.43, 5846442.608, 5846406.8, 5846434.58, 5846419.229,
5846441.045, 5846436.903, 5846447.917, 5846460.757, 5846457.428,
5846451.067, 5846445.596, 5846474.031, 5846457.239, 5846532.694,
5846553.938, 5846565.323, 5846446.926, 5846432.549, 5846467.236,
5846473.963, 5846464.78, 5846498.142, 5846458.168, 5846474.018,
5846489.801, 5846559.513, 5846589.975, 5846555.723, 5846553.847,
5846560.066, 5846560.792, 5846455.642, 5846546.374, 5846465.999,
5846432.091, 5846422.061, 5846442.871, 5846485.956, 5846472.811,
5846506.756, 5846416.327, 5846419.623, 5846413.124, 5846587.334,
5846600.116, 5846589.515, 5846463.69, 5846456.712, 5846459.683,
5846600.118, 5846574.99, 5846597.804, 5846419.496, 5846437.615,
5846436.902, 5846567.872, 5846572.857, 5846556.904, 5846388.146,
5846393.088, 5846390.13, 5846481.09, 5846496.127, 5846493.586,
5846545.396, 5846532.126, 5846538.334, 5846388.343, 5846416.117,
5846392.223, 5846513.526, 5846486.644, 5846512.917, 5846395.509,
5846386.421, 5846383.873, 5846459.062, 5846459.36, 5846459.682,
5846460.026, 5846460.377, 5846460.703, 5846461.047, 5846461.378,
5846461.73, 5846462.071, 5846462.418, 5846462.765, 5846463.115,
5846463.466, 5846463.815, 5846464.128, 5846464.505, 5846464.843,
5846465.189, 5846465.52, 5846465.869, 5846466.217, 5846466.557,
5846466.893, 5846467.237, 5846467.586, 5846467.903, 5846468.274,
5846468.601, 5846468.943, 5846469.258, 5846469.592, 5846469.909,
5846470.247, 5846470.565, 5846470.891, 5846471.24, 5846471.536,
5846471.885, 5846472.224, 5846472.553, 5846472.884, 5846473.225,
5846473.532, 5846473.89, 5846474.179, 5846474.502, 5846474.827,
5846475.146, 5846475.448, 5846475.768, 5846476.102, 5846476.428,
5846476.746, 5846477.069, 5846477.37, 5846477.685, 5846478.009,
5846478.335, 5846478.656, 5846478.958, 5846479.299, 5846479.608,
5846479.926, 5846480.267, 5846480.603, 5846480.908, 5846481.246,
5846481.56, 5846481.877, 5846482.19, 5846482.503, 5846482.825,
5846483.144, 5846483.468, 5846483.811, 5846484.13, 5846484.458,
5846484.8, 5846485.125, 5846485.456, 5846485.778, 5846486.112,
5846486.421, 5846486.75, 5846487.08, 5846487.401, 5846487.744,
5846488.067, 5846488.39, 5846488.728, 5846489.067, 5846489.383,
5846489.716, 5846490.054, 5846490.38, 5846490.719, 5846491.044,
5846491.357, 5846491.694, 5846492.005, 5846492.402, 5846492.726,
5846493.045, 5846493.389, 5846493.708, 5846494.049, 5846494.363,
5846494.686, 5846494.982, 5846495.3, 5846495.64, 5846495.957,
5846496.263, 5846496.584, 5846496.911, 5846497.241, 5846497.591,
5846497.914, 5846498.226, 5846498.553, 5846498.893, 5846499.221,
5846499.538, 5846499.869, 5846500.19, 5846500.508, 5846500.82,
5846501.151, 5846501.492, 5846501.827, 5846502.147, 5846502.471,
5846502.803, 5846503.129, 5846503.46, 5846503.783, 5846504.11,
5846504.448, 5846504.76, 5846505.118, 5846505.445, 5846505.79,
5846506.106, 5846506.465, 5846506.795, 5846507.118, 5846507.448,
5846507.758, 5846508.081, 5846508.396, 5846508.645, 5846508.99,
5846509.34, 5846509.685, 5846510.031, 5846510.363, 5846510.693,
5846511.031, 5846511.362, 5846511.694, 5846512.024, 5846512.354,
5846512.701, 5846513.034, 5846513.353, 5846513.683, 5846513.998,
5846514.32, 5846514.636, 5846514.956, 5846515.326, 5846515.65,
5846515.968, 5846516.301, 5846516.634, 5846516.971, 5846517.318,
5846517.64, 5846517.952, 5846518.308, 5846518.626, 5846518.937,
5846519.27, 5846519.597, 5846519.921, 5846520.245, 5846520.581,
5846521.498, 5846521.209, 5846520.893), z = c(26.485, 26.411,
26.339, 27.248, 27.208, 26.799, 27.199, 27.023, 26.973, 26.908,
26.275, 26.474, 26.316, 26.226, 27.184, 25.903, 25.765, 25.931,
26.057, 26.181, 26.102, 26.436, 26.457, 26.396, 25.585, 25.572,
26.448, 25.637, 25.603, 25.634, 25.847, 26.185, 25.899, 26.016,
25.873, 26.299, 26.358, 26.344, 26.088, 26.264, 26.3, 26.306,
26.311, 25.857, 26.004, 25.824, 25.798, 26.326, 26.03, 25.625,
25.78, 26.368, 26.225, 26.582, 26.398, 25.343, 26.253, 25.908,
25.323, 25.381, 26.3, 26.179, 26.284, 26.024, 25.896, 26.251,
26.447, 26.385, 26.419, 25.188, 25.176, 25.169, 25.348, 25.188,
25.291, 25.285, 25.266, 25.262, 25.333, 25.308, 25.314, 25.145,
25.172, 25.22, 25.235, 25.204, 25.286, 25.155, 25.397, 25.202,
25.373, 25.327, 25.341, 25.172, 25.253, 25.318, 25.023, 25.24,
25.132, 25.264, 25.38, 25.221, 25.119, 25.179, 25.083, 25.258,
25.254, 25.235, 25.252, 25.266, 25.256, 25.264, 25.26, 25.262,
25.265, 25.265, 25.285, 25.28, 25.257, 25.254, 25.258, 25.287,
25.294, 25.282, 25.27, 25.268, 25.309, 25.303, 25.3, 25.312,
25.305, 25.3, 25.314, 25.319, 25.328, 25.304, 25.325, 25.308,
25.332, 25.333, 25.333, 25.346, 25.344, 25.339, 25.355, 25.362,
25.36, 25.391, 25.418, 25.434, 25.436, 25.447, 25.486, 25.5,
25.526, 25.552, 25.551, 25.564, 25.589, 25.606, 25.641, 25.672,
25.689, 25.709, 25.736, 25.758, 25.782, 25.836, 25.844, 25.866,
25.88, 25.935, 25.984, 26.037, 26.066, 26.071, 26.094, 26.106,
26.106, 26.118, 26.1, 26.146, 26.135, 26.156, 26.169, 26.162,
26.173, 26.198, 26.196, 26.228, 26.258, 26.276, 26.283, 26.277,
26.236, 26.277, 26.251, 26.264, 26.26, 26.261, 26.249, 26.307,
26.289, 26.243, 26.206, 26.231, 26.224, 26.238, 26.244, 26.245,
26.254, 26.2, 26.229, 26.24, 26.248, 26.223, 26.29, 26.344, 26.371,
26.364, 26.311, 26.343, 26.342, 26.334, 26.317, 26.342, 26.315,
26.312, 26.322, 26.325, 26.324, 26.32, 26.308, 26.329, 26.31,
26.32, 26.327, 26.34, 26.371, 26.442, 26.442, 26.483, 26.504,
26.526, 26.562, 26.562, 26.538, 26.534, 26.533, 26.541, 26.584,
26.642, 26.65, 26.691, 26.719, 26.755, 26.786, 26.794, 26.849,
26.867, 26.919, 26.93, 26.945, 26.947, 26.959, 26.984, 26.992,
27.006, 27.035, 27.021, 27.052, 27.094, 27.104, 27.119, 27.16,
27.182, 27.223, 27.236, 27.267, 27.304, 27.331, 27.348, 27.341,
27.379, 27.355, 27.378, 27.357, 27.373, 27.319, 27.299, 27.278,
27.28, 27.295, 27.288, 27.286, 27.279), soil_m_sat = c(24.1,
24.2, 26.9, 13.9, 20.6, 34.1, 16.2, 16.7, 16, 22.1, 23.9, 27.2,
26.8, 34.4, 26.3, 54.1, 51, 44.9, 46.4, 45.9, 54.7, 39.1, 38.7,
40.7, 56.5, 56.3, 40.6, 60.9, 56.8, 56.3, 40.7, 40.4, 44.1, 44.9,
46.2, 45.3, 46.1, 43.7, 44.9, 45.4, 33.1, 45.8, 27.6, 47.8, 37.3,
58.9, 51.4, 42.1, 46, 66.6, 51.1, 31.6, 48.7, 32.9, 28.1, 84,
37.7, 38.2, 80.4, 73.3, 35.6, 44.2, 39.7, 50.2, 49.9, 37.8, 37,
41.7, 27.3, 100, 100, 100, 80.9, 100, 88.4, 89.6, 93.8, 95.3,
91.9, 93.9, 96.1, 91.4, 100, 94.4, 100, 100, 80, 94.1, 84.4,
91.1, 80, 78.9, 85.9, 100, 97.5, 87.2, 88.6, 83.3, 90.7, 100,
82.2, 100, 96.3, 93.3, 99.6, 92.1, 92.8, 90.9, 92.3, 91.2, 94.5,
91.8, 89.4, 87, 86, 88, 83.7, 88.8, 92.9, 89.3, 83.3, 83.5, 84.5,
85.8, 87.4, 86.5, 82, 78.1, 85.8, 85.6, 88.7, 87.7, 84.9, 82,
87.9, 85.5, 86, 82, 83, 88.5, 81.2, 81.6, 76.5, 77.6, 84.5, 81.5,
82, 82.4, 68, 67.7, 62.1, 68.9, 61.7, 68.5, 68.6, 65.3, 59.5,
60.8, 67.3, 66.2, 59.9, 50.9, 46.9, 44.6, 47.9, 53, 52.1, 48.3,
41.3, 53.8, 51, 47, 53.7, 49.5, 51.1, 44.4, 35.1, 42.2, 41.5,
40, 48.2, 46.7, 48.6, 51.7, 51.2, 52.3, 53.4, 48.9, 50.7, 48.5,
46.5, 39.4, 38, 49.2, 43.6, 47.1, 40.4, 44.7, 45.7, 38.1, 41.9,
39.3, 40.2, 43.8, 47.3, 50.1, 41.2, 39.8, 46, 40.8, 40, 37.8,
42.6, 46, 43.8, 45.4, 42.2, 46.5, 40.4, 39.9, 53, 44.7, 35.8,
42.9, 43.9, 43.2, 40.6, 40.8, 32.2, 32.6, 33.5, 36.7, 34.6, 34.7,
50.9, 35.6, 34.2, 28.1, 42, 32, 42.3, 30, 29.6, 31, 29.8, 26,
37.8, 40, 37, 30.2, 28.2, 26.2, 27.4, 22.1, 28.4, 23.2, 24.8,
26.5, 23.9, 21.1, 27.2, 20.8, 12.5, 14, 17.9, 19.7, 19.4, 26,
16.7, 18.2, 23.9, 19, 25.9, 24.4, 22.1, 19.2, 18.4, 24.7, 17.3,
19.4, 19.6, 17.7, 21.3, 22.1, 17.9, 28.2, 16.3, 25.3, 19.7, 21.7,
19, 18.8, 11.8, 15.6, 9.8, 17.7)), .Names = c("x", "y", "z",
"soil_m_sat"), class = "data.frame", row.names = c(NA, -296L))
In order to estimate a variogram for this data I need to remove the spatial trend from it. The soil moisture, of course, varies with the surface - the higher a point is the dryer it is. And since this soil moisture data is percetagewise the relationship is hardly linear, what leads me to allow up to cubic dependencies of the soil moisture to the z-coordinate. It happens that in this area there is a small more or less elliptic elevation, so that I want to allow the soil moisture to be dependend of the x- and y-coordinates in a quadratic way. I hope the following model does exactly this:
polymod <- lm(soil_m_sat ~ poly(x + y, degree = 2) + poly(z, degree = 3), data = gue)
summary(polymod)
The summary shows me that there is no significance for the first coefficient of the x- and y-dependency (what summary names poly(x + y, degree = 2)1). Because the help page from poly() told me that it "returns or evaluates orthogonal polynomials of degree 1 to degree", I thought, removing a degree one polynom from the model might be the same as removing the first coefficient of the degree 2 polynom. Therefore I tried to remove it like this:
mod <- lm(soil_m_sat ~ poly(x + y, degree = 2) - poly(x + y, degree = 1) + poly(z, degree = 3), data = gue)
summary(mod)
But the summary of mod looks exactly the same as the summary of polymod, meaning mod does not differ from polymod. How is it possible to remove the unsignificant component then?
No, don't check with summary in this case. You should use anova. A polynomial term from poly(), or a spline term from bs() contains more than coefficients, so they are more like a factor variable with multiple levels.
> anova(polymod)
Analysis of Variance Table
Response: soil_m_sat
Df Sum Sq Mean Sq F value Pr(>F)
poly(x + y, degree = 2) 2 113484 56742 1600.8 < 2.2e-16 ***
poly(z, degree = 3) 3 68538 22846 644.5 < 2.2e-16 ***
Residuals 290 10280 35
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The ANOVA table clearly shows that you need all model terms. Do not drop any.
But I still need to answer your question and make you feel happy.
It is not impossible to drop the poly(x + y, degree = 2)1 term, but you need to access model matrix for such purpose. You may do
gue$XY_poly <- with(gue, poly(x + y, degree = 2))[, 2] ## use the 2nd column only
fit <- lm(soil_m_sat ~ XY_poly + poly(z, degree = 3), data = gue)
summary(fit)
## ...
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 52.3071 0.3459 151.217 < 2e-16 ***
XY_poly -18.8515 7.3894 -2.551 0.0112 *
poly(z, degree = 3)1 -418.1634 6.4937 -64.395 < 2e-16 ***
poly(z, degree = 3)2 116.5327 6.9171 16.847 < 2e-16 ***
poly(z, degree = 3)3 -28.7773 5.9517 -4.835 2.16e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.951 on 291 degrees of freedom
Multiple R-squared: 0.9464, Adjusted R-squared: 0.9457
F-statistic: 1285 on 4 and 291 DF, p-value: < 2.2e-16