How can I draw 3d hyperplane to illustrate decision boundary using ggplot? - r

I have dataframe df which has 3d input data : x1, x2, x3 and target t. I used logistic regression to create decision boundary
a0 + a1 * x1 + a2 * x2 + a3 * x3 = 0
I was wondering if there is a way to draw 3d hyperplane (along with 3d input data) using ggplot to illustrate decision boundary created by logistic regression.
Thanks

You cannot have a true 3D plot in ggplot2, but there are ways to represent a 3d plane using contour lines or colour fills. Here's an example using a coloured raster layer to represent a plane.
I assume from the question you want the decision boundary to be where the probability is 0.5 (i.e. the log odds = 0)
First we need a logistic regression model, so in the absence of any data in the question, let's create some that will allow us a nice example:
# Create dummy data for logistic regression
set.seed(69)
x1 <- sample(100, 1000, TRUE)
x2 <- sample(100, 1000, TRUE)
x3 <- sample(100, 1000, TRUE)
log_odds <- -1 + 0.02 * x1 + 0.005 * x2 - 0.03 * x3 + rnorm(1000, 0, 2)
odds <- exp(log_odds)
probs <- odds/(1 + odds)
y <- rbinom(1000, 1, probs)
df <- data.frame(y, x1, x2, x3)
Now we have a binary outcome, y, whose value is dependent on the values of the three independent variables x1, x2 and x3, so we can run a logistic regression and grab its coefficients:
# Run logistic regression and extract coefficients
logistic_model <- glm(y ~ x1 + x2 + x3, data = df, family = binomial)
summary(logistic_model)
#>
#> Call:
#> glm(formula = y ~ x1 + x2 + x3, family = binomial, data = df)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -1.5058 -0.8689 -0.6296 1.1264 2.3669
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.888782 0.232728 -3.819 0.000134 ***
#> x1 0.012369 0.002562 4.828 1.38e-06 ***
#> x2 0.008031 0.002478 3.241 0.001191 **
#> x3 -0.020676 0.002560 -8.076 6.67e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for binomial family taken to be 1)
#>
#> Null deviance: 1235.0 on 999 degrees of freedom
#> Residual deviance: 1129.9 on 996 degrees of freedom
#> AIC: 1137.9
#>
#> Number of Fisher Scoring iterations: 4
coefs <- coef(logistic_model)
Our plot is going to show x1 on the x axis and x2 on the y axis. The colour at each point (x1, x2) will be the value of x3 that produces log odds of 0. We can get this by rearranging the formula a0 + a1 * x1 + a2 * x2 + a3 * x3 = 0 that you showed in the question:
# Create a function that returns the value of x3 at p = 0.5, given x1 and x2
find_x3 <- function(x1, x2) (-coefs[1] -coefs[2] * x1 - coefs[3] * x2)/coefs[4]
Now we can create a data frame that contains all values of x1 and x2 between 1 and 100, and find the appropriate value of x3 that gives log odds of 0 for each point on this grid:
# Create a data frame to plot the 3d plane where p = 0.5
plot_df <- expand.grid(x2 = 1:100, x1 = 1:100)
plot_df$x3 <- find_x3(plot_df$x1, plot_df$x2)
head(plot_df)
#> x2 x1 x3
#> 1 1 1 -41.99975
#> 2 2 1 -41.61133
#> 3 3 1 -41.22291
#> 4 4 1 -40.83450
#> 5 5 1 -40.44608
#> 6 6 1 -40.05766
We can confirm this gives us the values of our decision boundary by running predict with this data frame as newdata. The values should all be 0 (or very close to 0):
head(predict(logistic_model, newdata = plot_df))
#> 1 2 3 4 5
#> 0.000000e+00 0.000000e+00 -1.110223e-16 0.000000e+00 0.000000e+00
Good.
Finally, we can plot the result with a colorful divergent scale to show the values of x1, x2 and x3 that together give your decision boundary:
library(ggplot2)
ggplot(plot_df, aes(x1, x2, fill = x3)) +
geom_raster() +
scale_fill_gradientn(colours = c("deepskyblue4", "forestgreen", "gold", "red")) +
coord_equal() +
theme_classic()
If you're looking for a genuine 3d perspective plot, you could try base R's persp function:
persp(x = 1:100, y = 1:100, z = matrix(plot_df$x3, ncol = 100),
xlab = "x1", ylab = "x2", zlab = "x3",
theta = -45, , phi = 25, d = 5,
col = "gold", border = "orange",
ticktype = "detailed")
Created on 2020-08-16 by the reprex package (v0.3.0)

Related

How to conduct joint significance test in seemingly unrelated regression

I'm trying to conduct a joint test of significance in a seemingly unrelated regression setup with robust standard errors. I have three outcomes Y1, Y2, and Y3 and I want to conduct a joint hypothesis test against the null that the average effect of the treatment Z is zero on all three outcomes.
I think that I have the model set up correctly, but I don't think that I have the hypothesis.matrix set correctly in car::linearHypothesis.
Here's some data:
library(tibble)
library(car)
library(systemfit)
set.seed(343)
N = 800
dat <-
tibble(
U = rnorm(N),
Z = rbinom(N, 1, 0.5),
Y = 0.2 * Z + U,
Y1 = Y + rnorm(N, sd = 0.3),
Y2 = Y + rnorm(N, sd = 0.5),
Y3 = Y + rnorm(N, sd = 0.5)
)
Here's the seemingly unrelated regression fit:
sur <- systemfit(list(Y1 ~ Z, Y2 ~ Z, Y3 ~ Z), method = "SUR", data = dat)
summary(sur)
Which is identical to the ols fit in this case:
ols <- lm(cbind(Y1, Y2, Y3) ~ Z, data = dat)
summary(ols)
Which is useful, because I need to estimate robust standard errors for this test:
linearHypothesis(ols, hypothesis.matrix = "Z = 0", white.adjust = "hc2")
This last line is the one that I think is incorrect. I think it's incorrect because the individual coefficients all have lower p-values than the joint test, but I could be wrong?
Looks right to me. You'd get the same result by estimating the null model (ols0 below) and using anova() to test the difference between the estimated and null models.
library(tibble)
library(car)
#> Loading required package: carData
set.seed(343)
N = 800
dat <-
tibble(
U = rnorm(N),
Z = rbinom(N, 1, 0.5),
Y = 0.2 * Z + U,
Y1 = Y + rnorm(N, sd = 0.3),
Y2 = Y + rnorm(N, sd = 0.5),
Y3 = Y + rnorm(N, sd = 0.5)
)
ols <- lm(cbind(Y1, Y2, Y3) ~ Z, data = dat)
linearHypothesis(ols, hypothesis.matrix = "Z = 0")
#>
#> Sum of squares and products for the hypothesis:
#> Y1 Y2 Y3
#> Y1 3.201796 4.693391 3.359617
#> Y2 4.693391 6.879863 4.924734
#> Y3 3.359617 4.924734 3.525216
#>
#> Sum of squares and products for error:
#> Y1 Y2 Y3
#> Y1 829.5535 756.1586 770.0808
#> Y2 756.1586 965.5959 770.4636
#> Y3 770.0808 770.4636 980.0664
#>
#> Multivariate Tests:
#> Df test stat approx F num Df den Df Pr(>F)
#> Pillai 1 0.0073689 1.96972 3 796 0.11703
#> Wilks 1 0.9926311 1.96972 3 796 0.11703
#> Hotelling-Lawley 1 0.0074236 1.96972 3 796 0.11703
#> Roy 1 0.0074236 1.96972 3 796 0.11703
ols0 <- lm(cbind(Y1, Y2, Y3) ~ 1, data = dat)
anova(ols, ols0, test="Pillai")
#> Analysis of Variance Table
#>
#> Model 1: cbind(Y1, Y2, Y3) ~ Z
#> Model 2: cbind(Y1, Y2, Y3) ~ 1
#> Res.Df Df Gen.var. Pillai approx F num Df den Df Pr(>F)
#> 1 798 0.48198
#> 2 799 1 0.48257 0.0073689 1.9697 3 796 0.117
Created on 2022-07-08 by the reprex package (v2.0.1)

Explain the code underlying a linear model in R visualised with ggplot

I am trying to understand how linear modelling can be used to as an alternative to the t-test when analysing gene expression data. For a single gene, I have a dataframe of 20 gene expression values altogether in group 1 (n=10) and group 2 (n=10).
gexp = data.frame(expression = c(2.7,0.4,1.8,0.8,1.9,5.4,5.7,2.8,2.0,4.0,3.9,2.8,3.1,2.1,1.9,6.4,7.5,3.6,6.6,5.4),
group = c(rep(1, 10), rep(2, 10)))
The data can be (box)plotted using ggplot as shown below:
plot <- gexp %>%
ggplot(aes(x = group, y = expression)) +
geom_boxplot() +
geom_point()
plot
I wish to model the expression in groups 1 and 2 using the regression formula:
Y = Beta0 + (Beta1 x X) + e where Y is the expression I want to model and X represents the two groups that are encoded as 0 and 1 respectively. Therefore, the expression in group 1 (when x = 0) is equal to Beta0; and the expression in group 2 (when x = 1) is equal to Beta0 + Beta1.
If this is modelled with:
mod1 <- lm(expression ~ group, data = gexp)
mod1
The above code outputs an intercept of 2.75 and a slope of 1.58. It is the visualisation of the linear model that I don't understand. I would be grateful for a clear explanation of the below code:
plot +
geom_point(data = data.frame(x = c(1, 2), y = c(2.75, 4.33)),
aes(x = x, y = y),
colour = "red", size = 5) +
geom_abline(intercept = coefficients(mod1)[1] - coefficients(mod1)[2],
slope = coefficients(mod1)[2])
I get why the data.frame values are the ones chosen (the value of 4.33 is the sum of the intercept, Beta0 and the slope, Beta1) , but it is the geom_abline arguments I do not understand. Why is the intercept calculation as shown? In the text I am using it states, '...we need to subtract the slope from the intercept when plotting the linear model because groups 1 and 2 are encoded as 0 and 1 in the model, but plotted as 1 and 2 on the figure.' I don't follow this point and would be grateful for an explanation, without getting too technical.
I believe your code is correct if the group variable was encoded as a factor.
library(ggplot2)
gexp = data.frame(expression = c(2.7,0.4,1.8,0.8,1.9,5.4,5.7,2.8,2.0,4.0,3.9,2.8,3.1,2.1,1.9,6.4,7.5,3.6,6.6,5.4),
group = factor(c(rep(1, 10), rep(2, 10))))
plot <-
ggplot(gexp, aes(x = group, y = expression)) +
geom_boxplot() +
geom_point()
mod1 <- lm(expression ~ group, data = gexp)
plot +
geom_point(data = data.frame(x = c(1, 2), y = c(2.75, 4.33)),
aes(x = x, y = y),
colour = "red", size = 5) +
geom_abline(intercept = coefficients(mod1)[1] - coefficients(mod1)[2],
slope = coefficients(mod1)[2])
Created on 2022-03-30 by the reprex package (v2.0.1)
To understand the difference between factors and integers in specifying linear models, you can have a look at the model matrix.
model.matrix(y ~ f, data = data.frame(f = 1:3, y = 1))
#> (Intercept) f
#> 1 1 1
#> 2 1 2
#> 3 1 3
#> attr(,"assign")
#> [1] 0 1
model.matrix(y ~ f, data = data.frame(f = factor(1:3), y = 1))
#> (Intercept) f2 f3
#> 1 1 0 0
#> 2 1 1 0
#> 3 1 0 1
#> attr(,"assign")
#> [1] 0 1 1
#> attr(,"contrasts")
#> attr(,"contrasts")$f
#> [1] "contr.treatment"
Created on 2022-03-30 by the reprex package (v2.0.1)
In the first model matrix, what you specify is what you get: you're modelling something as a function of the intercept and the f variable. In this model, you account for that f = 2 is twice as much as f = 1.
This works a little bit differently when f is a factor. A k-level factor gets split up in k-1 dummy variables, where each dummy variable encodes with 1 or 0 whether it deviates from the reference level (the first factor level). By modelling it in this way, you don't consider that the 2nd factor level might be twice the 1st factor level.
Because in ggplot2, the first factor level is displayed at position = 1 and not at position = 0 (how it is modelled), your calculated intercept is off. You need to subtract 1 * slope from the calculated intercept to get it to display right in ggplot2.

code for linear regression scatterplot residuals scatterplot

I ran a linear regression
lm.fit <- lm(intp.trust~age+v225+age*v225+v240+v241+v242,data=intp.trust)
summary(lm.fit)
and get the following results
Call:
lm(formula = intp.trust ~ age + v225 + age * v225 + v240 + v241 +
v242, data = intp.trust)
Residuals:
Min 1Q Median 3Q Max
-1.32050 -0.33299 -0.04437 0.30899 2.35520
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.461e+00 2.881e-02 85.418 < 2e-16 ***
age -2.416e-03 5.144e-04 -4.697 2.66e-06 ***
v225 5.794e-04 1.574e-02 0.037 0.971
v240 2.111e-02 2.729e-03 7.734 1.07e-14 ***
v241 -1.177e-03 1.958e-04 -6.014 1.83e-09 ***
v242 -1.473e-02 4.166e-04 -35.354 < 2e-16 ***
age:v225 4.214e-06 3.101e-04 0.014 0.989
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4833 on 34845 degrees of freedom
(21516 observations deleted due to missingness)
Multiple R-squared: 0.05789, Adjusted R-squared: 0.05773
F-statistic: 356.8 on 6 and 34845 DF, p-value: < 2.2e-16
"consider the residuals from the regression above. compare the residual distributions for females and males using an appropriate graph?"
Males and females is coded using variable v225. How do I go about on creating this graph?
at first I created :
lm.res <- resid(lm.fit)
but I'm not sure what the next step is.
The graph is supposed to be a scatterplot of residuals with different colour for females and males.
I tried this but was not working
ggplot(intp.trust, aes(x = intp.trust, y = lm.res, color = v225)) + geom_point()
In this line:
ggplot(intp.trust, aes(x = intp.trust, y = lm.res, color = v225)) + geom_point()
You are saying: "go look in the data.frame intp.trust for a variable called lm.res, and plot that as y"
But you created lm.res as standalone object, not as a column of intp.trust. Assign the residuals from your model to a new column in the data.frame like this:
intp.trust$lm.res <- resid(lm.fit)
And it should work. Example with dummy data:
library(ggplot2)
# generate data
true_function <- function(x, is_female) {
ifelse(is_female, 5, 2) +
ifelse(is_female, -1.5, 1.5) * x +
rnorm(length(x))
}
set.seed(123)
dat <- data.frame(x = runif(200, 1, 5), is_female = rbinom(200, 1, .5))
dat$y <- with(dat, true_function(x, is_female))
# regression
lm_fit <- lm(y ~ x + as.factor(is_female), data=dat)
# add residuals to data.frame
dat$resid <- resid(lm_fit)
# plot
ggplot(dat, aes(x=x, y=resid, color=as.factor(is_female))) +
geom_point()
Here is a sample that you could follow and get what you want
# Sample Data
x_1 <- rnorm(100)
x_2 <- runif(100, 10, 30)
x_3 <- rnorm(100) * runif(100)
y <- rnorm(100, mean = 10)
gender <- sample(c("F", "M"), replace = TRUE)
df <- data.frame(x_1, x_2, x_3, y, gender)
# Fit model
lm.fit <- lm(y ~ x_1 + x_2 + x_1 * x_2 + x_3, data = df)
# Update data.frame
df$residuals <- lm.fit$residuals
# Scatter Residuals
ggplot(df) +
geom_point(aes(x = as.numeric(row.names(df)), y = residuals, color = gender)) +
labs(x = 'Index', y = 'Residual value', title = 'Residual scatter plot')

ggplot exponential smooth with tuning parameter inside exp

ggplot provides various "smoothing methods" or "formulas" that determine the form of the trend line. However it is unclear to me how the parameters of the formula are specified and how I can get the exponential formula to fit my data. In other words how to tell ggplot that it should fit the parameter inside the exp.
df <- data.frame(x = c(65,53,41,32,28,26,23,19))
df$y <- c(4,3,2,8,12,8,20,15)
x y
1 65 4
2 53 3
3 41 2
4 32 8
5 28 12
6 26 8
7 23 20
8 19 15
p <- ggplot(data = df, aes(x = x, y = y)) +
geom_smooth(method = "glm", se=FALSE, color="black", formula = y ~ exp(x)) +
geom_point()
p
Problematic fit:
However if the parameter inside the exponential is fit then the form of the trend line becomes reasonable:
p <- ggplot(data = df, aes(x = x, y = y)) +
geom_smooth(method = "glm", se=FALSE, color="black", formula = y ~ exp(-0.09 * x)) +
geom_point()
p
Here is an approach with method nls instead of glm.
You can pass additional parameters to nls with a list supplied in method.args =. Here we define starting values for the a and r coefficients to be fit from.
library(ggplot2)
ggplot(data = df, aes(x = x, y = y)) +
geom_smooth(method = "nls", se = FALSE,
formula = y ~ a * exp(r * x),
method.args = list(start = c(a = 10, r = -0.01)),
color = "black") +
geom_point()
As discussed in the comments, the best way to get the coefficients on the graph is by fitting the model outside the ggplot call.
model.coeff <- coef(nls( y ~ a * exp(r * x), data = df, start = c(a = 50, r = -0.04)))
ggplot(data = df, aes(x = x, y = y)) +
geom_smooth(method = "nls", se = FALSE,
formula = y ~ a * exp(r * x),
method.args = list(start = c(a = 50, r = -0.04)),
color = "black") +
geom_point() +
geom_text(x = 40, y = 15,
label = as.expression(substitute(italic(y) == a %.% italic(e)^(r %.% x),
list(a = format(unname(model.coeff["a"]),digits = 3),
r = format(unname(model.coeff["r"]),digits = 3)))),
parse = TRUE)
Firstly, to pass additional parameters to the function passed to the method param of geom_smooth, you can pass a list of named parameters to method.args.
Secondly, the problem you're seeing is that glm is placing the coefficient in front of the whole term: y ~ coef * exp(x) instead of inside: y ~ exp(coef * x) like you want. You could use optimization to solve the latter outside of glm, but you can fit it into the GLM paradigm by a transformation: a log link. This works because it's like taking the equation you want to fit, y = exp(coef * x), and taking the log of both sides, so you're now fitting log(y) = coef * x, which is equivalent to what you want to fit and works with the GLM paradigm. (This ignores the intercept. It also ends up in transformed link units, but it's easy enough to convert back if you like.)
You can run this outside of ggplot to see what the models look like:
df <- data.frame(
x = c(65,53,41,32,28,26,23,19),
y <- c(4,3,2,8,12,8,20,15)
)
bad_model <- glm(y ~ exp(x), family = gaussian(link = 'identity'), data = df)
good_model <- glm(y ~ x, family = gaussian(link = 'log'), data = df)
# this is bad
summary(bad_model)
#>
#> Call:
#> glm(formula = y ~ exp(x), family = gaussian(link = "identity"),
#> data = df)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -7.7143 -2.9643 -0.8571 3.0357 10.2857
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 9.714e+00 2.437e+00 3.986 0.00723 **
#> exp(x) -3.372e-28 4.067e-28 -0.829 0.43881
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for gaussian family taken to be 41.57135)
#>
#> Null deviance: 278.00 on 7 degrees of freedom
#> Residual deviance: 249.43 on 6 degrees of freedom
#> AIC: 56.221
#>
#> Number of Fisher Scoring iterations: 2
# this is better
summary(good_model)
#>
#> Call:
#> glm(formula = y ~ x, family = gaussian(link = "log"), data = df)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -3.745 -2.600 0.046 1.812 6.080
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 3.93579 0.51361 7.663 0.000258 ***
#> x -0.05663 0.02054 -2.757 0.032997 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for gaussian family taken to be 12.6906)
#>
#> Null deviance: 278.000 on 7 degrees of freedom
#> Residual deviance: 76.143 on 6 degrees of freedom
#> AIC: 46.728
#>
#> Number of Fisher Scoring iterations: 6
From here, you can reproduce what geom_smooth is going to do: make a sequence of x values across the domain and use the predictions as the y values for the line:
# new data is a sequence across the domain of the model
new_df <- data.frame(x = seq(min(df$x), max(df$x), length = 501))
# `type = 'response'` because we want values for y back in y units
new_df$bad_pred <- predict(bad_model, newdata = new_df, type = 'response')
new_df$good_pred <- predict(good_model, newdata = new_df, type = 'response')
library(tidyr)
library(ggplot2)
new_df %>%
# reshape to long form for ggplot
gather(model, y, contains('pred')) %>%
ggplot(aes(x, y)) +
geom_line(aes(color = model)) +
# plot original points on top
geom_point(data = df)
Of course, it's a lot easier to let ggplot handle all that for you:
ggplot(df, aes(x, y)) +
geom_smooth(
method = 'glm',
formula = y ~ x,
method.args = list(family = gaussian(link = 'log'))
) +
geom_point()

How to plot lm slope modeled using poly()?

I need to plot the relationship between x and y where polynomials of x predict y. This is done using the poly() function in order to ensure polynomials are orthogonal.
How do I plot this relationship considering linear, quadratic and cubic terms together ? The issue is the coefficients for the different terms are not scaled as x is.
I provide some example code below. I have tried reassigning the contrast values for each polynomial to x.
This solution gives impossible predicted values.
Thank you in advance for your help !
Best wishes,
Eric
Here is an example code:
x = sample(0:6,100,replace = TRUE)
y = (x*0.2) + (x^2*.05) + (x^3*0.001)
y = y + rnorm(100)
x = poly(x,3)
m = lm(y~x)
TAB = summary(m)$coefficients
### Reassigning the corresponding contrast values to each polynomial of x:
eq = function(x,TAB,start) {
#argument 'start' is used to determine the position of the linear coefficient, quadratic and cubic follow
pols = poly(x,3)
x1=pols[,1]; x2=pols[,2]; x3=pols[,3]
TAB[1,1] + x1[x]*TAB[start,1] + x2[x] * TAB[start+1,1] + x3[x] * TAB[start+2,1]
}
plot(eq(0:7,TAB,2))
Actually, you can use poly directly in formula for lm().
y ~ poly(x, 3) in lm() might be what you want.
For plot, I'll use ggplot2 package which has geom_smooth() function. It can draw the fitted curve. You should specify
method = "lm" argument
and the formula
library(tidyverse)
x <- sample(0:6,100,replace = TRUE)
y <- (x*0.2) + (x^2*.05) + (x^3*0.001)
eps <- rnorm(100)
(df <- data_frame(y = y + eps, x = x))
#> # A tibble: 100 x 2
#> y x
#> <dbl> <int>
#> 1 3.34 4
#> 2 1.23 5
#> 3 1.38 3
#> 4 -0.115 2
#> 5 1.94 5
#> 6 3.87 6
#> 7 -0.707 3
#> 8 0.954 3
#> 9 1.19 3
#> 10 -1.34 0
#> # ... with 90 more rows
Using your simulated data set,
df %>%
ggplot() + # this should be declared at first with the data set
aes(x, y) + # aesthetic
geom_point() + # data points
geom_smooth(method = "lm", formula = y ~ poly(x, 3)) # lm fit
If you want to remove the points: erase geom_point()
df %>%
ggplot() +
aes(x, y) +
geom_smooth(method = "lm", formula = y ~ poly(x, 3))
transparency solution: control alpha less than 1
df %>%
ggplot() +
aes(x, y) +
geom_point(alpha = .3) +
geom_smooth(method = "lm", formula = y ~ poly(x, 3))

Resources