I have 5 (x,y) data points and I'm trying to find a best fit solution consisting of two lines which intersect at a point (x0,y0), and which follow these equations:
y1 = (m1)(x1 - x0) + y0
y2 = (m2)(x2 - x0) + y0
Specifically, I require that the intersection must occur between x=2 and x=3. Have a look at the code:
#Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)
x2 <- c(3,4,5)
y2 <- c(20,30,40)
q <- nls(c(y1, y2) ~ ifelse(g == TRUE, m1 * (x1 - x0) + y0, m2 * (x2 - x0) + y0), start = c(m1 = -1, m2 = 1, y0 = 0, x0 = 2), algorithm = "port", lower = c(m1 = -Inf, m2 = -Inf, y0 = -Inf, x0 = 2), upper = c(m1 = Inf, m2 = Inf, y0 = Inf, x0 = 3))
coef <- coef(q)
m1 <- coef[1]
m2 <- coef[2]
y0 <- coef[3]
x0 <- coef[4]
#Plot the original x1, y1, and x2, y2
#Plot the fits
x1 <- c(1,2,3,4,5)
fit1 <- m1 * (x1 - x0) + y0
lines(x1, fit1, col="red")
x2 <- c(1,2,3,4,5)
fit2 <- m2 * (x2 - x0) + y0
lines(x2, fit2, col="blue")
So, you can see the data points listed there. Then, I run it through my nls, get my parameters m1, m2, x0, y0 (the slopes, and the intersection point).
But, take a look at the solution:
Clearly, the red line (which is supposed to only be based on the first 2 points) is not the best line of fit for the first 2 points. This is the same case with the blue line (the 2nd fit), which supposed to be is dependent on the last 3 points). What is wrong here?

This is segmented regression:
# input data
x1 <- c(1,2); y1 <- c(10,10); x2 <- c(3,4,5); y2 <- c(20,30,40)
x <- c(x1, x2); y <- c(y1, y2)
# segmented regression
fm <- segmented.lm(lm(y ~ x), ~ x, NA, seg.control(stop.if.error = FALSE, K = 2))
# plot
points(y ~ x)
See ?lm, ?segmented.lm and ?seg.control for more info.

I'm not exactly sure what's wrong but I can get it to work by rearranging things a bit. Please note the comment in ?nls about "Do not use ‘nls’ on artificial "zero-residual" data."; I added a bit of noise.
## Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)
x2 <- c(3,4,5)
y2 <- c(20,30,40)
## make single x, y vector
x <- c(x1,x2)
## (add a bit of noise to avoid zero-residual artificiality)
y <- c(y1,y2)+rnorm(5,sd=0.01)
g <- c(TRUE,TRUE,FALSE,FALSE,FALSE) ## specify identities of points
## particular changes:
## * you have lower=upper=2 for x0. Did you want 2<x0<3?
## * specified data argument explicitly (allows use of predict() etc.)
## * changed name from 'q' to 'fit1' (avoid R built-in function)
fit1 <- nls(y ~ ifelse(g,m1,m1+delta_m)*(x - x0) + y0,
start = c(m1 = -1, delta_m = 2, y0 = 0, x0 = 2),
algorithm = "port",
lower = c(m1 = -Inf, delta_m = 0, y0 = -Inf, x0 = 2),
upper = c(m1 = Inf, delta_m = Inf, y0 = Inf, x0 = 3),
#Plot the original 'data'
## add predicted values
xvec <- seq(1,5,length.out=101)
edit: based ifelse clause on point identity, not x position
edit: changed to require second slope to be > first slope
On a second look, I think the issue above is probably due to the use of separate vectors for x1 and x2 above, rather than a single x vector: I suspect these got replicated by R to match up with the g vector, which would have messed things up pretty badly. For example, this stripped-down example:
## [1] 1 2 5 3 4
shows that x2 gets extended to (3 4 5 3 4) before being used in the ifelse clause. The scariest part is that normally one gets a warning such as this:
> x2 + 1:5
[1] 4 6 8 7 9
Warning message:
In x2 + 1:5 :
longer object length is not a multiple of shorter object length
but in this case there is no warning ...


lines() function in base R plot giving several lines instead of smooth line

I am attempting to add a smoother to a plot of a regression model I have. I was just using base R to plot my X and Y vectors and add a smoother using plot() and then lines(). I've done this before, and it worked, but today I am given a plot with multiple lines connecting the points as opposed to one smooth line through all the data. I can't figure out what is different about this piece of code I have written, so I am hoping someone here could help me identify the issue.
Here is my code. I am using data I randomly generated to practice something else:
X and random variable vectors to create 'Y':
X <- rnorm(100, mean = 10, sd = 1)
epsilon <- rnorm(100, 0, 1)
b0 <- 0.27
b1 <- 0.49
b2 <- 0.62
b3 <- 0.8
Y <- b0 + b1*X + b2*2^2 + b3*X^3 + epsilon
Creating df and reg model/Yhat:
df = data.frame(Y,X,epsilon)
reg <- lm(Y ~ I(X^3), data = df)
Yhat <- fitted.values(reg)
cbind(df, Yhat) -> df
plot(X, Y)
lines(X, Yhat, col = "blue", lwd = 0.5)
For this to work, the X values have to be sorted and the Y values sorted according to their corresponding X values:
X <- rnorm(100, mean = 10, sd = 1)
epsilon <- rnorm(100, 0, 1)
b0 <- 0.27
b1 <- 0.49
b2 <- 0.62
b3 <- 0.8
Y <- b0 + b1*X + b2*2^2 + b3*X^3 + epsilon
df = data.frame(Y,X,epsilon)
reg <- lm(Y ~ I(X^3), data = df)
Yhat <- fitted.values(reg)
cbind(df, Yhat) -> df
plot(X, Y)
lines(X[order(X)], Yhat[order(X)], col = "blue", lwd = 0.5)

Fiting 1 - exp(x) giving higher weight to the first values

I want to fit to a 1 - exp(x) function to a data set , but giving higher weight to the first values. However, the following code is not working in such way:
x <-sqrt((0.05)^2+(0.05)^2)*seq(from = 1, to = 20, by = 1)
y <- c(11,20,27,32,35,36,36.5,25,16,9,4,1,7.87e-16,2.07e-15,-9.36e-16,1.61e-15,-3.81e-16,3.92e-16,7.65e-16,-8.26e-16)
temp <- data.frame(cbind(x,y))
we <- 1/(log1p(seq_along(x)))
# fit non-linear model
mod <- nls(y ~ (1 - exp(a + b * x)), data = temp, start = list(a = 0, b = 0), weights = we)
#add fitted curve
lines(temp$x, predict(mod, list(x = temp$x)))
Here is the output:
Your specification of weights is correct. The bad fit you obtained is due to your faulty model assumption. You assumed:
y ~ 1 - exp(a + b * x)
Note that exp() gives strictly positive values, so y will be no larger than 1. However, y values in your data range up to 35.
My idea is not perfect, but it might give you a better starting point. Consider:
y ~ a * x * exp(b * x * x + c * x)
Using your data:
x <- c(0, sqrt((0.05)^2+(0.05)^2)*seq(from = 1, to = 20, by = 1))
y <- c(0, 11,20,27,32,35,36,36.5,25,16,9,4,1,7.87e-16,2.07e-15,-9.36e-16,1.61e-15,-3.81e-16,3.92e-16,7.65e-16,-8.26e-16)
fit <- nls(y ~ a * x * exp(b * x * x + c * x), start = list(a = 30, b= -1, c = -1))
plot(x, y)
lines(x, predict(fit, list(x)))

Reproduce Fisher linear discriminant figure

Many books illustrate the idea of Fisher linear discriminant analysis using the following figure (this particular is from Pattern Recognition and Machine Learning, p. 188)
I wonder how to reproduce this figure in R (or in any other language). Pasted below is my initial effort in R. I simulate two groups of data and draw linear discriminant using abline() function. Any suggestions are welcome.
library(DiscriMiner) # For scatter matrices
# Simulate bivariate normal distribution with 2 classes
mu1 <- c(2, -4)
mu2 <- c(2, 6)
rho <- 0.8
s1 <- 1
s2 <- 3
Sigma <- matrix(c(s1^2, rho * s1 * s2, rho * s1 * s2, s2^2), byrow = TRUE, nrow = 2)
n <- 50
X1 <- mvrnorm(n, mu = mu1, Sigma = Sigma)
X2 <- mvrnorm(n, mu = mu2, Sigma = Sigma)
y <- rep(c(0, 1), each = n)
X <- rbind(x1 = X1, x2 = X2)
X <- scale(X)
# Scatter matrices
B <- betweenCov(variables = X, group = y)
W <- withinCov(variables = X, group = y)
# Eigenvectors
ev <- eigen(solve(W) %*% B)$vectors
slope <- - ev[1,1] / ev[2,1]
intercept <- ev[2,1]
par(pty = "s")
plot(X, col = y + 1, pch = 16)
abline(a = slope, b = intercept, lwd = 2, lty = 2)
I pasted my current solution below. The main question is how to rotate (and move) the density plot according to decision boundary. Any suggestions are still welcome.
# Simulation parameters
mu1 <- c(5, -9)
mu2 <- c(4, 9)
rho <- 0.5
s1 <- 1
s2 <- 3
Sigma <- matrix(c(s1^2, rho * s1 * s2, rho * s1 * s2, s2^2), byrow = TRUE, nrow = 2)
n <- 50
# Multivariate normal sampling
X1 <- mvrnorm(n, mu = mu1, Sigma = Sigma)
X2 <- mvrnorm(n, mu = mu2, Sigma = Sigma)
# Combine into data frame
y <- rep(c(0, 1), each = n)
X <- rbind(x1 = X1, x2 = X2)
X <- scale(X)
X <- data.frame(X, class = y)
# Apply lda()
m1 <- lda(class ~ X1 + X2, data = X)
m1.pred <- predict(m1)
# Compute intercept and slope for abline
gmean <- m1$prior %*% m1$means
const <- as.numeric(gmean %*% m1$scaling)
z <- as.matrix(X[, 1:2]) %*% m1$scaling - const
slope <- - m1$scaling[1] / m1$scaling[2]
intercept <- const / m1$scaling[2]
# Projected values
LD <- data.frame(predict(m1)$x, class = y)
# Scatterplot
p1 <- ggplot(X, aes(X1, X2, color=as.factor(class))) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
scale_x_continuous(limits=c(-5, 5)) +
scale_y_continuous(limits=c(-5, 5)) +
geom_abline(intecept = intercept, slope = slope)
# Density plot
p2 <- ggplot(LD, aes(x = LD1)) +
geom_density(aes(fill = as.factor(class), y = ..scaled..)) +
theme_bw() +
theme(legend.position = "none")
vp <- viewport(width = .7, height = 0.6, x = 0.5, y = 0.3, just = c("centre"))
print(p2, vp = vp)
Basically you need to project the data along the direction of the classifier, plot a histogram for each class, and then rotate the histogram so its x axis is parallel to the classifier. Some trial-and-error with scaling the histogram is needed in order to get a nice result. Here's an example of how to do it in Matlab, for the naive classifier (difference of class' means). For the Fisher classifier it is of course similar, you just use a different classifier w. I changed the parameters from your code so the plot is more similar to the one you gave.
n = 1000;
mu1 = [1,3]';
mu2 = [4,1]';
rho = 0.3;
s1 = .8;
s2 = .5;
Sigma = [s1^2,rho*s1*s1;rho*s1*s1, s2^2];
X1 = mvnrnd(mu1,Sigma,n);
X2 = mvnrnd(mu2,Sigma,n);
X = [X1; X2];
Y = [zeros(n,1);ones(n,1)];
scatter(X1(:,1), X1(:,2), [], 'b' );
hold on
scatter(X2(:,1), X2(:,2), [], 'r' );
axis equal
m1 = mean(X(1:n,:))';
m2 = mean(X(n+1:end,:))';
plot([m1(1),m2(1)], [m1(2),m2(2)],'g')
%% classifier taking only means into account
w = m2 - m1;
w = w / norm(w);
% project data onto w
X1_projected = X1 * w;
X2_projected = X2 * w;
% plot histogram and rotate it
angle = 180/pi * atan(w(2)/w(1));
[hy1, hx1] = hist(X1_projected);
[hy2, hx2] = hist(X2_projected);
hy1 = hy1 / sum(hy1); % normalize
hy2 = hy2 / sum(hy2); % normalize
scale = 4; % set manually
h1 = bar(hx1, scale*hy1,'b');
h2 = bar(hx2, scale*hy2,'r');
set([h1, h2],'ShowBaseLine','off')
% rotate around the origin
rotate(get(h1,'children'),[0,0,1], angle, [0,0,0])
rotate(get(h2,'children'),[0,0,1], angle, [0,0,0])

lattice plot only lines with positive slope

Is there a easy and efficient way to define a function like panel.xyplot (or rather panel.lines) that connects only two point (x1,y1) and (x2,y2) if x1 <= x2 and y1 <= y2? (Ideally, with all other properties are retained by label.xyplot(...))
I asked the same question a view month ago and the solution is great:
lattice, connect points only if the connection has a positive slope
Now it would be fine to have a real panel.xyplot like function so that I can use my own groups. It should work and plot like below, except the crossed lines.
I welcome suggestions.
I'm not sure I understand what you're after, but if I do, then I think this should work for any given group:
dat <- data.frame(x=1:10,y=sample(1:10))
dat <- mutate(dat, x0 = x, y0 = y, x1 = lead(x), y1 = lead(y), slope = (x1 - x0)/(y1 - y0))
with(dat, plot(x, y))
with(dat[1:nrow(dat) - 1,], segments(x0 = x0, y0 = y0, x1 = x1, y1 = y1,
col = ifelse(slope >= 0, "black", "white"))) # This bit gets makes line-drawing conditional
Here's what I get from that:
And here's a version for grouped data that doesn't depend on lattice:
dat2 <- data.frame(x = rep(seq(10), 10),
y = sample(1:10, size = 100, replace = TRUE),
indx = rep(seq(10), each = 10))
dat2g <- dat2 %>%
group_by(indx) %>%
mutate(., x0 = x, y0 = y, x1 = lead(x), y1 = lead(y), slope = (x1 - x0)/(y1 - y0))
plotit <- function(group) {
datsub <- filter(dat2g, indx == group)
with(datsub, plot(x, y, main = group))
with(datsub[1:nrow(datsub) - 1,], segments(x0 = x0, y0 = y0, x1 = x1, y1 = y1, col = ifelse(slope >= 0, "black", "white")))
par(mfrow=c( floor(sqrt(max(dat2g$indx))), ceiling(sqrt(max(dat2g$indx)))))
for (i in 1:length(unique(dat2g$indx))) { plotit(i) }
Here's the plot output from that process. It could use fine-tuning, but I think it's what you're after?

R neuralnet does not converge within stepmax for time series

I'm writing a neural network for prediction of elements in a time series x + sin(x^2) in R, using the neuralnet package. This is how training data is being generated, assuming a window of 4 elements, and that the last one is the one that has to be predicted:
nntr0 <- ((1:25) + sin((1:25)^2))
nntr1 <- ((2:26) + sin((2:26)^2))
nntr2 <- ((3:27) + sin((3:27)^2))
nntr3 <- ((4:28) + sin((4:28)^2))
nntr4 <- ((5:29) + sin((5:29)^2))
Then, I turn these into a data.frame:
nntr <- data.frame(nntr0, nntr1, nntr2, nntr3, nntr4)
Then, I proceed to train the NN:
net.sinp <- neuralnet(nntr4 ~ nntr0 + nntr1 + nntr2 + nntr3, data=nntr, hidden=10, threshold=0.04, act.fct="tanh", linear.output=TRUE, stepmax=100000)
Which, after a while, gives me the message
Warning message:
algorithm did not converge in 1 of 1 repetition(s) within the stepmax
Call: neuralnet(formula = nntr4 ~ nntr0 + nntr1 + nntr2 + nntr3, data = nntr, hidden = 10, threshold = 0.04, stepmax = 100000, act.fct = "tanh", linear.output = TRUE)
Can anyone help me figure out why it is not converging? Many thanks
With tanh as an activation function (it is bounded),
it is very difficult to reproduce the linear trend in your signal.
You can use linear activation functions instead,
or try to detrend the signal.
# Data
dx <- 1
n <- 25
x <- seq(0,by=dx,length=n+4)
y <- x + sin(x^2)
y0 <- y[1:n]
y1 <- y[1 + 1:n]
y2 <- y[2 + 1:n]
y3 <- y[3 + 1:n]
y4 <- y[4 + 1:n]
d <- data.frame(y0, y1, y2, y3, y4)
# Linear activation functions
r <- neuralnet(y4 ~ y0 + y1 + y2 + y3, data=d, hidden=10)
plot(y4, compute(r, d[,-5])$net.result)
# No trend
d2 <- data.frame(
y0 = y0 - x[1:n],
y1 = y1 - x[1 + 1:n],
y2 = y2 - x[2 + 1:n],
y3 = y3 - x[3 + 1:n],
y4 = y4 - x[4 + 1:n]
r <- neuralnet(y4 ~ y0 + y1 + y2 + y3, data=d2, hidden=10, act.fct="tanh" )
plot(d2$y4, compute(r, d2[,-5])$net.result)
Warning message:
algorithm did not converge in 1 of 1 repetition(s) within the stepmaxmeans your algorithm reached the limited steps before it is converged. If you type ?neuralnet and see the definition for stepmax it says,
the maximum steps for the training of the neural network. Reaching this maximum leads to a stop of the neural network's training process.
For your problem, I recommend you to increase your stepmax value to 1e7 and see what happens.
The code will be,
net.sinp <- neuralnet(nntr4 ~ nntr0 + nntr1 + nntr2 + nntr3, data=nntr, hidden=10, threshold=0.04, act.fct="tanh", linear.output=TRUE, stepmax=1e7)
