Lotka-Volterra equations using R - r

How do I use ggplot to plot the predator species against the prey species?
These are the equations I'm using;
dX/dt = a1X - b1XY, X(0) = X0 #Prey
dY/dt = a2XY - b2Y, Y(0) = Y0 #Predator
These are the values of my constants;
a1 = 1.247
b1 = .384
a2 = .123
b2 = .699
X0 = 5.415
Y0 = 6.923
K = 9.438
This piece of code below describes the right hand side of the equations, however I'm unsure if this is relevant to plotting the two species.
ydot.lv <- function(t,y,parms){
ydot <- rep(NA,2)
ydot[1] <- parms[1]*y[1] - parms[2]*y[1]*y[2]
ydot[2] <- parms[3]*y[1]*y[2] - parms[4]*y[2]
return(list(ydot))
}

Related

lines() function in base R plot giving several lines instead of smooth line

I am attempting to add a smoother to a plot of a regression model I have. I was just using base R to plot my X and Y vectors and add a smoother using plot() and then lines(). I've done this before, and it worked, but today I am given a plot with multiple lines connecting the points as opposed to one smooth line through all the data. I can't figure out what is different about this piece of code I have written, so I am hoping someone here could help me identify the issue.
Here is my code. I am using data I randomly generated to practice something else:
X and random variable vectors to create 'Y':
X <- rnorm(100, mean = 10, sd = 1)
epsilon <- rnorm(100, 0, 1)
Y:
b0 <- 0.27
b1 <- 0.49
b2 <- 0.62
b3 <- 0.8
Y <- b0 + b1*X + b2*2^2 + b3*X^3 + epsilon
Creating df and reg model/Yhat:
df = data.frame(Y,X,epsilon)
reg <- lm(Y ~ I(X^3), data = df)
Yhat <- fitted.values(reg)
cbind(df, Yhat) -> df
plot:
plot(X, Y)
lines(X, Yhat, col = "blue", lwd = 0.5)
For this to work, the X values have to be sorted and the Y values sorted according to their corresponding X values:
X <- rnorm(100, mean = 10, sd = 1)
epsilon <- rnorm(100, 0, 1)
b0 <- 0.27
b1 <- 0.49
b2 <- 0.62
b3 <- 0.8
Y <- b0 + b1*X + b2*2^2 + b3*X^3 + epsilon
df = data.frame(Y,X,epsilon)
reg <- lm(Y ~ I(X^3), data = df)
Yhat <- fitted.values(reg)
cbind(df, Yhat) -> df
plot(X, Y)
lines(X[order(X)], Yhat[order(X)], col = "blue", lwd = 0.5)

Convergence Failure: Iteration limit reached without convergence (10)

I have some difficulties getting a specific curve to fit data to an nls model.
This is the formula for the data:
((b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4)))^(b3 / b4)
I use nls2 package with a random algorithm to find the inital values.
library(nls2)
#FORMULA
eq <- y ~ (b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4))^(b3 / b4)
#LIMITS
values <- data.frame(
b1 = c(60, 63)
b2 = c(0, 0.05)
b3 = c(0, 1)
b4 = c(0, 0.9)
fit <- nls2(eq,
data = .data,
start = values,
algorithm = "random",
control = mls.control(maxiter = 1000))
nls(eq, .data, start = coef(fit), alg = "port", lower = 0)
plot(.data)
The values should be:
b1 = 62.2060
b2 = 0.0438
b3 = 0.9692
b4 = 0.8693
However, when I try to run the codes, I always ended on an error message: Convergence Failure: Iteration limit reached without convergence (10)
How can I avoid the convergence failure error? Any help is highly appreciated. Thank You.
0. TLDR
You did not set the lower and upper bound in nls, so you didn't get a converging result. If you set them your will get a result near the boundary. See the code I wrote in the last paragraph.
Actually, even if you set the boundary, due to the bad data quality(sample size is small and do not consist with you formula), it's hard to fit a optimal value near your true b1,'b2','b3' and b4. See nontechnical reason.
1. Nontechnical reason of convergence failure
I think your code is right, and this convergence fail is due to your data quality or your misspecification of formula.
In general, it's hard for you to estimate 4 parameters with only 6 point. If you have good data which actully fits your model well, nlm will converge. In your case, either your data is wrong or you formula specification bias is huge.
I draw a plot to show your that:
Code
# generate a line using true parameters:b1,b2,b3,b4
b1 = 62.2060
b2 = 0.0438
b3 = 0.9692
b4 = 0.8693
x_points = seq(50,420,length.out = 200)
y_points = (b1 * ((b2 * x_points)^b4)) / (1 + ((b2 * x_points)^b4))^(b3 / b4)
# plot the function
plot(x = x_points ,y = y_points, type ='l',col ='black',lwd = 5,
xlim = c(min(yourdata$x)-5,max(yourdata$x)+5),
ylim = c(min(yourdata$y)-5,max(yourdata$y)+5))
# plot the data your got
points(yourdata$x,yourdata$y,cex = 2)
Output:
If we generate a data from your formula, we can fit them quite easily, like this:
## generate data
b1 = 62.2060
b2 = 0.0438
b3 = 0.9692
b4 = 0.8693
x <- runif(6,60,450)
y <- (b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4))^(b3 / b4)
data <- data.frame(x,y)
yourdata <- data.frame(x = c(409.56, 195.25, 60.53, 359.56, 188.79, 67.12),
y = c(39.76100, 20.11875, 7.23675, 41.01100, 20.28035, 7.07200))
#FORMULA
eq <- y ~ (b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4))^(b3 / b4)
#LIMITS
values <- data.frame(
b1 = c(60, 63),
b2 = c(0, 0.05),
b3 = c(0, 1),
b4 = c(0, 0.9))
fit <- nls2(eq,
data = data,
start = values,
algorithm = "random",
control = nls.control(maxiter = 1000))
nls(eq, data, start = coef(fit), alg = "port",
control = nls.control(maxiter = 1000,tol = 1e-05),
low = c(60,0,0,0),upper =c(63,0.05,1,0.9) ,trace = TRUE)
plot(x,y)
Output:
Nonlinear regression model
model: y ~ (b1 * ((b2 * x)^b4))/(1 + ((b2 * x)^b4))^(b3/b4)
data: data
b1 b2 b3 b4
62.2060 0.0438 0.9692 0.8693
residual sum-of-squares: 3.616e-24
Algorithm "port", convergence message: absolute function convergence (6)
Alse note that, in the above, I generate only6 numbers to fit the model. If you generate more data, for instance 60, you will have a better convergency!
2.Technical reason
After reading the PORT docs, I think that this error can mean
gradient is calculated incorrectly
stopping tolerances are too tight
gradient is discontinous near some iterate
And all these may have a relationship with you data and training task(your boundary and formula).
Try code below and you will get a better result:
Code:
yourdata <- data.frame(x = c(409.56, 195.25, 60.53, 359.56, 188.79, 67.12),
y = c(39.76100, 20.11875, 7.23675, 41.01100, 20.28035, 7.07200))
#FORMULA
eq <- y ~ (b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4))^(b3 / b4)
#LIMITS
values <- data.frame(
b1 = c(60, 63),
b2 = c(0, 0.05),
b3 = c(0, 1),
b4 = c(0, 0.9))
fit <- nls2(eq,
data = yourdata,
start = values,
algorithm = "random",
control = nls.control(maxiter = 1000))
nls(eq, yourdata, start = coef(fit), alg = "port",
control = nls.control(maxiter = 1000,tol = 1e-05),
low = c(60,0,0,0),upper =c(63,0.05,1,0.9) ,trace = TRUE)
plot(x,y)
Outputs:
Nonlinear regression model
model: y ~ (b1 * ((b2 * x)^b4))/(1 + ((b2 * x)^b4))^(b3/b4)
data: yourdata
b1 b2 b3 b4
63.00000 0.00155 0.00000 0.90000
residual sum-of-squares: 22.28
Algorithm "port", convergence message: both X-convergence and relative convergence (5)
As we can see, it converges to the boundary, which means that your data is unconsitant with your settings(formula or boundary).

Making Contour Plot in R More Efficient

I need some help rewriting my code. I have some code that I wrote in R for a contour plot that I want. The code works, however, it is very inefficient in that it has to plot thousands of points to get exactly what I want (the grey area), and so I would like to see if there is a simpler way to do what my code is doing.
I basically want to plot the function
f = x1 + x2
subject to the following constraint functions
c1 = 3/2 - x1 - 2x2 - 1/2*sin(2*pi(x1^2 - 2x2)) < 0
c2 = x1^2+x2^2-3/2 < 0
And so where c1 and c2 are both greater than zero, I would like to grey out those areas, and only show the function f where c1 and c2 are less than 0. And the domain of x1 and x2 are between 0 and 1.
Here is my current R code:
x1 = seq(0,1,.001)
x2 = seq(0,1,.001)
f = function(x1,x2){
ans = x1 + x2
return(ans) }
h = function(x1,x2){
ans1 = 1.5-x1-2*x2-.5*sin(2*pi*(x1^2-2*x2))
ans2 = x1^2+x2^2-1.5
ans1 = sapply(ans1,function(x){max(x,0)})
ans2 = sapply(ans2,function(x){max(x,0)})
ans = ans1 + ans2
return(ans) }
z = outer(x1,x2,f)
w = outer(x1,x2,h)
image(x1,x2,z,xlab=expression(x[1]),ylab=expression(x[2]))
contour(x1,x2,z,add=TRUE)
X = cbind(expand.grid(x1,x2),c(w))
points(X[X[,3]!=0,1],X[X[,3]!=0,2],pch=19,col="lightgrey")
You can do it with .filled.contour, the working function behind filled.contour. (The full filled.contour makes it hard to annotate your plot.) For example
x1 = seq(0,1,.001)
x2 = seq(0,1,.001)
f = function(x1,x2){
ans = x1 + x2
return(ans) }
h = function(x1,x2){
ans1 = 1.5-x1-2*x2-.5*sin(2*pi*(x1^2-2*x2))
ans2 = x1^2+x2^2-1.5
ans1 = pmax(ans1,0)
ans2 = pmax(ans2,0)
ans = ans1 + ans2
return(ans) }
z = outer(x1,x2,f)
w = outer(x1,x2,h)
# Set up the plot, .filled.contour doesn't do that
plot(x1, x2, type="n", xlab=expression(x[1]),ylab=expression(x[2]))
# Set a gray background
rect(min(x1), min(x2), max(x1), max(x2), col = "gray")
# Make parts transparent
z[ w != 0 ] <- NA
# Choose the contour levels
levels <- pretty(z)
# Plot them
.filled.contour(x1,x2,z,levels,
hcl.colors(length(levels)-1, "ylOrRd", rev = TRUE))
# If your device does antialiasing, plot the filled contours twice to avoid
# ugly effects
.filled.contour(x1,x2,z,levels,
hcl.colors(length(levels)-1, "ylOrRd", rev = TRUE))
contour(x1,x2,z, levels = levels, add=TRUE)
This gives me

Reproduce Fisher linear discriminant figure

Many books illustrate the idea of Fisher linear discriminant analysis using the following figure (this particular is from Pattern Recognition and Machine Learning, p. 188)
I wonder how to reproduce this figure in R (or in any other language). Pasted below is my initial effort in R. I simulate two groups of data and draw linear discriminant using abline() function. Any suggestions are welcome.
set.seed(2014)
library(MASS)
library(DiscriMiner) # For scatter matrices
# Simulate bivariate normal distribution with 2 classes
mu1 <- c(2, -4)
mu2 <- c(2, 6)
rho <- 0.8
s1 <- 1
s2 <- 3
Sigma <- matrix(c(s1^2, rho * s1 * s2, rho * s1 * s2, s2^2), byrow = TRUE, nrow = 2)
n <- 50
X1 <- mvrnorm(n, mu = mu1, Sigma = Sigma)
X2 <- mvrnorm(n, mu = mu2, Sigma = Sigma)
y <- rep(c(0, 1), each = n)
X <- rbind(x1 = X1, x2 = X2)
X <- scale(X)
# Scatter matrices
B <- betweenCov(variables = X, group = y)
W <- withinCov(variables = X, group = y)
# Eigenvectors
ev <- eigen(solve(W) %*% B)$vectors
slope <- - ev[1,1] / ev[2,1]
intercept <- ev[2,1]
par(pty = "s")
plot(X, col = y + 1, pch = 16)
abline(a = slope, b = intercept, lwd = 2, lty = 2)
MY (UNFINISHED) WORK
I pasted my current solution below. The main question is how to rotate (and move) the density plot according to decision boundary. Any suggestions are still welcome.
require(ggplot2)
library(grid)
library(MASS)
# Simulation parameters
mu1 <- c(5, -9)
mu2 <- c(4, 9)
rho <- 0.5
s1 <- 1
s2 <- 3
Sigma <- matrix(c(s1^2, rho * s1 * s2, rho * s1 * s2, s2^2), byrow = TRUE, nrow = 2)
n <- 50
# Multivariate normal sampling
X1 <- mvrnorm(n, mu = mu1, Sigma = Sigma)
X2 <- mvrnorm(n, mu = mu2, Sigma = Sigma)
# Combine into data frame
y <- rep(c(0, 1), each = n)
X <- rbind(x1 = X1, x2 = X2)
X <- scale(X)
X <- data.frame(X, class = y)
# Apply lda()
m1 <- lda(class ~ X1 + X2, data = X)
m1.pred <- predict(m1)
# Compute intercept and slope for abline
gmean <- m1$prior %*% m1$means
const <- as.numeric(gmean %*% m1$scaling)
z <- as.matrix(X[, 1:2]) %*% m1$scaling - const
slope <- - m1$scaling[1] / m1$scaling[2]
intercept <- const / m1$scaling[2]
# Projected values
LD <- data.frame(predict(m1)$x, class = y)
# Scatterplot
p1 <- ggplot(X, aes(X1, X2, color=as.factor(class))) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
scale_x_continuous(limits=c(-5, 5)) +
scale_y_continuous(limits=c(-5, 5)) +
geom_abline(intecept = intercept, slope = slope)
# Density plot
p2 <- ggplot(LD, aes(x = LD1)) +
geom_density(aes(fill = as.factor(class), y = ..scaled..)) +
theme_bw() +
theme(legend.position = "none")
grid.newpage()
print(p1)
vp <- viewport(width = .7, height = 0.6, x = 0.5, y = 0.3, just = c("centre"))
pushViewport(vp)
print(p2, vp = vp)
Basically you need to project the data along the direction of the classifier, plot a histogram for each class, and then rotate the histogram so its x axis is parallel to the classifier. Some trial-and-error with scaling the histogram is needed in order to get a nice result. Here's an example of how to do it in Matlab, for the naive classifier (difference of class' means). For the Fisher classifier it is of course similar, you just use a different classifier w. I changed the parameters from your code so the plot is more similar to the one you gave.
rng('default')
n = 1000;
mu1 = [1,3]';
mu2 = [4,1]';
rho = 0.3;
s1 = .8;
s2 = .5;
Sigma = [s1^2,rho*s1*s1;rho*s1*s1, s2^2];
X1 = mvnrnd(mu1,Sigma,n);
X2 = mvnrnd(mu2,Sigma,n);
X = [X1; X2];
Y = [zeros(n,1);ones(n,1)];
scatter(X1(:,1), X1(:,2), [], 'b' );
hold on
scatter(X2(:,1), X2(:,2), [], 'r' );
axis equal
m1 = mean(X(1:n,:))';
m2 = mean(X(n+1:end,:))';
plot(m1(1),m1(2),'bx','markersize',18)
plot(m2(1),m2(2),'rx','markersize',18)
plot([m1(1),m2(1)], [m1(2),m2(2)],'g')
%% classifier taking only means into account
w = m2 - m1;
w = w / norm(w);
% project data onto w
X1_projected = X1 * w;
X2_projected = X2 * w;
% plot histogram and rotate it
angle = 180/pi * atan(w(2)/w(1));
[hy1, hx1] = hist(X1_projected);
[hy2, hx2] = hist(X2_projected);
hy1 = hy1 / sum(hy1); % normalize
hy2 = hy2 / sum(hy2); % normalize
scale = 4; % set manually
h1 = bar(hx1, scale*hy1,'b');
h2 = bar(hx2, scale*hy2,'r');
set([h1, h2],'ShowBaseLine','off')
% rotate around the origin
rotate(get(h1,'children'),[0,0,1], angle, [0,0,0])
rotate(get(h2,'children'),[0,0,1], angle, [0,0,0])

R script - NLS not working

I have 5 (x,y) data points and I'm trying to find a best fit solution consisting of two lines which intersect at a point (x0,y0), and which follow these equations:
y1 = (m1)(x1 - x0) + y0
y2 = (m2)(x2 - x0) + y0
Specifically, I require that the intersection must occur between x=2 and x=3. Have a look at the code:
#Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)
x2 <- c(3,4,5)
y2 <- c(20,30,40)
g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
q <- nls(c(y1, y2) ~ ifelse(g == TRUE, m1 * (x1 - x0) + y0, m2 * (x2 - x0) + y0), start = c(m1 = -1, m2 = 1, y0 = 0, x0 = 2), algorithm = "port", lower = c(m1 = -Inf, m2 = -Inf, y0 = -Inf, x0 = 2), upper = c(m1 = Inf, m2 = Inf, y0 = Inf, x0 = 3))
coef <- coef(q)
m1 <- coef[1]
m2 <- coef[2]
y0 <- coef[3]
x0 <- coef[4]
#Plot the original x1, y1, and x2, y2
plot(x1,y1,xlim=c(1,5),ylim=c(0,50))
points(x2,y2)
#Plot the fits
x1 <- c(1,2,3,4,5)
fit1 <- m1 * (x1 - x0) + y0
lines(x1, fit1, col="red")
x2 <- c(1,2,3,4,5)
fit2 <- m2 * (x2 - x0) + y0
lines(x2, fit2, col="blue")
So, you can see the data points listed there. Then, I run it through my nls, get my parameters m1, m2, x0, y0 (the slopes, and the intersection point).
But, take a look at the solution:
Clearly, the red line (which is supposed to only be based on the first 2 points) is not the best line of fit for the first 2 points. This is the same case with the blue line (the 2nd fit), which supposed to be is dependent on the last 3 points). What is wrong here?
This is segmented regression:
# input data
x1 <- c(1,2); y1 <- c(10,10); x2 <- c(3,4,5); y2 <- c(20,30,40)
x <- c(x1, x2); y <- c(y1, y2)
# segmented regression
library(segmented)
fm <- segmented.lm(lm(y ~ x), ~ x, NA, seg.control(stop.if.error = FALSE, K = 2))
summary(fm)
# plot
plot(fm)
points(y ~ x)
See ?lm, ?segmented.lm and ?seg.control for more info.
I'm not exactly sure what's wrong but I can get it to work by rearranging things a bit. Please note the comment in ?nls about "Do not use ‘nls’ on artificial "zero-residual" data."; I added a bit of noise.
## Initialize x1, y1, x2, y2
x1 <- c(1,2)
y1 <- c(10,10)
x2 <- c(3,4,5)
y2 <- c(20,30,40)
## make single x, y vector
x <- c(x1,x2)
set.seed(1001)
## (add a bit of noise to avoid zero-residual artificiality)
y <- c(y1,y2)+rnorm(5,sd=0.01)
g <- c(TRUE,TRUE,FALSE,FALSE,FALSE) ## specify identities of points
## particular changes:
## * you have lower=upper=2 for x0. Did you want 2<x0<3?
## * specified data argument explicitly (allows use of predict() etc.)
## * changed name from 'q' to 'fit1' (avoid R built-in function)
fit1 <- nls(y ~ ifelse(g,m1,m1+delta_m)*(x - x0) + y0,
start = c(m1 = -1, delta_m = 2, y0 = 0, x0 = 2),
algorithm = "port",
lower = c(m1 = -Inf, delta_m = 0, y0 = -Inf, x0 = 2),
upper = c(m1 = Inf, delta_m = Inf, y0 = Inf, x0 = 3),
data=data.frame(x,y))
#Plot the original 'data'
plot(x,y,col=rep(c("red","blue"),c(2,3)),
xlim=c(1,5),ylim=c(0,50))
## add predicted values
xvec <- seq(1,5,length.out=101)
lines(xvec,predict(fit1,newdata=data.frame(x=xvec)))
edit: based ifelse clause on point identity, not x position
edit: changed to require second slope to be > first slope
On a second look, I think the issue above is probably due to the use of separate vectors for x1 and x2 above, rather than a single x vector: I suspect these got replicated by R to match up with the g vector, which would have messed things up pretty badly. For example, this stripped-down example:
g <- c(TRUE, TRUE, FALSE, FALSE, FALSE)
ifelse(g,x1,x2)
## [1] 1 2 5 3 4
shows that x2 gets extended to (3 4 5 3 4) before being used in the ifelse clause. The scariest part is that normally one gets a warning such as this:
> x2 + 1:5
[1] 4 6 8 7 9
Warning message:
In x2 + 1:5 :
longer object length is not a multiple of shorter object length
but in this case there is no warning ...

Resources