Simulate data from a Gompertz curve in R

Simulate data from a Gompertz curve in R - r

I have a set of data that I have collected which consists of a time series, where each y-value is found by taking the mean of 30 samples of grape cluster weight.
I want to simulate more data from this, with the same number of x and y values, so that I can carry out some Bayesian analysis to find the posterior distribution of the data.
I have the data, and I know that the growth follows a Gompertz curve with formula:
[y = a*exp(-exp(-(x-x0)/b))], with a = 88.8, b = 11.7, and x0 = 15.1.
The data I have is
x = c(0, 28, 36, 42, 50, 58, 63, 71, 79, 85, 92, 99, 106, 112)
y = c(0, 15, 35, 55, 62, 74, 80, 96, 127, 120, 146, 160, 177, 165).
Any help would be appreciated thank you

*Will edit when more information is given**
I am a little confused by your question. I have compiled what you have written into R. Please elaborate for me so that I can help you:
gompertz <- function(x, x0, a, b){
a*exp(-exp(-(x-x0)/b))
}
y = c(0, 15, 35, 55, 62, 74, 80, 96, 127, 120, 146, 160, 177, 165) # means of 30 samples of grape cluster weights?
x = c(0, 28, 36, 42, 50, 58, 63, 71, 79, 85, 92, 99, 106, 112) # ?
#??
gompertz(x, x0 = 15.1, a = 88.8, b = 11.7)
gompertz(y, x0 = 15.1, a = 88.8, b = 11.7)

Related

Plot a plane in R: PCA

I'm working with a PCA problem where I have 3 variables and I reduce them to 2 by doing PCA. I've already plot all the points in 3D using scatter3D. My question is, how can I plot the plane determined by two vectors (the first two eigenvectors of the sampled covariance matrix) in R?
This is what I have so far
library(plot3D)
X <- matrix(c(55, 75, 110,
47, 69, 108,
42, 71, 110,
48, 74, 114,
47, 75, 114,
52, 73, 104,
49, 72, 106,
44, 67, 107,
52, 73, 108,
45, 73, 111,
50, 80, 117,
50, 71, 110,
48, 75, 114,
51, 73, 106,
44, 66, 102,
42, 71, 112,
50, 68, 107,
48, 70, 108,
51, 72, 108,
52, 73, 109,
49, 72, 112,
49, 73, 108,
46, 70, 105,
39, 66, 100,
50, 76, 108,
52, 71, 108,
56, 75, 108,
53, 70, 112,
53, 72, 110,
49, 74, 113,
51, 72, 109,
55, 74, 110,
56, 75, 110,
62, 79, 118,
58, 77, 115,
50, 71, 105,
52, 67, 104,
52, 73, 107,
56, 73, 106,
55, 78, 118,
53, 68, 103), ncol = 3,nrow = 41,byrow = TRUE)
S <- cov(X)
Gamma <- eigen(S)$vectors
scatter3D(X[,1], X[,2], X[,3], pch = 18, bty = "u", colkey = FALSE,
main ="bty= 'u'", col.panel ="gray", expand =0.4,
col.grid = "white",ticktype = "detailed",
phi = 25,theta = 45)
pc <- scale(X,center=TRUE,scale=FALSE) %*% Gamma[,c(1,2)]
Now I would like to plot the plane using scatter3D

Perhaps this will do. Using the iris data. It uses scatter3d in package car which can add a regression surface to a 3d plot:
library(car)
data(iris)
iris.pr <- prcomp(iris[, 1:3], scale.=TRUE)
# Draw 3d plot with surface and color points by species
scatter3d(PC3~PC1+PC2, iris.pr$x, point.col=c(rep(2, 50), rep(3, 50), rep(4, 50)))
This plots a regression surface predicting PC3 from PC1 and PC2. By definition the correlation between any two principal components is zero so the surface should be PC3=0 for any values of PC1 and PC2, but I don't see a way to produce exactly that surface. It is pretty close though.

update valence shifter in sentimentr package in r

I am trying to remove certain rows from the lexicon::hash_valence_shifters in the sentimentr package. Specifically, i want to keep only rows:
c( 1 , 2 , 3 , 6 , 7 , 13, 14 , 16 , 19, 24, 25 , 26, 27 , 28, 36, 38, 39, 41, 42, 43, 45, 46, 53, 54, 55, 56, 57, 59, 60, 65, 70, 71, 73, 74, 79, 84, 85, 87, 88, 89, 94, 95, 96, 97, 98, 99, 100, 102, 103, 104, 105, 106, 107, 114, 115, 119, 120, 123, 124, 125, 126, 127, 128, 129, 135, 136, 138)
I have tried the below approach:
vsm = lexicon::hash_valence_shifters[c, ]
vsm[ , y := as.numeric(y)]
vsm = sentimentr::as_key(vsm, comparison = NULL, sentiment = FALSE)
sentimentr::is_key(vsm)
vsn = sentimentr::update_valence_shifter_table(vsm, drop = c(dropvalue$x), x= lexicon::hash_valence_shifters, sentiment = FALSE, comparison = TRUE )
However, when I am calculating the sentiment using the updated valence shifter table "vsn", it is giving the sentiment as 0.
Can someone please let me know how to just keep specific rows of the valence shifter table ?
Thanks!

Fit custom function to data

I have a data such that produced from special function:
where t0=1, alpha, q, gamma, C and beta are unknown parameters.
The question is how to fit the above function to following data, in R?
mydata<-structure(list(x = 1:100, y = c(0, 0, 2, 1, 3, 4, 4, 3, 7, 8,
9, 11, 12, 11, 15, 15, 17, 21, 49, 43, 117, 75, 85, 97, 113,
129, 135, 147, 149, 149, 123, 129, 127, 122, 143, 157, 144, 139,
123, 117, 141, 138, 124, 134, 158, 151, 136, 133, 121, 117, 122,
125, 117, 111, 98, 94, 92, 89, 73, 87, 91, 88, 94, 90, 93, 76,
60, 96, 71, 80, 71, 63, 65, 47, 74, 63, 78, 68, 55, 48, 51, 45,
48, 50, 71, 48, 35, 51, 69, 62, 64, 66, 51, 59, 58, 34, 57, 56,
63, 50)), class = "data.frame", row.names = c(NA, -100L))
I defined the function as follows:
t0<<-1
fyy<-function(t,cc0,alpha0,qq0,beta0,gamma0){
ret<-cc0*((t-t0)^alpha0)/(((1+(qq0-1)*beta0*(t-t0)^gamma0))^(1/(qq0-1)))
return(ret)
}
but I don't know how to continue?
as #mhovd mentioned I used "nls" function but I got an error as follows:
> fit <- nls(y~fyy(x,cc0 ,alpha0 ,beta0 ,gamma0 ,qq0 ),
data=data.frame(mydata), start=list(cc0 = .01,alpha0 =1,beta0 =.3,gamma0
= 2,qq0 = 1))
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model

In the comments #masoud references a paper about the specific function in the question. It suggests fixing gamma0 and qq0 and if we do that we do get a solution -- fm shown in red in the plot. We have also shown an alternate parametric curve as fm2 in blue. It also has 3 optimized parameters but has lower residual sum of squares (lower is better).
fyy <- function(t,cc0,alpha0,qq0,beta0,gamma0){
cc0 * ((t-t0)^alpha0) / (((1+(qq0-1)*beta0*(t-t0)^gamma0))^(1/(qq0-1)))
}
mydata0 <- subset(mydata, y > 0)
# fixed values
t0 <- 1
gamma0 <- 3
qq0 <- 1.2
st <- list(cc0 = 1, alpha0 = 1, beta0 = 1) # starting values
fm <- nls(y ~ fyy(x, cc0, alpha0, qq0, beta0, gamma0), mydata0,
lower = list(cc0 = 0.1, alpha0 = 0.1, beta0 = 0.00001),
start = st, algorithm = "port")
deviance(fm) # residual sum of squares
## [1] 61458.5
st2 <- list(a = 1, b = 1, c = 1)
fm2 <- nls(y ~ exp(a + b/x + c*log(x)), mydata0, start = st2)
deviance(fm2) # residual sum of squares
## [1] 16669.24
plot(mydata0, ylab = "y", xlab = "t")
lines(fitted(fm) ~ x, mydata0, col = "red")
lines(fitted(fm2) ~ x, mydata0, col = "blue")
legend("topright", legend = c("fm", "fm2"), lty = 1, col = c("red", "blue"))

R Binary Integer Optimization with Groups

I am trying to get Rsolnp to constrain my parameters to binary integers or to decimals that are nearly the same (.999 is close enough to 1 for example).
I have three vectors of equal length (52), each of which will get multiplied by my binary parameter vector in my objective function.
library(Rsolnp)
a <- c(251, 179, 215, 251, 63, 45, 54, 63, 47, 34, 40, 47, 141, 101, 121, 141, 47, 34, 40, 47, 94, 67, 81, 94, 47, 34, 40, 47, 157, 108, 133, 157, 126, 85, 106, 126, 126, 85, 106, 126, 110, 74, 92, 110, 110, 74, 92, 110, 63, 40, 52, 63)
b <- c(179, 251, 215, 0, 45, 63, 54, 0, 34, 47, 40, 0, 101, 141, 121, 0, 34, 47, 40, 0, 67, 94, 81, 0, 34, 47, 40, 0, 108, 157, 133, 0, 85, 126, 106, 0, 85, 126, 106, 0, 74, 110, 92, 0, 74, 110, 92, 0, 40, 63, 52, 0)
c <- c(179, 179, 118, 179, 45, 45, 30, 45, 34, 34, 22, 34, 101, 101, 67, 101, 34, 34, 22, 34, 67, 67, 44, 67, 34, 34, 22, 34, 108, 108, 71, 108, 85, 85, 56, 85, 85, 85, 56, 85, 74, 74, 49, 74, 74, 74, 49, 74, 40, 40, 27, 40)
x is my parameter vector and below if my objective function.
objective_function = function(x){
-(1166 * sum(x[1:52] * a) / 2000) *
(((sum(x[1:52] * b)) / 2100) + .05) *
(((sum(x[1:52] * c))/1500) + 1.5)
}
I essentially want 1 paramater in each group of 4 equal to 1 and the rest 0 and I'm not sure how to create the correct constraints for this but I believe I need to use these sum constraints in combination with another type of constraint as well. Here are my constraints:
eqn1=function(x){
z1=sum(x[1:4])
z2=sum(x[5:8])
z3=sum(x[9:12])
z4=sum(x[13:16])
z5=sum(x[17:20])
z6=sum(x[21:24])
z7=sum(x[25:28])
z8=sum(x[29:32])
z9=sum(x[33:36])
z10=sum(x[37:40])
z11=sum(x[41:44])
z12=sum(x[45:48])
z13=sum(x[49:52])
return(c(z1,z2,z3,z4,z5,z6,z7,z8,z9,z10,z11,z12,z13))
}
And finally, here is my function call:
opti<-solnp(pars=rep(1,52), fun = objective_function, eqfun = eqn1, eqB = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), LB=rep(0,52))
Calling opti$pars returns my solution vector:
[1] 7.199319e-01 2.800680e-01 6.015388e-08 4.886578e-10 5.540961e-01 4.459036e-01 2.906853e-07 4.635970e-08 5.389325e-01
[10] 4.610672e-01 2.979195e-07 3.651954e-08 6.228346e-01 3.771652e-01 1.980380e-07 3.348488e-09 5.389318e-01 4.610679e-01
[19] 2.979195e-07 3.651954e-08 5.820231e-01 4.179766e-01 2.099869e-07 2.624076e-08 5.389317e-01 4.610680e-01 2.979195e-07
[28] 3.651954e-08 6.499878e-01 3.500120e-01 1.959133e-07 1.059012e-08 6.249098e-01 3.750900e-01 2.588037e-07 1.752927e-08
[37] 6.249106e-01 3.750892e-01 2.588037e-07 1.752927e-08 6.095743e-01 3.904254e-01 2.741968e-07 2.233806e-08 6.095743e-01
[46] 3.904254e-01 2.741968e-07 2.233806e-08 5.679608e-01 4.320385e-01 6.821224e-07 3.997882e-08
As one can see the weight is getting split between multiple variables in each group of 4 instead of being forced onto just 1 with the rest being 0.
If this is not possible with this package could someone show me how to convert my objective function to work with other optimization packages? From what I have seen, they require the objective function to be converted to a vector of coefficients. Any help is appreciated. Thanks!

I tried with a few solvers. With MINLP solvers Couenne and Baron we can solve this directly. With Gurobi we need to decompose the objective into two quadratic parts. All these solvers give:
---- 119 VARIABLE x.L
i1 1.000, i5 1.000, i9 1.000, i14 1.000, i17 1.000, i21 1.000, i25 1.000, i29 1.000
i34 1.000, i38 1.000, i41 1.000, i46 1.000, i49 1.000
---- 119 VARIABLE z.L = -889.346 obj
Zeroes are not printed here.
I used GAMS (commercial) but if you want to use free tools you can use Pyomo(Python) + Couenne. I am not sure about MINLP solvers for R, but Gurobi can be used from R.
Note that the group constraint is simply:
groups(g).. sum(group(g,i),x(i)) =e= 1;
where g are the groups and group(g,i) is a 2d set with the mapping between groups and items.
For Gurobi you need to do something like (in pseudo code):
z1 = 1166 * sum(i,x(i)*a(i)) / 2000 (linear)
z2 = ((sum(i, x(i)*b(i))) / 2100) + .05 (linear)
z3 = ((sum(i, x(i)*c(i)))/1500) + 1.5 (linear)
z23 = z2*z3 (non-convex quadratic)
obj = -z1*z23 (non-convex quadratic)
and tell Gurobi to use the nonconvex MIQCP solver.
Sorry, no R code for this. But it may give you something to think about.

within CPLEX you may try mathematical programming as Paul wrote, but you may also use Constraint Programming.
In OPL (CPLEX modeling language)
using CP;
execute
{
cp.param.timelimit=5; // time limit 5 seconds
}
int n=52;
range r=1..n;
int a[r]=[251, 179, 215, 251, 63, 45, 54, 63, 47, 34, 40, 47, 141, 101, 121, 141, 47, 34, 40, 47, 94, 67,
81, 94, 47, 34, 40, 47, 157, 108, 133, 157, 126, 85, 106, 126, 126,
85, 106, 126, 110, 74, 92, 110, 110, 74, 92, 110, 63, 40, 52, 63];
int b[r]=[179, 251, 215, 0, 45, 63, 54, 0, 34, 47, 40, 0, 101, 141, 121, 0,
34, 47, 40, 0, 67, 94, 81, 0, 34, 47, 40, 0, 108, 157, 133, 0, 85, 126, 106, 0, 85,
126, 106, 0, 74, 110, 92, 0, 74, 110, 92, 0, 40, 63, 52, 0];
int c[r]=[179, 179, 118, 179, 45, 45, 30, 45, 34, 34, 22, 34, 101, 101, 67, 101,
34, 34, 22, 34, 67, 67, 44, 67, 34, 34, 22, 34, 108, 108, 71, 108, 85, 85, 56, 85,
85, 85, 56, 85, 74, 74, 49, 74, 74, 74, 49, 74, 40, 40, 27, 40];
// decision variable
dvar boolean x[r];
// objective
dexpr float obj=
-(1166 * sum(i in r) (x[i]*a[i]) / 2000) *
(((sum(i in r) (x[i]* b[i])) / 2100) + .05) *
(((sum(i in r) (x[i]*c[i]))/1500) + 1.5);
minimize obj;
subject to
{
// one and only one out of 4 is true
forall(i in 1..n div 4) count(all(j in 1+(i-1)*4..4+(i-1)*4)x[j],1)==1;
}
gives
// solution with objective -889.3463
x = [1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1
0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0
0 0];
within 5 seconds
NB: You could call OPL CPLEX from R or rely on any other CPLEX API
And in python you can write the same
from docplex.cp.model import CpoModel
n=52
r=range(0,n)
a =[251, 179, 215, 251, 63, 45, 54, 63, 47, 34, 40, 47, 141, 101, 121, 141, 47, 34, 40, 47, 94, 67, 81, 94, 47, 34, 40, 47, 157, 108, 133, 157, 126, 85, 106, 126, 126, 85, 106, 126, 110, 74, 92, 110, 110, 74, 92, 110, 63, 40, 52, 63]
b =[179, 251, 215, 0, 45, 63, 54, 0, 34, 47, 40, 0, 101, 141, 121, 0, 34, 47, 40, 0, 67, 94, 81, 0, 34, 47, 40, 0, 108, 157, 133, 0, 85, 126, 106, 0, 85, 126, 106, 0, 74, 110, 92, 0, 74, 110, 92, 0, 40, 63, 52, 0]
c =[179, 179, 118, 179, 45, 45, 30, 45, 34, 34, 22, 34, 101, 101, 67, 101, 34, 34, 22, 34, 67, 67, 44, 67, 34, 34, 22, 34, 108, 108, 71, 108, 85, 85, 56, 85, 85, 85, 56, 85, 74, 74, 49, 74, 74, 74, 49, 74, 40, 40, 27, 40]
mdl = CpoModel(name='x')
#decision variables
mdl.x = {i: mdl.integer_var(0,n,name="x"+str(i+1)) for i in r}
mdl.minimize(-1166 * sum(mdl.x[i]*a[i] / 2000 for i in r) \
*((sum(mdl.x[i]* b[i] / 2100 for i in r) +0.05)) \
*((sum(mdl.x[i]*c[i]/1500 for i in r) +1.5)) )
for i in range(0,n // 4):
mdl.add(1==sum( mdl.x[j] for j in range(i*4+0,i*4+4)))
msol=mdl.solve(TimeLimit=5)
# Dislay solution
for i in r:
if (msol[mdl.x[i]]==1):
print(i+1," ")
and that gives
! Best objective : -889.3464
1
5
9
13
17
22
25
30
34
38
41
45
49

I set up an R notebook to solve (or try to solve) the problem as a mixed integer linear program, using CPLEX as the MIP solver and the Rcplex package as the interface to it. The results were unspectacular. After five minutes of grinding, CPLEX had a solution somewhat inferior to what Erwin got (-886.8748 v. his -889.346) with a gap over 146% (which, given Erwin's result, is mostly just the upper bound converging very slowly). I'm happy to share the notebook, which shows the linearization, but to use it you would need to have CPLEX installed.
Update: I have a second notebook, using the GA genetic algorithm package, that consistently gets close to Erwin's solution (and occasionally hits it) in under five seconds. The results are random, so rerunning may do better (or worse), and there is no proof of optimality.

Calculate area of cross section for varying height

I'm trying to figure out how to calculate the area of a river cross section.
For the cross section I have the depth at every 25 cm over the 5 m wide river.
x_profile <- seq(0, 500, 25)
y_profile = c(50, 73, 64, 59, 60, 64, 82, 78, 79, 76, 72, 68, 63, 65, 62, 61, 56, 50, 44, 39, 25)
If anyone have some suggestions of how this could be done in r it's highly appreciated.

We can use the sf package to create a polygon showing the cross-section and then calculate the area. Notice that to create a polygon, it is necessary to provide three more points as c(0, 0), c(500, 0), and c(0, 0) when creating the matrix m.
x_profile <- seq(0, 500, 25)
y_profile <- c(50, 73, 64, 59, 60, 64, 82, 78, 79, 76, 72,
68, 63, 65, 62, 61, 56, 50, 44, 39, 25)
library(sf)
# Create matrix with coordinates
m <- matrix(c(0, x_profile, 500, 0, 0, -y_profile, 0, 0),
byrow = FALSE, ncol = 2)
# Create a polygon
poly <- st_polygon(list(m))
# View the polygon
plot(poly)
# Calcualte the area
st_area(poly)
31312.5

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Simulate data from a Gompertz curve in R - r

Related

Plot a plane in R: PCA

update valence shifter in sentimentr package in r

Fit custom function to data

R Binary Integer Optimization with Groups

Calculate area of cross section for varying height

Categories

Resources