How can I estimate the parameter in the function? - r

I stuck in a problem that reproduces data in an article. (https://www.nature.com/articles/s41598-019-40993-w)
Attached picture describes concentration-response curve (left), right graph means Emax of individual curve (E'max) and logEC50 (pEC50). The lower equation is the (6) equation on the manuscript.
From the manuscript,
"Simulated data (Fig. 1, left) were fitted with Eq. (10) and resulting E’MAX values were plotted against corresponding EC50 values (Fig. 1, right). Fitting Eq. (6) to the data yielded system EMAX = 0.98 ± 0.01 and logarithm of the equilibrium dissociation constant of the agonist-receptor complex KA = −5.99 ± 0.01 (parameter estimate ± SD). ."
To sum up, E'max and EC50 are given value. I need fitting the Eq.(6) to estimate Emax and KA value.
I tried that on GraphPad Prism software. First I made X, Y column (pEC50, E'max) and put in the data from graph. Then, I set non-linear user defined equation like (6) equation.
Y= EMAX - (EMAX * 10^X)/KA
Rules for intial value of EMAX is 1 (initial value to be fit), of KA is 0 (initial value to be fit).
Default of constraints of both EMAX and KA are "No constraint"
When run the user defined equation, I got different value (EMAX = 1.0, KA = 0.0).
How can I correctly estimate the Emax and KA value..?

Related

How to estimate parameters of exponential function with R?

I am working on an investigation on ichthyofauna (study of fishes). I need to find the condition factor of the fish.
The steps to find the condition factor are as follows:
1. W = aL^b ... (1)
Where:
W: The weight of fish in grams.
L: Total length of fish in centimeters.
a: Exponent describing the rate of change of weight with length (= the intercept of the regression line on the Y axis).
b: The slope of the regression line (also referred to as the Allometric coefficient).
2. Log w = log a + b log L ... (2)
Where:
a: constant
b: the regression co-efficient
3. K = 100 w/L^b ... (3)
Where:
W: Weight of the fish in grams
L: The total length of the fish in centimeters
b: The value obtained from the length-eight equation formula
I understand that to calculate K, I must first obtain the regression slope (b of 1), then the co-efficient regression (b of 2) and finally K. I need help to do it in R.
I would be very grateful for your support.
Thanks and regards!
so for a very simple regression, you may want to start with a linear model, and you'd do something like this:
reg1 <- lm(log(W) ~ log(L), data=yourdataframename)
then check the summary for coefficients:
summary(reg1)
Note you don't need to take a log of the intercept because it is a column of ones essentially (but it is included implicitly unless you put '-1' in your parameters)

PLOTTING ROC CURVE USING ROCR or PROCR

library(ROCR)
pred1 <- prediction(predictions=glm.prob2,labels =test_data$Direction)
perf1<-performance(pred1,measure = "TP.rate",x.measure = "FP.rate")
plot(perf1)
I keep getting the following error message:
Wrong argument types: First argument must be of type 'prediction'; second and optional third argument must be available performance measures!
How can I get the roc curve for this?
As the error suggests, your measure and x.measure arguments are invalid.
The documentation of the performance function lists the following options to choose from:
‘acc’: Accuracy. P(Yhat = Y). Estimated as: (TP+TN)/(P+N).
‘err’: Error rate. P(Yhat != Y). Estimated as: (FP+FN)/(P+N).
‘fpr’: False positive rate. P(Yhat = + | Y = -). Estimated as:
FP/N.
‘fall’: Fallout. Same as ‘fpr’.
‘tpr’: True positive rate. P(Yhat = + | Y = +). Estimated as:
TP/P.
‘rec’: Recall. Same as ‘tpr’.
‘sens’: Sensitivity. Same as ‘tpr’.
‘fnr’: False negative rate. P(Yhat = - | Y = +). Estimated as:
FN/P.
‘miss’: Miss. Same as ‘fnr’.
‘tnr’: True negative rate. P(Yhat = - | Y = -).
‘spec’: Specificity. Same as ‘tnr’.
‘ppv’: Positive predictive value. P(Y = + | Yhat = +). Estimated
as: TP/(TP+FP).
‘prec’: Precision. Same as ‘ppv’.
‘npv’: Negative predictive value. P(Y = - | Yhat = -). Estimated
as: TN/(TN+FN).
‘pcfall’: Prediction-conditioned fallout. P(Y = - | Yhat = +).
Estimated as: FP/(TP+FP).
‘pcmiss’: Prediction-conditioned miss. P(Y = + | Yhat = -).
Estimated as: FN/(TN+FN).
‘rpp’: Rate of positive predictions. P(Yhat = +). Estimated as:
(TP+FP)/(TP+FP+TN+FN).
‘rnp’: Rate of negative predictions. P(Yhat = -). Estimated as:
(TN+FN)/(TP+FP+TN+FN).
‘phi’: Phi correlation coefficient. (TP*TN -
FP*FN)/(sqrt((TP+FN)*(TN+FP)*(TP+FP)*(TN+FN))). Yields a
number between -1 and 1, with 1 indicating a perfect
prediction, 0 indicating a random prediction. Values below 0
indicate a worse than random prediction.
‘mat’: Matthews correlation coefficient. Same as ‘phi’.
‘mi’: Mutual information. I(Yhat, Y) := H(Y) - H(Y | Yhat), where
H is the (conditional) entropy. Entropies are estimated
naively (no bias correction).
‘chisq’: Chi square test statistic. ‘?chisq.test’ for details.
Note that R might raise a warning if the sample size is too
small.
‘odds’: Odds ratio. (TP*TN)/(FN*FP). Note that odds ratio produces
Inf or NA values for all cutoffs corresponding to FN=0 or
FP=0. This can substantially decrease the plotted cutoff
region.
‘lift’: Lift value. P(Yhat = + | Y = +)/P(Yhat = +).
‘f’: Precision-recall F measure (van Rijsbergen, 1979). Weighted
harmonic mean of precision (P) and recall (R). F = 1/
(alpha*1/P + (1-alpha)*1/R). If alpha=1/2, the mean is
balanced. A frequent equivalent formulation is F = (beta^2+1)
* P * R / (R + beta^2 * P). In this formulation, the mean is
balanced if beta=1. Currently, ROCR only accepts the alpha
version as input (e.g. alpha=0.5). If no value for alpha is
given, the mean will be balanced by default.
‘rch’: ROC convex hull. A ROC (=‘tpr’ vs ‘fpr’) curve with
concavities (which represent suboptimal choices of cutoff)
removed (Fawcett 2001). Since the result is already a
parametric performance curve, it cannot be used in
combination with other measures.
‘auc’: Area under the ROC curve. This is equal to the value of the
Wilcoxon-Mann-Whitney test statistic and also the probability
that the classifier will score are randomly drawn positive
sample higher than a randomly drawn negative sample. Since
the output of ‘auc’ is cutoff-independent, this measure
cannot be combined with other measures into a parametric
curve. The partial area under the ROC curve up to a given
false positive rate can be calculated by passing the optional
parameter ‘fpr.stop=0.5’ (or any other value between 0 and 1)
to ‘performance’.
‘prbe’: Precision-recall break-even point. The cutoff(s) where
precision and recall are equal. At this point, positive and
negative predictions are made at the same rate as their
prevalence in the data. Since the output of ‘prbe’ is just a
cutoff-independent scalar, this measure cannot be combined
with other measures into a parametric curve.
‘cal’: Calibration error. The calibration error is the absolute
difference between predicted confidence and actual
reliability. This error is estimated at all cutoffs by
sliding a window across the range of possible cutoffs. The
default window size of 100 can be adjusted by passing the
optional parameter ‘window.size=200’ to ‘performance’. E.g.,
if for several positive samples the output of the classifier
is around 0.75, you might expect from a well-calibrated
classifier that the fraction of them which is correctly
predicted as positive is also around 0.75. In a
well-calibrated classifier, the probabilistic confidence
estimates are realistic. Only for use with probabilistic
output (i.e. scores between 0 and 1).
‘mxe’: Mean cross-entropy. Only for use with probabilistic output.
MXE := - 1/(P+N) sum_{y_i=+} ln(yhat_i) + sum_{y_i=-}
ln(1-yhat_i). Since the output of ‘mxe’ is just a
cutoff-independent scalar, this measure cannot be combined
with other measures into a parametric curve.
‘rmse’: Root-mean-squared error. Only for use with numerical class
labels. RMSE := sqrt(1/(P+N) sum_i (y_i - yhat_i)^2). Since
the output of ‘rmse’ is just a cutoff-independent scalar,
this measure cannot be combined with other measures into a
parametric curve.
‘sar’: Score combinining performance measures of different
characteristics, in the attempt of creating a more "robust"
measure (cf. Caruana R., ROCAI2004): SAR = 1/3 * ( Accuracy +
Area under the ROC curve + Root mean-squared error ).
‘ecost’: Expected cost. For details on cost curves, cf.
Drummond&Holte 2000,2004. ‘ecost’ has an obligatory x axis,
the so-called 'probability-cost function'; thus it cannot be
combined with other measures. While using ‘ecost’ one is
interested in the lower envelope of a set of lines, it might
be instructive to plot the whole set of lines in addition to
the lower envelope. An example is given in ‘demo(ROCR)’.
‘cost’: Cost of a classifier when class-conditional
misclassification costs are explicitly given. Accepts the
optional parameters ‘cost.fp’ and ‘cost.fn’, by which the
costs for false positives and negatives can be adjusted,
respectively. By default, both are set to 1.
So you should do something like:
perf1 <- performance(pred1, measure = "tpr", x.measure = "fpr")

A Feature Selection Algorithm POE1ACC for features with continuous value

i want to implement the algorithm of "Probability of Error and Average Correlation Coefficient". (more info Page 143. It is a algorithm to elect unused features from set of features. As far as i know, this algorithm is not limited to boolean valued features but i dont know how i can use it for continuous features.
This is the only example what i could find about this algorithm:
Thus, X is to be predicted feature and C is any feature. To calculate Probability of Error value of C, they select values which are mismatching with green pieces. Thus PoE of C is (1-7/9) + (1-6/7) = 3/16 = 1875.
My question is thus: How can we use a continuous feature instead of a boolean feature, like in this example, to calculate PoE? Or is it not possible?
The algorithm that you describe is a feature selection algorithm, similar to the forward selection technique. At each step, we find a new feature Fi that minimizes this criterion :
weight_1 * ErrorProbability(Fi) + weight_2 * Acc(Fi)
ACC(Fi) represents the mean correlation between the feature Fi and other features already selected. You want to minimize this in order to have all your features not correlated, thus have a well conditionned problem.
ErrorProbability(Fi) represents if the feature correctly describes the variable you want to predict. For example, lets say you want to predict if tommorow will be rainy depending on temperature (continuous feature)
The Bayes error rate is (http://en.wikipedia.org/wiki/Bayes_error_rate) :
P = Sum_Ci { Integral_xeHi { P(x|Ci)*P(Ci) } }
In our example
Ci belong to {rainy ; not rainy}
x are instances of temperatures
Hi represent all temperatures that would lead to a Ci prediction.
What is interesting is that you can take any predictor you like.
Now, suppose you have all temperatures in one vector, all states rainy/not rainy in another vector :
In order to have P(x|Rainy), consider the following values :
temperaturesWhenRainy <- temperatures[which(state=='rainy')]
What you should do next is to plot an histogram of these values. Then you should try to fit a distribution on it. You will havea parametric formula of P(x|Rainy).
If your distribution is gaussian, you can do it simply :
m <- mean(temperaturesWhenRainy)
s <- sd(temperaturesWhenRainy)
Given some x value, you have the density of probability of P(x|Rainy) :
p <- dnorm(x, mean = m, sd = s)
You can do the same procedure for P(x|Not Rainy). Then P(Rainy) and P(Not Rainy) are easy to compute.
Once you have all that stuff you can use the Bayes error rate formula, which yields your ErrorProbability for a continuous feature.
Cheers

loess predict with new x values

I am attempting to understand how the predict.loess function is able to compute new predicted values (y_hat) at points x that do not exist in the original data. For example (this is a simple example and I realize loess is obviously not needed for an example of this sort but it illustrates the point):
x <- 1:10
y <- x^2
mdl <- loess(y ~ x)
predict(mdl, 1.5)
[1] 2.25
loess regression works by using polynomials at each x and thus it creates a predicted y_hat at each y. However, because there are no coefficients being stored, the "model" in this case is simply the details of what was used to predict each y_hat, for example, the span or degree. When I do predict(mdl, 1.5), how is predict able to produce a value at this new x? Is it interpolating between two nearest existing x values and their associated y_hat? If so, what are the details behind how it is doing this?
I have read the cloess documentation online but am unable to find where it discusses this.
However, because there are no coefficients being stored, the "model" in this case is simply the details of what was used to predict each y_hat
Maybe you have used print(mdl) command or simply mdl to see what the model mdl contains, but this is not the case. The model is really complicated and stores a big number of parameters.
To have an idea what's inside, you may use unlist(mdl) and see the big list of parameters in it.
This is a part of the manual of the command describing how it really works:
Fitting is done locally. That is, for the fit at point x, the fit is made using points in a neighbourhood of x, weighted by their distance from x (with differences in ‘parametric’ variables being ignored when computing the distance). The size of the neighbourhood is controlled by α (set by span or enp.target). For α < 1, the neighbourhood includes proportion α of the points, and these have tricubic weighting (proportional to (1 - (dist/maxdist)^3)^3). For α > 1, all points are used, with the ‘maximum distance’ assumed to be α^(1/p) times the actual maximum distance for p explanatory variables.
For the default family, fitting is by (weighted) least squares. For
family="symmetric" a few iterations of an M-estimation procedure with
Tukey's biweight are used. Be aware that as the initial value is the
least-squares fit, this need not be a very resistant fit.
What I believe is that it tries to fit a polynomial model in the neighborhood of every point (not just a single polynomial for the whole set). But the neighborhood does not mean only one point before and one point after, if I was implementing such a function I put a big weight on the nearest points to the point x, and lower weights to distal points, and tried to fit a polynomial that fits the highest total weight.
Then if the given x' for which height should be predicted is closest to point x, I tried to use the polynomial fitted on the neighborhoods of the point x - say P(x) - and applied it over x' - say P(x') - and that would be the prediction.
Let me know if you are looking for anything special.
To better understand what is happening in a loess fit try running the loess.demo function from the TeachingDemos package. This lets you interactively click on the plot (even between points) and it then shows the set of points and their weights used in the prediction and the predicted line/curve for that point.
Note also that the default for loess is to do a second smoothing/interpolating on the loess fit, so what you see in the fitted object is probably not the true loess fitting information, but the secondary smoothing.
Found the answer on page 42 of the manual:
In this algorithm a set of points typically small in number is selected for direct
computation using the loess fitting method and a surface is evaluated using an interpolation
method that is based on blending functions. The space of the factors is divided into
rectangular cells using an algorithm based on k-d trees. The loess fit is evaluated at
the cell vertices and then blending functions do the interpolation. The output data
structure stores the k-d trees and the fits at the vertices. This information
is used by predict() to carry out the interpolation.
I geuss that for predict at x, predict.loess make a regression with some points near x, and calculate the y-value at x.
Visit https://stats.stackexchange.com/questions/223469/how-does-a-loess-model-do-its-prediction

Derivative of Kernel Density

I am using density {stats} to construct a kernel "gaussian' density of a vector of variables. If I use the following example dataset:
x <- rlogis(1475, location=0, scale=1) # x is a vector of values - taken from a rlogis just for the purpose of explanation
d<- density(x=x, kernel="gaussian")
Is there some way to get the first derivative of this density d at each of the n=1475 points
Edit #2:
Following up on Greg Snow's excellent suggestion to use the analytical expression for the derivative of a Gaussian, and our conversation following his post, this will get you the exact slope at each of those points:
s <- d$bw;
slope2 <- sapply(x, function(X) {mean(dnorm(x - X, mean = 0, sd = s) * (x - X))})
## And then, to compare to the method below, plot the results against one another
plot(slope2 ~ slope)
Edit:
OK, I just reread your question, and see that you wanted slopes at each of the points in the input vector x. Here's one way you might approximate that:
slope <- (diff(d$y)/diff(d$x))[findInterval(x, d$x)]
A possible further refinement would be to find the location of the point within its interval, and then calculate its slope as the weighted average of the slope of the present interval and the interval to its right or left.
I'd approach this by averaging the slopes of the segments just to the right and left of each point. (A bit of special care needs to be taken for the first and last points, which have no segment to their left and right, respectively.)
dy <- diff(d$y)
dx <- diff(d$x)[1] ## Works b/c density() returns points at equal x-intervals
((c(dy, tail(dy, 1)) + c(head(dy, 1), dy))/2)/dx
The curve of a density estimator is just the sum of all the kernels, in your case a gaussian (divided by the number of points). The derivative of a sum is the sum of the derivatives and the derivative of a constant times a function is that constant times the derivative. So the derivative of the density estimate at a given point will just be the average of the slopes for the 1475 different gaussian curves at that given point. Each gaussian curve will have a mean corresponding to each of the data points and a standard deviation based on the bandwidth. So if you can calculate the slope for a gaussian, then finding the slope for the density estimate is just a mean of the 1475 slopes.

Resources