How to fit a known linear equation to my data in R?

How to fit a known linear equation to my data in R? - r

I used a linear model to obtain the best fit to my data, lm() function.
From literature I know that the optimal fit would be a linear regression with the slope = 1 and the intercept = 0. I would like to see how good this equation (y=x) fits my data? How do I proceed in order to find an R^2 as well as a p-value?
This is my data
(y = modelled, x = measured)
measured<-c(67.39369,28.73695,60.18499,49.32405,166.39318,222.29022,271.83573,241.72247, 368.46304,220.27018,169.92343,56.49579,38.18381,49.33753,130.91752,161.63536,294.14740,363.91029,358.32905,239.84112,129.65078,32.76462,30.13952,52.83656,67.35427,132.23034,366.87857,247.40125,273.19316,278.27902,123.24256,45.98363,83.50199,240.99459,266.95707,308.69814,228.34256,220.51319,83.97942,58.32171,57.93815,94.64370,264.78007,274.25863,245.72940,155.41777,77.45236,70.44223,104.22838,294.01645,312.42321,122.80831,41.65770,242.22661,300.07147,291.59902,230.54478,89.42498,55.81760,55.60525,111.64263,305.76432,264.27192,233.28214,192.75603,75.60803,63.75376)
modelled<-c(42.58318,71.64667,111.08853,67.06974,156.47303,240.41188,238.25893,196.42247,404.28974,138.73164,116.73998,55.21672,82.71556,64.27752,145.84891,133.67465,295.01014,335.25432,253.01847,166.69241,68.84971,26.03600,45.04720,75.56405,109.55975,202.57084,288.52887,140.58476,152.20510,153.99427,75.70720,92.56287,144.93923,335.90871,NA,264.25732,141.93407,122.80440,83.23812,42.18676,107.97732,123.96824,270.52620,388.93979,308.35117,100.79047,127.70644,91.23133,162.53323,NA ,276.46554,100.79440,81.10756,272.17680,387.28700,208.29715,152.91548,62.54459,31.98732,74.26625,115.50051,324.91248,210.14204,168.29598,157.30373,45.76027,76.07370)
Now I would like to see how good the equation y=x fits the data presented above (R^2 and p-value)?
I am very grateful if somebody can help me with this (basic) problem, as I found no answers to my question on stackoverflow?
Best regards Cyril

Let's be clear what you are asking here. You have an existing model, which is "the modelled values are the expected value of the measured values", or in other words, measured = modelled + e, where e are the normally distributed residuals.
You say that the "optimal fit" should be a straight line with intercept 0 and slope 1, which is another way of saying the same thing.
The thing is, this "optimal fit" is not the optimal fit for your actual data, as we can easily see by doing:
summary(lm(measured ~ modelled))
#>
#> Call:
#> lm(formula = measured ~ modelled)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -103.328 -39.130 -4.881 40.428 114.829
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 23.09461 13.11026 1.762 0.083 .
#> modelled 0.91143 0.07052 12.924 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Residual standard error: 55.13 on 63 degrees of freedom
#> Multiple R-squared: 0.7261, Adjusted R-squared: 0.7218
#> F-statistic: 167 on 1 and 63 DF, p-value: < 2.2e-16
This shows us the line that would produce the optimal fit to your data in terms of reducing the sum of the squared residuals.
But I guess what you are asking is "How well do my data fit the model measured = modelled + e ?"
Trying to coerce lm into giving you a fixed intercept and slope probably isn't the best way to answer this question. Remember, the p value for the slope only tells you whether the actual slope is significantly different from 0. The above model already confirms that. If you want to know the r-squared of measured = modelled + e, you just need to know the proportion of the variance of measured that is explained by modelled. In other words:
1 - var(measured - modelled) / var(measured)
#> [1] 0.7192672
This is pretty close to the r squared from the lm call.
I think you have sufficient evidence to say that your data is consistent with the model measured = modelled, in that the slope in the lm model includes the value 1 within its 95% confidence interval, and the intercept contains the value 0 within its 95% confidence interval.

As mentioned in the comments, you can use the lm() function, but this actually estimates the slope and intercept for you, whereas what you want is something different.
If slope = 1 and the intercept = 0, essentially you have a fit and your modelled is already the predicted value. You need the r-square from this fit. R squared is defined as:
R2 = MSS/TSS = (TSS − RSS)/TSS
See this link for definition of RSS and TSS.
We can only work with observations that are complete (non NA). So we calculate each of them:
TSS = nonNA = !is.na(modelled) & !is.na(measured)
# residuals from your prediction
RSS = sum((modelled[nonNA] - measured[nonNA])^2,na.rm=T)
# total residuals from data
TSS = sum((measured[nonNA] - mean(measured[nonNA]))^2,na.rm=T)
1 - RSS/TSS
[1] 0.7116585

If measured and modelled are supposed to represent the actual and fitted values of an undisclosed model, as discussed in the comments below another answer, then if fm is the lm object for that undisclosed model then
summary(fm)
will show the R^2 and p value of that model.
The R squared value can actually be calculated using only measured and modelled but the formula is different if there is or is not an intercept in the undisclosed model. The signs are that there is no intercept since if there were an intercept sum(modelled - measured, an.rm = TRUE) should be 0 but in fact it is far from it.
In any case R^2 and the p value are shown in the output of the summary(fm) where fm is the undisclosed linear model so there is no point in restricting the discussion to measured and modelled if you have the lm object of the undisclosed model.
For example, if the undisclosed model is the following then using the builtin CO2 data frame:
fm <- lm(uptake ~ Type + conc, CO2)
summary(fm)
we have the this output where the last two lines show R squared and p value.
Call:
lm(formula = uptake ~ Type + conc, data = CO2)
Residuals:
Min 1Q Median 3Q Max
-18.2145 -4.2549 0.5479 5.3048 12.9968
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.830052 1.579918 16.349 < 2e-16 ***
TypeMississippi -12.659524 1.544261 -8.198 3.06e-12 ***
conc 0.017731 0.002625 6.755 2.00e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.077 on 81 degrees of freedom
Multiple R-squared: 0.5821, Adjusted R-squared: 0.5718
F-statistic: 56.42 on 2 and 81 DF, p-value: 4.498e-16

Related

How to calculate the partial R squared for a linear model with factor interaction in R

I have a linear model where my response Y is say the percentage (proportion) of fat in milk. I have two explanatory variables one (x1) is a continuous variable, the other (z) is a three level factor.
I now do the regression in R as:
contrasts(z) <- "contr.sum"
model<-lm(logit(Y) ~ log(x1)*z)
the model summary gives me the R2 of this model . However, I want to find out the importance of x1 in my model.
I can look at the p-value if the slope is statistically different from 0, but this does not tell me if x1 is actually a good predictor.
Is there a way to get the partial R2 for this model and the overall effect of x1? As this model includes an interaction I am not sure how to calculate this and if there is one unique solution or if I get a partial R2 for the main effect of x1 and a partial R2 for main effect of x1 plus its interaction.
Or would it be better to avoid partial R2 and explain the magnitude of the slope of the main effect and interaction. But given my logit transformation I am not sure if this has any practical meaning for say how log(x1) changes the log odds ratio of % fat in milk.
Thanks.
-I tried to fit the model without the interaction and without the factor to get a usual R2 , but this would not be my preferred solution and I would like to get the partial R2 when specifying a full model.
Update: As requested in a comment, here the output from the summary(model). As written above z is sum contrast coded.
Call:
lm(formula = y ~ log(x1) * z, data = mydata)
Residuals:
Min 1Q Median 3Q Max
-1.21240 -0.09487 0.03282 0.13588 0.85941
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.330678 0.034043 -68.462 < 2e-16 ***
log(x1) -0.012948 0.005744 -2.254 0.02454 *
z1 0.140710 0.048096 2.926 0.00357 **
z2 -0.348526 0.055156 -6.319 5.17e-10 ***
log(x1):z1 0.017051 0.008095 2.106 0.03558 *
log(x1):z2 -0.028201 0.009563 -2.949 0.00331 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2288 on 594 degrees of freedom
Multiple R-squared: 0.1388, Adjusted R-squared: 0.1315
F-statistic: 19.15 on 5 and 594 DF, p-value: < 2.2e-16
Update: As requested in a comment, here the output from
print(aov(model))
Call:
aov(formula = model)
Terms:
log(x1) z log(x1):z Residuals
Sum of Squares 0.725230 3.831223 0.456677 31.105088
Deg. of Freedom 1 2 2 594
Residual standard error: 0.228835
Estimated effects may be unbalanced.
As written above, z is sum contrast coded.

How do I print the growth rate for an exponential regression model?

So, I created an exponential regression using 50 data points taken over 50 days. Finding the summary of it resulted in the following:
> summary(TotalModel)
Call:
lm(formula = log(Total) ~ Time)
Residuals:
Min 1Q Median 3Q Max
-1.0570 -0.4827 -0.1168 0.5545 0.8195
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.537196 0.165779 27.37 <2e-16 ***
Time 0.148937 0.005658 26.32 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5773 on 48 degrees of freedom
Multiple R-squared: 0.9352, Adjusted R-squared: 0.9339
F-statistic: 692.9 on 1 and 48 DF, p-value: < 2.2e-16
Now, while this does provide me with some of the information I needed, I want to take the growth rate of this exponential model and print it to a variable that I will later export to a spreadsheet (and repeat this about 13 more times). How do I get this value?

First of all, in your example above, the fitted model is:
log(Total) = 4.537196 + 0.148937 * Time
Exponentiating both sides, we have:
Total = exp(4.537196)*exp(0.148937 * Time)
Your growth rate is 0.148937 which is the coefficient of the Time variable. To extract this value, you could do
TotalModel$coefficients["Time"]
In general, in the future, if you're trying to see how you can extract a specific value from an object, a few things you can try include: (will not always work for every object but worth a try)
Look at the structure of your model by doing str(TotalModel). In your case, you will see that one of the elements is coefficients, which you can access using TotalModel$coefficients
Type TotalModel$ (with the dollar sign) and hit the tab key on your keyboard to see what possible elements can be accessed.

So a generic exponential function has the following form, where is the exponential growth rate:
Looks like you have estimated the transformed log-linear model, ie:
Or, put another way, the OLS model you have estimated is:
where and .
So then, the exponential growth rate is equal to:
You can calculate this from the TotalModel object, where TotalModel$coefficient['Time'] is and so, r = exp(TotalModel$coefficient['Time']) - 1.

How to get probability from GLM output

I'm extremely stuck at the moment as I am trying to figure out how to calculate the probability from my glm output in R. I know the data is very insignificant but I would really love to be shown how to get the probability from an output like this. I was thinking of trying inv.logit() but didn't know what variables to put within the brackets.
The data is from occupancy study. I'm assessing the success of a hair trap method versus a camera trap in detecting 3 species (red squirrel, pine marten and invasive grey squirrel). I wanted to see what affected detection (or non detection) of the various species. One hypotheses was the detection of another focal species at the site would affect the detectability of red squirrel. Given that pine marten is a predator of the red squirrel and that the grey squirrel is a competitor, the presence of those two species at a site might affect the detectability of the red squirrel.
Would this show the probability? inv.logit(-1.14 - 0.1322 * nonRS events)
glm(formula = RS_sticky ~ NonRSevents_before1stRS, family = binomial(link = "logit"), data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.7432 -0.7432 -0.7222 -0.3739 2.0361
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.1455 0.4677 -2.449 0.0143 *
NonRSevents_before1stRS -0.1322 0.1658 -0.797 0.4255
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 34.575 on 33 degrees of freedom
Residual deviance: 33.736 on 32 degrees of freedom
(1 observation deleted due to missingness)
AIC: 37.736
Number of Fisher Scoring iterations: 5*

If you want to predict the probability of response for a specified set of values of the predictor variable:
pframe <- data.frame(NonRSevents_before1stRS=4)
predict(fitted_model, newdata=pframe, type="response")
where fitted_model is the result of your glm() fit, which you stored in a variable. You may not be familiar with the R approach to statistical analysis, which is to store the fitted model as an object/in a variable, then apply different methods to it (summary(), plot(), predict(), residuals(), ...)
This is obviously only a made-up example: I don't know if 4 is a reasonable value for the NonRSevents_before1stRS variable)
you can specify more different values to do predictions for at the same time (data.frame(NonRSevents_before1stRS=c(4,5,6,7,8)))
if you have multiple predictors, you have to specify some value for every predictor for every prediction, e.g. data.frame(x=4:8,y=mean(orig_data$y), ...)
If you want the predicted probabilities for the observations in your original data set, just predict(fitted_model, type="response")
You're correct that inv.logit() (from a bunch of different packages, don't know which you're using) or plogis() (from base R, essentially the same) will translate from the logit or log-odds scale to the probability scale, so
plogis(predict(fitted_model))
would also work (predict provides predictions on the link-function [in this case logit/log-odds] scale by default).

The dependent variable in a logistic regression is a log odds ratio. We'll illustrate how to interpret the coefficients with the space shuttle autolander data from the MASS package.
After loading the data, we'll create a binary dependent variable where:
1 = autolander used,
0 = autolander not used.
We will also create a binary independent variable for shuttle stability:
1 = stable positioning
0 = unstable positioning.
Then, we'll run glm() with family=binomial(link="logit"). Since the coefficients are log odds ratios, we'll exponentiate them to turn them back into odds ratios.
library(MASS)
str(shuttle)
shuttle$stable <- 0
shuttle[shuttle$stability =="stab","stable"] <- 1
shuttle$auto <- 0
shuttle[shuttle$use =="auto","auto"] <- 1
fit <- glm(use ~ factor(stable),family=binomial(link = "logit"),data=shuttle) # specifies base as unstable
summary(fit)
exp(fit$coefficients)
...and the output:
> fit <- glm(use ~ factor(stable),family=binomial(link = "logit"),data=shuttle) # specifies base as unstable
>
> summary(fit)
Call:
glm(formula = use ~ factor(stable), family = binomial(link = "logit"),
data = shuttle)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.1774 -1.0118 -0.9566 1.1774 1.4155
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.747e-15 1.768e-01 0.000 1.0000
factor(stable)1 -5.443e-01 2.547e-01 -2.137 0.0326 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 350.36 on 255 degrees of freedom
Residual deviance: 345.75 on 254 degrees of freedom
AIC: 349.75
Number of Fisher Scoring iterations: 4
> exp(fit$coefficients)
(Intercept) factor(stable)1
1.0000000 0.5802469
>
The intercept of 0 is the log odds for unstable, and the coefficient of -.5443 is the log odds for stable. After exponentiating the coefficients, we observe that the odds of autolander use under the condition of an unstable shuttle 1.0, and are multiplied by .58 if the shuttle is stable. This means that the autolander is less likely to be used if the shuttle has stable positioning.
Calculating probability of autolander use
We can do this in two ways. First, the manual approach: exponentiate the coefficients and convert the odds to probabilities using the following equation.
p = odds / (1 + odds)
With the shuttle autolander data it works as follows.
# convert intercept to probability
odds_i <- exp(fit$coefficients[1])
odds_i / (1 + odds_i)
# convert stable="stable" to probability
odds_p <- exp(fit$coefficients[1]) * exp(fit$coefficients[2])
odds_p / (1 + odds_p)
...and the output:
> # convert intercept to probability
> odds_i <- exp(fit$coefficients[1])
> odds_i / (1 + odds_i)
(Intercept)
0.5
> # convert stable="stable" to probability
> odds_p <- exp(fit$coefficients[1]) * exp(fit$coefficients[2])
> odds_p / (1 + odds_p)
(Intercept)
0.3671875
>
The probability of autolander use when a shuttle is unstable is 0.5, and decreases to 0.37 when the shuttle is stable.
The second approach to generate probabilities is to use the predict() function.
# convert to probabilities with the predict() function
predict(fit,data.frame(stable="0"),type="response")
predict(fit,data.frame(stable="1"),type="response")
Note that the output matches the manually calculated probabilities.
> # convert to probabilities with the predict() function
> predict(fit,data.frame(stable="0"),type="response")
1
0.5
> predict(fit,data.frame(stable="1"),type="response")
1
0.3671875
>
Applying this to the OP data
We can apply these steps to the glm() output from the OP as follows.
coefficients <- c(-1.1455,-0.1322)
exp(coefficients)
odds_i <- exp(coefficients[1])
odds_i / (1 + odds_i)
# convert nonRSEvents = 1 to probability
odds_p <- exp(coefficients[1]) * exp(coefficients[2])
odds_p / (1 + odds_p)
# simulate up to 10 nonRSEvents prior to RS
coef_df <- data.frame(nonRSEvents=0:10,
intercept=rep(-1.1455,11),
nonRSEventSlope=rep(-0.1322,11))
coef_df$nonRSEventValue <- coef_df$nonRSEventSlope *
coef_df$nonRSEvents
coef_df$intercept_exp <- exp(coef_df$intercept)
coef_df$slope_exp <- exp(coef_df$nonRSEventValue)
coef_df$odds <- coef_df$intercept_exp * coef_df$slope_exp
coef_df$probability <- coef_df$odds / (1 + coef_df$odds)
# print the odds & probabilities by number of nonRSEvents
coef_df[,c(1,7:8)]
...and the final output.
> coef_df[,c(1,7:8)]
nonRSEvents odds probability
1 0 0.31806 0.24131
2 1 0.27868 0.21794
3 2 0.24417 0.19625
4 3 0.21393 0.17623
5 4 0.18744 0.15785
6 5 0.16423 0.14106
7 6 0.14389 0.12579
8 7 0.12607 0.11196
9 8 0.11046 0.09947
10 9 0.09678 0.08824
11 10 0.08480 0.07817
>

How to obtain Poisson's distribution "lambda" from R glm() coefficients

My R-script produces glm() coeffs below.
What is Poisson's lambda, then? It should be ~3.0 since that's what I used to create the distribution.
Call:
glm(formula = h_counts ~ ., family = poisson(link = log), data = pois_ideal_data)
Deviance Residuals:
Min 1Q Median 3Q Max
-22.726 -12.726 -8.624 6.405 18.515
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 8.222532 0.015100 544.53 <2e-16 ***
h_mids -0.363560 0.004393 -82.75 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 11451.0 on 10 degrees of freedom
Residual deviance: 1975.5 on 9 degrees of freedom
AIC: 2059
Number of Fisher Scoring iterations: 5
random_pois = rpois(10000,3)
h=hist(random_pois, breaks = 10)
mean(random_pois) #verifying that the mean is close to 3.
h_mids = h$mids
h_counts = h$counts
pois_ideal_data <- data.frame(h_mids, h_counts)
pois_ideal_model <- glm(h_counts ~ ., pois_ideal_data, family=poisson(link=log))
summary_ideal=summary(pois_ideal_model)
summary_ideal

What are you doing here???!!! You used a glm to fit a distribution???
Well, it is not impossible to do so, but it is done via this:
set.seed(0)
x <- rpois(10000,3)
fit <- glm(x ~ 1, family = poisson())
i.e., we fit data with an intercept-only regression model.
fit$fitted[1]
# 3.005
This is the same as:
mean(x)
# 3.005

It looks like you're trying to do a Poisson fit to aggregated or binned data; that's not what glm does. I took a quick look for canned ways of fitting distributions to canned data but couldn't find one; it looks like earlier versions of the bda package might have offered this, but not now.
At root, what you need to do is set up a negative log-likelihood function that computes (# counts)*prob(count|lambda) and minimize it using optim(); the solution given below using the bbmle package is a little more complex up-front but gives you added benefits like easily computing confidence intervals etc..
Set up data:
set.seed(101)
random_pois <- rpois(10000,3)
tt <- table(random_pois)
dd <- data.frame(counts=unname(c(tt)),
val=as.numeric(names(tt)))
Here I'm using table rather than hist because histograms on discrete data are fussy (having integer cutpoints often makes things confusing because you have to be careful about right- vs left-closure)
Set up density function for binned Poisson data (to work with bbmle's formula interface, the first argument must be called x, and it must have a log argument).
dpoisbin <- function(x,val,lambda,log=FALSE) {
probs <- dpois(val,lambda,log=TRUE)
r <- sum(x*probs)
if (log) r else exp(r)
}
Fit lambda (log link helps prevent numerical problems/warnings from negative lambda values):
library(bbmle)
m1 <- mle2(counts~dpoisbin(val,exp(loglambda)),
data=dd,
start=list(loglambda=0))
all.equal(unname(exp(coef(m1))),mean(random_pois),tol=1e-6) ## TRUE
exp(confint(m1))
## 2.5 % 97.5 %
## 2.972047 3.040009

Multiple correlation coefficient in R

I am looking for a way to calculate the multiple correlation coefficient in R http://en.wikipedia.org/wiki/Multiple_correlation, is there a built-in function to calculate it ?
I have one dependent variable and three independent ones.
I am not able to find it online, any idea ?

The easiest way to calculate what you seem to be asking for when you refer to 'the multiple correlation coefficient' (i.e. the correlation between two or more independent variables on the one hand, and one dependent variable on the other) is to create a multiple linear regression (predicting the values of one variable treated as dependent from the values of two or more variables treated as independent) and then calculate the coefficient of correlation between the predicted and observed values of the dependent variable.
Here, for example, we create a linear model called mpg.model, with mpg as the dependent variable and wt and cyl as the independent variables, using the built-in mtcars dataset:
> mpg.model <- lm(mpg ~ wt + cyl, data = mtcars)
Having created the above model, we correlate the observed values of mpg (which are embedded in the object, within the model data frame) with the predicted values for the same variable (also embedded):
> cor(mpg.model$model$mpg, mpg.model$fitted.values)
[1] 0.9111681
R will in fact do this calculation for you, but without telling you so, when you ask it to create the summary of a model (as in Brian's answer): the summary of an lm object contains R-squared, which is the square of the coefficient of correlation between the predicted and observed values of the dependent variable. So an alternative way to get the same result is to extract R-squared from the summary.lm object and take the square root of it, thus:
> sqrt(summary(mpg.model)$r.squared)
[1] 0.9111681
I feel that I should point out, however, that the term 'multiple correlation coefficient' is ambiguous.

The built-in function lm gives at least one version, not sure if this is what you are looking for:
fit <- lm(yield ~ N + P + K, data = npk)
summary(fit)
Gives:
Call:
lm(formula = yield ~ N + P + K, data = npk)
Residuals:
Min 1Q Median 3Q Max
-9.2667 -3.6542 0.7083 3.4792 9.3333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 54.650 2.205 24.784 <2e-16 ***
N1 5.617 2.205 2.547 0.0192 *
P1 -1.183 2.205 -0.537 0.5974
K1 -3.983 2.205 -1.806 0.0859 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.401 on 20 degrees of freedom
Multiple R-squared: 0.3342, Adjusted R-squared: 0.2343
F-statistic: 3.346 on 3 and 20 DF, p-value: 0.0397
More info on what's going on at ?summary.lm and ?lm.

Try this:
# load sample data
data(mtcars)
# calculate correlation coefficient between all variables in `mtcars` using
# the inbulit function
M <- cor(mtcars)
# M is a matrix of correlation coefficient which you can display just by
# running
print(M)
# If you want to plot the correlation coefficient
library(corrplot)
corrplot(M, method="number",type= "lower",insig = "blank", number.cex = 0.6)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to fit a known linear equation to my data in R? - r

Related

How to calculate the partial R squared for a linear model with factor interaction in R

How do I print the growth rate for an exponential regression model?

How to get probability from GLM output

How to obtain Poisson's distribution "lambda" from R glm() coefficients

Multiple correlation coefficient in R

Categories

Resources