My question might seem very basic but since am new to R, I am struggling with it. Shall be grateful if I could get some insight...
I need to conduct a probit regression with respect to temperature and people's preference (desire warmer or desire cooler). The preference is therefore my dependent binary variable. I basically desire a plot in which on the y axis there is percentage of respondents desiring warmer/cooler and the x axis has temperature. The plot must therefore have two probit curves- one representing desire warmer and the other curve indicating desire cooler. I am interested at the intersection of the two curves.
Shall be grateful if someone could help me in the R code for the same.
Related
I am looking at the relationship between agricultural intensity and functional diversity of birds.
In my GLM model I have included a number of other variables including forest, semi-natural habitat, temperature, pesticides etc.
When looking to see whether my variables are normally distributed or not, I used a QQplot to identify the normality and there appears to be these 3 outliers.
I wondered how I would remove these outliers to make my data more normally distributed?
I tried to use the outliers package but all the examples I found failed to work, or I failed to understand how they worked!
Any help would be appreciated. This is my QQ plot for my functional dispersion model and a scatter of functional dispersion x agricultural intensity.
QQ plot
functional dispersion x agriculture scatter
You could remove the observations that appear out of place. Given the amount of observations, this is unlikely to change estimates, but please make sure this is indeed the case. Also, when reporting your work, make sure you justify why you removed those points based on your domain knowledge about the variable.
You can remove the observation using
model.data.scaled <- model.data.scaled[model.data.scaled$agri > -5, ]
I have constructed a mixed effect model using lmer() with the aim of comparing the growth in reading scores for four different groups of children as they age.
I would like to plot a graph of the 4 different slopes with confidence intervals in R in order to visualize this relationship but I keep getting stuck.
I have tried to use the plot function and some versions of the ggplot as I have done for previous lm() models but it isn't working so far. Here is my attempted model which I hope looks at how the change in reading scores over time(age) interacts with a child's SESDLD grouping (this indicated whether a child has a language problem and whether or not they are high or low income).
AgeSES.model <- lmer(ReadingMeasure ~ Age.c*SESDLD1 + (1|childid), data = reshapedomit, REML = FALSE)
The ReadingMeasure is a continuous score, age.c is centered age measured in months. SESDLD1 is a categorical measure which has 4 levels. I would expect four positive slopes of ReadingMeasure growth with different intercepts and probably differing slopes.
I would really appreciate any pointers on how to do this!
Thank you so much!!
The type of plot I would like to achieve - this was done in Stata
I'm trying to figure how to make a nonlinear regression of some cumulative data of X and Y values. The dataset is based on cumulative items and their respective cumulated demand. I have a plot that looks like this
based on the following observation of 5299 items, which is available here: abc.csv datafile
and I would like to fit a model that can explain it quite neatly. Given the plot, I reckon that there is a high degree of detail. Hence, I would believe that it would be possible to find a function that would explain the data with very high accuracy.
The problem is, however, that I find myself trying to fit a model with nls() by trial and error. Furthermore, some of the functions that I've tried give me some explanation, but not in full detail. For instance
nlm <- nls(abc$Cumfreq ~c*(1-exp(-a*abc$noe))+b, data=abc,
start = list(a=4.14, b=0.21, c=0.79))
Yields me:
My question is: how do I obtain a regression with a better fit? Is there a function in R or another way of achieving this? (fingers crossed for a math genius out there)
I have used the package lsmeans in R to get the average estimate for all observations for my treatment factor (across the levels of a block factor in the experimental design that has been included with systematic effect because it only had 3 levels). I have used a sqrt transformation for my response variable.
Thus I have used the following commands in R.
First defining model
model<-sqrt(response)~treatment+block
Then applying lsmeans
model_lsmeans<-lsmeans(model,~treatment)
Then plotting this
plot(model_lsmeans,ylab="treatment", xlab="response(with 95% CI)")
This gives a very nice graph with estimates and 95% confidense intervals for the different treatment.
The problems is just that this graph is for the transformed response.
How do I get this same plot with the backtransformed response (so the squared response)?
I have tried to create a new data frame and extract the lsmean, lower.CL, and upper.CL:
a<-summary(model_lsmeans)
New_dataframe<-as.data.frame(a[c("treatment","lsmean","lower.CL","upper.CL")])
And then make these squared
New_dataframe$lsmean<-New_dataframe$lsmean^2
New_dataframe$lower.CL<-New_dataframe$lower.CL^2
New_dataframe$upper.CL<-New_dataframe$upper.CL^2
New_dataframe
This gives me the estimates and CI boundaries squared that I need.
The problem is that I cannot make the same graph for thise estimates and CI as the one that I did in LS means above.
How can I do this? The reason that I ask is that I want to have graphs that are all of a similar style for my article. Since I very much like this LSmeans plot, and it is very convenient for me to use on the non-transformed response variables, I would like to have all my graphs in this style.
Thank you very much for your help! Hope everything is clear!
Kind regards
Ditlev
I'm studying the effect of different predictors (dummy, categorical and continuos variables) on presence of birds, obtained from bird counts at-sea. To do that I used a glmmadmb function and binomial family.
I've plotted the relationship between response variable and predictors in order to asses the model fit and the marginal effect of each predictor. To draw the graphs I used visreg function, specifying the transformation of the vertical axis:
visreg(modelo.bn7, type="conditional", scale="response", ylab= "Bird Presence")
The output graphs showed a confident bands very wide when I used the original scale of the response variable (covering the whole vertical axis). In case of graphs without transformation, confident bands were shorter but they had the same extension in the different levels of dummy variables. Does anyone know how the confidents bands are calculated in binomial distributions? Could it reflect that I have a problem in the estimated coefficients or in the model fit?
The confidence bands are calculated using p-values for binomial distribution... For detailed explanation you can ask on stats.stackexchange.com. If the bands are very wide (and the interpretation of 'wide' is subjective and mostly based on what is your goal) then it shows that your estimates may not be very accurate. High p-values usually are due to small or insufficient number of observations used for building the model. If the number of observations are large, then it does indicate a poor fit.