Plotting Difference in Difference Interaction Effect with PLM Model - r

I have a PLM model with two dummy independent variables, each with an interaction effect with another dummy variable. I have fixed effects (with a within estimator) as well. The resulting coefficients tell me the difference of the interaction effect with each treatment. I now want to visually plot the difference-in-difference based on these coefficients to show how the interaction effect is different for each independent variable (treatment). For the plot, it is not panel data, so the x axis would not be time, but would be the interaction (ie t=1 (no interaction), t=2 (interaction)). However, I cannot find a way to plot the difference-in-difference plot like this based off of the function. I would like to continue using the PLM function because of the fixed effects I have built in, and I feel there should be a way to make this DID plot, even if the x axis is the interaction and not a time variable.
I have tried various suggestions for predicted values and other ways to form the plot, but none have worked thus far. The closest I have come to visualizing the effect would be simple box plots of the group means, but I definitely want to make the DID plot based off of the regression output, so these would not suffice.

Related

How to produce this figure of thin-plate splines fit to observed data with contours plotted?

How does one reproduce this figure from Elements of Statistical Learning page 166?
I understand that a regression is fit to the data using age and obesity as features. But I am wondering how to represent the fitted surface using contours as is shown in the figure. I would prefer an R implementation because I believe that is what was used in this case.

LOESS smoothing - geom_smooth vs() loess()

I have some data which I would like to fit with a model. For this example we have been using LOESS smoothing (<1.000 observations). We applied LOESS smoothing using the geom_smooth() function from the ggplot package. So far, so good.
The next step was to acquire a first derivative of the smoothed curve, and as far as we know this is not possible to extract from geom_smooth(). Thus, we sought to manually create our model using loess() and use this to extract our first derivative from this.
Strangely however, we observed that the plotted geom_smooth() curve is different from the manually constructed loess() curve. This can be observed in the figure which is shown underneath; in red the geom_smooth() and in orange the loess() function.
If somebody would be interested, a minimal working reproducible example can be found here.
Would somebody be able to pinpoint why the curves are different? Is this because of the optimization settings of both curves? In order to acquire a meaningful derivative we need to ensure that these curves are identical.

Scatterplot:car extracting fits

Is there a way to extract functions that are used when you plot with scatterplot6 from car?
Example:
require(car)
scatterplot(x~y)
What it produces by default is a scatterplot with four lines, one for linear regression, 2 for residuals, and one (full red line) for a function that fits the data.
Now I want to know which function is used to produce the red line. Not this specific one, but generally is there a way to obtain this function?

Why does my linear regression fit line look wrong?

I have plotted a 2-D histogram in a way that I can add to the plot with lines, points etc.
Now I seek to apply a linear regression fit at the region of dense points, however my linear regression line seems totally off where it should be?
To demonstrate here is my plot on the left with both a lowess regression fit and linear fit.
lines(lowess(na.omit(a),na.omit(b),iter=10),col='gray',lwd=3)
abline(lm(b[cc]~a[cc]),lwd=3)
Here a and b are my values and cc are the points within the densest parts (i.e. most points lay there), red+yellow+blue.
Why doesn't my regression line look more like that on the right (hand-drawn fit)?
If I was plotting a line of best fit it would be there?
I have numerous plots similar to this but still I get the same results....
Are there any alternative linear regression fits that could prove to be better for me?
A linear regression is a method to fit a linear function to a set of points (observations) minimizing the least-squares error.
Now imagine your heatmap indicating a shape where you would assume a vertical line fitting best. Just turn your heatmap 10 degrees counter clock-wise and you have it.
Now how would a linear function supposed to be defined which is vertical? Exactly, it is not possible.
The result of this little thought experiment is that you confuse the purpose of linear regression and what you most likely want is - as indicated already by Gavin Simpson - the 1st principal component vector.

Splitting lme residual plot into separate boxplots

Using the basic plot function (plot.intervals.lmList) from an lme model (called meef1), I produced a massive graph of boxplots. My vector v2andv3commoditycombined has 98 levels.
plot(meef1, v2andv3commoditycombined~resid(.))
I would like to separate by the grouping values of my variable v2andv3commoditycombined to either graph them separately, order them, or exclude some. I'm not sure if there is code to do this or if I have to extract information from the lme output. If that is the case, I'm not sure what to extract to create the boxplots as extracting the residuals returns only one value for each level. If this is impossible, any advice on how to space out the commodity names would be equally helpful.
Thank you.
For each level of v2andv3commoditycombined, what exactly would you like your Y axis and your X axis to be? Since you're splitting the plots by v2andv3commoditycombined, you obviously can't also use that as one of your axes.
Let's pretend you just want do the traditional residuals on the Y axis and fitted values on the X axis, in a separate plot for each of the 98 levels. You can change the code to do plot whatever it is you actually want to plot.
As per ?plot.lme, you would do something like this:
plot(meef1,resid(.,type='pearson',level=1)~fitted(.,level=1)|v2andv3commoditycombined);
Make sure you stretch out your plot window beforehand so that it's nice and big, otherwise you might get an error saying something about margins. The following might produce a better-looking plot:
plot(meef1,resid(.,type='pearson',level=1)~fitted(.,level=1)|v2andv3commoditycombined,pch='.',cex=1.5,abline=0);
Since it wasn't clear from your question I went ahead and assumed you're interested in the individual level residuals (i.e. how much each datapoint differs from the predicted value given its random variables), and that you have one level of nesting in your random formula. If you want population residuals (i.e. how much each datapoint differs from the average predicted value), change both instances of level to say level=0. If you have K levels of nesting, change them to level=K and good luck.
I also assumed you wanted standardized residuals (because you can use the convenient rule of thumb that absolute values greater than 3 are possible outliers, regardless of what scale the original data are on). If not, see ?residuals.lme for other valid options for the type argument.
Oh, and the name of your variables suggests that you're looking at some sort of financial time series. If so, have a look at ACF(meef1) to see if there is a lot of autocorrelation. If there is, you could remedy it by instead fitting a model where the response (Y) variable is diff(...) the original variable. If you're seeing really skewed residuals, you might consider log-transforming your response variable before taking the diff.

Resources