r- Adding multiple unrelated nonlinear fuctions (not fitted) to a scatterplot - r

I have made a scatter plot of raw data. The equation for the quantile lines takes the form of y=10^a*x^b. (The equation for the quantile was log transformed and meaningless to the audience when viewed). How do I add this form of a function to the scatter plot of the raw data.
I think this may be a matter of my not knowing the terminology to search for the right method.

a=1
b=2
curve(10^a*x^b,add =T,from=0,to=10)

Related

How to convert S3 objects with class roc for ggplot2?

I'm planning to use patchwork to assemble several ROC curves plotted with pROC. After constructing a pROC plot list (of S3: roc objects) and attempting to use wrap_plots(plots) to assemble, I came across the following error:
Error: Only know how to add ggplots and/or grobs
AFAIK, there may be several solutions:
Coerce S3:roc objects to ggplots. It seems the function fortify does this job for S3 objects generated by precrec package but I don't know if S3:roc objects can be done in the same way. Using ggplot2::fortify I ran into
`data` must be a data frame, or other object coercible by `fortify()`, not an S3 object with class roc.
Use precrec to streamline the conversion, instead. What curtails my migration is that I want to print Youden index point and confidence intervals of the Youden index point and area under curve (AUC) on the plot. It seems only pROC package meets all my needs so I don't quite want to move on. Also I need to adjust my codes to cater parameter demands from precrec. Too much to learn and try, so tutorials and simple codes are appreciated.
Whatever, my final purpose is being able to assemble all ROC curves programmatically, with automatic annotations. The ROC curves need to show their respective Youden index point and confidence intervals of the Youden index point and area under curve (AUC) on the plot.
Drawbacks exist in the pROC package, too. The text sizes of Youden index and confidence interval values are too small for the whole plot if all ROC plots are assembled. I can adjust them by specifying par(cex=<text size>) but there's ricks that the texts may overlap with the curves or get out of bound if the texts are too marginal. pROC is not smart enough to reconcile with text sizes, curves and text positions. A smarter package to meet all of my harsh demands mentioned above will strongly push me forward to adopt a new package to draw ROC curves. Therefore, solutions vary in my scenario (but please don't recommend using a graphical vector image editor to edit these curves by hand because it's time-consuming and error-prone, and lags changing demands from different journals). All insights from all perspectives are appreciated.
Have you tried the ggroc function from pROC? It does exactly what you're asking for: it creates a ggplot2 plot (class gg) which you can then manipulate as you wish.
However I think you are being slightly confused:
Coerce S3:roc objects to ggplots. It seems the function fortify does this job for S3 objects generated by precrec package
It makes sense that the precrec package would be able to convert its own objects. However, note that it doesn't generate a ggplot2 object, but a data.frame with the coordinates of the ROC curve (which can then be used as input for ggplot2).
In pROC, this exact operation is done with the coords function, which extracts the coordinates of the ROC curve to a data.frame (and that you can then use as input for ggplot2).

How to remove box and whiskers from plot() function in R?

I'm trying to do a very simple plot using the plot() function in R where I plot out the weights and diets of chicks in a stratified plot. For other simple plots like this I've been able to just use the plot() function and it's gone fine. But for some reason, R insists on plotting this as a box-and-whiskers chart instead of plotting all the values themselves. I've tried many different solutions I've found on the Web, from box=FALSE to box=0 to bty="n" to type="p" but nothing works. R always plots it as a box-and-whiskers chart. I can use whisklty=0 to get rid of the whiskers, but nothing I've tried (including all possible combinations of the above solutions) will replace the boxes with the actual values I want.
If the x-axis data is categorical, plot will return a boxplot by default. You could run plot.default() instead of plot() and that will give you a plot of points.
Compare, for example:
plot(iris$Species, iris$Petal.Width)
plot.default(iris$Species, iris$Petal.Width)
If you type methods(plot) in the console, you'll see all of the different kinds of plots the plot function returns, depending on what type of object you give it. plot.default is the "method" that gets dispatched when you provide plot with two columns of numbers. plot.factor gets dispatched when the y-values are numeric and the x-values are categorical (run ?plot.factor for details). If you do plot(table(mtcars$vs, mtcars$cyl)) the plot.table method gets dispatched. And so on.

LOESS smoothing - geom_smooth vs() loess()

I have some data which I would like to fit with a model. For this example we have been using LOESS smoothing (<1.000 observations). We applied LOESS smoothing using the geom_smooth() function from the ggplot package. So far, so good.
The next step was to acquire a first derivative of the smoothed curve, and as far as we know this is not possible to extract from geom_smooth(). Thus, we sought to manually create our model using loess() and use this to extract our first derivative from this.
Strangely however, we observed that the plotted geom_smooth() curve is different from the manually constructed loess() curve. This can be observed in the figure which is shown underneath; in red the geom_smooth() and in orange the loess() function.
If somebody would be interested, a minimal working reproducible example can be found here.
Would somebody be able to pinpoint why the curves are different? Is this because of the optimization settings of both curves? In order to acquire a meaningful derivative we need to ensure that these curves are identical.

Scatterplot:car extracting fits

Is there a way to extract functions that are used when you plot with scatterplot6 from car?
Example:
require(car)
scatterplot(x~y)
What it produces by default is a scatterplot with four lines, one for linear regression, 2 for residuals, and one (full red line) for a function that fits the data.
Now I want to know which function is used to produce the red line. Not this specific one, but generally is there a way to obtain this function?

Splitting lme residual plot into separate boxplots

Using the basic plot function (plot.intervals.lmList) from an lme model (called meef1), I produced a massive graph of boxplots. My vector v2andv3commoditycombined has 98 levels.
plot(meef1, v2andv3commoditycombined~resid(.))
I would like to separate by the grouping values of my variable v2andv3commoditycombined to either graph them separately, order them, or exclude some. I'm not sure if there is code to do this or if I have to extract information from the lme output. If that is the case, I'm not sure what to extract to create the boxplots as extracting the residuals returns only one value for each level. If this is impossible, any advice on how to space out the commodity names would be equally helpful.
Thank you.
For each level of v2andv3commoditycombined, what exactly would you like your Y axis and your X axis to be? Since you're splitting the plots by v2andv3commoditycombined, you obviously can't also use that as one of your axes.
Let's pretend you just want do the traditional residuals on the Y axis and fitted values on the X axis, in a separate plot for each of the 98 levels. You can change the code to do plot whatever it is you actually want to plot.
As per ?plot.lme, you would do something like this:
plot(meef1,resid(.,type='pearson',level=1)~fitted(.,level=1)|v2andv3commoditycombined);
Make sure you stretch out your plot window beforehand so that it's nice and big, otherwise you might get an error saying something about margins. The following might produce a better-looking plot:
plot(meef1,resid(.,type='pearson',level=1)~fitted(.,level=1)|v2andv3commoditycombined,pch='.',cex=1.5,abline=0);
Since it wasn't clear from your question I went ahead and assumed you're interested in the individual level residuals (i.e. how much each datapoint differs from the predicted value given its random variables), and that you have one level of nesting in your random formula. If you want population residuals (i.e. how much each datapoint differs from the average predicted value), change both instances of level to say level=0. If you have K levels of nesting, change them to level=K and good luck.
I also assumed you wanted standardized residuals (because you can use the convenient rule of thumb that absolute values greater than 3 are possible outliers, regardless of what scale the original data are on). If not, see ?residuals.lme for other valid options for the type argument.
Oh, and the name of your variables suggests that you're looking at some sort of financial time series. If so, have a look at ACF(meef1) to see if there is a lot of autocorrelation. If there is, you could remedy it by instead fitting a model where the response (Y) variable is diff(...) the original variable. If you're seeing really skewed residuals, you might consider log-transforming your response variable before taking the diff.

Resources