Plotting a fitted quadratic curve in r - r

I have some data which I have fit a quadratic curve to using
model<-lm(Frequency ~ poly(Distance, 2, raw=TRUE))
I want to then draw this curve on the scatterplot of my data. I've tried using
lines(predict(model))
based on some information I find online, but it doesn't work quite right as the resulting curve is squished into the left side of the plot.
Ignore the regression line.
I believe the problem is that my variable Distance is a set of values each 5 greater than the previous, and that when I plot the curve it ignores this and plots using increments of 1. What I'm not sure of is how to fix it. Any help would be appreciated.

Related

Plotting a probability mass function for a poisson distribution

uppose that i have a poisson distribution with mean of 6 i would like to plot a probability mass function which includes an overlay of the approximating normal density.
This is what i have tried
plot( dpois( x=0:10, lambda=6 ))
this produces
which is wrong since it doesnt contain an overlay of approxiamating noral density
How do i go about this?
Something like what you seem to be asking for (I'm outlining the commands and the basic ideas, but checking the help on the functions and trying should fill in the remaining details):
taking a wider range of x-values (out to at least 13 or so) and use xlim to extend the plot slightly into the negatives (maybe to -1.5) and
plotting the pmf of the Poisson with solid dots (similar to your command but with pch=16 as an argument to plot) with a suitable color, then
call points with the same x and y arguments as above and have type=h and lty=3 to get vertical dotted lines (to give a clear impression of the relative heights, somewhat akin to the appearance of a Cleveland dot-chart); I'd use the same colour as the dots or a slightly lighter/greyer version of the dot-colour
use curve to draw the normal curve with the same mean and standard deviation as the Poisson with mean 6 (see details at the Wikipedia page for the Poisson which gives the mean and variance), but across the wider range we plotted; I'd use a slightly contrasting colour for that.
I'd draw a light x-axis in (e.g. using abline with the h argument)
Putting all those suggestions together:
(However, while it's what you're asking for it's not strictly a suitable way to compare discrete and continuous variables since density and pmf are not on the same scale, since density is not probability -- the "right" comparison between a Poisson and an approximating normal would be on the scale of the cdfs so you compare like with like -- they'd both be on the scale of probabilities then)

R - locate intersection of two curves

There are a number of questions in this forum on locating intersections between a fitted model and some raw data. However, in my case, I am in an early stage project where I am still evaluating data.
To begin with, I have created a data frame that contains a ratio value whose ideal value should be 1.0. I have plotted the data frame and also used abline() function to plot a horizontal line at y=1.0. This horizontal line and the plot of ratios intersect at some point.
plot(a$TIME.STAMP, a$PROCESS.RATIO,
xlab='Time (5s)',
ylab='Process ratio',
col='darkolivegreen',
type='l')
abline(h=1.0,col='red')
My aim is to locate the intersection point, say x and draw two vertical lines at x±k, as abline(v=x-k) and abline(v=x+k) where, k is certain band of tolerance.
Applying a grid on the plot is not really an option because this plot will be a part of a multi-panel plot. And, because ratio data is very tightly laid out, the plot will not be too readable. Finally, the x±k will be quite valuable in my discussions with the domain experts.
Can you please guide me how to achieve this?
Here are two solutions. The first one uses locator() and will be useful if you do not have too many charts to produce:
x <- 1:5
y <- log(1:5)
df1 <-data.frame(x= 1:5,y=log(1:5))
k <-0.5
plot(df1,type="o",lwd=2)
abline(h=1, col="red")
locator()
By clicking on the intersection (and stopping the locator top left of the chart), you will get the intersection:
> locator()
$x
[1] 2.765327
$y
[1] 1.002495
You would then add abline(v=2.765327).
If you need a more programmable way of finding the intersection, we will have to estimate the function of your data. Unfortunately, you haven’t provided us with PROCESS.RATIO, so we can only guess what your data looks like. Hopefully, the data is smooth. Here’s a solution that should work with nonlinear data. As you can see in the previous chart, all R does is draw a line between the dots. So, we have to fit a curve in there. Here I’m fitting the data with a polynomial of order 2. If your data is less linear, you can try increasing the order (2 here). If your data is linear, use a simple lm.
fit <-lm(y~poly(x,2))
newx <-data.frame(x=seq(0,5,0.01))
fitline = predict(fit, newdata=newx)
est <-data.frame(newx,fitline)
plot(df1,type="o",lwd=2)
abline(h=1, col="red")
lines(est, col="blue",lwd=2)
Using this fitted curve, we can then find the closest point to y=1. Once we have that point, we can draw vertical lines at the intersection and at +/-k.
cross <-est[which.min(abs(1-est$fitline)),] #find closest to 1
plot(df1,type="o",lwd=2)
abline(h=1)
abline(v=cross[1], col="green")
abline(v=cross[1]-k, col="purple")
abline(v=cross[1]+k, col="purple")

Understanding what the kde2d z values mean?

I have two data sets that I am comparing using a ked2d contour plot on a log10 scale,
Here I will use an example of the following data sets,
b<-log10(rgamma(1000,6,3))
a<-log10((rweibull(1000,8,2)))
density<-kde2d(a,b,n=100)
filled.contour(density,color.palette=colorRampPalette(c('white','blue','yellow','red','darkred')))
This produces the following plot,
Now my question is what does the z values on the legend actually mean? I know it represents where most the data lies but 0-15 confuses me. I thought it could be a percentage but without the log10 scale I have values ranging from 0-1? And I have also produced plots with scales 1-1.2, 1-2 using my real data.
The colors represent the the values of the estimated density function ranging from 0 to 15 apparently. Just like with your other question about the odd looking linear regression I can relate to your confusion.
You just have to understand that a density's integral over the full domain has to be 1, so you can use it to calculate the probability of an observation falling into a specific region.

Using Matlab, how does the visual geometric angle of a regression line change as I alter the axes of the graph?

I know that you can adjust the scale of the x and y axes to change the geometric angle of a regression line. For example, if you plotted a regression line with slope of b=0.3, perhaps the default settings of axes length etc. would create a regression angle of 35 degrees.
If you adjust the axes, you will change the angle the regression line makes with the x-axis so that it is greater or less than 35 degrees-WITHOUT changing the mathematical value of the slope--it will still stay as b=0.3.
What systematic equation/set of equations is there that allows me to know how the geometric angle of the regression line will be changed as I change the axes of the graph itself?
I have spent a lot of time on the internet looking for the answer to this and have not yet succeeded. For some reason statistics and geometry do not overlap much.
Refer to this web page: http://www.mathworks.in/help/matlab/ref/axis.html
Based on the data you have, set the same ranges for all the axes in your plot. Then the regression line would have the same angle for both the datasets.
Hope this helps!

GraphPad Mann Whitney scatter plot in R

I try to make a plot similar to the top three plot of this:
I found a partial answer here, however I am unsure how to add the p-values in the scatter-plot.
Any tips?
You've already got a partial answer. If you just want to know how to put p-values on then use text. (looking at graph C).
text(x = 1.5, y = 73, 'p = 0.03')
If you want the p-values and the lines underneath, assuming you also want those caps on the lines, use arrows instead of segments.
arrows(1, 70, 2, length = 2, angle = 90, code = 3)
If you're sticking with solving this in base R that's a great learning exercise and can give you full control over your plot. However, if you just want to get it done I'd suggest the beeswarm package (you're making beeswarm plots).
As an aside, this prompted me to investigate why you get those upward curving lines in beeswarm plots. It's a consequence of the typical algorithm. The line curves upward because the positions are calculated through increasing y-values. If the next y-value is so close that the points would overlap in the y-axis it's plotted at an angle off the x position. Many points close together on Y results in upward curving lines until you get far enough along Y to go back to X. Smaller points should alleviate that. Also, the beeswarm package in R has several optional algorithms that avoid that as well.

Resources