Putting a best fit line through a binscatter in ggplot2 - r

Piggy backing off of this great thread, I am hoping to introduce a best fit line through the orange lines from robust's answer. I understand that I can connect the lines by adding another stat_summary_bin with geom = "line", but this generates a line that goes through each point, and I am looking for a best fit line through these points, if this is even possible.
Thanks!

You can use binsreg to do this: https://nppackages.github.io/binsreg/. binsreg can plot the best fit line or other parametric fits.

Related

3D Line Plot with Datavisualization

I am looking for a way to draw a 3d line plot. Preferably I would like to use the datavisualization framework, but it does not seem to provide this out of the box.
I experimented a little bit and ended up using 3D surface plots (Surface3D) displaying the lines as surfaces (i.e. ribbons) like this:
While this works and looks okay in above picture the thickness of the line depends on the perspective. Rotating the plot always allows to find the angle where the line disappears since it has not thickness:
Is there a type of plot that would be better suited for this? I tested the bars which don't perform well for lots of samples and don't look nice in my application. I also tested scatterplots which are not suitable either.
If there isn't: Where would I start to implement this myself on top of the existing classes in the datavisualization framework? I am thinking about adding another surface "ribbon" in z direction, however that seems a little hackish.
I used the technique described as hackish above. While I am not too happy about the approach the overall look is quite okay:
So basically each data line consists of three QSurfaceDataRows that together form two 90° ribbons as can be seen here:

R - Heatmap from sparse 2d data

I'd like to achieve what this person has achieved without using ggplot. Any ideas?
How do I create a continuous density heatmap of 2D scatter data in R?
You can see what I get when using the solution detailed in that question.
ggplot(df,aes(x=x,y=y))+
stat_density2d(aes(alpha=..level..), geom="polygon") +
scale_alpha_continuous(limits=c(0,1),breaks=seq(0,1,by=0.1))+
geom_point(colour="red",alpha=0.2)+
theme_bw()
The heatmap is so sparse. I want it to cover much more than what it is covering now. It's terribly hard to see anything about the density. Any ideas of different ways to make density heatmaps from 2D data besides this ggplot solution?
One idea I had was instead of using linear color labeling (see the black to white spectrum on the left, which is linear), using logarithmic scale for the density labeling. Any ideas how I could do this?
"The heatmap is so sparse. I want it to cover much more than what it is covering now. It's terribly hard to see anything about the density."
Please be specific: what do you want to see in areas with most or all NAs?
if you use geom_point with alpha-blending and position_jitter, the current plot is as good as it gets
if some solid color, then use geom_hex(), see http://mfcovington.github.io/r_club/solutions/2013/02/28/peer-produced-plots-solutions/ for code. Then play with the continuous color_scale... you probably want a nonlinear transform. Post us your revised attempt, if you want a critique.
I actually ended up using smoothScatter, which works well and uses classic R plotting.

How to specify equation for regression line in ggplot2?

I want to create a scatterplot in ggplot2 with one or more lines over-layed. Having looked at the documentation for geom_smooth() and geom_line(), it remains unclear to me how I can specify the equations for lines that I want to add to a plot. I understand that this must be very basic, so please feel free simply to point me toward the appropriate documentation that I must have overlooked.
geom_abline() is the name of the geom you're looking for, e.g. geom_abline(aes(intercept=a,slope=b)). There are examples in the online documentation.

R - Scatter plots, how to plot points in differnt lines to overlapping?

I want to plot several lists of points, each list has distance (decimal) and error_no (1-8). So far I am using the following:
plot(b1$dist1, b1$e1, col="blue",type="p", pch=20, cex=.5)
points(b1$dist2, b1$e2, col="blue", pch=22)
to add them both to the same plot. (I will add legends, etc later on).
The problem I have is that points overlap, and even when changing the character using for plotting, it covers up previous points. Since I am planning on plotting a lot more than just 2 this will be a big problem.
I found some ways in:
http://www.rensenieuwenhuis.nl/r-sessions-13-overlapping-data-points/
But I would rather do something that would space the points along the y axis, one way would be to add .1, then .2, and so on, but I was wondering if there was any package to do that for me.
Cheers
M
ps: if I missed something, please let me know.
As noted in the very first point in the link you posted, jitter will slightly move all your points. If you just want to move the points on the y-axis:
plot(b1$dist1, b1$e1, col="blue",type="p", pch=20, cex=.5)
points(b1$dist2, jitter(b1$e2), col="blue", pch=22)
Depends a lot on what information you wish to impart to the reader of your chart. A common solution is to use the transparency quality of R's color specification. Instead of calling a color "blue" for example, set the color to #0000FF44 (Apologies if I just set it to red or green) The final two bytes define the transparency, from 00 to FF, so overlapping data points will appear darker than standalone points.
Look at the spread.labs function in the TeachingDemos package, particularly the example. It may be that you can use that function to create your plot (the examples deal with labels, but could just as easily be applied to the points themselves). The key is that you will need to find the new locations based on the combined data, then plot. If the function as is does not do what you want, you could still look at the code and use the ideas to spread out your points.
Another approach would be to restructure your data and use the ggplot2 package with "dodging". Other approaches rather than using points several times would be the matplot function, using the col argument to plot with a vector, or lattice or ggplot2 plots. You will probably need to restructure the data for any of these.

How do I increase the number of evaluation points in geom_smooth for ggplot2 in R

I'm creating a plot and adding a basic loess smooth line to it.
qplot(Age.GTS2004., X.d18O,data=deepsea, geom=c('point')) +
geom_smooth(method="loess",se=T,span=0.01, alpha=.5, fill='light blue',color='navy')
The problem is that the line is coming out really choppy. I need more evaluation point for the curve in certain areas. Is there a way to increase the number of evaluation points without having to reconstruct geom_smooth?
Use the n parameter, as documented in stat_smooth.
Hadley: The documentation leads people astray. geom_smooth does not document that it accepts parameters on behalf of stat_smooth, nor is there any link on that page to stat_smooth for continued reading.
I figured the parameter was buried on some other help page, but I landed here to clue in where.

Resources