I used ggplot2 to draw a trend line based on my data.
Below is something I've done using spreadsheet.
But I only want to show the trend line (black line as shown in upper plot) rather than all dots as number of observation is > 20,000.
So I tried to do the same thing using ggplot2.
fig_a <- ggplot(df1, aes(data_x, data_y ))
fig_a + stat_smooth(method=lm)
fig_a + stat_smooth(method=gam)
Apparently it does not work well, anyone can help?
Why it gives so many lines rather than single trend line?
You can do the following. Add + geom_smooth(method = "lm") to your ggplot script.
Example using built-in data
ggplot(mpg, aes(displ, hwy)) + geom_point() + geom_smooth(method = "lm")
Related
I would like a plot that looks like this:
Using ggplot. The one above was made from heatscatter in the LSD package.
I tried using this code:
p = ggplot(data = emr.ext.melt, aes(Date,NDVI))
p + geom_point() + stat_density2d(aes(fill=..level..), geom="polygon") +
scale_fill_gradient(low="blue", high="green")+ scale_y_continuous(limits = c(-1, 1))
But, got this weird plot. I just want a scatter plot colored based on the density of points for that given day. I do not want to use hexplot either.
Thank you for your help!
I have couple of questions regarding plotting using ggplot2.
I have already used below commands to colour data points using R.
library(ggplot2)
df <- read.csv(file="c:\\query2.csv")
ggplot( df,aes( x = Time,y ,y = users,colour = users>40) ) + geom_point()
My question is: how should I draw a continuous line connecting data points and how do I circle around data points for users >40?
To connect the points, use geom_line (if that doesn't give you what you need, please explain what you're trying to accomplish).
I haven't used geom_encircle, but another option is to use a filled marker with the fill deleted to create the circles. Here's an example, using the built-in mtcars data frame for illustration:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_point(data=mtcars[mtcars$mpg>30,],
pch=21, fill=NA, size=4, colour="red", stroke=1) +
theme_bw()
pch=21 is one of the filled markers (see ?pch for more info on other available point markers). We set fill=NA to remove the fill. stroke sets the thickness of the circle border.
UPDATE: To add a line to this chart, using the example above:
ggplot(mtcars, aes(wt, mpg)) +
geom_line() +
geom_point() +
geom_point(data=mtcars[mtcars$mpg>30,],
pch=21, fill=NA, size=4, colour="red", stroke=1) +
theme_bw()
However, if (as in my original code for this graph) you put the aes statement inside the geom, rather than in the initial call to ggplot, then you need to include an aes statement inside geom_line as well.
I am trying to fill in a portion of a plot underneath a geom_smooth() line.
Example:
In the example the data fits on that curve. My data is not as smooth. I want to use geom_point() and a mix of geom_smooth() and geom_area() to fill in the area under the smoothed line while leaving the points above.
A picture of my data with a geom_smooth():
In other words, I want everything underneath that line to be filled in, like in Image 1.
Use predict with the type of smoothing being used. geom_smooth uses loess for n < 1000 and gam for n > 1000.
library(ggplot2)
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_smooth() +
geom_ribbon(aes(ymin = 0,ymax = predict(loess(hwy ~ displ))),
alpha = 0.3,fill = 'green')
Which gives:
I've read documentation and I think that my code should be right, but still there is no line between the points in the output. What is wrong?
The x'axis is discrete and y'axis is continuous.
My code
point.sqrmPrice <- ggplot(overview.df, aes(x = areaSize, y = sqrmPrice)) +
geom_line() +
geom_point() +
scale_y_continuous(breaks = c(seq(min(overview.df$sqrmPrice), max(overview.df$sqrmPrice), by = 10000) )) +
theme_bw()
The underlying issue here is a duplicate of this stack post.
Here's a reproducible example showing what #SN248 meant about adding group to the code
ggplot(iris, aes(x = factor(Sepal.Length), y = Sepal.Width)) +
geom_line(aes(group=1)) + geom_point() + theme_bw()
You are not getting a line because areaSize is a factor. Convert to numeric with
overview.df$areaSize <- as.numeric(as.character(overview.df$areaSize))
and then make the plot.
What you have to think about it is, do you expect a single line to connect all the dots?
Else, how many lines do you expect, that will tell you how many groups will you need to have.
You are missing the group aesthetic required for geom_line(), because you haven't specified how many groups (lines) you want in your plot.
In ggplot2, the following command p <- qplot(wt, mpg, data=mtcars, colour=factor(cyl)) taken from here plots a scatter plot with each point coloured according to factor
I would like to fit all data with a geom_smooth irrespective of factor but keeping the colour of individual points according to factor. p + geom_smooth(method="lm") does a linear fit on each factor. How do I do this?
You can do this fairly easily by stepping back from the 'qplot' wrapper function and using the 'ggplot' and geometry functions directly.
ggplot(mtcars, aes(x=wt, y=mpg)) +
geom_point(aes(colour=factor(cyl))) +
geom_smooth(method="lm")
Step 1: Set your initial 'ggplot' settings. These are the settings that you want to be defaults for the geometry functions.
ggplot(mtcars, aes(x=wt, y=mpg))
In this case, we are using the 'mtcars' data for all geometries with 'wt' assigned to the x-axis and 'mpg' assigned to the y-axis. By specifying these at the beginning, we lessen the risk of messing something up when copy-pasting into the geometry functions.
Step 2: Draw the point geometry, using the factors of 'cyl' to color the points. This is what the original 'qplot' function was doing, but we're specifying it a little more explicitly.
geom_point(aes(colour=factor(cyl)))
Step 3: Draw the smoothed linear model. This is exactly what the OP wrote before, but now that the aesthetic of coloring is no longer part of the defaults, the model draws as intended.
geom_smooth(method="lm")
Chain it all together with the + et voila!
For reference: You could just as easily do this by being explicit in each layer, like so:
ggplot() +
geom_point(data=mtcars, aes(x=wt, y=mpg, colour=factor(cyl))) +
geom_smooth(data=mtcars, method="lm", aes(x=wt, y=mpg))
In my opinion, you'll find ggplot a lot easier if you start to use the ggplot() function rather than qplot. The control of aesthetics makes a lot more sense. In this case, you just build your base:
p <- ggplot(mtcars, aes(wt, mpg))
Then build the two geoms on top:
p + geom_point(aes(colour = factor(cyl))) +
geom_smooth(method = "lm")
Let me know if that wasn't what you're after.
I agree with previous answers from #alexwhan and #Dinre that the ggplot() + geom_point(...) + ... is the best approach to this problem
However, If you just would like to modify your solution try
p + geom_smooth(method = 'lm', aes(colour = NA), colour = 'magenta')