Conditionally circling around data plots using ggplot2 - r

I have couple of questions regarding plotting using ggplot2.
I have already used below commands to colour data points using R.
library(ggplot2)
df <- read.csv(file="c:\\query2.csv")
ggplot( df,aes( x = Time,y ,y = users,colour = users>40) ) + geom_point()
My question is: how should I draw a continuous line connecting data points and how do I circle around data points for users >40?

To connect the points, use geom_line (if that doesn't give you what you need, please explain what you're trying to accomplish).
I haven't used geom_encircle, but another option is to use a filled marker with the fill deleted to create the circles. Here's an example, using the built-in mtcars data frame for illustration:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_point(data=mtcars[mtcars$mpg>30,],
pch=21, fill=NA, size=4, colour="red", stroke=1) +
theme_bw()
pch=21 is one of the filled markers (see ?pch for more info on other available point markers). We set fill=NA to remove the fill. stroke sets the thickness of the circle border.
UPDATE: To add a line to this chart, using the example above:
ggplot(mtcars, aes(wt, mpg)) +
geom_line() +
geom_point() +
geom_point(data=mtcars[mtcars$mpg>30,],
pch=21, fill=NA, size=4, colour="red", stroke=1) +
theme_bw()
However, if (as in my original code for this graph) you put the aes statement inside the geom, rather than in the initial call to ggplot, then you need to include an aes statement inside geom_line as well.

Related

How to set background color for each panel in grouped boxplot?

I plotted a grouped boxplot and trying to change the background color for each panel. I can use panel.background function to change whole plot background. But how this can be done for individual panel? I found a similar question here. But I failed to adopt the code to my plot.
Top few lines of my input data look like
Code
p<-ggplot(df, aes(x=Genotype, y=Length, fill=Treatment)) + scale_fill_manual(values=c("#69b3a2", "#CF7737"))+
geom_boxplot(width=2.5)+ theme(text = element_text(size=20),panel.spacing.x=unit(0.4, "lines"),
axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.text.y = element_text(angle=90, hjust=1,colour="black")) +
labs(x = "Genotype", y = "Petal length (cm)")+
facet_grid(~divide,scales = "free", space = "free")
p+theme(panel.background = element_rect(fill = "#F6F8F9", colour = "#E7ECF1"))
Unfortunately, like the other theme elements, the fill aesthetic of element_rect() cannot be mapped to data. You cannot just send a vector of colors to fill either (create your own mapping of sorts). In the end, the simplest solution probably is going to be very similar to the answer you linked to in your question... with a bit of a twist here.
I'll use mtcars as an example. Note that I'm converting some of the continuous variables in the dataset to factors so that we can create some more discrete values.
It's important to note, the rect geom is drawn before the boxplot geom, to ensure the boxplot appears on top of the rect.
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
All done... but not quite. Something is wrong and you might notice this if you pay attention to the boxes on the legend and the gridlines in the plot panels. It looks like the alpha value is incorrect for some facets and okay for others. What's going on here?
Well, this has to do with how geom_rect works. It's drawing a box on each plot panel, but just like the other geoms, it's mapped to the data. Even though the x and y aesthetics for the geom_rect are actually not used to draw the rectangle, they are used to indicate how many of each rectangle are drawn. This means that the number of rectangles drawn in each facet corresponds to the number of lines in the dataset which exist for that facet. If 3 observations exist, 3 rectangles are drawn. If 20 observations exist for one facet, 20 rectangles are drawn, etc.
So, the fix is to supply a dataframe that contains one observation each for every facet. We have to then make sure that we supply any and all other aesthetics (x and y here) that are included in the ggplot call, or we will get an error indicating ggplot cannot "find" that particular column. Remember, even if geom_rect doesn't use these for drawing, they are used to determine how many observations exist (and therefore how many to draw).
rect_df <- data.frame(carb=unique(mtcars$carb)) # supply one of each type of carb
# have to give something to disp
rect_df$disp <- 0
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
data=rect_df,
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
That's better.

ggplot2: line connecting axis to point

data:
df<-data.frame(grp=letters[1:4],perc=runif(4))
First option:
First, create a second dataset that contains zeros for each group
df2<-rbind(df,data.frame(grp=df[,1],perc=c(0,0,0,0)))
Then plot with geom_points and geom_line:
ggplot(df,aes(y=perc,x=grp))+
geom_point()+
geom_line(data=df2, aes(y=perc, x=grp))+
coord_flip()
Which looks just fine. Just too much extra work to create a second dataset.
The other option is using geom_bar and making the width tiny:
ggplot(df,aes(y=perc,x=grp))+
geom_point()+
geom_bar(stat="identity",width=.01)+
coord_flip()
But this is also weird, and when I save to .pdf, not all of the bars are the same width.
There clearly has to be an easier way to do this, any suggestions?
Use geom_segment with fixed yend = 0. You'll also need expand_limits to adjust the plotting area:
ggplot(df, aes(y=perc, x=grp)) +
geom_point() +
geom_segment(aes(xend=grp), yend=0) +
expand_limits(y=0) +
coord_flip()

How to change style settings in stacked barchart overlaid with density line (ggplot2)

I am trying to change the style settings of this kind of chart and hope you can help me.
R code:
set_theme(theme_bw)
cglac$pred2<-as.factor(cglac$pred)
ggplot(cglac, aes(x=depth, colour=pred2))
+ geom_bar(aes(y=..density..),binwidth=3, alpha=.5, position="stack")
+ geom_density(alpha=.2)
+ xlab("Depth (m)")
+ ylab("Counts & Density")
+ coord_flip()
+ scale_x_reverse()
+ theme_bw()
which produces this graph:
Here some points:
What I want is to have the density line as black and white lines separated by symbols rather than colour (dashed line, dotted line etc).
The other thing is the histogram itself. How do I get rid of the grey background in the bars?
Can I change the bars also to black and white symbol lines (shaded etc)? So that they would match the density lines?
Last but not least I want to add a second x or in this case y axis, because of flip_coord(). The one I see right now is for the density. The other one I need would then be the count data from the pred2 variable.
Thanks for helping.
Best,
Moritz
Have different line types: inside aes(), put linetype = pred2. To make the line color black, inside geom_density, add an argument color = "black".
The "background" of the bars is called "fill". Inside geom_bar, you can set fill = NA for no fill. A more common approach is to fill in the bars with the colors, inside aes() specify fill = pred2. You might consider faceting by your variable, + facet_wrap(~ pred2, nrow = 1) might look very nice.
Shaded bars in ggplot? No, you can't do that easily. See the answers to this question for other options and hacks.
Second y-axis, similar to the shaded symbol lines, the ggplot creator thinks a second y-axis is a terrible design choice, so you can't do it at all easily. Here's a related question, including Hadley's point of view:
I believe plots with separate y scales (not y-scales that are transformations of each other) are fundamentally flawed.
It's definitely worth considering his point of view, and asking yourself if those design choices are really what you want.
Different linetypes for densities
Here's my built-in data version of what you're trying to do:
ggplot(mtcars, aes(x = hp,
linetype = cyl,
group = cyl,
color = cyl)) +
geom_histogram(aes(y=..density.., fill = cyl),
alpha=.5, position="stack") +
geom_density(color = "black") +
coord_flip() +
theme_bw()
And what I think you should do instead. This version uses facets instead of stacking/colors/linetypes. You seem to be aiming for black and white, which isn't a problem at all in this version.
ggplot(mtcars, aes(x = hp,
group = cyl)) +
geom_histogram(aes(y=..density..),
alpha=.5) +
geom_density() +
facet_wrap(~ cyl, nrow = 1) +
coord_flip() +
theme_bw()

ggplot2 colour geom_point by factor but geom_smooth based on all data

In ggplot2, the following command p <- qplot(wt, mpg, data=mtcars, colour=factor(cyl)) taken from here plots a scatter plot with each point coloured according to factor
I would like to fit all data with a geom_smooth irrespective of factor but keeping the colour of individual points according to factor. p + geom_smooth(method="lm") does a linear fit on each factor. How do I do this?
You can do this fairly easily by stepping back from the 'qplot' wrapper function and using the 'ggplot' and geometry functions directly.
ggplot(mtcars, aes(x=wt, y=mpg)) +
geom_point(aes(colour=factor(cyl))) +
geom_smooth(method="lm")
Step 1: Set your initial 'ggplot' settings. These are the settings that you want to be defaults for the geometry functions.
ggplot(mtcars, aes(x=wt, y=mpg))
In this case, we are using the 'mtcars' data for all geometries with 'wt' assigned to the x-axis and 'mpg' assigned to the y-axis. By specifying these at the beginning, we lessen the risk of messing something up when copy-pasting into the geometry functions.
Step 2: Draw the point geometry, using the factors of 'cyl' to color the points. This is what the original 'qplot' function was doing, but we're specifying it a little more explicitly.
geom_point(aes(colour=factor(cyl)))
Step 3: Draw the smoothed linear model. This is exactly what the OP wrote before, but now that the aesthetic of coloring is no longer part of the defaults, the model draws as intended.
geom_smooth(method="lm")
Chain it all together with the + et voila!
For reference: You could just as easily do this by being explicit in each layer, like so:
ggplot() +
geom_point(data=mtcars, aes(x=wt, y=mpg, colour=factor(cyl))) +
geom_smooth(data=mtcars, method="lm", aes(x=wt, y=mpg))
In my opinion, you'll find ggplot a lot easier if you start to use the ggplot() function rather than qplot. The control of aesthetics makes a lot more sense. In this case, you just build your base:
p <- ggplot(mtcars, aes(wt, mpg))
Then build the two geoms on top:
p + geom_point(aes(colour = factor(cyl))) +
geom_smooth(method = "lm")
Let me know if that wasn't what you're after.
I agree with previous answers from #alexwhan and #Dinre that the ggplot() + geom_point(...) + ... is the best approach to this problem
However, If you just would like to modify your solution try
p + geom_smooth(method = 'lm', aes(colour = NA), colour = 'magenta')

Plot thick line with dark dots at data points in ggplot2

I want to plot a path and show where the datapoints are.
Combine Points with lines with ggplot2
uses geom_point() + geom_line() but I do not like that the dots are much thicker and the lines have a discontinuous look - x - x ----- x --- thus I decidet to
create my own dotted line:
mya <- data.frame(a=1:20)
ggplot() +
geom_path(data=mya, aes(x=a, y=a, colour=2, size=1)) +
geom_point(data=mya, aes(x=a, y=a, colour=1, size=1)) +
theme_bw() +
theme(text=element_text(size=11))
I like that the dots and the line have the same size. I did not use the alpha channel because I fear trouble with the alpha channel when I include the files in other programs.
open problems:
R should not create those legends
can R calculate the "darker colour" itself? darker(FF0000) = AA0000
how can I manipulate the linethickness? The size= parameter did not work as expected in R 2.15
Aesthetics can be set or mapped within a ggplot call.
An aesthetic defined within aes(...) is mapped from the data, and a legend created.
An aesthetic may also be set to a single value, by defining it outside aes().
In your case it appears you want to set the size to a single value. You can also use scale_..._manual(values = ..., guide = 'none') to suppress the creation of a legend.
This appears to be what you want with colour.
You can then use named colours such as lightblue and darkblue (see ?colors for more details)
ggplot() +
geom_line(data=mya, aes(x=a, y=a, colour='light'), size = 2) +
geom_point(data=mya, aes(x=a, y=a, colour='dark'), size = 2) +
scale_colour_manual(values = setNames(c('darkblue','lightblue'),
c('dark','light')), guide = 'none') +
theme_bw()

Resources