Can I add a third variable to graph with geom_rug? - r

I have sports dataset which shows a team's result, win draw or loss, cumulative games played and league standing. A simple plot of position by games played is produced thus
df<- data.frame(played=c(1:5),
result=c("W","L","D","D","L"),
position=c(1,3,4,4,5))
ggplot() +
geom_line(data=df,aes(x=played,y=position)) +
scale_y_reverse()
I would like to add a rug on the x axis with a different colour for each result, say W is green, L red and D, blue but cannot seem to solve it using geom_rug or adding a geom_bar.

This should do the trick:
##The data frame df is now inherited by
##the other geom's
ggplot(data=df,aes(x=played,y=position)) +
geom_line() +
scale_y_reverse() +
geom_rug(sides="b", aes(colour=result))
In the geom_rug function, we specify that we only want a rug on the bottom and that we should colour the lines conditional on the result. To change the colours, look at the scale_colour_* functions. For your particular colours, try:
+ scale_colour_manual(values=c("blue","red", "green"))

Related

ggplot histogram with labels

I want to label to different histograms that are in the same plot. By labels, I want to identify by colors each histogram, for example one green that corresponds to x and one red that corresponds to y.
I tried to use the function label. But it is not working.
ggplot() +
geom_histogram(data=junk, aes(x),fill="green", alpha=.2) +
geom_histogram(data=jun, aes(y), fill="red", alpha=.2)+
labs(x = "something") +
ggtitle("title")
I expect to have both histograms, one green and the other one red, and labels in the right describing each histogram.
for this, you need to have the data in long format, so the data that should make up the green histogram and the data that make up the red one in a data frame below one another, and another column, that defines the groups.
df=data.frame(values=rnorm(20000),colorby=c("red_values","green_values"))
ggplot(data=df,aes(x=values,fill=colorby))+
geom_histogram(position="dodge")+
scale_fill_manual(values=c("red_values"="red","green_values"="green"))
For the position argument you could also try if "stack" fits your needs better.

How do I add intensity legend of colors after I plot using grid.raster()?

I am doing kmeans clustering on a png image and have been plotting it using grid::grid.raster(image). But I would like to put a legend which shows the intensity in a bar(from blue to red) marked with values, essentially indicating the intensity on the image. (image is an array where the third dimension equals 3 giving the red, green and blue channels.)
I thought of using grid.legend() but couldn't figure it out. I am hoping the community can help me out. Following is the image I have been using and after I perform kmeans clustering want a legend beside it that displays intensity on a continuous scale on a color bar.
Also I tried with ggplot2 and could plot the image but still couldn't plot the legend. I am providing the ggplot code for plotting the image. I can extract the RGB channels separately using ggplot2 also, so showing that also helps.
colassign <- rgb(Kmeans2#centers[clusters(Kmeans2),])
library(ggplot2)
ggplot(data = imgVEC, aes(x = x, y = y)) +
geom_point(colour = colassign) +
labs(title = paste("k-Means Clustering of", kClusters, "Colours")) +
xlab("x") +
ylab("y")
Did not find a way to use grid.raster() properly but found a way to do it by ggplot2 when plotting the RGB channels separately. Note: this only works for plotting the pannels separately, but this is what I needed. Following shows the code for green channel.
#RGB channels are respectively stored in columns 1,2,3.
#x-axis and y-axis values are stored in columns 4,5.
#original image is a nx5 matrix
ggplot(original_img[,c(3,4,5)], aes(x, y)) +
geom_point(aes(colour = segmented_img[,3])) +
scale_color_gradient2()+
# scale_color_distiller(palette="RdYlBu") can be used instead of scale_color_gradient2() to get color selections of choice using palette as argument.

ggplot geom_histogram color by factor not working properly

In trying to color my stacked histogram according to a factor column; all the bars have a "green" roof? I want the bar-top to be the same color as the bar itself. The figure below shows clearly what is wrong. All the bars have a "green" horizontal line at the top?
Here is a dummy data set :
BodyLength <- rnorm(100, mean = 50, sd = 3)
vector <- c("80","10","5","5")
colors <- c("black","blue","red","green")
color <- rep(colors,vector)
data <- data.frame(BodyLength,color)
And the program I used to generate the plot below :
plot <- ggplot(data = data, aes(x=data$BodyLength, color = factor(data$color), fill=I("transparent")))
plot <- plot + geom_histogram()
plot <- plot + scale_colour_manual(values = c("Black","blue","red","green"))
Also, since the data column itself contains color names, any way I don't have to specify them again in scale_color_manual? Can ggplot identify them from the data itself? But I would really like help with the first problem right now...Thanks.
Here is a quick way to get your colors to scale_colour_manual without writing out a vector:
data <- data.frame(BodyLength,color)
data$color<- factor(data$color)
and then later,
scale_colour_manual(values = levels(data$color))
Now, with respect to your first problem, I don't know exactly why your bars have green roofs. However, you may want to look at some different options for the position argument in geom_histogram, such as
plot + geom_histogram(position="identity")
..or position="dodge". The identity option is closer to what you want but since green is the last line drawn, it overwrites previous the colors.
I like density plots better for these problems myself.
ggplot(data=data, aes(x=BodyLength, color=color)) + geom_density()
ggplot(data=data, aes(x=BodyLength, fill=color)) + geom_density(alpha=.3)

How to recycle colours in a colorbrewer palette using line symbols

I'm using ggplot2 to create quite a few facet_wrapped geom_line plot.
Although each plot only has a maximum of eight lines, when taken together, there are more like twenty categories to show on the legend.
In a similar vein to this:
Recommend a scale colour for 13 or more categories
and this:
In R,how do I change the color value of just one value in ggplot2's scale_fill_brewer? I'd like to artificially up the number of colours I can show using colorbrewer's high-contrast colour sets.
An obvious way to do this would seem to be to 'recycle' the colours in the palette, with a different line symbol each time. So bright red with 'x's on the line could be a different category than bright red with 'o's etc.
Can anyone think how I might do this?
Thanks!
Edit
Here's some (sanitised) data to play with, and the R code I'm using to produce my plot.
Data: http://orca.casa.ucl.ac.uk/~rob/Stack%20Overflow%20question/stack%20overflow%20colours%20question%20data.csv
R code:
csvData <- read.csv("stack overflow colours question data.csv")
p <- ggplot(csvData,
aes(year, percentage_of_output, colour=category, group=category))
p +
geom_line(size=1.2) +
labs(title = "Can I recycle the palette colours?", y = "% of output") +
scale_colour_brewer(palette = "Set1") +
theme(plot.title = element_text(size = rel(1.5))) +
facet_wrap("country_iso3", scales="free_y")
Made data frame containing 20 levels (as letters).
df<-data.frame(group=rep(c(LETTERS[1:20]),each=5),x=rep(1:5,times=20),y=1:100)
You can use scale_colour_manual() to set colors for lines - in example I used five SET1 and repeated them four times (total number is 20). Then to set shapes added geom_point() and scale_shape_manual() and five different shapes and repeated each of them four times (total number again is 20).
library(RColorBrewer)
ggplot(df,aes(x,y,colour=group))+geom_line()+geom_point(aes(shape=group),size=5)+
scale_colour_manual(values=rep(brewer.pal(5,"Set1"),times=4))+
scale_shape_manual(values=rep(c(15,16,17,18,19),each=5))

How can I change the colors in a ggplot2 density plot?

Summary: I want to choose the colors for a ggplot2() density distribution plot without losing the automatically generated legend.
Details: I have a dataframe created with the following code (I realize it is not elegant but I am only learning R):
cands<-scan("human.i.cands.degnums")
non<-scan("human.i.non.degnums")
df<-data.frame(grp=factor(c(rep("1. Candidates", each=length(cands)),
rep("2. NonCands",each=length(non)))), val=c(cands,non))
I then plot their density distribution like so:
library(ggplot2)
ggplot(df, aes(x=val,color=grp)) + geom_density()
This produces the following output:
I would like to choose the colors the lines appear in and cannot for the life of me figure out how. I have read various other posts on the site but to no avail. The most relevant are:
Changing color of density plots in ggplot2
Overlapped density plots in ggplot2
After searching around for a while I have tried:
## This one gives an error
ggplot(df, aes(x=val,colour=c("red","blue"))) + geom_density()
Error: Aesthetics must either be length one, or the same length as the dataProblems:c("red", "blue")
## This one produces a single, black line
ggplot(df, aes(x=val),colour=c("red","green")) + geom_density()
The best I've come up with is this:
ggplot() + geom_density(aes(x=cands),colour="blue") + geom_density(aes(x=non),colour="red")
As you can see in the image above, that last command correctly changes the colors of the lines but it removes the legend. I like ggplot2's legend system. It is nice and simple, I don't want to have to fiddle about with recreating something that ggplot is clearly capable of doing. On top of which, the syntax is very very ugly. My actual data frame consists of 7 different groups of data. I cannot believe that writing + geom_density(aes(x=FOO),colour="BAR") 7 times is the most elegant way of coding this.
So, if all else fails I will accept with an answer that tells me how to get the legend back on to the 2nd plot. However, if someone can tell me how to do it properly I will be very happy.
set.seed(45)
df <- data.frame(x=c(rnorm(100), rnorm(100, mean=2, sd=2)), grp=rep(1:2, each=100))
ggplot(data = df, aes(x=x, color=factor(grp))) + geom_density() +
scale_color_brewer(palette = "Set1")
ggplot(data = df, aes(x=x, color=factor(grp))) + geom_density() +
scale_color_brewer(palette = "Set3")
gives me same plots with different sets of colors.
Provide vector containing colours for the "values" argument to map discrete values to manually chosen visual ones:
ggplot(df, aes(x=val,color=grp)) +
geom_density() +
scale_color_manual(values=c("red", "blue"))
To choose any colour you wish, enter the hex code for it instead:
ggplot(df, aes(x=val,color=grp)) +
geom_density() +
scale_color_manual(values=c("#f5d142", "#2bd63f")) # yellow/green

Resources