How can I control the colour of the dots in the scatter plot by ggplot2? I need the first 20 points to have a colour, then the next 20 to have a different colour. At the moment I am using base R plot output. The matrix looks like this
1 4
1 3
2 9
-1 8
9 9
and I have a colour vector which looks like
cols<-c("#B8DBD3","#FFB933","#FF6600","#0000FF")
then
plot(mat[,1],mat[,2],col=cols)
works.
How could I do this ggplot?
Regarding the colours
my cols vector looks ike this
100->n
colours<-c(rep("#B8DBD3",n),rep("#FFB933",n),rep("#FF6600",n),rep("#0000FF",n),rep("#00008B",n),rep("#ADCD00",n),rep("#008B00",n),rep("#9400D3",n))
when I then do
d<-ggplot(new,aes(x=PC1,y=PC2,col=rr))
d+theme_bw() +
scale_color_identity(breaks = rep(colours, each = 1)) +
geom_point(pch=21,size=7)
the colours look completely different from
plot(new[,1],new[,2],col=colours)
this looks like
http://fs2.directupload.net/images/150417/2wwvq9u2.jpg
while ggplot with the same colours looks like
http://fs1.directupload.net/images/150417/bwc5wn7b.jpg
I would recommend creating a column that designates to which group a point belongs to.
library(ggplot2)
xy <- data.frame(x = rnorm(80), y = rnorm(80), col = as.factor(rep(1:4, each = 20)))
cols<-c("#B8DBD3","#FFB933","#FF6600","#0000FF")
ggplot(xy, aes(x = x, y = y, col = col)) +
theme_bw() +
scale_colour_manual(values = cols) +
geom_point()
Related
I just started learning R. I melted my dataframe and used ggplot to get this graph. There's supposed to be two lines on the same graph, but the lines connecting seem random.
Correct points plotted, but wrong lines.
# Melted my data to create new dataframe
AvgSleep2_DF <- melt(AvgSleep_DF , id.vars = 'SleepDay_Date',
variable.name = 'series')
# Plotting
ggplot(AvgSleep2_DF, aes(SleepDay_Date, value, colour = series)) +
geom_point(aes(colour = series)) +
geom_line(aes(colour = series))
With or without the aes(colour = series) in the geom_line results in the same graph. What am I doing wrong here?
The following might explain what geom_line() does when you specify aesthetics in the ggplot() call.
I assign a deliberate colour column that differs from the series specification!
df <- data.frame(
x = c(1,2,3,4,5)
, y = c(2,2,3,4,2)
, colour = factor(c(rep(1,3), rep(2,2)))
, series = c(1,1,2,3,3)
)
df
x y colour series
1 1 2 1 1
2 2 2 1 1
3 3 3 1 2
4 4 4 2 3
5 5 2 2 3
Inheritance in ggplot will look for aesthetics defined in an upper layer.
ggplot(data = df, aes(x = x, y = y, colour = colour)) +
geom_point(size = 3) + # setting the size to stress point layer call
geom_line() # geom_line will "inherit" a "grouping" from the colour set above
This gives you
While we can control the "grouping" associated to each line(segment) as follows:
ggplot(data = df, aes(x = x, y = y, colour = colour)) +
geom_point(size = 3) +
geom_line(aes(group = series) # defining specific grouping
)
Note: As I defined a separate "group" in the series column for the 3rd point, it is depicted - in this case - as a single point "line".
Say I have the following dummy data frame:
df <- data.frame(let = LETTERS[1:13], value = sample(13),
group = rep(c("foo", "bar"), times = c(5,8)))
df
let value group
1 A 2 foo
2 B 1 foo
3 C 12 foo
4 D 8 foo
5 E 4 foo
6 F 13 bar
7 G 11 bar
8 H 3 bar
9 I 7 bar
10 J 5 bar
11 K 10 bar
12 L 9 bar
13 M 6 bar
Using ggplot with facet_wrap allows me to make a panel for each of the groups...
library(ggplot2)
ggplot(df, aes(x= let, y = value)) +
geom_point() +
coord_flip() +
facet_wrap(~group, scales = "free")
..but the vertical axes are not equally spaced, i.e. the left plot contains more vertical ticks than the right one. I would like to fill up the right vertical axis with (unlabeled) ticks (with no plotted values). In this case that would add 3 empty ticks, but it should be scalable to any df size.
What is the best way to accomplish this? Should I change the data frame, or is there a way to do this using ggplot?
I’m not sure why you want to arrange the categorical variable on your chart as you do other than aesthetics (it does seem to look better). At any rate, a simple workaround which seems to handle general cases is to note that ggplot uses a numerical scale to plot categorical variables. The workaround for your chart is then for each x value to plot a transparent point at the y value equal to the number of categorical variables. Points are plotted for all x values as a simple solution to the case of non-overlapping ranges of x values for each group. I've added another group to your data frame to make the example a bit more general.
library(ggplot2)
set.seed(123)
df <- data.frame(let = LETTERS[1:19], value = c(sample(13),20+sample(6)),
group = rep(c("foo", "bar", "bar2"), times = c(5,8,6)))
num_rows <- xtabs(~ group, df)
max_rows <- max(num_rows)
sp <- ggplot(df, aes(y= let, x = value)) +
geom_point() +
geom_point(aes(y = max_rows +.5), alpha=0 ) +
facet_wrap(~group, scales = "free", nrow=1 )
plot(sp)
This gives the following chart:
A cludgy solution that requires magrittr (for the compound assignment pipe %<>%):
df %<>%
rbind(data.frame(let = c(" ", " ", " "),
value = NA,
group = "foo"))
I just add three more entries for foo that are blank strings (i.e., just spaces) of different lengths. There must be a more elegant solution, though.
Use free_x instead of free, like this:
ggplot(df, aes(x= let, y = value)) +
geom_point() +
coord_flip() +
facet_wrap(~group, scales = "free_x")+
theme(axis.text.y=element_blank(),
axis.ticks.y=element_blank())
I'm somewhat new to R and ggplot2. I've been trying to create a scatterplot graph that has one specific point coloured. For example, here is my basic data frame
manager Confirmed Overturned keeping Stands total
A.J. Hinch 11 24 0 14 49
Angel Hernandez 0 1 0 0 1
Bill Miller 3 1 0 4 8
Bob Melvin 6 16 0 6 28
Brad Ausmus 3 11 0 13 27
With this I can create a simple scatterplot using this code,
p <- ggplot(data = Outcome, aes(x = Overturned, y = total))
p + geom_point()
I know how to add general colour, and add a colour scale, but I don't know how to colour just one point. For example, let's say I wanted to colour A.J. Hinch blue, and make every other point a different colour (probably grey or black), how would I do that?
Here is a link to the graph I want to create in Tableau.
https://public.tableau.com/profile/julien1554#!/vizhome/ManagerChallenges2014-2015/Sheet1
All help is appreciated, thanks.
You would just add another scatter plot layer to your plot. Here is the code that I used. Hope it helps!
> df = as.data.frame(cbind(Overturned = c(24,1,1,16,11), total = c(49,1,8,28,27)))
> library(ggplot2)
> p <- ggplot(data = df, aes(x = Overturned, y = total)) # creates the graph
> p + geom_point(data = df, color = "gray") + # creates main scatter plot with gray points
geom_point(data = df[1,], color = "blue") # colors A.J. Hinch's point blue
Here is the resulting graph:
Note that I'm just using the last name because when I read your data from the clipboard it thought the first names were row labels.
Outcome$color_me <- ifelse(Outcome$manager == "Hinch", "color_me", "normal")
textdf <- Outcome[Outcome$manager == "Hinch", ]
mycolors <- c("color_me" = "blue", "normal" = "grey50")
ggplot(data = Outcome, aes(x = Overturned, y = total)) +
geom_point(size = 3, aes(colour = color_me))
or with the manually defined color:
ggplot(data = Outcome, aes(x = Overturned, y = total)) +
geom_point(size = 3, aes(colour = color_me)) +
scale_color_manual("Status", values = mycolors)
I'm using ggplot2 to create a simple dot plot of -1 to +1 correlation values using the following R code:
ggplot(dataframe, aes(x = exit)) +
geom_point(aes(y= row.names(dataframe))) +
geom_text(aes(y=exit, label=samplesize))
The y-axis has text labels, and I believe those text labels may be the reason that my geom_text() data point labels are squished down into the bottom of the plot as pictured here:
How can I change my plotting so that the data point labels appear on the dots themselves?
I understand that you would like to have the samplesize appear above each data point in the plot. Here is a sample plot with a sample data frame that does this:
EDIT: Per note by Gregor, changed the geom_text() call to utilize aes() when referencing the data. Thanks for the heads up!
top10_rank<-
String Number
4 h 0
1 a 1
11 w 1
3 z 3
7 z 3
2 b 4
8 q 5
6 k 6
9 r 9
5 x 10
10 l 11
x<-ggplot(data=top10_rank, aes(x = Number,
y = String)) + geom_point(size=3) + scale_y_discrete(limits=top10_rank$String)
x + geom_text(data=top10_rank, size=5, color = 'blue',
aes(x = Number,label = Number), hjust=0, vjust=0)
Not sure if this is what you wanted though.
Your problem is simply that you switched the y variables:
# your code
ggplot(dataframe, aes(x = exit)) +
geom_point(aes(y = row.names(dataframe))) + # here y is the row names
geom_text(aes(y =exit, label = samplesize)) # here y is the exit column
Since you want the same y-values for both you can define this in the initial ggplot() call and not worry about repeating it later
# working version
ggplot(dataframe, aes(x = exit, y = row.names(dataframe))) +
geom_point() +
geom_text(aes(label = samplesize))
Using row names is a little fragile, it's a little safer and more robust to actually create a data column with what you want for y values:
# nicer code
dataframe$y = row.names(dataframe)
ggplot(dataframe, aes(x = exit, y = y)) +
geom_point() +
geom_text(aes(label = samplesize))
Having done this, you probably don't want the labels right on top of the points, maybe a little offset would be better:
# best of all?
ggplot(dataframe, aes(x = exit, y = y)) +
geom_point() +
geom_text(aes(x = exit + .05, label = samplesize), vjust = 0)
In the last case, you'll have to play with the adjustment to the x aesthetic, what looks right will depend on the dimensions of your final plot
I've got a problem interacting with the labels in ggplot2.
I have two data sets (Temperature vs. Time) from two experiments but recorded at different timesteps. I've managed to merge the data frames and put them in a long fashion to plot them in the same graph, using the melt function from the reshape2 library.
So, the initial data frames look something like this:
> d1
step Temp
1 512.5 301.16
2 525.0 299.89
3 537.5 299.39
4 550.0 300.58
5 562.5 300.20
6 575.0 300.17
7 587.5 300.62
8 600.0 300.51
9 612.5 300.96
10 625.0 300.21
> d2
step Temp
1 520 299.19
2 540 300.39
3 560 299.67
4 580 299.43
5 600 299.78
6 620 300.74
7 640 301.03
8 660 300.39
9 680 300.54
10 700 300.25
I combine it like this:
> mrgd <- merge(d1, d2, by = "step", all = T)
step Temp.x Temp.y
1 512.5 301.16 NA
2 520.0 NA 299.19
...
And put it into long format for ggplot2 with this:
> melt1 <- melt(mrgd3, id = "step")
> melt1
step variable value
1 512.5 Temp.x 301.16
2 520.0 Temp.x NA
...
Now, I want to for example do a histogram of the distribution of values. I do it like this:
p <- ggplot(data = melt1, aes(x = value, color = variable, fill = variable)) + geom_histogram(alpha = 0.4)
My problem is when I try to modify the Legend of this graph, I don't know how to! I've followed what is suggested in the R Graphics Cookbook book, but I've had no luck.
I've tried to do this, for example (to change the labels of the Legend):
> p + scale_fill_discrete(labels = c("d1", "d2"))
But I just create a "new" Legend box, like so
Or even removing the Legend completely
> p + scale_fill_discrete(guide = F)
I just get this
Finally, doing this also doesn't help
> p + scale_fill_discrete("")
Again, it just adds a new Legend box
Does anyone know what's happening here? It looks as if I'm actually modyfing another Label object, if that makes any sense. I've looked into other related questions in this site, but I haven't found someone having the same problem as me.
Get rid of the aes(color = variable...) to remove the scale that belongs to aes(color = ...).
ggplot(data = melt1, aes(x = value, fill = variable)) +
geom_histogram(alpha = 0.4) +
scale_fill_discrete(labels = c("d1", "d1")) # Change the labels for `fill` scale
This second plot contains aes(color = variable...). Color in this case will draw colored outlines around the histogram bins. You can turn off the scale so that you only have one legend, the one created from fill
ggplot(data = melt1, aes(x = value, color = variable, fill = variable)) +
geom_histogram(alpha = 0.4) +
scale_fill_discrete(labels = c("d1", "d1")) +
scale_color_discrete(guide = F) # Turn off the color (outline) scale
The most straightforward thing to do would be to not use reshape2 or merge at all, but instead to rbind your data frames:
dfNew <- rbind(data.frame(d1, Group = "d1"),
data.frame(d2, Group = "d2"))
ggplot(dfNew, aes(x = Temp, color = Group, fill = Group)) +
geom_histogram(alpha = 0.4) +
labs(fill = "", color = "")
If you wanted to vary alpha by group:
ggplot(dfNew, aes(x = Temp, color = Group, fill = Group, alpha = Group)) +
geom_histogram() +
labs(fill = "", color = "") +
scale_alpha_manual("", values = c(d1 = 0.4, d2 = 0.8))
Note also that the default position for geom_histogram is "stacked". There won't be overlap of the bars unless you use geom_histogram(position = identity).