geom_point, colour by factor - r

I am trying to create some maps in R utilising ggplot2. I've reached a point where I have the final map I want spatially, but need to change the shape and colour of the points according to categorial factors.
The data frame is called genAQ in this context. Each row contains longitude and latitude, quality (high/low) and species (3 sp). Eventually, I would like the point colour to be in accordance with species, and the point shape to be in accordance with quality.
#correct spatial map, but all points are black. This code works perfectly.
ggplot(data=world) +
geom_sf() +
geom_point(data=genAQ, aes(x=Lon, y=Lat), size=1,
shape = 19, col = "black") +
geom_polygon(data=lakes10, aes(long, lat, group = group), fill="lightblue") +
coord_sf(xlim = c(133.75, 140.52), ylim = c(-26.75, -31.30), expand=F)
Now I try to change the colour, utilising species as a factor.
If I were coding using the plot function, I would write it as
plot(genAQ$Lat~genAQ$Lon, pch=19, col=genAQ$Species).
Using this same principle, I try to apply it to the initial code which works.
ggplot(data=world) +
geom_sf() +
geom_point(data=genAQ, aes(x=Lon, y=Lat), size=1,
shape = 19, col = genAQ$Species) +
geom_polygon(data=lakes10, aes(long, lat, group = group), fill="lightblue") +
coord_sf(xlim = c(133.75, 140.52), ylim = c(-26.75, -31.30), expand=F)
It comes back with an error:
Error in grDevices::col2rgb(colour, TRUE) :
invalid color name 'Amytornis modestus indulkanna'.
So clearly ggplot does not understand the same language I'm using in the plot() function, as it's still trying to read the colour as a colour, rather than a categorical factor. I've tried other methods I've seen on StackOverflow, but I can't seem to find an answer for a seemly simple problem.

Related

R code of scatter plot for three variables

Hi I am trying to code for a scatter plot for three variables in R:
Race= [0,1]
YOI= [90,92,94]
ASB_mean = [1.56, 1.59, 1.74]
Antisocial <- read.csv(file = 'Antisocial.csv')
Table_1 <- ddply(Antisocial, "YOI", summarise, ASB_mean = mean(ASB))
Table_1
Race <- unique(Antisocial$Race)
Race
ggplot(data = Table_1, aes(x = YOI, y = ASB_mean, group_by(Race))) +
geom_point(colour = "Black", size = 2) + geom_line(data = Table_1, aes(YOI,
ASB_mean), colour = "orange", size = 1)
Image of plot: https://drive.google.com/file/d/1E-ePt9DZJaEr49m8fguHVS0thlVIodu9/view?usp=sharing
Data file: https://drive.google.com/file/d/1UeVTJ1M_eKQDNtvyUHRB77VDpSF1ASli/view?usp=sharing
Can someone help me understand where I am making mistake? I want to plot mean ASB vs YOI grouped by Race. Thanks.
I am not sure what is your desidered output. Maybe, if I well understood your question I Think that you want somthing like this.
g_Antisocial <- Antisocial %>%
group_by(Race) %>%
summarise(ASB = mean(ASB),
YOI = mean(YOI))
Antisocial %>%
ggplot(aes(x = YOI, y = ASB, color = as_factor(Race), shape = as_factor(Race))) +
geom_point(alpha = .4) +
geom_point(data = g_Antisocial, size = 4) +
theme_bw() +
guides(color = guide_legend("Race"), shape = guide_legend("Race"))
and this is the output:
#Maninder: there are a few things you need to look at.
First of all: The grammar of graphics of ggplot() works with layers. You can add layers with different data (frames) for the different geoms you want to plot.
The reason why your code is not working is that you mix the layer call and or do not really specify (and even mix) what is the scatter and line visualisation you want.
(I) Use ggplot() + geom_point() for a scatter plot
The ultimate first layer is: ggplot(). Think of this as your drawing canvas.
You then speak about adding a scatter plot layer, but you actually do not do it.
For example:
# plotting antisocal data set
ggplot() +
geom_point(data = Antisocial, aes(x = YOI, y = ASB, colour = as.factor(Race)))
will plot your Antiscoial data set using the scatter, i.e. geom_point() layer.
Note that I put Race as a factor to have a categorical colour scheme otherwise you might end up with a continous palette.
(II) line plot
In analogy to above, you would get for the line plot the following:
# plotting Table_1
ggplot() +
geom_line(data = Table_1, aes(x = YOI, y = ASB_mean))
I save showing the plot of the line.
(III) combining different layers
# putting both together
ggplot() +
geom_point(data = Antisocial, aes(x = YOI, y = ASB, colour = as.factor(Race))) +
geom_line(data = Table_1, aes(x = YOI, y = ASB_mean)) +
## this is to set the legend title and have a nice(r) name in your colour legend
labs(colour = "Race")
This yields:
That should explain how ggplot-layering works. Keep an eye on the datasets and geoms that you want to use. Before working with inheritance in aes, I recommend to keep the data= and aes() call in the geom_xxxx. This avoids confustion.
You may want to explore with geom_jitter() instead of geom_point() to get a bit of a better presentation of your dataset. The "few" points plotted are the result of many datapoints in the same position (and overplotted).
Moving away from plotting to your question "I want to plot mean ASB vs YOI grouped by Race."
I know too little about your research to fully comprehend what you mean with that.
I take it that the mean ASB you calculated over the whole population is your reference (aka your Table_1), and you would like to see how the Race groups feature vs this population mean.
One option is to group your race data points and show them as boxplots for each YOI.
This might be what you want. The boxplot gives you the median and quartiles, and you can compare this per group against the calculated ASB mean.
For presentation purposes, I highlighted the line by increasing its size and linetype. You can play around with the colours, etc. to give you the aesthetics you aim for.
Please note, that for the grouped boxplot, you also have to treat your integer variable YOI, I coerced into a categorical factor. Boxplot works with fill for the body (colour sets only the outer line). In this setup, you also need to supply a group value to geom_line() (I just assigned it to 1, but that is arbitrary - in other contexts you can assign another variable here).
ggplot() +
geom_boxplot(data = Antisocial, aes(x = as.factor(YOI), y = ASB, fill = as.factor(Race))) +
geom_line(data = Table_1, aes(x = as.factor(YOI), y = ASB_mean, group = 1)
, size = 2, linetype = "dashed") +
labs(x = "YOI", fill = "Race")
Hope this gets you going!

ggplot2 plotting coordinates on map using geom_point, unwanted lines appearing between points

I am trying to plot a set of lat/long coordinates on a map of the USA using ggplot2, here is my code:
states <- map_data("state")
usamap <- ggplot(states, aes(long, lat, group=1)) +
geom_polygon(fill = "white", colour = "black") +
geom_point(data = data_masks2, aes(x = lng, y = lat), alpha = 1, size = 1) +
theme_cowplot()
However, when I plot usamap I am getting strange lines connecting some of the points (seen below), and I am unsure why. Why are these appearing, and how do I get rid of them?
Thanks in advance
There's a very helpful vignette available here for creating maps, but the issue is with your geom_polygon() line. You definitely need this (as it's the thing responsible for drawing your state lines), but you have the group= aesthetic wrong. You need to set group=group to correctly draw the lines:
ggplot(states, aes(long, lat, group=group)) +
geom_polygon(fill = "white", colour = "black")
If you use group=1 as you have, you get the lines:
ggplot(states, aes(long, lat, group=1)) +
geom_polygon(fill = "white", colour = "black")
Why does this happen? Well, it's how geom_polygon() (and ggplot in general) works. The group= aesthetic tells ggplot what "goes together" for a geom. In the case of geom_polygon(), it tells ggplot what collection of points need to be connected in order to draw a single polygon- which in this case is a single state. When you set group=1, you are assigning every point in the dataset to belong to the same polygon. Believe it or not, the map with the weird lines is actually composed of a single polygon, with points that are drawn in sequence as they are presented.
Have a look at your states dataset and you will see that there is states$group, which is specifically designed to allow you to group the points that belong to each state together. Hence, we arrive at the somewhat confusing statement: group=group. This means "Set the group= aesthetic to the value of the group column in states, or states$group."

Add a box for the NA values to the ggplot legend for a continuous map

I have got a map with a legend gradient and I would like to add a box for the NA values. My question is really similar to this one and this one. Also I have read this topic, but I can't find a "nice" solution somewhere or maybe there isn't any?
Here is an reproducible example:
library(ggplot2)
map <- map_data("world")
map$value <- setNames(sample(-50:50, length(unique(map$region)), TRUE),
unique(map$region))[map$region]
map[map$region == "Russia", "value"] <- NA
ggplot() +
geom_polygon(data = map,
aes(long, lat, group = group, fill = value)) +
scale_fill_gradient2(low = "brown3", mid = "cornsilk1", high = "turquoise4",
limits = c(-50, 50),
na.value = "black")
So I would like to add a black box for the NA value for Russia. I know, I can replace the NA's by a number, so it will appear in the gradient and I think, I can write a workaround like the following, but all this workarounds do not seem like a pretty solution for me and also I would like to avoid "senseless" warnings:
ggplot() +
geom_polygon(data = map,
aes(long, lat, group = group, fill = value)) +
scale_fill_gradient2(low = "brown3", mid = "cornsilk1", high = "turquoise4",
limits = c(-50, 50),
na.value = "black") +
geom_point(aes(x = -100, y = -50, size = "NA"), shape = NA, colour = "black") +
guides(size = guide_legend("NA", override.aes = list(shape = 15, size = 10)))
Warning messages:
1: Using size for a discrete variable is not advised.
2: Removed 1 rows containing missing values (geom_point).
One approach is to split your value variable into a discrete scale. I have done this using cut(). You can then use a discrete color scale where "NA" is one of the distinct colors labels. I have used scale_fill_brewer(), but there are other ways to do this.
map$discrete_value = cut(map$value, breaks=seq(from=-50, to=50, length.out=8))
p = ggplot() +
geom_polygon(data=map, aes(long, lat, group=group, fill=discrete_value)) +
scale_fill_brewer(palette="RdYlBu", na.value="black") +
coord_quickmap()
ggsave("map.png", plot=p, width=10, height=5, dpi=150)
Another solution
Because the original poster said they need to retain the color gradient scale and the colorbar-style legend, I am posting another possible solution. It has 3 components:
We need to trick ggplot into drawing a separate color scale by using aes() to map something to color. I mapped a column of empty strings using aes(colour="").
To ensure that we do not draw a colored boundary around each polygon, I specified a manual color scale with a single possible value, NA.
Finally, guides() along with override.aes is used to ensure the new color legend is drawn as the correct color.
p2 = ggplot() +
geom_polygon(data=map, aes(long, lat, group=group, fill=value, colour="")) +
scale_fill_gradient2(low="brown3", mid="cornsilk1", high="turquoise4",
limits=c(-50, 50), na.value="black") +
scale_colour_manual(values=NA) +
guides(colour=guide_legend("No data", override.aes=list(colour="black")))
ggsave("map2.png", plot=p2, width=10, height=5, dpi=150)
It's possible, but I did it years ago. You can't use guides. You have to set individually the continuous scale for the values as well as the discrete scale for the NAs. This is what the error is telling you and this is how ggplot2 works. Did you try using both scale_continuous and scale_discrete since your set up is rather awkward, instead of simply using guides which is basically used for simple plot designs?

R ggplot function error

My name is Venus , I am a beginner of data mining and use R. Right now, I need fill my datas to map and show it with different colours to see the different level.
I have install ggplot library and input my data with longtitue and latitude
My code like this
> ggplot(target_map, aes(long, lat, group = group, fill = Target.rate)) +
+ geom_polygon(colour = alpha("black", 1/2), size = 0.2) +
+ geom_polygon(data = needregion, colour ="white", fill = NA)+
+ scale_fill_brewer(palette = "PuRd")
I do this from an example, but I don't know why R always return this result:
Error: Aesthetics must be either length 1 or the same as the data (19584): x, y, group, fill
So, Could you give me some suggest what wrong with me? Thank you a lot

Why can't I change the shape of point using ggplot2?

I want to specific the shape of three kinds of points in my plot using ggplot2. However, no matter how I change the shape numbers, it doesn't work, the shapes of points been constantly set automatically.
Here is my code (the shape numbers are in the first line):
shape <- c("min"="1","max"="2",mean="3")
fill <- c("Rate"="#25c25b")
ggplot (data, aes(x=order))+
geom_rect(aes(xmin=order-0.1, xmax=order+0.1, ymin = min, ymax=max), alpha=0, color="black")+
geom_bar(aes(y=rate, fill="Rate"),stat="identity", alpha=0.3, width=0.5)+
geom_point(aes(y=min, shape="min"), size=5)+
geom_point(aes(y=mean, shape="mean"), size=5)+
geom_point(aes(y=max, shape="max"), size=5)+
labs(shape = "F0", fill = "Rate")
To change the shape of points you need to use scale_shape_manual() and provide argument values= with shapes you need.
+ scale_shape_manual(values=c("min"=1,"max"=2,"mean"=3))

Resources