Problems creating points on a map ggplot2/r - r

I have a data set of UK earthquakes that I want to plot by location on a map. (Hopefully I then want to change the size to be representative of the magnitude). I have made a map of the uk using ggmap, but I am struggling to then add the points to a map.
I however keep getting 2 errors, and cannot plot my points on the map. The errors are either
- Error: Aesthetics must be either length 1 or the same as the data (990): x, y
or
- Error in FUN(X[[i]], ...) : object 'group' not found
depending on how I try to plot the points.
this is what I have so far:
table <- data.frame(long2, lat2, mag1)
table
long2 lat2 mag1
1 -2.62 52.84 1.9
2 1.94 57.03 4.2
3 -0.24 51.16 0.6
4 -2.34 53.34 0.8
5 -3.16 55.73 2.0
6 -0.24 51.16 1.0
7 -4.11 53.03 1.5
8 -0.24 51.16 0.2
9 -0.24 51.16 1.1
10 -5.70 57.08 1.6
11 -2.40 53.00 1.4
12 -1.19 53.35 1.2
13 -1.02 53.84 1.7
14 -4.24 52.62 0.8
15 -3.23 54.24 0.3
16 -2.06 52.62 1.0
17 1.63 54.96 1.7
18 -5.24 56.05 0.7
19 -5.86 55.84 1.3
20 -3.22 54.23 0.3
21 -0.24 51.16 -1.4
22 -0.24 51.16 -0.7
23 -4.01 55.92 0.3
24 -5.18 50.08 2.3
25 -1.95 54.44 1.0
library(ggplot2)
library(maps)
w <- map_data("world", region = "uk")
uk <- ggplot(data = w, aes(x = long, y = lat, group=group)) + geom_polygon(fill = "seagreen2", colour="white") + coord_map()
uk + geom_point(data=table, aes(x=long2, y=lat2, colour="red", size=2), position="jitter", alpha=I(0.5))
Is it the way I have built my map, or how I am plotting my points? And how do I fix it?

I've made three changes to your code, and one or more of them solved the problems you were having. I'm not sure exactly which—feel free to experiment!
I named your data pdat (point data) instead of table. table is the name of a built-in R function, and it's best to avoid using it as a variable name.
I have placed both data= expressions inside the geom function that needs that data (instead of placing the data= and aes() inside the initial ggplot() call.) When I use two or more data.frames in a single plot, I do this defensively and find that it avoids many problems.
I have moved colour="red" and size=2 outside of the aes() function. aes() is used to create an association between a column in your data.frame and a visual attribute of the plot. Anything that's not a name of a column doesn't belong inside aes().
# Load data.
pdat <- read.table(header=TRUE,
text="long2 lat2 mag1
1 -2.62 52.84 1.9
2 1.94 57.03 4.2
3 -0.24 51.16 0.6
4 -2.34 53.34 0.8
5 -3.16 55.73 2.0
6 -0.24 51.16 1.0
7 -4.11 53.03 1.5
8 -0.24 51.16 0.2
9 -0.24 51.16 1.1
10 -5.70 57.08 1.6
11 -2.40 53.00 1.4
12 -1.19 53.35 1.2
13 -1.02 53.84 1.7
14 -4.24 52.62 0.8
15 -3.23 54.24 0.3
16 -2.06 52.62 1.0
17 1.63 54.96 1.7
18 -5.24 56.05 0.7
19 -5.86 55.84 1.3
20 -3.22 54.23 0.3
21 -0.24 51.16 -1.4
22 -0.24 51.16 -0.7
23 -4.01 55.92 0.3
24 -5.18 50.08 2.3
25 -1.95 54.44 1.0")
library(ggplot2)
library(maps)
w <- map_data("world", region = "uk")
uk <- ggplot() +
geom_polygon(data = w,
aes(x = long, y = lat, group = group),
fill = "seagreen2", colour = "white") +
coord_map() +
geom_point(data = pdat,
aes(x = long2, y = lat2),
colour = "red", size = 2,
position = "jitter", alpha = 0.5)
ggsave("map.png", plot=uk, height=4, width=6, dpi=150)

Related

Can't add legend panel to a certain scatter plot with multiple data sets

I simply can't find a way to plot legends panel in this specific ggplot with ggplot2 on R. Just want to make it appear.
For context, I'm plotting chemical abundances of sample versus the atomic number of the elements.
For background, I tried many things that are described here:
Reasons that ggplot2 legend does not appear
including links therein, however could not find a solution for my specific data set.
I know the problem could be within the structure of the data set, since I've been able to do that with other data, but I can't solve it. I also know that the problem should have to do with the theme() described in the code below, because when I use default ggplot configuration legends actually appear. I use this personalized theme for consistency trough out my work.
This is what I have so far removing cosmetics:
ggplot(atomic, aes(x=atomic$Z, y = atomic$avg, group=1), fill = atomic$Z) +
plot dots for average of values
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$avg, group=1, color="black"), size=0.5, alpha=1, shape=16 ) +
connect dots for average of values
geom_line(data=atomic, aes(x=atomic$Z, y=atomic$avg, group=1), color="black", linetype= "dashed") +
plot dots for actual values from the samples
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$SDSS, group=1, color="#00ba38"), size=5, alpha=1, shape=16, color="#00ba38") +
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$HE22, group=1, color="#619cff"), size=5, alpha=1, shape=16, color="#619cff") +
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$HE12, group=1, color="#F8766D"), size=5, alpha=1, shape=16, color="#F8766D") +
EDIT: the Definition of base_breaks (used below)
base_breaks_x <- function(x){
b <- pretty(x)
d <- data.frame(y=-Inf, yend=-Inf, x=min(b), xend=max(b))
list(geom_segment(data=d, aes(x=x, y=y, xend=xend, yend=yend), inherit.aes=FALSE),
scale_x_continuous(breaks=b))
}
base_breaks_y <- function(x){
b <- pretty(x)
d <- data.frame(x=-Inf, xend=-Inf, y=min(b), yend=max(b))
list(geom_segment(data=d, aes(x=x, y=y, xend=xend, yend=yend), inherit.aes=FALSE),
scale_y_continuous(breaks=b))
}
the problem might be here
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
text = element_text(size=20),
legend.position="bottom",
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
base_breaks_x(atomic$Z) +
base_breaks_y(atomic$HE22)
The data set is the following
Z Name HE22 SDSS HE12 avg
1 3 Li NA 1.00 NA 1.00
2 6 C 6.16 5.50 6.06 5.91
3 7 N NA NA 6.49 6.49
4 11 Na NA NA 3.53 3.53
5 12 Mg 5.32 4.43 4.99 4.91
6 13 Al 2.90 NA 3.08 2.99
7 14 Si NA 4.90 4.89 4.90
8 20 Ca 4.07 3.37 3.56 3.67
9 21 Sc 0.72 -0.07 0.24 0.30
10 22 Ti 2.74 1.79 2.47 2.33
11 23 V NA NA 1.18 1.18
12 24 Cr 2.88 2.14 2.67 2.56
13 25 Mn 2.34 1.59 2.44 2.12
14 26 Fe 4.92 4.14 4.59 4.55
15 27 Co 2.57 1.72 2.36 2.22
16 28 Ni 3.63 2.96 3.51 3.37
17 29 Cu NA NA 0.31 0.31
18 30 Zn 2.29 NA 2.44 2.37
19 38 Sr 0.62 0.29 0.41 0.44
20 39 Y -0.22 -0.44 -0.33 -0.33
21 40 Zr 0.60 NA 0.30 0.45
22 56 Ba 0.13 -0.10 0.12 0.05
23 57 La -0.77 -0.49 -0.77 -0.68
24 58 Ce NA NA -0.39 -0.39
25 59 Pr NA NA -0.78 -0.78
26 60 Nd -0.47 NA -0.37 -0.42
27 62 Sm NA NA -0.57 -0.57
28 63 Eu -1.02 -0.92 -0.85 -0.93
29 64 Gd NA NA -0.39 -0.39
30 66 Dy NA NA -0.16 -0.16
31 68 Er NA -0.40 NA -0.40
32 70 Yb NA -0.60 NA -0.60
33 90 Th NA -0.60 NA -0.60
as Z = atomic number, Name = element, HE12/HE22/SDSS = samples, avg = average of the samples.
I would like to know how I can add legend panel coherent with the colors of my scatter plots.
Thank you so much! Hope I could describe the problem properly.
This is personally what I would do.
I converted the data from wide format to long format since it's easier to manipulate colors that way (Sorry I just used generic "key" and "value" since I'm not sure what you would want your columns to be named). Hopefully this will get you at least part of the way to where you want to go. Let me know if you have questions!
library(ggplot2)
library(tidyr)
p <- atomic %>%
gather(key = "key", value = "value", SDSS, HE22, HE12) %>%
ggplot(aes(Z, value, color = key))+
geom_point() +
geom_text(aes(x = Z, y = avg, label = Name), # EDITED
color = "black")
scale_color_manual(values = c("#00ba38", "#619cff", "#F8766D"))
p +
geom_line(data=atomic, aes(x=atomic$Z, y=atomic$avg, group=1), color="black",
linetype= "dashed") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
text = element_text(size=20),
legend.position="bottom",
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
base_breaks_x(atomic$Z) +
base_breaks_y(atomic$HE22)
EDITED
I added the geom_text() command so labels show up. You can adjust the arguments so the labels look better. I've also heard geom_text_repel() in the ggrepel package is helpful for creating nice labels: https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html#examples

Colour palette from low to high for every row

I'm trying to plot a geom_tile plot for a dataset, where I need to highlight the max and min values in every row (colour palette going from green to red)
Dataset:
draft_mean trim rf_pwr
1 12.0 1.0 12253
2 12.0 0.8 12052
3 12.0 0.6 12132
4 12.0 0.4 12280
5 12.0 0.2 11731
6 12.0 0.0 11317
7 12.0 -0.2 12126
8 12.0 -0.4 12288
9 12.0 -0.6 12461
10 12.0 -0.8 12791
11 12.0 -1.0 12808
12 12.2 1.0 12346
13 12.2 0.8 12041
14 12.2 0.6 12345
15 12.2 0.4 12411
16 12.2 0.2 12810
17 12.2 0.0 12993
18 12.2 -0.2 12796
19 12.2 -0.4 12411
20 12.2 -0.6 12342
21 12.2 -0.8 12671
22 12.2 -1.0 13161
ggplot(dataset, aes(trim, draft_mean)) +
geom_tile(aes(fill=rf_pwr), color="black") +
scale_fill_gradient(low= "green", high= "red") +
scale_x_reverse() +
scale_y_reverse()
This plot (image) is taking the minimum values and plotting them as green and maximum values as red. What I need help with is that I need colour palette to go from green to red (minimum to maximum) for every row of the plot (2 rows in this plot) rather than the whole plot.
For draft_mean=12.2, rf_pwr should be colour formatted from minimum to maximum for trim values.
For every value of draft_mean, I should be able to tell the trim values with lowest and highest rf_pwr.
I can plot individual draft_mean values to check, but all draft_mean values needs to be visualized together.
You can create a scaled variable where min = 0 and max = 1 per group like this:
require(tidyverse)
# create toy data
set.seed(1)
df <- data.frame(
draft_mean =sort(rep(c(12,12.2),11 )),
trim=rep(sample(seq(-1,1,length.out = 11), replace = F),2),
rf_pwr = sample(11000:13000,22)
)
# create a scaled variable per unique draft_mean (min = 0 and max = 1)
df <- df %>% group_by(draft_mean) %>% mutate(rf_scl = (rf_pwr-
min(rf_pwr))/(max(rf_pwr)-min(rf_pwr)))
ggplot(df, aes(trim, draft_mean)) +
geom_tile(aes(fill=rf_scl), color="black") +
scale_fill_gradient(low= "green", high= "red") +
scale_x_reverse() +
scale_y_reverse()

Coloring nodes of a graph according to the different scales

I want to plot different data sets as igraph objects. They can be like as follows:
library(igraph)
m<-matrix(data = c("a1_ghj", "a1_phj",
"b2_ghj", "c1_pht",
"c1_ght", "a1_ghi",
"g5_pht", "d2_phj",
"r5_phj", "u6_pht"), ncol = 2)
))
g<-graph_from_edgelist(m)
g
The color of their nodes should be specified by different scales for example they are as follows:
aa qwr asd rty fgh vbn iop ert
ghj 1.8 -0.5 0.2 0.62 0.74 0.3 1.6
ght 2.5 -1 4.1 0.29 0.91 0.9 2
pht -3.5 3 -3.1 -0.9 0.62 -0.6 -9.2
phj -3.5 3 -1.8 -0.74 0.62 -0.7 -8.2
ghi 2.8 -2.5 4.4 1.19 0.88 0.5 3.7
In the name of nodes, after _ , the name of group that the node is a member of that is displayed. In the scale table, columns display type of the scale and rows illustrate the name of the groups.
For plotting these graphs I need a function to normalize these scales between -1 and 1, then specifies color to the nodes regarding the values of a chosen scale type in the table. Anybody help me on this issue?
First of all, as in The earlier question
you can use sub on the vertex names to get the suffixes.
Suffixes = factor(sub(".*_", "", names(V(g))))
So the question becomes how to use the different scales to choose the colors
for the nodes. You asked to scale from -1 to 1, but actually I have scaled
0 to 1, because that is the type of argument expected by the function produced
by colorRamp.
Your scaling data
RawScales = read.table(text="aa qwr asd rty fgh vbn iop ert
ghj 1.8 -0.5 0.2 0.62 0.74 0.3 1.6
ght 2.5 -1 4.1 0.29 0.91 0.9 2
pht -3.5 3 -3.1 -0.9 0.62 -0.6 -9.2
phj -3.5 3 -1.8 -0.74 0.62 -0.7 -8.2
ghi 2.8 -2.5 4.4 1.19 0.88 0.5 3.7",
header=TRUE)
I will use both of the qwr and rty scales as examples.
Scale between 0 and 1.
qwr_Scaled = (RawScales$qwr - min(RawScales$qwr)) /
(max(RawScales$qwr) - min(RawScales$qwr))
rty_Scaled = (RawScales$rty - min(RawScales$rty)) /
(max(RawScales$rty) - min(RawScales$rty))
Set up a function to create color scales. Note: orange is the minimum value, red is the maximum value.
Color = colorRamp(c("orange", "yellow", "white", "pink", "red"))
Use the function to create a vector of colors for the nodes.
ColVals_qwr = rgb(Color(qwr_Scaled), maxColorValue=255)
names(ColVals_qwr) = RawScales$aa
ColVals_rty = rgb(Color(rty_Scaled), maxColorValue=255)
names(ColVals_rty) = RawScales$aa
Now plot using the color scales. I have added an explicit layout of the nodes so that the two plots would be comparable.
par(mfrow=c(1,2), mar=c(5, 1,3,1))
LO = layout_with_fr(g)
plot(g, vertex.color=ColVals_qwr[Suffixes], frame=TRUE)
plot(g, vertex.color=ColVals_rty[Suffixes], frame=TRUE)

How to add shaded confidence intervals to line plot with specified values

I have a small table of summary data with the odds ratio, upper and lower confidence limits for four categories, with six levels within each category. I'd like to produce a chart using ggplot2 that looks similar to the usual one created when you specify a lm and it's se, but I'd like R just to use the pre-specified values I have in my table. I've managed to create the line graph with error bars, but these overlap and make it unclear. The data look like this:
interval OR Drug lower upper
14 0.004 a 0.002 0.205
30 0.022 a 0.001 0.101
60 0.13 a 0.061 0.23
90 0.22 a 0.14 0.34
180 0.25 a 0.17 0.35
365 0.31 a 0.23 0.41
14 0.84 b 0.59 1.19
30 0.85 b 0.66 1.084
60 0.94 b 0.75 1.17
90 0.83 b 0.68 1.01
180 1.28 b 1.09 1.51
365 1.58 b 1.38 1.82
14 1.9 c 0.9 4.27
30 2.91 c 1.47 6.29
60 2.57 c 1.52 4.55
90 2.05 c 1.31 3.27
180 2.422 c 1.596 3.769
365 2.83 c 1.93 4.26
14 0.29 d 0.04 1.18
30 0.09 d 0.01 0.29
60 0.39 d 0.17 0.82
90 0.39 d 0.2 0.7
180 0.37 d 0.22 0.59
365 0.34 d 0.21 0.53
I have tried this:
limits <- aes(ymax=upper, ymin=lower)
dodge <- position_dodge(width=0.9)
ggplot(data, aes(y=OR, x=days, colour=Drug)) +
geom_line(stat="identity") +
geom_errorbar(limits, position=dodge)
and searched for a suitable answer to create a pretty plot, but I'm flummoxed!
Any help greatly appreciated!
You need the following lines:
p<-ggplot(data=data, aes(x=interval, y=OR, colour=Drug)) + geom_point() + geom_line()
p<-p+geom_ribbon(aes(ymin=data$lower, ymax=data$upper), linetype=2, alpha=0.1)
Here is a base R approach using polygon() since #jmb requested a solution in the comments. Note that I have to define two sets of x-values and associated y values for the polygon to plot. It works by plotting the outer perimeter of the polygon. I define plot type = 'n' and use points() separately to get the points on top of the polygon. My personal preference is the ggplot solutions above when possible since polygon() is pretty clunky.
library(tidyverse)
data('mtcars') #built in dataset
mean.mpg = mtcars %>%
group_by(cyl) %>%
summarise(N = n(),
avg.mpg = mean(mpg),
SE.low = avg.mpg - (sd(mpg)/sqrt(N)),
SE.high =avg.mpg + (sd(mpg)/sqrt(N)))
plot(avg.mpg ~ cyl, data = mean.mpg, ylim = c(10,30), type = 'n')
#note I have defined c(x1, x2) and c(y1, y2)
polygon(c(mean.mpg$cyl, rev(mean.mpg$cyl)),
c(mean.mpg$SE.low,rev(mean.mpg$SE.high)), density = 200, col ='grey90')
points(avg.mpg ~ cyl, data = mean.mpg, pch = 19, col = 'firebrick')

Alpha not applied for all points in ggplot scatterplot

I have been trying to produce a scatter plot with two levels of alpha applied to dots that are above or below a score threshold. To do so, I am storing the alpha value for each point in a vector, item_alpha, within the data frame and supplying this vector as the argument for alpha in my call to geom_point:
library( ggplot2 );
library( scales );
one.data <- read.table("test.data", header = TRUE)
p1 <- ggplot( data = one.data )
p1 <- p1 + geom_point( aes( plot_X, plot_Y, colour = log10_p_value, size = plot_size, alpha = item_alpha ) )
p1 <- p1 + scale_colour_gradientn( colours = c("red", "yellow", "green", "blue"), limits = c( min(one.data$log10_p_value), max(one.data$log10_p_value)));
p1 <- p1 + geom_point( aes(plot_X, plot_Y, size = plot_size), shape = 21, fill = "transparent", colour = I (alpha ("black", 0.6) ));
p1 <- p1 + scale_size( range=c(5, 30)) + theme_bw();
one.x_range = max(one.data$plot_X) - min(one.data$plot_X);
one.y_range = max(one.data$plot_Y) - min(one.data$plot_Y);
p1 <- p1 + xlim(min(one.data$plot_X) one.x_range/10,max(one.data$plot_X)+one.x_range/10);
p1 <- p1 + ylim(min(one.data$plot_Y)one.y_range/10,max(one.data$plot_Y)+one.y_range/10);
p1
However, it seems alpha is only being set properly for the eight points with the smaller value, while the remaining points remain opaque. I've consulted the ggplot documentation, played with the examples and tried some other variations which have mostly produced various errors and I'm really hoping someone will have some insight on this! Thanks in advance!
Contents of test.data:
"plot_X" "plot_Y" "plot_size" "log10_p_value" "item_alpha"
5.326 3.194 4.411 -27.3093 0.6
-2.148 7.469 3.434 -12.3487 0.6
-6.14 -2.796 3.062 -22.8069 0.6
3.648 6.091 3.597 -15.5032 0.6
0.356 -6.925 3.95 -10.4754 0.6
5.532 -0.135 3.246 -19.2883 0.6
3.794 -2.279 3.557 -16.4438 0.6
-3.784 1.42 2.914 -17.9687 0.6
-7.645 -1.571 3.163 -12.4498 0.6
-1.526 -4.756 3.509 -10.8972 0.6
-6.461 2.293 2.962 -13.4306 0.6
-5.806 0.983 4.38 -24.5422 0.6
-3.592 0.769 2.971 -17.8119 0.6
0.127 3.572 3.603 -11.4277 0.6
-0.566 0.706 3.77 -13.0952 0.3
2.25 -2.604 0.845 -11.7949 0.3
-7.845 -0.927 3.21 -12.6408 0.3
1.084 -6.691 3.654 -10.7319 0.3
-3.546 6.46 2.994 -11.6777 0.3
-5.478 -0.645 4.256 -17.7344 0.3
-6.251 -0.418 4.273 -19.29 0.3
-3.855 5.969 3.236 -10.9057 0.3
0.345 0.971 3.383 -11.5973 0.6
0.989 0.345 2.959 -10.8252 0.6
You're using a distinctly base plotting approach with ggplot2, which is obviously not the right way to go. Here are two options:
dat <- read.table(text = "plot_X plot_Y plot_size log10_p_value item_alpha
5.326 3.194 4.411 -27.3093 0.6
-2.148 7.469 3.434 -12.3487 0.6
-6.14 -2.796 3.062 -22.8069 0.6
3.648 6.091 3.597 -15.5032 0.6
0.356 -6.925 3.95 -10.4754 0.6
5.532 -0.135 3.246 -19.2883 0.6
3.794 -2.279 3.557 -16.4438 0.6
-3.784 1.42 2.914 -17.9687 0.6
-7.645 -1.571 3.163 -12.4498 0.6
-1.526 -4.756 3.509 -10.8972 0.6
-6.461 2.293 2.962 -13.4306 0.6
-5.806 0.983 4.38 -24.5422 0.6
-3.592 0.769 2.971 -17.8119 0.6
0.127 3.572 3.603 -11.4277 0.6
-0.566 0.706 3.77 -13.0952 0.3
2.25 -2.604 0.845 -11.7949 0.3
-7.845 -0.927 3.21 -12.6408 0.3
1.084 -6.691 3.654 -10.7319 0.3
-3.546 6.46 2.994 -11.6777 0.3
-5.478 -0.645 4.256 -17.7344 0.3
-6.251 -0.418 4.273 -19.29 0.3
-3.855 5.969 3.236 -10.9057 0.3
0.345 0.971 3.383 -11.5973 0.6
0.989 0.345 2.959 -10.8252 0.6",header = TRUE)
dat$alpha_grp <- ifelse(dat$item_alpha == 0.6,'High','Low')
#If you want a legend; although you can suppress the legend
# here if you want.
ggplot(data = dat,aes(x = plot_X,y = plot_Y)) +
geom_point(aes(alpha = alpha_grp)) +
scale_alpha_manual(values = c(0.3,0.6))
#If you don't care about a legend
ggplot() +
geom_point(data = dat[dat$alpha_grp == 'High',],
aes(x = plot_X,y = plot_Y),alpha = 0.6) +
geom_point(data = dat[dat$alpha_grp == 'Low',],
aes(x = plot_X,y = plot_Y),alpha = 0.3)

Resources