How to use continuous variable 2d distribution with ggmap - r

copying the example below I could create a map that shows the density of points in the map, but I would like to see a density distribution of the quantitative variable "dist", on table W, what should I do to have that?
As this example
Density2d Plot using another variable for the fill (similar to geom_tile)?
but with stat_density2d instead of stat_summary2d.
W
lat lon dist
1 -3.844117 -32.44028 0.23
2 -3.841167 -32.39318 0.86
3 -3.808283 -32.38135 0.13
4 -3.815583 -32.39295 0.15
5 -3.844267 -32.44015 0.16
6 -3.845600 -32.44220 0.20
7 -3.866700 -32.45778 0.67
8 -3.833467 -32.39752 0.22
9 -3.871400 -32.46202 0.18
10 -3.833467 -32.39752 0.22
11 -3.833467 -32.39752 0.60
12 -3.833467 -32.39752 0.14
13 -3.833467 -32.39752 0.22
14 -3.833467 -32.39752 0.14
15 -3.833467 -32.39752 0.16
16 -3.872283 -32.42713 0.06
17 -3.849217 -32.39095 0.10
18 -3.833467 -32.39752 0.57
library(ggmap)
center <- c(-3.858331, -32.423985)
fernando.map <- get_map(location = c(center[2], center[1]), zoom = 13, color = "bw")
ggmap(fernando.map, extent = "normal", maprange=FALSE) %+% W + aes(x = lon, y = lat) +
#geom_density2d() +
stat_density2d(aes(fill = ..level.., alpha = ..level.., colour=dist),
size = 0.01, bins = 16, geom = 'polygon')

Related

NDVI time series and axis limits

My code is to plot NDVI versus time. here is the code below
ggplot(data = EdinburghNDVI, aes(x = EdinburghNDVIDate, y = NDVI)) +
geom_point(color = "blue") +
labs(title = "Edinburgh NDVI",
x = "Date",
y = "NDVI")
When I try to add ylim I get an error saying that there is a discrete value supplied to continuous scale.
Representative data
> head(EdinburghNDVIDate)
[1] "2000-02-24" "2000-02-25" "2000-02-26" "2000-02-27" "2000-02-28" "2000-02-29"
> head(EdinburghNDVI$NDVI)
[1] 0.39 0.48 0.47 0.47 0.47 0.47 82
Levels: -0.07 -0.08 -0.23 -0.24 -0.35 #DIV/0! 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 ... 0.76 –

pheatmap : extracting row z score values

I am a little confused about row scaling in pheatmap. This is my data frame
gene s1 s2 s3
1 -3.83 -8.17 -8.59
2 0.33 -4.51 -7.27
3 0.15 -5.26 -6.2
4 -0.08 -6.13 -5.95
5 -1.15 -4.82 -5.75
6 -0.99 -4.11 -4.85
7 0.42 -4.18 -4.54
8 -0.32 -3.43 -4.4
9 -0.72 -3.37 -4.39
I need to extract those values of the data frame after pheatmap generates the graph with row z score
library(pheatmap)
my_colors <- c(min(d),seq(-4,4,by=0.01),max(d))
my_palette <- c("green",colorRampPalette(colors = c("green", "red"))
(n = length(my_colors)-2), "red")
pheatmap(as.matrix(d),
scale = "row",
cluster_cols=FALSE,
cluster_rows = FALSE,
treeheight_row=0,
show_rownames=FALSE,
main = "test.txt",
color = my_palette,
breaks = my_colors)
How can I get a mew matrix which pheatmap uses to make the heatmap?

plotting with specific values for heatmap in pheatmap

I have a data frame like this:
gene s1 s2 s3
1 -3.83 -8.17 -8.59
2 0.33 -4.51 -7.27
3 0.15 -5.26 -6.2
4 -0.08 -6.13 -5.95
5 -1.15 -4.82 -5.75
6 -0.99 -4.11 -4.85
7 0.42 -4.18 -4.54
8 -0.32 -3.43 -4.4
9 -0.72 -3.37 -4.39
I want to make a heatmap using pheatmap where if anything is below -4 it should be green and anything over +4 should be red and everything in between should red/green shades. I also don't want to scale my data and no clustering. I have this code so far in R:
d <- read.table("test.txt", header = TRUE, sep = "\t", row.names = 1, quote = "")
pheatmap(as.matrix(d), # matrix
scale = "none", # z score scaling applied to rows
cluster_cols=FALSE, # do not cluster columns
cluster_rows = FALSE,
treeheight_row=0, # do not show row dendrogram
show_rownames=FALSE, # do not show row names i.e gene names
main = "test.txt",
color = colorRampPalette(c("#0016DB","#FFFFFF","#FFFF00"))(50),
)
How can I plot this with the color scheme I mentioned above.
Thanks
d <-read.table(text="gene s1 s2 s3
1 -3.83 -8.17 -8.59
2 0.33 -4.51 -7.27
3 0.15 -5.26 -6.20
4 -0.08 -6.13 -5.95
5 -1.15 -4.82 -5.75
6 -0.99 -4.11 -4.85
7 0.42 -4.18 -4.54
8 -0.32 -3.43 -4.40
9 -0.72 -3.37 -4.39", header=T)
library(pheatmap)
my_colors <- c(min(d),seq(-4,4,by=0.01),max(d))
my_palette <- c("green",colorRampPalette(colors = c("green", "red"))
(n = length(my_colors)-2), "red")
pheatmap(as.matrix(d),
scale = "none",
cluster_cols=FALSE,
cluster_rows = FALSE,
treeheight_row=0,
show_rownames=FALSE,
main = "test.txt",
color = my_palette,
breaks = my_colors)
Created on 2019-05-29 by the reprex package (v0.3.0)

Can't add legend panel to a certain scatter plot with multiple data sets

I simply can't find a way to plot legends panel in this specific ggplot with ggplot2 on R. Just want to make it appear.
For context, I'm plotting chemical abundances of sample versus the atomic number of the elements.
For background, I tried many things that are described here:
Reasons that ggplot2 legend does not appear
including links therein, however could not find a solution for my specific data set.
I know the problem could be within the structure of the data set, since I've been able to do that with other data, but I can't solve it. I also know that the problem should have to do with the theme() described in the code below, because when I use default ggplot configuration legends actually appear. I use this personalized theme for consistency trough out my work.
This is what I have so far removing cosmetics:
ggplot(atomic, aes(x=atomic$Z, y = atomic$avg, group=1), fill = atomic$Z) +
plot dots for average of values
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$avg, group=1, color="black"), size=0.5, alpha=1, shape=16 ) +
connect dots for average of values
geom_line(data=atomic, aes(x=atomic$Z, y=atomic$avg, group=1), color="black", linetype= "dashed") +
plot dots for actual values from the samples
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$SDSS, group=1, color="#00ba38"), size=5, alpha=1, shape=16, color="#00ba38") +
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$HE22, group=1, color="#619cff"), size=5, alpha=1, shape=16, color="#619cff") +
geom_point(data=atomic, aes(x=atomic$Z, y=atomic$HE12, group=1, color="#F8766D"), size=5, alpha=1, shape=16, color="#F8766D") +
EDIT: the Definition of base_breaks (used below)
base_breaks_x <- function(x){
b <- pretty(x)
d <- data.frame(y=-Inf, yend=-Inf, x=min(b), xend=max(b))
list(geom_segment(data=d, aes(x=x, y=y, xend=xend, yend=yend), inherit.aes=FALSE),
scale_x_continuous(breaks=b))
}
base_breaks_y <- function(x){
b <- pretty(x)
d <- data.frame(x=-Inf, xend=-Inf, y=min(b), yend=max(b))
list(geom_segment(data=d, aes(x=x, y=y, xend=xend, yend=yend), inherit.aes=FALSE),
scale_y_continuous(breaks=b))
}
the problem might be here
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
text = element_text(size=20),
legend.position="bottom",
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
base_breaks_x(atomic$Z) +
base_breaks_y(atomic$HE22)
The data set is the following
Z Name HE22 SDSS HE12 avg
1 3 Li NA 1.00 NA 1.00
2 6 C 6.16 5.50 6.06 5.91
3 7 N NA NA 6.49 6.49
4 11 Na NA NA 3.53 3.53
5 12 Mg 5.32 4.43 4.99 4.91
6 13 Al 2.90 NA 3.08 2.99
7 14 Si NA 4.90 4.89 4.90
8 20 Ca 4.07 3.37 3.56 3.67
9 21 Sc 0.72 -0.07 0.24 0.30
10 22 Ti 2.74 1.79 2.47 2.33
11 23 V NA NA 1.18 1.18
12 24 Cr 2.88 2.14 2.67 2.56
13 25 Mn 2.34 1.59 2.44 2.12
14 26 Fe 4.92 4.14 4.59 4.55
15 27 Co 2.57 1.72 2.36 2.22
16 28 Ni 3.63 2.96 3.51 3.37
17 29 Cu NA NA 0.31 0.31
18 30 Zn 2.29 NA 2.44 2.37
19 38 Sr 0.62 0.29 0.41 0.44
20 39 Y -0.22 -0.44 -0.33 -0.33
21 40 Zr 0.60 NA 0.30 0.45
22 56 Ba 0.13 -0.10 0.12 0.05
23 57 La -0.77 -0.49 -0.77 -0.68
24 58 Ce NA NA -0.39 -0.39
25 59 Pr NA NA -0.78 -0.78
26 60 Nd -0.47 NA -0.37 -0.42
27 62 Sm NA NA -0.57 -0.57
28 63 Eu -1.02 -0.92 -0.85 -0.93
29 64 Gd NA NA -0.39 -0.39
30 66 Dy NA NA -0.16 -0.16
31 68 Er NA -0.40 NA -0.40
32 70 Yb NA -0.60 NA -0.60
33 90 Th NA -0.60 NA -0.60
as Z = atomic number, Name = element, HE12/HE22/SDSS = samples, avg = average of the samples.
I would like to know how I can add legend panel coherent with the colors of my scatter plots.
Thank you so much! Hope I could describe the problem properly.
This is personally what I would do.
I converted the data from wide format to long format since it's easier to manipulate colors that way (Sorry I just used generic "key" and "value" since I'm not sure what you would want your columns to be named). Hopefully this will get you at least part of the way to where you want to go. Let me know if you have questions!
library(ggplot2)
library(tidyr)
p <- atomic %>%
gather(key = "key", value = "value", SDSS, HE22, HE12) %>%
ggplot(aes(Z, value, color = key))+
geom_point() +
geom_text(aes(x = Z, y = avg, label = Name), # EDITED
color = "black")
scale_color_manual(values = c("#00ba38", "#619cff", "#F8766D"))
p +
geom_line(data=atomic, aes(x=atomic$Z, y=atomic$avg, group=1), color="black",
linetype= "dashed") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5),
text = element_text(size=20),
legend.position="bottom",
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()) +
base_breaks_x(atomic$Z) +
base_breaks_y(atomic$HE22)
EDITED
I added the geom_text() command so labels show up. You can adjust the arguments so the labels look better. I've also heard geom_text_repel() in the ggrepel package is helpful for creating nice labels: https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html#examples

How to add shaded confidence intervals to line plot with specified values

I have a small table of summary data with the odds ratio, upper and lower confidence limits for four categories, with six levels within each category. I'd like to produce a chart using ggplot2 that looks similar to the usual one created when you specify a lm and it's se, but I'd like R just to use the pre-specified values I have in my table. I've managed to create the line graph with error bars, but these overlap and make it unclear. The data look like this:
interval OR Drug lower upper
14 0.004 a 0.002 0.205
30 0.022 a 0.001 0.101
60 0.13 a 0.061 0.23
90 0.22 a 0.14 0.34
180 0.25 a 0.17 0.35
365 0.31 a 0.23 0.41
14 0.84 b 0.59 1.19
30 0.85 b 0.66 1.084
60 0.94 b 0.75 1.17
90 0.83 b 0.68 1.01
180 1.28 b 1.09 1.51
365 1.58 b 1.38 1.82
14 1.9 c 0.9 4.27
30 2.91 c 1.47 6.29
60 2.57 c 1.52 4.55
90 2.05 c 1.31 3.27
180 2.422 c 1.596 3.769
365 2.83 c 1.93 4.26
14 0.29 d 0.04 1.18
30 0.09 d 0.01 0.29
60 0.39 d 0.17 0.82
90 0.39 d 0.2 0.7
180 0.37 d 0.22 0.59
365 0.34 d 0.21 0.53
I have tried this:
limits <- aes(ymax=upper, ymin=lower)
dodge <- position_dodge(width=0.9)
ggplot(data, aes(y=OR, x=days, colour=Drug)) +
geom_line(stat="identity") +
geom_errorbar(limits, position=dodge)
and searched for a suitable answer to create a pretty plot, but I'm flummoxed!
Any help greatly appreciated!
You need the following lines:
p<-ggplot(data=data, aes(x=interval, y=OR, colour=Drug)) + geom_point() + geom_line()
p<-p+geom_ribbon(aes(ymin=data$lower, ymax=data$upper), linetype=2, alpha=0.1)
Here is a base R approach using polygon() since #jmb requested a solution in the comments. Note that I have to define two sets of x-values and associated y values for the polygon to plot. It works by plotting the outer perimeter of the polygon. I define plot type = 'n' and use points() separately to get the points on top of the polygon. My personal preference is the ggplot solutions above when possible since polygon() is pretty clunky.
library(tidyverse)
data('mtcars') #built in dataset
mean.mpg = mtcars %>%
group_by(cyl) %>%
summarise(N = n(),
avg.mpg = mean(mpg),
SE.low = avg.mpg - (sd(mpg)/sqrt(N)),
SE.high =avg.mpg + (sd(mpg)/sqrt(N)))
plot(avg.mpg ~ cyl, data = mean.mpg, ylim = c(10,30), type = 'n')
#note I have defined c(x1, x2) and c(y1, y2)
polygon(c(mean.mpg$cyl, rev(mean.mpg$cyl)),
c(mean.mpg$SE.low,rev(mean.mpg$SE.high)), density = 200, col ='grey90')
points(avg.mpg ~ cyl, data = mean.mpg, pch = 19, col = 'firebrick')

Resources