I'm trying to create an image similar to this one in R using ggplot2.
However, I'm new to using this package. I'm struggling to find out how to draw lines that each have a different gradient. I want each line to start with one colour and end in another colour (gradually changing throughout), and I want to be able to specify this for each individual line uniquely. Can I do this with geom_segment? Would it also be possible for curves with geom_curve? It seems that the package ggforce could be useful for this. Any help would be greatly appreciated! Thank you.
This is the best I could pull together in 20 minutes to just illustrate that ggforce can be handy.
library(ggplot2)
library(ggforce)
n <- 1000
df <- data.frame(
x = runif(2 * n),
id = rep(seq_len(n), each = 2),
y = rep(c(0:1), n)
)
g <- ggplot(df, aes(x = x, y = y)) +
geom_link2(aes(group = id, colour = x),
alpha = 0.3) +
scale_colour_gradientn(colours = rainbow(100),
guide = "none") +
theme_void() +
theme(plot.background = element_rect(fill = "black"))
Related
I used the following code to plot a packing circle graph and I want to add the numbers (values) for each bubble in addition to the text. How do I do that?
Another question is whether someone knows how to deal with a large number of categories (about 200) which makes some of the plot unreadable. Is there another visualization that might be more useful in this case?
Thanks in advance!
library(packcircles)
library(ggplot2)
library(viridis)
library(ggiraph)
packing <- circleProgressiveLayout(data$Number, sizetype='area')
data <- cbind(data, packing)
dat.gg <- circleLayoutVertices(packing, npoints=50)
ggplot() +
geom_polygon(data = dat.gg, aes(x, y, fill=as.factor(id), colour = "black", alpha = 0.6)) +
geom_text(data = data, aes(x, y, size=Number, label = Journal)) +
scale_size_continuous(range = c(2,4)) +
theme_void() +
theme(legend.position="none")+
coord_equal()```
I have created a function for creating a barchart using ggplot.
In my figure I want to overlay the plot with white horizontal bars at the position of the tick marks like in the plot below
p <- ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_bar(stat = 'identity')
# By inspection I found the y-tick postions to be c(50,100,150)
p + geom_hline(aes(yintercept = seq(50,150,50)), colour = 'white')
However, I would like to be able to change the data, so I can't use static positions for the lines like in the example. For example I might change Sepal.With to Sepal.Height in the example above.
Can you tell me how to:
get the tick positions from my ggplot; or
get the function that ggplot uses for tick positions so that I can use this to position my lines.
so I can do something like
tickpositions <- ggplot_tickpostion_fun(iris$Sepal.Width)
p + scale_y_continuous(breaks = tickpositions) +
geom_hline(aes(yintercept = tickpositions), colour = 'white')
A possible solution for (1) is to use ggplot_build to grab the content of the plot object. ggplot_build results in "[...] a panel object, which contain all information about [...] breaks".
ggplot_build(p)$layout$panel_ranges[[1]]$y.major_source
# [1] 0 50 100 150
See edit for pre-ggplot2 2.2.0 alternative.
Check out ggplot2::ggplot_build - it can show you lots of details about the plot object. You have to give it a plot object as input. I usually like to str() the result of ggplot_build to see what all the different values it has are.
For example, I see that there is a panel --> ranges --> y.major_source vector that seems to be what you're looking for. So to complete your example:
p <- ggplot() +
geom_bar(data = iris, aes(x = Species, y = Sepal.Width), stat = 'identity')
pb <- ggplot_build(p)
str(p)
y.ticks <- pb$panel$ranges[[1]]$y.major_source
p + geom_hline(aes(yintercept = y.ticks), colour = 'white')
Note that I moved the data argument from the main ggplot function to inside geom_bar, so that geom_line would not try to use the same dataset and throw errors when the number in iris is not a multiple of the number of lines we're drawing. Another option would be to pass a data = data.frame() argument to geom_line; I cannot comment on which one is a more correct solution, or if there's a nicer solution altogether. But the gist of my code still holds :)
For ggplot 3.1.0 this worked for me:
ggplot_build(p)$layout$panel_params[[1]]$y.major_source
#[1] 0 50 100 150
for sure you can. Read the help file for the seq() function.
seq(from = min(), to = max(), len = 5)
and do something like this.
p <- ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_bar(stat = 'identity')
p + geom_hline(aes(yintercept = seq(from = min(), to = max(), len = 5)), colour = 'white')
I have a dataset with binary variables like the one below.
M4 = matrix(sample(1:2,20*5, replace=TRUE),20,5)
M4 <- as.data.frame(M4)
M4$id <- 1:20
I have produced a stacked bar plot using the code below
library(reshape)
library(ggplot2)
library(scales)
M5 <- melt(M4, id="id")
M5$value <- as.factor(M5$value)
ggplot(M5, aes(x = variable)) + geom_bar(aes(fill = value), position = 'fill') +
scale_y_continuous(labels = percent_format())
Now I want the percentage for each field in each bar to be displayed in the graph, so that each bar reach 100%. I have tried 1, 2, 3 and several similar questions, but I can't find any example that fits my situation. How can I manage this task?
Try this method:
test <- ggplot(M5, aes(x = variable, fill = value, position = 'fill')) +
geom_bar() +
scale_y_continuous(labels = percent_format()) +
stat_bin(aes(label=paste("n = ",..count..)), vjust=1, geom="text")
test
EDITED: to give percentages and using the scales package:
require(scales)
test <- ggplot(M5, aes(x = variable, fill = value, position = 'fill')) +
geom_bar() +
scale_y_continuous(labels = percent_format()) +
stat_bin(aes(label = paste("n = ", scales::percent((..count..)/sum(..count..)))), vjust=1, geom="text")
test
You could use the sjp.stackfrq function from the sjPlot-package (see examples here).
M4 = matrix(sample(1:2,20*5, replace=TRUE),20,5)
M4 <- as.data.frame(M4)
sjp.stackfrq(M4)
# alternative colors: sjp.stackfrq(M4, barColor = c("aquamarine4", "brown3"))
Plot appearance can be custzomized with various parameters...
I really like the usage of the implicit information that is created by ggplot itself, as described in this post:
using the ggplot_build() function
From my point of view this provides a lot of opportunities to finally control the appearance of a ggplot chart.
Hope this helps somehow
Tom
I have sampled 10,000 coordinates from my data in this file. I have around 130,000 points.
https://www.dropbox.com/s/40hfyx6a5hsjuv7/data.csv
I am trying to plot these points on the Americas map using ggplot2. Here is my code.
library(ggplot2)
library(maps)
map_world <- map_data("world")
map_world <- subset(map_world, (lat >= -60 & lat <= 75))
map_world <- subset(map_world, (long >= -170 & long <= -30))
p <- ggplot(data = data_coords, legend = FALSE) +
geom_polygon(data = map_world, aes(x = long, y = lat, group = group)) +
geom_point(aes(x = lon, y = lat), shape = 19, size = 0.00001,
alpha = 0.3, colour = "red") +
theme(panel.grid.major = element_blank()) +
theme(panel.grid.minor = element_blank()) +
theme(axis.text.x = element_blank(),axis.text.y = element_blank()) +
theme(axis.ticks = element_blank()) +
xlab("") + ylab("")
png("my_plot.png", width = 8000, height = 7000, res = 1000)
print(p)
dev.off()
The points seem to cover the whole area in which they were plotted. I would like them to be more smaller to better represent a location. You can see that I've set the size to 0.00001. I was just trying to see if it has any effect but it doesn't seem to help after a certain limit. Is this the best that is possible at this resolution or could it be reduced more?
I had actually plotted around 400,000 points but only on the US map before and they looked much better like below. Hoping to get something like this. Thanks.
https://www.dropbox.com/s/8d0niu9g6ygz0wo/Clusters_reduced.png
Try playing with very small values of alpha, instead of the point size:
http://docs.ggplot2.org/0.9.3.1/geom_point.html
# Varying alpha is useful for large datasets
d <- ggplot(diamonds, aes(carat, price))
d + geom_point(alpha = 1/1000)
Edit:
Additional ideas are given in the documentation. Here's a summary:
Details
The scatterplot is useful for displaying the relationship between two continuous variables, although it can also be used with one continuous and one categorical variable, or two categorical variables. See geom_jitter for possibilities.
The bubblechart is a scatterplot with a third variable mapped to the size of points. There are no special names for scatterplots where another variable is mapped to point shape or colour, however.
The biggest potential problem with a scatterplot is overplotting: whenever you have more than a few points, points may be plotted on top of one another. This can severely distort the visual appearance of the plot. There is no one solution to this problem, but there are some techniques that can help. You can add additional information with stat_smooth, stat_quantile or stat_density2d. If you have few unique x values, geom_boxplot may also be useful. Alternatively, you can summarise the number of points at each location and display that in some way, using stat_sum.
Another technique is to use transparent points, geom_point(alpha = 0.05).
Edit 2:
Combining the details from the manual with the hints in Transparency and Alpha levels for ggplot2 stat_density2d with maps and layers in R
This might look like the solution:
library(ggplot2)
library(maps)
data_coords <- read.csv("C:/Downloads/data.csv")
map_world <- map_data("world")
map_world <- subset(map_world, (lat >= -60 & lat <= 75))
map_world <- subset(map_world, (long >= -170 & long <= -30))
p <- ggplot( data = data_coords, legend = FALSE) +
geom_polygon( data = map_world, aes(x = long, y = lat, group = group)) +
stat_density2d( data = data_coords, aes(x=lon, y=lat, fill = as.factor(..level..)), size=1, bins=10, geom='polygon') +
scale_fill_manual(values = c("yellow","red","green","royalblue", "black","white","orange","brown","grey"))
png("my_plot2k.png", width = 2000, height = 2000, res = 500)
print(p)
dev.off()
Resulting image (not the best colour palette used):
is there a way in ggplot2 to get the plot type "b"? See example:
x <- c(1:5)
y <- x
plot(x,y,type="b")
Ideally, I want to replace the points by their values to have something similar to this famous example:
EDIT:
Here some sample data (I want to plot each "cat" in a facet with plot type "b"):
df <- data.frame(x=rep(1:5,9),y=c(0.02,0.04,0.07,0.09,0.11,0.13,0.16,0.18,0.2,0.22,0.24,0.27,0.29,0.31,0.33,0.36,0.38,0.4,0.42,0.44,0.47,0.49,0.51,0.53,0.56,0.58,0.6,0.62,0.64,0.67,0.69,0.71,0.73,0.76,0.78,0.8,0.82,0.84,0.87,0.89,0.91,0.93,0.96,0.98,1),cat=rep(paste("a",1:9,sep=""),each=5))
Set up the axes by drawing the plot without any content.
plot(x, y, type = "n")
Then use text to make your data points.
text(x, y, labels = y)
You can add line segments with lines.
lines(x, y, col = "grey80")
EDIT: Totally failed to clock the mention of ggplot in the question. Try this.
dfr <- data.frame(x = 1:5, y = 1:5)
p <- ggplot(dfr, aes(x, y)) +
geom_text(aes(x, y, label = y)) +
geom_line(col = "grey80")
p
ANOTHER EDIT: Given your new dataset and request, this is what you need.
ggplot(df, aes(x, y)) + geom_point() + geom_line() + facet_wrap(~cat)
YET ANOTHER EDIT: We're starting to approach a real question. As in 'how do you make the lines not quite reach the points'.
The short answer is that that isn't a standard way to do this in ggplot2. The proper way to do this would be to use geom_segment and interpolate between your data points. This is quite a lot of effort however, so I suggest an easier fudge: draw big white circles around your points. The downside to this is that it makes the gridlines look silly, so you'll have to get rid of those.
ggplot(df, aes(x, y)) +
facet_wrap(~cat) +
geom_line() +
geom_point(size = 5, colour = "white") +
geom_point() +
opts(panel.background = theme_blank())
There's an experimental grob in gridExtra to implement this in Grid graphics,
library(gridExtra)
grid.newpage() ; grid.barbed(pch=5)
This is now easy with ggh4x::geom_pointpath. Set shape = NA and add a geom_text layer.
library(ggh4x)
#> Loading required package: ggplot2
df <- data.frame(x = rep(1:5, each = 5),
y = c(outer(seq(0, .8, .2), seq(0.02, 0.1, 0.02), `+`)),
cat = rep(paste0("a", 1:5)))
ggplot(df, aes(x, y)) +
geom_text(aes(label = cat)) +
geom_pointpath(aes(group = cat, shape = NA))
Created on 2021-11-13 by the reprex package (v2.0.1)
Another way to make great slope graphs is using the package CGPfunctions.
library(CGPfunctions)
newggslopegraph(newcancer, Year, Survival, Type)
You have also many options to choose. You can find a good tutorial here:
https://www.r-bloggers.com/2018/06/creating-slopegraphs-with-r/