Highlighting lines and gray out rest in multiple line chart with ggplot2? - r

So I had generated a multiple line chart on ggplot with different countries and want to colour the top 10 and grey out the rest.
When I assign colours black and red, it colours the first two countries in the legend. However I want to colour other ones down the list. US, India, Brazil in the chart. Help much appreciated, thanks.
This is what I have:
and the code here:
ggplot(data=y, aes(x = Date, y = Deaths, color = Country)) +
geom_line() +
scale_color_manual(values = c("black",
"red",
rep("gray", 196)))

You first need to order your countries according to the number of deaths, then in scale_color_manual you need the first 10 colours to be of your choice, not just the first two:
library(ggplot2)
y$Country <- reorder(y$Country, -y$Deaths)
ggplot(data = y, aes(x = Date, y = Deaths, color = Country)) +
geom_line() +
scale_color_manual(values = c(rep(c("black", "red"), each = 5),
rep("gray", nrow(y) - 10))) +
guides(color = guide_none())
Note that since you didn't share your data, I made some up with a similar structure and the same names as yours so that the above code should also work on your own data set.
Made-up data
set.seed(1)
y <- data.frame(
Deaths = c(replicate(198, 1000 * cumprod(runif(100, 1, 1 + sample(10, 1)/100)))),
Date = rep(seq(as.POSIXct('2020-01-01'), as.POSIXct('2022-01-01'), len = 100),
198),
Country = factor(rep(1:198, each = 100)))

Related

ggplot: Using factors to determine colour, shape and fill (issue with fill/color)

I have the following:
set.seed(100)
df <- data.frame(
lng = runif(n=20, min=5, max=10),
lat = runif(n=20, min=40, max=50),
year = rep(c("2001","2002","2003","2004"), each=5),
season = sample(c("spring", "autumn"), 10, replace = T),
info = sample(c("yes","no"), 10, replace = T)
)
Which can be plotted by:
ggplot() +
geom_point(data=df,
aes(x = lng,
y = lat,
color = year,
shape = season),
size=3)
To produce:
Great. But I want a red outline on the shapes were info == "yes".
The desired output would be:
Not made using actual data, just for demonstrative purpose. Made in powerpoint.
Admittedly it is similar to this question here, but not quite.
I am happy to split the df using a filter if easier then two + geom_points()
Many thanks
Jim
Below is a quick solution (not the best), which is to use another scale, and below I use size as the scale, then use guides() to manually specify the shape to appear in the legend. you need to plot the bigger red shapes first and then plot over so that it looks like an outline:
ggplot() +
geom_point(data=subset(df,info=="yes"),
aes(x=lng,y=lat,shape = season,size=info),col="red") +
scale_size_manual(values=3.6)+
geom_point(data=df,
aes(x = lng,
y = lat,
color = year,
shape = season),
size=3)+
guides(size = guide_legend(override.aes = list(shape = 1)))
You can change the legend for the shape by playing around with options in the guide()

Show only data labels for every N day and highlight a line for a specific variable in R ggplot

I'm trying to figure out two problems in R ggplot:
Show only data labels for every N day/data point
Highlight (make the line bigger and/or dotted) for a specific variable
My code is below:
gplot(data = sales,
aes(x = dates, y = volume, colour = Country, size = ifelse(Country=="US", 1, 0.5) group = Country)) +
geom_line() +
geom_point() +
geom_text(data = sales, aes(label=volume), size=3, vjust = -0.5)
I can't find out a way how to space the data labels as currently they are being shown for each data point per every day and it's very hard to read the plot.
As for #2, unfortunately, the size with ifelse doesn't work as 'US' line is becoming super huge and I can't change that size not matter what I specify in the first parameter of ifelse.
Would appreciate any help!
As no data was provided the solution is probably not perfect, but nonetheless shows you the general approach. Try this:
sales_plot <- sales %>%
# Create label
# e.g. assuming dates are in Date-Format labels are "only" created for even days
mutate(label = ifelse(lubridate::day(dates) %% 2 == 0, volume, ""))
ggplot(data = sales_plot,
# To adjust the size: Simply set labels. The actual size is set in scale_size_manual
aes(x = dates, y = volume, colour = Country, size = ifelse(Country == "US", "US", "other"), group = Country)) +
geom_line() +
geom_point() +
geom_text(aes(label = label), size = 3, vjust = -0.5) +
# Set the size according the labels
scale_size_manual(values = c(US = 2, other = .9))

Add legend using geom_point and geom_smooth from different dataset

I really struggle to set the correct legend for a geom_point plot with loess regression, while there is 2 data set used
I got a data set, who is summarizing activity over a day, and then I plot on the same graph, all the activity per hours and per days recorded, plus a regression curve smoothed with a loess function, plus the mean of each hours for all the days.
To be more precise, here is an example of the first code, and the graph returned, without legend, which is exactly what I expected:
# first graph, which is given what I expected but with no legend
p <- ggplot(dat1, aes(x = Hour, y = value)) +
geom_point(color = "darkgray", size = 1) +
geom_point(data = dat2, mapping = aes(x = Hour, y = mean),
color = 20, size = 3) +
geom_smooth(method = "loess", span = 0.2, color = "red", fill = "blue")
and the graph (in grey there is all the data, per hours, per days. the red curve is the loess regression. The blue dots are the means for each hours):
When I tried to set the legend I failed to plot one with the explanation for both kind of dots (data in grey, mean in blue), and the loess curve (in red). See below some example of what I tried.
# second graph, which is given what I expected + the legend for the loess that
# I wanted but with not the dot legend
p <- ggplot(dat1, aes(x = Hour, y = value)) +
geom_point(color = "darkgray", size = 1) +
geom_point(data = dat2, mapping = aes(x = Hour, y = mean),
color = "blue", size = 3) +
geom_smooth(method = "loess", span = 0.2, aes(color = "red"), fill = "blue") +
scale_color_identity(name = "legend model", guide = "legend",
labels = "loess regression \n with confidence interval")
I obtained the good legend for the curve only
and another trial :
# I tried to combine both date set into a single one as following but it did not
# work at all and I really do not understand how the legends works in ggplot2
# compared to the normal plots
A <- rbind(dat1, dat2)
p <- ggplot(A, aes(x = Heure, y = value, color = variable)) +
geom_point(data = subset(A, variable == "data"), size = 1) +
geom_point(data = subset(A, variable == "Moy"), size = 3) +
geom_smooth(method = "loess", span = 0.2, aes(color = "red"), fill = "blue") +
scale_color_manual(name = "légende",
labels = c("Data", "Moy", "loess regression \n with confidence interval"),
values = c("darkgray", "royalblue", "red"))
It appears that all the legend settings are mixed together in a "weird" way, the is a grey dot covering by a grey line, and then the same in blue and in red (for the 3 labels). all got a background filled in blue:
If you need to label the mean, might need to be a bit creative, because it's not so easy to add legend manually in ggplot.
I simulate something that looks like your data below.
dat1 = data.frame(
Hour = rep(1:24,each=10),
value = c(rnorm(60,0,1),rnorm(60,2,1),rnorm(60,1,1),rnorm(60,-1,1))
)
# classify this as raw data
dat1$Data = "Raw"
# calculate mean like you did
dat2 <- dat1 %>% group_by(Hour) %>% summarise(value=mean(value))
# classify this as mean
dat2$Data = "Mean"
# combine the data frames
plotdat <- rbind(dat1,dat2)
# add a dummy variable, we'll use it later
plotdat$line = "Loess-Smooth"
We make the basic dot plot first:
ggplot(plotdat, aes(x = Hour, y = value,col=Data,size=Data)) +
geom_point() +
scale_color_manual(values=c("blue","darkgray"))+
scale_size_manual(values=c(3,1),guide=FALSE)
Note with the size, we set guide to FALSE so it will not appear. Now we add the loess smooth, one way to introduce the legend is to introduce a linetype, and since there's only one group, you will have just one variable:
ggplot(plotdat, aes(x = Hour, y = value,col=Data,size=Data)) +
geom_point() +
scale_color_manual(values=c("blue","darkgray"))+
scale_size_manual(values=c(3,1),guide=FALSE)+
geom_smooth(data=subset(plotdat,Data="Raw"),
aes(linetype=line),size=1,alpha=0.3,
method = "loess", span = 0.2, color = "red", fill = "blue")

Adding points, symbols, and legends to ggplot

I have created a plot using ggplot (with DF1 dataset below). I would like two additions to this plot:
to add symbol based on DF.SYMBOL dataset (on specified times for two IDs: different shape and color by event).
to add a vertical line within the bar with CONC as legend based on DF.LINE dataset
I would appreciate your suggestion!
ID<-rep(c(1,2),each=6)
START <- c(0, 42,57,300,520,710, 0,31,56,85,120,300)
END <- c(42,57,300,520,710,711,31,56,85,120,300,301)
TYPE <- c("S","NR","R","NR","R","R","S","R","NR","R","NR","NR")
DF1 <-data.frame(ID,START,END,TYPE)
DF1
# converting ID from numeric to factor
DF1 %<>%
dplyr::mutate(ID = factor(ID))
ggplot(DF1,aes(y=ID,yend=ID,x=START,xend=END,color=TYPE))+
geom_segment(aes(y=ID,yend=ID,x=START,xend=END),size=6,lineend= "butt")
DF.SYMBOL dataset to add points and symbols to the plot
ID<-rep(c(1,2),each=2)
EVENT <- rep(c("TBR","PBR"))
TIME <- c(90, 220,120,200)
DF.SYMBOL<-data.frame(ID,EVENT,TIME)
DF.LINE dataset to add a vertical line in bar with CONC in legend above the vertical line for each ID
ID <- c(1,2)
TIME <- c(400, 265)
CONC <- c(23,97)
DF.LINE<-data.frame(ID,TIME, CONC)
Here's the desired plot (edited on powerpoint): symbols based on DF.SYMBOL dataset and black line with value based on DF.LINE dataset.
This should do it. I used geom_errorbarh for the vertical line - I don't know a better way to get a vertical line across a horizontal bar on a discrete scale. For better control of the thickness you might consider changing the geom_segment to a geom_rect.
DF.SYMBOL$ID = factor(DF.SYMBOL$ID)
DF.LINE$ID = factor(DF.LINE$ID)
ggplot(DF1,aes(y=ID))+
geom_segment(aes(yend=ID, x=START, xend=END, color = TYPE),size=6,lineend= "butt") +
geom_point(data = DF.SYMBOL, aes(x = TIME, fill = EVENT, shape = EVENT), size = ) +
scale_shape_manual(values = c(21, 24)) +
scale_fill_manual(values = c("red", "yellow")) +
geom_errorbarh(data = DF.LINE, aes(xmin = TIME, xmax = TIME), height = 0.1) +
geom_text(data = DF.LINE, aes(x = TIME, label = CONC), vjust = -1.5)

Facet wrap radar plot with three apexes in R

I have created the following plot which gives the shape of the plot I desire. But when I facet wrap it, the shapes no longer remain triangular and become almost cellular. How can I keep the triangular shape after faceting?
Sample data:
lvls <- c("a","b","c","d","e","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15")
df <- data.frame(Product = factor(rep(lvls, 3)),
variable = c(rep("Ingredients", 20),
rep("Defence", 20),
rep("Benefit", 20)),
value = rnorm(60, mean = 5))
Now when I use this code, I get the shapes I desire.
ggplot(df,
aes(x = variable,
y = value,
color = Product,
group = Product)) +
geom_polygon(fill = NA) +
coord_polar()
However, the products are all on top of one another so ideally I would like to facet wrap.
ggplot(df,
aes(x = variable,
y = value,
color = Product,
group = Product)) +
geom_polygon(fill = NA) +
coord_polar() +
facet_wrap(~Product)
But when I facet wrap, the shapes become oddly cellular and not triangular (straight lines from point to point). Any ideas on how to alter this output?
Thanks.

Resources