I want to produce the scatterplots between the first variable of a dataset and all others, e.g. from iris the Sepal.Length with all others. I have created the following:
data <- iris[,c(-5)]
par(mfrow = c(2, 2))
for (i in seq(ncol(data))[-1]) {
plot(
data[, 1],
data[, i],
xlab = colnames(data)[i],
ylab = "Y"
)
lines(lowess(data[,1],data[,i]),col="red")
}
which results in:
Is there any way to make it looks more professional and not so simple??
ggplot2 is great for this type of thing. There are a bunch of themes that can be used to quickly create high quality plots. It also gives you a lot of flexibilty to customize your plot by changing individual elements.
In addition to being able to make the plots pretty, it is very effective at creating the plots in the first place. Here is somewhere to start :
library(tidyverse)
#your example data
data <- iris[, c(-5)]
#pivot_longer rearranges the data in to a long form, which makes it easier to plot
data_def <- pivot_longer(data, -Sepal.Length)
#the data to be plotted
ggplot(data_def, aes(x = Sepal.Length, y = value)) +
#adding the scatter plot (each value is a point)
geom_point() +
#adding a LOESS smoothed line (the default method of smoothing), without the standard error
geom_smooth(se = FALSE, color = "red", size = 0.5) +
#Splits into the three plots based on the measurements and moves the titles underneath the x-axis
facet_wrap( ~ name, scales = "free_y", strip.position = "bottom") +
#Changes the overall look of the plot
theme_classic() +
#Removes elements of the former title (now x-axis) so that there is no surrounding box
theme(strip.background = element_blank(),
strip.placement = "outside") +
#Manually change the axis labels
labs(x = NULL, y = "Y")
I also use ggpubr which is based on ggplot2
I'm using the package patchwork to combine multiple ggplot2 plots vertically. I'd like the scales for each plot to be directly above one another, regardless of the length of the scale name. At the moment, the scales are not aligned above one another.
I'm open to using ggpubr or facet_grid() if they would make it possible, but I've seen that facets doesn't allow multiple scales, and I haven't found any solution using ggpubr
library(ggplot2)
library(patchwork)
set.seed(0)
testdata <- data.frame(x=1:10, y=1:10, col=runif(10))
g1 <- ggplot(testdata, aes(x=x,y=y,col=col)) + geom_point() +
scale_color_gradient(name="Short")
g2 <- ggplot(testdata, aes(x=x,y=y,col=col)) + geom_point() +
scale_color_gradient(name="A rather longer name")
g1/g2
ggsave("testfile.tiff", units = "mm", device="tiff",
width=100, height=100, dpi = 100)
Ideal output:
With plot_layout you can "collect" the legends. This uses as default theme(legend.position = 'right'). You can add this after plot_layout with & theme(legend.position = 'right') and adjust the position if you want to change the location of the legends.
g1/g2 + plot_layout(guides = 'collect') # & theme(legend.position = 'right') <- adjust position here!
ggsave("testfile.tiff", units = "mm", device="tiff",
width=100, height=100, dpi = 100)
I'd also be curious to learn of a patchwork parameter than could fix this, but I don't think there is one (please correct me if I'm wrong). You may have noticed that Hadley's answer is more than 10 years old and people have been working on ggplot2 since then. The ggnewscale package solves the problem of having multiple scales per plot. Here is a facetted approach using multiple colourscales:
library(ggplot2)
library(ggnewscale)
set.seed(0)
testdata <- data.frame(x=1:10, y=1:10, col=runif(10))
ggplot(mapping = aes(x = x, y, y)) +
geom_point(data = transform(testdata,
facet = factor("Top", c("Top", "Bottom"))),
aes(colour = col)) +
scale_colour_continuous(name = "Short") +
new_scale_colour() +
geom_point(data = transform(testdata,
facet = factor("Bottom", c("Top", "Bottom"))),
aes(colour = col)) +
scale_colour_continuous(name = "A rather longer name") +
facet_wrap(~ facet, ncol = 1)
I'm using visual studio with R version 3.5.1 where I tried to plot legend to the graph.
f1 = function(x) {
return(x+1)}
x1 = seq(0, 1, by = 0.01)
data1 = data.frame(x1 = x1, f1 = f1(x1), F1 = cumtrapz(x1, f1(x1)) )
However, when I tried to plot it, it never give me a legend!
For example, I used the same code in this (Missing legend with ggplot2 and geom_line )
ggplot(data = data1, aes(x1)) +
geom_line(aes(y = f1), color = "1") +
geom_line(aes(y = F1), color = "2") +
scale_color_manual(values = c("red", "blue"))
I also looked into (How to add legend to ggplot manually? - R
) and many other websites in stackoverflo, and I have tried every single function in https://www.rstudio.com/wp-content/uploads/2016/11/ggplot2-cheatsheet-2.1.pdf
i.e.
theme(legend.position = "bottom")
scale_fill_discrete(...)
group
guides()
show.legend=TRUE
I even tried to use the original plot() and legend() function. Neither worked.
I thought there might be something wrong with the dataframe, but I split them(x2,f1,F1) apart, it still didn't work.
I thought there might be something wrong with IDE, but the code given by kohske acturally plotted legend!
d<-data.frame(x=1:5, y1=1:5, y2=2:6)
ggplot(d, aes(x)) +
geom_line(aes(y=y1, colour="1")) +
geom_line(aes(y=y2, colour="2")) +
scale_colour_manual(values=c("red", "blue"))
What's wrong with the code?
As far as I know, you only have X and Y variables in your aesthetics. Therefore there is no need for a legend. You have xlab and ylab to describe your two lines. If you want to have legends, you should put the grouping in the aesthetics, which might require recoding your dataset
d<- data.frame(x=c(1:5, 1:5), y=c(1:5, 2:6), colorGroup = c(rep("redGroup", 5),
rep("blueGroup", 5)))
ggplot(d, aes(x, y, color = colorGroup )) + geom_line()
This should give you two lines and a legend
hopefully a trivial one. The code below is hopefully quite simplistic and is just to illustrate the issue (and hence is quite crude). I am just wondering what the best way to set a maximum number of x axis tickmarks might be here.
library(ggplot2)
library(data.table)
data <- data.table(year=c(2009,2009,2009,2009,2010,2010,2010,2010,
2011,2011,2011,2011,2012,2012,2012,2012,2013,2013,2013,2013,
2014,2014,2014,2014,2015,2015,2015,2015),year_quart = c("2009-Q1","2009-Q2",
"2009-Q3","2009-Q4","2010-Q1","2010-Q2","2010-Q3","2010-Q4","2011-Q1","2011-
Q2","2011-Q3","2011-Q4","2012-Q1","2012-Q2","2012-Q3","2012-Q4",
"2013-Q1","2013-Q2","2013-Q3","2013-Q4","2014-Q1","2014-Q2","2014-Q3",
"2014-Q4","2015-Q1","2015-Q2","2015-Q3","2015-Q4"),region = c("EU","EU",
"EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU",
"EU","EU","EU","EU","EU","EU","EU","EU","EU","EU","EU"),value = c(390,621,
442,113,586,571,391,432,758,897,696,160,189,567,621,922,402,185,609,812,549,
783,211,974,723,584,745,609))
plot1 <- ggplot(data, aes(factor(year_quart),value, xmin="2009-Q1", xmax="2009-Q4")) +
geom_line(aes(group=region),size=0.4) +
labs(x = "year", y = "value", title = "Title") +
scale_x_discrete(
breaks = unique(data$year_quart),
labels = unique(data$year_quart),
limits = c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2")
)
So with this code I get a plot which looks OKish. However if I swap
limits=c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2")
with
limits=c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2","2010-Q3", "2010-Q4","2011-Q1","2011-Q2","2011-Q3","2011-Q4","2012-Q1","2012-Q2","2012-Q3",
"2012-Q4","2013-Q1","2013-Q2","2013-Q3","2013-Q4","2014-Q1","2014-Q2","2014-Q3",
"2014-Q4","2015-Q1","2015-Q2","2015-Q3","2015-Q4"))
I generate far too many tickmarks to be viewed clearly. So what I would ideally like is, for a certain year/quarter range, specific code that generates a maximum number of (clearly viewable) tickmarks depending on this range.
many thanks in advance!
If our goal is to make the tick labels more readable, we can always rotate them using the axis.text.x argument in theme:
ggplot(data, aes(factor(year_quart),value, xmin="2009-Q1", xmax="2009-Q4")) +
geom_line(aes(group=region),size=0.4) +
labs(x = "year", y = "value", title = "Title") +
scale_x_discrete(
breaks = unique(data$year_quart),
labels = unique(data$year_quart),
limits=c("2009-Q1","2009-Q2","2009-Q3","2009-Q4","2010-Q1","2010-Q2","2010-Q3", "2010-Q4","2011-Q1","2011-Q2","2011-Q3","2011-Q4","2012-Q1","2012-Q2","2012-Q3",
"2012-Q4","2013-Q1","2013-Q2","2013-Q3","2013-Q4","2014-Q1","2014-Q2","2014-Q3",
"2014-Q4","2015-Q1","2015-Q2","2015-Q3","2015-Q4")
) +
theme(axis.text.x = element_text(angle = 45))
See my related question and the accepted answer here.
I am trying to produce a plot similar to that in the accepted answer i.e. a gridded plot with a shared common legend and a different unique legend attached to each plot on the grid.
Specifically, I want a 3 row, 1 column grid with 1 plot on each row. Like this:
Which was produced with the following code:
library (ggplot2)
library(gridExtra)
library (grid)
library(cowplot)
diamonds2 <- diamonds[sample(nrow(diamonds), 500), ]
# 3 ggplot plot objects with multiple legends 1 common legend and 3 unique legends
p1<- ggplot(diamonds2, aes(x=price, y= depth, color= clarity , shape= cut )) +
geom_point(size=5) + labs (shape = "unique legend", color = "common legend")
p2 <- ggplot(diamonds2, aes(x=price, y= depth, color= clarity , shape= color )) +
geom_point(size=5) + labs (shape = "unique legend", color = "common legend")
p3 <- ggplot(diamonds2, aes(x=price, y= depth, color= clarity , shape= clarity )) +
geom_point(size=5) + labs (shape = "unique legend", color = "common legend")
cowplot::plot_grid(
cowplot::plot_grid(
p1 + scale_color_discrete(guide = FALSE),
p2 + scale_color_discrete(guide = FALSE),
p3 + scale_color_discrete(guide = FALSE),
nrow=3, ncol = 1))
But with a shared legend which relates to the color = argument of each plot object.
I've tried many variations of the below code and have added/adjusted/removed various arguments/parameters in consultation with the cowplot documentation but I cannot get a neat plot like the one above with the shared legend at the bottom (or anywhere useful!) - everything I have attempted returns a crowded plot like below.
Adaption of the above code to include the shared legend :
cowplot::plot_grid(
cowplot::plot_grid(
p1 + scale_color_discrete(guide = FALSE),
p2 + scale_color_discrete(guide = FALSE),
p3 + scale_color_discrete(guide = FALSE),
nrow=3, ncol = 1
),
cowplot::get_legend(p1 + scale_shape(guide = FALSE) + theme(legend.position = "bottom")), nrow=3)
Which results in a crowded plot like this with a lot of empty space:
Could anyone suggest where I might be going wrong?
Each call to plot_grid splits your plotting area. Here, you are nesting two calls to plot_grid, and you are asking for 3 rows in each. cowplot therefore splits the plotting area in two equal parts:
in the top part, it puts your scatter plot
in the bottom part, your legend takes the first row, with nothing in the bottom two rows creating a lot of empty space while squishing your scatter plots.
You can specify the relative height of each of your plotting area giving more space for the scatter plots, and less space for the legend at the bottom. For instance for 85% plots, and 15% legend:
cowplot::plot_grid(
cowplot::plot_grid(
p1 + scale_color_discrete(guide = FALSE),
p2 + scale_color_discrete(guide = FALSE),
p3 + scale_color_discrete(guide = FALSE),
ncol = 1, align = "v"
),
cowplot::get_legend(p1 + scale_shape(guide = FALSE) +
theme(legend.position = "bottom")),
ncol=1, rel_heights=c(.85, .15))
which produces :