I'd like to create a line graph with irregular break intervals on the x-axis. When I define the breaks as in the code example, I get additional unlabeled breaks, which always seem to be exactly in the middle of two defined breaks and thus are unregular too (see image in link).
test_frame <- data.frame("v1"=1:3,"v2"=3:1)
library(ggplot2)
ggplot(data = test_frame, aes(x=v1, y=v2, group=1))+geom_line()+
scale_x_continuous(breaks = c(2.74,2.43,1.19))
Graph with additional breaks:
Is there any way to get rid of these vertical lines, so that there are only lines at the defined break position? I'd be grateful for any suggestions.
Set minor_breaks = NULL:
scale_x_continuous(breaks = c(2.74,2.43,1.19), minor_breaks = NULL)
ggplot(data = test_frame, aes(x = v1, y = v2, group = 1)) +
geom_line() +
scale_x_continuous(breaks = c(2.74, 2.43, 1.19)) +
theme(panel.grid.minor = element_blank()) # or panel.grid.minor.x to keep horizontal lines
Related
I have two different datasets which I'd like to plot in the same ggplot2 plot, using different geoms for each. Ideally I would also like a legend which shows that the point geom corresponds to one type of data and the line geom corresponds to the other, but I cannot figure out how to do this. An example of what my data basically looks like is below, minus the legend.
require(ggplot2)
set.seed(1)
d1 = data.frame(y_values = rnorm(21), x_values = 1:21, factor_values = as.factor(sample(1:3, 21, replace=T)))
d2 = data.frame(y_values = seq(-1,1,by = .05), x_values = seq(1,21,by = .5))
ggplot() +
geom_point(data=d1, aes(x=x_values, y=y_values, color=factor_values)) +
geom_line(data=d2, aes(x = x_values, y=y_values), color="blue")
Maybe is this what you want? Two legends for each data. You can enable linetype in order to create a new legend so that points and lines can be in different places:
#Code
ggplot() +
geom_point(data=d1, aes(x=x_values, y=y_values, color=factor_values)) +
geom_line(data=d2, aes(x = x_values, y=y_values,linetype='myline'), color="blue")+
scale_linetype_manual('My line',values='solid')
Output:
Or you can also try this:
#Code 2
ggplot() +
geom_point(data=d1, aes(x=x_values, y=y_values, color=factor_values)) +
geom_line(data=d2, aes(x = x_values, y=y_values,linetype='myline'), color="blue")+
scale_linetype_manual('',values='solid')+
theme(
legend.spacing = unit(-17,'pt'),
legend.margin = margin(t=0,b=0,unit='pt'),
legend.background = element_blank()
)+guides(linetype=guide_legend(title="New Legend Title"),
color=guide_legend(title=""))
Output:
In a previous question, I asked about moving the label position of a barplot outside of the bar if the bar was too small. I was provided this following example:
library(ggplot2)
options(scipen=2)
dataset <- data.frame(Riserva_Riv_Fine_Periodo = 1:10 * 10^6 + 1,
Anno = 1:10)
ggplot(data = dataset,
aes(x = Anno,
y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity",
width=0.8,
position="dodge") +
geom_text(aes( y = Riserva_Riv_Fine_Periodo,
label = round(Riserva_Riv_Fine_Periodo, 0),
angle=90,
hjust= ifelse(Riserva_Riv_Fine_Periodo < 3000000, -0.1, 1.2)),
col="red",
size=4,
position = position_dodge(0.9))
And I obtain this graph:
The problem with the example is that the value at which the label is moved must be hard-coded into the plot, and an ifelse statement is used to reposition the label. Is there a way to automatically extract the value to cut?
A slightly better option might be to base the test and the positioning of the labels on the height of the bar relative to the height of the highest bar. That way, the cutoff value and label-shift are scaled to the actual vertical range of the plot. For example:
ydiff = max(dataset$Riserva_Riv_Fine_Periodo)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity", width=0.8) +
geom_text(aes(label = round(Riserva_Riv_Fine_Periodo, 0), angle=90,
y = ifelse(Riserva_Riv_Fine_Periodo < 0.3*ydiff,
Riserva_Riv_Fine_Periodo + 0.1*ydiff,
Riserva_Riv_Fine_Periodo - 0.1*ydiff)),
col="red", size=4)
You would still need to tweak the fractional cutoff in the test condition (I've used 0.3 in this case), depending on the physical size at which you render the plot. But you could package the code into a function to make the any manual adjustments a bit easier.
It's probably possible to automate this by determining the actual sizes of the various grobs that make up the plot and setting the condition and the positioning based on those sizes, but I'm not sure how to do that.
Just as an editorial comment, a plot with labels inside some bars and above others risks confusing the visual mapping of magnitudes to bar heights. I think it would be better to find a way to shrink, abbreviate, recode, or otherwise tweak the labels so that they contain the information you want to convey while being able to have all the labels inside the bars. Maybe something like this:
library(scales)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo/1000)) +
geom_col(width=0.8, fill="grey30") +
geom_text(aes(label = format(Riserva_Riv_Fine_Periodo/1000, big.mark=",", digits=0),
y = 0.5*Riserva_Riv_Fine_Periodo/1000),
col="white", size=3) +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
theme_classic() +
labs(y="Riserva (thousands)")
Or maybe go with a line plot instead of bars:
ggplot(dataset, aes(Anno, Riserva_Riv_Fine_Periodo/1e3)) +
geom_line(linetype="11", size=0.3, colour="grey50") +
geom_text(aes(label=format(Riserva_Riv_Fine_Periodo/1e3, big.mark=",", digits=0)),
size=3) +
theme_classic() +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
expand_limits(y=0) +
labs(y="Riserva (thousands)")
When splitting a basic bar plot with facet_wrap, the strip titles are drawn behind and barely above the plot area, meaning any character descenders are cut off. Is there a way I can space these a little? The problem persists whether or not I use ggthemr.
Thanks!
Example of what I'm talking about:
You can specify the margins of the element by adding it in the strip.text.x argument, as follows:
A = data.frame(x = 1:4, y = 1:4, z = c('A','A','B','B'))
ggplot(A) +
geom_point(aes(x = x, y = y)) +
facet_wrap(~z) +
theme_bw()+
theme(strip.text.x = element_text(margin = margin(2,0,2,0, "cm")))
I have some data here
I read the data into a data frame and then plot this data with this following code,
# Reading data from a .csv file into a data frame
df <- read.table("newcsv_file.csv",header=T,sep="\t" )
# Now melting the data frame prior to plotting
df_mlt <- melt(df, id=names(df)[1], measure=names(df)[c(2, 6, 11,16,21,26,31,36,41,46,51,106,111,116,121,126,131,136,141,146,151)], variable = "cols")
# plotting the data
plt_fit <- ggplot(df_mlt, aes(x=x,y= value, color=cols)) +
geom_point(size=2) +
geom_smooth(method = "lm", se = FALSE) +
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x))) +
annotation_logticks(sides = "rl") +
theme_bw() +
theme(legend.text=element_text(size=12), legend.title=element_text(size=12))+
theme(axis.text=element_text(size=14)) +
theme(axis.title=element_text(size=14,face="bold")) +
labs(x = "x", y = "y") +
scale_color_discrete(name = "values", labels = c("0","-0.1","-0.2","-0.3","-0.4","-0.5","-0.6","-0.7","-0.8","-0.9","-1","+0.1","+0.2","+0.3","+0.4","+0.5","+0.6","+0.7","+0.8","+0.9","+1")) +
guides(colour = guide_legend(override.aes = list(size=3),nrow=2,title.position = 'top',title.hjust=0.5,legend.direction = "horizontal")) +
theme(legend.position = 'bottom', legend.margin=unit(1,"cm"),legend.background = element_rect(fill ='gray94')) +
theme(plot.margin=unit(c(0,2,0,0),"mm"))
The resulting plot looks like this, the problem here is that the right most edge of the legend is cropped.
I use +theme(legend.margin=unit(1,"cm")) but this does not seem sufficient. Could someone please let me know what I can change to display the full legend properly in the plot.
Thanks.
The code is fine. The problem is the size of your plot window. Try making it wider and you'll see the whole legend.
Also,
ggsave("plot_fit.pdf",plot_fit)
will create a pdf where the full legend is displayed.
After changing the width and height of the plot using the following code,
ggsave(file="new_png_file.png",width=22,height=21,units=c("cm"), dpi=600)
Yields a plot such as this,
I want to create the next histogram density plot with ggplot2. In the "normal" way (base packages) is really easy:
set.seed(46)
vector <- rnorm(500)
breaks <- quantile(vector,seq(0,1,by=0.1))
labels = 1:(length(breaks)-1)
den = density(vector)
hist(df$vector,
breaks=breaks,
col=rainbow(length(breaks)),
probability=TRUE)
lines(den)
With ggplot I have reached this so far:
seg <- cut(vector,breaks,
labels=labels,
include.lowest = TRUE, right = TRUE)
df = data.frame(vector=vector,seg=seg)
ggplot(df) +
geom_histogram(breaks=breaks,
aes(x=vector,
y=..density..,
fill=seg)) +
geom_density(aes(x=vector,
y=..density..))
But the "y" scale has the wrong dimension. I have noted that the next run gets the "y" scale right.
ggplot(df) +
geom_histogram(breaks=breaks,
aes(x=vector,
y=..density..,
fill=seg)) +
geom_density(aes(x=vector,
y=..density..))
I just do not understand it. y=..density.. is there, that should be the height. So why on earth my scale gets modified when I try to fill it?
I do need the colours. I just want a histogram where the breaks and the colours of each block are directionally set according to the default ggplot fill colours.
Manually, I added colors to your percentile bars. See if this works for you.
library(ggplot2)
ggplot(df, aes(x=vector)) +
geom_histogram(breaks=breaks,aes(y=..density..),colour="black",fill=c("red","orange","yellow","lightgreen","green","darkgreen","blue","darkblue","purple","pink")) +
geom_density(aes(y=..density..)) +
scale_x_continuous(breaks=c(-3,-2,-1,0,1,2,3)) +
ylab("Density") + xlab("df$vector") + ggtitle("Histogram of df$vector") +
theme_bw() + theme(plot.title=element_text(size=20),
axis.title.y=element_text(size = 16, vjust=+0.2),
axis.title.x=element_text(size = 16, vjust=-0.2),
axis.text.y=element_text(size = 14),
axis.text.x=element_text(size = 14),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank())
fill=seg results in grouping. You are actually getting a different histogram for each value of seg. If you don't need the colours, you could use this:
ggplot(df) +
geom_histogram(breaks=breaks,aes(x=vector,y=..density..), position="identity") +
geom_density(aes(x=vector,y=..density..))
If you need the colours, it might be easiest to calculate the density values outside of ggplot2.
Or an option with ggpubr
library(ggpubr)
gghistogram(df, x = "vector", add = "mean", rug = TRUE, fill = "seg",
palette = c("#00AFBB", "#E7B800", "#E5A800", "#00BFAB", "#01ADFA",
"#00FABA", "#00BEAF", "#01AEBF", "#00EABA", "#00EABB"), add_density = TRUE)
The confusion regarding interpreting the y-axis might be due to density is plotted rather than count. So, the values on the y-axis are proportions of the total sample, where the sum of the bars is equal to 1.