Breaking value axis using ggplot2 [duplicate] - r

This question already has answers here:
Using ggplot2, can I insert a break in the axis?
(10 answers)
Closed 3 years ago.
I have used Thinkcell, and one of its cool features is that it breaks very long y-axis to fit the graph. I am not sure whether we can do this with ggplot2. I am a beginner in ggplot2. So, I'd appreciate any thoughts.
For example:
Series <- c(1:6)
Values <- c(899, 543, 787, 35323, 121, 234)
df_val_break <- data.frame(Series, Values)
ggplot(data=df_val_break, aes(x=Series, y=Values)) +
geom_bar(stat="identity")
This creates a graph like this:
However, I want a graph that looks something like this:
However, it seems that broken axis is not supported in ggplot2 because it's misleading (Source: Using ggplot2, can I insert a break in the axis?). This thread suggests a couple of things--faceting and tables.
While I like tables, but I don't like faceting because my categorical variable "Series" are closely related. Moreover, I'd prefer Excel for drawing tables--it's fast.
I have two questions:
Question 1: One of the options I liked is at https://stats.stackexchange.com/questions/1764/what-are-alternatives-to-broken-axes. The graph is at
.
I am unable to replicate similar graph because of the scaling issue.
Question 2: This is a minor question just in case there were new packages introduced that might help us to do this. (The linked SO thread above is older than 5 years. ) Are there any other options on the table?
Update: I don't think my question is duplicate for two reasons: a) I have already gone through the indicated thread, and have referenced here explaining that I am looking for a solution that looks like the third graph in my post. Specifically, I am looking to plot both the graphs--one with shorter scales and the other with 1/20 scale in one graph. I am unable to do this using ggplot2 because of scale issue. Either both the sub-graphs get scaled to 1/nth or one of them get scaled to normal range. I believe this version is much relatable for non-technical audience who don't understand log and Inverse transformation.

I took a stab at this one. I'm a beginner so I am not sure whether this can be improved further in terms of placement of text. I struggled with fitting both high growth rate series and low growth rate series in one graph because of different scales. So, I used facetting.
Here's the code:
ggplot(data = df_val_break,aes(x=Series,y=Values)) +
geom_bar(stat = "identity") +
facet_wrap(~Modified) +
geom_text(data = df_val_break[df_val_break$Modified=="HIGH_GROWTH",], aes(label = "x20 growth rate"),hjust=0.5, vjust=0)
ggsave("post.png")
Here's the output:
There are quite a few issues that I see:
a) High_growth rate graph has Series 2 and Series 6 on the x-axis, although we don't need them. I don't know how to turn them off.
b) geom_text overlaps with the bar. This looks a little annoying.
c) I'd believe that the graph is a little misleading, especially for HIGH_GROWTH section because the y-axis isn't scaled with LOW_GROWTH I was originally thinking of showing two different y-axis--one scaled by 1/20 and the other unscaled.

Related

What function would I use in R to give more detail to this Chart? [duplicate]

This question already has answers here:
Increase number of axis ticks in ggplot2 for dates
(1 answer)
Adding date ticks to ggplot in R
(4 answers)
Closed 3 months ago.
I am fairly new to all of this, and I'm currently working with some datasets for fun/practice. I took a course on R, and it explained the basics and dove a little deeper into ggplot2, but I am at a point where I'm not quite sure what to do.
The dataset is commodity prices from 1960-2021. I am trying to visualize the cost of crude oil over that time, and currently I have a decent chart, but I am wanting to change the intervals on the tick marks so that it is more readable. I will also take any advice you guys have on how to make it look even better!
I've attached my current chart here. (https://i.stack.imgur.com/o9Z63.png)
This is my current code for the chart:
graph +
geom_point(size=2, alpha=.7) + geom_line() +
ggtitle("Oil Prices from 1960 to 2021") +
xlab("Years") + ylab("Oil Prices") +
theme(axis.title.x=element_text(color="Black", size=20),
axis.title.y=element_text(color="Black", size=20),
plot.title=element_text(color="Black", size=40))
I have tried to google the function, and have been given a bunch of answers, but I'm still not sure which one fits for what I'm trying to do. Ideally I would like the x axis (Years) to at a minimum show every 5-10 years, but the more descriptive the better. Also the Y axis (Prices) is very hard to read and tell what the actual price is.
I appreciate any and all help with this. I'm just working on it for the fun of it and to get some practice in, but figured I would use this resource and try to see what we can come up with.
Thanks!

How to display truncated error bars with ggplot? [duplicate]

This question already has answers here:
geom_bar bars not displaying when specifying ylim
(4 answers)
Closed 9 months ago.
I am trying to create a barplot using ggplot2, with the y axis starting at a value greater than zero.
Lets say I have the means and standard errors for hypothetical dataset about carrot length at three different farms:
carrots<-NULL
carrots$Mean<-c(270,250,240)
carrots$SE<-c(3,4,5)
carrots$Farm<-c("Plains","Hill","Valley")
carrots<-data.frame(carrots)
I create a basic plot:
p<-ggplot(carrots,aes(y=Mean,x=Farm)) +
geom_bar(fill="slateblue") +
geom_errorbar(aes(ymin=Mean-SE,ymax=Mean+SE), width=0)
p
This is nice, but as the scale runs from 0 to it is difficult to see the differences in length. Therefore, I would like to rescale the y axis to something like c(200,300). However, when I try to do this with:
p+scale_y_continuous('Length (mm)', limit=c(200,300))
The bars disappear, although the error bars remain.
My question is: is it possible to plot a barplot with this adjusted axis using ggplot2?
Thank you for any help or suggestions you can offer.
Try this
p + coord_cartesian(ylim=c(200,300))
Setting the limits on the coordinate system performs a visual zoom;
the data is unchanged, and we just view a small portion of the original plot.
If someone is trying to accomplish the same zoom effect for a flipped bar chart, the accepted answer won't work (even though the answer is perfect for the example in the question).
The solution for the flipped bar chart is using the argument ylim of the coord_flip function. I decided to post this answer because my bars were also "disappearing" as in the original question while I was trying to re-scale with other methods, but in my case the chart was a flipped one. This may probably help other people with the same issue.
This is the adapted code, based on the example of the question:
ggplot(carrots,aes(y=Mean,x=Farm)) +
geom_col(fill="slateblue") +
geom_errorbar(aes(ymin=Mean-SE,ymax=Mean+SE), width=0) +
coord_flip(ylim=c(200,300))
Flipped chart example

Indicating the statistically significant difference in bar graph USING R

This is a repeat of a question originally asked here: Indicating the statistically significant difference in bar graph but asked for R instead of python.
My question is very simple. I want to produce barplots in R, using ggplot2 if possible, with an indication of significant difference between the different bars, e.g. produce something like this. I have had a search around but can't find another question asking exactly the same thing.
I know that this is an old question and the answer by Didzis Elferts already provides one solution for the problem. But I recently created a ggplot-extension that simplifies the whole process of adding significance bars: ggsignif
Instead of tediously adding the geom_path and annotate to your plot you just add a single layer geom_signif:
library(ggplot2)
library(ggsignif)
ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot() +
geom_signif(comparisons = list(c("versicolor", "virginica")),
map_signif_level=TRUE)
Full documentation of the package is available at CRAN.
You can use geom_path() and annotate() to get similar result. For this example you have to determine suitable position yourself. In geom_path() four numbers are provided to get those small ticks for connecting lines.
df<-data.frame(group=c("A","B","C","D"),numb=c(12,24,36,48))
g<-ggplot(df,aes(group,numb))+geom_bar(stat="identity")
g+geom_path(x=c(1,1,2,2),y=c(25,26,26,25))+
geom_path(x=c(2,2,3,3),y=c(37,38,38,37))+
geom_path(x=c(3,3,4,4),y=c(49,50,50,49))+
annotate("text",x=1.5,y=27,label="p=0.012")+
annotate("text",x=2.5,y=39,label="p<0.0001")+
annotate("text",x=3.5,y=51,label="p<0.0001")
I used the suggested method from above, but I found the annotate function easier for making lines than the geom_path function. Just use "segment" instead of "text". You have to break things up by segment and define starting and ending x and y values for each line segment.
example for making 3 lines segments:
annotate("segment", x=c(1,1,2),xend=c(1,2,2), y= c(125,130,130), yend=c(130,130,125))

displaying stat_summary accurately on violin plots

I just started using ggplot2 on R and have a violin plot question.
I have a data set that can be accessed here: data.
The data comes from a study of making estimations. The variables of interest are the question.no (questions), condition, estimate.no (tr.est1 or tr.est2) and estimate.
The code below makes the plot look almost the way I want it to look at least for one question, yet the median dots generated by stat_summary() are displayed in between the "violins."
v.data<-read.csv("data.csv")
# loop through each question number
d_ply(v.data, c("question.no"), function(d.plot){
q.no <- v.data$question.no
plot.q <- ggplot(d.plot,aes(condition, estimate, fill=estimate.no)) +
geom_violin() +
stat_summary(fun.y="median", geom="point") +
scale_y_continuous('Change Scores') +
scale_x_discrete("Conditions")
ggsave(filename=paste(q.no,".png",sep=""))
})
My Question: How can I make the median dots display correctly on the "violins" rather than in between them?
I searched the previous questions asked on ggplot2 on this site and looked at the ggplot2 documentation as well as other R forums but have not been able to find anything relevant.
I would appreciate any comments and suggestions as to how I can fix it. Also, if the questions I ask are already answered somewhere else, I would appreciate the links to the threads,too. Many thanks in advance.
stat_summary is limited to the variable that determines your x-axis. One way to convey the information you want would be to replace condition in your call to aes with interaction(condition, estimate.no).
Plotluck is a library based on ggplot2 that aims at automating the choice of plot type based on characteristics of 1-3 variables. For your data set, the command plotluck(v.data, condition, estimate, question.no) generates the following plot:
Note that the library chose to scale y logarithmically. You can override this behavior with plotluck(v.data,condition,estimate,question.no,opts=plotluck.options(trans.log.thresh=1E20)) but it doesn't display well, and the median points look like they are all on the zero line.

Rescaling the y axis in bar plot causes bars to disappear : R ggplot2 [duplicate]

This question already has answers here:
geom_bar bars not displaying when specifying ylim
(4 answers)
Closed 9 months ago.
I am trying to create a barplot using ggplot2, with the y axis starting at a value greater than zero.
Lets say I have the means and standard errors for hypothetical dataset about carrot length at three different farms:
carrots<-NULL
carrots$Mean<-c(270,250,240)
carrots$SE<-c(3,4,5)
carrots$Farm<-c("Plains","Hill","Valley")
carrots<-data.frame(carrots)
I create a basic plot:
p<-ggplot(carrots,aes(y=Mean,x=Farm)) +
geom_bar(fill="slateblue") +
geom_errorbar(aes(ymin=Mean-SE,ymax=Mean+SE), width=0)
p
This is nice, but as the scale runs from 0 to it is difficult to see the differences in length. Therefore, I would like to rescale the y axis to something like c(200,300). However, when I try to do this with:
p+scale_y_continuous('Length (mm)', limit=c(200,300))
The bars disappear, although the error bars remain.
My question is: is it possible to plot a barplot with this adjusted axis using ggplot2?
Thank you for any help or suggestions you can offer.
Try this
p + coord_cartesian(ylim=c(200,300))
Setting the limits on the coordinate system performs a visual zoom;
the data is unchanged, and we just view a small portion of the original plot.
If someone is trying to accomplish the same zoom effect for a flipped bar chart, the accepted answer won't work (even though the answer is perfect for the example in the question).
The solution for the flipped bar chart is using the argument ylim of the coord_flip function. I decided to post this answer because my bars were also "disappearing" as in the original question while I was trying to re-scale with other methods, but in my case the chart was a flipped one. This may probably help other people with the same issue.
This is the adapted code, based on the example of the question:
ggplot(carrots,aes(y=Mean,x=Farm)) +
geom_col(fill="slateblue") +
geom_errorbar(aes(ymin=Mean-SE,ymax=Mean+SE), width=0) +
coord_flip(ylim=c(200,300))
Flipped chart example

Resources