How do i go about putting these lines spereatly on this graph? - r

So i created a ggplot as below;
enter image description here
using this code:
ggplot(dataset1, aes(x = y, y = x)) + geom_smooth(span=0.2) + ylim(0,5) + xlim(0,23) + ylab("Count")
labs(x="Hours") +
theme_classic()
i then wanted to add an additonal 3 lines to this graph and so tried this code:
ggplot(rbind(dataset1,dataset2,dataset3,dataset4), aes(x = y, y = x)) + geom_smooth(span=0.2) + ylim(0,5) + xlim(0,23) + ylab("count") +
labs(x="Hours") +
theme_classic()
however the graph i was then given was as seen below:
enter image description here
which is no where near what I'm trying to do.
I also got an error message after i did this code such as;
Warning message:
Removed 1 rows containing non-finite values (stat_smooth).
I know i'm going majorly wrong with the second code and probably missing out a part of it but this code isnt something I've used before so just trying my hand at trying to get around it.
Thanks for any help!

If you need one line per dataset you should add some sort of category/grouping variable. By combining your data you just create one big dataset so ggplot has no way of knowing it should plot them separately.
dataset1$category <- 1
dataset2$category <- 2
...
Now you can create your new data and then add for example color = category to your aesthetics.

Related

Coxcomb charts in R

I have the following two graphs, the first one is provided, and we need to modify it to produce the second one. The code is provided below
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut), width = 1) +
labs(x=NULL) +
theme(axis.title.y=element_blank(),axis.text.y=element_blank(),axis.ticks.y=element_blank()) +
coord_polar()
This is the code that produces the first image, to get the second graph, the geom_bar() call needs to be changed, specifically, stat() needs to be called to manually set the heights. How do I modify this line of code to produce the second graph?
For those who come across a similar issue, I solved this by adding an argument for the y-axis and setting it to sqrt(stat(count)). This yields the second coxcomb chart shown above

R, how to add one break to the default breaks in ggplot?

Suppose I have the following issue: having a set of data, generate a chart indicating how many datapoints are below any given threshold.
This is fairly easy to achieve
n.data <- 215
set.seed(0)
dt <- rnorm(n.data) ** 2
x <- seq(0, 5, by=.2)
y <- sapply(x, function(i) length(which(dt < i)))
ggplot() +
geom_point(aes(x=x,y=y)) +
geom_hline(yintercept = n.data)
The question is, suppose I want to add a label to indicate what the total number of observation was (n.data). How do I do that, while maintaining the other breaks as default?
The outcome I'd like looks something like the image below, generated with the code
ggplot() +
geom_point(aes(x=x,y=y)) +
geom_hline(yintercept = n.data) +
scale_y_continuous(breaks = c(seq(0,200,50),n.data))
However, I'd like this to work even when I change the value of n.data, just by adding it to the default breaks.
(bonus points if you also get rid of the grid line between the last default break and the n.data one!)
Three years and some more knowledge of ggplot later, here's how I would do this today.
ggplot() +
geom_point(aes(x=x,y=y)) +
geom_hline(yintercept = n.data) +
scale_y_continuous(breaks = c(pretty(y), n.data))
Here is how you can get rid of the grid line between the last auto break and the manual one :
theme_update(panel.grid.minor=element_blank())
For the rest, I can't quite understand your question, as when you change n.data, your break is updated.

Add an average line to an existing plot

I want to add an average line to the existing plot.
library(ggplot2)
A <- c(1:10)
B <- c(1,1,2,2,3,3,4,4,5,5)
donnees <- data.frame(A,B)
datetime<-donnees[,2]
Indcatotvalue<-donnees[,1]
df<-donnees
mn<-tapply(donnees[,1],donnees[,2],mean)
moyenne <- data.frame(template=names(mn),mean=mn)
ggplot(data=df,
aes_q(x=datetime,
y=Indcatotvalue)) + geom_line()
I have tried to add :
geom_line(aes(y = moyenne[,2], colour = "blue"))
or :
lines(moyenne[,1],moyenne[,2],col="blue")
but nothing happens, I don't understand especially for the function "lines".
When you say average line I'm assuming you want to plot a line that represents the average value of Y (Indcatotvalue). For that you want to use geom_hline() which plots horizontal lines on your graph:
ggplot(data=df,aes_q(x=datetime,y=Indcatotvalue)) +
geom_line() +
geom_hline(yintercept = mean(Indcatotvalue), color="blue")
Which, with the example numbers you gave, will give you a plot that looks like this:
The function stat_summary is perfect here.
I have found the answer in this page groups.google from Brian Diggs:
p + stat_summary(aes(group=bucket), fun.y=mean, geom="line", colour="green")
You need to set the group to the faceting variable explicitly since
otherwise it will be type and bucket (which looks like type since type
is nested in bucket).

Changing the xlim of numeric value causing error ggplot R

I have a grouped barplot produced using ggplot in R with the following code
ggplot(mTogether, aes(x = USuniquNegR, y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_discrete(name = "Area",
labels = c("Everywhere", "New York")) +
xlab("Reasons") +
ylab("Proportion of total complaints") +
coord_flip() +
ggtitle("Comparison between NY and all areas")
mTogether is created using the following code
mTogether <- melt(together, id.vars = 'USuniquNegR')
The Data Frame together is made up of
USperReasons USperReasonsNY USuniquNegR
1 0.198343304187759 0.191304347826087 Late Flight
2 0.35987114588127 0.321739130434783 Customer Service Issue
3 0.0667280257708237 0.11304347826087 Lost Luggage
4 0.0547630004601933 0.00869565217391304 Flight Booking Problems
5 0.109065807639208 0.121739130434783 Can't Tell
6 0.00460193281178095 0 Damaged Luggage
7 0.0846755637367694 0.0782608695652174 Cancelled Flight
8 0.0455591348366314 0.0521739130434783 Bad Flight
9 0.0225494707777266 0.0347826086956522 longlines
10 0.0538426138978371 0.0782608695652174 Flight Attendant Complaints
Together can be generated by the following
together<-data.frame(cbind(USperReasons,USperReasonsNY,USuniquNegR))
where
USperReasons <- c(0.19834,0.35987,.06672,0.05476,0.10906,.00460,.08467,0.04555,0.02254,0.05384)
USperReasonsNY <- c(0.191304348,0.321739130,0.113043478,0.008695652,0.121739130,0.000000000,0.078260870,0.05217391,0.034782609,0.078260870)
USuniquNegR <- c("Late Flight","Customer Service Issue","Lost Luggage","Flight Booking Problems","Can't Tell","Damaged Luggage","Cancelled Flight","Bad Flight","longlines","Flight Attendant Complaints")
The problem is when I try change xlim of the ggplot using
+ xlim(0, 1)
I just seem to get an error:
Discrete value supplied to continuous scale
I can't understand why this happens but I need to resolve it because currently the x axis starts below 0 and is very highly packed:
image of ggplot output
The problem is that you are cbind()ing your column vectors together, which converts the numbers to characters. Fix that and the rest should fix itself.
together<-data.frame(USperReasons,USperReasonsNY,USuniquNegR)
You need to remove the cbind from
together<-data.frame(cbind(USperReasons,USperReasonsNY,USuniquNegR))
because str(together) tells that all three columns are factors.
With
together <- data.frame(USperReasons, USperReasonsNY, USuniquNegR)
the plot looks reasonable to me (without having to use ylim or xlim).
So, the error was not within ggplot2 but in data preparation.
Therefore, please, provide a full working example which can be copied, pasted and run when asking a question next time. Thank you.

R ggplot2 getting "Error: Aesthetics must either be length one, or the same length as the dataProblems"

I get na error
Error: Aesthetics must either be length one, or the same length as the dataProblems:Average, Average, Average
trying to put a vertical line in a picture with multiple horizontal boxplots
Here is a piece of the data.frame gs1_domain, to draw boxplots is
and here is the data.frame R_18, to put the vertical lines in the plots
Below is my code that draw the boxplots
bp_domain <- ggplot(gs1_domain, aes(x=gs1_domain$Domain, y=gs1_domain$Average))
bp_domain + stat_boxplot(geom='errorbar') + geom_boxplot(outlier.shape = 1) +
coord_flip() + xlab("Domínio") + ylab("Média") + ggtitle("Box plot das médias por domínios")
With this code I get the following graph
I'm trying to put vertical lines in each boxplot, with data from the column Average in data.frame R_18.
Now, following some google pages I added to the code above the function geom_errorbar, and the new code is
bp_domain_R_18 <- ggplot(gs1_domain, aes(x=gs1_domain$Domain, y=gs1_domain$Average))
bp_domain_R_18 + stat_boxplot(geom='errorbar') + geom_boxplot(outlier.shape = 1) +
geom_errorbar(data=R_18, aes(y=Average, ymax=Average, ymin=Average)) +
coord_flip() + xlab("Domínio") + ylab("Média") + ggtitle("Box plot das médias por domínios")
but I get that error message.
I already removed NA's from gs1_domain.
Someone can tell me what's wrong?
Try replacing
aes(x=gs1_domain$Domain, y=gs1_domain$Average)
with
aes(x=Domain, y=Average)
the dataframe is already specified, we only need to specify the column names within aes

Resources