How to remove outliers from this box plotting r command [duplicate] - r

This question already has answers here:
Ignore outliers in ggplot2 boxplot
(8 answers)
Closed 4 years ago.
I am trying to Boxplot my data frame using this command.
ggplot(combined_data,aes(x= factor(field), y=moisture1))+geom_boxplot()
How can I remove the outliers from the output of this command ?

If you just want to remove the outliers from your graph, you can just add the argument outlier.shape = NA to geom_boxplot()
Example:
require(tidyverse)
mtcars %>%
ggplot(aes(1, wt)) +
geom_boxplot(outlier.shape = NA)

Related

How can I change the order of the stacked bar plot in R? [duplicate]

This question already has an answer here:
ggplot2 geom_bar - how to keep order of data.frame [duplicate]
(1 answer)
Closed 2 years ago.
I want to change the order of the stacked bar plot in R? I am sharing the example below and the green bar will be under the blue bar, how can I can this order? See the code below.
d1<-data.frame(Gene=c("DNA","DNA",
"RNA","RNA",
"XX","XX"),
Gender=c("M","F","M","F","M","F"),
p_value=c( 0.5, 0.1,
0.6,0.01,
0.07,0.02
))
p<-d1 %>%
ggplot(aes(x=forcats::fct_reorder(Gene,p_value), y=p_value, fill=Gender)) +
geom_col(color="black",position=position_dodge()) +
coord_flip() +
scale_fill_manual(values=c('#6495ED','#2E8B57'))+
labs(x="Gene", y="p-value")
Just convert d1$Gendered into a factor and specify the levels in the order you want them.
d1$Gender <- factor(d1$Gender, levels = c("M", "F"))
Then, run the code to create your plot.

How to change legend of facet_grid? [duplicate]

This question already has answers here:
How to change legend title in ggplot
(14 answers)
Closed 4 years ago.
I used R and ggplot to do a small-multiple graph.
ggplot(data=datatest,aes(x=Percentage,y=Accuracy,group=interaction(Classifiers, Feature), color=interaction(Classifiers, Feature)))+geom_line()+facet_grid(OS ~ Dataset)
The graph I got is:
How can I remove change the legend, for example, I want to change interaction(Classifiers,Feature) to just 'Approaches', and also, how to change like SVM.Ngram, LG.WE, SVM.WE to just 'approach1','approach2', and 'approach3'.
Try
tbl <- c(
SVM.Ngram = "approach1",
LG.WE = "approach2",
SVM.WE = "approach3"
)
ggplot(data=datatest,
aes(x=Percentage,y=Accuracy,group=interaction(Classifiers, Feature), color=interaction(Classifiers, Feature))) +
geom_line() +
labs(color = "Approaches") +
facet_grid(OS ~ Dataset, labeller = labeller(tbl)
)
This is from http://ggplot2.tidyverse.org/reference/labeller.html - if you check there, it gives more options you might be interested in.

How to change factor names on x axis with ggplot2 and R? [duplicate]

This question already has answers here:
Customize axis labels
(4 answers)
Closed 5 years ago.
I am plotting the interaction between multiple variables with geom_boxplot, and the resulting factor names are very long. I want to rename these factor names on the plot without changing the factors in the original data set to make the plot easier to interpret.
As an example using the mtcars cars data set:
library(tidyverse)
ggplot(mtcars) + geom_boxplot(aes(factor(cyl), mpg))
This results in a boxplot with 4, 6, and 8 cylinders as the x axis factors. What I would like to do is change those x axis factors. For example, how could I change 4 to "Four Cyl" without editing the original data?
Try this:
ggplot(mtcars) +
geom_boxplot(aes(factor(cyl), mpg)) +
scale_x_discrete(labels = c('Four','Six','Eight'))
See ?discrete_scale.

same bar width in ggplot2? [duplicate]

This question already has answers here:
A way to always dodge a histogram? [duplicate]
(2 answers)
Closed 8 years ago.
In this example:
library(ggplot2)
dat <- data.frame(a=factor(c(1,1,1,2,2,2,3,3,3,4)), b=c("A","B","D","A","B","C","A","B","D",NA), c=c(1,4,3,5,5,1,2,2,8,6))
plot <- ggplot(dat,aes(fill=b,x=a,y=c))
plot + geom_bar(width=.7, position=position_dodge(width=.7), stat = "identity")
factor 4 is wider than the other bars. Is there a way to make them all the same width?
Ideally you should have data for every combination even if it is zero. That means, with 1 in data$a you should have data all the four(A,B,C,D) and so on... try modifying your data frame like this and plot. NA category was referred to as "other" here.
library(ggplot2)
dat <- data.frame(a=factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4)),
b=c("A","B","C","D","other","A","B","C","D","other","A","B","C","D","other","A","B","C","D","other"),
c=c(1,4,0,3,0,5,5,1,0,0,2,2,0,8,0,0,0,0,0,6))
plot <- ggplot(dat,aes(fill=b,x=a,y=c))
plot + geom_bar(width=.7, position=position_dodge(width=.7), stat = "identity")
View this dataframe you will know the difference. You will obviously have missing bars corresponding to your data, which dnt look good. But im afraid this might be the only solution.

How to show count of each bin on histogram on the plot [duplicate]

This question already has answers here:
How to put labels over geom_bar in R with ggplot2
(4 answers)
Closed 8 years ago.
I figured out how to get the count of each bin from ggplot, does anyone know how to show these numbers on the plot?
g <- ggplot()+geom_histogram()
ggplot_build(g)$data[[1]]$count
You can add a stat_bin to do the calculations of counts for you, and then use those values for the labels and heights. Here's an example
set.seed(15)
dd<-data.frame(x=rnorm(100,5,2))
ggplot(dd, aes(x=x))+ geom_histogram() +
stat_bin(aes(y=..count.., label=..count..), geom="text", vjust=-.5)

Resources