Boxplot not displaying correctly - r

I am facing an issue with boxplot, have following dataset
dataset
Code I applied is as follow
boxplot(bxplot$food1~bxplot$groupss)
It only shows plot for one variable and I want to use other two variables as well food2 and food3. So that we will have 6 boxplots.
Hope I have explained my questions correctly.
Thanks

#Usman - hope this will be helpful. If you use as.factor for your variable groupss, you should get 6 boxplots as desired:
library(reshape2)
library(ggplot2)
dat.m <- melt(df, id.vars='groupss', measure.vars=c('food1','food2','food3'))
p <- ggplot(dat.m, aes(x=as.factor(groupss), y=value, color=variable)) +
geom_boxplot()
p
As suggested above, you can change your column groupss to a factor itself as follows:
dat.m$groupss <- as.factor(dat.m$groupss)
Instead of reshape2, I use the latest tidyr which has pivot_longer an alternative to melt. This would accomplish the same thing:
library(tidyr)
dat.m2 <- df %>%
pivot_longer(cols = starts_with("food"), names_to = "food", values_to = "value")
p <- ggplot(dat.m2, aes(x=as.factor(groupss), y=value, color=food)) +
geom_boxplot()
p
Edit: If you wish to have food1, food2, and food3 on the x-axis, and for each of those 3, have 2 boxplots for groups 1 and 2, you can do the following:
p <- ggplot(dat.m, aes(x=variable, y=value, color=as.factor(groupss))) +
geom_boxplot()
or for the pivot_longer version:
p <- ggplot(dat.m2, aes(x=food, y=value, color=as.factor(groupss))) +
geom_boxplot()

Related

How to plot two lines (group of rows) in the same plot

I Have data of 80 line and 210 columns.
Here i plot one line
ggplot(data=Data[c(1:5),], aes(x=x_lab, y=col1, group=1)) +
geom_line()+
geom_point() + ggtitle("plot of col1 ")
Can you tell me please how i can plot also the rows from 6 to 10 of col1
in other line (like i did for rows 1:5) and in other color
Thank you
While with smaller datasets it's tempting to do geom_line(data=...) for each separate line, this scales poorly. Since ggplot2 benefits from having its data in a "long" format, I suggest you reshape it (reshape2::melt or tidyr::pivot_longer) and then plot.
Lacking your data, I think these will work:
library(ggplot2)
### pick one of these two
longData <- tidyr::pivot_longer(Data, -x_lab, names_to = "variable", values_to = "y")
longData <- reshape2::melt(Data, "x_lab", variable.name = "variable", value.name = "y")
### plot all at once
ggplot(longData, aes(x_lab, y = y, group = variable)) +
geom_line() + geom_point()
(I find it often useful to use group=variable, color=variable for more visual breakout of the lines.)
A quick and easy solution would be to add a grouping variable directly to your data if you are only hoping to plot these 10 lines.
Data$group <- rep("group1", "group2", each =5)
ggplot(Data, aes(x,y)) +
geom_line(aes(color = group))

plot selected columns using ggplot2

I would like to plot multiple separate plots and so far I have the following code:
However, I don't want the final column from my dataset; it makes ggplot2 plot x-variable vs x-variable.
library(ggplot2)
require(reshape)
d <- read.table("C:/Users/trinh/Desktop/Book1.csv", header=F,sep=",",skip=24)
t<-c(0.25,1,2,3,4,6,8,10)
d2<-d2[,3:13] #removing unwanted columns
d2<-cbind(d2,t) #adding x-variable
df <- melt(d2, id = 't')
ggplot(data=df, aes(y=value,x=t) +geom_point(shape=1) +
geom_smooth(method='lm',se=F)+facet_grid(.~variable)
I tried adding
data=subset(df,df[,3:12])
but I don't think I am writing it correctly. Please advise. Thanks.
Here's how you could do it, using data(iris) as an example:
(i) plot with all variables
df <- reshape2::melt(iris, id="Species")
ggplot(df, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)
(ii) plot without "Petal.Width"
library(dplyr)
df2 <- df %>% filter(!variable == "Petal.Width")
ggplot(df2, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)

Plot including one categorical variable and two numeric variables

How can I show the values of AverageTime and AverageCost for their corresponding type on a graph. The scale of the variables is different since one of them is the average of time and another one is the average of cost. I want to define type as x and y refers to the value of AverageTime and AverageCost. (In this case, I will have two line plots just in one graph)
Type<-c("a","b","c","d","e","f","g","h","i","j","k")
AverageTime<-c(12,14,66,123,14,33,44,55,55,6,66)
AverageCost<-c(100,10000,400,20000,500000,5000,700,800,400000,500,120000)
df<-data.frame(Type,AverageTime,AverageCost)
This could be done using facet_wrap and scales="free_y" like so:
library(tidyr)
library(dplyr)
library(ggplot2)
df %>%
mutate(AverageCost=as.numeric(AverageCost), AverageTime=as.numeric(AverageTime)) %>%
gather(variable, value, -Type) %>%
ggplot(aes(x=Type, y=value, colour=variable, group=variable)) +
geom_line() +
facet_wrap(~variable, scales="free_y")
There you can compare the two lines even though they are different scales.
HTH
# install.packages("ggplot2", dependencies = TRUE)
library(ggplot2)
p <- ggplot(df, aes(AverageTime, AverageCost, colour=Type)) + geom_point()
p + geom_abline()
To show both lines in the same plot it will be hard since there are on different scales. You also need to convert AverageTime and AverageCost into a numeric variable.
library(ggplot2)
library(reshape2)
library(plyr)
to be able to plot both lines in one graph and take the average of the two, you need to some reshaping.
df_ag <- melt(df, id.vars=c("Type"))
df_ag_sb <- df_ag %>% group_by(Type, variable) %>% summarise(meanx = mean(as.numeric(value), na.rm=TRUE))
ggplot(df_ag_sb, aes(x=Type, y=as.numeric(meanx), color=variable, group=variable)) + geom_line()

dplyr + ggplot2: Plotting not working via piping

I want to plot a subset of my dataframe. I am working with dplyr and ggplot2. My code only works with version 1, not version 2 via piping. What's the difference?
Version 1 (plotting is working):
data <- dataset %>% filter(type=="type1")
ggplot(data, aes(x=year, y=variable)) + geom_line()
Version 2 with piping (plotting is not working):
data %>% filter(type=="type1") %>% ggplot(data, aes(x=year, y=variable)) + geom_line()
Error:
Error in ggplot.data.frame(., data, aes(x = year, :
Mapping should be created with aes or aes_string
Thanks for your help!
Solution for version 2: a dot . instead of data:
data %>%
filter(type=="type1") %>%
ggplot(., aes(x=year, y=variable)) +
geom_line()
I usually do this, which also dispenses with the need for the .:
library(dplyr)
library(ggplot2)
mtcars %>%
filter(cyl == 4) %>%
ggplot +
aes(
x = disp,
y = mpg
) +
geom_point()
During typing with piping if you reenter the data name as you have as I shown with bold below, function confuses the sequence of arguments.
data %>% filter(type=="type1") %>% ggplot(***data***, aes(x=year, y=variable)) + geom_line()
Hope it works for you.

Clustered Bar plot in r using ggplot2

Snip of my data frame is
Basically i want to display barplot which is grouped by Country i.e i want to display no of people doing suicides for all of the country in clustered plot and similarly for accidents and Stabbing as well.I am using ggplot2 for this.I have no idea how to do this.
Any helps.
Thanks in advance
Edit to update for newer (2017) package versions
library(tidyr)
library(ggplot2)
dat.g <- gather(dat, type, value, -country)
ggplot(dat.g, aes(type, value)) +
geom_bar(aes(fill = country), stat = "identity", position = "dodge")
Original Answer
dat <- data.frame(country=c('USA','Brazil','Ghana','England','Australia'), Stabbing=c(15,10,9,6,7), Accidents=c(20,25,21,28,15), Suicide=c(3,10,7,8,6))
dat.m <- melt(dat, id.vars='country')
I guess this is the format you're after?
ggplot(dat.m, aes(variable, value)) +
geom_bar(aes(fill = country), position = "dodge")
library(ggplot2)
library(reshape2)
df <- data.frame(country=c('USA','Brazil','Ghana','England','Australia'), Stabbing=c(15,10,9,6,7), Accidents=c(20,25,21,28,15), Suicide=c(3,10,7,8,6))
mm <- melt(df, id.vars='country')
ggplot(mm, aes(x=country, y=value)) + geom_bar(stat='identity') + facet_grid(.~variable) + coord_flip() + labs(x='',y='')

Resources