plot selected columns using ggplot2 - r

I would like to plot multiple separate plots and so far I have the following code:
However, I don't want the final column from my dataset; it makes ggplot2 plot x-variable vs x-variable.
library(ggplot2)
require(reshape)
d <- read.table("C:/Users/trinh/Desktop/Book1.csv", header=F,sep=",",skip=24)
t<-c(0.25,1,2,3,4,6,8,10)
d2<-d2[,3:13] #removing unwanted columns
d2<-cbind(d2,t) #adding x-variable
df <- melt(d2, id = 't')
ggplot(data=df, aes(y=value,x=t) +geom_point(shape=1) +
geom_smooth(method='lm',se=F)+facet_grid(.~variable)
I tried adding
data=subset(df,df[,3:12])
but I don't think I am writing it correctly. Please advise. Thanks.

Here's how you could do it, using data(iris) as an example:
(i) plot with all variables
df <- reshape2::melt(iris, id="Species")
ggplot(df, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)
(ii) plot without "Petal.Width"
library(dplyr)
df2 <- df %>% filter(!variable == "Petal.Width")
ggplot(df2, aes(y=value, x=Species)) + geom_point() + facet_wrap(~ variable)

Related

Plotting multiple items in a list using ggplot in R

I have a list of data frames that are all of the same structure, and I want to plot information from all of these data frames on the same diagram in R using ggplot, like when facet_wrap is used to show multiple panels on a single image, but am having trouble. below I have created a reproducible example.
library(ggplot)
#Designating 3 datasets:
data_1 <- mtcars
data_2 <- mtcars
data_3 <- mtcars
#Making them into a list:
mylist <- list(data_1, data_2, data_3)
#What things should look like, with facet_wrap being by "dataset", and thus a panel for each of the
#three datasets presented.
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point() + facet_wrap(~Species)
But instead, when I run the following, I get an error saying that the data must be presented as a dataframe, not a list:
ggplot(mylist, aes(x = cyl, y = mpg)) + geom_point() + facet_wrap(~.x)
Does anyone know the best way to use ggplot to plot from a list like this? Do you have to somehow wrap ggplot within lapply()?
One option would be to bind your dataframes by row using e.g. dplyr::bind_rows:
library(ggplot2)
data_1 <- mtcars
data_2 <- mtcars
data_3 <- mtcars
mylist <- list(data_1, data_2, data_3) |>
dplyr::bind_rows(.id = "id")
ggplot(mylist, aes(x = cyl, y = mpg)) + geom_point() + facet_wrap(~id)

Boxplot not displaying correctly

I am facing an issue with boxplot, have following dataset
dataset
Code I applied is as follow
boxplot(bxplot$food1~bxplot$groupss)
It only shows plot for one variable and I want to use other two variables as well food2 and food3. So that we will have 6 boxplots.
Hope I have explained my questions correctly.
Thanks
#Usman - hope this will be helpful. If you use as.factor for your variable groupss, you should get 6 boxplots as desired:
library(reshape2)
library(ggplot2)
dat.m <- melt(df, id.vars='groupss', measure.vars=c('food1','food2','food3'))
p <- ggplot(dat.m, aes(x=as.factor(groupss), y=value, color=variable)) +
geom_boxplot()
p
As suggested above, you can change your column groupss to a factor itself as follows:
dat.m$groupss <- as.factor(dat.m$groupss)
Instead of reshape2, I use the latest tidyr which has pivot_longer an alternative to melt. This would accomplish the same thing:
library(tidyr)
dat.m2 <- df %>%
pivot_longer(cols = starts_with("food"), names_to = "food", values_to = "value")
p <- ggplot(dat.m2, aes(x=as.factor(groupss), y=value, color=food)) +
geom_boxplot()
p
Edit: If you wish to have food1, food2, and food3 on the x-axis, and for each of those 3, have 2 boxplots for groups 1 and 2, you can do the following:
p <- ggplot(dat.m, aes(x=variable, y=value, color=as.factor(groupss))) +
geom_boxplot()
or for the pivot_longer version:
p <- ggplot(dat.m2, aes(x=food, y=value, color=as.factor(groupss))) +
geom_boxplot()

Q: Display grouped and combined boxplot in a single plot in R

I am trying to display grouped boxplot and combined boxplot into one plot. Take the iris data for instance:
data(iris)
p1 <- ggplot(iris, aes(x=Species, y=Sepal.Length)) +
geom_boxplot()
p1
I am trying to compare overall distribution with distributions within each categories. So is there a way to display a boxplot of all samples on the left of these three grouped boxplots?
Thanks in advance.
You can rbind a new version of iris, where Species equals "All" for all rows, to iris before piping to ggplot
p1 <- iris %>%
rbind(iris %>% mutate(Species = 'All')) %>%
ggplot(aes(x = Species, y = Sepal.Length)) +
geom_boxplot()
Yes, you can just create a column for all species as follows:
iris = iris %>% mutate(all = "All Species")
p1 <- ggplot(iris) +
geom_boxplot(aes(x=Species, y=Sepal.Length)) +
geom_boxplot(aes(x=all, y=Sepal.Length))
p1

boxplot ggplot2::qplot() ungrouped and grouped data in same plot R

My data set features a factor(TypeOfCat) and a numeric (AgeOfCat).
I've made the below box plot. In addition to a box representing each type of cat, I've also tried to add a box representing the ungrouped data (ie the entire cohort of cats and their ages). What I've got is not quite what I'm after though, as sum() of course won't provide all the information needed to create such a plot. Any help would be much appreciated.
Data set and current code:
Df1 <- data.frame(TypeOfCat=c("A","B","B","C","C","A","B","C","A","B","A","C"),
AgeOfCat=c(14,2,5,8,4,5,2,6,3,6,12,7))
Df2 <- data.frame(TypeOfCat=c("AllCats"),
AgeOfCat=sum(Df1$AgeOfCat)))
Df1 <- rbind(Df1, Df2)
qplot(Df1$TypeOfCat,Df1$AgeOfCat, geom = "boxplot") + coord_flip()
No need for sum. Just take all the values individually for AllCats:
# Your original code:
library(ggplot2)
Df1 <- data.frame(TypeOfCat=c("A","B","B","C","C","A","B","C","A","B","A","C"),
AgeOfCat=c(14,2,5,8,4,5,2,6,3,6,12,7))
# this is the different part:
Df2 <- data.frame(TypeOfCat=c("AllCats"),
AgeOfCat=Df1$AgeOfCat)
Df1 <- rbind(Df1, Df2)
qplot(Df1$TypeOfCat,Df1$AgeOfCat, geom = "boxplot") + coord_flip()
You can see you have all the observations if you add geom_point to the boxplot:
ggplot(Df1, aes(TypeOfCat, AgeOfCat)) +
geom_boxplot() +
geom_point(color='red') +
coord_flip()
Like this?
library(ggplot2)
# first double your data frame, but change "TypeOfCat", since it contains all:
df <- rbind(Df1, transform(Df1, TypeOfCat = "AllCats"))
# then plot it:
ggplot(data = df, mapping = aes(x = TypeOfCat, y = AgeOfCat)) +
geom_boxplot() + coord_flip()

Plot including one categorical variable and two numeric variables

How can I show the values of AverageTime and AverageCost for their corresponding type on a graph. The scale of the variables is different since one of them is the average of time and another one is the average of cost. I want to define type as x and y refers to the value of AverageTime and AverageCost. (In this case, I will have two line plots just in one graph)
Type<-c("a","b","c","d","e","f","g","h","i","j","k")
AverageTime<-c(12,14,66,123,14,33,44,55,55,6,66)
AverageCost<-c(100,10000,400,20000,500000,5000,700,800,400000,500,120000)
df<-data.frame(Type,AverageTime,AverageCost)
This could be done using facet_wrap and scales="free_y" like so:
library(tidyr)
library(dplyr)
library(ggplot2)
df %>%
mutate(AverageCost=as.numeric(AverageCost), AverageTime=as.numeric(AverageTime)) %>%
gather(variable, value, -Type) %>%
ggplot(aes(x=Type, y=value, colour=variable, group=variable)) +
geom_line() +
facet_wrap(~variable, scales="free_y")
There you can compare the two lines even though they are different scales.
HTH
# install.packages("ggplot2", dependencies = TRUE)
library(ggplot2)
p <- ggplot(df, aes(AverageTime, AverageCost, colour=Type)) + geom_point()
p + geom_abline()
To show both lines in the same plot it will be hard since there are on different scales. You also need to convert AverageTime and AverageCost into a numeric variable.
library(ggplot2)
library(reshape2)
library(plyr)
to be able to plot both lines in one graph and take the average of the two, you need to some reshaping.
df_ag <- melt(df, id.vars=c("Type"))
df_ag_sb <- df_ag %>% group_by(Type, variable) %>% summarise(meanx = mean(as.numeric(value), na.rm=TRUE))
ggplot(df_ag_sb, aes(x=Type, y=as.numeric(meanx), color=variable, group=variable)) + geom_line()

Resources