Reorder discrete variables in ggplot2 - r

I'm trying to reorder the discrete variables that I have in ggplot2. I would like to display it like WTT, KOT, WTD, KOD in that order in the graph however, I am currently getting KOD,KOT,WTD,WTT in the graph. I have tried using match to manually order the dataframe but I don't see a change in the graph itself.
The data looks something like this:
type mean
WTT 100
KOT 110
WTD 1000
KOD 1300
The means will vary and I only care that the correct factors are paired to each other in a graph.
And the code I am primarily using is the following:
graph = ggplot(data = data_subset,aes(y = Mean, x = Type, color = Type))

A straight forward way would be to re-level your type variable:
graph = ggplot(data = data_subset,aes(y = Mean, x = factor(Type, levels = c("WTT", "KOT", "WTD", "KOD"), color = Type))

Related

I am facing a problem in generating bubble enrichment plot using ggplot from fgsea result

I want to create a bubble plot:
My data looks like this:
My code for the bubble plot:
bubble1 <- ggplot(a, aes(y = pathway, x = Condition, size = size)) +
geom_point(aes(color = NES, size=size)) +
scale_color_gradientn(colours = rainbow(5))
It does not show any error but the result looks as follows:
The plot was supposed to be something like this:
Your x and y arguments are both factors (it seems). Considering the screenshot of your data in excel, the value of Condition is Tumor for all (visible) pathway values. Thus specifying Condition as x variable with the same value for all your y variable values results in the kind of plot you show here.
To achieve a plot as the one you want to actually get, your x variable should be numeric, e.g. any of the columns B to F in excel.

Question: Use a factor's index to plot variables

I'm very new to R so I'm sorry if this is something really simple.
I've had a look on a bunch of cheat sheets and can't see anything obvious.
I have a simple set of data that has date, temperature, and 4 different factors (based on the bloom of a tree // 1 = "", 2 = "bloom", 3 = "full", 4 = "scatter")
What I want to do, but have no idea how, is to do a scatter plot of the date and temperature of each factor individually.
One approach is to use ggplot2 with facet_wrap. First, be sure to set the level names of the Bloom factor so the plots will label usefully.
Then, we use ggplot to plot the data and group = by the Bloom factor. Then we add facet_wrap with the formula that . (everything else) should be grouped by Bloom.
library(ggplot2)
levels(TreeData$Bloom) <- c("None","Bloom","Full","Scatter")
ggplot(TreeData, aes(x=Date,y=Temp,group = Bloom, color = Bloom)) +
geom_point(show.legend = FALSE) +
facet_wrap(. ~ Bloom)
Per your comment, if you wanted individual graphs you could use base R subsetting with TreeData[TreeData$Bloom == "Full",]. Note that "Full" is the factor level we set earlier.
ggplot(TreeData[TreeData$Bloom == "Full",], aes(x=Date,y=Temp)) +
geom_point() + labs(title="Full Bloom")
Data
set.seed(1)
TreeData <- data.frame(Date = rep(seq.Date(from=as.Date("2019-04-01"), to = as.Date("2019-08-01"), by = "week"),each = 10) , Temp = round(runif(22,38,n=180)), Bloom = as.factor(sample(1:4,180,replace = TRUE)))

Barplot of groups based on counts

I'm trying to make barplot
Data are in dataframe. In those dataframes I have several column, one named ID and another count.
First I'm trying to make group of this count. In the barplot we should see,count=0,count=1,count=2,count>=3
Some exemple data
data1 <- data.frame(ID="ID_1", count=(rep(seq(0,10,by=1),each=4)))
data2 <- data.frame(ID="ID_2", count=(rep(seq(0,10,by=1),each=4)))
data3 <- data.frame(ID="ID_3", count=(rep(seq(0,10,by=1),each=4)))
Obviously here, barplots of the dataframes will look same
I tried to make this in ggplot (it's not nice at all)
ggplot(data1)+
geom_bar(aes(x = ID, fill = count),position = "fill")+
geom_bar(data=data2,aes(x = ID, fill = count),position = "fill")+
geom_bar(data=data3,aes(x = ID, fill = count),position = "fill")
I got something like that
What I'm trying to do is to have different groups within a barplot, like the proportion of counts 0, proportion of counts 1,2 and proportion of counts greater (and equal) to 3.
I expect something like that
But of course in my example barplots will look same.
Also if you have some suggestion to change Y axis from 1.00 to 100%.
Also One of my problem is that length of my real dataframes are not equal but it should doesn't matter because I try to get the percentage of count group
You need to put all the data in 1 dataframe, in long format. Then cast your counts to factors, and it works.
ggplot(bind_rows(data1, data2, data3)) +
geom_bar(aes(x = ID, fill = as.factor(count)), position = "fill") +
scale_y_continuous(labels=scales::percent) # To get the Y axis in percentage
So I did something to try to create my barplot
data1$var="first"
data2$var="second"
data3$var="third"
data4$var="fourth"
data5$var="fifth"
full_data=rbind(data1,data2,data3,data4,data5)
ggplot(ppgk) +
geom_bar(aes(x = var, fill = as.factor(Count)), position = "fill")+
scale_y_continuous(labels=scales::percent)
So I got something like that :
If Someone have the solution to make different group of counts : count=0,count=1,count=2,count>=3

How to Plot Bar Charts for a Categorical Variable Against an Analytical Variable in R

I'm struggling with how to do something with R that comes very easily to me in Excel: so I'm sure this is something quite basic but I'm just not aware of the equivalent method in R.
In essence, I have a two variables in my dataset: a categorical variable which has a list of names, and an analytical variable that has the frequency corresponding to that particular observation.
Something like this:
Name Freq
==== =========
X 100
Y 200
and so on.
I would like to plot a bar chart with the names listed on the X-Axis (X, Y and so on) and bars of height corresponding to the relevant value of the Freq. variable for that observation.
This is something very trivial with Excel; I can just select the relevant cells and create a bar chart.
However, in R I just can't seem to figure out how to do this! The bar charts in R seems to be univariate only and doesn't behave the way I want it to. Trying to plot the two variables results in a scatter plot which is not what I'm going for.
Is there something very basic I'm missing here, or is R just not capable of performing this task?
Any pointers will be much helpful.
Edited to Add:
I was primarily trying to use base R's plot function to get the job done.
Using, plot(dataset1$Name, dataset1$Freq) does not lead to a bar graph but a scatter-plot instead.
First the data.
dat <- data.frame(Name = c("X", "Y"), Freq = c(100, 200))
With base R.
barplot(dat$Freq, names.arg = dat$Name)
If you want to display a long list of names.arg, maybe the best way is to customize your horizontal axis with function staxlab from package plotrix. Here are two example plots.
One, with the axis labels rotated 45 degrees.
set.seed(3)
Name <- paste0("Name_", LETTERS[1:10])
dat2 <- data.frame(Name = Name, Freq = sample(100:200, 10))
bp <- barplot(dat2$Freq)
plotrix::staxlab(1, at = bp, labels = dat2$Name, srt = 45)
Another, with the labels spread over 3 lines.
bp <- barplot(dat2$Freq)
plotrix::staxlab(1, at = bp, labels = dat2$Name, nlines = 3)
Add colors with argument col. See help("par").
With ggplot2.
library(ggplot2)
ggplot(dat, aes(Name, Freq)) +
geom_bar(stat = "identity")
To add colors you have the aesthetics colour (for the contour of the bars) and fill (for the interior of the bars).

ggplot2 | How to plot mean lines for x, y, colour=group, facets=Drug~. Cannot make it look right

I have a dataset with three factors (Group=Between; Drug=Within; Session=Within) and one response variable (DEs2mPre). I am able to plot faceted boxplot using
qplot(Session, DEs2mPre, data = Dummy.Data, colour = Drug, facets=Group~., -geom="boxplot")
I have three groups and two levels of Drug, so I get nice 3X2 graph with 3 individual graphs for each group with two levels of drug over the sessions on each graph. However instead of boxplots I would like to see lines connecting the means on each session. When I change geom to geom="line", I get a mess of lines what looks like a line for every subject in the dataset and not a grouped (mean like) visualization of the data like what you would see with lineplot.CI (sciplot package).
Is there any way to do that in ggplot2?
Sorry I couldn't add my graphs because I do not have enough "reputation points".
Thanks for any help in advance.
You get a mess of lines since ggplot connects all data points by default. You need to tell ggplot to use the mean of each group instead. The appropriate arguments are stat = "summary" and fun.y = "mean".
qplot(Session, DEs2mPre, data = Dummy.Data, colour = Drug, facets = Group~.,
stat = "summary", fun.y = "mean", geom = "line")

Resources