ggplot2 - geom_histogram / scale_fill_manual - r

I am working the following dataframe (df):
df$GP<-c(0,0,0,1,1,2,3,3,3,3,4,4,9,15,18,18,19,19,20,20,21,22,22,23)
df$colour<-c("g","g","g","g","g","g","g","g","g","g","g","g","t","t","g","g","g","g","g","g","g","g","g","g")
I want the histogram below, but showing a different fill for colour=="g" and colour=="t".
However, running the following code, the bars labelled colour=="t", go out of scale (up to 1 - plot2) whereas should be at 0.25 (plot1).
ggplot(data=df,aes(x=GP,y=..ndensity..))+geom_histogram(bins=25,aes(fill=colour))+scale_fill_manual(values=c("black","grey"))
Do you have any idea of how this could be achieved?
Thank you very much for your help with this one!

I used a tibble as the data type for dataset, with different tibble variable names.
the result is just as you want.
tb <- tibble(
tbx = c(0, 0, 0, 1, 1, 2, 3, 3, 3, 3, 4, 4, 9, 15, 18, 18, 19, 19, 20, 20, 21, 22, 22, 23),
tby = c("g","g","g","g","g","g","g","g","g","g","g","g","t","t","g","g","g","g","g","g","g","g","g","g")
)
ggplot(tb, aes(tbx, tby = ..ndensity..)) +
geom_histogram(bins = 25, aes(fill = tby)) +
scale_fill_manual(values = c("red", "grey"))
and this is the output plot:
I hope this addresses your question

Related

Adding title and subheadings to histogram in r keeps failing

I am brand new to R and I've been trying to look around StackOverflow for an answer but have not found something that works. I would like to add a title and color to my histogram graph but it keeps on failing for some reason. I've made my data in descending order and I'd like to title both the main heading and the x-axis and y-axis with differing names. Thank you so much in advance!
Here is the code that I am using:
CompaniesOrder=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
GpMean=c(9905.5789474, 8794.1052632, 4893.0526316, 3723.1052632,
3045.6069474,1518.0444211,1200.4994211,842.4464737,765.5630588,
647.6224211,543.739875,324.5206316,217.9081579,213.0212857,
168.1743158,149.2178947,136.6547895,90.5400526,66.8915333,
57.7370526,8.3272143,3.3801053,0.2194286,0,0,0)
GpMeanTreatment <- data.frame(CompaniesOrder, GpMean)
library(ggplot2)
ggplot(GpMeanTreatment, aes(x = CompaniesOrder, y = GpMean)) +
geom_bar(stat="identity")
ggplot(GpMeanTreatment, aes(x = CompaniesOrder, y = GpMean, fill = CompaniesOrder)) +
geom_col() +
scale_fill_gradient(low='red', high='yellow') +
labs(x = "Your X axis title", y = "Your Y axis title", title = "Your Main Title")
Hint: geom_bar(stat="identity") is just a long way of using geom_col()

Stacked bar plot in R with the positive and negative values

I would like to plot a stacked bar plot in R and my data looks as such:
This table is the values against date and as it can be seen, there are repetitive dates with different sides. I would like to plot a bar plot using this data.
combined = rbind(x,y)
combined = combined[order(combined$Group.1),]
barplot(combined$x,main=paste("x vs y Breakdown",Sys.time()),names.arg = combined$Group.1,horiz = TRUE,las=2,xlim=c(-30,30),col = 'blue',beside = True)
Want a stacked plot where I can see the values against dates. How do change my code?
You can easily create this figure with ggplot2.
Here a piece of code for you using a data frame similar to what you have:
library(ggplot2)
my_data <- data.frame(
date = factor(c(1, 2, 2, 3, 3, 4, 5, 5, 6, 7, 8, 8)),
x = c(-2, 14, -8, -13, 3, -4, 9, 8, 3, -4, 8, -1)
)
g <- ggplot(my_data, aes(x = date, y = x)) +
geom_bar(
stat = "identity", position = position_stack(),
color = "white", fill = "lightblue"
) +
coord_flip()
This is the output:
Obviously, the official documentation is a good way to start to understand a bit better how to improve it.

Changing legend labels in ggplotly()

I have a plot of polygons that are colored according to a quantitative variable in the dataset being cut off at certain discrete values (0, 5, 10, 15, 20, 25). I currently have a static ggplot() output that "works" the way I intend. Namely, the legend values are the cut off values (0, 5, 10, 15, 20, 25). The static plot is below -
However, when I simply convert this static plot to an interactive plot, the legend values become hexadecimal values (#54278F, #756BB1, etc.) instead of the cut off values (0, 5, 10, 15, 20, 25). A screenshot of this interactive plot is shown below -
I am trying to determine a way to change the legend labels in the interactive plot to be the cut off values (0, 5, 10, 15, 20, 25). Any suggestions or support would be greatly appreciated!
Below is the code I used to create the static and interactive plot:
library(plotly)
library(ggplot2)
library(RColorBrewer)
set.seed(1)
x = abs(rnorm(30))
y = abs(rnorm(30))
value = runif(30, 1, 30)
myData <- data.frame(x=x, y=y, value=value)
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueColor <- cut(myData$value, breaks=c(0, cutList, 30), labels=rev(purples))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueColor)) + geom_polygon(stat="identity") + scale_fill_manual(labels = as.character(c(0, cutList)), values = levels(myData$valueColor), name = "Value")
# Interactive plot
ip <- ggplotly(sp)
Label using the cut points and use scale_fill_manual for the colors.
cutList = c(5, 10, 15, 20, 25)
purples <- brewer.pal(length(cutList)+1, "Purples")
myData$valueLab <- cut(myData$value, breaks=c(0, cutList, 30), labels=as.character(c(0, cutList)))
# Static plot
sp <- ggplot(myData, aes(x=x, y=y, fill=valueLab)) + geom_polygon(stat="identity") + scale_fill_manual(values = rev(purples))
# Interactive plot
ip <- ggplotly(sp)

How to create a rose plot with lines

I am trying to create a rose plot of chromosome data as follows
structure(list(chr = c(11, 11, 11, 12, 12, 12, 13, 13, 13, 14,
16, 16, 18, 2, 2, 20, 20, 3, 4, 4), leftPos = c(17640000, 2880000,
29040000, 19680000, 6120000, 6480000, 14880000, 16200000, 17760000,
13560000, 21240000, 7080000, 10440000, 16800000, 49080000, 12240000,
8280000, 13320000, 12000000, 13560000), Means.x = c(218.523821652113,
256.117545073851, 343.541494875886, 348.237645885147, 426.983644179467,
228.568161732039, 283.269605668063, 440.686146437743, 218.674798674891,
264.556139561699, 232.068688576066, 226.877793789348, 224.282711224934,
215.935347248385, 253.472008896076, 230.683794652539, 305.788038763088,
285.805349707436, 644.897029710454, 485.630136931446), Def.x = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), Means.y = c(236.188586172547,
345.958953367193, 250.194040077771, 304.25175004754, 336.629006416052,
221.495167672412, 231.719055660279, 231.252826188427, 334.254524914754,
271.392526335334, 236.848569235568, 261.62635228236, 246.090793604293,
370.773978424351, 242.493276055677, 245.097715487835, 280.225103337613,
370.736474095631, 1014.42590543955, 236.718929160423), Def.y = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Names = c("chr",
"leftPos", "Means.x", "Def.x", "Means.y", "Def.y"), row.names = c(NA,
20L), class = "data.frame")
Initially I was pretty pleased with myself because I could get these plots
using this code:
ggplot(ZoutliersM)+
geom_point(aes(x = ZoutliersM$leftPos/1000000,y = as.numeric(ZoutliersM$Def.x)),stat="identity",fill="magenta",size=2,colour="red")+
geom_bar(aes(x = ZoutliersM$leftPos/1000000,y = as.numeric(ZoutliersM$Def.x)),stat="identity",fill="purple",size=1,colour="red")+
ylim(0, 1)+
ggtitle("Shared")+
#geom_hline(aes(yintercept=0))+
coord_polar(theta = "x", start = 0)+
facet_wrap(~ chr)
However I have a problem with using geom_bar as I constantly get the error
position_stack requires constant width: output may be incorrect
and I think the output is incorrect as it doesn't plot all of the points.
So I spent ages searching for an answer but really didn't get much. I think it's an error related to the fact that geom_bar thinks that the bar widths are all different sizes and doesn't like it. I've tried changing to stat='bin' but I don't want a frequency plot, I want to just have a line from the point to the x-axis.
So the question is how can I do this and avoid the geom_bar all together. Is there, for example a way of having vline drawn for each point down to the y=0 point?
Edit
so then I tried this
ggplot(ZoutliersM)+
geom_point(aes(x = ZoutliersM$leftPos/1000000,y = as.numeric(ZoutliersM$Def.x)),stat="identity",fill="magenta",size=2,colour="red")+
geom_vline(xintercept = ZoutliersM$leftPos/1000000, linetype= 1, colour = "#919191")+
ylim(0, 1)+
ggtitle("Shared")+
#geom_hline(aes(yintercept=0))+
coord_polar(theta = "x", start = 0)+
facet_wrap(~ chr)
and I got this:
but now all the vlines are plotted on one graph and then replicated per chromosome. so not working still
Try geom_segment(), which allows you to use two coordinates to specify a line segment: (x,y) and (xend,yend). The (x,y) coordinates are the same as your point, while the (xend,yend) coordinate represent the other end of the line segment. In this case, since we want the line to extend from the point to the x-axis, xend should be the same as x and yend should be 0. I've consolidated all of your aes() variables into one, but everything else not related to geom_segment() I've kept the same:
ggplot(ZoutliersM,aes(x = ZoutliersM$leftPos/1000000,y = as.numeric(ZoutliersM$Def.x),
xend=ZoutliersM$leftPos/1000000,yend=0))+
geom_point(stat="identity",fill="magenta",size=2,colour="red")+
geom_segment(linetype= 1, colour = "#919191")+
ylim(0, 1)+
ggtitle("Shared")+
coord_polar(theta = "x", start = 0)+
facet_wrap(~ chr)

Split barplot by grouping by days

I have the following bar chart produced using this code:
MD1<-read.csv("MD_qual_OTU_sorted.csv")
MD1<-data.frame(Samples=c("A","B","C","D","E","F","G","H","I","J","K","L","M", "N","O","P","Q", "R"), Number.of.OTUs=c(13,10,9,9,15,11,7,7,9,9,5,10,10,7,15,17,8,9))
par(las=1)
barplot(MD1[,2],names.arg=MD1[,1], ylab='OTU Count', yaxt='n', xlab='MD samples', main='Total OTU count/Sample',density=c(90,90, 90, 90, 90, 90, 10, 10, 10, 10, 10, 10, 40, 40, 40, 40, 40, 40), col=c("yellow","yellow","pink", "pink","green","green","red","red", "purple", "purple", "blue", "blue", "orange", "orange","cyan", "cyan","chartreuse4", "chartreuse4" ))
usr <- par("usr")
par(usr=c(usr[1:2], 0, 20))
axis(2, at=seq(0,20,5))
I want to split samples A-F into a separate group (Day 3), G-L (Day 5) and M-R (Day 15)
There are similar questions posted however I am not sure how to tidy up the manner in which I have inputted my data to be able to use these solutions.
You could consider using ggplot2, separate plots are very easy using facet_wrap and facet_grid.
library(ggplot2)
#create a grouping variable
MD1$Day <- rep(c("Day 03","Day 05","Day 15"),
each=6)
p1 <- ggplot(MD1, aes(x=Samples,y=Number.of.OTUs)) +
geom_bar(stat="identity") + facet_wrap(~Day,
scales="free_x")
p1
Or, if you want to use base-R and approach your original image:
#add colors/densities
MD1$col <- c("yellow","yellow","pink", "pink","green","green","red","red",
"purple", "purple", "blue", "blue", "orange", "orange","cyan", "cyan","chartreuse4", "chartreuse4" )
MD1$density <- c(90,90, 90, 90, 90, 90, 10, 10, 10, 10, 10, 10, 40, 40, 40, 40, 40, 40)
#set 1 row three cols for plotting
par(mfrow=c(1,3))
#split and plot
lapply(split(MD1, MD1$Day),function(x){
barplot(x[,2],
names.arg=x[,1],
ylab='OTU Count',
ylim=c(0,20),
main=unique(x$Day),
col=x$col,
density=x$density)
})

Resources