I've read through the ggplot2 docs website and other question but I couldn't find a solution. I'm trying to visualize some data for varying age groups. I have sort of managed to do it but it does not look like I would intend it to.
Here is the code for my plot
p <- ggplot(suggestion, aes(interaction(Age,variable), value, color = Age, fill = factor(variable), group = Age))
p + geom_bar(stat = "identity")+
facet_grid(.~Age)![The facetting separates the age variables][1]
My ultimate goal is to created a stack bar graph, which is why I used the fill, but it does not put the TDX values in its corresponding Age group and Year. (Sometimes TDX values == DX values, but I want to visualize when they don't)
Here's the dput(suggestion)
structure(list(Age = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L,
7L), .Label = c("0-2", "3-9", "10-19", "20-39", "40-59", "60-64",
"65+", "UNSP", "(all)"), class = "factor"), variable = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("Year.10.DX", "Year.11.DX",
"Year.12.DX", "Year.13.DX", "Year.10.TDX", "Year.11.TDX", "Year.12.TDX",
"Year.13.TDX"), class = "factor"), value = c(26.8648932910636,
30.487741796656, 31.9938838749782, 62.8189679326958, 72.8480838120064,
69.3044125928752, 36.9789457527416, 21.808001825378, 24.1073451428435,
40.3305134762935, 70.4486116545885, 68.8342676191755, 63.9227718107745,
34.6086468618636, 8.84033719571875, 13.2807072303835, 28.4781516422802,
55.139497471546, 59.7230544500003, 67.9448927372699, 37.7293286937066,
6.9507024051526, 17.4393054963572, 33.1485743479821, 61.198647580693,
58.6845873573852, 48.0073013177248, 28.4455801248562, 26.8648932910636,
19.8044453272475, 23.0189084635948, 53.7037832071889, 60.6516550126422,
58.1573725886767, 27.0791868812255, 21.808001825378, 19.8146296425633,
35.0587750051557, 62.3308555053346, 59.3299998610862, 56.5341245769817,
27.7229319271878, 8.84033719571875, 13.2807072303835, 22.4081606349585,
48.0252683906252, 52.7560684009579, 65.2890977685045, 32.4142337849399,
6.9507024051526, 15.2833655677215, 24.5268503180754, 52.536784326675,
51.4100599515986, 40.9609231655724, 18.1306673637441)), row.names = c(NA,
-56L), .Names = c("Age", "variable", "value"), class = "data.frame")
It's unclear what you need but perhaps this.
ggplot(a,aes(x=variable,y=value,fill=Age)) + geom_bar(stat='identity')
+facet_wrap(~Age)
If you want to visualize separately the TDX and the DX entries, we'll need to change the dataframe a bit.
> head(a)
Age variable value
1 0-2 Year.10.DX 26.86489
2 3-9 Year.10.DX 30.48774
3 10-19 Year.10.DX 31.99388
4 20-39 Year.10.DX 62.81897
5 40-59 Year.10.DX 72.84808
6 60-64 Year.10.DX 69.30441
The column of interest variable is a combination of year and of TDX/DX value. We'll use the tidyr package to separate this into two columns.
library(tidyr)
library(dplyr)
tidy_a<- a %>% separate(variable, into = c( 'nothing',"year",'label'), sep = "\\.")
This actually splits the levels of column variable into three components, since we split on . and the character . appears twice in each entry.
> head(tidy_a)
Age nothing year label value
1 0-2 Year 10 DX 26.86489
2 3-9 Year 10 DX 30.48774
3 10-19 Year 10 DX 31.99388
4 20-39 Year 10 DX 62.81897
5 40-59 Year 10 DX 72.84808
6 60-64 Year 10 DX 69.30441
So the column nothing is rather useless, just a necessary result of using separate and separating on .. Now this will allow us to visualize TDX/DX separately.
ggplot(tidy_a,aes(x=year,y=value,fill=label)) + geom_bar(stat='identity') + facet_wrap(~Age)
Related
I want a barplot based on the number of occurrences of a string in a particular column in a dataset in r.
At the same time, I want to run a t-test and plot the significant p-values using stars on the top of the bars. The nonsignificant can be represented as ns.
My attempt has been:
barplot(prop.table(table(ttcluster_dataset$Phenotype)),col=clustercolor,border="black",xlab="Phenotypes",ylab="Percentage of Samples expressed",main="Sample wise Phenotype distribution",cex.names = 0.8)
The dataset column is:
ttcluster_dataset$Phenotype<-
structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("Proneural (Cluster 1)", "Proneural (Cluster 2)", "Neural (Cluster 1)", "Neural (Cluster 2)",
"Classical (Cluster 1)", "Classical (Cluster 2)", "Mesenchymal (Cluster 1)",
"Mesenchymal (Cluster 2)"), class = "factor")
All suggestions shall be apprciated.
A t-test is probably not what you want since you are looking at counts and proportions between the two clusters. Your data is not really set up to do either one so first we need to split the two variables:
Pheno.splt <- strsplit(as.character(ttcluster_dataset$Phenotype), " ")
Pheno.mat <- do.call(rbind, x)[, c(1, 3)]
ttclust <- data.frame(Phenotype=Pheno.mat[, 1], Cluster=gsub(")", "", Pheno.mat[, 2]))
str(ttclust)
# 'data.frame': 171 obs. of 2 variables:
# $ Phenotype: chr "Proneural" "Proneural" "Proneural" "Proneural" ...
# $ Cluster : chr "1" "1" "1" "1" ...
Now Phenotype and Cluster are separate columns in the data frame. There are multiple ways to do this, but here we just split your Phenotype into three parts by splitting on the space between them. Now ttclust is as data frame with two variables. Now a summary table and bar plot:
tbl <- xtabs(~Phenotype+Cluster, ttclust)
tbl
# Cluster
# Phenotype 1 2
# Classical 32 6
# Mesenchymal 44 10
# Neural 26 0
# Proneural 45 8
tbl.row <- prop.table(tbl, 1)
barplot(t(tbl.row), beside=TRUE)
At this point, a simple proportions test indicates that there is no difference in percent of Cluster 1 across the four Phenotypes:
prop.test(tbl)
4-sample test for equality of proportions without continuity correction
data: tbl
X-squared = 5.2908, df = 3, p-value = 0.1517
alternative hypothesis: two.sided
sample estimates:
prop 1 prop 2 prop 3 prop 4
0.8421053 0.8148148 1.0000000 0.8490566
Using `prop.test' on each Phenotype indicates that Cluster 1 is significantly difference from Cluster 2 in every case:
for(i in 1:4) print(prop.test(t(tbl[i, ])))
# First test
#
# 1-sample proportions test with continuity correction
#
# data: t(tbl[i, ]), null probability 0.5
# X-squared = 16.447, df = 1, p-value = 5.002e-05
# alternative hypothesis: true p is not equal to 0.5
# 95 percent confidence interval:
# 0.6807208 0.9341311
# sample estimates:
# p
# 0.8421053
. . . .
I need to make new column of factors based on value of column Quadrat. There are 9 quadrats, and new column called Sponge would be something like:
"Old Growth" if Quadrat = 1,4,9
"Absent" if Quadrat= 3,6,7
"New Growth" if Quadrat = 2,5,8
I am sorry if answer is easy, I did check: How to convert integer to factor in R?
and also I am trying to use recode_factor. Here is my code:
library(dplyr)
key <- list(`1,4,9` = "Old Growth", `3,6,7` = "Absent", `2,5,8` = "New Growth")
df <- mutate(df, Sponge = recode_factor(Quadrat, key))
I get error:
Error in mutate_impl(.data, dots) :
Evaluation error: Vector 1 must be length 108 or one, not 3.
Real data has much more entries than the dataset I include here, if that matters. Thank you for any help.
df <- structure(list(Quadrat = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L,
9L, 9L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L,
5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 1L, 1L, 1L,
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L,
7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L,
3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L,
9L, 9L, 9L), Month = structure(c(4L, 4L, 4L, 3L, 3L, 3L, 7L,
7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L, 2L,
2L, 9L, 9L, 9L, 4L, 4L, 4L, 3L, 3L, 3L, 7L, 7L, 7L, 1L, 1L, 1L,
8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L, 2L, 2L, 9L, 9L, 9L, 4L,
4L, 4L, 3L, 3L, 3L, 7L, 7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L,
6L, 5L, 5L, 5L, 2L, 2L, 2L, 9L, 9L, 9L, 4L, 4L, 4L, 3L, 3L, 3L,
7L, 7L, 7L, 1L, 1L, 1L, 8L, 8L, 8L, 6L, 6L, 6L, 5L, 5L, 5L, 2L,
2L, 2L, 9L, 9L, 9L), .Label = c("Apr", "Aug", "Feb", "Jan", "Jul",
"Jun", "Mar", "May", "Sep"), class = "factor"), PopDens = c(65.6011820777785,
18.4913752602879, 12.151802276494, 68.0740840677172, 50.9832500135526,
36.8684287818614, 52.0825074084569, 26.8776902493555, 49.2173263626173,
25.5460870559327, 5.4171769618988, 34.4303709487431, 44.3439512783661,
2.25230997451581, 61.2502326716203, 25.9035727053415, 32.339118222706,
24.1017888628412, 12.340617884649, 53.3521768709179, 26.0048255382571,
52.8581868957262, 31.9503199581522, 18.1601244299673, 34.228305231547,
2.09199664392509, 22.6402857622597, 4.48008164577186, 48.2082461479586,
65.4937081446406, 5.43837511213496, 32.8203339113388, 4.44421968702227,
19.8568186087068, 24.2561273102183, 12.3652934685815, 39.0541164302267,
16.1970243314281, 12.9826903613284, 36.3537323835772, 48.7148000504822,
11.5067498446442, 68.7493303583469, 60.7505214684643, 49.3874175737146,
63.0705459746532, 23.721419940237, 53.4379795142449, 57.7867246468086,
38.4747762591578, 8.43540686019696, 20.5636212413665, 28.7687741059344,
53.2144687068649, 32.0859562589321, 10.5120962983929, 53.4312571119517,
13.6547974413261, 31.3038802060764, 14.5005466006696, 6.03453303268179,
62.6867637028918, 17.7734197168611, 11.0327071261127, 51.4377708046231,
26.8335341704078, 9.81126144807786, 43.993699422339, 20.5123583010864,
14.9305799969006, 23.8019575944636, 39.1543961388525, 30.4534046472982,
61.2751477411948, 48.0770866076928, 59.4514226955362, 42.9857548968866,
23.0139948409051, 1.76873184926808, 33.1222371393815, 10.8652087603696,
24.5235243474599, 62.4086231633555, 55.6522683221847, 68.8337469024118,
48.2195318546146, 6.75986870843917, 57.7931131315418, 18.2255988919642,
40.8185531077906, 38.066848333925, 31.8611310839187, 22.2724406518973,
51.7982920755167, 29.2363496678881, 35.541056742426, 66.5265460675582,
28.267403066624, 40.5209824540652, 31.8187582066748, 67.2972998009063,
53.6718824433628, 42.6495425191242, 31.6603209995665, 44.3039192620199,
21.6216275517363, 66.9763269643299, 36.3314134527463)), .Names = c("Quadrat",
"Month", "PopDens"), row.names = c(NA, -108L), class = "data.frame")
If we are using recode_factor, then create the list with individual components instead of pasteed one
key <- setNames(as.list(rep(c("Old Growth", "Absent", "New Growth"),
each = 3)), c(1, 4, 9, 3, 6, 7, 2, 5, 8))
df %>%
mutate(Sponge = recode_factor(Quadrat, !!! key)) %>%
head
# Quadrat Month PopDens Sponge
#1 1 Jan 65.60118 Old Growth
#2 1 Jan 18.49138 Old Growth
#3 1 Jan 12.15180 Old Growth
#4 2 Feb 68.07408 New Growth
#5 2 Feb 50.98325 New Growth
#6 2 Feb 36.86843 New Growth
Use mutate with the factor function
df %>% mutate(Quadrat2 =
factor(Quadrat, levels = 1:9,
labels =rep(c("Old Growth", "New Growth", "Absent"),3)
)
)
We would like to present the change in muscle mass due to the exercise of different age group and the final performance/outcome at the competition at the end of the study.
We have several time points at which the muscle mass was measured. In this example I only show three time points, however, the study compromises 12 time points.
To present the change in muscle mass and deviation from the average I was able to use geom_flow(). However, it becomes very tricky to add the age groups on the left of the chart as well as the performance on the right side. These data are located in different variables.
Please help us to find a great way to present the data. Thanks.
Data Structure:
ID Age_at_start Month Deviation_muscle Performance
1 36 3 59 Outstanding
1 36 6 104 Outstanding
1 36 9 200 Outstanding
2 29 3 -40 average
2 29 6 -109 average
2 29 9 -30 average
3 22 3 310 above average
library(ggplot2)
library(ggalluvial)
df.san$age<-factor(df.san$age)
df.san$age<-factor(df.san$age, levels=c(1,2,3,4), labels=c("20 to 24 years","25 to 29 years","30 to 34 years","35 to 39 years"))
df.san$dev_group <-factor(df.san$dev_group,levels=c(1,2,3,4,5,6,7),labels=c("≥250g","≥150 to <250g","≥50 to <150g","> -50 to <50g","> -150 to ≤ -50","> -250 to ≤ -150", "≤ -250g"))
df.san$month <- factor(df.san$month,labels=c("1mo","2mo","3mo"))
df.san$perform<-factor(df.san$perform,levels=c(1,2,3,4),labels=c("outstanding "," above average "," average "," below average"))
ggplot(df.san,aes(x = month,stratum = dev_group, alluvium = ID, fill = dev_group,label = dev_group)) +
scale_fill_brewer(type = "qual", palette = "Set2") +
geom_flow(stat = "alluvium", lode.guidance = "rightleft", color = "darkgray") +
geom_stratum() +
theme(legend.position = "bottom") +
ggtitle("Effect of Exercice on Muscle Growth on Performance in 4 Different Age Groups ")
Data for df.san:
structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L), age = c(2L, 3L, 3L, 1L, 3L, 1L, 2L, 3L, 4L, 1L, 1L, 3L, 1L, 4L, 4L, 3L, 4L, 3L, 4L, 2L, 2L, 1L, 2L, 4L, 1L, 1L, 4L, 1L, 3L, 1L, 2L, 3L, 4L, 4L, 2L, 2L, 2L, 2L, 4L, 2L, 2L, 4L, 3L, 3L, 2L), month = c(2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L, 2L, 4L, 6L), dev_muscle = c(-109.3, -236.2, -275.4, -44.5, -202.6, -436, 3, -115.8, -136.2, -142.1, -429, -561.4, -49, -248.8, -232.6, -15.9, -171.5, -391.6, -5.8, -21.7, -104.1, 12.6, -33.4, -25.4, -57.3, -50.7, -103.6, -124, -221.4, -457.2, 22.1, -126.9, -79.5, -76.8, -113.2, -129.7, -86.1, -126, -82.9, -10.8, -2.8, 88.3, 41.6, 0.2, 184.7), perform = c(1L, 2L, 1L, 2L, 4L, 1L, 1L, 4L, 3L, 4L, 2L, 4L, 4L, 4L, 2L, 2L, 4L, 3L, 3L, 4L, 1L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 1L, 2L, 4L, 3L, 2L, 1L, 3L, 2L, 1L, 1L, 4L, 4L), dev_group = c(5L, 6L, 7L, 4L, 6L, 7L, 4L, 5L, 5L, 5L, 7L, 7L, 4L, 6L, 6L, 4L, 6L, 7L, 4L, 4L, 5L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 7L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 3L, 4L, 4L, 2L)), class = "data.frame", row.names = c(NA, -45L))
I have a table that has in the first column the starting node, the ending node, and the cost to move in that direction. It is one directional, you can't move backwards. These are all the combinations. Seems like I'm making an obvious mistake..
mygraph = structure(list(V1 = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L,
5L, 5L, 6L, 7L, 8L, 9L), V2 = c(3L, 4L, 3L, 4L, 5L, 7L, 6L, 6L,
8L, 9L, 7L, 8L, 10L, 10L, 10L, 10L), V3 = c(3L, 2L, 4L, 2L, 4L,
2L, 3L, 4L, 2L, 5L, 2L, 2L, 3L, 4L, 2L, 3L)), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -16L))
names(mygraph)=c('start','end','cost')
library(igraph)
mygraph = graph.data.frame(mygraph, directed=T) # I think this is right?
plot(mygraph) #looks completely wrong???
help=get.shortest.paths(mygraph,1,10) #I'm doing something wrong want to see route and total cost of going from node 1-10
help
If you change the last term 'cost' in the following line to weight, it will generate the right solution.
names(mygraph) <- c('start', 'end', 'weight')
This happens because function get.shortest.paths() uses attribute weight (not cost) as the costs of edges.
The following figure:
Was generated with the following code:
library(ggplot2)
library(reshape2)
dat <- structure(list(Type = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("High_expression",
"KD.ip", "LG.ip", "LN.id", "LV.id", "LV.ip", "SP.id", "SP.ip"
), class = "factor"), ImmGen = structure(c(1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c("Bcells",
"DendriticCells", "Macrophages", "Monocytes", "NKCells", "Neutrophils",
"StemCells", "StromalCells", "abTcells", "gdTCells"), class = "factor"),
Exp_06hr = c(7174.40482999999, 23058.74882, 39819.39133,
15846.46146, 8075.78226, 105239.11609, 7606.34563, 19513.57747,
7116.51211, 6978.64995, 498.36828, 732.01788, 621.51576,
546.63461, 529.1711, 545.17219, 477.54658, 1170.50303, 550.99528,
607.56707, 775.0691, 1269.50773, 2138.69883, 1561.74652,
601.9372, 5515.59896, 744.48716, 997.32859, 639.13126, 657.64581,
4165.29899, 5465.1883, 7773.25723, 5544.86758, 3461.13442,
8780.64899, 4380.00437, 8721.84871, 3674.62723, 3911.00108,
2932.76554, 5903.48407, 6179.81046, 3683.64539, 2744.59622,
6760.37307, 4097.14665, 6845.31988, 2872.77771, 2912.84262
), Exp_24hr = c(1596.9091, 4242.52354, 9984.68861, 3519.18627,
1602.92511, 12203.57109, 1656.19357, 3389.93866, 1617.35484,
1579.00309, 715.47289, 643.98371, 689.40412, 580.26036, 608.22853,
695.10737, 830.77947, 670.34899, 640.67908, 637.47464, 356.75713,
393.13449, 549.60095, 466.76064, 336.95453, 617.20976, 339.2476,
469.57407, 292.86365, 305.45178, 2604.07605, 4210.64843,
5797.13123, 3650.88447, 2275.03269, 6475.27485, 2604.70614,
4796.3314, 2411.09694, 2458.23237, 1498.21516, 1996.6875,
2927.82836, 1911.00463, 1523.57171, 2199.62297, 1541.82034,
2815.82184, 1608.46099, 1588.80561), ExpDiff_06_24hr = c(5577.49572999999,
18816.22528, 29834.70272, 12327.27519, 6472.85715, 93035.545,
5950.15206, 16123.63881, 5499.15727, 5399.64686, -217.10461,
88.03417, -67.88836, -33.62575, -79.05743, -149.93518, -353.23289,
500.15404, -89.6838, -29.9075700000001, 418.31197, 876.37324,
1589.09788, 1094.98588, 264.98267, 4898.3892, 405.23956,
527.75452, 346.26761, 352.19403, 1561.22294, 1254.53987,
1976.126, 1893.98311, 1186.10173, 2305.37414, 1775.29823,
3925.51731, 1263.53029, 1452.76871, 1434.55038, 3906.79657,
3251.9821, 1772.64076, 1221.02451, 4560.7501, 2555.32631,
4029.49804, 1264.31672, 1324.03701)), .Names = c("Type",
"ImmGen", "Exp_06hr", "Exp_24hr", "ExpDiff_06_24hr"), row.names = c(NA,
-50L), class = "data.frame")
dat.m <- melt(dat)
setwd("~/Desktop/")
pdf("myfig.pdf",width=30,height=20)
p <- ggplot(dat.m,aes(ImmGen,value)) +
geom_bar(aes(fill = variable),position = "dodge",stat="identity")+
facet_wrap(~Type)
p
dev.off();
How can I modify it such that instead of wrapping it to (2x3) matrix like the above, we create (5x1) matrix instead. So each row will have its on scale of y-axis.
Secondly notice that the blue-bar (ExpDiff_06_24hr) can contain negative value. How can I show that so that in the plot the bar goes below 0 in y-axis.
I think the subplots shouldn't be plotted in one row but in one column for clarity reasons. Whith some help of this answer (thanks to hrbrmstr for linking to it) and because I think this question deserves an answer, here is a solution:
dat$rank <- rank(dat$ExpDiff_06_24hr)
dat.m <- melt(dat, id = c("Type","ImmGen","rank"))
dat.t <- transform(dat.m, TyIm = factor(paste0(Type, ImmGen)))
dat.t <- transform(dat.t, TyIm = reorder(TyIm, rank(rank)))
p <- ggplot(dat.t, aes(TyIm,value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity")+
facet_wrap(~Type, ncol=1, scales="free") +
scale_x_discrete("ImmGen", breaks=dat.t$TyIm, labels=dat.t$ImmGen)
p
The result: