I'm trying to create a stacked bar graph showing body composition. I have a table/data set (I don't know the correct term) that looks like this:
structure(list(data.Date = structure(1:7, .Label = c("2021-03-06",
"2021-03-07", "2021-03-08", "2021-03-09", "2021-03-10", "2021-03-11",
"2021-03-12"), class = "factor"), total_bf = c(19.6612, 18.2182,
19.6803, 21.7047, 18.126, 19.7, 19.1424), total_muscle = c(41.5948,
43.043, 42.1578, 42.1866, 43.4017, 42.2, 42.2728), other = c(37.544,
38.8388, 38.0619, 38.0087, 39.1723, 38.1, 38.2848)), class = "data.frame", row.names = c(NA,
-7L))
Each column is a weight in kilograms. Together they add up to the total body weight of the subject. What I want is a stacked bar graph where each bar represents a date and each bar is split by total_bf, total_muscle and other. All of the guides and Q&As I've seen don't seem to apply to my situation. Maybe this is because I am new but nothing I've tried has worked yet.
An example of what I'm trying to achieve:
The only difference is that on my graph blue would be body fat (total_bf), green would be other and red would be muscle (total_muscle).
You can convert data from the wide format to the long format using tidyr::pivot_longer() function:
library(ggplot2)
df <- structure(list(
data.Date = structure(
1:7,
.Label = c("2021-03-06", "2021-03-07", "2021-03-08", "2021-03-09",
"2021-03-10", "2021-03-11", "2021-03-12"), class = "factor"),
total_bf = c(19.6612, 18.2182, 19.6803, 21.7047, 18.126, 19.7, 19.1424),
total_muscle = c(41.5948, 43.043, 42.1578, 42.1866, 43.4017, 42.2, 42.2728),
other = c(37.544, 38.8388, 38.0619, 38.0087, 39.1723, 38.1, 38.2848)
), class = "data.frame", row.names = c(NA, -7L))
long <- tidyr::pivot_longer(df, -data.Date)
Then using ggplot2, the defaults already make a stacked bar chart, so you just need to specify x, y and fill aesthetics.
ggplot(long, aes(data.Date, value, fill = name)) +
geom_col()
Since your date is encoded as a factor, if you want to encode it as a real date you can convert it as follows:
long$date <- as.Date(strptime(as.character(long$data.Date), format = "%Y-%m-%d"))
ggplot(long, aes(date, value, fill = name)) +
geom_col()
Created on 2021-03-12 by the reprex package (v0.3.0)
I have the following code, that generates the following heatmap in R.
ggplot(data = hminput, color=category, aes(x = Poblaciones, y = Variantes)) +
geom_tile(aes(fill = Frecuencias)) + scale_colour_gradient(name = "Frecuencias",low = "blue", high = "white",guide="colourbar")
hminput is a data frame with three columns: Poblaciones, Variantes and Frecuencias, where the first two are the x and y axis and the third one is the color reference.
And my desired output is that the heatmap to have a bar as the reference instead of those blocks, and also that the coloring is white-blue gradient instead of that multicolor gradient.
To achieve that, I tried what's in my code, but I'm not achieving what I want (I'm getting the graph you see in the picture). Any thoughts? Thanks!
As some people asked, here is the dput of the data frame :
> dput(hminput)
structure(list(Variantes = structure(c(1L, 2L, 3L, 4L,...), .Label =
c("rs10498633", "rs10792832", "rs10838725",
"rs10948363", ..., "SNP"), class = "factor"),
Poblaciones = c("AFR", "AFR", ...), Frecuencias = structure(c(12L,
10L,...), .Label = c("0.01135", "0.0121",
"0.01286", "0.01513", "0.02194", "0.05144", "0.05825", "0.059",
"0.07716", "0.0938", "0.1051", "0.1225", "0.1346", "0.1407",
"0.1566", "0.1604", "0.1619", "0.1838", "0.1914", "0.1929",
...,
"0.45", "0.5", "0.4"), class = "factor")), .Names = c("Variantes",
"Poblaciones", "Frecuencias"), row.names = c("frqAFR.33", "frqAFR.31",
"frqAFR.27", "frqAFR.14", "frqAFR.24",...
), class = "data.frame")
My data looks something like this:
There are 10,000 rows, each representing a city and all months since 1998-01 to 2013-9:
RegionName| State| Metro| CountyName| 1998-01| 1998-02| 1998-03
New York| NY| New York| Queens| 1.3414| 1.344| 1.3514
Los Angeles| CA| Los Angeles| Los Angeles| 12.8841| 12.5466| 12.2737
Philadelphia| PA| Philadelphia| Philadelphia| 1.626| 0.5639| 0.2414
Phoenix| AZ| Phoenix| Maricopa| 2.7046| 2.5525| 2.3472
I want to be able to do a plot for all months since 1998 for any city or more than one city.
I tried this but i get an error. I am not sure if i am even attempting this right. Any help will be appreciated. Thank you.
forecl <- ts(forecl, start=c(1998, 1), end=c(2013, 9), frequency=12)
plot(forecl)
Error in plots(x = x, y = y, plot.type = plot.type, xy.labels = xy.labels, :
cannot plot more than 10 series as "multiple"
You might try
require(reshape)
require(ggplot2)
forecl <- melt(forecl, id.vars = c("region","state","city"), variable_name = "month")
forecl$month <- as.Date(forecl$month)
ggplot(forecl, aes(x = month, y = value, color = city)) + geom_line()
To add to #JLLagrange's answer, you might want to pass city through facet_grid() if there are too many cities and the colors will be hard to distinguish.
ggplot(forecl, aes(x = month, y = value, color = city, group = city)) +
geom_line() +
facet_grid( ~ city)
Could you provide an example of your data, e.g. dput(head(forecl)), before converting to a time-series object? The problem might also be with the ts object.
In any case, I think there are two problems.
First, data are in wide format. I'm not sure about your column names, since they should start with a letter, but in any case, the general idea would be do to something like this:
test <- structure(list(
city = structure(1:2, .Label = c("New York", "Philly"),
class = "factor"), state = structure(1:2, .Label = c("NY",
"PA"), class = "factor"), a2005.1 = c(1, 1), a2005.2 = c(2, 5
)), .Names = c("city", "state", "a2005.1", "a2005.2"), row.names = c(NA,
-2L), class = "data.frame")
test.long <- reshape(test, varying=c(3:4), direction="long")
Second, I think you are trying to plot too many cities at the same time. Try:
plot(forecl[, 1])
or
plot(forecl[, 1:5])
I tried to make a boxplot today using ggplot2, and I encountered an error I haven't been able to solve, yet. I have used a similar approach (which I actually took from an answer by user #joran) before without incident, but I must be doing something incorrectly this time.
Here is my data:
myboxplot<-structure(list(gap = structure(1:2, .Label = c("Jib", "NoJib"), class = "factor"),
Location = structure(c(4L, 4L), .Label = c("A", "B", "C",
"D"), class = "factor"), min = c(21.809, 21.081), q1 = c(25.582,
25.375), med = c(28.082, 27), q3 = c(30.142, 28.622), max = c(37.166,
39.808), lab = c(2342L, 119681L)), .Names = c("JibStat", "Location",
"min", "q1", "med", "q3", "max", "lab"), row.names = c(2L, 7L
), class = "data.frame")
The code that I have been attempting to use is as follows:
ggplot(myboxplot + aes(x=JibStat, fill=JibStat)) +
geom_boxplot(aes(lower = q1, upper = q3, middle = med, ymin = min, ymax = max), stat = "identity")
and I get the following error message:
Error in Ops.data.frame(myboxplot, aes(x = JibStat, fill = JibStat)) : list of length 2 not meaningful
I have worked on resolving the issue, but I have not been able to find much on resolving the error. My Google skills must be lacking today, but I can't think of what to search for to get help on this problem. What is it I am doing wrong here?
Additional info: R version 3.0.1, 64-bit Windows 8.
Try changing the first line to:
ggplot(myboxplot, aes(x=JibStat)) +
geom_boxplot(aes(lower = q1, upper = q3, middle = med,
ymin = min, ymax = max), stat = "identity")
I think you'd mis-typed a comma.
This question is a direct successor to a pervious question asked here called “ggplot scatter plot of two groups with superimposed means with X and Y error bars”. That questions answer looks to do exactly what I am trying to accomplish however the code provided results in an error which I can’t get around. I will use my data as example here but I have tried the original question code as well with the same result.
I have a data frame which looks like this:
structure(list(Meta_ID = structure(c(15L, 22L, 31L, 17L), .Label = c("NM*624-46",
"NM*624-54", "NM*624-56", "NM*624-61", "NM*624-70", "NM624-36",
"NM624-38", "NM624-39", "NM624-40", "NM624-41", "NM624-43", "NM624-46",
"NM624-47", "NM624-51", "NM624-54 ", "NM624-56", "NM624-57",
"NM624-59", "NM624-61", "NM624-64", "NM624-70", "NM624-73", "NM624-75",
"NM624-77", "NM624-81", "NM624-82", "NM624-83", "NM624-84", "NM625-02",
"NM625-10", "NM625-11", "SM621-43", "SM621-44", "SM621-46", "SM621-47",
"SM621-48", "SM621-52", "SM621-53", "SM621-55", "SM621-56", "SM621-96",
"SM621-97", "SM622-51", "SM622-52", "SM623-14", "SM623-23", "SM623-26",
"SM623-27", "SM623-32", "SM623-33", "SM623-34", "SM623-55", "SM623-56",
"SM623-57", "SM623-58", "SM623-59", "SM623-61", "SM623-62", "SM623-64",
"SM623-65", "SM623-66", "SM623-67", "SM680-74", "SM681-16"), class = "factor"),
Region = structure(c(1L, 1L, 1L, 1L), .Label = c("N", "S"
), class = "factor"), Tissue = structure(c(1L, 2L, 1L, 1L
), .Label = c("M", "M*"), class = "factor"), Tag_Num = structure(c(41L,
48L, 57L, 43L), .Label = c("621-43", "621-44", "621-46",
"621-47", "621-48", "621-52", "621-53", "621-55", "621-56",
"621-96", "621-97", "622-51", "622-52", "623-14", "623-23",
"623-26", "623-27", "623-32", "623-33", "623-34", "623-55",
"623-56", "623-57", "623-58", "623-59", "623-61", "623-62",
"623-64", "623-65", "623-66", "623-67", "624-36", "624-38",
"624-39", "624-40", "624-41", "624-43", "624-46", "624-47",
"624-51", "624-54", "624-56", "624-57", "624-59", "624-61",
"624-64", "624-70", "624-73", "624-75", "624-77", "624-81",
"624-82", "624-83", "624-84", "625-02", "625-10", "625-11",
"680-74", "681-16"), class = "factor"), Lab_Num = structure(1:4, .Label = c("C4683",
"C4684", "C4685", "C4686", "C4687", "C4688", "C4689", "C4690",
"C4691", "C4692", "C4693", "C4694", "C4695", "C4696", "C4697",
"C4698", "C4699", "C4700", "C4701", "C4702", "C4703", "C4704",
"C4705", "C4706", "C4707", "C4708", "C4709", "C4710", "C4711",
"C4712", "C4713", "C4714", "C4715", "C4716", "C4717", "C4718",
"C4719", "C4720", "C4721", "C4722", "C4723", "C4724", "C4725",
"C4726", "C4727", "C4728", "C4729", "C4730", "C4731", "C4732",
"C4733", "C4734", "C4735", "C4736", "C4737", "C4738", "C4739",
"C4740", "C4741", "C4742", "C4743", "C4744", "C4745", "C4746",
"C4747", "C4748"), class = "factor"), C = c(46.5, 46.7, 45,
43.6), N = c(12.9, 13.7, 14.5, 13.4), C.N = c(3.6, 3.4, 3.1,
3.3), d13C = c(-19.7, -19.5, -19.4, -19.2), d15N = c(13.3,
12.4, 11.7, 11.9)), .Names = c("Meta_ID", "Region", "Tissue",
"Tag_Num", "Lab_Num", "C", "N", "C.N", "d13C", "d15N"), row.names = c(NA,
4L), class = "data.frame")
What I want to produce is a scatter plot of the raw data with an overlay of the data means for each “Region” with bidirectional error bars. To accomplish that I use plyr to summarize my data and generate the means and SD’s. Then I use ggplot2:
library(plyr)
Basic <- ddply(First.run,.(Region),summarise,
N = length(d13C),
d13C.mean = mean(d13C),
d15N.mean = mean(d15N),
d13C.SD = sd(d13C),
d15N.SD = sd(d15N))
ggplot(data=First.run, aes(x = First.run$d13C, y = First.run$d15N))+
geom_point(aes(colour = Region))+
geom_point(data = Basic,aes(colour = Region))+
geom_errorbarh(data = Basic, aes(xmin = d13C.mean + d13C.SD, xmax = d13C.mean - d13C.SD,
y = d15N.mean, colour = Region, height = 0.01))+
geom_errorbar(data = Basic, aes(ymin = d15N.mean - d15N.SD, ymax = d15N.mean + d15N.SD,
x = d13C.mean,colour = Region))
But each time I run this code I get the same error and can’t figure out what the problem is.
Error: Aesthetics must either be length one, or the same length as the dataProblems:Region
Any help would be much appreciated.
Edit: Since my example data is taken from the head of my full dataset it only includes samples from the "N" Region. With only this one region the code works fine but if you use fix() to change the provided dataset so that at least one other Region is included (in my data the other Region is "S") then the error I get shows up. My mistake in not including some data from each Region.
I ended up changing two of the "N" Regions to "S" so I could calculate standard deviation for both groups.
I think the problem was that you were missing required aesthetics in some of your geoms (geom_point was missing x and y, for example). At least getting all the required aesthetics into each geom seemed to get everything working. I cleaned up a few other things while I was at it to shorten the code up a bit.
ggplot(data = First.run, aes(x = d13C, y = d15N, colour = Region)) +
geom_point() +
geom_point(data = Basic,aes(x = d13C.mean, y = d15N.mean)) +
geom_errorbarh(data = Basic, aes(xmin = d13C.mean + d13C.SD,
xmax = d13C.mean - d13C.SD, y = d15N.mean, x = d13C.mean), height = .5) +
geom_errorbar(data = Basic, aes(ymin = d15N.mean - d15N.SD,
ymax = d15N.mean + d15N.SD, x = d13C.mean, y = d15N.mean), width = .01)