issue with ggplot2/geom_line - subjets datasets plotted erroneously jointly - r

I have an issue with the follwoing dataframe although I followed standard examples for line plots.
As shown in the plot, two of the three subjects dummy time courses are plotted jointly instead of being plotted separately
library(tidyverse)
blub <- structure(list(time = seq(0,10,by=1),
sub1 = seq(10,20,by=1),
sub2 = seq(20,30,by=1),
sub3 = seq(30,40,by=1)),
row.names = c(NA, -11L),
class = "data.frame")
bluba <- gather(data = blub, key=subjects, value=value, 2:ncol(blub))
basic <- c("N","P","P")
statusArray <- rep(basic,each=11)
bluba$status <- statusArray
print(ggplot(data=bluba,
aes(x=time,y=value, color=status)) +
geom_line())
Any comments would be appreciated!

Add group to aes to specify the connection of line
print(ggplot(data=bluba,
aes(x=time,y=value, color=status, group = subjects)) +
geom_line())

You can use linetype = subjects
bluba %>%
ggplot(aes(x=time,y=value, color=status, linetype=subjects)) +
geom_line()

Related

Plot a bar plot on R, grouped in 2s

Looking to plot grouped bar plots
data:
structure(list(Main = c(0.468893939007605, 0.0629924918425918,
0.561410474480681), Total = c(0.388090040532888, -0.0706047151157143,
0.483298239353565)), class = "data.frame", row.names = c(NA,
-3L))
intended output should look like this:
My current plot code which does not make sense to me:
barplot(main_total$Main, main_total$Total)
ggplot would be preferred but i have trouble coding it.any help will be appreciated. Thank you
It's because barplot prefers transposed matrices.
m <- as.matrix(main_total)
Use t to transpose the matrix.
b <- barplot(t(m), beside=TRUE, ylab="Value",
ylim=c(round(min(m), 1), round(max(m), 1)), col=3:4)
axis(1, colMeans(b), c("H", "M", "S"))
legend("topleft", legend=c("Main", "Total"), fill=3:4)
box()
Gives
You'll get the idea here since you didn't include the grouping variables in your example code.
df <- structure(list(Main = c(0.468893939007605, 0.0629924918425918,
0.561410474480681), Total = c(0.388090040532888, -0.0706047151157143,
0.483298239353565)), class = "data.frame", row.names = c(NA,
-3L))
df$Group <- c('H','M','S') # Assign group variables
library(reshape2) # Data frame needs to be in long format
df.m <- melt(df,id.vars = "Group")
df.m
library(ggplot2)
ggplot(df.m, aes(x=Group,y=value)) +
geom_bar(aes(fill=variable),stat = 'identity',position = 'dodge')

Overlaying plots with a horizontal date in R

I was attempting to overlay two plots using ggplot2, I can graph them individually, but I want to overlay them to show a comparison. They have the same y axis. The y axis is a score from 0 to 100, the x axis is a specific date in the month (from a range of 3 weeks)
Here is what I have tried:
data <- read.table(text = Level5avg, header = TRUE)
data2 <- read.table(text = Level6avg, header = TRUE)
colnames(data) = c("x","y")
colnames(data2) = c("x","y")
ggplot(rbind(data.frame(data2, group="a"), data.frame(data, group="b")), aes(x=x,y=y)) +
stat_density2d(geom="tile", aes(fill = group, alpha=..density..), contour=FALSE) + scale_fill_manual(values=c("b"="#FF0000", "a"="#00FF00")) + geom_point() + theme_minimal()
When I do this, I get a strange graph that has several dots, but I'm not sure if my code is right, since I can't distinguish the data. I want to add 3 more (small) datasets to the plot, if it is possible. If it is possible, how do I make it into a line graph in order to distinguish the datasets?
Note: I was under the impression ggplot would work for my purposes because of this post (and several other posts on this site advised using ggplot as opposed to Lattice). I'm not sure if what I want is possible, so I came here.
Data sets:
dput(data) structure(list(x = structure(1:6, .Label = c("10/27/2015",
"10/28/2015",
"10/29/2015", "10/30/2015", "10/31/2015", "11/1/2015"), class = "factor"),
y = c(0, 12.5, 0, 0, 11, 43)), .Names = c("x", "y"), class = "data.frame",
row.names = c(NA, -6L))
dput(data2) structure(list(x = structure(1:3, .Label
=c("10/28/2015","10/31/2015",
"11/1/2015"), class = "factor"), y = c(0, 0, 41.5)), .Names = c("x",
"y"), class = "data.frame", row.names = c(NA, -3L))
I've now managed to get my overlay, but is there a way to organize the horizontal axis? The dates have no order.
It seems to me that the answer that you are basing your plots on uses density plots that are not useful for your data. If you are just looking for some line plots with points, you could do the following (note I created a dataframe outside of the ggplot() call to make it look a little cleaner):
data$group <- "b"
data2$group <- "a"
df <- rbind(data2,data)
df$x <- as.Date(df$x,"%m/%d/%Y")
ggplot(df,aes(x=x,y=y,group=group,color=group)) + geom_line() +
geom_point() + theme_minimal()
Note that by converting the date, the dates end up in the right order all on their own.

How to group data and then draw bar chart in ggplot2

I have data frame (df) with 3 columns e.g.
NUMERIC1: NUMERIC2: GROUP(CHARACTER):
100 1 A
200 2 B
300 3 C
400 4 A
I want to group NUMERIC1 by GROUP(CHARACTER), and then calculate mean for each group.
Something like that:
mean(NUMERIC1): GROUP(CHARACTER):
250 A
200 B
300 C
Finally I'd like to draw bar chart using ggplot2 having GROUP(CHARACTER) on x axis a =nd mean(NUMERIC) on y axis.
It should look like:
I used
mean <- tapply(df$NUMERIC1, df$GROUP(CHARACTER), FUN=mean)
but I'm not sure if it's ok, and even if it's, I don't know what I supposed to do next.
This is what stat_summmary(...) is designed for:
colnames(df) <- c("N1","N2","GROUP")
library(ggplot2)
ggplot(df) + stat_summary(aes(x=GROUP,y=N1),fun.y=mean,geom="bar",
fill="lightblue",col="grey50")
Try something like:
res <- aggregate(NUMERIC1 ~ GROUP, data = df, FUN = mean)
ggplot(res, aes(x = GROUP, y = NUMERIC1)) + geom_bar(stat = "identity")
data
df <- structure(list(NUMERIC1 = c(100L, 200L, 300L, 400L), NUMERIC2 = 1:4,
GROUP = structure(c(1L, 2L, 3L, 1L), .Label = c("A", "B",
"C"), class = "factor")), .Names = c("NUMERIC1", "NUMERIC2",
"GROUP"), class = "data.frame", row.names = c(NA, -4L))
I'd suggest something like:
#Imports; data.table, which allows for really convenient "apply a function to
#"each part of a df, by unique value", and ggplot2
library(data.table)
library(ggplot2)
#Convert df to a data.table. It remains a data.frame, so any function that works
#on a data.frame can still work here.
data <- as.data.table(df)
#By each unique value in "CHARACTER", subset and calculate the mean of the
#NUMERIC1 values within that subset. You end up with a data.frame/data.table
#with the columns CHARACTER and mean_value
data <- data[, j = list(mean_value = mean(NUMERIC1)), by = "CHARACTER"]
#And now we play the plotting game (the plotting game is boring, lets
#play Hungry Hungry Hippos!)
plot <- ggplot(data, aes(CHARACTER, mean_value)) + geom_bar()
#And that should do it.
Here's a solution using dplyr to create the summary. In this case, the summary is created on the fly within ggplot, but you can also create a separate summary data frame first and then feed that to ggplot.
library(dplyr)
library(ggplot2)
ggplot(df %>% group_by(GROUP) %>%
summarise(`Mean NUMERIC1`=mean(NUMERIC1)),
aes(GROUP, `Mean NUMERIC1`)) +
geom_bar(stat="identity", fill=hcl(195,100,65))
Since you're plotting means, rather than counts, it might make more sense use points, rather than bars. For example:
ggplot(df %>% group_by(GROUP) %>%
summarise(`Mean NUMERIC1`=mean(NUMERIC1)),
aes(GROUP, `Mean NUMERIC1`)) +
geom_point(pch=21, size=5, fill="blue") +
coord_cartesian(ylim=c(0,310))
Why ggplot when you could do the same with your own code and barplot:
barplot(tapply(df$NUMERIC1, df$GROUP, FUN=mean))

R ggplot2: colouring step plot depending on value

How do I configure a ggplot2 step plot so that when the value being plotted is over a certain level it is one colour and when it is below that certain level it is another colour? (Ultimately I would like to specify the colours used.)
My first thought was that this would be a simple issue that only required me to add a column to my existing data frame and map this column to the aes() for geom_step(). That works to a point: I get two colours, but they overlap as shown in this image:
I have searched SO for the past several hours and found many similar but not identical questions. However, despite trying a wide variety of combinations in different layers I have not been able to resolve the problem. Code follows. Any help much appreciated.
require(ggplot2)
tmp <- structure(list(date = structure(c(1325635200, 1325635800, 1325636400,
1325637000, 1325637600, 1325638200, 1325638800, 1325639400, 1325640000,
1325640600, 1325641200, 1325641800, 1325642400, 1325643000, 1325643600,
1325644200, 1325647800, 1325648400, 1325649000, 1325649600, 1325650200,
1325650800, 1325651400, 1325652000, 1325652600, 1325653200, 1325653800,
1325654400, 1325655000, 1325655600, 1325656200, 1325656800), tzone = "", tclass = c("POSIXct",
"POSIXt"), class = c("POSIXct", "POSIXt")), Close = c(739.07,
739.86, 740.41, 741.21, 740.99, 741.69, 742.64, 741.34, 741.28,
741.69, 741.6, 741.32, 741.95, 741.86, 741.02, 741.08, 742.08,
742.88, 743.19, 743.18, 743.78, 743.65, 743.66, 742.78, 743.34,
742.81, 743.31, 743.81, 742.91, 743.09, 742.47, 742.99)), .Names = c("date",
"Close"), row.names = c(NA, -32L), class = "data.frame")
prevclose <- 743
tmp$status <- as.factor(ifelse (tmp$Close> prevclose, "Above", "Below"))
ggplot() +
geom_step(data = tmp,aes(date, Close, colour = status))
You need group = 1 in aes:
# top panel
ggplot(tmp, aes(date, Close, colour = status, group = 1)) +
geom_step() + scale_colour_manual(values = c("pink", "green"))
Maybe you want to do something like this:
# make sure that data is sorted by date
tmp2 <- arrange(tmp, date)
# add intermittent column between below/above
tmp3 <- tmp2[1, ]
for (i in seq(nrow(tmp2))[-1]) {
if (tmp2[i-1, ]$status != tmp2[i, ]$status) {
tmp3 <- rbind(tmp3,
transform(tmp2[i, ], Close = prevclose, status = tmp2[i-1, ]$status),
transform(tmp2[i, ], Close = prevclose))
}
tmp3 <- rbind(tmp3, tmp2[i, ])
}
# bottom panel
ggplot(tmp3, aes(date, Close, colour = status, group = 1)) + geom_step() +
scale_colour_manual(values = c("pink", "green"))

Bull's-eye charts

A colleague of mine needs to plot 101 bull's-eye charts. This is not her idea. Rather than have her slave away in Excel or God knows what making these things, I offered to do them in R; mapping a bar plot to polar coordinates to make a bull's-eye is a breeze in ggplot2.
I'm running into a problem, however: the data is already aggregated, so Hadley's example here isn't working for me. I could expand the counts out into a factor to do this, but I feel like there's a better way - some way to tell the geom_bar how to read the data.
The data looks like this:
Zoo Animals Bears Polar Bears
1 Omaha 50 10 3
I'll be making a plot for each zoo - but that part I can manage.
and here's its dput:
structure(list(Zoo = "Omaha", Animals = "50", Bears = "10", `Polar Bears` = "3"), .Names = c("Zoo",
"Animals", "Bears", "Polar Bears"), row.names = c(NA, -1L), class = "data.frame")
Note: it is significant that Animals >= Bears >= Polar Bears. Also, she's out of town, so I can't just get the raw data from her (if there was ever a big file, anyway).
While we're waiting for a better answer, I figured I should post the (suboptimal) solution you mentioned. dat is the structure included in your question.
d <- data.frame(animal=factor(sapply(list(dat[2:length(dat)]),
function(x) rep(names(x),x))))
cxc <- ggplot(d, aes(x = animal)) + geom_bar(width = 1, colour = "black")
cxc + coord_polar()
You can use inverse.rle to recreate the data,
dd = list(lengths = unlist(dat[-1]), values = names(dat)[-1])
class(dd) = "rle"
inverse.rle(dd)
If you have multiple Zoos (rows), you can try
l = plyr::dlply(dat, "Zoo", function(z)
structure(list(lengths = unlist(z[-1]), values = names(z)[-1]), class = "rle"))
reshape2::melt(llply(l, inverse.rle))
The way to do this without disaggregating is to use stat="identity" in geom_bar.
It helps to have the data frame containing numeric values rather than character strings to start:
dat <- data.frame(Zoo = "Omaha",
Animals = 50, Bears = 10, `Polar Bears` = 3)
We do need reshape2::melt to get the data organized properly:
library(reshape2)
d3 <- melt(dat,id.var=1)
Now create the plot (identical to the other answer):
library(ggplot2)
ggplot(d3, aes(x = variable, y = value)) +
geom_bar(width = 1, colour = "black",stat="identity") +
coord_polar()

Resources