I used the code below to create my plot above. Is there a way to adapt my code so that I do not have the long red line joining the two periods of non-peak hours?
Day_2 <- non_cumul[(non_cumul$Day.No == 'Day 2'),]
Day_2$time_test <- between(as.ITime(Day_2$date_time),
as.ITime("09:00:00"),
as.ITime("17:00:00"))
Day2plot <- ggplot(Day_2,
aes(date_time, non_cumul_measurement, color = time_test)) +
geom_point()+
geom_line() +
theme(plot.title = element_text(hjust = 0.5)) +
ggtitle('Water Meter Averages (Thurs 4th Of Jan 2018)',
'Generally greater water usage between peak hours compared to non peak hours') +
xlab('Date_Times') +
ylab('Measurement in Cubic Feet') +
scale_color_discrete(name="Peak Hours?")
Day2plot +
theme(axis.title.x = element_text(face="bold", colour="black", size=10),
axis.text.x = element_text(angle=90, vjust=0.5, size=10))
From the sound of it, your plot comprises of one observation for each position on the x-axis, and you want consecutive observations of the same color to be joined together in a line.
Here's a simple example that reproduces this:
set.seed(5)
df = data.frame(
x = seq(1, 20),
y = rnorm(20),
color = c(rep("A", 5), rep("B", 9), rep("A", 6))
)
ggplot(df,
aes(x = x, y = y, color = color)) +
geom_line() +
geom_point()
The following code creates a new column "group", which takes on a different value for each collection of consecutive points with the same color. "prev.color" and "change.color" are intermediary columns, included here for clarity:
library(dplyr)
df2 <- df %>%
arrange(x) %>%
mutate(prev.color = lag(color)) %>%
mutate(change.color = is.na(prev.color) | color != prev.color) %>%
mutate(group = cumsum(change.color))
> head(df2, 10)
x y color prev.color change.color group
1 1 -0.84085548 A <NA> TRUE 1
2 2 1.38435934 A A FALSE 1
3 3 -1.25549186 A A FALSE 1
4 4 0.07014277 A A FALSE 1
5 5 1.71144087 A A FALSE 1
6 6 -0.60290798 B A TRUE 2
7 7 -0.47216639 B B FALSE 2
8 8 -0.63537131 B B FALSE 2
9 9 -0.28577363 B B FALSE 2
10 10 0.13810822 B B FALSE 2
ggplot(df2,
aes(x = x, y = y, color = colour, group = group)) +
geom_line() +
geom_point()
Related
I'm building a dynamic flexdashboard with plotly and I was wondering if there was a way to dynamically resize my dashboard. For example, I have created plots of subjects being tested over time. When I shrink the page down, what I'd like is for it to dynamically adjust to a time-series plot of the average for the group at each test day.
My data looks like this:
library(flexdashboard)
library(knitr)
library(tidyverse)
library(plotly)
subject <- rep(c("A", "B", "C"), each = 8)
testDay <- rep(1:8, times = 3)
variable1 <- rnorm(n = length(subject), mean = 30, sd = 10)
variable2 <- rnorm(n = length(subject), mean = 15, sd = 3)
df <- data.frame(subject, testDay, variable1, variable2)
subject testDay variable1 variable2
1 A 1 21.816831 8.575000
2 A 2 14.947327 17.387903
3 A 3 18.014435 16.734653
4 A 4 33.100524 11.381793
5 A 5 37.105911 13.862776
6 A 6 32.181317 10.722458
7 A 7 41.107293 9.176348
8 A 8 36.674051 17.114815
9 B 1 33.710838 17.508234
10 B 2 23.788428 13.903532
11 B 3 42.846120 17.032208
12 B 4 9.785957 15.275293
13 B 5 32.551619 21.172497
14 B 6 36.912465 18.694263
15 B 7 40.061797 13.759541
16 B 8 41.094825 15.472144
17 C 1 27.663408 17.949291
18 C 2 31.263966 11.546486
19 C 3 39.734050 19.831854
20 C 4 25.461309 19.239821
21 C 5 22.128139 10.837672
22 C 6 31.234339 16.976004
23 C 7 46.273664 19.255745
24 C 8 27.057218 21.086204
My plotly code looks like this (a graph of each subject over time):
Dynamic Chart
===========================
Row
-----------------------------------------------------------------------
```{r}
p1 <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable1, color = subject, group = 1)) +
geom_line() +
theme_bw() +
ggtitle("Variable 1")
ggplotly(p1)
```
```{r}
p2 <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable2, color = subject, group = 1)) +
geom_line() +
theme_bw() +
ggtitle("Variable 2")
ggplotly(p2)
```
Is there a way that when I shrink the website down these plots can dynamically change to a group average plot, like this:
p1_avg <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable1, group = 1)) +
stat_summary(fun.y = "mean", geom = "line") +
theme_bw() +
ggtitle("Variable 1 Avg")
ggplotly(p1_avg)
p2_avg <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable2, group = 1)) +
stat_summary(fun.y = "mean", geom = "line") +
theme_bw() +
ggtitle("Variable 2 Avg")
ggplotly(p2_avg)
You can put your plotly object inside the plotly function renderPlotly() for dynamically resizing to the page. See an example how I used the function in this blog post:
https://medium.com/analytics-vidhya/shiny-dashboards-with-flexdashboard-e66aaafac1f2
The following is how my data frame looks like:
CatA CatB CatC
1 Y A
1 N B
1 Y C
2 Y A
3 N B
2 N C
3 Y A
4 Y B
4 N C
5 N A
5 Y B
I want to have CatA on X-Axis, and its count on Y-Axis. This graph comes fine. However, I want to create group for CatB and stack it with CatC keeping count in Y axis. This is what I have tried, and this is how it looks:
I want it to look like this:
My code:
ggplot(data, aes(factor(data$catA), data$catB, fill = data$catC))
+ geom_bar(stat="identity", position = "stack")
+ theme_bw() + facet_grid( ~ data$catC)
PS: I am sorry for providing links to images because I am not able to upload it, it gives me error occurred at imgur, every time I upload.
You could use facets:
df <- data.frame(A = sample(1:5, 30, T),
B = sample(c('Y', 'N'), 30, T),
C = rep(LETTERS[1:3], 10))
ggplot(df) + geom_bar(aes(B, fill = C), position = 'stack', width = 0.9) +
facet_wrap(~A, nrow = 1) + theme(panel.spacing = unit(0, "lines"))
Given the following dataset:
Output<- read.table(text = "Type 2012-06-30' 2012-09-30
1 Market 2 3
2 Geography 3 -2
3 Industry -1 5 ",header = TRUE,sep = "",row.names = 1)
I'm trying to prepare the data in order to use the ggplot2 package and create a stacked bar chart with negative values. Here's the basic chart sequence I'm using:
Output$row <- seq_len(nrow(Output))
dat2 <- melt(Output, id.vars = "row")
But this gives me:
dat2
row variable value
1 1 Type Market
2 2 Type Geography
3 3 Type Industry
4 1 X2012.06.30. 2
5 2 X2012.06.30. 3
6 3 X2012.06.30. -1
7 1 X2012.09.30 3
8 2 X2012.09.30 -2
9 3 X2012.09.30 5
Ideally in the 'row' column instead of numbers I would have Market io 1, Geography io 2, Industry io 3 so that I fill my bar chart with the different (Market, Geography, Industry) categories and not 1-2-3.Also the rows 1 to 3 in dat2 should be dropped since they dont correspond to a quarter data. Thank you!
dat1 <- subset(dat2,value >= 0)
dat3 <- subset(dat2,value < 0)
ggplot() +
geom_bar(data = dat1, aes(x=variable, y=value, fill=row),stat = "identity") +
geom_bar(data = dat3, aes(x=variable, y=value, fill=row),stat = "identity") +
scale_fill_brewer(type = "seq", palette = 1)
I had a go at the below, but I am quite confused by your question in bold. The odd formatting of your data seemed caused by using id.vars = "row", but please clarify if need be.
Output<- read.table(text = "Type 2012-06-30' 2012-09-30
1 Market 2 3
2 Geography 3 -2
3 Industry -1 5 ",header = TRUE,sep = "",row.names = 1)
melt(Output)
dat2 <- melt(Output)
dat1 <- subset(dat2,value >= 0)
dat3 <- subset(dat2,value < 0)
ggplot() +
geom_bar(data = dat1, aes(x=variable, y=value, fill=Type),stat = "identity") +
geom_bar(data = dat3, aes(x=variable, y=value, fill=Type),stat = "identity") +
scale_fill_brewer(type = "seq", palette = 1)
I am having trouble plotting paired data with ggplot2.
So, I have a database with paired (idpair) individuals (id) and their respective sequences, such as
idpair id 1 2 3 4 5 6 7 8 9 10
1 1 1 d b d a c a d d a b
2 1 2 e d a c c d a b a c
3 2 3 e a a a a c d b c e
4 2 4 d d b c d e a a a b
...
What I would like is to plot all the sequences but that somewhat we can visually distinguish the pair.
I thought of using the grid such as: facet_grid(idpair~.). My issue looks like this:
How could I plot the two sequences side by side removing the "vacuum" in between caused by the other idpair ?
Any suggestions of alternative plotting of paired data are very welcome.
My code
library(ggplot2)
library(dplyr)
library(reshape2)
dtmelt = dt %>% melt(id.vars = c('idpair', 'id')) %>% arrange(idpair, id, variable)
dtmelt %>% ggplot(aes(y = id, x = variable, fill = value)) +
geom_tile() + scale_fill_brewer(palette = 'Set3') +
facet_grid(idpair~.) + theme(legend.position = "none")
generate the data
dt = as.data.frame( cbind( sort( rep(1:10, 2) ) , 1:20, replicate(10, sample(letters[1:5], 20, replace = T)) ) )
colnames(dt) = c('idpair', 'id', 1:10)
You can remove the unused levels in the facet by setting scales = "free_y". This will vary the y-axis limits for each facet.
dtmelt %>% ggplot(aes(y = id, x = variable, fill = value)) +
geom_tile() + scale_fill_brewer(palette = 'Set3') +
facet_grid(idpair~., scales = "free_y") + theme(legend.position = "none")
I have the following generated data frame called Raw_Data:
Time Velocity Type
1 10 1 a
2 20 2 a
3 30 3 a
4 40 4 a
5 50 5 a
6 10 2 b
7 20 4 b
8 30 6 b
9 40 8 b
10 50 9 b
11 10 3 c
12 20 6 c
13 30 9 c
14 40 11 c
15 50 13 c
I plotted this data with ggplot2:
ggplot(Raw_Data, aes(x=Time, y=Velocity))+geom_point() + facet_grid(Type ~.)
I have the objects: Regression_a, Regression_b, Regression_c. These are the linear regression equations for each plot. Each plot should display the corresponding equation.
Using annotate displays the particular equation on each plot:
annotate("text", x = 1.78, y = 5, label = Regression_a, color="black", size = 5, parse=FALSE)
I tried to overcome the issue with the following code:
Regression_a_eq <- data.frame(x = 1.78, y = 1,label = Regression_a,
Type = "a")
p <- x + geom_text(data = Raw_Data,label = Regression_a)
This did not solve the problem. Each plot still showed Regression_a, rather than just plot a
You can put the expressions as character values in a new dataframe with the same unique Type's as in your data-dataframe and add them with geom_text:
regrDF <- data.frame(Type = c('a','b','c'), lbl = c('Regression_a', 'Regression_b', 'Regression_c'))
ggplot(Raw_Data, aes(x = Time, y = Velocity)) +
geom_point() +
geom_text(data = regrDF, aes(x = 10, y = 10, label = lbl), hjust = 0) +
facet_grid(Type ~.)
which gives:
You can replace the text values in regrDF$lbl with the appropriate expressions.
Just a supplementary for the adopted answer if we have facets in both horizontal and vertical directions.
regrDF <- data.frame(Type1 = c('a','a','b','b'),
Type2 = c('c','d','c','d'),
lbl = c('Regression_ac', 'Regression_ad', 'Regression_bc', 'Regression_bd'))
ggplot(Raw_Data, aes(x = Time, y = Velocity)) +
geom_point() +
geom_text(data = regrDF, aes(x = 10, y = 10, label = lbl), hjust = 0) +
facet_grid(Type1 ~ Type2)
The answer is good but still imperfect as I do not know how to incorporate math expressions and newline simultaneously (Adding a newline in a substitute() expression).