I have the following generated data frame called Raw_Data:
Time Velocity Type
1 10 1 a
2 20 2 a
3 30 3 a
4 40 4 a
5 50 5 a
6 10 2 b
7 20 4 b
8 30 6 b
9 40 8 b
10 50 9 b
11 10 3 c
12 20 6 c
13 30 9 c
14 40 11 c
15 50 13 c
I plotted this data with ggplot2:
ggplot(Raw_Data, aes(x=Time, y=Velocity))+geom_point() + facet_grid(Type ~.)
I have the objects: Regression_a, Regression_b, Regression_c. These are the linear regression equations for each plot. Each plot should display the corresponding equation.
Using annotate displays the particular equation on each plot:
annotate("text", x = 1.78, y = 5, label = Regression_a, color="black", size = 5, parse=FALSE)
I tried to overcome the issue with the following code:
Regression_a_eq <- data.frame(x = 1.78, y = 1,label = Regression_a,
Type = "a")
p <- x + geom_text(data = Raw_Data,label = Regression_a)
This did not solve the problem. Each plot still showed Regression_a, rather than just plot a
You can put the expressions as character values in a new dataframe with the same unique Type's as in your data-dataframe and add them with geom_text:
regrDF <- data.frame(Type = c('a','b','c'), lbl = c('Regression_a', 'Regression_b', 'Regression_c'))
ggplot(Raw_Data, aes(x = Time, y = Velocity)) +
geom_point() +
geom_text(data = regrDF, aes(x = 10, y = 10, label = lbl), hjust = 0) +
facet_grid(Type ~.)
which gives:
You can replace the text values in regrDF$lbl with the appropriate expressions.
Just a supplementary for the adopted answer if we have facets in both horizontal and vertical directions.
regrDF <- data.frame(Type1 = c('a','a','b','b'),
Type2 = c('c','d','c','d'),
lbl = c('Regression_ac', 'Regression_ad', 'Regression_bc', 'Regression_bd'))
ggplot(Raw_Data, aes(x = Time, y = Velocity)) +
geom_point() +
geom_text(data = regrDF, aes(x = 10, y = 10, label = lbl), hjust = 0) +
facet_grid(Type1 ~ Type2)
The answer is good but still imperfect as I do not know how to incorporate math expressions and newline simultaneously (Adding a newline in a substitute() expression).
Related
I'm building a dynamic flexdashboard with plotly and I was wondering if there was a way to dynamically resize my dashboard. For example, I have created plots of subjects being tested over time. When I shrink the page down, what I'd like is for it to dynamically adjust to a time-series plot of the average for the group at each test day.
My data looks like this:
library(flexdashboard)
library(knitr)
library(tidyverse)
library(plotly)
subject <- rep(c("A", "B", "C"), each = 8)
testDay <- rep(1:8, times = 3)
variable1 <- rnorm(n = length(subject), mean = 30, sd = 10)
variable2 <- rnorm(n = length(subject), mean = 15, sd = 3)
df <- data.frame(subject, testDay, variable1, variable2)
subject testDay variable1 variable2
1 A 1 21.816831 8.575000
2 A 2 14.947327 17.387903
3 A 3 18.014435 16.734653
4 A 4 33.100524 11.381793
5 A 5 37.105911 13.862776
6 A 6 32.181317 10.722458
7 A 7 41.107293 9.176348
8 A 8 36.674051 17.114815
9 B 1 33.710838 17.508234
10 B 2 23.788428 13.903532
11 B 3 42.846120 17.032208
12 B 4 9.785957 15.275293
13 B 5 32.551619 21.172497
14 B 6 36.912465 18.694263
15 B 7 40.061797 13.759541
16 B 8 41.094825 15.472144
17 C 1 27.663408 17.949291
18 C 2 31.263966 11.546486
19 C 3 39.734050 19.831854
20 C 4 25.461309 19.239821
21 C 5 22.128139 10.837672
22 C 6 31.234339 16.976004
23 C 7 46.273664 19.255745
24 C 8 27.057218 21.086204
My plotly code looks like this (a graph of each subject over time):
Dynamic Chart
===========================
Row
-----------------------------------------------------------------------
```{r}
p1 <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable1, color = subject, group = 1)) +
geom_line() +
theme_bw() +
ggtitle("Variable 1")
ggplotly(p1)
```
```{r}
p2 <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable2, color = subject, group = 1)) +
geom_line() +
theme_bw() +
ggtitle("Variable 2")
ggplotly(p2)
```
Is there a way that when I shrink the website down these plots can dynamically change to a group average plot, like this:
p1_avg <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable1, group = 1)) +
stat_summary(fun.y = "mean", geom = "line") +
theme_bw() +
ggtitle("Variable 1 Avg")
ggplotly(p1_avg)
p2_avg <- df %>%
ggplot(aes(x = as.factor(testDay), y = variable2, group = 1)) +
stat_summary(fun.y = "mean", geom = "line") +
theme_bw() +
ggtitle("Variable 2 Avg")
ggplotly(p2_avg)
You can put your plotly object inside the plotly function renderPlotly() for dynamically resizing to the page. See an example how I used the function in this blog post:
https://medium.com/analytics-vidhya/shiny-dashboards-with-flexdashboard-e66aaafac1f2
I used the code below to create my plot above. Is there a way to adapt my code so that I do not have the long red line joining the two periods of non-peak hours?
Day_2 <- non_cumul[(non_cumul$Day.No == 'Day 2'),]
Day_2$time_test <- between(as.ITime(Day_2$date_time),
as.ITime("09:00:00"),
as.ITime("17:00:00"))
Day2plot <- ggplot(Day_2,
aes(date_time, non_cumul_measurement, color = time_test)) +
geom_point()+
geom_line() +
theme(plot.title = element_text(hjust = 0.5)) +
ggtitle('Water Meter Averages (Thurs 4th Of Jan 2018)',
'Generally greater water usage between peak hours compared to non peak hours') +
xlab('Date_Times') +
ylab('Measurement in Cubic Feet') +
scale_color_discrete(name="Peak Hours?")
Day2plot +
theme(axis.title.x = element_text(face="bold", colour="black", size=10),
axis.text.x = element_text(angle=90, vjust=0.5, size=10))
From the sound of it, your plot comprises of one observation for each position on the x-axis, and you want consecutive observations of the same color to be joined together in a line.
Here's a simple example that reproduces this:
set.seed(5)
df = data.frame(
x = seq(1, 20),
y = rnorm(20),
color = c(rep("A", 5), rep("B", 9), rep("A", 6))
)
ggplot(df,
aes(x = x, y = y, color = color)) +
geom_line() +
geom_point()
The following code creates a new column "group", which takes on a different value for each collection of consecutive points with the same color. "prev.color" and "change.color" are intermediary columns, included here for clarity:
library(dplyr)
df2 <- df %>%
arrange(x) %>%
mutate(prev.color = lag(color)) %>%
mutate(change.color = is.na(prev.color) | color != prev.color) %>%
mutate(group = cumsum(change.color))
> head(df2, 10)
x y color prev.color change.color group
1 1 -0.84085548 A <NA> TRUE 1
2 2 1.38435934 A A FALSE 1
3 3 -1.25549186 A A FALSE 1
4 4 0.07014277 A A FALSE 1
5 5 1.71144087 A A FALSE 1
6 6 -0.60290798 B A TRUE 2
7 7 -0.47216639 B B FALSE 2
8 8 -0.63537131 B B FALSE 2
9 9 -0.28577363 B B FALSE 2
10 10 0.13810822 B B FALSE 2
ggplot(df2,
aes(x = x, y = y, color = colour, group = group)) +
geom_line() +
geom_point()
I am trying to use color to highlight differences between and within factor levels. For example, with these reproducible data:
set.seed(123)
dat <- data.frame(
Factor = sample(c("AAA", "BBB", "CCC"), 50, replace = T),
ColorValue = sample(1:4, 50 , replace = T),
x = sample(1:50, 50, replace =T),
y = sample(1:50, 50, replace =T))
head(dat)
Factor ColorValue x y
1 AAA 1 30 43
2 CCC 2 17 25
3 BBB 4 25 20
4 CCC 1 48 13
5 CCC 3 25 6
6 AAA 1 45 20
I want to have a different color for each Factor. Then, within each factor I am trying to use ColorValue as a continuous coloring variable to show intensity.
In the plot below, each facet would have different shades of red, green, and blue that reflect the ColorValue, ideally with a single intensity (i.e. ColorValue) legend for all three factor levels.
ggplot(dat, aes(x = x, y = y, color = Factor)) +
geom_point(size = 3) +
facet_wrap(~Factor) +
theme_bw()
ggplot(dat, aes(x = x, y = y, color = Factor, alpha = ColorValue)) +
geom_point(size = 3) +
facet_wrap(~Factor) +
theme_bw()
A friend of mine helped me creating a code to generate a scatter plot using the basic plot function. Now I would like to make the same plot using ggplot, however I do not know/understand how to convert cex parameter of the basic plot function in the corresponding ggplot option.
This is an example of my data
df=read.table(text="
A B C D E F G H I J
1 2 3 4 5 1 2 3 4 5
2 3 4 5 6 2 3 4 5 6
3 4 5 6 7 3 4 5 6 7
4 5 6 7 8 4 5 6 7 8
5 6 7 8 9 5 6 7 8 9
6 7 8 9 10 6 7 8 9 10
7 8 9 10 11 7 8 9 10 11
8 9 10 11 12 8 9 10 11 12
9 10 11 12 13 9 10 11 12 13",header=T)
And the is the code I use for the basic plot function
temp <- as.matrix(df)
x <- ncol(temp)/2
y <- nrow(temp)
maxtemp <- max(temp [ , 1:x], na.rm = T)
plot(rep(1, y) ~ temp [, x + 1], type = "p", pch = 1, cex = 5*temp [, 1]/maxtemp , xlim = c(0, 15), ylim = c(0,6))
for(i in 1:(x-1)){
points(rep(1 + i, y) ~ temp [, x + 1 + i], type = "p", pch = 1, cex = 5*temp [, 1 + i]/maxtemp)
next}
To make the same plot in ggplot I wrote this code, however I do not get the same picture, the size of the dots is not the same, and I guess this is due to the fact that plot uses cex, while ggplot uses size, and I do not know how to deal with this...
temp <- data.table(df)
colnames(temp) <- c(paste("I",c(1:5), sep=""), c(1:5))
x <- ncol(temp)/2
maxtemp <- max(temp [ , 1:x], na.rm = T)
for(t in 1:5){
temp[, t] <- 5*temp[, t, with = FALSE]/maxtemp
next}
#I do this to create the 'cex' values, as cex does not exist in ggplot
ggplot(gather(temp[, 6:10]),aes(x = value, y = as.numeric(key))) + geom_point(aes(size = gather(temp[, 1:5])$value), shape = 1) + xlim(0, 15) + ylim(0, 6) + theme_bw() + theme(legend.position = "none", panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + xlab("") + ylab("")
What am I doing wrong?
Ok, I think I found a way to do what I wanted.
I was looking for something else and I came across the following sentence in the reference manual of ggplot2
# To set aesthetics, wrap in I()
qplot(mpg, wt, data = mtcars, colour = I("red"))
So I gave it a try and after few attempts I ended up with the following code
ggplot(gather(temp[, 6:10]),aes(x = value, y = as.numeric(key))) + geom_point(aes(size = I(gather(temp[, 1:5])$value*2)), shape = 1) + xlim(0, 15) + ylim(0, 6) + theme_bw() + theme(legend.position = "none", panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + xlab("") + ylab("")
Using aes(size = I(gather(temp[, 1:5])$value*2)) I get almost exactly the same plot as if I use the plot() function (first block of code).
I do not fully understand the relationship between cex and aes(I()), but it does what I was looking for. Maybe someone can comment on this.
I have a data frame which I generated using the following piece of code,
x <- c(1:10)
y <- x^3
z <- y-20
s <- z/3
t <- s*6
q <- s*y
x1 <- cbind(x,y,z,s,t,q)
x1 <- data.frame(x1)
The data frame x1 thus has the following data,
x y z s t q
1 1 1 -19 -6.333333 -38 -6.333333
2 2 8 -12 -4.000000 -24 -32.000000
3 3 27 7 2.333333 14 63.000000
4 4 64 44 14.666667 88 938.666667
5 5 125 105 35.000000 210 4375.000000
6 6 216 196 65.333333 392 14112.000000
7 7 343 323 107.666667 646 36929.666667
8 8 512 492 164.000000 984 83968.000000
9 9 729 709 236.333333 1418 172287.000000
10 10 1000 980 326.666667 1960 326666.666667
Now I want to plot columns x vs y, z vs s and t vs q in the same plot, so for this I use the following code,
p <- ggplot() +
geom_line(data = x1, aes(x = x1[,1], y = x1[,2], color = "red")) +
geom_line(data = x1, aes(x = x1[,3], y = x1[,4], color = "blue")) +
geom_line(data = x1, aes(x = x1[,5], y = x1[,6], color = "green")) +
xlab('x') +
ylab('y')
While the above piece of code works fine for a data frame of just 6 columns, I would like to perform the same operation for a data frame with many number of columns. For example if there are 20 columns in a data frame, there should be one single plot generated containing plot of col 1 vs 2, col 3 vs 4, col 5 vs 6 and so on until col 19 vs 20. To do this I use this following piece of code,
p <- ggplot() + geom_line(data = x1, aes(x = x1[,1], y = x1[,2], color = "red")) + xlab('x') + ylab('y')
ctr <- 1
for (iz in seq(3, ncol(x1), by = 2))
{
p$ctr <- p + geom_line(data = x1, aes(x = x1[,iz], y = x1[,iz+1], color = "green"))
ctr <- ctr+1
}
So the plots should be layered incrementally and the last object should contain the entire plot. Using the above code the plot gets overwritten every time when the loop runs, could some one point out how to capture the full data. I would like to display a legend for each of the plot as well.
Thanks
You don't need a loop if you put your data into the right format. You can create a long data frame based on your original data frame.
x1_long <- data.frame(x = unlist(x1[c(TRUE, FALSE)]),
y = unlist(x1[c(FALSE, TRUE)]),
ind = gl(ncol(x1) / 2, nrow(x1)))
Now, a single geom_line command is sufficient:
library(ggplot2)
ggplot(x1_long) +
geom_line(aes(x = x, y = y, colour = ind))
(Note. The red line is plotted too but its values are quite small.)
How about this?
ggplot() +
lapply(seq(1,ncol(x1),2), # every second col index
function(x){ # return the geom_line calls in a list
geom_line(aes_string(x=x1[x], # remember to use aes_string for x
y=x1[x+1]), # and y
color=factor(x), # then color
size=2) # and size
}) +
xlab('x') + ylab('y')