Here is a simple ggplot chart for two variables:
library("ggplot2")
library("directlabels")
library("tibble")
df <- tibble(
number = 1:10,
var1 = runif(10)*10,
var2 = runif(10)*10
)
ggplot(df, aes(number))+
geom_line(aes(y=var1), color='red')+
geom_line(aes(y=var2), color='blue')
Is it possible to label the last value of var1 and var2 using the expression like that:
direct.label(df, 'last.points')
In my case I get an error:
Error in UseMethod("direct.label") :
no applicable method for 'direct.label' applied to an object of
class
Maria, you initially need to structure your data frame by "stacking data". I like to use the melt function of the reshape2 package. This will allow you to use only one geom_line.
Later you need to generate an object from ggplot2. And this object you must apply the directlabels package.
library(ggplot2)
library(directlabels)
library(tibble)
library(dplyr)
library(reshape2)
set.seed(1)
df <- tibble::tibble(number = 1:10,
var1 = runif(10)*10,
var2 = runif(10)*10)
df <- df %>%
reshape2::melt(id.vars = "number")
p <- ggplot2::ggplot(df) +
geom_line(aes(x = number, y = value, col = variable), show.legend = F) +
scale_color_manual(values = c("red", "blue"))
p
directlabels::direct.label(p, 'last.points')
Related
Is basically the problem described here Plotting an xts object using ggplot2
But I can not adapt it to plot two series, the code is the following:
dates <- c("2014-10-01", "2014-11-01", "2014-12-01", "2015-01-01", "2015-02-01")
values1 <- as.numeric(c(1,2,3,4,5))
values2 <- as.numeric(c(10,9,8,7,6))
new_df <- data_frame(dates, values1, values2)
new_df$dates <- as.Date(dates)
new_df <- as.xts(new_df[, -1], order.by = new_df$dates)
Now I use ggplot:
ggplot(new_df, aes(x = Index, y = c(values1, values2)))
+ geom_point()
but I get the following error:
Error: Aesthetics must be either length 1 or the same as the data (5):
y Run rlang::last_error() to see where the error occurred.
It is possible to have both series of this object on the same plot?
Option 1: specify each series as a layer:
ggplot(new_df, aes(x = Index)) +
geom_point(aes(y = values1, color = "values1")) +
geom_point(aes(y = values2, color = "values2"))
Option 2: convert to a longer shape of tibble, with series name as a column:
library(tidyverse)
new_df %>%
zoo::fortify.zoo() %>%
as_tibble() %>%
pivot_longer(-Index, names_to = "series", values_to = "values") %>%
ggplot(aes(x = Index, y = values, color = series)) +
geom_point()
Regarding the creation of new_df we revise the calculation in the Note at the end giving the same value but with less code:
new_df is an xts object, not a data.frame, so let us use x as a more descriptive name
there is no real point in creating a data frame and then converting it to xts -- just create an xts object directly
we don't need as.numeric. Both instances of c(...) are already numeric.
the ggplot2 command takes data frames, not xts objects. For an xts object use autoplot instead.
Note that x in the Note is identical in contents to new_df in the question. We have just used a different name.
Now use autoplot. Omit the geom="point" argument if you want lines and omit the facet=NULL argument if you want separate panels. See ?autoplot.zoo for more examples.
library(ggplot2)
library(xts)
autoplot(x, geom = "point", facet = NULL) + ggtitle("My Plot")
Note
Input used above.
library(xts)
dates <- c("2014-10-01", "2014-11-01", "2014-12-01", "2015-01-01", "2015-02-01")
values1 <- c(1,2,3,4,5)
values2 <- c(10,9,8,7,6)
x <- xts(cbind(values1, values2), as.Date(dates))
My first Q here, so please go lightly if I'm out of step anywhere.
I'm trying to code R to produce a single chart to contain a number of data series lines. The number of data series may vary but will be provided in the data frame. I have tried to rearrange another thread's content to print the geom_line , but not successfully.
The logic is:
#desire to replace loop of 1:5 with ncol(df)
print(ggplot(df,aes(x=time))
for (i in 1:5) {
print (+ geom_line(aes(y=df[,i]))
}
#functioning geom point loops ggplot production:
for (i in 1:5) {
print(ggplot(df,aes(x=time,y=df[,i]))+geom_point())
}
#functioning multi-line ggplot where n is explicit:
ggplot(data=df, aes(x=time), group=1) +
geom_line(aes(y=df$`3`))+
geom_line(aes(y=df$`4`))
The functioning example code produces n number of point charts, 5 in this case. I would like just one chart to contain n line series.
This may be similar to How to plot n dimensional matrix? for which there are currently no relevant answers
Any contributions much appreciated, thanks
You can use gather from tidyverse "world" to do that.
As you didn't supply a sample data I used mtcars.
I created two data.frames one with 3 columns one with 9. In each one of them I plotted all of the variables against the variable mpg.
library(tidyverse)
df3Columns <- mtcars[, 1:4]
df9Columns <- mtcars[, 1:10]
df3Columns %>%
gather(var, value, -mpg) %>%
ggplot(aes(mpg, value, group = var, color = var)) +
geom_line()
df9Columns %>%
gather(var, value, -mpg) %>%
ggplot(aes(mpg, value, group = var, color = var)) +
geom_line()
Edit - using the sample data in comments.
library(tidyverse)
df %>%
rownames_to_column("time") %>%
gather(var, value, -time) %>%
ggplot(aes(time, value, group = var, color = var)) +
geom_line()
Sample data:
df <- structure(list("39083" = c(96, 100, 100), "39090" = c(99, 100, 100), "39097" = c(99, 100, 100)), row.names = 3:5, class = "data.frame")
To strictly answer your question, you can simply store your ggplot in a variable and add the geom_line one by one:
df <- structure(list("39083" = c(96, 100, 100), "39090" = c(99, 100, 100), "39097" = c(99, 100, 100)), row.names = 3:5, class = "data.frame")
g <- ggplot(df, aes(x = 1:nrow(df)))
for (i in colnames(df))
{
g <- g + geom_line(y = df[,i])
}
g <- g + scale_y_continuous(limits = c(min(df), max(df)))
print(g)
However, this is not a very convenient solution. I would highly recommend to refactor your data frame to be more ggplot style.
df.ultimate <- data.frame(time = numeric(), value = numeric(), group = character())
for (i in colnames(df))
{
df.ultimate <- rbind(df.ultimate, data.frame(time = 1:nrow(df), value = df[, i], group = i))
}
g <- ggplot(df.ultimate, aes(x = time, y = value, color = group))
g <- g + geom_line()
print(g)
A one-line solution:
ggplot(data.frame(time = rep(1:nrow(df), ncol(df)),
value = as.vector(as.matrix(df)),
group = rep(colnames(df), each = nrow(df))),
aes(x = time, y = value, color = group)) + geom_line()
I am trying to plot a series of time series with ggplot2 that sometimes have greek names. As the plot uses dynamic names (i.e. the names of the variables are stored within another variable) I am having troubles to get it to work.
Here is an example:
# create some data
df <- data.frame(time = rep(1:10, 3),
variable = rep(letters[1:3], each = 10),
val = rnorm(30))
# create a variable for the group name
nam <- expression("alpha[i]")
library(ggplot2)
# plot the data as a line
ggplot(df, aes(x = time, y = val, color = variable)) +
geom_line() +
# Option 1: Does not work
scale_color_discrete(name = eval(nam))
# Option 2: works but has no variable input
# scale_color_discrete(name = expression(alpha[i]))
Do you have any idea of how I can evaluate the variable nam to be displayed as the name of the legend as in option 2?
Thank you very much!
Using this code
# create some data
df <- data.frame(time = rep(1:10, 3),
variable = rep(letters[1:3], each = 10),
val = rnorm(30))
# create a variable for the group name
nam <- expression(alpha[i])
library(ggplot2)
# plot the data as a line
ggplot(df, aes(x = time, y = val, color = variable)) +
geom_line() +
# Option 1: Does not work
scale_color_discrete(name = nam)
# Option 2: works but has no variable input
# scale_color_discrete(name = expression(alpha[i]))
This gives me the plot you probably wanna see. No eval in name = eval(name) and no blockquotes in the assignment nam <- expression(alpha[i])
Probably this will work:
nam <- "alpha[i]"
...
scale_color_discrete(name = eval(parse(text= nam)))
I've seen heatmaps with values made in various R graphics systems including lattice and base like this:
I tend to use ggplot2 a bit and would like to be able to make a heatmap with the corresponding cell values plotted. Here's the heat map and an attempt using geom_text:
library(reshape2, ggplot2)
dat <- matrix(rnorm(100, 3, 1), ncol=10)
names(dat) <- paste("X", 1:10)
dat2 <- melt(dat, id.var = "X1")
p1 <- ggplot(dat2, aes(as.factor(Var1), Var2, group=Var2)) +
geom_tile(aes(fill = value)) +
scale_fill_gradient(low = "white", high = "red")
p1
#attempt
labs <- c(apply(round(dat[, -2], 1), 2, as.character))
p1 + geom_text(aes(label=labs), size=1)
Normally I can figure out the x and y values to pass but I don't know in this case since this info isn't stored in the data set. How can I place the text on the heatmap?
Key is to add a row identifier to the data and shape it "longer".
edit Dec 2022 to make code reproducible with R 4.2.2 / ggplot2 3.4.0 and reflect changes in tidyverse semantics
library(ggplot2)
library(tidyverse)
dat <- matrix(rnorm(100, 3, 1), ncol = 10)
## the matrix needs names
names(dat) <- paste("X", 1:10)
## convert to tibble, add row identifier, and shape "long"
dat2 <-
dat %>%
as_tibble() %>%
rownames_to_column("Var1") %>%
pivot_longer(-Var1, names_to = "Var2", values_to = "value") %>%
mutate(
Var1 = factor(Var1, levels = 1:10),
Var2 = factor(gsub("V", "", Var2), levels = 1:10)
)
#> Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if
#> `.name_repair` is omitted as of tibble 2.0.0.
#> ℹ Using compatibility `.name_repair`.
ggplot(dat2, aes(Var1, Var2)) +
geom_tile(aes(fill = value)) +
geom_text(aes(label = round(value, 1))) +
scale_fill_gradient(low = "white", high = "red")
Created on 2022-12-31 with reprex v2.0.2
There is another simpler way to make heatmaps with values. You can use pheatmap to do this.
dat <- matrix(rnorm(100, 3, 1), ncol=10)
names(dat) <- paste("X", 1:10)
install.packages('pheatmap') # if not installed already
library(pheatmap)
pheatmap(dat, display_numbers = T)
This will give you a plot like this
If you want to remove clustering and use your color scheme you can do
pheatmap(dat, display_numbers = T, color = colorRampPalette(c('white','red'))(100), cluster_rows = F, cluster_cols = F, fontsize_number = 15)
You can also change the fontsize, format, and color of the displayed numbers.
I have a qplot that is showing 5 different groupings (denoted with colour = type) with two dependent variables each. The command looks like this:
qplot(data = data, x = day, y = var1, geom = "line", colour = type) +
geom_line(aes(y = var2, colour = value)
I'd like to label the two different lines so that I can tell which five represent var1 and which five represent var2.
How do I do this?
You can convert the data to a "tall" format, with melt, and use another aesthetic, such as the line type, to distinguish the variables.
# Sample data
n <- 100
k <- 5
d <- data.frame(
day = rep(1:n,k),
type = factor(rep(1:k, each=n)),
var1 = as.vector( replicate(k, cumsum(rnorm(n))) ),
var2 = as.vector( replicate(k, cumsum(rnorm(n))) )
)
# Normalize the data
library(reshape2)
d <- melt(d, id.vars=c("day","type"))
# Plot
library(ggplot2)
ggplot(d) + geom_line(aes(x=day, y=value, colour=type, linetype=variable))