I have this data frame to construct some lines chart using ggplot2. lb is what I want my label to be on x-axis while each other variables (x0.6, x0.8, x0.9, x0.95, x0.99, and x0.999) will be against lb on the y-axis.
# my data
lb <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
x0.6 <- c(0.9200795, 0.9315084, 0.9099002, 0.9160192, 0.9121120, 0.9134098, 0.9130619, 0.9128494, 0.9144164)
x0.8 <- c(0.9804872, 1.0144678, 0.9856382, 0.9730490, 1.0032707, 1.0036311, 0.9726198, 0.9986403, 1.0022643)
x0.9 <- c(1.055256, 1.016159, 1.067242, 1.089894, 1.043502, 1.041497, 1.037738, 1.023274, 1.040536)
x0.95 <- c(1.058024, 1.105353, 1.069076, 1.061077, 1.095764, 1.096789, 1.096670, 1.121497, 1.109918)
x0.99 <- c(1.107258, 1.098061, 1.118248, 1.101253, 1.083208, 1.109715, 1.083704, 1.083704, 1.118057)
x0.999 <- c(1.110732, 1.119625, 1.121221, 1.087423, 1.093228, 1.094003, 1.108910, 1.112413, 1.096734)
#my datafram
pos11 <- data.frame(lb, x0.6, x0.8, x0.9, x0.95, x0.99, x0.999)
#load packages
library("reshape2")
library("ggplot2")
# this `R` CODE reshapes the data
long_pos11 <- melt(pos11, id="lb")
# Here is the `R` code that produces the `line-chart`
pos_line <- ggplot(data = long_pos11,
aes(x=AR, y=value, colour=variable)) +
geom_line()
I want the line-chart to show elements of the vector lb (1, 2, 3, 4, 5, 6, 7, 8, 9) on x-axis as its label just like date is 0n Plotting two variables as lines using ggplot2 on the same graph
Try this. As your variable is of numeric type you would need to set it as factor and then also add group to your aes() statement. Here the code:
library("reshape2")
library("ggplot2")
# this `R` CODE reshapes the data
long_pos11 <- melt(pos11, id="lb")
# Here is the `R` code that produces the `line-chart`
pos_line <- ggplot(data = long_pos11,
aes(x=factor(lb), y=value, colour=variable,group=variable)) +
geom_line()+xlab('lb')
Output:
We can also use pivot_longer
library(ggplot2)
library(tidyr)
library(dplyr)
pos11 %>%
pivot_longer(cols = -lb) %>%
mutate(lb = factor(lb)) %>%
ggplot(aes(x = lb, y = value, color = name, group = name)) +
geom_line() +
xlab('lb')
Related
I would like to link variables I have in a dataframe i.e. ('prop1', 'prop2', 'prop3') to specific colours and shapes in the plot. However, I also want to exclude data (using dplyr::filter) to customise the plot display WITHOUT changing the points and shapes used for a specific variable. A minimal example is given below.
library(ggplot2)
library(dplyr)
library(magrittr)
obj <- c("cmpd 1","cmpd 1","cmpd 1","cmpd 2","cmpd 2")
x <- c(1, 2, 4, 7, 3)
var <- c("prop1","prop2","prop3","prop2","prop3")
y <- c(1, 2, 3, 2.5, 4)
col <- c("#E69F00","#9E0142","#56B4E9","#9E0142","#56B4E9")
shp <- c(0,1,2,1,2)
df2 <- cbind.data.frame(obj,x,var,y,col,shp)
plot <- ggplot(data = df2 %>%
filter(obj %in% c(
"cmpd 1",
"cmpd 2"
)),
aes(x = x,
y = y,
colour = as.factor(var),
shape = as.factor(var))) +
geom_point(size=2) +
#scale_shape_manual(values=shp) +
#scale_color_manual(values=col) +
facet_grid(.~obj)
plot
However, when I redact cmpd1 (just hashing in code) the colour and shape of prop2 and prop3 for cmpd2 change (please see plot2).
To this end, I tried adding in scale_shape_manual and scale_color_manual to the code (currently hashed) and linked these to specific vars (col and shp) in the dataframe (df2), but the same problem arises that both the shape and color of these variables changes when excluding one of the conditions?
Any and all help appreciated.
Try something like this:
library(tidyverse)
obj <- c("cmpd 1","cmpd 1","cmpd 1","cmpd 2","cmpd 2")
x <- c(1, 2, 4, 7, 3)
var <- c("prop1","prop2","prop3","prop2","prop3")
y <- c(1, 2, 3, 2.5, 4)
df2 <- cbind.data.frame(obj,x,var,y)
col <- c("prop1" = "#E69F00",
"prop2" = "#9E0142",
"prop3" = "#56B4E9")
shp <- c("prop1" = 0,
"prop2" = 1,
"prop3" = 2)
plot <- ggplot(data = df2 %>%
filter(obj %in% c(
"cmpd 1",
"cmpd 2"
)),
aes(x = x,
y = y,
colour = var,
shape = var)) +
geom_point(size=2) +
scale_shape_manual(values=shp) +
scale_color_manual(values=col) +
facet_grid(.~obj)
plot
I am trying to create a histogram of my data.
My dataframe looks like this
x counts
4 78
5 45
... ...
where x is the variable I would like to plot and counts is the number of observations. If I do hist(x) the plot will be misleading because I am not taking into account the count. I have also tried:
hist(do.call("c", (mapply(rep, df$x, df$count))))
Unfortunately this does not work because the resulting vector will be too big
sum(df$ount)
[1] 7943571126
Is there any other way I can try?
Thank you
The solution is a barplot as #Rui Barradas suggested. I use ggplot to plot data.
library(ggplot2)
x <- c(4, 5, 6, 7, 8, 9, 10)
counts <- c(78, 45, 50, 12, 30, 50)
df <- data.frame(x=x, counts=counts)
plt <- ggplot(df) + geom_bar(aes(x=x, y=counts), stat="identity")
print(plt)
Since creating a new row for each repetition of x was not possible due to the size of the data, you can plot the density with a weight in ggplot2 using geom_histogram.
library(tidyverse)
set.seed(1)
x <- 1:100
counts <- sample(20:200,100,T)
df <- data.frame(x,counts)
df %>% ggplot() +geom_histogram(aes(x=x, y=..density..,weight=counts))
compare this with just plotting the counts:
df %>% ggplot() +geom_histogram(aes(x=x))
Using matplot, I can plot a line for each row of a dataframe at given x values. For example
set.seed(1)
df <- matrix(runif(20, 0, 1), nrow = 5)
matplot(t(df), type = "l", x = c(1, 3, 7, 9)) # c(1, 3, 7, 9) are the x-axis positions I'd like to plot along
# the line colours are not important
I'd like to use ggplot2 instead, but I'm not sure how best to replicate the outcome. Using melt I can rename the columns to the desired x values, as below. But is there a 'cleaner' approach that I'm missing?
df1 <- as.data.frame(df)
names(df1) <- c(1, 3, 7, 9) # rename columns to the desired x-axis values
df1$id <- 1:nrow(df1)
df1_melt <- melt(df1, id.var = "id")
df1_melt$variable <- as.numeric(as.character(df1_melt$variable)) # convert x-axis values from factor to numeric
ggplot(df1_melt, aes(x = variable, y = value)) + geom_line(aes(group = id))
Any help would be much appreciated. Thanks
Since ggplot2 is increasingly used as part of the tidyverse family of packages, I thought I would post a tidy approach.
# generate data
set.seed(1)
df <- matrix(runif(20, 0, 1), nrow = 5) %>% as.data.frame
# put x-values into a data.frame
x_df <- data.frame(col=c('V1', 'V2', 'V3', 'V4'),
x=c(1, 3, 7, 9))
# make a tidy version of the data and graph
df %>%
rownames_to_column %>%
gather(col, value, -rowname) %>%
left_join(x_df, by='col') %>%
ggplot(aes(x=x, y=value, color=rowname)) +
geom_line()
The key idea is to gather() the data into tidy format, so that instead of being 5 rows × 4 columns, the data is 20 rows × 1 value column along with a few other identifier columns (col, rowname and eventually x) in this particular case).
autoplot.zoo can do ggplot graphics of matrix data. Omit the facet argument if you want separate panels. The inputs are defined in the Note at the end.
library(ggplot2)
library(zoo)
z <- zoo(t(m), x) # use t so that series are columns
autoplot(z, facet = NULL) + xlab("x")
Note: The inputs used:
set.seed(1)
m <- matrix(runif(20, 0, 1), nrow = 5)
rownames(m) <- c("a", "b", "c", "d", "e")
x <- c(1, 3, 7, 9)
I want to arrange the bar graphs in ascending order by the levels of a variable. The other posts here and here do not work.
Here is an example data
x <- c(rep(letters[1:2], each = 6))
y <- c(rep(letters[3:8], times = 2))
z <- c(2, 4, 7, 5, 11, 8, 9, 2, 3, 4, 10, 11)
dat <- data.frame(x,y,z)
What I want to achieve is to plot a bar graph of the levels of y grouped by x in increasing order.
The following arranges y in increasing order.
library(tidyverse)
dat2 <- dat %>% group_by(y) %>%
arrange(x, z) %>%
ungroup() %>%
mutate(y = reorder(y, z))
dat2
However, the resulting plots are not what I was expecting.
ggplot(dat2, aes(y,z)) +
geom_bar(stat = "identity") +
facet_wrap(~x, scales = "free")
How can I arrange the levels of y in increasing order of z by x?
If you expect the bars to be increasing within each facet, the same value of y needs to be in a different position for different facets. Try this instead:
dat3 <- dat %>%
mutate(x.y = paste(x, y, sep = ".")) %>%
mutate(x.y = factor(x.y, levels = x.y[order(z)]))
# this creates a new variable that combines x & y; different facets
# simply use different subsets of values from this variable
ggplot(dat3, aes(x.y,z)) +
geom_bar(stat = "identity") +
scale_x_discrete(name = "y", breaks = dat3$x.y, labels = dat3$y) +
facet_wrap(~x, scales = "free")
For multiple (here: two) value lists I want to
plot values as line or points into one diagram
plot histograms into another diagram and
assign the same color to the respective line plot and histogram plot
I've come up with a combination of two examples using ggplot2, which is still using different colors for line plot and histograms. Also it may be a bit redundant, creating
How can I get the same color for line plot and histogram?
Bonus: How can I shorten the piece of used source code?
my result so far:
Source Code (R):
# input data lists
vals_x <- c(4, 3, 6, 7, 4, 6, 9, 3, 0, 8, 3, 7, 7, 5, 9, 0)
vals_y <- c(6, 6, 4, 8, 0, 3, 7, 3, 1, 8, 2, 1, 2, 3, 6, 5)
# ------------------------------------------------
library(ggplot2)
library(gridExtra)
# prepare data for plotting
df <- rbind( data.frame( fill = "blue", obs = vals_x),
data.frame( fill = "red", obs = vals_y))
test_data <- data.frame(
var0 = vals_x,
var1 = vals_y,
idx = seq(length(vals_x)))
stacked <- with(test_data,
data.frame(value = c(var0, var1),
variable = factor(rep(c("Values x","Values y"),
each = NROW(test_data))),
idx = rep(idx, 2),
fill_col = c( rep("blue", length(vals_x)),
rep("red", length(vals_y)))))
# plot line
p_line <- ggplot(stacked, aes(idx, value, colour = variable)) +
geom_line()
# plot histogram
p_hist <- ggplot( df, aes(x=obs, fill = fill)) +
geom_histogram(binwidth=2, colour="black", position="dodge") +
scale_fill_identity()
# arrange diagrams
grid.arrange( p_line, p_hist, ncol = 2)
The easiest thing to do is
Use the same data set in each ggplot object
Then use scale_*_manual (or some other scale call).
So
## Particularly awful colours
p_hist = ggplot(stacked, aes(x=value, fill=variable)) +
geom_histogram(binwidth=2, colour="black", position="dodge") +
scale_fill_manual(values=c("red", "yellow"))
p_line = ggplot(stacked, aes(idx, value, colour = variable)) +
geom_line() +
scale_colour_manual(values=c("red", "yellow"))
As an aside, I wouldn't use a histogram here; a boxplot or density plot would be much better.