R: Loop pairs of columns in a dataframe - r

Is it possible to plot pairs of columns in a single plot with a loop? For example, if I have a data frame of time series with 10 columns (x1, x2.. x10), I would like to create 5 plots: 1st plot will display x1 and x2, the 2nd plot would display x3 and x4 and so on.
Any plotting method would be useful, (zoo, lattice, ggplot2).
I got stuck at creating a loop to plot a single variable:
set.seed(1)
x<- data.frame(replicate(10,rnorm(10, mean = 0, sd = 1)))
cols <- seq(1,10)
library(zoo)
z <- read.zoo(x)
for (i in cols) {
plot(z[,i], screen = 1)
}
Thanks in advance.

How about this with ggplot2 and reshape2:
require(reshape2)
require(ggplot2)
m<-melt(matrix(z,10))
m$facet<-cut(m$Var2,c(0,2,4,6,8,10))
ggplot(m)+geom_line(aes(x=Var1,y=value,group=Var2,color=factor(Var2)))+facet_wrap(~ facet)

It can be done in a single line without a loop like this where the col argument specifies that the odd series are black and the even are red. Note that z in the question has 9 columns (since the first column in x is the time index) so we have used a 10 column z below instead which was likely what was intended.
library(zoo)
# test data
set.seed(123); z <- zoo(matrix(rnorm(250), 25)); colnames(z) <- make.names(1:10)
plot(z, screen = rep(colnames(z)[c(TRUE, FALSE)], each = 2), col = 1:2)
The output is shown below. To produce a single column add the argument nc=1 or to produce a lattice plot replace plot with xyplot.
ADDED: lattice solution.

like this? Although I am not clear how you want to plot it.
par(mfrow=c(1,5))
for (i in seq(1,10,by=2)){
plot(x[,i],x[,i+1])
}

Related

Split facet plot into list of plots

Huge fan of facet plots in ggplot2. However, sometimes I have too many subplots and it'd be nice to break them up into a list of plots. For example
df <- data.frame(x=seq(1,24,1), y=seq(1,24,1), z=rep(seq(1,12),each=2))
df
x y z
1 1 1 1
2 2 2 1
3 3 3 2
4 4 4 2
5 5 5 3
. . . .
. . . .
myplot <- ggplot(df,aes(x=x, y=y))+geom_point()+facet_wrap(~z)
myplot
How would I write a function to take the resulting plot and split it into a list of plots? Something along these lines
splitFacet <- function(subsPerPlot){
# Method to break a single facet plot into a list of facet plots, each with at most `subsPerPlot` subplots
# code...
return(listOfPlots)
}
Split plot into individual plots
We build a function along these steps :
We go through the structure of the object to get the names of the variables used for faceting (here 'z').
We overwrite the facet element of our plot object with the one from the empty ggplot object (so if we print it at this stage facets are gone).
We extract the data and split it along the variables we identified in 1st step.
We overwrite the original data with each subset (12 times here) and store all outputs in a list.
code
splitFacet <- function(x){
facet_vars <- names(x$facet$params$facets) # 1
x$facet <- ggplot2::ggplot()$facet # 2
datasets <- split(x$data, x$data[facet_vars]) # 3
new_plots <- lapply(datasets,function(new_data) { # 4
x$data <- new_data
x})
}
new_plots <- splitFacet(myplot)
length(new_plots) # [1] 12
new_plots[[3]] # 3rd plot
Split plot into faceted plots of n subplots max
If we want to keep the facets but have less plots by facet we can skip step 2, and rework our split instead so it includes several values of the variables used for faceting.
Rather than making a separate function we'll generalize the 1st, n is the number of facets you get by plot.
n = NULL means you get the previous output, which is slightly different from n = 1 (one facet by plot).
splitFacet <- function(x, n = NULL){
facet_vars <- names(x$facet$params$facets) # 1
if(is.null(n)){
x$facet <- ggplot2::ggplot()$facet # 2a
datasets <- split(x$data, x$data[facet_vars]) # 3a
} else {
inter0 <- interaction(x$data[facet_vars], drop = TRUE) # 2b
inter <- ceiling(as.numeric(inter0)/n)
datasets <- split(x$data, inter) # 3b
}
new_plots <- lapply(datasets,function(new_data) { # 4
x$data <- new_data
x})
}
new_plots2 <- splitFacet(myplot,4)
length(new_plots2) # [1] 3
new_plots2[[2]]
This might come in handy too :
unfacet <- function(x){
x$facet <- ggplot2::ggplot()$facet
x
}
The tidy way
If the code is available, no need to go through all this trouble, we can split the data before feeding it to ggplot :
library(tidyverse)
myplots3 <-
df %>%
split(ceiling(group_indices(.,z)/n_facets)) %>%
map(~ggplot(.,aes(x =x, y=y))+geom_point()+facet_wrap(~z))
myplots3[[3]]
While I was looking for a solution for this I can across ggplus. Specifically the function facet_multiple:
https://github.com/guiastrennec/ggplus
It lets you split a facet over a number of pages by specifying the amount of plots you want per page. In your example it would be:
library(ggplus)
df <- data.frame(x=seq(1,24,1), y=seq(1,24,1), z=rep(seq(1,12),each=2))
myplot <- ggplot(df,aes(x=x, y=y))+geom_point()
facet_multiple(plot = myplot, facets = 'z', ncol = 2, nrow = 2)
Is this the sort of thing you need? It worked a treat for me.
This is similar to Moody_Muddskipper's answer, but works with any type of faceting (facet_grid or facet_wrap), handles arbitrary expressions in facets, and doesn't draw facet strip bars.
library(rlang)
library(ggplot2)
split_facets <- function(x) {
facet_expr <- unlist(x[["facet"]][["params"]][c("cols", "rows", "facets")])
facet_levels <- lapply(facet_expr, rlang::eval_tidy, data = x[["data"]])
facet_id <- do.call(interaction, facet_levels)
panel_data <- split(x[["data"]], facet_id)
plots <- vector("list", length(panel_data))
for (ii in seq_along(plots)) {
plots[[ii]] <- x
plots[[ii]][["data"]] <- panel_data[[ii]]
plots[[ii]][["facet"]] <- facet_null()
}
plots
}
split_facets(ggplot(df,aes(x=x, y=y))+geom_point()+facet_wrap(~z))
split_facets(ggplot(df,aes(x=x, y=y))+geom_point()+facet_grid(z %% 2 ~ z %% 5))
It uses rlang::eval_tidy to evaluate the facet expressions, combines them into a single categorical factor, then uses that to split the data. It also "suppresses" each subplot's faceting part by replacing it with facet_null().
Posting this for anyone wanting to use ggplus. ggplus will work with later versions of R, but you need to install it using the developer's directions, i.e.
devtools::install_github("guiastrennec/ggplus")
I ran into the same issue when trying to install it using RStudio, then realized that it's just not one of the "standard packages." I'm using 3.4.4.

plotting each column of a matrix individually in a single graph in R

I have a 10x10 matrix and I want to plot each column(in the form of lines) in the following way
1. There should be one y-axis which will cover the scale of all columns of matrix.
2. There should be single x-axis with 10 points(= the number of columns).
3. the first column of matrix should be plotted within the point-1 and point-2 of x-axis, the second column of matrix within the point 2 and point-3, third column within the point-3 and point-4 and so on....
I have seen already posts, but all are multiple plots which are not according to my requirements. Could you please help me that how this can be done in R
You could convert your data from wide to long format and then use a standard plotting utility like ggplot to appropriately group your data and position it:
# Build a sample matrix, dat
set.seed(144)
dat <- matrix(rnorm(100), nrow=10)
# Build a data frame, to.plot, where each element represents one value in the matrix
to.plot <- expand.grid(row=factor(seq(nrow(dat))), col=factor(seq(ncol(dat))))
to.plot$dat <- dat[cbind(to.plot$row, to.plot$col)]
to.plot$col <- as.factor(to.plot$col)
# Plot
library(ggplot2)
ggplot(to.plot, aes(x=as.numeric(col)+(row-1)/max(row), y=dat, group=col, col=col))
+ geom_line() + scale_x_continuous(breaks=1:10) + xlab("Column")
Here's how you do it with matplot.
matplot(y = myData,
,x = matrix(seq(prod(dim(myData)))/nrow(myData),
nrow=nrow(myData),byrow=F)
- 1/nrow(myData) + 1)
The trick is constructing the right matrix for the x values.

Universal scale bar for paneled levelplots

I would like to have multiple heatmaps/levelplots in a single plot, with a universal scale bar. I have the plots arranged, and I think I'm close to the answer, but I want to make sure I don't mess the scale up.
#Fake data
library(gridExtra)
fill = rnorm(100,4)
matA = matrix(fill, ncol=10)
matB = matrix(fill * 2, ncol=10)
# Plotting
a=levelplot(matA, colorkey=FALSE)
b=levelplot(matB, colorkey=list(col=rainbow(1000), at=seq(0,6, length.out=1000)))
grid.arrange(a,b,ncol=2)
Thanks for any help!
Instead of using grid.arrange, you may rearrange your data to be able to use the formula method of x in levelplot. This allows you to easily create a plot with different panels based on a grouping variable g, with a common scale. Here g ('L1') corresponds to the different matrices.
library(reshape2)
library(lattice)
# put your matrices in a list an melt them to one data frame.
l <- list(matA, matB)
df <- melt(l)
# plot
levelplot(value ~ Var1 * Var2 | L1, data = df,
col.regions = rainbow(100))

R violin plot overlay 2 dataframes

Say you have two dataframes
M1 <- data.frame(sample(1:3, 500, replace = TRUE), ncol = 5)
M2 <- data.frame(sample(1:3, 500, replace = TRUE), ncol = 5)
and I want to overlay them as violin plots as seen here:
Overlay violin plots ggplot2
but I have 2 dataframes like above (but bigger) not one with 3 columns as in the example above
I have tried the advice using melt as seen here:
Violin plot of a data frame
but I cant get it to overlay two dataframes
help is much appreciated:
Like this?
library(ggplot2)
library(reshape2)
set.seed(1)
M1 <- data.frame(matrix(sample(1:5, 500, replace = TRUE), ncol = 5))
M2 <- data.frame(matrix(sample(2:4, 500, replace = TRUE), ncol = 5))
M1.melt <- melt(M1)
M2.melt <- melt(M2)
ggplot() +
geom_violin(data=M1.melt, aes(x=variable,y=value),fill="lightblue",colour="blue")+
geom_violin(data=M2.melt, aes(x=variable,y=value),fill="lightgreen",colour="green")
There are several issues. First, data.frame(...) does no take an ncol argument, so your code just generates a pair of 2-column data frames with the second column called ncol with all values = 5. If you want 5 columns (do you??) then you have to use matrix(...) as above.
Second, you do need to use melt(...) to reorganize the dataframes from "wide" format (categories in 5 different columns) to "long" format (all data in 1 column, called value, with categories distinguihsed by a second column, called variable).
Another way to do this combines the two dataframes first:
M3 <- rbind(M1,M2)
M3$group <- rep(c("A","B"),each=100)
M3.melt <- melt(M3, id="group")
ggplot(M3.melt, aes(x=variable, y=value, fill=group)) +
geom_violin(position="identity")
Note that this generates a slightly different plot because ggplot scales the width of the violins together, whereas in the earlier plot they were scaled separately.
EDIT (Response to OP's comment)
To put the fill colors in a legend, you have to make them part of an aesthetic scale: put fill=... inside the call to aes(...) as follows.
ggplot() +
geom_violin(data=M1.melt, aes(x=variable,y=value,fill="M1"),colour="blue")+
geom_violin(data=M2.melt, aes(x=variable,y=value,fill="M2"),colour="green")+
scale_fill_manual(name="Data Set",values=c(M1="lightblue",M2="lightgreen"))

Setting up repetition in lines()

I am trying to create a 'before and after' line chart which shows results of blood tests before and after an operation. I have 307 pairs of data so need to get the lines function to plot a line for each of the 307 columns in the matrix of data created from pre- and post-operative data (one column: one patient). So I tried this:
ylabel<-"Platelet count (millions/ml)"
preoptpk<-c(100,101,102,103,104,105)
postoptpk<-c(106,107,108,109,110,111)
preoptpk<-t(matrix(preoptpk))
postoptpk<-t(matrix(postoptpk))
preoptpk
postoptpk
beforeandafterdata<-rbind(preoptpk, postoptpk)
beforeandafterdata
ylimits<-c(0.8*min(beforeandafterdata,na.rm=TRUE),1.15*max(beforeandafterdata, na.rm=TRUE))
ylimits
plot(beforeandafterdata[,1], type = "l", col = "black", xlim = c(0.9, 2.1),
ylim = ylimits, ann = FALSE, axes = FALSE)
title(ylab=ylabel, cex.lab=1.4)
axis(1,at=1:2,lab=c("Preop.","Postop."),cex.axis=1.5)
axis(2,labels=TRUE)
x<-c(1*2:6)
x
lines(beforeandafterdata[,x],type="l",col="black",
xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
..and nothing happened.
I don't understand why I can't use x<-c(1*2:307) since when I manually define x as 2 then 3 then 4 then 5 then 6 it works fine:
x <- 2 x
lines(beforeandafterdata[,x],type="l",col="black",xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
x <-3 x
lines(beforeandafterdata[,x],type="l",col="black",xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
x <-4 x
lines(beforeandafterdata[,x],type="l",col="black",xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
x<-5 x
lines(beforeandafterdata[,x],type="l",col="black",xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
x<-6 x
lines(beforeandafterdata[,x],type="l",col="black",xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
x<-c(1*2:6)
Any help how I can get this to work? Since I have several variables and manually plotting 307 lines for each will be v. time consuming. Thanks for reply.
Trying to stay close to your example, you need to use xy.coords within lines.
plot(beforeandafterdata[,1],type="l",col="black",xlim=c(0.9,2.1),ylim=ylimits,
ann=FALSE,axes=FALSE)
title(ylab=ylabel, cex.lab=1.4)
axis(1,at=1:2,lab=c("Preop.","Postop."),cex.axis=1.5)
axis(2,labels=TRUE)
x<-c(1*2:6)
x
lapply(x, function(x){
lines(xy.coords(x=c(1, 2), y=c(beforeandafterdata[,x])), type="l", col="black",
xlim=c(0.9,2.1),ylim=ylimits,ann=FALSE)
})
lapply is needed to prevent one line being joined to the next
You could use a for loop to do this. e.g.:
for (x in 2:6) {
lines(beforendafterdata[,x], ...)
}
Or you can use the reshape2 and ggplot2 packages. First melt your data into a long format that ggplot2 likes:
library(reshape2)
beforeandafter_melted <- melt(beforeandafterdata)
Then plot away. You don't need the color argument, but the group is important to force individual lines to be drawn.
library(ggplot2)
ggplot(beforeandafter_melted, aes(x=Var1, y=value, color=factor(Var2), group=Var2)) +
geom_line()
Where Var1 is the row (1 or 2) and Var2 is the column (1 to 6) from your initial matrix beforeandafterdata.
Also, why have you written x <- c(1*2:307)? This is no different than 2:307 (unless you're trying to force numeric conversion, but that isn't the way to go about it).
all.equal(c(1*2:307), 2:307)
# [1] TRUE

Resources