Could someone please suggest a method to print several matrices side by side in the terminal window.
For the matrices m1 and m2, I would like the desired output below.
m1 <- m2 <- matrix(1:4, nrow=2, dimnames=list(c("a", "b"), c("d", "e")))
Desired output
m1 m2
d e d e
a 1 3 a 1 3
b 2 4 b 2 4
The reason is that I have several 2x2 matrices that i am using in calculations and want to show in a Rmarkdown doc. It takes up a bit too much of the page when printing length ways. Thanks.
EDIT
My attempt at a solution
fn <- function(x) setNames(data.frame(.=paste(" ", rownames(x)), x,
check.names=F, row.names=NULL),c(paste(substitute(x)), colnames(x)))
cbind(fn(m1), fn(m2))
# m1 d e m2 f g
#1 a 1 3 v 1 3
#2 b 2 4 w 2 4
But this of course doesnt look very good.
A little hack-ish, but I believe it is what you want:
m1 <- m2 <- m3 <- m4 <- matrix(1:4, nrow=2, dimnames=list(c("a", "b"), c("d", "e")))
fn <- function(x) setNames(data.frame(.=paste("", rownames(x)), x, check.names=F, row.names=NULL),c(" ", colnames(x)))
matrix.names <- Filter( function(x) 'matrix' %in% class( get(x) ), ls(pattern = "m") )
matrix.list <- lapply(matrix.names, get)
matrix.chain <- do.call(cbind, lapply(matrix.list, fn))
cat(" ", paste0(matrix.names, collapse = " "), "\n"); print(matrix.chain, row.names = FALSE)
m1 m2 m3 m4
d e d e d e d e
a 1 3 a 1 3 a 1 3 a 1 3
b 2 4 b 2 4 b 2 4 b 2 4
Related
Suppose that we have the following dataframe:
set.seed(1)
(tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace =
TRUE), R2 = sample(LETTERS[1:5], 10, replace = TRUE)))
x R1 R2
1 1 B B
2 2 B A
3 3 C D
4 4 E B
5 5 B D
6 6 E C
7 7 E D
8 8 D E
9 9 D B
10 10 A D
I want to do the following: if the difference between the level index
of factor R1 and that of factor R2 is an odd number, the levels of the
two factors need to be switched between them, which can be performed
through the following code:
for(ii in 1:dim(tmp)[1]) {
kk <- which(levels(tmp$R2) %in% tmp[ii,'R2'], arr.ind = TRUE) -
which(levels(tmp$R1) %in% tmp[ii,'R1'], arr.ind = TRUE)
if(kk%%2!=0) { # swap the their levels between the two factors
qq <- tmp[ii,]$R1
tmp[ii,]$R1 <- tmp[ii,]$R2
tmp[ii,]$R2 <- qq
}
}
More concise and efficient ways to achieve this?
P.S. A slightly different situation is the following.
set.seed(1)
(tmp <- data.frame(x = 1:10, R1 = sample(LETTERS[1:5], 10, replace =
TRUE), R2 = sample(LETTERS[2:6], 10, replace = TRUE)))
x R1 R2
1 C B
2 B B
3 C E
4 E C
5 E B
6 D E
7 E E
8 D F
9 C D
10 A E
Notice that the factor levels between the two factors, R1 and R2, slide by one level; that is, factor R1 does not have level F while factor R2 does not have level A. I want to swap the factor levels based on the combined levels of the two factors as shown below:
tl <- unique(c(levels(tmp$R1), levels(tmp$R2)))
for(ii in 1:dim(tmp)[1]) {
kk <- which(tl %in% tmp[ii,'R2'], arr.ind = TRUE) - which(tl %in%
tmp[ii,'R1'], arr.ind = TRUE)
if(kk%%2!=0) { # swap the their levels between the two factors
qq <- tmp[ii,]$R1
tmp[ii,]$R1 <- tmp[ii,]$R2
tmp[ii,]$R2 <- qq
}
}
How to go about this case? Thanks!
#Find out the indices where difference is odd
inds = abs(as.numeric(tmp$R1) - as.numeric(tmp$R2)) %% 2 != 0
#create new columns where values for the appropriate inds are from relevant columns
tmp$R1_new = replace(tmp$R1, inds, tmp$R2[inds])
tmp$R2_new = replace(tmp$R2, inds, tmp$R1[inds])
tmp
# x R1 R2 R1_new R2_new
#1 1 B B B B
#2 2 B A A B
#3 3 C D D C
#4 4 E B B E
#5 5 B D B D
#6 6 E C E C
#7 7 E D D E
#8 8 D E E D
#9 9 D B D B
#10 10 A D D A
Delete the old R1 and R2 if necessary
A solution using dplyr. dt is the final output. Notice that we need to use if_else from dplyr here, not the common ifelse from base R.
library(dplyr)
dt <- tmp %>%
mutate(R1_new = if_else((as.numeric(R2) - as.numeric(R1)) %% 2 != 0, R2, R1),
R2_new = if_else((as.numeric(R2) - as.numeric(R1)) %% 2 != 0, R1, R2)) %>%
select(x, R1 = R1_new, R2 = R2_new)
Update
For the updated case, add one mutate call to redefine the factor level of R1 and R2. The rest is the same.
tl <- unique(c(levels(tmp$R1), levels(tmp$R2)))
dt <- tmp %>%
mutate(R1 = factor(R1, levels = tl), R2 = factor(R2, levels = tl)) %>%
mutate(R1_new = if_else((as.numeric(R2) - as.numeric(R1)) %% 2 != 0, R2, R1),
R2_new = if_else((as.numeric(R2) - as.numeric(R1)) %% 2 != 0, R1, R2)) %>%
select(x, R1 = R1_new, R2 = R2_new)
Here is an option using data.table
library(data.table)
setDT(tmp)[(as.integer(R1) - as.integer(R2))%%2 != 0, c('R2', 'R1') := .(R1, R2)]
tmp
# x R1 R2
#1: 1 B B
#2: 2 A B
#3: 3 D C
#4: 4 B E
#5: 5 B D
#6: 6 E C
#7: 7 D E
#8: 8 E D
#9: 9 D B
#10:10 D A
Suppose have a dataframe like this :-
df<- read.table(text="groups names
1 a
1 b
1 c
1 d
2 e
2 f
2 g
2 h
", header=T)
I divided this data frame into two groups by using
split_groups <-split(df, df$groups)
Then I used for loop to obtain the overlapping lists of split_group[[1]] and split_group[[2]] as follows:
slide <- list()
for(i in 1:2){
slide[[i]] <- rollapply(split_groups[[i]][,2], width =2,by=1, matrix, align="right")
}
And obtained this :-
slide[[1]]:
a
b
**b**
c
**c**
d
slide[[2]] :
e
f
**f**
g
**g**
h
then I divided slide[[1]] and slide[[2]] into lists of equal rows:
divide <- split(slide[[1]], cumsum(seq_len(nrow(slide[[1]])) %%2 == 1))
and obtained divide[[1]] = a,b ; divide[[2]] = b,c and so on.
Similarly from slide[[2]], divide[[1]] = e,f and so on.
I want to rbind divide[[1]] from split[[1]] and split[[2]] ie set1 = a,b,e,f in the form of list or dataframe.
Similarly divide[[2]] from split[[1]] and split[[2]] ie set2= b,c,f,g.
ie
set1:
a
b
e
f
set2:
b
c
f
g
How can I do this ?
May be you want this: (The slide output is different than it was showed in the post)
divide1 <- split(slide[[1]], cumsum(seq_len(nrow(slide[[1]])) %%2 == 1))
divide2 <- split(slide[[2]], cumsum(seq_len(nrow(slide[[2]])) %%2 == 1))
nm1 <- paste0("set", 1:2)
Map(function(x,y,z) setNames(data.frame(c(x,y)),z), divide1, divide2, nm1)
#$`1`
# set1
#1 a
#2 b
#3 e
#4 f
#$`2`
# set2
#1 b
#2 f
Or if you have more list elements in slide, you could do:
divide <- lapply(slide, function(x) split(x, cumsum(!!seq_len(nrow(x)) %%2)))
divN <- unlist(divide)
lstN <- split(unname(divN), substr(names(divN),1,1))
nm1 <- paste0("set", seq_along(lstN))
Map(function(x,y) setNames(data.frame(x),y), lstN, nm1)
#$`1`
# set1
#1 a
#2 b
#3 e
#4 f
#$`2`
# set2
#1 b
#2 f
If I have a data frame as below, with the first row the column names (row names not included here)
A B C D E F G H I
a b c a a b c c c
1 2 3 4 5 6 7 8 9
How would I be able create a new data frame such that:
a b c
1 2 3
4 6 7
5 NA 8
NA NA 9
Notice the NA. For empty values.
UPDATE
If d.frame is the dataframe in question:
new.df <- data.frame();
firstrow <- d.frame[,1]
names <- unique(firstrow)
for (n in names) {
#cbind.fill is part of a package plyr
new.df <- cbind.fill(new.df, frame[3,which(firstrow == n)])
}
colnames(new.df) <- names;
I think that works well. But it isn't efficient and relies on a third party package. Any suggestions?
Here is another solution, based on function cbind.fill from cbind a df with an empty df (cbind.fill?)
cbind.fill<-function(...){
nm <- list(...)
nm<-lapply(nm, as.matrix)
n <- max(sapply(nm, nrow))
do.call(cbind, lapply(nm, function (x)
rbind(x, matrix(, n-nrow(x), ncol(x)))))
}
df <- read.table(text = "A B C D E F G H I
a b c a a b c c c
1 2 3 4 5 6 7 8 9", header = T, as.is=T)
df <- as.matrix(df)
do.call(cbind.fill, split(df[2,], df[1,]))
And another one solution
df <- as.matrix(df)
lst <- split(df[2,], df[1,])
m <- max(sapply(lst, length))
result <- sapply(lst, function(x) {length(x) <- m; x})
Couldn't find a simple solution for this, so here's one option using base R as you requested in comments. This solution will work no matter how many columns you have in the original data
temp <- read.table(text = "A B C D E F G H I
a b c a a b c c c
1 2 3 4 5 6 7 8 9", header = T) # your data
temp <- data.frame(t(temp))
lengths <- table(temp[, 1])
maxval <- max(lengths)
data.frame(do.call(cbind, lapply(levels(temp[, 1]), function(x) c(x, temp[temp[, 1] == x, 2], rep(NA, maxval - lengths[x])))))
## X1 X2 X3
## 1 a b c
## 2 1 2 3
## 3 4 6 7
## 4 5 <NA> 8
## 5 <NA> <NA> 9
I would transpose the original two-row data.frame, create a "time" variable, use reshape to reorganize the data, and transpose the result.
Like this:
x <- t(mydf)
y <- data.frame(cbind(x, ave(x[, 1], x[, 1], FUN = seq_along)))
t(reshape(y, direction = "wide", idvar = "X1", timevar = "X3"))
# A B C
# X1 "a" "b" "c"
# X2.1 "1" "2" "3"
# X2.2 "4" "6" "7"
# X2.3 "5" NA "8"
# X2.4 NA NA "9"
I would like to join matrices by both colnames and rownames in R:
m1 = matrix(c(1,2,3, 11,12,13), nrow = 2, ncol = 3, byrow = TRUE,
dimnames = list(c("r1", "r2"),
c("a", "b", "c")))
m2 = matrix(c(4, 5, 0, 2,3,4), nrow = 2, ncol = 3, byrow = TRUE,
dimnames = list(c("r2", "r3"),
c("d", "b", "c")))
Check m1:
> m1
a b c
r1 1 2 3
r2 11 12 13
Check m2:
> m2
d b c
r2 4 5 0
r3 2 3 4
I want to get m3 which looks like this:
> m3
a b c d
r1 1 2 3 0
r2 11 17 13 4
r3 0 3 4 2
I did't find an elegant way to do so. Using the rbind.fill.matrix function in package plyr, I can indirectly get m3.
require(plyr)
m3 = rbind.fill.matrix(m1, m2)
rownames(m3) = c(rownames(m1), rownames(m2))
m3[is.na(m3)]=0 # replace na with zero
m3 = t(sapply(by(m3,rownames(m3),colSums),identity)) # aggregate matrix by rownames
I guess there must be some better ways to do so. What's your suggestion?
The following seems valid:
tmp = rbind(as.data.frame(as.table(m1)), as.data.frame(as.table(m2)))
#tmp = aggregate(Freq ~ Var1 + Var2, tmp, sum) #unnecessary
xtabs(Freq ~ Var1 + Var2, tmp)
# Var2
#Var1 a b c d
# r1 1 2 3 0
# r2 11 17 13 4
# r3 0 3 4 2
edit: As noted by #AnandaMahto, xtabs is a 'contingency-table' and not a 'reshape-data' function and, so, it sums by default.
I used this code:
m1 = m1[sort(rownames(m1)),sort(colnames(m1))]
m2 = m2[sort(rownames(m2)),sort(colnames(m2))]
nr = unique(c(rownames(m1),rownames(m2)))
nc = unique(c(colnames(m1),colnames(m2)))
m3 = matrix(0,nr=length(nr),nc=length(nc),dimnames=list(nr,nc))
m3[rownames(m3)%in%rownames(m1),colnames(m3)%in%colnames(m1)]=m1
m3[rownames(m3)%in%rownames(m2),colnames(m3)%in%colnames(m2)]=m3[rownames(m3)%in%rownames(m2),colnames(m3)%in%colnames(m2)]+m2
I have following issue, I could solve:
set.seed (1234)
mydf <- data.frame (var1a = sample (c("TA", "AA", "TT"), 5, replace = TRUE),
varb2 = sample (c("GA", "AA", "GG"), 5, replace = TRUE),
varAB = sample (c("AC", "AA", "CC"), 5, replace = TRUE)
)
mydf
var1a varb2 varAB
1 TA AA CC
2 AA GA AA
3 AA GA AC
4 AA AA CC
5 TT AA AC
I want to split two letter into different column, and then order alphabetically.
Edit: Ordering can be done before split, for example var1a value "TA" var1a should be "AT" or after split so that var1aa should be "A", and var1ab be "T" (instead of "T", "A").
so sorting is within each cell.
split_col <- function(.col, data){
.x <- colsplit( data[[.col]], names = paste0(.col, letters[1:2]))
}
split each column and combine
require(reshape)
splitdf <- do.call(cbind, lapply(names(mydf), split_col, data = mydf))
var1aa var1ab varb2a varb2b varABa varABb
1 T A A A C C
2 A A G A A A
3 A A G A A C
4 A A A A C C
5 T T A A A C
But the unsolved part is I want to order the pair of columns such that columnname"a" and columname"b" are ordered, alphabetically. Thus expected output:
var1aa var1ab varb2a varb2b varABa varABb
1 A T A A C C
2 A A A G A A
3 A A A G A C
4 A A A A C C
5 T T A A A C
Can how can order (short with each pair of variable) ?
mylist <-as.list(mydf)
splits <- lapply(mylist, reshape::colsplit, names=c("a", "b"))
rowsort <- lapply(splits, function(x) t(apply(x, 1, sort)))
comb <- do.call(data.frame, rowsort)
comb
var1a.1 var1a.2 varb2.1 varb2.2 varAB.a varAB.b
1 A T A A C C
2 A A A G A A
3 A A A G A C
4 A A A A C C
5 T T A A A C
EDIT:
If names are important, you can replace them:
replaceNums <- function(x){
.which <- regmatches(x, regexpr("[[:alnum:]]*(?=.)", x, perl=TRUE))
stopifnot(length(x) %% 2 == 0) #checkstep
paste0(.which, c("a", "b"))
}
names(comb) <- replaceNums(names(comb))
comb
var1aa var1ab varb2a varb2b varABa varABb
1 A T A A C C
2 A A A G A A
3 A A A G A C
4 A A A A C C
5 T T A A A C