How to replace/shuffle one column with another in R? - r

I am very new to R and I was wondering that is there way to shuffle two columns in matrix?
I have a matrix of 800 x 12. I want to replace column 1 by column 2 and column 2 by column 1. Can anyone help me, please?

x <- matrix(1:15,5,3) # create 5x3 matrix
x[,c(1,2)] <- x[,c(2,1)] # exchange columns 1 and 2

before <- data.frame(c1=1:3, c2=4:6)
after <- before[,c("c2", "c1")]

Related

replace a column of a matrix by consecutive numbers

I have generated a matrix of dimension (5x3). Now I want to replace my 2nd column with (1 to 3) such as the column values become
[,2]
1
2
3
1
2
I am getting an error message like this:
Error in hdcell[, 2] <- (1:3) :
number of items to replace is not a multiple of replacement length
I am new to R. I know it is a simple question.
You can make #Martin Gal answer work for any matrix/data frame length with
hdcell[, 2] <- rep_len(1:3, nrow(hdcell))
Just in case. :)

Excluding a number of answers from a R dataframe

I'm looking for a way to exclude a number of answers from a length function.
This is a follow on question from Getting R Frequency counts for all possible answers In sql the syntax could be
select * from someTable
where variableName not in ( 0, null )
Given
Id <- c(1,2,3,4,5)
ClassA <- c(1,NA,3,1,1)
ClassB <- c(2,1,1,3,3)
R <- c(5,5,7,NA,9)
S <- c(3,7,NA,9,5)
df <- data.frame(Id,ClassA,ClassB,R,S)
ZeroTenNAScale <- c(0:10,NA);
R.freq = setNames(nm=c('R','freq'),data.frame(table(factor(df$R,levels=ZeroTenNAScale,exclude=NULL))));
S.freq = setNames(nm=c('S','freq'),data.frame(table(factor(df$S,levels=ZeroTenNAScale,exclude=NULL))));
length(S.freq$freq[S.freq$freq!=0])
# 5
How would I change
length(S.freq$freq[S.freq$freq!=0])
to get an answer of 4 by excluding 0 and NA?
We can use colSums,
colSums(!is.na(S.freq)[S.freq$freq!=0,])[[1]]
#[1] 4
You can use sum to calculate the sum of integers. if NA's are found in your column you could be using na.rm(), however because the NA is located in a different column you first need to remove the row containing NA.
Our solution is as follows, we remove the rows containing NA by subsetting S.freq[!is.na(S.freq$S),], but we also need the second column freq:
sum(S.freq[!is.na(S.freq$S), "freq"])
# 4
You can try na.omit (to remove NAs) and subset ( to get rid off all lines in freq equal to 0):
subset(na.omit(S.freq), freq != 0)
S freq
4 3 1
6 5 1
8 7 1
10 9 1
From here, that's straightforward:
length(subset(na.omit(S.freq), freq != 0)$freq)
[1] 4
Does it solve your problem?
Just add !is.na(S.freq$S) as a second filter:
length(S.freq$freq[S.freq$freq!=0 & !is.na(S.freq$S)])
If you want to extend it with other conditions, you could make an index vector first for readability:
idx <- S.freq$freq!=0 & !is.na(S.freq$S)
length(S.freq$freq[idx])
You're looking for values with frequency > 0, that means you're looking for unique values. You get this information directly from vector S:
length(unique(df$S))
and leaving NA aside you get answer 4 by:
length(unique(df$S[!is.na(df$S)]))
Regarding your question on how to exclude a number of items based on their value:
In R this is easily done with logical vectors as you used it in you code already:
length(S.freq$freq[S.freq$freq!=0])
you can combine different conditions to one logical vector and use it for subsetting e.g.
length(S.freq$freq[S.freq$freq!=0 & !is.na(S.freq$freq)])

Extract vector from matrix based on column index changing by row

I am struggling with a theoretically simple problem with R:
say I have the following matrix:
a <- matrix(1:16,ncol=4)
and the following vector showing the column position I need to extract for each row:
b <- c(4,3,1,1)
I need to return the following vector:
[1] 13 10 3 4
In other words, for each row I need to extract the element whose column position is shown in the corrisponding c value.
I have search extensively on this site but could not find a solution.
Can anyone please help me? Thanks
You can try
a[cbind(1:nrow(a), b)]
#[1] 13 10 3 4

Filling matrix with data from two columns of dataframe

I am so desperated and even I am ready to lose some more rep points but I have to ask it.
(Yes, I read some threads about it).
I created a dataframe with only 2 columns I want to put to the matrix (I didn't know how to pick just 2 columns from whole data):
tbl_corel <- tbl_end[,c("diff", "abund_mean")]
In next step I created and empty matrix:
## Creating a empty matrix to check the correlation between diff and abund_mean
mat_corel <- matrix(0, ncol = 2)
colnames(mat_corel) <- c("diff", "abund_mean")
I tried to use that function to fill the matrix with the data:
mat_corel <- matrix(tbl_corel), nrow = 676,ncol = 2)
Of course I had to check manually how many rows I have in my data frame...
It doesn't work.
Tried that function as well:
mat_corel[ as.matrix(tbl_corel) ] <- 1
It doesn't work. I'd be so grateful for the help.
diff abund_mean
1 0 3444804.80
2 0 847887.02
3 0 93654.19
4 0 721692.76
5 0 382711.04
6 1 428656.66
If you want to create a matrix from your two-columns data frame, there is a more direct and simpler way : just transform you data frame as a matrix directly :
mat_corel <- as.matrix(tbl_corel)
But if you just want to compute a correlation coefficient, you can do it directly from your data frame :
cor(tbl_end$diff, tbl_end$abund_mean)

R how to produce a vector trimming another vector by choosing a fixed value of its components

This is an elementary question; I apologize for it.
Let x <- c(1,2,3,4,5). I would like to produce a vector z of length 5 s.t. its components are all those x satisfying the condition
if x[i]>2 then write 2.
The result should look like
z <- c(1,2,2,2,2)
I know that
z <- which(x>2)
gives me
3 4 5
but I cannot find a good way to implement it to arrive at the result.
I thank you all for your support.
EDIT. If instead of considering a vector x I have a matrix M with columns x and y and I want to apply the above trimming to the column x leaving y untouched, how should I proceed?
You can use pmin:
pmin(x, 2)
# [1] 1 2 2 2 2
For example:
y <- x
y[x>2] <- 2
1 2 2 2 2
If you've a matrix M with two columns, and you want to replace only the first column with values > 2 to 2, then do:
M[,1][M[,1]>2] <- 2

Resources