I have a matrix that has some unique rows and I would like to get the row names of those unique rows only.
m <- matrix( data = c(1,1,2,1,1,2,1,1,2), ncol = 3 )
If the expected row index is '3' as the other two rows are duplicates, then use duplicated to get the logical index and wrap with which for the numeric index.
which(!(duplicated(m)|duplicated(m,fromLast=TRUE)))
#[1] 3
If we consider the 1st and 3rd as the unique rows, then
which(!duplicated(m))
Related
I have a 93060-by-141 matrix file with filled values.
I need to assign zeros to some rows and columns with the condition: Row 1: 65 of column 1; row 67:131 of column 2; row 133:197 of column 3 and so on.
This condition excludes for some column, i.e the values in rows of column 10,20,36 are unchanged.
I think I will need a For-loop. But I have no idea how to code for the link of rows and columns expressing the mentioned condition.
You can change multiple values by providing a matrix of indexes to a matrix. The indices correspond with the row (first column) and column (second column) numbers of the cells. For example doing mymat[cbind(1:10, 1)] <- 0 would change the first through tenth rows of column 1 to zero. In you case, you could put together several such cbind() statements with a call to rbind(). For example, mymat[rbind(cbind(1:5, 1), cbind(6:10, 2))] <- 0 would change the first through fifth rows of column 1 and the sixth through tenth rows of column 2 to zero. In the example you proposed above, it would be something like this:
my_mat[rbind(
cbind(1:65, 1),
cbind(67:131, 2),
cbind(133:197,3))] <- 0
I have a dataframe which has 15 columns. All values are numeric.
I have a vector having numeric values ranging from 1 to 15. Lets say x = c( 5,7,2,8,13,5,6...).
From each row in the dataframe, I need to get a value from a column, such that column corresponds to the vector value.
For example, using vector x, from the first row pull the 5th value, from 2nd row pull 7th, then for 3rd row the 2nd column etc..
PS: I'm nowhere in this
For any one interested:
data[ cbind(1:nrow(data), x) ]
Where data is our data frame with 15 columns
The vector has duplicate IDs which I would like to preserve:
vector <- c(1,1,1,1,2,2,3,5)
That is, it should return the first row 4 times, then the second row twice, the third row 3 times, etc.
Typically when doing this I use something like:
df2 <- subset(df1, ID %in% vector)
However, this only returns the unique elements, i.e. the 1st row once, the 2nd row once, the 3rd row once, etc.
I have a data frame with many rows and columns in it (3000x37) and I want to be able to select only rows that may have >= 2 columns of value "NA". These columns have data of different data types. I know how to do this in case I want to select only one column via:
df[is.na(df$col.name), ]
How to make this selection if I want to select two (or more) columns?
First create a vector nn with the of the number of NA's in each row and then select only those rows with >= 2 NA's d[nn>=2,]
d = data.frame(x=c(NA,1,2,3), y=c(NA,"a",NA,"c"))
nn = apply(d, 1, FUN=function (x) {sum(is.na(x))})
d[nn>=2,]
x y
1 NA <NA>
I have a dataframe with 23000 rows and 8 columns
I want to subset it using only unique identifiers that are in column 1. I do this by,
total_res2 <- unique(total_res['Entrez.ID']);
This produces 17,000 rows with only the information from column 1.
I am wondering how to extract the unique rows, based on this column and also take the information from the other 7 columns using only these unique rows.
This returns the rows of total_res containing the first occurrences of each Entrez.ID value:
subset(total_res, ! duplicated( Entrez.ID ) )
or did you mean you only want rows whose Entrez.ID is not duplicated:
subset(total_res, ave(seq_along(Entrez.ID), Entrez.ID, FUN = length) == 1 )
Next time please provide test data and expected output.