How can one determine the row index-numbers corresponding to particular row names? I have a vector of row names, and I would like to use these to obtain a vector of the corresponding row indices in a matrix.
I tried row() and as.integer(rownames(matrix.object)), but neither seems to work.
In addition to which, you can look at match:
m <- matrix(1:25, ncol = 5, dimnames = list(letters[1:5], LETTERS[1:5]))
vec <- c("e", "a", "c")
match(vec, rownames(m))
# [1] 5 1 3
Try which:
which(rownames(matrix.object) %in% c("foo", "bar"))
Related
I have
d <- matrix(rnorm(6), ncol = 1,
dimnames = list(c("a", "a1", "d", "e", "f", "f2"), NULL))
I want to sort the rows in the following order: a1, a, e, d, f2, f.
Notes:
I am looking for a general solution. Of course, i know how to do it for this specific matrix.
Rownames can be all kinds of names, so any string related operation doesnt work
Matrix d will not have more than 16-20 entries. So dont worry about speed.
The matrix has always an even number of rows.
We can use a recycling logical vector to subset the row.names, alternate the names by first rbinding it to a matrix, remove the dim attibutes with c (convert to a vector) and use that as row index
d[c( rbind(row.names(d)[c(FALSE, TRUE)],
row.names(d)[c(TRUE, FALSE)])),, drop = FALSE]
# [,1]
#a1 -0.43704092
#a 0.41215035
#e 1.47443155
#d -1.78087570
#f2 -0.01673482
#f 0.98952497
Switching every pair of rows gives us row numbers 2, 1, 4, 3, 6, 5, etc. Hence, after the transformation the n-th row is the (n - (-1)^n)-th row in the original matrix.
Thus, the order of rows that you want is 1:nrow(d) - (-1)^(1:nrow(d)):
d[1:nrow(d) - (-1)^(1:nrow(d)), , drop = FALSE]
# [,1]
# a1 0.1228430
# a -1.4051684
# e -0.7928203
# d 1.3270429
# f2 0.3554126
# f -1.1388026
I want to create a numeric vector in R with a placeholder. Just like in a chracter vector like:
characterVec <- c("a", "b", "", "d")
This gives me a characterVec vector with a length of 4.
How can I create a numeric vector with a length of 4, but still has one empty value? For example, I would like to know what do I put into the question mark in the following vector.
numericVec <- c(1, 2, ?, 4)
If I'm understanding your question properly, you can use a named vector to create a data dictionary linking letters to corresponding numbers:
# data dictionary
dat <- 1:26
names(dat) <- letters
then map dictionary onto your vector
characterVec <- c("a", "b", "", "d")
numVec <- dat[characterVec]
gives
a b <NA> d
1 2 NA 4
You can remove the vector names with unname():
numVec <- unname(dat[characterVec])
From an old R thread captured in nabble the indication is that three separate operations are required to obtain the result described in the title of this post http://r.789695.n4.nabble.com/To-give-column-names-of-a-data-frame-td2249996.html:
results <- data.frame(matrix(c(1,2,3,4),nrow=2,ncol=2))
rownames(results) <- c("a","b")
colnames(results) <- c("c","d")
Can these be collapsed into a single operation?
We can use setnames and row.names to set them in one-line
setNames(data.frame(matrix(c(1,2,3,4),nrow=2,ncol=2), row.names=c("a","b")), c("c", "d"))
# c d
#a 1 3
#b 2 4
You can use the option dimnames which is part of the matrix function. The first part of dimnames are the row names, the second part the column names.
data.frame(matrix(c(1,2,3,4),nrow = 2, ncol = 2, dimnames = list(c("a","b"), c("c","d")))
The difference between matrix(c(1,2,3,4),nrow = 2, ncol = 2, dimnames = list(c("a","b"), c("c","d"))) and the previous line is that the matrix call will give you a matrix with a dimnnames attribute. The data.frame line transforms the matrix into a data.frame with row names and column headers.
I have a dataframe in which the 1st element of an associated 'name' vector is related to subsequent named numerical vectors. I am attempting to replace the meaningless number with the 1st element of the associated name vector.
Here is an example dataframe:
df <- data.frame(data.0.name = c("A", "A", "A"), data.0.one_minute_ago = c(1,2,1), data.0.one_hour_ago = c(2,2,3),
data.1.name = c("B", "B", "B"), data.1.one_minute_ago = c(3,3,2), data.1.one_hour_ago = c(5,6,2))`
Each number.name vector is associated with a construct (either A or B in this case) and each number.time is associated with a time dimension. So, data.0.one_minute_ago is actually the number of A's you had one_minute_ago.
What I would like to do (because I have a large dataset with lots of the transformations) is to replace the number.dimension with the construct.dimension, and of course do that for each number. from 0:9
I've written some grep code to begin with this task, but to no avail (I am stuck with retaining everything after the number.
grep( "data.[0-9].name" ,names(df), perl=TRUE)
as.character(df[1, 1])
as.character(df[1, 4])
as.character(names(df[2]))
as.character(names(df[3]))
as.character(names(df[5]))
as.character(names(df[6]))
df.1 <- (df[1, grep( "data.[0-9].name" ,names(df))])
df.1 <- (df[1, grep( "data.[0-9].name" ,names(df))])
df.1 <- data.frame(lapply(df.1, as.character), stringsAsFactors=FALSE)
constructs <- as.character(df.1[1,c(1:2)])
Here the 1st and 2nd element of constructs are the constructs associated with 0.name/0.dimension and 1.name/1.dimension respectively.
constructs [1]
constructs [2]
From there, I'm fairly certain the code would involve some names(df)[] <- but am uncertain on where to go from here.
Any and all help appreciated.
EDIT: here is the desired variable name output: simply changing the variable names (and of course retain the values associated with the variable names:
data.A.name data.A.one_minute_ago data.A.one_hour_ago data.B.name data.B.one_minute_ago data.B.one_hour_ago
EDIT 2: In my true dataset, the number of repetitions per dimensions (i.e., one_minute_ago, one_hour_ago, one_day_ago) can vary across construct (i.e, two dimensions for one construct and 3 for another, and 9 for another). I would like the solution to take that into account.
Here is a modified sample dataset to reflect this subtlety:
df <- data.frame(data.0.name = c("A", "A", "A"), data.0.one_minute_ago = c(1,2,1), data.0.one_hour_ago = c(2,2,3),
data.1.name = c("B", "B", "B"), data.1.one_minute_ago = c(3,3,2), data.1.one_hour_ago = c(5,6,2),
data.2.name = c("C", "C", "C"), data.2.one_minute_ago = c(3,3,2), data.2.one_hour_ago = c(5,6,2), data.2.one_day_ago = c(3,2,3))
We create a grouping 'indx' based on the 'number' in the column names. split the column names based on the 'indx' ('lst'). Get one element from the columns having 'name' as suffix ('r1'). Use 'Map' and gsub to replace the 'number' in each element of 'lst' with that of 'r1'.
indx <- gsub('[^0-9]+', '', names(df))
lst <- split(names(df), indx)
r1 <- as.character(unlist(df[1,grep('name', names(df))]))
lst2 <- Map(function(x,y) gsub('[0-9]+', y, x), lst, r1)
names(df) <- unsplit(lst2, indx)
names(df)
# [1] "data.A.name" "data.A.one_minute_ago" "data.A.one_hour_ago"
#[4] "data.B.name" "data.B.one_minute_ago" "data.B.one_hour_ago"
#[7] "data.C.name" "data.C.one_minute_ago" "data.C.one_hour_ago"
#[10] "data.C.one_day_ago"
I think this works:
library(stringr)
splits <- str_split(names(df), "\\.")
trailing_name <- sapply(splits, "[[", 3)
constructs <- rep(constructs, each = 3)
constructs
# [1] "A" "A" "A" "B" "B" "B"
names(df) <- str_c("data", constructs, trailing_name, sep=".")
names(df)
# [1] "data.A.name" "data.A.one_minute_ago" "data.A.one_hour_ago" "data.B.name"
# [5] "data.B.one_minute_ago" "data.B.one_hour_ago"
I have two data.frames:
pattern <- data.frame(pattern = c("A", "B", "C", "D"), val = c(1, 1, 2, 2))
match <- data.frame(match = c("A", "C"))
I want to add to my data.frame pattern another column called new_val and assign "X" to each row where the value for column pattern is in the data.frame match otherwise assign "Y"
is.element(pattern$pattern, match$match)
[1] TRUE FALSE TRUE FALSE
So, the resulting data.frame should look like:
pattern val new_val
1 A 1 X
2 B 1 Y
3 C 2 X
4 D 2 Y
I achieved to do it with an ugly for-loop but I am sure this can be pretty much done in a one line R command using fancy stuff :-)
Is anyone able to help?
Many thanks!
I'm only really posting this since Tyler said "if you wanted a one liner data.table would likely do it" and I knew it was definitely possible with a one liner in base. I am also assuming match had been renamed to mat.
pattern$new_val <- c("Y", "X")[(pattern$pattern %in% mat)+1]
pattern
# pattern val new_val
#1 A 1 X
#2 B 1 Y
#3 C 2 X
#4 D 2 Y
pattern$pattern %in% mat is finding which of the elements of pattern are in mat which returns TRUE if it's in mat, FALSE if it's not. Then I add 1 to make it numeric in the range of 1-2 so that it can be used for indexing. Then we use that as an index to the self defined vector c("Y", "X") and since the index we created is always 1 or 2 we're always able to grab an element of interest. So in this case we'll grab "Y" if pattern wasn't in mat and "X" if it was - which is what you wanted.
Here's one way (I renamed your match to mat since there's a pretty important base function named match that you could actually use to solve this problem; in fact %in% is a form of match:
pattern <- data.frame(pattern = c("A", "B", "C", "D"), val = c(1, 1, 2, 2))
mat <- c("A", "C")
pattern$new_val <- "Y" #pre allot everything to be Y
pattern$new_val[pattern$pattern %in% mat] <- "X" #replace any A or C with an X
pattern
PS if you wanted a one liner data.table would likely do it.
If you wanted something a little more complicated you could use a function from a package I'm working on:
library(qdap)
#original problem
pattern$new_val <- text2color(pattern$pattern, list(c("A", "C")), c("X", "Y"))
#extending it
#makes D a 5
text2color(pattern$pattern, list(c("A", "C"), "D"), c("X", 5, "Y"))
This function really is designed to do something else but if you want to grab the essential parts of it you can look at the source code.