Call names of a vector or list - r

I have a vector named "vec". I label the elements from "a" to "m"
vec <- c(1,1,1,2,2,2,2,2,2,4,4,4,4)
names(vec) <- c("a","b","c","d","e","f","g","h","i","j","k","l","m")
Then I split the vec according to the sequences.
split_vec <- split(vec, vec)
Now when I type
Spec_vec$"1" I get the first list.
Instead of typing the specific name as "1". I want to get the values
such as
spec_vec$vec[1]
But the above function doesn't work. Is there a way to get that?

You can do
split_vec[[as.character(vec[1])]]
# a b c
# 1 1 1
Notice that you need as.character, since just the number value from vec[i] would give incorrect results for calls like split_vec[[vec[10]]] where you would expect the third element.
split_vec[[vec[10]]]
# Error in split_vec[[vec[10]]] : subscript out of bounds
split_vec[[as.character(vec[10])]]
# j k l m
# 4 4 4 4
But in general, it's best to avoid such names that begin with numerics because, obviously, it's quite awkward and can cause trouble.

Related

return indices of duplicated elements corresponding to the unique elements in R

anyone know if there's a build in function in R that can return indices of duplicated elements corresponding to the unique elements?
For instance I have a vector
a <- ["A","B","B","C","C"]
unique(a) will give ["A","B","C"]
duplicated(a) will give [F,F,T,F,T]
is there a build-in function to get a vector of indices for the same length as original vector a, that shows the location a's elements in the unique vecor (which is [1,2,2,3,3] in this example)?
i.e., something like the output variable "ic" in the matlab function "unique". (which is, if we let c = unique(a), then a = c(ic,:)).
http://www.mathworks.com/help/matlab/ref/unique.html
Thank you!
We can use match
match(a, unique(a))
#[1] 1 2 2 3 3
Or convert to factor and coerce to integer
as.integer(factor(a, levels = unique(a)))
#[1] 1 2 2 3 3
data
a <- c("A","B","B","C","C")
This should work:
cumsum( !duplicated( sort( a)) ) # one you replace Mathlab syntax with R syntax.
Or just:
as.numeric(factor(a) )

Get index of vector between 1nd and 2nd appearance of number 1

Suppose we have a vector:
v <- c(0,0,0,1,0,0,0,1,1,1,0,0)
Expected output:
v_index <- c(5,6,7)
v always starts and ends with 0. There is only one possibility of having cluster of zeros between two 1s.
Seems simple enough, can't get my head around...
I think this will do
which(cumsum(v == 1L) == 1L)[-1L]
## [1] 5 6 7
The idea here is to separate all the instances of "one"s to groups and select the first group while removing the occurrence of the "one" at the beginning (because you only want the zeroes).
v <- c(0,0,0,1,0,0,0,1,1,1,0,0)
v_index<-seq(which(v!=0)[1]+1,which(v!=0)[2]-1,1)
> v_index
[1] 5 6 7
Explanation:I ask which indices are not equal to 0:
which(v!=0)
then I take the first and second index from that vector and create a sequence out of it.
This is probably one of the simplest answers out there. Find which items are equal to one, then produce a sequence using the first two indexes, incrementing the first and decrementing the other.
block <- which(v == 1)
start <- block[1] + 1
end <- block[2] - 1
v_index <- start:end
v_index
[1] 5 6 7

I have a numeric list where I'd like to add 0 or NA to extend the length of the list

I have 5 lists that need to be the same length as the lists will be combined into a dataframe. One of them may not be the same length as the other 4 so what I currently have is an if statement that checks the length against the length of one of the other lists and then...
1) I create a temporary list using rep( NA, length ) where length is the extra elements I need to add to extend the list
2) I use the concat function c() to combine the list that needs extending with the list with the NAs.
x <- as.numeric( list )
if( length( list ) < length( main ))
{
temp <- rep( NA, length( main ) - length( list ))
list <- c( list, temp )
}
List 1 - NA NA
List 2 - 32 53 45
Merged List - 32 53 45 NA NA
The problem with this is that I then get a ton of NAs introduced by coercion after the dataframe is created.
Is there a better way of handling this? I assume it has to do with the fact that the main list is numeric. I tried doing the same with 0 instead of NA but that failed for some reason. What I use to extend the length does not matter. I just need it to not be a number other than 0.
I will assume that you start with several lists like that:
n=as.list(1:2)
a=as.list(letters[1:3])
A=as.list(LETTERS[1:4])
First, I'd suggest to combine them into a list of lists:
z <- list(n,a,A)
so you can find the length of the longest sub-lists:
max.length <- max(sapply(z,length))
and use length<- to fill the missing elements of the shorter sub-lists with NULL values:
# z2 <- lapply(z,function(k) {length(k) <- max.length; return(k)}) # Original version
# z2 <- lapply(z, "length<-", max.length) # More elegant way
z2 <- lapply(lapply(z, unlist), "length<-", max.length) # Even better because it makes sure that the resulting data frame will consists of atomic vectors
The resulting list can be easily transformed into data.frame:
df <- as.data.frame(do.call(rbind,z2))
Another option using stringi would be ("z" from #Marat Talipov's post). If you want to get the result as showed in "df",
library(stringi)
as.data.frame(stri_list2matrix(lapply(z, as.character), byrow=TRUE))
# V1 V2 V3 V4
#1 1 2 <NA> <NA>
#2 a b c <NA>
#3 A B C D
NOTE: Now, the columns are all "factors" or "characters" (if we specify stringsAsFactors=FALSE). As #Richard Scriven mentioned in the comments, this would make more sense to have the "rows" as "columns". The above method is good when you have all 'numeric' or 'character' lists.

Access R output with subsetting

I do have the output of a function which looks like this
function(i,var1,list1)->h
and then the output
value
2.8763
There is a line break in the output and I only need the number bit of the result but not the string. Hence, I tried to use h[1] but this is
value
2.87..
and length(h) is also equal 1. Is there any way to access only the number in this case?
Thanks,
you are accessing only the value and the name you see is the name of the element in a vector you return. you can get rid of those names / attributes like this:
> v <- c("a" = 1, "b" = 2)
> v
a b
1 2
> attributes(v) <- NULL
> v
[1] 1 2

Remove quotes from vector element in order to use it as a value

Suppose that I have a vector x whose elements I want to use to extract columns from a matrix or data frame M.
If x[1] = "A", I cannot use M$x[1] to extract the column with header name A, because M$A is recognized while M$"A" is not. How can I remove the quotes so that M$x[1] is M$A rather than M$"A" in this instance?
Don't use $ in this case; use [ instead. Here's a minimal example (if I understand what you're trying to do).
mydf <- data.frame(A = 1:2, B = 3:4)
mydf
# A B
# 1 1 3
# 2 2 4
x <- c("A", "B")
x
# [1] "A" "B"
mydf[, x[1]] ## As a vector
# [1] 1 2
mydf[, x[1], drop = FALSE] ## As a single column `data.frame`
# A
# 1 1
# 2 2
I think you would find your answer in the R Inferno. Start around Circle 8: "Believing it does as intended", one of the "string not the name" sub-sections.... You might also find some explanation in the line The main difference is that $ does not allow computed indices, whereas [[ does. from the help page at ?Extract.
Note that this approach is taken because the question specified using the approach to extract columns from a matrix or data frame, in which case, the [row, column] mode of extraction is really the way to go anyway (and the $ approach would not work with a matrix).

Resources