R: Get column size using index - r

I'd like to get the size of a column using the index. I tried using the length() function with the column index inside, but it doesn't work:
length(bd[7])
I'm sorry if this is too basic, I'm new to R. Thank you!

The bd[7] is still a data.frame with single column and length for a data.frame is by default the number of columns. We need to extract the column as a vector and then use length. Extraction of column depends on the class i.e. if it is a data.frame/matrix, then bd[,7] would drop the dimensions and return a vector, but it is not the case with data.table/tibble. However, all of them works with either $ or [[
length(bd[[7]])
Or if it is a data.frame or vector, NROW would still work though
NROW(bd[7])
i.e.
> NROW(1:7)
[1] 7
> NROW(data.frame(col1 = 1:7))
[1] 7

Related

List only values from a subset of a subset of list in R

Hi my datastructure is attached.
I need to access just the mean under each [[ element under extra.
I can do this manually individually opt.state[["opt.path"]][["env"]][["extra"]][[1]][["mean"]]
which gives me NA but there's 100 elements like this.
I followed the solution from this problem of : subset of a subset of a list
using this lst <- lapply(opt.state[["opt.path"]][["env"]][["extra"]], function(x) x[["mean"]])
but end up getting two extra columns I don't need :
How do I go about getting just 1-column list with the values?
Cheers
You can unlist(lst) then you will get a vector of 100 mean values. Columns Name and Type would appear for every list object in R when you View them, Or you can use sapply() function which would directly return a vector in this case, instead of a list.

Vector from tibble has length 0

I have a tibble ('df') with
> dim(df)
[1] 55 144
of which I extract a vector test <- c(df[,39]). I would expect the following result:
> length(test)
[1] 55
as I basically took column 39 from my tibble. Instead, I get
> length(test)
[1] 1
Now, class(test) yielded list, so I thought the class might be the reason; however, with class set to char, I get the same result.
I'm especially confused since length(df[39,]) yields [1] 155.
Background is I am searching in the vector using grep, which doesn't work with a vector taken from a column. Of course, as I am trying to recode all lines in my tibble, I can recode them by row instead of by column, so I think there is a workaround. However, what causes R to assume that test has length 1? What is the difference in the treatment of rows and columns?
Whenever you apply [] operation on a tibble, it always returns another tibble. This is one of differences between tibble structure and the data.frame in base R.
For example:
a <- 1:5
df = tibble(a,b=a*2,c=a^2)
df2 = as.data.frame(df) # convert to base data.frame
df[,2] # give a tibble, its dim is 5 1
df2[,2] # give a vector, its dim is NULL, its length is 5.
You see the return type from the data.frame has been changed from the original type. Meanwhile the tibble is designed in such way to keep the structure consistency between input and output type.
There are two ways, if you want to process certain column of a tibble as vectors.
pull()
[[ ]]
Personally, I am using pull(), which is also very intuitive.
Why length(df[39,]) yields 155?
My understanding is that df[39,] give you a tibble, its dim is 1 155. And its length is equal to the number of columns. Why? Because length also can give the length of lists. Behind of the design of tibble and data.frame, they are constructed by linked list. Each column is actually a list. That's why you can have different types in one tibble or data.frame.

Clarification in colnames function in R

I am new to R and I wanted to ask experts about the colnames function in R. Using the function I realized that it provides a NULL if used for single column of a matrix object, however it works perfectly fine for more than 1 columns of a matrix object. To illustrate, say I have matrix test
>test<-matrix(0,ncol=4,nrow=5)
>colnames(test)<-c("A","B","C","D")
>colnames(test[,1]) or colnames(test[,c(1)]) gives output as NULL
NULL
whereas the following works fine,
colnames(test[,c(1:2)])
[1] "A" "B"
I understand that alternative way is to use colnames(test)[c(1:2)]. Am I missing something here in the case where I am getting NULL.
If you look in the description of ?colnames. You'll see that it takes an argument x which is a a matrix-like R object, with at least two dimensions for colnames.
When you are calling colnames(test[,1]) you are giving colnames a vector with 1 dimension. Compare class(test[,1]) vs. class(test[,c(1:2)]). Vectors don't have columns or rows and therefore no column or row names. You can have named elements within a vector, but that is definitely not equivalent to the column names from a matrix
The best way to extract a single (or multiple) column name is to select the column after from the full vector of column names
colnames(test) # gives you all column names
colnames(test)[1] # gives you the column name 1
colnames(test)[c(1,2)] # gives you column names 1 and 2
Does this clarify this issue for you?

How to get a matrix element without the column name in R?

This seems to be simple but I can't find the answer.
I combine two vectors using cbind().
> first = c(1:5)
> second = c(6:10)
> values = cbind(first,second)
When I want to retrieve a single element using values[1,2] I always get the column name in addition to the actual element.
> values[1,2]
second
6
How can I get the value without the column name?
I know I can remove the column names in the matrix like in this post: How to remove column names from a matrix in R? But how can I leave the matrix as is and only get the value I want?
We can use unname
unname(values[1,2])
#[1] 6
Or as.vector
as.vector(values[1,2])
You can use the [[ operator to extact a single element,
values[[1,2]]
# [1] 6

extract data from a list without using loop in R

I have a vector v with row positions:
v<-c(10,3,100,50,...)
with those positions I want to extract elements of a list, having a column fixed, for example lets suppose my column number is 2, so I am doing:
data<-c()
data<-c(list1[[v]][[2]])
list1 has the data in the following format:
[[34]]
[1] "200_s_at" "483" "1933" "3664"
So for example, I want to extract from the row 342 the value 1910 only, column 2, and do the same with the next rows
but I got an error when I want to do that, is it possible to do it directly? or should I have a loop that read one by one the positions in v and fill the data vector like:
#algorithm
for i<-1 to length(v)
pos<-v[i]
data[[i]]<-c(list1[[pos]][[2]])
next i
Thanks
You can use sapply as below:
sapply(list1[v], `[`, 2)
However, depending on your data, you might get an unexpected output, as explained in Why is `vapply` safer than `sapply`?. For example, what if some of your list items have length < 2? What if some of the list items are not vectors but data.frames? Also, the output class may differ based on the class of your list elements (logical, integer, numeric, character). If for example, you expect that all your list items are character vectors of length >= 2, then it is safer to do:
vapply(list1[v], `[`, character(1), 2)
where vapply will double check your assumptions for you, and error out if it finds a problem.

Resources