how to use R language $ symbol to extract column from a matrix - r

I am new to R language, if I use us_stocks$"LNC" I could get the corresponding data zoo. resB is a list with following elements.The library is zoo, which is the type of us_stocks
resB
# [[1]] LNC 7
# [[2]] GAM 62
# [[3]] CMA 7
class(resB)
# [1] "list"
names(resB[[1]])
# [1] "LNC"
but when use us_stocks$names(resB[[1]]) I could not get the zoo series? How to fix this?

It often takes a while to understand what is meant by " ... $ is a function which does not evaluate its second argument." Most R functions would take names(resB[[1]]) and eval;uate it and then act on the value. But not $. It expects the second argument to be an actual column name but given as an unquoted string. This is an example of "non-standard evaluation". You will also see it operating in the functions library and help, as well as many functions in what is known perhaps flippantly as the hadleyverse, which includes the packages 'ggplot2' and 'dplyr'. The names of dataframe columns or the nodes of R lists are character literals, however, they are not really R names in the sense that their values cannot be accessed with an unquoted sequence of letters typed to the console at the toplevel of R.
So as already stated you should be using d[[ names(resB[[1]]) ]]. This is also much safer to use in programming, since there are often problems with scoping involved with the use of the $-function in anything other than interactive console use.

Related

creating names using backticks vs using single and double quotes in R

"You can also create non-syntactic bindings using single or double quotes (e.g. "_abc" <- 1) instead of backticks, but you shouldn’t, because you’ll have to use a different syntax to retrieve the values. The ability to use strings on the left hand side of the assignment arrow is an historical artefact, used before R supported backticks."
The quote above is from Hadley Wickham's book.
Can you give any example of the bolded?
In my experience, I find no difference in retrieving names created with backticks or quotes.
I expect the intended difference is that if you instantiate with double-quotes, you can't access the variable in that fashion.
"mu" <- 2
"mu"
# [1] "mu"
mu
# [1] 2
Whereas if you create it with backticks, you can still access it with backticks:
`mu` <- 2
`mu`
# [1] 2
There are of course special ways (get("mu")), but that's different.

Is there a build-in ordinal sequence vector in R?

I need a long ordinal sequence vector in R. As a simple example of what I want:
OS <- c("First","Second","Third")
Is there a build-in vector like that?
from library(english)
ordinal(1:5)
# [1] first second third fourth fifth
I googled "R cardinal numbers" and got to the vignette for the toOrdinal package, but unfortunately it doesn't actually get you words.
library(toOrdinal)
sapply(1:5,toOrdinal)
## [1] "1st" "2nd" "3rd" "4th" "5th"
The docs say
convert_to: OPTIONAL. Output type that provided 'cardinal_number' is
converted into. Default is 'ordinal_number' which refers to
the 'cardinal_number' followed by the appropriate ordinal
indicator. Additional options planned include 'ordinal_word'.
so maybe this will eventually do what you want ...

Use of $ and %% operators in R

I have been working with R for about 2 months and have had a little bit of trouble getting a hold of how the $ and %% terms.
I understand I can use the $ term to pull a certain value from a function (e.g. t.test(x)$p.value), but I'm not sure if this is a universal definition. I also know it is possible to use this to specify to pull certain data.
I'm also curious about the use of the %% term, in particular, if I am placing a value in between it (e.g. %x%) I am aware of using it as a modulator or remainder e.g. 7 %% 5 returns 2. Perhaps I am being ignorant and this is not real?
Any help or links to literature would be greatly appreciated.
Note: I have been searching for this for a couple hours so excuse me if I couldn't find it!
You are not really pulling a value from a function but rather from the list object that the function returns. $ is actually an infix that takes two arguments, the values preceding and following it. It is a convenience function designed that uses non-standard evaluation of its second argument. It's called non-standard because the unquoted characters following $ are first quoted before being used to extract a named element from the first argument.
t.test # is the function
t.test(x) # is a named list with one of the names being "p.value"
The value can be pulled in one of three ways:
t.test(x)$p.value
t.test(x)[['p.value']] # numeric vector
t.test(x)['p.value'] # a list with one item
my.name.for.p.val <- 'p.value'
t.test(x)[[ my.name.for.p.val ]]
When you surround a set of characters with flanking "%"-signs you can create your own vectorized infix function. If you wanted a pmax for which the defautl was na.rm=TRUE do this:
'%mypmax%' <- function(x,y) pmax(x,y, na.rm=TRUE)
And then use it without quotes:
> c(1:10, NA) %mypmax% c(NA,10:1)
[1] 1 10 9 8 7 6 7 8 9 10 1
First, the $ operator is for selecting an element of a list. See help('$').
The %% operator is the modulo operator. See help('%%').
The '$' operator is used to select particular element from a list or any other data component which contains sub data components.
For example: data is a list which contains a matrix named MATRIX and other things too.
But to get the matrix we write,
Print(data$MATRIX)
The %% operator is a modulus operator ; which provides the remainder.
For example: print(7%%3)
Will print 1 as an output

How is R handling numbers so that they are the same but not indentical() after exporting and re-importing?

A and B should be the same dataframe. A is generated in R, B is A exported and them imported back into R.
Both have dimensions 49 x 97, with the first column characters and all other columns numbers.
str() lists them as "chr" and "num" respectively.
Depending on how I look at the number columns, sometimes R finds them identical and sometimes not:
> identical(A,B)
FALSE
#The dataframes A and B are not the same
> identical(A[,1],B[,1])
TRUE
#The character-containing columns are the same
> identical(A[,-1],B[,-1])
FALSE
#The number-containing columns are not the same
> identical(matrix(A[,-1]),matrix(B[,-1]))
TRUE
#If the number-containing columns are converted into a matrix, they are the same
> identical(as.matrix(A[,-1]),as.matrix(B[,-1]))
FALSE
> identical(as.matrix(A[1:49,-1]),as.matrix(B[1:49,-1]))
TRUE
#But if they're converted into a matrix using as.matrix() instead of
# matrix() they're only the same if the 49 rows are explicitly indexed
My question:
What is the difference in how R interprets the numbers?
Are they sometimes treated as doubles and sometimes as floating points?
How do you know when R will do one or the other, and can I be sure that A and B really are the same?
EDIT: my advice after another 2 yrs experience in R:
use all.equal() instead of identical() to see an explanation of what's different and to ignore minute rounding errors
use saveRDS() and readRDS() to export and re-import with exact same format (and much faster)
remember that matrix() and as.matrix() can behave differently
Please read the help page for identical. It applies a much more stringent and extensive set of tests than just checking whether numerical entries are the same. All of the attributes of R objects including names and non-printing attributes are checked for identity. if you want to check numerical equivalent then perhaps you should first be stripping the objects of attributes, perhaps with as.vector or similar functions.

R: Using ellipsis with a function that takes a arbitrary number of arguments

Many a times, I find myself typing the following
print(paste0(val1,',',val2,',',val3)) to print the output from a function with variables separated by a comma.
It is handy when I want to copy generate a csv file from the output.
I was wondering if I can write a function in R that does this for me. With many attempts, I could only get to this the following.
ppc <- function(string1,string2,...) print(paste0(string1,',',string2,',',...,))
It works well for at the maximum of three arguments.
> ppc(1,2,3)
[1] "1,2,3"
> ppc(1,2,3,4)
[1] "1,2,34"
ppc(1,2,3,4) should have given "1,2,3,4". How can I correct my function? I somehow believe that this is possible in R.
You don't need to write your own function. You can do this with paste.
paste(1:3,collapse=",")
# [1] "1,2,3"
Or, in case you insist on a ppc() function:
ppc <- function(...) paste(...,sep=",")
ppc(1,2,3,4)

Resources