"Named tuples" in r

"Named tuples" in r - r

If you load the pracma package into the r console and type
gammainc(2,2)
you get
lowinc uppinc reginc
0.5939942 0.4060058 0.5939942
This looks like some kind of a named tuple or something.
But, I can't work out how to extract the number below the lowinc, namely 0.5939942. The code (gammainc(2,2))[1] doesn't work, we just get
lowinc
0.5939942
which isn't a number.
How is this done?

As can be checked with str(gammainc(2,2)[1]) and class(gammainc(2,2)[1]), the output mentioned in the OP is in fact a number. It is just a named number. The names used as attributes of the vector are supposed to make the output easier to understand.
The function unname() can be used to obtain the numerical vector without names:
unname(gammainc(2,2))
#[1] 0.5939942 0.4060058 0.5939942
To select the first entry, one can use:
unname(gammainc(2,2))[1]
#[1] 0.5939942
In this specific case, a clearer version of the same might be:
unname(gammainc(2,2)["lowinc"])

Double brackets will strip the dimension names
gammainc(2,2)[[1]]
gammainc(2,2)[["lowinc"]]
I don't claim it to be intuitive, or obvious, but it is mentioned in the manual:
For vectors and matrices the [[ forms are rarely used, although they
have some slight semantic differences from the [ form (e.g. it drops
any names or dimnames attribute, and that partial matching is used for
character indices).
The partial matching can be employed like this
gammainc(2, 2)[["low", exact=FALSE]]

In R vectors may have names() attribute. This is an example:
vector <- c(1, 2, 3)
names(vector) <- c("first", "second", "third")
If you display vector, you should probably get desired output:
vector
> vector
first second third
1 2 3
To ensure what type of output you get after the function you can use:
class(your_function())
I hope this helps.

Related

A function to convert words to numbers

I came across this function that converts numbers written in words into its numeric representation (e.g., five to 5). The function looks like this:
library(english)
words_to_numbers <- function(s) {
s <- stringr::str_to_lower(s)
for (i in 0:11)
s <- stringr::str_replace_all(s, words(i), as.character(i))
s
}
Can you explain how the function works? I am confused how as.character() is playing a role here.

The function works like this (note you also need the stringr package).
First, it takes the word you input (i.e. "five" if you used words_to_numbers("five"))
Then, str_to_lower() takes that and normalizes it to all lower case (i.e., avoiding issues if you typed "Five" or "FIVE" instead of "five").
It then iterates over a loop (for some reason ending at 11), so i will take the value of 1, then 2, then 3, all the way to 11.
Within the loop, str_replace_all() takes your string (i.e., "five") and looks for a matching pattern. Here, the pattern is words(i) (i.e. words(5) when i == 5 yields the pattern "five" - in the english package, the words() function provides a vector of words that represent the position in the vector. For instance, if you type english::words(1000) it will return "one thousand". Once it finds the pattern, it then replaces it with as.character(i). The as.character() function converts the number i value to a character since str_replace_all() requires a character replacement. If you needed the return value to be numeric, you could use as.numeric(words_to_numbers("five"))
For some reason, the function stops at 11, meaning if you type words_to_numbers("twelve") it won't work (returns "twelve"). So you will need to adjust that number if you want to use the function for values > 11.
Hope this helps and good luck learning R!

vector - character/integer class (under the hood)

Starting to learn R, and I would appreciate some help understanding how R decides the class of different vectors. I initialize vec <- c(1:6) and when I perform class(vec) I get 'integer'. Why is it not 'numeric', because I thought integers in R looked like this: 4L
Also with vec2 <- c(1,'a',2,TRUE), why is class(vec2) 'character'? I'm guessing R picks up on the characters and automatically assigns everything else to be characters...so then it actually looks like c('1','a','2','TRUE') am I correct?

Type the following, you can see the help page of the colon operator.
?`:`
Here is one paragraph.
For numeric arguments, a numeric vector. This will be of type integer
if from is integer-valued and the result is representable in the R
integer type, otherwise of type "double" (aka mode "numeric").
So, in your example c(1:6), since 1 for the from argument can be representable in R as integer, the resulting sequence becomes integer.
By the way, c is not needed to create a vector in this case.
For the second question, since in a vector all the elements have to be in the same type, R will automatically convert all the elements to the same. In this case, it is possible to convert everything to be character, but it is not possible to convert "a" to be numeric, so it results in a character vector.

what does lapply(Output_data,"[[",2) mean in R

In RHadoop, when we make the results readable, it will use the code:
results <- data.frame(words=unlist(lapply(Output_data,"[[",1)), count
=unlist(lapply(Output_data,"[[",2)))
but what does lapply(Output_data,"[[",1)mean? especially the "[[" and '1' in lapply.

The syntax of extracting list elements with [ or [[ is often used in R. It is not specific to any packages. The meaning of the syntax
lapply(Output_data,"[[",1)
is loop through the object 'Output_data' and extract ([[) the first element. So, if the 'Output_data' is a list of data.frames, it will extract the first column of the data.frame and if it is a list of vectors, it extracts the first elements of vector. It does similar functionality as an anonymous function does i..e
lapply(Output_data, function(x) x[[1]])
The latter syntax is more clear and easier to understand but the former is compact and a bit more stylish...
More info about the [[ can be found in ?Extract

Operators like [[ , [ and -> are actually functions.
list[[1]]
is equal to
`[[`(list,1)
In your case, lapply(Output_data,"[[",1)means to extract the first value of every column (or sublist) of Output_data. And the 1 is a argument passed to [[ function.

Use of $ and %% operators in R

I have been working with R for about 2 months and have had a little bit of trouble getting a hold of how the $ and %% terms.
I understand I can use the $ term to pull a certain value from a function (e.g. t.test(x)$p.value), but I'm not sure if this is a universal definition. I also know it is possible to use this to specify to pull certain data.
I'm also curious about the use of the %% term, in particular, if I am placing a value in between it (e.g. %x%) I am aware of using it as a modulator or remainder e.g. 7 %% 5 returns 2. Perhaps I am being ignorant and this is not real?
Any help or links to literature would be greatly appreciated.
Note: I have been searching for this for a couple hours so excuse me if I couldn't find it!

You are not really pulling a value from a function but rather from the list object that the function returns. $ is actually an infix that takes two arguments, the values preceding and following it. It is a convenience function designed that uses non-standard evaluation of its second argument. It's called non-standard because the unquoted characters following $ are first quoted before being used to extract a named element from the first argument.
t.test # is the function
t.test(x) # is a named list with one of the names being "p.value"
The value can be pulled in one of three ways:
t.test(x)$p.value
t.test(x)[['p.value']] # numeric vector
t.test(x)['p.value'] # a list with one item
my.name.for.p.val <- 'p.value'
t.test(x)[[ my.name.for.p.val ]]
When you surround a set of characters with flanking "%"-signs you can create your own vectorized infix function. If you wanted a pmax for which the defautl was na.rm=TRUE do this:
'%mypmax%' <- function(x,y) pmax(x,y, na.rm=TRUE)
And then use it without quotes:
> c(1:10, NA) %mypmax% c(NA,10:1)
[1] 1 10 9 8 7 6 7 8 9 10 1

First, the $ operator is for selecting an element of a list. See help('$').
The %% operator is the modulo operator. See help('%%').

The '$' operator is used to select particular element from a list or any other data component which contains sub data components.
For example: data is a list which contains a matrix named MATRIX and other things too.
But to get the matrix we write,
Print(data$MATRIX)
The %% operator is a modulus operator ; which provides the remainder.
For example: print(7%%3)
Will print 1 as an output

How to correctly index an array?

Please download the file into your computer,and run :
http://freeuploadfiles.com/bb3cwypih2d2
data=read.table("path/to/file", sep="|",quote='',
head=T,blank.lines.skip=T,as.is=T)
ddata=array(data,dim=c(nrow(data),ncol(data)))
ddata[1,1]
I want to extract the first element of the first column. The answer should be AAC.
How do I do that?

Some suggestions to clean your code and make life easier in the long term:
Work with the data in a data.frame, not an array.
Never refer to TRUE as T. TRUE is a reserved word that can never be redefined, whereas T can take any value, including FALSE
Use the <- symbol for assignment
Don't use abbreviate argument names. The arguement is header, not head. This might bite you
Arrays can only contain a single class of object. Thus converting your data to array will implicitly convert the numeric column to character, which surely is a bad thing.
You then index the data frame like this:
dat <- read.table("nasdaqlisted.txt", sep="|", quote='',
header=TRUE, blank.lines.skip=TRUE, as.is=TRUE)
dat$Symbol[1]
[1] "AAC"
The following alternative ways of indexing also return the same element:
dat[1, "Symbol"]
dat[1, 1]
dat[, 1][1]
dat[["Symbol"]][1]
If you really want to do the foolish thing and convert your data to an array, then use matrix:
mdat <- as.matrix(dat)
mdat[1, 1]
Symbol
"AAC"
Disclaimer: I only post this since you ask. Arrays and matrices are powerful and fast, but not appropriate for this data.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

"Named tuples" in r - r

Related

A function to convert words to numbers

vector - character/integer class (under the hood)

what does lapply(Output_data,"[[",2) mean in R

Use of $ and %% operators in R

How to correctly index an array?

Categories

Resources