Could someone explain me the output of "str("Hello") == str("World!")" in R. I was expecting "TRUE" - r

I'm just learning R, and I tried this:
str("Hello") == str("World!")
I was expecting an logical output, like TRUE or FALSE, but instead, it returned this:
chr "Hello" chr "World!" logical(0)
I didn't understand why, bacause the str() function should print the class and whatever is inside parenthesis...
Could someone help me in understanding this, please?
Many thanks.
Tried: str("Hello") == str("World!")
Expected: "TRUE" or "FALSE"
Reality:
chr "Hello" chr "World!" logical(0)

See the following.
tmp <- str("Hello")
print(tmp)
is returning:
chr "Hello"
NULL
So you see that NULL is assigned to tmp. The same is for str("whatever").
So you are calling :
NULL == NULL
That's obviously logical(0)

Related

R Array creation with very specific structure

I am a little bit stuck with one exercise in a beginner R course that I need for the following exercises (we should replace values of the previously created object).
Create object A, which returns the following when the structure is queried:
> str(A)
num [1 : 2, 1 : 5, 1 : 3] TRUE FALSE TRUE FALSE TRUE FALSE ...
- attr(, "dimnames")=List of 3
..$ : chr [1:2] "a" "b"
..$ : chr [1:5] "C1" "C2" "C3" "C4" ...
..$ : chr [1:3] "X" "Y" "Z"
Because I am a little bit clueless with the content, that is my beginning:
a <- rep(c(0,1),15)
A <- array(a, dim= c(2,5,3))
rownames(A) <- letters[1:2]
colnames(A) <- paste("C",1:5,sep="")
Unfortunately I struggle with the object itself, I don't see how the array should be filled to be numeric and have a TRUE/FALSE content. Also the naming of the third dimension is something where I didn't find sufficient information.
Can anyone help me here?
This exercise appears to be trying to teach you about array dimensions.
array has 3 arguments:
args(array)
#function (data = NA, dim = length(data), dimnames = NULL)
data = is the data to be put in the array. Replacement is allowed.
dim = an integer vector giving the "maximal indices in each dimension"
dimnames = is a list of character vectors each as long as the corresponding dimension. (As an aside, the character vectors themselves can also be named)
Thus, the following would get pretty close to your desired output:
A = array(data = c(TRUE,FALSE),
dim = c(2,5,3),
dimnames = list(c("a","b"), c("C1","C2","C3","C4","C5"),c("X","Y","Z")))
str(A)
# logi [1:2, 1:5, 1:3] TRUE FALSE TRUE FALSE TRUE FALSE ...
# - attr(*, "dimnames")=List of 3
# ..$ : chr [1:2] "a" "b"
# ..$ : chr [1:5] "C1" "C2" "C3" "C4" ...
# ..$ : chr [1:3] "X" "Y" "Z"
However, I do not see a way for str to print TRUE FALSE TRUE FALSE TRUE FALSE ... while also being class num. Perhaps the lesson is incorrect.
You could also try your approach, but use dimnames(A)[3] to assign the third dimension's names:
dimnames(A)[3] <- list(c("X","Y","Z"))

Why do identical row names yield different results on barplot axis labels? [duplicate]

I've come across a strange behavior when playing with some dataframes: when I create two identical dataframes a,b, then swap their rownames around, they don't come out as identical:
rm(list=ls())
a <- data.frame(a=c(1,2,3),b=c(2,3,4))
b <- a
identical(a,b)
#TRUE
identical(rownames(a),rownames(b))
#TRUE
rownames(b) <- rownames(a)
identical(a,b)
#FALSE
Can anyone reproduce/explain why?
This is admittedly a bit confusing. Starting with ?data.frame we see that:
If row.names was supplied as NULL or no suitable component was found
the row names are the integer sequence starting at one (and such row
names are considered to be ‘automatic’, and not preserved by
as.matrix).
So initially a and b each have an attribute called row.names that are integers:
> str(attributes(a))
List of 3
$ names : chr [1:2] "a" "b"
$ row.names: int [1:3] 1 2 3
$ class : chr "data.frame"
But rownames() returns a character vector (as does dimnames(), actually a list of character vectors, called under the hood). So after reassigning the row names you end up with:
> str(attributes(b))
List of 3
$ names : chr [1:2] "a" "b"
$ row.names: chr [1:3] "1" "2" "3"
$ class : chr "data.frame"

Why do identical dataframes become different when changing rownames to the same

I've come across a strange behavior when playing with some dataframes: when I create two identical dataframes a,b, then swap their rownames around, they don't come out as identical:
rm(list=ls())
a <- data.frame(a=c(1,2,3),b=c(2,3,4))
b <- a
identical(a,b)
#TRUE
identical(rownames(a),rownames(b))
#TRUE
rownames(b) <- rownames(a)
identical(a,b)
#FALSE
Can anyone reproduce/explain why?
This is admittedly a bit confusing. Starting with ?data.frame we see that:
If row.names was supplied as NULL or no suitable component was found
the row names are the integer sequence starting at one (and such row
names are considered to be ‘automatic’, and not preserved by
as.matrix).
So initially a and b each have an attribute called row.names that are integers:
> str(attributes(a))
List of 3
$ names : chr [1:2] "a" "b"
$ row.names: int [1:3] 1 2 3
$ class : chr "data.frame"
But rownames() returns a character vector (as does dimnames(), actually a list of character vectors, called under the hood). So after reassigning the row names you end up with:
> str(attributes(b))
List of 3
$ names : chr [1:2] "a" "b"
$ row.names: chr [1:3] "1" "2" "3"
$ class : chr "data.frame"

Why does sapply() return a list with attributes when used on characters?

There is a strange behaviour of sapply() when used on a vector of characters:
y <- c("Hello", "bob", "daN")
z <- sapply(y, function(x) {toupper(x)})
z
# Hello bob daN
# "HELLO" "BOB" "DAN"
str(z)
# Named chr [1:3] "HELLO" "BOB" "DAN"
# - attr(*, "names")= chr [1:3] "Hello" "bob" "daN"
Why does sapply() return a vector with the old values as attributes? I don't want them, I don't need them and I am not aware of this behaviour when applied on e.g. numerical vectors.
By default, sapply() adds names for each iteration on character vectors.
The result can be delivered without the names by using USE.NAMES = FALSE in the call.
sapply(y, toupper, USE.NAMES = FALSE)
# [1] "HELLO" "BOB" "DAN"
This is explained in help(sapply)
USE.NAMES - logical; if TRUE and if X is character, use X as names for the result unless it had names already. Since this argument follows ... its name cannot be abbreviated.
Note that when you are applying a single function only, there is no need to use an anonymous function (anonymous function usage is also slightly less efficient). This is also shown above.
Also note that sapply() is not necessary here, as toupper() is vectorized.
toupper(y)
# [1] "HELLO" "BOB" "DAN"

Loop over vector containing NULL

I want to loop over a vector and send the values as parameters to a function. One of the values I want to send is NULL. This is what I've been trying
things <- c('M','F',NULL)
for (thing in things){
doSomething(thing)
}
But the loop ignores the NULL value. Any suggestions?
The loop doesn't ignore it. Look at things and you'll see that the NULL isn't there.
You can't mix types in a vector, so you can't have both "character" and "NULL" types in the same vector. Use a list instead.
things <- list('M','F',NULL)
for (thing in things) {
print(thing)
}
[1] "M"
[1] "F"
NULL
When you construct a vector with c(), a value of NULL is ignored:
things <- c('M','F',NULL)
things
[1] "M" "F"
However, if it important to pass the NULL downstream, you can use a list instead:
things <- list('M','F',NULL)
for (thing in things){
print(thing)
}
[1] "M"
[1] "F"
NULL

Resources