In R what makes NULL atomical and therefore unable to exist in a vector? - r

In R for Everyone by Jared P. Lander on p. 54 it says "...NULL is atomical and cannot exist within a vector. If used inside a vector, it simply disappears."
I understand the concept of being atomic is being indivisible and that NULL represents "nothingness", used commonly to handle returns that are undefined.
Therefore, is NULL atomical b/c it has this one value always of "nothingness", meaning something simply does not exist and therefore R's way of handling that is to just not let it exist in a vector or on assignment in a list it will actually remove that element?
Trying to wrap my head around it and find a more intuitive and comprehensive answer.

In my opinion talking about vectors as being "atomic" is more confusing than helpful. Instead, consider that R has a series of data types built into the language. They are given by definition and are distinct from one another.
For example, one such data type is "integer vector", which represents a sequence of integer values. Note that R does not have a data type of "integer". If we are talking about integer 5 in R, it is actually an integer vector of length 1.
Another built-in data type is NULL. There is a single object of type NULL, which is also called NULL. Since NULL is a type and an object, but not an integer value, it cannot be part of an integer vector.
Missing data in an integer vector are represented by NA. In this context NA is considered an integer value. Note that NA can also be a numeric value, logical value, etc. NA is a not a data type, but a value.
A complete list of built-in data types can be found in the R source code and also in the documentation, e.g. https://cran.r-project.org/doc/manuals/r-release/R-ints.html#SEXPTYPEs

Related

What happens when you combine multiple datatypes in an atomic vector in R programming language?

I tried running a code to identify the type of the vector produced while combining different data types. Here is the code and what I got as the output. Can somebody explain why this output is seen?
v<-c(1L,2,TRUE)
typeof(v)
Output: [1] "double"
Seems like this is the rule:
When you attempt to combine different types they will be coerced in a fixed order: character → double → integer → logical. For example, combining a character and an integer yields a character.
An atomic vector can only hold values of a single data type. If you put several different types in it, these get coerced to a common type. In your case double.
IF you want to keep the data type of the original values, you need to use a list. Lists do not have this restriction.

Can a SQLite user-defined function take a row argument?

They are described as scalar, but I think that refers to the return type rather than the arguments.
I'm trying to define one in rust that will provide a TEXT value derived from other columns in the row, for convenience/readability at point of use, I'd like to call it as select myfunc(mytable) from mytable rather than explicitly the columns that it derives.
The rusqlite example simply gets an argument as f64, so it's not that clear to me how it might be possible to interpret it as a row and retrieve columnar values from within it. Nor have I been able to find examples in other languages.
Is this possible?
This doesn't seem possible.
func(tablename) syntax that I'm familiar with seems to be PostgreSQL-specific; SQLite supports func(*) but when func is user-defined it receives zero arguments, not one (structured) or N (all columns separately) as I expected.

In Julia, how do I find out why Dict{String, Any} is Any?

I am very new to Julia and mostly code in Python these days. I am using Julia to work with and manipulate HDF5 files.
So when I get to writing out (h5write), I get an error because the data argument is of mixed type and I need to find out why.
The error message says Array{Dict{String,Any},4} is what I am trying to pass in, but when I look at the values (and it is a huge structure), I see a lot of 0xff and values like this. How do I quickly find why the Any and not a single type?
Just to make this an answer:
If my_dicts is an Array{Dict{String, Any}, 4}, then one way of working out what types are hiding in the Any part of the dict is:
unique(typeof.(values(my_dicts[1])))
To explain:
my_dicts[1] picks out the first element of your Array, i.e. one of your Dict{String, Any}
values then extracts the values, which is the Any part of the dictionary,
typeof. (notice the dot) broadcasts the typeof function over all elements returned by values, returning the types of all of these elements; and
unique takes the list of all these types and reduces it to its unique elements, so you'll end up with a list of each separate type contained in the Any partof your dictionary.

Subsetting list containing multiple classes by same index/vector

I'm needing to subset a list which contains an array as well as a factor variable. Essentially if you imagine each component of the array is relative to a single individual which is then associated to a two factor variable (treatment).
list(array=array(rnorm(2,4,1),c(5,5,10)), treatment= rep(c(1,2),5))
Typically when sub-setting multiple components of the array from the first component of the list I would use something like
list$array[,,c(2,4,6)]
this would return the array components in location 2,4 and 6. However, for the factor component of the list this wouldn't work as subsetting is different, what you would need is this:
list$treatment[c(2,4,6)]
Need to subset a list with containing different classes (array and vector) by the same relative number.
You're treating your list of matrices as some kind of 3-dimensional object, but it's not.
Your list$matrices is of itself a list as well, which means you can index at as a list as well, it doesn't matter if it is a list of matrices, numerics, plot-objects, or whatever.
The data you provided as an example can just be indexed at one level, so list$matrices[c(2,4,6)] works fine.
And I don't really get your question about saving the indices in a numeric vector, what's to stop you from this code?
indices <- c(2,4,6)
mysubset <- list(list$matrices[indices], list$treatment[indices])
EDIT, adding new info for edited question:
I see you actually have an 3-D array now. Which is kind of weird, as there is no clear convention of what can be seen as "components". I mean, from your question I understand that list$array[,,n] refers to the n-th individual, but from a pure code-point of view there is no reason why something like list$array[n,,] couldn't refer to that.
Maybe you got the idea from other languages, but this is not really R-ish, your earlier example with a list of matrices made more sense to me. And I think the most logical would have been a data.frame with columns matrix and treatment (which is conceptually close to a list with a vector and a list of matrices, but it's clearer to others what you have).
But anyway, what is your desired output?
If it's just subsetting: with this structure, as there are no constraints on what could have been the content, you just have to tell R exactly what you want. There is no one operator that takes a subset of a vector and the 3rd index of an array at the same time. You're going to have to tell R that you want 3rd index to use for subsetting, and that you want to use the same index for subsetting a vector. Which is basically just the code you already have:
idx <- c(2,4,6)
output <- list(list$array[,,idx], list$treatment[idx])
The way that you use for subsetting multiple matrices actually gives an error since you are giving extra dimension although you already specify which sublist you are in. Hence in order to subset matrices for the given indices you can usemy_list[[1]][indices] or directly my_list$matrices[indices]. It is the same for the case treatement my_list[[2]][indices] or my_list$treatement[indices]

How to add empty values in a vector, matrix, structure

I am trying to do some calculations where I divide two vectors. Sometimes I encounter a division by zero, which cannot take place. Instead of attempting this division, I would like to store an empty element in the output.
The question is: how do I do this? Can vectors have empty fields? Can a structure be the solution to my problem or what else should I use?
No, there must be something in the memory slot. Simply store a NaN or INT_MIN for integer values.

Resources