Cannot modify subset of psp object - r

All,
I am trying to modify a subset of a psp object in the R package spatstat. Here is the code that is giving me an issue:
set.seed(10)
mat <- matrix(runif(40), ncol=4)
mx <- data.frame(v1=sample(1:4,10,TRUE),
v2=factor(sample(letters[1:4],10,TRUE),levels=letters[1:4]))
a <- as.psp(mat, window=owin(),marks=mx)
#subset to marking v1 = 2, modify one of its endpoints
a[a$marks$v1==2]$ends$x0<-rep(5,4)
this throws a warning at me:
Warning message:
In a[a$marks$v1 == 2]$ends$x0 <- rep(5, 4) :
number of items to replace is not a multiple of replacement length
What is the right way to modify some elements of a psp object? I commonly use this operation with dataframes and don't have an issue. My sense is that the subset operator ([) isn't set up for this operation with the psp class.
Thank you for reading; appreciate any help you may have.

The problem here is that you are trying to write to a subset of the psp object. Although the [ operator is defined for this class so you can extract a subset from it, the [<- operator is not defined, so you can't overwrite a subset.
However, the member that you are trying to overwrite is a data frame, which of course does have a [<- operator defined. So all you need to do is write to that without subsetting the actual psp object.
Here's a full reprex:
library(spatstat)
set.seed(10)
mat <- matrix(runif(40), ncol = 4)
mx <- data.frame(v1 = sample(1:4, 10, TRUE),
v2 = factor(sample(letters[1:4], 10, TRUE),
levels = letters[1:4]))
a <- as.psp(mat, window = owin(), marks = mx)
#subset to marking v1 = 2, modify one of its endpoints
a$ends$x0[a$marks$v1 == 2] <- rep(5, 4)
a
#> marked planar line segment pattern: 10 line segments
#> Mark variables: v1, v2
#> window: rectangle = [0, 1] x [0, 1] units
Created on 2020-08-18 by the reprex package (v0.3.0)

I will take that as a feature request to add a method for [<- for class psp.
Generally we advise against directly altering the components of objects in spatstat because this can destroy their internal consistency. So a method for [<- would be the best solution.

Related

Using mocking with apply R

I am currently mocking in some unit tests, using the packages testthatand mockery. I am trying to understand how the function expect_argsfrom the mockerypackage works when the mocked function is actually called in a function using apply. Here is an example where the test is successful.
myMean <- function(A){
apply(A,1,mean)
}
myMat = matrix(rep(1,6), nrow = 2, ncol = 3)
test_that("myMean calls base::mean correctly",{
m <- mock(1, cycle = TRUE)
with_mock(
`base::mean` = m,
myMean(myMat),
expect_args(m, 1, as.double(myMat[1,])))
})
Let's take now a slightly more complicated example, where the argument of myMeanis actually a data.frame and needs to be converted to a matrix within the function.
myMean <- function(A){
B = as.matrix(A)
apply(B,1,mean)
}
myMat = as.data.frame(myMat)
test_that("myMean calls base::mean correctly",{
m <- mock(1, cycle = TRUE)
with_mock(
`base::mean` = m,
myMean(myMat),
expect_args(m, 1, as.double(myMat[1,])))
})
I then get the following error message:
Error: Test failed: 'myMeanSimple calls base::mean correct number of times
* 1st actual argument not equal to 1st expected argument.
names for target but not for current
This error is explained on the vignette of the mockery package. Nevertheless I do not manage to find which argument name I should associate with as.double(myMat[1,]).
First of all, I'm happy this small utility became useful! Second of all, the error you see results from how your transformations are carried out and how expect_args compares results. Internally, we call expect_equal which requires all of the names of the matrix to be present there.
After calling your second example I run this:
> mock_args(m)
[[1]]
[[1]][[1]]
V1 V2 V3
1 1 1
[[2]]
[[2]][[1]]
V1 V2 V3
1 1 1
So you can see that in the first call a single named raw was passed, and the same is true for the second call - there are names assigned to each column. This is because as.matrix preserves column names. So this is not about argument names, this is about names in the data that's compared.
Now, when you run your final comparison using expect_args you actually use as.double which doesn't preserve names. Thus, the error you see. To fix it you can simply change your expectation to:
expect_args(m, 1, as.matrix(myMat)[1,])
I hope this solves your problem.

Function name in single quotation marks in R

It may be a silly question but I have been bothered for quite a while. I've seen people use single quotation marks to surround the function name when they are defining a function. I keep wondering the benefit of doing so. Below is a naive example
'row.mean' <- function(mat){
return(apply(mat, 1, mean))
}
Thanks in advance!
Going off Richard's assumption, the back ticks allows you to use symbols in names which are normally not allowed. See:
`add+5` <- function(x) {return(x+5)}
defines a function, but
add+5 <- function(x) {return(x+5)}
returns
Error in add + 5 <- function(x) { : object 'add' not found
To refer to the function, you need to explicitly use the back ticks as well.
> `add+5`(3)
[1] 8
To see the code for this function, simply call it without its arguments:
> `add+5`
function(x) {return(x+5)}
See also this comment which deals with the difference between the backtick and quotes in name assignment: https://stat.ethz.ch/pipermail/r-help/2006-December/121608.html
Note, the usage of back ticks is much more general. For example, in a data frame you can have columns named with integers (maybe from using reshape::cast on integer factors).
For example:
test = data.frame(a = "a", b = "b")
names(test) <- c(1,2)
and to retrieve these columns you can use the backtick in conjunction with the $ operator, e.g.:
> test$1
Error: unexpected numeric constant in "test$1"
but
> test$`1`
[1] a
Levels: a
Funnily you can't use back ticks in assigning the data frame column names; the following doesn't work:
test = data.frame(`1` = "a", `2` = "b")
And responding to statechular's comments, here are the two more use cases.
In fix functions
Using the % symbol we can naively define the dot product between vectors x and y:
`%.%` <- function(x,y){
sum(x * y)
}
which gives
> c(1,2) %.% c(1,2)
[1] 5
for more, see: http://dennisphdblog.wordpress.com/2010/09/16/infix-functions-in-r/
Replacement functions
Here is a great answer demonstrating what these are: What are Replacement Functions in R?

Using apply() over columns to output subsets

I have a data frame in R where the majority of columns are values, but there is one character column. For each column excluding the character column I want to subset the values that are over a threshold and obtain the corresponding value in the character column.
I'm unable to find a built-in dataset that contains the pattern of data I want, so a dput of my data can be accessed here.
When I use subsetting, I get the output I'm expecting:
> df[abs(df$PA3) > 0.32,1]
[1] "SSI_01" "SSI_02" "SSI_04" "SSI_05" "SSI_06" "SSI_07" "SSI_08" "SSI_09"
When I try to iterate over the columns of the data frame using apply, I get a recursion error:
> apply(df[2:10], 2, function(x) df[abs(df[[x]])>0.32, 1])
Error in .subset2(x, i, exact = exact) :
recursive indexing failed at level 2
Any suggestions where I'm going wrong?
The reason your solution didn't work is that the x being passed to your user-defined function is actually a column of df. Therefore, you could get your solution working with a small modification (replacing df[[x]] with x):
apply(df[2:10], 2, function(x) df[abs(x)>0.32, 1])
You could use the ... argument to apply to pass an extra argument. In this case, you would want to pass the first column:
apply(df[2:10], 2, function(x, y) y[abs(x) > 0.32], y=df[,1])
Yet another variation:
apply(abs(df[-1]) > .32, 2, subset, x=df[[1]])
The cute trick here is to "curry" subset by specifying the x parameter. I was hoping I could do it with [ but that doesn't deal with named parameters in the typical way because it is a primitive function :..(
A quick and non-sophisticated solution might be:
sapply(2:10, function(x) df[abs(df[,x])>0.32, 1])
Try:
lapply(df[,2:10],function(x) df[abs(x)>0.32, 1])
Or using apply:
apply(df[2:10], 2, function(x) df[abs(x)>0.32, 1])

S4 class for chunked data in R- inherited numeric methods won’t work

I want to create an S4 class in R that will allow me to access large datasets (in chunks) from the cloud (similar to the goals of the ff package). Right now I'm working with a toy example called "range.vec" (I don't want to deal with internet access yet), which stores a sequence of numbers like so:
setClass("range.vec",
representation(start = "numeric", #beginning num in sequence
end = "numeric", #last num in sequence
step = "numeric", #step size
chunk = "numeric", #cache a chunk here to save memory
chunkpos = "numeric"), #where does the chunk start in the overall vec
contains="numeric" #inherits methods from numeric
)
I want this class to inherit the methods from "numeric", but I want it to use these methods on the whole vector, not just the chunk that I'm storing. For example, I don't want to define my own method for 'mean', but I want 'mean' to get the mean of the whole vector by accessing it chunk by chunk, using length(), '[', '[[', and el() functions that I've defined. I've also defined a chunking function:
setGeneric("set.chunk", function(x,...) standardGeneric("set.chunk"))
setMethod("set.chunk", signature(x = "range.vec"),
function (x, chunksize=100, chunkpos=1) {
#This function extracts a chunk of data from the range.vec object.
begin <- x#start + (chunkpos - 1)*x#step
end <- x#start + (chunkpos + chunksize - 2)*x#step
data <- seq(begin, end, x#step) #calculate values in data chunk
#get rid of out-of-bounds values
data[data > x#end] <- NA
x#chunk <- data
x#chunkpos <- chunkpos
return(x)
}})
When I try to call a method like 'mean', the function inherits correctly, and accesses my length function, but returns NA because I don't have any data stored in the .Data slot. Is there a way that I can use the .Data slot to point to my chunking function, or to tell the class to chunk numeric methods without defining every single method myself? I'm trying to avoid coding in C if I can. Any advice would be very helpful!
You could remove your chunk slot and replace it by numeric's .Data slot.
Little example:
## class definition
setClass("foo", representation(bar="numeric"), contains="numeric")
setGeneric("set.chunk", function(x, y, z) standardGeneric("set.chunk"))
setMethod("set.chunk",
signature(x="foo", y="numeric", z="numeric"),
function(x, y, z) {
## instead of x#chunk you could use numeric's .Data slot
x#.Data <- y
x#bar <- z
return(x)
})
a <- new("foo")
a <- set.chunk(a, 1:10, 4)
mean(a) # 5.5
Looks like there isn't a good way to do this within the class. The only solution I've found is to tell the user to calculate to loop through all of the chunks of data from the cloud, and calculate as they go.

Assignment in R language

I am wondering how assignment works in the R language.
Consider the following R shell session:
> x <- c(5, 6, 7)
> x[1] <- 10
> x
[1] 10 6 7
>
which I totally understand. The vector (5, 6, 7) is created and bound to
the symbol 'x'. Later, 'x' is rebound to the new vector (10, 6, 7) because vectors
are immutable data structures.
But what happens here:
> c(4, 5, 6)[1] <- 10
Error in c(4, 5, 6)[1] <- 10 :
target of assignment expands to non-language object
>
or here:
> f <- function() c(4, 5, 6)
> f()[1] <- 10
Error in f()[1] <- 10 : invalid (NULL) left side of assignment
>
It seems to me that one can only assign values to named data structures (like 'x').
The reason why I am asking is because I try to implement the R language core and I am unsure
how to deal with such assignments.
Thanks in advance
It seems to me that one can only assign values to named data structures (like 'x').
That's precisely what the documentation for ?"<-" says:
Description:
Assign a value to a name.
x[1] <- 10 doesn't use the same function as x <- c(5, 6, 7). The former calls [<- while the latter calls <-.
As per #Owen's answer to this question, x[1] <- 10 is really doing two things. It is calling the [<- function, and it is assigning the result of that call to x.
So what you want to achieve your c(4, 5, 6)[1] <- 10 result is:
> `[<-`(c(4, 5, 6),1, 10)
[1] 10 5 6
You can make modifications to anonymous functions, but there is no assignment to anonymous vectors. Even R creates temporary copies with names and you will sometimes see error messages that reflect that fact. You can read this in the R language definition on page 21 where it deals with the evaluation of expressions for "subset assignment" and for other forms of assignment:
x[3:5] <- 13:15
# The result of this commands is as if the following had been executed
`*tmp*` <- x
x <- "[<-"(`*tmp*`, 3:5, value=13:15)
rm(`*tmp*`)
And there is a warning not to use *tmp* as an object name because it would be overwritting during the next call to [<-

Resources