Some of spatstat functions, such as crossing.psp do not allow to assign marks within the function. I am making a complicated function with for loops and lapply commands, which calls for marks in ppp and psp objects. I am experiencing problems when trying to assign marks to these objects when I use get() functions. Typically I would use assign function in these cases, but cannot get it to work. Here is an example:
library(spatstat)
win <- owin(c(0,1), c(0,1))
p1 <- ppp(0.1, 0.3, window = win)
p2 <- ppp(0.2, 0.4, window = win)
p3 <- ppp(0.4, 0.7, window = win)
points <- c("p1", "p2", "p3")
For those who are not familiar with the package, marks works followingly:
marks(p1) <- "p1"
What I want to do is (or something similar, which gives the desired result):
for(i in length(points)){
marks(get(points[i])) <- points[i]}
This, of course, does not work because I am using the assignment operator for get function. If I try assign function, I get an error
for(i in 1:length(points)) assign(marks(get(points[i])), points[i])
#Error in assign(marks(get(points[i])), points[i]) :
# invalid first argument
# Or following also gives the same error:
for(i in 1:length(points)) assign(x = marks, value = points[i], envir = get(points[i]))
I have also tried:
setmarks(mget(points), points)
sapply(seq_along(points), function(i) marks(get(points[i])) <- points[i])
How can I assign marks to spatspat objects within loops or using apply commands?
This will do the trick:
for(i in points) {
assign(i, do.call(`marks<-`, list(x=as.symbol(i), value=i)))
}
## Check a point patter to see that it works
marks(p3)
# [1] "p3"
If the need for a pair of nested function calls (the inner one to marks<-() and the outer one to assign()) seems mysterious, have a look at the "subset assignment" section of R-lang.
Related
How can i dynamically generate a function-paramater in R?
I will use the paste0 function for different parameters for the line-function:
lines(paste0(gdaxisymbol[i],"$",gdaxisymbol[i],".Adjusted"))
It ends with an warning:
In xy.coords(x, y) : NAs durch Umwandlung erzeugt
thx
Hi, thx for your answer and sorry for the unclear question
Here is my full code:
library(XML)
library(RCurl)
s <- getURL("https://de.finance.yahoo.com/quote/^GDAXI/components")
t=readHTMLTable(s)
gdaxi = t[["NULL"]]
gdaxi = gdaxi[,-(3:6) ]
gdaxisymbol <- gdaxi$Symbol
getSymbols(gdaxisymbol,from="2015-12-31",to="2021-01-31", auto.assign = TRUE)
plot(ADS.DE$ADS.DE.Adjusted)
for(i in 1:30)
{
lines(paste0(gdaxisymbol[i],"$",gdaxisymbol[i],".Adjusted"))
}
I would like to print all Adujsted share prices in one Plot.
But if i execute the loop i become the error above.
How can i give the correct parameter dynamicaly to the lines function?
If i print one line with
lines(ALV.DE$ALV.DE.Adjusted)
it works fine.
thx
This questions isn't perfectly clear. Not sure whether the problem is with the lines function or something else. But I'll answer the question regarding how to dynamically pass parameters for a function.
If you want to dynamically pass parameters for a function and send them into paste0, the easiest way to do it is to pass the parameters in on a list or vector. Here is an example:
# Define function
paste_special <- function(inputs) {
ret <- paste0(inputs, collapse = "")
return(ret)
}
# Create sample data
myinputs <- c("A", "$", "B", ".adjusted")
# Call function and view results
paste_special(myinputs)
# [1] "A$B.adjusted"
Since the list or vector is flexible, and can take any number of elements, then your function can also take any number of elements.
I am trying to analyse a dataframe using hierarchical clustering hclust function in R.
I would like to pass in a vector of p values I'll write beforehand (maybe something like c(5/4, 3/2, 7/4, 9/4)) and be able to have these specified as the different p value options with Minkowski distance when I use expand.grid. Ideally, when hyperparams is viewed, it would also be clear which value of p has been used for each minkowski, i.e. they should be labelled. So for example, where (if you run my code for hyperparams) there would currently just be one minkowski under Dists, for each of the methods in Meths, there would be, if I supplied the p vector as c(5/4, 3/2, 7/4, 9/4), now instead 4 rows for Minkowski distance: minkowski, p=5/4, minkowski, p=3/2, minkowski, p=7/4, minkowski, p=9/4 (or looking something like that, making the p values clear). Any ideas?
(Note: no packages please, only base R!)
Edit: I worded it poorly before, now rewritten. Let's take the following example instead:
acc <- function(x){
first = sum(x)
second = sum(x^2)
return(list(First=first,Second=second))
}
iris0 <- iris
iris1 <- cbind(log(iris[,1:4]),iris[5])
iris2 <- cbind(sqrt(iris[,1:4]),iris[5])
Now the important bit:
tests <- expand.grid(Dists=c("euclidean","maximum","manhattan","canberra","binary"),
DS=c("iris0","iris1","iris2"))
Table <- Map(function(x, ds){acc(table(ds$Species, cutree(hclust(dist(get(ds)[,1:4], method=x)),3)))},tests[[1]], tests[[2]])
This will work. But now if I want to include a term like "minkowski",p=3 in expand.grid, how would I do it?
tests <- expand.grid(Dists=c("euclidean","maximum","manhattan","canberra","binary","minkowski,p=3"),
DS=c("iris0","iris1","iris2"))
Table <- Map(function(x, ds){acc(table(ds$Species, cutree(hclust(dist(get(ds)[,1:4], method=x)),3)))},tests[[1]], tests[[2]])
This gives an error.
In reality there should be no p argument unless the method="minkowski". I have tried to use strsplit to get the first part of the expression into ds, and a switch with strsplit to get the second part and then use parse (it would return NULL if the length of the strsplit was not 2 -- this should pass no argument, I think). The issue seems to be that strsplit is not strsplit(x,",") fails to evaluate the vectorized x but rather tries to evaluate the character x which is not a string. Can anyone suggest any workaround/fix or other method for including the minkowski,p=1.6 terms and the like?
We can create a 'p' value column
tests <- expand.grid(Dists=c("euclidean","maximum","manhattan","canberra","binary",
"minkowski3", "minkowski4", "minkowski5"),
DS=c("iris0","iris1","iris2"))
Suppose, we have another column of 'p' values in 'tests', the above solution can be changed to
tests$p <- as.list(args(dist))$p # default value
i1 <- grepl("minkowski", tests$Dists)
tests$Dists <- sub("[0-9.]+$", "", tests$Dists)
tests$p[i1] <- rep(3:5, length.out = sum(i1))
Map(function(x, ds, p){
dist1 <- dist(get(ds)[, 1:4], method = x, p = p)
ct <- cutree(hclust(dist1), 3)
acc(table(get(ds)$Species, ct))},
as.character(tests[[1]]), as.character(tests[[2]]), tests$p )
I am new to R and am working on writing some cool functions while I learn statistics in parallel. I'm trying to make a function that will take a numeric vector, perform the "root mean squared" operations and then have the output return essentially same vector with the possible outliers removed.
For example, if the vector is c(2,4,9,10,100) the resulting RMS would be about 37.
Therefore, I want the output to return the same vector with the possible outlier (in this case, 100) removed from the dataset. So the result would be 2, 4, 9, 10
I put my code below but the output isn't working. I tried it 2 different ways. Everything up to the line that says RMS final works. But below that it does not.
How can I modify this function so that it does what I want? Also, as a bonus, and this might be asking a lot but based on my coding below, any tips for a newbie on making functions would be something I'd be grateful for as well. Thanks so much!
RMS_x <- c(2,4,9,10,100)
#Root Mean Squared Function - Takes a numeric vector
RMS <- function(RMS_x){
RMS_MEAN <- mean(RMS_x)
RMS_DIFF <- (RMS_x-RMS_MEAN)
RMS_DIFF_SQ <- RMS_DIFF^2
RMS_FINAL <- sqrt(sum(RMS_DIFF_SQ)/length(RMS_x))
for(i in length(RMS_x)){
if(abs(RMS_x[i]) > RMS_FINAL){
output <- RMS_x[i]}
else {NULL} }
return(output)
}
#Root Mean Squared Function - Takes a numeric vector
RMS <- function(RMS_x){
RMS_MEAN <- mean(RMS_x)
RMS_DIFF <- (RMS_x-RMS_MEAN)
RMS_DIFF_SQ <- RMS_DIFF^2
RMS_FINAL <- sqrt(sum(RMS_DIFF_SQ)/length(RMS_x))
#output <- ifelse(abs(RMS_x) > RMS_FINAL,RMS_x, NULL)
return(RMS_FINAL)
}
Try following in the first lines of the RMS function.
RMS <- function(RMS_x) {
bp <- boxplot(RMS, plot = FALSE)
RMS_x <- RMS_x[!(RMS_x %in% bp$out)]
...
Now, you have RMS_x sans the outliers.
The boxplot function has a way of determining the outliers. Here, I am using that to remove them.
Since you are asking more specifically about R and R functions I’ll focus my response on that. There are a couple errors I'll point out then provide a few alternative solutions.
Your first function isn’t producing the output you want for two reasons:
The logic instructs the function to return a single value rather than a vector. If you’re trying to load a vector within your for loop (one without the outlier) make sure to initialize the vector outside of the function : output <- vector() (note that in my solution below however this is not required). Also the value it is returning is just a value in your vector RMS_x that is greater than the RMS rather that finding an outlier, just fyi if that's what you wanted.
There’s an error and/or typo in your for loop argument, it’s minor but it turns your for loop into not-a-loop whatsoever – which is obviously the total opposite of what you intended. The for loop needs a vector to loop through, the argument should be: for(i in 1:length(RMS_x))
In your code the loop is jumping straight to i = 5 because that is the length of your vector (length(RMS_x) = 5). Given that the values in the RMS_x vector were already in ascending order your code happens to give the "right" answer but that's just because of how you initially loaded the vector. This may have been a typo in your question, and it's a difference of only 2 code characters, but it totally changes what the function looks for.
Solution:
To get what you are trying to accomplish, you need to write two functions: 1.) that defines what's considered an outlier in your data set and 2.) a second function that strips out the outliers and calculates RMS. Then from there either make the functions independent or nest them to pass variables (this kind of goes with your bonus request as well since it's multiple ways of writing functions).
Function to identify outliers:
outlrs <- function(vec){
Q1 <- summary(vec)["1st Qu."]
Q3 <- summary(vec)["3rd Qu."]
# defining outliers can get complicated depending on your sample data but
# your data set is super simple so we'll keep it that way
IQR <- Q3 - Q1
lower_bound <- Q1 - 1.5*(IQR)
upper_bound <- Q3 + 1.5*(IQR)
bounds <- c(lower_bound, upper_bound)
return(bounds)
assign("non_outlier_range", bounds, envir = globalEnv())
# the assign() function will create an actual object in your environment
# called non_outlier_range that you can access directly - return()
# just mean the result will be spit out into the console or into a variable
# you load it into
}
Now moving on to the second function, a few options here:
First Way: Input bounds argument into RMS_func()
RMS_func <- function(dat, bounds){
dat <- dat[!(dat < min(bounds)) & !(dat > max(bounds))]
dat_MEAN <- mean(dat)
dat_DIFF <- (dat-dat_MEAN)
dat_DIFF_SQ <- dat_DIFF^2
dat_FINAL <- sqrt(sum(dat_DIFF_SQ)/length(dat))
return(dat_FINAL)
}
# Call function from approach 1 - note that here the assign() in the
# definition of outlrs() would be required to refer to non_outlier_range:
RMS_func(dat = RMS_x, bounds = non_outlier_range)
Second Way: Call outlrs() inside the second function
RMS_func <- function(dat){
bounds <- outlrs(vec = dat)
dat <- dat[!(dat < min(bounds)) & !(dat > max(bounds))]
dat_MEAN <- mean(dat)
dat_DIFF <- (dat-dat_MEAN)
dat_DIFF_SQ <- dat_DIFF^2
dat_FINAL <- sqrt(sum(dat_DIFF_SQ)/length(dat))
return(dat_FINAL)
}
# Call RMS_func - here the assign() in outlrs() would not be needed is not
# needed because the output will exist within the functions temp environment
# and be passed to RMS_func
RMS_func(dat = RMS_x)
Third Way: Nest outlrs() definition within the RMS_Func - in this case you only need one nested function to accomplish your task
RMS_Func <- function(dat){
outlrs <- function(vec){
Q1 <- summary(dat)["1st Qu."]
Q3 <- summary(dat)["3rd Qu."]
#Q1 <- quantile(vec)["25%"]
#Q3 <- summary(vec)["75%"]
IQR <- Q3 - Q1
lower_bound <- Q1 - 1.5*(IQR)
upper_bound <- Q3 + 1.5*(IQR)
bounds <- c(lower_bound, upper_bound)
return(bounds)
}
bounds <- outlrs(vec = dat)
dat <- dat[!(dat < min(bounds)) & !(dat > max(bounds))]
dat_MEAN <- mean(dat)
dat_DIFF <- (dat-dat_MEAN)
dat_DIFF_SQ <- dat_DIFF^2
dat_FINAL <- sqrt(sum(dat_DIFF_SQ)/length(dat))
return(dat_FINAL)
}
P.S. Wrote this pretty quickly - will likely re-test and edit later. Hopefully for now this helps.
Yesterday, I asked a question regarding how to plot multiple (horizontal-ish) lines, each with a user-specified color, without using a loop.
I tried to use the suggested function matplot() to plot the short vertical lines shown in the plot below with the relevant code.
ci = matrix(1:30, nrow=3, byrow=T)
ci=list(rbind(ci[1,], ci[1,]+2),
rbind(ci[2,], ci[2,]+2),
rbind(ci[3,], ci[3,]+2))
x = rbind(1:10, 1:10)
plot(-5, xlim=c(1,10), ylim=c(1,32))
invisible(mapply(matlines, x=list(x), y=ci,
col=c("red","blue","black"),
lty = 1))
This is all good. However, I am trying to wrap this code inside a function, and I would like to be able to input a list of optional arguments, which can then be passed to mapply/matlines. I tried to use the argument MoreArgs in mapply() to no avail. It seems that arguments that MoreArgs takes are not treated the same as others. As you can see in the first code, each item of the list gets a different color, but when I put col inside the list args.ci, the 3 colors are recycled within each item of the list. I wonder if there is anyway to resolve this issue so that if I have multiple values for an argument, each value gets applied to one item of a list. Thanks!
args.ci = list(col=c("red","blue","black"), lty=1:3)
plot(-5, xlim=c(1,10), ylim=c(1,32))
invisible(mapply(matlines, x=list(x), y=ci,
MoreArgs = args.ci))
Here's a general approach to this sort of problem. You should be able to adapt it to do more exactly what you want:
myFun <- function(...) {
fixedArgs <- list(matlines, x=list(x), y=ci)
dots <- list(...)
allArgs <- c(fixedArgs, dots)
plot(-5, xlim=c(1,10), ylim=c(1,32))
invisible(do.call(mapply, allArgs))
}
myFun(col=c("red","blue","black"), lty=1)
A previous post provided a solution to iteratively store plots in R: see ... iteratively in R... . I had a similar problem and after reading and implementing the solutions provided by the post I am still unable to solve my problem.
The previous post provided the following code:
# Create a list to hold the plot objects.
pltList <- list()
for( i in 2:15 ){
# Get data, perform analysis, ect.
# Create plot name.
pltName <- paste( 'a', i, sep = '' )
# Store a plot in the list using the
name as an index.
pltList[[ pltName ]] <- plot()
}
The following is my code implementation:
a <- list.files("F:.../4hrs", pattern='.csv')
pltList <- list()
i=1
for (x in a) {
myfiles <- read.csv(a, header=TRUE, as.is=TRUE, nrows=2500)
h <- hist(data, plot=F)
# perform analysis, ect.
pltName <- paste('a', formatC(i, width=2, flag='0'), sep='')
pltList[[ pltName ]] <- plot(h)
i <- i+1
}
pltName does produce a list of names but pltList is of length zero.
I am not sure why pltList is not being assigned the plots.
What I eventually want to do is create a pltList with multiple plots contained therein. Then plot those plots in par(mfrow=c(2,1)) style and export as a .pdf.
I should mention that the above works for
pltList[[ pltName ]] <- xyplot(h)
but then I am unable to plot multiple plots in the style of par(mfrow=c(2,1)).
Any suggestions are appreciated.
In the original question you referenced and my answer to it, plot() was used as an abstract placeholder for a plotting function that returned an object and not a literal call to the R function plot. Most functions that return graphics objects are based on the 'grid' package such as xyplot from the lattice package or qplot from ggplot2.
Sorry for the confusion, I should have made this point clear in my answer, but I did not as the asker of the question was already aware of it.
Base graphics functions such as hist and plot render directly to output and do not return any objects that can be used to recreate the plot at a later time which is why you are ending up with a list of length zero.
In order to use this approach you will have to trade usage of the hist function for something that returns an object. Using ggplot2, you could do something like the following in your loop:
# Don't call your data variable 'data' or ggplot will confuse it with the
# `data` function and suffer an error.
h <- qplot(x = plot_data)
pltName <- paste('a', formatC(i, width=2, flag='0'), sep='')
pltList[[ pltName ]] <- h
I have edited my answer to the previous question to make it clear that the use of plot() in my example is not an actual call to the R function of the same name.
Your code uses files I don't have so I can't replicate it, I am also not entirely sure what you are trying to accomplish, but I do see some problems in the code that might help fix it:
a <- list.files("F:.../4hrs", pattern='.csv')
I am not fammiliar with list.files, Is this correctly assigning a? .csv seems an odd pattern.
pltList <- list()
i=1
for (x in a) {
myfiles <- read.csv(a, header=TRUE, as.is=TRUE, nrows=2500)
Here I think a is a vector containing the filenanes right? You are looping x for every value of a, however I don't see x return anywhere in the code. Also you are reading a vector of filenames here. Shouldnt this be read.csv(x,..., or better yet, loop for (i in 1:length(a)) and index a[i].
h <- hist(data, plot=F)
I don't see the object data anywhere. Is h correctly assigned?
# perform analysis, ect.
pltName <- paste('a', formatC(i, width=2, flag='0'), sep='')
pltList[[ pltName ]] <- plot(h)
i <- i+1
}
What I like to do is simply run such a loop by hand, and see what is going on. I think there is a problem in the assigning of myfiles or h