I've written a function that is a simulation, that outputs a vector of 100 elements, and I want to use the *apply functions to run the function many times and store the repeated output in a new vector for each time the simulation is run.
The function looks like:
J <- c(1:100)
species_richness <- function(J){
a <- table(J)
return(NROW(a))
}
simulation <- function(J,gens,ploton=FALSE,v=0.1){
species_richness_output <- rep(NA,gens)
for(rep in 1:gens){
index1 <- sample(1:length(J),1)
if(runif(1,0,1) < v){
J[index1] <- (rep+100)
}
else{
index2 <- sample(1:length(J),1)
while(index1==index2) {
index2 <- sample(1:length(J),1)
}
J[index1] <- J[index2]
}
species_richness_output[rep] <- species_richness(J)
}
species_abundance <- function(J){
a <- table(J)
return(a)
}
abuntable <- species_abundance(J)
print(abuntable)
octaves <- function(abuntable){
oct <- (rep(0,log2(sum(abuntable))))
for(i in 1:length(abuntable)){
oct2 <- floor(log2(abuntable[i])+1)
oct[oct2] <- oct[oct2]+1
}
print(oct)
}
# octaves(c(100,64,63,5,4,3,2,2,1,1,1,1))
if(ploton==TRUE){
hist(octaves(abuntable))
}
print(species_richness(J))
return(J)
}
simulation(J, 10000,TRUE,v=0.1)
So that's my function, it takes J a vector I defined earlier, manipulates it, then returns:
the newly simulated vector J of 100 elements
a function called octave that categorises the new vector
a histogram corresponding to the above "octave"
I have tried a number of variations: using lapply, mapply
putting args=args_from_original_simulation
simulation_repeated <- c(mapply(list, FUN=simulation(args),times=10000))
but I keep getting an error with the match.fun part of the mapply function
Error in match.fun(FUN) :
'simulation(J, 10000, FALSE, 0.1)' is not a function, character or symbol
This is despite the simulation I have written showing as being saved as a function in the workspace.
Does anyone know what this error is pointing to?
In this line:
simulation_repeated <- c(mapply(list, FUN=simulation(args),times=10000))
You are not giving a function to mapply. You are (essentially) passing the result of calling simulation(args) and simulation does not return a function.
Related
Hokay, so I have a for loop right at the end of this function here that's supposed to output the generated prime numbers that are created early in the function.
When I run the code line by line, everything works as expected and the two for loops output every element in the lists: PrimeList_p, PrimeList_q, to the console window.
When I run the code by calling the function however, the for loops only output the first 20 or so elements to the console. Why doesn't it print the whole list?
require(gmp)
GenPrimes <- function(InitialSize) {
#List initialisation
PrimeList_p <<- list()
PrimeList_q <<- list()
#Loop initialisation
x <- 1
#LOOP START
while (x < 81) {
#Generate and compile prime numbers into Prime_List1.
PrimeList_p[[x]] <- nextprime(urand.bigz(size = InitialSize + x, seed =
Sys.time()))
x <- x+1
PrimeList_q[[x]] <- nextprime(urand.bigz(size = InitialSize + x, seed =
Sys.time()))
x <- x+1
}
#LOOP END
#Remove NULL entries in lists
PrimeList_p <<- PrimeList_p[-which(sapply(PrimeList_p, is.null))]
PrimeList_q <<- PrimeList_q[-which(sapply(PrimeList_q, is.null))]
cat("Prime p:")
for (i in 1:40){
message(PrimeList_p[[i]])
}
cat("Prime q")
for (j in 1:40){
message(PrimeList_q[[j]])
}
}
GenPrimes(1)
You're incrementing x twice in a single loop. This creates NULL entries in both lists which is handled with code, but then saved to the parent environment. The original lists with NULL values still present in the function environment are then passed to the message loops which is why you see a blank line between each value. Change for (i in 1:40) to for (i in seq_along(PrimeList_p)) and this will become evident
I have problems storing user defined functions in R list when they are put on it in a for loop.
I have to define some segment-specific functions based on some parameters, so I create functions and put them on a list looping through segments with for-loop. The problem is I get same function everywhere on a result list.
The code looks like this:
n <- 100
segmenty <- 1:n
segment_functions <- list()
for (i in segmenty){
segment_functions[[i]] <- function(){return(i)}
}
When i run the code what I get is the same function (last created in the loop) for all indexes:
## for all k
segment_functions[[k]]()
[1] 100
There is no problem when I put the functions on list manually e.g.
segment_functions[[1]] <- function(){return(1)}
segment_functions[[2]] <- function(){return(2)}
segment_functions[[3]] <- function(){return(3)}
works just fine.
I honsetly have no idea what's wrong. Could you help?
You need to use the force function to ensure that the evaluation of i is done during the assignment into the list:
n <- 100
segmenty <- 1:n
segment_functions <- list()
f <- function(i) { force(i); function() return(i) }
for (i in segmenty){
segment_functions[[i]] <- f(i)
}
I'd use lapply and capture i in a clousre of the wrapper:
segment_functions <- lapply(1:100, function(i) function() i)
readStateData <- function() {
infile <- paste("state",i,".txt",sep="")
state <- readLines(infile,n=1)
statedata <- read.table(infile,header=FALSE,sep=",",skip=1,col.names=c("Rank","City","Population"))
statename <- list(state,statedata)
statename
}
# Start loop
for(i in 1:50) {
readStateData()
# Add function to big.list
big.list[[i]] <- readStateData(statename)
}
The assignment for class is to bring in 50 files, all named state#.txt, get the state via readLines, get the data via read.table, and ultimately put it all into big.list that'll have all of the data through a for loop.
The problem I'm having is calling the function in during the for loop. I get the error:
Error in readStateData(statename) : unused argument (statename)
I'm either not calling in the function properly or I've written the function wrong. Both are likely.
Thank you for your help.
You have different issues here.
Do not refer inside a function to a variable which is defined outside. It means instead of access an outside the function defined i inside the function:
i <- 1
fct <- function() {
a <- i + 1
return(a)
}
fct()
Pass the variable as an argument to the function:
i <- 1
fct <- function(x) {
a <- x + 1
return(a)
}
fct(i)
In your function the return statement is missing. See point 1 the last command in the functions. Without a return statement the last written variable is on the stack and is "returned" by the function. This is not the clean way to return a value.
Ergo your code should look like this
readStateData <- function(x) {
infile <- paste("state",x,".txt",sep="")
state <- readLines(infile,n=1)
statedata <-read.table(infile,header=FALSE,sep=",",skip=1,col.names=c("Rank","City","Population"))
statename <- list(state,statedata)
return(statename)
}
# Start loop
for(i in 1:50) {
j <- readStateData(i)
# Add function to big.list
big.list[[i]] <- j
}
If your files are all of the pattern: state[number].txt you can simplify your code to:
# Get all files with pattern state*.txt
fls <- dir(pattern='state.*txt')
readStateData <- function(x) {
state <- readLines(x, n=1)
statedata <-read.table(x, header=FALSE,sep=",",skip=1,col.names=c("Rank","City","Population"))
statename <- list(state,statedata)
return(statename)
}
# Start loop
for(i in 1:length(fls)) {
j <- readStateData(fls[i])
# Add function to big.list
big.list[[i]] <- j
}
I have defined a function which I want to reapply to its own output multiple times. I tried
replicate(1000,myfunction)
but realised that this is just applying my function to my initial input 1000 times, rather than applying my function to the new output each time. In effect what I desire is:
function(function(...function(x_0)...))
1000 times over and being able to see the changes at each stage.
I have previous defined b as a certain vector of length 7.
b_0=b
C=matrix(0,7,1000)
for(k in 1:1000){
b_k=myfun(b_(k-1))
}
C=rbind(b_k)
C
Is this the right idea behind what I want?
You could use Reduce for this. For example
add_two <- function(a) a+2
ignore_current <- function(f) function(a,b) f(a)
Reduce(ignore_current(add_two), 1:10, init=4)
# 24
Normally Reduce expects to iterate over a set of new values, but in this case I use ignore_current to drop the sequence value (1:10) so that parameter is just used to control the number of times we repeat the process. This is the same as
add_two(add_two(add_two(add_two(add_two(add_two(add_two(add_two(add_two(add_two(4))))))))))
Pure functional programming approach, use Compose from functional package:
library(functional)
f = Reduce(Compose, replicate(100, function(x) x+2))
#> f(2)
#[1] 202
But this solution does not work for too big n ! Very interesting.
A loop would work just fine here.
apply_fun_n_times <- function(input, fun, n){
for(i in 1:n){
input <- fun(input)
}
return(input)
}
addone <- function(x){x+1}
apply_fun_n_times(1, addone, 3)
which gives
> apply_fun_n_times(1, addone, 3)
[1] 4
you can try a recursive function:
rec_func <- function(input, i=1000) {
if (i == 0) {
return(input)
} else {
input <- myfunc(input)
i <- i - 1
rec_func(input, i)
}
}
example
myfunc <- function(item) {item + 1}
> rec_func(1, i=1000)
[1] 1001
How can I write this all in one line?
mydata is a "zoo" series, limit is a numeric vector of the same size
tmp <- ave(coredata(mydata), as.Date(index(mydata)),
FUN = function(x) cummax(x)-x)
tmp <- (tmp < limit)
final <- ave(tmp, as.Date(index(mydata)),
FUN = function(x) cumprod(x))
I've tried to use two vectors as argument to ave(...) but it seems to accept just one even if I join them into a matrix.
This is just an example, but any other function could be use.
Here I need to compare the value of cummax(mydata)-mydata with a numeric vector and
once it surpasses it I'll keep zeros till the end of the day. The cummax is calculated from the beginning of each day.
If limit were a single number instead of a vector (with different possible numbers) I could write it:
ave(coredata(mydata), as.Date(index(mydata)),
FUN = function(x) cumprod((cummax(x) - x) < limit))
But I can't introduce there a vector longer than x (it should have the same length than each day) and I don't know how to introduce it as another argument in ave().
Seems like this routine imposes intraday stoploss based on maxdrawdown. So I assume you want to be able to pass in variable limit as a second argument to your aggregation function which only currently only takes 1 function due to the way ave works.
If putting all this in one line is not an absolute must, I can share a function I've written that generalizes aggregation via "cut variables". Here's the code:
mtapplylist2 <- function(t, IDX, DEF, MoreArgs=NULL, ...)
{
if(mode(DEF) != "list")
{
cat("Definition must be list type\n");
return(NULL);
}
a <- c();
colnames <- names(DEF);
for ( i in 1:length(DEF) )
{
def <- DEF[[i]];
func <- def[1];
if(mode(func) == "character") { func <- get(func); }
cols <- def[-1];
# build the argument to be called
arglist <- list();
arglist[[1]] <- func;
for( j in 1:length(cols) )
{
col <- cols[j];
grp <- split(t[,col], IDX);
arglist[[1+j]] <- grp;
}
arglist[["MoreArgs"]] <- MoreArgs;
v <- do.call("mapply", arglist);
# print(class(v)); print(v);
if(class(v) == "matrix")
{
a <- cbind(a, as.vector(v));
} else {
a <- cbind(a, v);
}
}
colnames(a) <- colnames;
return(a);
}
And you can use it like this:
# assuming you have the data in the data.frame
df <- data.frame(date=rep(1:10,10), ret=rnorm(100), limit=rep(c(0.25,0.50),50))
dfunc <- function(x, ...) { return(cummax(x)-x ) }
pfunc <- function(x,y, ...) { return((cummax(x)-x) < y) }
# assumes you have the function declared in the same namespace
def <- list(
"drawdown" = c("dfunc", "ret"),
"hasdrawdown" = c("pfunc", "ret", "limit")
);
# from R console
> def <- list("drawdown" = c("dfunc", "ret"),"happened" = c("pfunc","ret","limit"))
> dim( mtapplylist2(df, df$date, def) )
[1] 100 2
Notice that the "def" variable is a list containing the following items:
computed column name
vector arg function name as a string
name of the variable in the input data.frame that are inputs into the function
If you look at the guts of "mtapplylist2" function, the key components would be "split" and "mapply". These functions are sufficiently fast (I think split is implemented in C).
This works with functions requiring multiple arguments, and also for functions returning vector of the same size or aggregated value.
Try it out and let me know if this solves your problem.