how to get output from data frames in R code

how to get output from data frames in R code - r

I have the following code to analyze data sets:
library("Matrix")
Data <-list(c(2,3),c(3,2),c(2,2))
TheSizes=c(3,4)
n=2
dd=2
StdGrid <- function(Data,TheSizes)
{
SGrid <- list(
Values = Data,
Sizes = TheSizes
)
class(SGrid) <- append(class(SGrid), c("StdGrid","Moment"))
return(SGrid)
}
theObject=StdGrid
MHistogramC <- function(theObject,n,dd)
{
sizes <- theObject$Sizes
l <- length(sizes)
data <- theObject$Values
Xarray <- matrix(rep(0,l*n),ncol=n)
N <- matrix(rep(0,l*n),ncol=n)
Histo <- matrix(rep(0,l*n),ncol=n)
GrandX <- lapply(data,function(x) log(x))
minX <- rep(0,l)
maxX <- rep(0,l)
DeltaX <- rep(0,l)
for(i in 1:l){
minX[i] <- min(GrandX[[i]])
maxX[i] <- max(GrandX[[i]])
DeltaX[i] <- maxX[i]/n-minX[i]/n
}
nzero <- numeric()
for(j in 1:n){
for(i in 1:l){
Xarray[i,j] <- minX[i]+(j-1/2)*DeltaX[i]
N[i,j] <- length(which((GrandX[[i]] >= minX[i]+(j-1/2)*DeltaX[i]-DeltaX[i]) & (GrandX[[i]] <= minX[i]+(j-1/2)*DeltaX[i]+DeltaX[i])))
Histo[i,j] <- log(N[i,j])
}
if(min(Histo[,j]) > - 10000){
nzero <- c(nzero,j)
}
}
alpha <- rep(0,lnzero)
falpha <- rep(0,lnzero)
for(j in 1:length(nzero)){
fit <- lm(Xarray[,nzero[j]] ~ log(sizes/dd))
alpha[j] <- fit$coefficients[[2]]
fit2 <- lm(Histo[,nzero[j]] ~ log(sizes/dd))
falpha[j] <- -fit2$coefficients[[2]]
}
Result <- data.frame(alpha=alpha,falpha=falpha)
return(Result)
}
MHistogramU <- function(theObject,n,dd)
{
sizes <- theObject$Sizes
l <- length(sizes)
data <- theObject$Values
Xarray <- matrix(rep(0,l*n),ncol=n)
N <- matrix(rep(0,l*n),ncol=n)
Histo <- matrix(rep(0,l*n),ncol=n)
GrandX <- lapply(data,function(x) log(x))
minX <- rep(0,l)
maxX <- rep(0,l)
DeltaX <- rep(0,l)
for(i in 1:l){
minX[i] <- min(GrandX[[i]])
maxX[i] <- max(GrandX[[i]])
DeltaX[i] <- maxX[i]/n-minX[i]/n
}
nzero <- numeric()
for(j in 1:n){
for(i in 1:l){
Xarray[i,j] <- minX[i]+(j-1/2)*DeltaX[i]
N[i,j] <- length(which((GrandX[[i]] >= minX[i]+(j-1/2)*DeltaX[i]-sqrt(DeltaX[i])) & (GrandX[[i]] <= minX[i]+(j-1/2)*DeltaX[i]+sqrt(DeltaX[i]))))
Histo[i,j] <- log(N[i,j])
}
if(min(Histo[,j]) > - 10000){
nzero <- c(nzero,j)
}
}
alpha <- rep(0,lnzero)
falpha <- rep(0,lnzero)
for(j in 1:length(nzero)){
fit <- lm(Xarray[,nzero[j]] ~ log(sizes/dd))
alpha[j] <- fit$coefficients[[2]]
fit2 <- lm(Histo[,nzero[j]] ~ log(sizes/dd))
falpha[j] <- -fit2$coefficients[[2]]
}
Result <- data.frame(alpha=alpha,falpha=falpha)
return(Result)
}
Which compiles, but i don't get anything in return. If I try to print "Result" the console says that the object "Result" was not found.
The inputs are:
Data : is a list of vector/grids
TheSizes : is a vector
theObject : the data defined as the class 'StdGrid' (defined below);
n : the number of values of alpha to be calculated;
dd : the dimension of the physical support of the measure.
What can I do to see the data frame that the code is supposed to return?

That is because your code is just a bunch of functions which are not called at all. A function would return value only when the function is called, it won't call itself.
Now looking at your code, It's hard to deduce what you are trying to calculate/analyze, but assuming all other codes are correct, and all functions are coded perfectly, you need to add the following lines to view the result at the end of your code:
var_MHistogramU <- MHistogramU(theObject,n,dd)
var_MHistogramC <- MHistogramC(theObject,n,dd)
To view the result, simply print the variables.
print(var_MHistogramU)
print(var_MHistogramC)
Remember: Printing Result won't work as the variable result is a local variable for the function, which is inaccessible globally.

Related

How to for-loop over a custom function R?

I wrote this function which returns the probability that the value 1 does not appear in a random sample when iterated 1000 times for varying sample sizes.
bday.function <- function(sample.size){
x <- vector()
for (i in 1:1000){
x[i] <- !any(data.frame(table(sample(1:365, sample.size, replace=TRUE)))$Var1 == 1)
}
return(mean(x))
}
Now I want to use this function and another for-loop to calculate the probability for every sample size between 500 and 1500 and make a simple scatter plot of my results. Here is what I tried:
z <- vector()
for (i in 500:1500) {
z[i] <- bday.function(i)
return(plot(z))
}
Edit: when I run bday.function the output is number of TRUE values divided by the total (1000) TRUE/FALSE outcomes:
bday.function(750)
[1] 0.122
I would like to replicate this for sample sizes between 500 and 1500 to generate a simple scatter plot
Edit 2: Thanks to everybody for the help! Here's my final solution:
x <- vector(length = 1000)
for (i in 1:1000){
x[i] <- !any(sample(1:365, 500, replace=TRUE) == 1)
}
x
bday.function <- function(sample.size){
x <- vector(length= 1000)
for (i in 1:1000){
x[i] <- !any(sample(1:365, sample.size, replace=TRUE) == 1)
}
return(mean(x))
}
bday.function(750)
z <- vector(length = 1000)
tmp.index <- 500:1500
for (i in seq_along(tmp.index)) {
z[i] <- bday.function(tmp.index[i])
}
#Plot
plot(tmp.index, z, xlab = "sample size", ylab = "Probability of no birthdays")

As #JohnColeman pointed in his sage comment, your function can be slow. Try these changes on your code for the printing output. I have run only 60 sims as I need to complete other things:
#Function
bday.function <- function(sample.size){
x <- vector()
for (i in 1:1000){
x[i] <- !any(data.frame(table(sample(1:365, sample.size, replace=TRUE)))$Var1 == 1)
}
return(mean(x))
}
#Loop
z <- vector()
vec <- 500:1500
for (i in seq_along(vec)) {
z[i] <- bday.function(vec[i])
}
#Plot
plot(z)
Output:

Are you looking for something like this?
bday.function <- function(sample.size) {
mean(sapply(seq(1000), function(x)
+!any(sample(365, sample.size, replace = TRUE) == 1)))
}
x <- 500:1500
y <- sapply(x, bday.function)
plot(x, y, xlab = "sample size", ylab = "Probability of no birthdays")

For optimx() with method of bobyqa, how to set the initial value

I was trying to maximize my Likelihood with the R package 'optimx'. Here is my code. With the initial value (5,5) and (1,1), I got different Maximized likelihood. I also have tried different method like 'Nelder=Mead', but the estimated log likelihood are different under different methods...
library('optimx')
n=225
X = matrix(runif(225),ncol=1)
e2 = matrix(runif(225,0,2),ncol=1)
set.seed(123)
This is the function to generate some data I will use
get_mls_basis<- function(p){
depth <- ceiling(runif(1)*p)
knot <- matrix(rep(0,depth+1),ncol=1)
lr <- runif(1) > 0.5
x <- matrix(rep(0,n),ncol=1)
not_finished <- 1
while (not_finished == 1) {
data_indx = ceiling(runif(1)*n)
var = matrix(rep(0,depth),ncol=1)
for (j in 1:depth) {
not_ok <- 1
while (not_ok == 1) {
ind <- ceiling(runif(1)*p)
if (!is.element (ind,var[1:j]))
{
var[j] <- ind
not_ok <- 0
}
}
}
x_v <- as.matrix(X[data_indx, var])
knot[1:depth] <- rgamma(depth,1,1)
knot[1:depth] <- knot[1:depth] / sqrt(sum(knot^2))
knot[depth+1] <- -x_v %*% knot[1:depth]
ones <- matrix(rep(1,n),ncol=1)
temp <- as.matrix(cbind(X[,var], ones)) %*% knot
if (lr == 0) {
for (i in 1:n)
{
temp[i] <- max(0,temp[i])
}
}
else {
for (i in 1:n)
{
temp[i] <- min(0,temp[i])
}
}
x <- temp
not_finished <- all(x==0)
}
mx <- mean(x)
stx <- sd(x)
x <- (x-mx)/stx
x
}
This is my log likelihood
Lik1<-function(theta, basis){
theta0=theta[1]
theta1=theta[2]
L=-n/2*log(theta0)-sum(basis/2)*log(theta1)-0.5/theta0*sum(e2/theta1^basis)
return(L)
}
basis1=get_mls_basis(1)
Here I used 5 as initial value
optimx(par=c(5,5), Lik1,
basis=basis1,method='bobyqa',control = list(maximize=TRUE))

R - lapply on deep nested elements of list

I have a large list that stored measurements (a product of other lapply() runs). I now want to gather these measurements and calculate median/mean/sd etc but I don't know how to access them. The structure of this list is like this:
foo[[i]][[j]][[k]][[1]]
foo[[i]][[j]][[k]][[2]]$bar
I can't figure out a function that would return e.g. mean of $bar (but not of $x) and keep relation the values of the indices i,j,k.
A sample list can be generated with the following R code:
library(purrr)
metrics <- function(y){
tt10r <- median(y)
list(y, flatten(list(bar = tt10r)))
}
example_list <- list()
for (i in 1:10)
{
v <- list()
for (j in 1:10)
{
w <- 1:10
v[j] <- list(w)
}
example_list[[i]] <- v
}
foo <- list()
for (i in 1:length(example_list))
{
u <- list()
values <- list()
for (j in 1:length(example_list[[i]]))
{
u[[j]] <- lapply(example_list[[i]][[j]], function(x) mean(x))
values[[j]] <- lapply(u[[j]], function(x) metrics(x))
}
foo[[i]] <- values
}

The following code works nicely, but I am not sure if it is efficient (loops!). Gives the anticipated result:
final <- matrix(nrow = tail(cumsum(unlist(lapply(foo, function(x) lengths(x) -2))), n=1), ncol = 3)
final <- data.frame(final)
j=1
i=1
all_js <- c(0, cumsum(lengths(foo)))
starts <- c(0, cumsum(unlist(lapply(foo, function(x) lengths(x) -2)))) + 1
ends <- c(0, cumsum(unlist(lapply(foo, function(x) lengths(x) -2))))
for (i in 1:length(foo))
{
a <- foo[[i]]
for (j in 1:length(a))
{
b <- a[[j]]
data <- unlist(lapply(lapply(b[1], '[', 2), '[[', 1))
for (k in 2:c(length(b)-2))
{
data <- rbind(data,unlist(lapply(lapply(b[k], '[', 2), '[[', 1)))
}
row.names(data) <- NULL
colnames(final) <- c("i", "j", colnames(data))
first <- starts[all_js[i] + j]
last <- ends[all_js[i] + j+1]
final[first:last,] <- data.frame(cbind(i = i, j = j, data))
}
}

R apply function to a list of price data

in the code below I am grabbing data for three symbols, then I want to apply a simple function(which is a trading strategy) to this data. Ideally, I would then run statistics on those returns, such as those native to PerformanceAnalytics.
library("quantmod")
library("PerformanceAnalytics")
options(scipen=999)
PriceData <- new.env()
Symbols <- c("SPY", "QQQ", "IWM")
StartDt <- as.Date("2015-01-01")
suppressWarnings(getSymbols(Symbols, src="yahoo", env=PriceData, from=StartDt))
x <- list()
for (i in 1:length(Symbols)) {
x[[i]] <- get(Symbols[i], pos=PriceData)
}
SYSTEM <- function(data){
ret<- Delt(Cl(x[[i]]),Op(x[[i]]),type = 'arithmetic')
mavga <- SMA(Cl(x[[i]]), n=10)
mavgb <- SMA(Cl(x[[i]]), n=20)
sig <- ifelse(mavga>mavgb,1,0)+ifelse(mavgb>mavga,-1,0)
sig <- lag(sig,1)
sig[is.na(sig)]=0
strategyreturn <- sig * ret
return(strategyreturn)
}
######I'm doing something wrong here######
y <- lappy(x,SYSTEM)
z <- do.call("cbind", y)
PerformanceTable <- function(returns){
scalar <- 252
CS <- t(Return.cumulative(returns, geometric = FALSE))
SR <- t(SharpeRatio.annualized(returns, scale=scalar, geometric = FALSE))
SOR <- t(SortinoRatio(returns))
MDD <- t(maxDrawdown(returns))
WP <- colSums(returns > 0)/colSums(returns != 0)
WP <- as.data.frame(WP)
ASD = t(sd.annualized(returns, scale=scalar))
Stat <- cbind(CS,SR,SOR,MDD,ASD,WP)
colnames(Stat) <- c("Profit","SharpeRatio", "Sortino","MaxDrawdown", "AnnStdDev", "WinPct")
print("Performance Table")
print(Stat)
return (Stat)
}
Perf <- PerformanceTable(y)

You are almost there, just replace x[[i]] by data in your function:
SYSTEM <- function(data){
ret<- Delt(Cl(data),Op(data),type = 'arithmetic')
mavga <- SMA(Cl(data), n=10)
mavgb <- SMA(Cl(data), n=20)
sig <- ifelse(mavga>mavgb,1,0)+ifelse(mavgb>mavga,-1,0)
sig <- lag(sig,1)
sig[is.na(sig)]=0
strategyreturn <- sig * ret
return(strategyreturn)
}
and add a l to lappy()
y <- lapply(x,SYSTEM)
z <- do.call("cbind", y)

Function: save returned data frame to workspace

I cannot really get my head around this problem:
I have a function that returns a data frame. However, the data frame is only printed in my console although I would like to have it stored in the work space. How can I achieve this?
Sample data:
n <- 32640
t <- seq(3*pi,n)
data_sim <- 30+ 2*sin(3*t)+rnorm(n)*10
data_sim <- floor(data_sim)
Function:
compress <- function (name, SR){
## -------------------------------------------------------
## COMPRESSION
library(zoo)
data <- get(name)
if (is.data.frame(data)==F){
data = as.data.frame(data)
}
SR <- SR
acrossmin <- 60
a <- nrow(data)
m <- acrossmin*SR*60
data_compress <- matrix(NA, nrow = a/m)
no_mov_subset <- matrix(NA, nrow = m)
for (i in 1:(a/m)){
subset <- data[(((i-1)*m)+1):((i*m)),]
b <- length(subset)
for (k in 1:b){
r <- subset[k]
if (r == 0){
no_mov_subset[k] <- 0
} else {
no_mov_subset[k] <- 1
}
sum_no_mov_subset <- sum(no_mov_subset)
data_compress[i] <- sum_no_mov_subset
}
colnames(data_compress) <- c("activity_count")
return(data_compress)
}
Run the code:
compress("data_sim", 4/60)
Obviously, the function returns something, but I would like it to be stored in the workspace rather than returned!

Instead of the return command you can use
data_compress <<- data_compress
This way, the data frame is stored in the workspace. So your function looks like this:
compress <- function (name, SR){
## -------------------------------------------------------
## COMPRESSION
library(zoo)
data <- get(name)
if (is.data.frame(data)==F){
data = as.data.frame(data)
}
SR <- SR
acrossmin <- 60
a <- nrow(data)
m <- acrossmin*SR*60
data_compress <- matrix(NA, nrow = a/m)
no_mov_subset <- matrix(NA, nrow = m)
for (i in 1:(a/m)){
subset <- data[(((i-1)*m)+1):((i*m)),]
b <- length(subset)
for (k in 1:b){
r <- subset[k]
if (r == 0){
no_mov_subset[k] <- 0
} else {
no_mov_subset[k] <- 1
}
sum_no_mov_subset <- sum(no_mov_subset)
data_compress[i] <- sum_no_mov_subset
}
colnames(data_compress) <- c("activity_count")
data_compress <<- data_compress
}
}
Edit: As commented by Heroka and hrbrmstr, this solution is not safe. It is better to assign the output of the function call to a variable:
data_compr <- compress("data_sim", 4/60)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

how to get output from data frames in R code - r

Related

How to for-loop over a custom function R?

For optimx() with method of bobyqa, how to set the initial value

R - lapply on deep nested elements of list

R apply function to a list of price data

Function: save returned data frame to workspace

Categories

Resources