I am new to R and am trying to vectorize my codes below.
What is a better way to do this? Thanks so much!
*
l_mat <- data.frame(matrix(ncol = 4, nrow = 4))
datax <- data.frame("var1"= c(1,1,1,1), "Var2" = c(2,2,2,2), "Var3"=c(3,3,3,3), "Var4"=c(4,4,4,4))
for (i in 1:4) {
for (j in 1:4) {
if (datax[i, 2] == datax[j, 2]) {
l_mat[i, j] <- 100
} else {
l_mat[i, j] <- 1
}
}
}
*
It can be better done with outer. As we are checking if all the values in the second column against itself, create the logical expression with outer, convert it to a numeric index and then replace the values with 1 or 100
out <- 1 + (outer(datax[,2], datax[,2], `==`))
out[] <- c(1, 100)[out]
Or in a single line
ifelse(outer(datax[,2], datax[,2], `==`), 100, 1)
Or use a variation with pmax and outer
do.call(pmax, list(outer(datax[,2], datax[,2], `==`) * 100, 1))
Related
I have a problem regarding nested looping. I need to do Row 1 - Row 2, Row 1 - Row 3 etc for each sample. Then looping to Row 2 - Row 3 etc. with no duplicates (Row 4 - Row 4) or combination repeats(Row 4 - Row 3). Next, I would need all of the output to be placed into an excel document.
Currently with the code below, I was only able to achieve the loop for Row 1 - Row(n).
set.seed(1)
df <- matrix(sample.int(10, size = 10*5, replace = TRUE), nrow = 10, ncol = 4)
print(df)
df <- as.data.frame(df) # tabulate as dataframe
my_list <- list()
for(i in 1:nrow(df)){
for( j in 1:nrow(df)){
if (i != j) {
if(i < j){
my_list[[i]] <- df[i,] - df[j,]
}
}
}
}
df1 <- rbindlist(my_list, fill=TRUE)
I assumed that having:
if (i != j) {
if(i < j){
would be enough but its not giving me what I want.
Please help!
You can use combn to get all possible combinations and subtract them.
do.call(rbind, combn(seq(nrow(df)), 2, function(x) {
df[x[1], ] - df[x[2], ]
}, simplify = FALSE)) -> result
write.csv(result, 'result.csv', row.names = FALSE)
I want to modify an array but with an element-by-element condition. This is what I want to do
vector <- runif(18, 0,1)
xx <- array(vector, dim=c(2,3,3))
for (i in 1:2) {
for (j in 1:3) {
xx[i,j,1] <- ifelse(xx[i,j,1]<0.5,1,xx[i,j,1])
xx[i,j,2] <- ifelse(xx[i,j,2]<0.4,1.5,xx[i,j,2])
xx[i,j,3] <- ifelse(xx[i,j,3]<0.2,2,xx[i,j,3])
}
}
Is there a more efficient way to do it?
Thanks
Not sure what you mean by efficient but this avoids looping:
vector <- runif(18, 0,1)
xx <- array(vector, dim=c(2,3,3))
xx
xx[,,1][xx[,,1]<.5] <- 1
xx[,,2][xx[,,2]<.4] <- 1.5
xx[,,3][xx[,,3]<.2] <- 2
Try it online!
There are two ways that you could simplify this double loop
Option 1:
vector <- runif(18, 0,1)
xx <- array(vector, dim=c(2,3,3))
xx[,,1][xx[,,1]<.5] = 1
xx[,,2][xx[,,2]<.4] = 1.5
xx[,,3][xx[,,3]<.2] = 2
You still have to write one line for each condition, though.
The second way is to use lapply, but in this case you have to create three vectors: index, threshhold, substitution
idx = 1:3
thr = c(.5, .4, .2)
sb = c(1, 1.5, 2)
lapply(idx, function(k){
xx[,,k][ xx[,,k]< thr[x] ] <<- sb[k]
})
Im trying to create my very own first project in R but have hit a roadblock.
I have a data frame such as below where every row represents dataset of a financial option.
type <- c("C", "C")
marketV <- c(1.1166, 1.911)
S <- c(20, 60)
K <- c(20, 56)
T <- c(0.333, 0.5)
df <- data.frame(type, marketV, S, K, T)
I made a user defined function to take this data frame as an input and works great when the data frame is one row long. However, I'm not sure how to have my function iterate through all the data frame rows and produce a result for all of them.
I'm new to R so I'm unsure whether I should be running a 'for' loop around or playing around with lapply, or if theres a simple syntax answer. I simply want the function to take the df as input, but repeat its calculation for n row, and produce n results. Thank you for the help in advance.
My current function code for a df with 1 row below as reference:
This is a corrected version of your program:
df <- data.frame(type=c("C", "C"), marketV=c(1.1166, 1.911), S=c(20, 60), K=c(20, 56), T=c(0.333, 0.5))
IV <- function(df) {
# check if df has more then 1 row:
if (nrow(df)>1) { message("!! nrow(df)>1 !!"); return(NA) }
# Initializing of variables
r <- 0
sigma <- 0.3
sigma_down <- 0.001
sigma_up <- 1
count <- 0
type <- df$type; marketV <- df$marketV; S <- df$S; K <- df$K; T <- df$T
d1 <- (log(S/K) + (sigma^2/2)*T)/(sigma*sqrt(T))
d2 <- (log(S/K) - (sigma^2/2)*T)/(sigma*sqrt(T))
if(type=="C") {
V <- exp(-r*T)*(S*pnorm(d1) - K*pnorm(d2))
} else {
V <- exp(-r*T)*(K*pnorm(-d2) - S*pnorm(-d1)) }
difference <- V - marketV
# Root finding of sigma by Bisection method
while(abs(difference)>0.001 && count<1000) {
if(difference < 0) {
sigma_down <- sigma
sigma <- (sigma_up + sigma)/2
} else {
sigma_up <- sigma
sigma <- (sigma_down + sigma)/2
}
d1 <- (log(S/K) + (sigma^2/2)*T)/(sigma*sqrt(T))
d2 <- d1 - sigma*sqrt(T)
if(type=="C") {
V <- exp(-r*T)*(S*pnorm(d1) - K*pnorm(d2))
} else {
V <- exp(-r*T)*(K*pnorm(-d2) - S*pnorm(-d1)) }
difference <- V - marketV
count <- count + 1
}
if(count == 1000){
return(NA) # If sigma to satisfy Black76 price cannot be found
} else{
return(sigma)
}
}
sapply(split(df, seq(nrow(df))), IV)
The main thing is to run row by row through the dataframe. This is done by
sapply(split(df, seq(nrow(df))), IV)
In your original function are many errors: the biggest is accessing to S, K and so on. You might thinking taking the values from the dataframe df. But in fact you were taking the values from the workspace! I corrected this by redefining:
type <- df$type; marketV <- df$marketV; S <- df$S; K <- df$K; T <- df$T
I inserted a test for the number of rows in df, so you will get:
> IV(df)
!! nrow(df)>1 !!
[1] NA
Here is a cleaned up version of your program:
df <- data.frame(type=c("C", "C"), marketV=c(1.1166, 1.911), S=c(20, 60), K=c(20, 56), T=c(0.333, 0.5))
IV2 <- function(type, marketV, S, K, T) {
r <- 0; sigma <- 0.3
sigma_down <- 0.001; sigma_up <- 1
count <- 0
if(type=="C") {
f.sig <- function(sigma) {
d1 <- (log(S/K) + (sigma^2/2)*T)/(sigma*sqrt(T))
d2 <- d1 - sigma*sqrt(T)
exp(-r*T)*(S*pnorm(d1) - K*pnorm(d2)) - marketV
}
} else {
f.sig <- function(sigma) {
d1 <- (log(S/K) + (sigma^2/2)*T)/(sigma*sqrt(T))
d2 <- d1 - sigma*sqrt(T)
exp(-r*T)*(K*pnorm(-d2) - S*pnorm(-d1)) - marketV
}
}
ifelse(f.sig(sigma_down)*f.sig(sigma_up) < 0, uniroot(f.sig, c(sigma_down,sigma_up))$root, NA) # sigma
}
sapply(split(df, seq(nrow(df))), do.call, what="IV2")
I'd like to perform this function on a matrix 100 times. How can I do this?
v = 1
m <- matrix(0,10,10)
rad <- function(x) {
idx <- sample(length(x), size=1)
flip = sample(0:1,1,rep=T)
if(flip == 1) {
x[idx] <- x[idx] + v
} else if(flip == 0) {
x[idx] <- x[idx] - v
return(x)
}
}
This is what I have so far but doesn't work.
for (i in 1:100) {
rad(m)
}
I also tried this, which seemed to work, but gave me an output of like 5226 rows for some reason. The output should just be a 10X10 matrix with changed values depending on the conditions of the function.
reps <- unlist(lapply(seq_len(100), function(x) rad(m)))
Ok I think I got it.
The return statement in your function is only inside a branch of an if statement, so it returns a matrix with a probability of ~50% while in the other cases it does not return anything; you should change the code function into this:
rad <- function(x) {
idx <- sample(length(x), size=1)
flip = sample(0:1,1,rep=T)
if(flip == 1) {
x[idx] <- x[idx] + v
} else if(flip == 0) {
x[idx] <- x[idx] - v
}
return(x)
}
Then you can do:
for (i in 1:n) {
m <- rad(m)
}
Note that this is semantically equal to:
for (i in 1:n) {
tmp <- rad(m) # return a modified verion of m (m is not changed yet)
# and put it into tmp
m <- tmp # set m equal to tmp, then in the next iteration we will
# start from a modified m
}
When you run rad(m) is not do changes on m.
Why?
It do a local copy of m matrix and work on it in the function. When function end it disappear.
Then you need to save what function return.
As #digEmAll write the right code is:
for (i in 1:100) {
m <- rad(m)
}
You don't need a loop here. The whole operation can be vectorized.
v <- 1
m <- matrix(0,10,10)
n <- 100 # number of random replacements
idx <- sample(length(m), n, replace = TRUE) # indices
flip <- sample(c(-1, 1), n, replace = TRUE) # subtract or add
newVal <- aggregate(v * flip ~ idx, FUN = sum) # calculate new values for indices
m[newVal[[1]]] <- m[newVal[[1]]] + newVal[[2]] # add new values
I have written the code below to generate a matrix containing what is, to me, a fairly complex pattern. In this case I determined that there are 136 rows in the finished matrix by trial and error.
I could write a function to calculate the number of matrix rows in advance, but the function would be a little complex. In this example the number of rows in the matrix = ((4 * 3 + 1) + (3 * 3 + 1) + (2 * 3 + 1) + (1 * 3 + 1)) * 4.
Is there an easy and efficient way to create matrices in R without hard-wiring the number of rows in the matrix statement? In other words, is there an easy way to let R simply add a row to a matrix as needed when using for-loops?
I have presented one solution that employs rbind at each pass through the loops, but that seems a little convoluted and I was wondering if there might be a much easier solution.
Sorry if this question is redundant with an earlier question. I could not locate a similar question using the search feature on this site or using an internet search engine today, although I think I have found a similar question somewhere in the past.
Below are 2 sets of example code, one using rbind and the other where I used trial and error to set nrow=136 in advance.
Thanks for any suggestions.
v1 <- 5
v2 <- 2
v3 <- 2
v4 <- (v1-1)
my.matrix <- matrix(0, nrow=136, ncol=(v1+4) )
i = 1
for(a in 1:v2) {
for(b in 1:v3) {
for(c in 1:v4) {
for(d in (c+1):v1) {
if(d == (c+1)) l.s = 4
else l.s = 3
for(e in 1:l.s) {
my.matrix[i,c] = 1
if(d == (c+1)) my.matrix[i,d] = (e-1)
else my.matrix[i,d] = e
my.matrix[i,(v1+1)] = a
my.matrix[i,(v1+2)] = b
my.matrix[i,(v1+3)] = c
my.matrix[i,(v1+4)] = d
i <- i + 1
}
}
}
}
}
my.matrix2 <- matrix(0, nrow=1, ncol=(v1+4) )
my.matrix3 <- matrix(0, nrow=1, ncol=(v1+4) )
i = 1
for(a in 1:v2) {
for(b in 1:v3) {
for(c in 1:v4) {
for(d in (c+1):v1) {
if(d == (c+1)) l.s = 4
else l.s = 3
for(e in 1:l.s) {
my.matrix2[1,c] = 1
if(d == (c+1)) my.matrix2[1,d] = (e-1)
else my.matrix2[1,d] = e
my.matrix2[1,(v1+1)] = a
my.matrix2[1,(v1+2)] = b
my.matrix2[1,(v1+3)] = c
my.matrix2[1,(v1+4)] = d
i <- i+1
if(i == 2) my.matrix3 <- my.matrix2
else my.matrix3 <- rbind(my.matrix3, my.matrix2)
my.matrix2 <- matrix(0, nrow=1, ncol=(v1+4) )
}
}
}
}
}
all.equal(my.matrix, my.matrix3)
If you have some upper bound on the size of the matrix,
you can create a matrix
large enough to hold all the data
my.matrix <- matrix(0, nrow=v1*v2*v3*v4*4, ncol=(v1+4) )
and truncate it at the end.
my.matrix <- my.matrix[1:(i-1),]
This is the generic form to do it. You can adapt it to your problem
matrix <- NULL
for(...){
...
matrix <- rbind(matriz,vector)
}
where vector contains the row elements
I stumbled upon this solution today: convert the matrix to a data.frame. As new rows are needed by the for-loop those rows are automatically added to the data.frame. Then you can convert the data.frame back to a matrix at the end if you want. I am not sure whether this constitutes something similar to iterative use of rbind. Perhaps it becomes very slow with large data.frames. I do not know.
my.data <- matrix(0, ncol = 3, nrow = 2)
my.data <- as.data.frame(my.data)
j <- 1
for(i1 in 0:2) {
for(i2 in 0:2) {
for(i3 in 0:2) {
my.data[j,1] <- i1
my.data[j,2] <- i2
my.data[j,3] <- i3
j <- j + 1
}
}
}
my.data
my.data <- as.matrix(my.data)
dim(my.data)
class(my.data)
EDIT: July 27, 2015
You can also delete the first matrix statement, create an empty data.frame then convert the data.frame to a matrix at the end:
my.data <- data.frame(NULL,NULL,NULL)
j <- 1
for(i1 in 0:2) {
for(i2 in 0:2) {
for(i3 in 0:2) {
my.data[j,1] <- i1
my.data[j,2] <- i2
my.data[j,3] <- i3
j <- j + 1
}
}
}
my.data
my.data <- as.matrix(my.data)
dim(my.data)
class(my.data)