I'm creating a function that takes two inputs: delta, and length. The function modifies an existing vector - let's call this vector base - by delta for 'length' specified elements.
I need to create a new vector, let's call it 'adjusted'. This vector, adjusted, needs to populate based on one condition: If the index value for the base vector is less than the length parameter specified in the function input, then I want to increase the base vector by delta for all elements less than length. All elements in the base vector greater than the specified length parameter are to remain the same as in base vector.
Here is an example of my code: I have tried to implement the logic in two ways.
1. Using an ifelse because it allows for vectorized output.
2. Using an if else logic.
adjusted <- vector() #initialize the vector.
#Create a function.
Adjustment <- function(delta, length) {
ifelse((index < length), (adjusted <- base + delta), (adjusted <- base))
head(adjusted)
}
#Method 2.
Adjustment <- function(delta, length) {
if (index < length) {
adjusted <- base + delta
}
else {adjusted <- base}
head(adjusted)
}
Where I am experiencing trouble is with my newly created vector, adjusted, will not populate. I'm not quite sure what is happening - I am very new to R and any insight would be much appreciated.
EDIT: With some advice given, I was able to return output. The only issue is that the 'adjusted' vector is now 2x the length of the index vector. I see that this is happening because there are two calculations, one in the if value is true, and one in the if value is false. However, I want the function to write to the vector adjusted only in the elements for which the index is less than the parameter length.
For example. Let's say that the index vector is 10. If I set length to 3, I want adjusted to reflect a change for elements 1:3 and the same information in base for 4:10.
My code reflects adjusted is now length 20. How can I approach this and maintain the original value of the vector?
adjusted <- c()
Adjustment <- function(delta, length) {
ifelse(index < length, adjusted <<- c(adjusted, base + delta), adjusted <<- c(adjusted, base))
head(adjusted)
}
How about using a bit of Boolean algebra:
adjusted <- base + delta*(base > length)
Adjustment <- function(delta, length) {
if (index < length) {
for(i in 1:length-1){
adjusted[i] <- base[i] + delta
}
}
else {adjusted <- base}
head(adjusted)
}
index=2
length=5
base <- c(1,2,3,4,5,6,7,8,9,10)
delta = 2
Adjustment(delta,length)
This returns a vector that has all values < indexVALUE from base vector added by delta.
Related
I am trying to write a function that will check to see if each value in a vector is greater than a certain value, and will then initialize a different empty column to 1 or 0. If the value is greater than the set parameter, then the same index of the empty column will be set to 1. Otherwise if it isn't greater than that number, it will be set to 0. So my function will take in a value, column containing probabilities, and then another column that is empty which will later be made into a column of 1's and 0's.
I don't get any errors when running the function however that empty column does not get updated.
I feel like my logic is right but clearly something is not working.
cut_off_prob <- function(x, prob, pred_class) {
for (i in 1:length(prob))
{
if(prob[i] > x)
{
pred_class[i] <- 1
}
else {
pred_class[i] <- 0
}
}
}
There is no need of pred_class as a function parameter. Try this:
cut_off_prob <- function(x, prob) {
pred_class <- c() # initialize as an empty vector
for (i in 1:length(prob)){
if(prob[i] > x){pred_class[i] <- 1}
else {pred_class[i] <- 0}
}
return(pred_class)
}
By the other hand, i strongly recommend you not to use for loops in this case, since as #Gregor Thomas said it's very inefficient. In comparison, check this:
pred_class <- rep(0,length(prob));
pred_class[prob>x] <- 1
Running time is shown here (in seconds):
"Loop Time: 0.0029909610748291 Conditional Indexing Time: 0.000996828079223633"
I'm making a "pairwise" array in R. Given the vector combo, I'm finding every permutation of 4 elements. Thus, a 4-dimensional "pairwise" array. My current approach is making it as a simple list, using nested sapply functions, like so:
fourList <- sapply(X = combo, FUN = function(h) {
hi <- which(combo == h) #get index of h
sapply(X = combo[hi:n], FUN = function(i) {
ii <- which(combo == i) #get index of i
sapply(X = combo[ii:n], FUN = function(j) {
ji <- which(combo == j) #get index of j
sapply(X = combo[ji:n], FUN = function(k) {
list(c(h,i,j,k))
})
})
})
})
I'd like to make some sort of progress indicator, so I can report to the user what percentage of the array has been built. Ideally, I'd just take numberCasesCompleted and divide that by totalCases = length(combo)^4 to get the fraction that is done. However, I can't seem to figure out an algorithm that takes in hi, ji, and ii, and outputs the value numberCasesCompleted. How can I calculate this?
In the 2D (x by y) case (e.g: sapply(X, function(x) {sapply(X[xi:n], function(y) {list(c(x,y))}}), this could be calculated by sum(n - (x-2:x), y-(x-1)), but generalizing that to 4 dimensions sounds rather difficult.
I'm stupid. Just add the proportion complete of the first level to the proportion complete of the second level (scaled down to a single iteration at the first level), and so forth.
In my case: completion <- hi/(n+1) + (ii/(n+1))*(1/n) + (ji/n)*(1/n)*(1/n)
(The n+1 denominators are there because there's effectively another loop after hi is equal to n, as ii still has a full set of iterations to complete. Otherwise it would end at ~101%. But for a rough/quick estimation of progress, this is fine.)
However, it is worth noting that (according to #Gregor in the comments) there are much better ways of making combinations in R, so my original use case may be moot (just don't use nested sapply in the first place).
I have the following problem: I am considering a database of 10 different vectors, containing the return of the same number of portfolios. I want to split the sample in two subsets, one containing the returns when the market performed positively, and the other containing the returns when the market performed badly. For this reason I have another vector containing the return of the market.
What I did so far is this:
a) I have computed the number of days in which the market had positive or negative returns:
n_bull <- sum(Mkt_Ret >= 0 )
n_bear <- sum(Mkt_Ret < 0 )
b) I have created a series of vector that will contain the result for each portfolio, i.e.:
Portfolio_1_Bull <- rep(0, n_bull)
Portfolio_1_Bear <- rep(0, n_bear)
c) I run a loop to fill the latter with a loop:
for(i in 1:length(Portfolio_1_EW_tot_ret)){
if(Mkt_Ret[i] >= 0){
Portfolio_1_Bull[i] = Portfolio_1_EW_tot_ret[i]
}
}
}
The problem is that the resulting vector Portfolio_1_Bull will have the same number of observation of the whole portfolio. Is there a way to solve this problem?
As #Roland said the easiest way is Portfolio_1_EW_tot_ret[Mkt_Ret >= 0]. But if you want your loop work, use another counter to index your portfolio:
j=1
for(i in 1:length(Portfolio_1_EW_tot_ret)){
if(Mkt_Ret[i] >= 0){
Portfolio_1_Bull[j] = Portfolio_1_EW_tot_ret[i]
j = j+1
}
}
i want to create a vector of numbers (following weibulldistribution (shape=c,scale=b), the length is uncertain at the beginning of creating this vector (length depending on g)! using a function (c,b,g) with a repeat loop brings the result on screen, but not into the vector. So I need the last loop's result in a vector, but don't know how
t<-NULL
z<-NULL
p<-NULL
neededvector<-function(c,b,g) {
p<-repeat{
t<-rweibull(1,c,b)
append(z,t)
z<-print(append(z,t))
if(sum(((z*0.01)^2*pi)/4)>g)
break
}
}
Normally it's a bad idea to grow an object in a loop, since that's slow in R. If we knew your resulting vector were less than 1000, we could use cumsum to know when we should stop it:
neededvector <- function(c,b,g) {
z <- rweibull(1000, c, b)
z[((cumsum(z) * 0.01) ^ 2 * pi) <= g]
}
This solution won't work for you if the resulting vector should have been longer than 1000. But you can make it work, and be a lot faster than 1-at-a-time, by doing it in chunks.
neededvector <- function(c,b,g) {
z <- c()
while (TRUE) {
# generate values 1000 at a time
z <- c(z, rweibull(1000, c, b))
threshold <- ((cumsum(z) * 0.01) ^ 2 * pi) <= g
# check if this would include all elements of z.
# if not, return it.
if (sum(threshold) < length(z)) {
return(z[threshold])
}
}
}
Generally, rather than 1000, set that value to some length greater than you generally expect the Weibull to be. (If your vector ends up being length 100,000, this method will have poor performance unless you set it to create it in chunks closer to that length).
We all know that appending a vector to a vector within a for loop in R is a bad thing because it costs time. A solution would be to do it in a vectorized style. Here is a nice example by Joshua Ulrich. It is important to first create a vector with known length and then fill it up, instead of appending each new piece to an existing piece within the loop.
Still, in his example he demonstrates 'only' how to append one data piece at a time. I am now fighting with the idea to fill a vector with vectors - not scalars.
Imagine I have a vector with a length of 100
vector <- numeric(length=100)
and a smaller vector that would fit 10 times into the first vector
vec <- seq(1,10,1)
How would I have to construct a loop that adds the smaller vector to the large vector without using c() or append ?
EDIT: This example is simplified - vec does not always consist of the same sequence but is generated within a for loop and should be added to vector.
You could just use normal vector indexing within the loop to accomplish this:
vector <- numeric(length=100)
for (i in 1:10) {
vector[(10*i-9):(10*i)] <- 1:10
}
all.equal(vector, rep(1:10, 10))
# [1] TRUE
Of course if you were just trying to repeat a vector a certain number of times rep(vec, 10) would be the preferred solution.
A similar approach, perhaps a little more clear if your new vectors are of variable length:
# Let's over-allocate so that we now the big vector is big enough
big_vec = numeric(1e4)
this.index = 1
for (i in 1:10) {
# Generate a new vector of random length
new_vec = runif(sample(1:20, size = 1))
# Stick in in big_vec by index
big_vec[this.index:(this.index + length(new_vec) - 1)] = new_vec
# update the starting index
this.index = this.index + length(new_vec)
}
# truncate to only include the added values
big_vec = big_vec[1:(this.index - 1)]
As #josilber suggested in comments, lists would be more R-ish. This is a much cleaner approach, unless the new vector generation depends on the previous vectors, in which case the for loop might be necessary.
vec_list = list()
for (i in 1:10) {
# Generate a new vector of random length
vec_list[[i]] = runif(sample(1:20, size = 1))
}
# Or, use lapply
vec_list = lapply(1:10, FUN = function(x) {runif(sample(1:20, size = 1))})
# Then combine with do.call
do.call(c, vec_list)
# or more simply, just unlist
unlist(vec_list)