I know the question is silly but I really can't solve it. I just want to do different operations to the elements in a dataframe deppending on its sign. The following code generating a mock dataframe:
mock<-data.frame(matrix(NA,ncol=5,nrow=2))
colnames(mock)<-as.vector(c("m","n","1985-02-04","1985-02-05","1985-02-06"))
rownames(mock)<-as.vector(c("fund1","fund2"))
mock
mock[1,]<-c(0.001,0.0045,-0.03,0.25,NA)
mock[2,]<-c(0.004,0.0004,NA,0.12,-0.087)
mock
so it looks like
m n 1985-02-04 1985-02-05 1985-02-06
fund1 0.001 0.0045 -0.03 0.25 NA
fund2 0.004 0.0004 NA 0.12 -0.087
for each fund, m and n represent two different ratios, the last three figures are returns on the given days. I wish to do the following oerations:
if the return x on one day is positive, I need (x+m)/(1+n) to replace the corresponding figure in the dataframe.
If the return x is negative, I need x+m to replace the corresponding figure in the dataframe.
If it is NA on the day, I will leave it NA.
I tried the following code:
Grossreturn<-function(x){
a<-x[3:5]
m<-x[1]
p<-x[2]
a[a>0]<-(a[a>0]+m)/(1-p)
a[a<0]<-a[a<0]+m
return(a)
}
apply(mock,1,Grossreturn)
and of course it failed and the error message is:
Error in a[a > 0] <- (a[a > 0] + m)/(1 - p) :
NAs are not allowed in subscripted assignments
I really get stucked here and couldn't sort it out. Can someone help?
Thanks!
You should just exclude NAs from all your assignments. A sample syntax for doing this is below:
> foo = data.frame(x=runif(3)-0.5, y=runif(3)) #random data frame
> foo[2,1] <- NA #adding an NA
> foo
x y
1 -0.4616014 0.4892859
2 NA 0.4730237
3 0.4060813 0.1517448
If you now try to reassign without filtering out NAs, you get your error.
> foo[sign(foo$x)==-1, 1] <- -10
Error in `[<-.data.frame`(`*tmp*`, sign(foo$x) == -1, 1, value = -10) :
missing values are not allowed in subscripted assignments of data frames
But not if you explicitly leave out NAs:
> foo[sign(foo$x)==-1 & !is.na(foo$x), 1] <- -10
> foo
x y
1 -10.0000000 0.4892859
2 NA 0.4730237
3 0.4060813 0.1517448
Here is a code which solves your problem:
grossreturn <- function(x) {
m <- x[1]
n <- x[2]
# iterate over all date columns and compute new value
for (i in 3:length(x)) {
if (is.na(x[i]) {
# NA remains NA
} else if (x[i] < 0) {
x[i] <- x[i] + m # x + m
} else {
# x[i] >= 0
# includes case where x[i] == 0
x[i] <- (x[i] + m) / (1 + n) # (x + m) / (1 + n)
}
}
return x
}
result <- apply(mock, 1, FUN=function(x) grossreturn(x))
I wanted to use an apply function to iterate over the numerical columns after extracting out m and n, but there does not seem to be any vectorized apply functions which can also pass multiple parameters as input (so mapply would not be a vectorized solution).
I assumed that the case where a return is 0 that you wanted (x + m) / (1 + n). Also, you test whether R drops either the row or column names when you run this code.
Related
I am a new R user and have very limited programming experience, hence my question and poorly written code.
I was assigned a problem where I had to use a while loop to generate the numbers of the Fibonacci sequence that are less than 4,000,000 (the Fibonacci sequence is characterized by the fact that every number after the first two is the sum of the two preceding ones).
Next, I had to compute the sum of the even numbers in the sequence that was generated.
I was successful with my response, however, I don't think the code is written very well. What could I have done better?
> x <- 0
> y <- 1
> z <- 0
if (x == 0 & y == 1) {
cat(x)
cat(" ")
cat(y)
cat(" ")
while (x < 4000000 & y < 4000000) {
x <- x + y
cat(x)
cat(" ")
if (x %% 2 == 0) {
z <- x + z
}
y <- x + y
cat(y)
cat(" ")
if (y %% 2 == 0) {
z <- y + z
}
}
}
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465
cat(z)
4613732
First of all, cat comes with a sep argument. You can do cat(x, y, sep = " ") rather than using 3 lines for that.
Secondly, when you call while (x < 4000000 & y < 4000000) note that y will always be greater than x because it is the sum of the last x and y ... so it should suffice to check for y < 4000000 here.
For the while loop, you could also use a counter - might be more intuitive. Indexing in R isn't that fast though
fib <- c(0, 1)
i <- 2
while (fib[i] < 4000000) {
fib <- c(fib, fib[i-1] + fib[i])
i <- i + 1
}
sum(fib[fib %% 2 == 0])
If you don't necessarily need the while, you could also approach it via recursion
fib <- function(x, y) {
s <- x + y
c(s, if (s < 4000000) fib(y, s))
}
f <- fib(0, 1)
sum(f[f %% 2 == 0])
First, there's no need o explicitly print everything out.
Second, it's more idiomatic in R to make a vector of the Fibonacci numbers and then sum. If you don't know an explicit closed form for the Fibonacci numbers, or if you've been told not to use this, then use a loop to create the list of Fibonacci numbers.
So to construct the list of Fibonacci numbers (two at a time) you can do
x <- 0
y <- 1
fib <- c()
while (x < 4000000 & y < 4000000){
x <- x + y
y <- x + y
fib = c(fib, x, y)
}
This will give you a vector of Fibonacci numbers, containing all those less than 4000000 and a few more (the last element is 9227465).
Then run
sum(fib[fib %% 2 == 0 & fib < 4000000])
to get the result. This returns 4613732, like your code does. The subsetting operator [], when you put a logical condition inside it, will output just those numbers which satisfy the logical condition -- in this case, that they're even and less than 4000000.
I am using the closed form of the fibonacci sequence as found here
fib = function(n) round(((5 + sqrt(5)) / 10) * (( 1 + sqrt(5)) / 2) ** (1:n - 1))
numbers <- 2
while (max(fib(numbers)) < 4000000){ # try amount of numbers while the maximum of the sequence is less than 4000000
sequence <- fib(numbers) # here the sequence that satisfies the "4000000 condition will be saved"
numbers <- numbers + 1 # increase the amount of numbers
}
total_sum <- sum(sequence[sequence%%2==0]) # summing the even numbers
This is how I would do it. First, I defined a global variable i to include the first two elements of the Fibonacci series. Then at the end, I re-assigned the global variable to its initial value (i.e. 1). If I don't do that, then when I call the function fib(0,1) again, the output is incorrect as it calls the function with the last value of i. It's also important to do return() to ensure it doesn't return anything in the else clause. If you don't specify return(), the final output will be 1, instead of the Fibonacci series.
Please note the series only goes till the number 13 (z<14) obviously you can change that to whatever you want. May also be a good option to include this as the third argument of the function, something like fib(0,1,14). Try it out!
i <<- 1
fib <- function(x,y){
z <- x+y
if(z<14){
if (i==1){
i <<- i+1
c(x,y,z,fib(y,z))
}
else c(z, fib(y,z))
}
else {
i <<- 1
return()
}
}
a <- fib(0,1)
a
I want to return the number of times in string vector v that the element at the next successive index has more characters than the current index.
Here's my code
BiggerPairs <- function (v) {
numberOfTimes <- 0
for (i in 1:length(v)) {
if((nchar(v[i+1])) > (nchar(v[i]))) {
numberOfTimes <- numberOfTimes + 1
}
}
return(numberOfTimes)
}
}
missing value where TRUE/FALSE needed.
I do not know why this happens.
The error you are getting is saying that your code is trying to evaluate a missing value (NA) where it expects a number. There are likely one of two reasons for this.
You have NA's in your vector v (I suspect this is not the actual issue)
The loop you wrote is from 1:length(v), however, on the last iteration, this will try the loop to try to compare v[n+1] > v[n]. There is no v[n+1], thus this is a missing value and you get an error.
To remove NAs, try the following code:
v <- na.omit(v)
To improve your loop, try the following code:
for(i in 1:(length(v) -1)) {
if(nchar(v[i + 1]) > nchar(v[i])) {
numberOfTimes <- numberOfTimes + 1
}
}
Here is some example dummy code.
# create random 15 numbers
set.seed(1)
v <- rnorm(15)
# accessing the 16th element produces an NA
v[16]
#[1] NA
# if we add an NA and try to do a comparison, we get an error
v[10] <- NA
v[10] > v[9]
#[1] NA
# if we remove NAs and limit our loop to N-1, we should get a fair comparison
v <- na.omit(v)
numberOfTimes <- 0
for(i in 1:(length(v) -1)) {
if(nchar(v[i + 1]) > nchar(v[i])) {
numberOfTimes <- numberOfTimes + 1
}
}
numberOfTimes
#[1] 5
Is this what you're after? I don't think there is any need for a for loop.
I'm generating some sample data, since you don't provide any.
# Generate some sample data
set.seed(2017);
v <- sapply(sample(30, 10), function(x)
paste(sample(letters, x, replace = T), collapse = ""))
v;
#[1] "raalmkksyvqjytfxqibgwaifxqdc" "enopfcznbrutnwjq"
#[3] "thfzoxgjptsmec" "qrzrdwzj"
#[5] "knkydwnxgfdejcwqnovdv" "fxexzbfpampbadbyeypk"
#[7] "c" "jiukokceniv"
#[9] "qpfifsftlflxwgfhfbzzszl" "foltth"
The following vector marks the positions with 1 in v where entries have more characters than the previous entry.
# The following vector has the same length as v and
# returns 1 at the index position i where
# nchar(v[i]) > nchar(v[i-1])
idx <- c(0, diff(nchar(v)) > 0);
idx;
# [1] 0 0 0 0 1 0 0 1 1 0
If you're just interested in whether there is any entry with more characters than the previous entry, you can do this:
# If you just want to test if there is any position where
# nchar(v[i+1]) > nchar(v[i]) you can do
any(idx == 1);
#[1] TRUE
Or count the number of occurrences:
sum(idx);
#[1] 3
I've been set a question on the Fibonacci Sequence and although I've been successful in doing the sequence, I haven't been as lucky summing the even terms up (i.e. 2nd, 4th, 6th... etc.) My code is below as well as the part of the question I am stuck on. Any guidance would be brilliant!
Question:
Write a function which will take as an input x and y and will return either the sum of the first x even Fibonacci numbers or the sum of even Fibonacci numbers less than y.
That means the user will be able to specify either x or y but not both.
You have to return a warning if someone uses both numbers (decide
on the message to return)
Code:
y <- 10
fibvals <- numeric(y)
fibvals[1] <- 1
fibvals[2] <- 1
for (i in 3:y) {
fibvals[i] <- fibvals[i-1]+fibvals[i-2]
if (i %% 2)
v<-sum(fibvals[i])
}
v
To get you started since this sounds like an exercise.
I would split your loop up into steps rather than do the summing within the loop with an if statement. Since you already have the sequence code working, you can just return what is asked for by the user. The missing function would probably help you out here
f <- function(x, y) {
if (missing(y)) {
warning('you must give y')
y <- 10
}
fibvals <- numeric(y)
fibvals[1] <- 1
fibvals[2] <- 1
for (i in 3:y) {
fibvals[i] <- fibvals[i-1]+fibvals[i-2]
}
evens <- fibvals %% 2 == 0
odds <- fibvals %% 2 != 0
if (missing(x)) {
return(sum(fibvals[evens]))
} else return(fibvals)
}
f(y = 20)
# [1] 3382
f(10)
# [1] 1 1 2 3 5 8 13 21 34 55
# Warning message:
# In f(10) : you must give y
Just a general question:
When I run:
ok<-NULL
for (i in 1:3) {
ok[i]=i^2
i=i+1
}
The loop works (as expected).
> ok
[1] 1 4 9
Now when I try to do something like:
ok<-NULL
for (i in 1:3) {
ok[i]=i^2
x[i]<-ok[i]+1
y[i]<-cbind(ok[i],x)
i=i+1
}
And I want:
y = 1
2
4
5
9
10
Instead I get:
Warning messages:
1: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
2: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
3: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
4: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
5: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
Thanks in advance.
You should read up on R basics before starting to program.
You don't have to increment i in the loop (actually its quite confusing).
You don't cbind or rbind vectors this is for data.frame columns and rows.
y <- NULL
for(i in 1:3){ ok <- i^2; x <- ok + 1; y <- c(y, ok, x) }
or:
as.vector(sapply(1:3, function(i){ ok <- i^2; x <- ok + 1; c(ok, x) }))
With this command y[i]<-cbind(ok[i],x) you attempt to replace one element in the vector with several. This causes an error.
If you want to to get 1:3 squared, you would use:
ok <- (1:3)^2
ok
# [1] 1 4 9
If you want to get 1:3 squared, along with the numbers right after them, you might try:
as.vector(rbind(ok, ok+1))
[1] 1 2 4 5 9 10
for loops in R are often the wrong solution to your problem.
Problem
Find the sum of all numbers below 1000 that can be divisible by 3 or 5
One solution I created:
x <- c(1:999)
values <- x[x %% 3 == 0 | x %% 5 == 0]
sum(values
Second solution I can't get to work and need help with. I've pasted it below.
I'm trying to use a loop (here, I use while() and after this I'll try for()). I am still struggling with keeping references to indexes (locations in a vector) separate from values/observations within vectors. Loops seem to make it more challenging for me to distinguish the two.
Why does this not produce the answer to Euler #1?
x <- 0
i <- 1
while (i < 100) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- c(x, i)
}
i <- i + 1
}
sum(x)
And in words, line by line this is what I understand is happening:
x gets value 0
i gets value 1
while object i's value (not the index #) is < 1000
if is divisible by 3 or 5
add that number i to the vector x
add 1 to i in order (in order to keep the loop going to defined limit of 1e3
sum all items in vector x
I am guessing x[i] <- c(x, i) is not the right way to add an element to vector x. How do I fix this and what else is not accurate?
First, your loop runs until i < 100, not i < 1000.
Second, replace x[i] <- c(x, i) with x <- c(x, i) to add an element to the vector.
Here is a shortcut that performs this sum, which is probably more in the spirit of the problem:
3*(333*334/2) + 5*(199*200/2) - 15*(66*67/2)
## [1] 233168
Here's why this works:
In the set of integers [1,999] there are:
333 values that are divisible by 3. Their sum is 3*sum(1:333) or 3*(333*334/2).
199 values that are divisible by 5. Their sum is 5*sum(1:199) or 5*(199*200/2).
Adding these up gives a number that is too high by their intersection, which are the values that are divisible by 15. There are 66 such values, and their sum is 15*(1:66) or 15*(66*67/2)
As a function of N, this can be written:
f <- function(N) {
threes <- floor(N/3)
fives <- floor(N/5)
fifteens <- floor(N/15)
3*(threes*(threes+1)/2) + 5*(fives*(fives+1)/2) - 15*(fifteens*(fifteens+1)/2)
}
Giving:
f(999)
## [1] 233168
f(99)
## [1] 2318
And another way:
x <- 1:999
sum(which(x%%5==0 | x%%3==0))
# [1] 233168
A very efficient approach is the following:
div_sum <- function(x, n) {
# calculates the double of the sum of all integers from 1 to n
# that are divisible by x
max_num <- n %/% x
(x * (max_num + 1) * max_num)
}
n <- 999
a <- 3
b <- 5
(div_sum(a, n) + div_sum(b, n) - div_sum(a * b, n)) / 2
In contrast, a very short code is the following:
x=1:999
sum(x[!x%%3|!x%%5])
Here is an alternative that I think gives the same answer (using 99 instead of 999 as the upper bound):
iters <- 100
x <- rep(0, iters-1)
i <- 1
while (i < iters) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- i
}
i <- i + 1
}
sum(x)
# [1] 2318
Here is the for-loop mentioned in the original post:
iters <- 99
x <- rep(0, iters)
i <- 1
for (i in 1:iters) {
if (i %% 3 == 0 | i %% 5 == 0) {
x[i] <- i
}
i <- i + 1
}
sum(x)
# [1] 2318