Question about passing sapply index to function - r

I am sure this is a simple problem. But I am new to programming so I am struggling. I think what I am trying to accomplish should be pretty clear from the code. Essentially, I want to generate a vector of random numbers of length i, check if there is less than i unique numbers. And I want to do this a bunch of times as a sort of simulation. When I do it i by i manually using the following code:
experiment<- function() {
ab <- rdunif(i, 1, 365)
ab <- data.frame(ab)
count <- uniqueN(ab)
if (count < i)
return(1)
else
return(0)
}
vector <- replicate(10, experiment(), simplify=FALSE)
sum <- sum(as.data.frame((vector)))
probability <- sum/(10)
It works fine. But I need to run this simulation 40 times and I would rather not do it by hand. However, I can't seem to get sapply to work for me and I cannot figure out what I am doing wrong:
i<-10:50
experiment<- function(i) {
ab <- rdunif(i, 1, 365)
ab <- data.frame(ab)
count <- uniqueN(ab)
if (count < i)
return(1)
else
return(0)
}
complete <- function(i) {
vector <- replicate(10, experiment(i), simplify=FALSE)
sum <- sum(as.data.frame((vector)))
probability <- sum/(10)
return(probability)
}
sapply(i, complete(i), simplify=FALSE)
This is the error I am currently experiencing:
Error in match.fun(FUN) :
'complete(i)' is not a function, character or symbol
In addition: Warning messages:
1: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
2: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
3: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
4: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
5: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
6: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
7: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
8: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
9: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used
10: In if (count < i) return(1) else return(0) :
the condition has length > 1 and only the first element will be used

I figured it out:
experiment<- function(i) {
ab <- rdunif(i, 1, 365)
count <- length(unique(ab))
if (count < i) return(1)
else return(0)
}
i <- 10:50
replication <- function(i) {
replicate(100, experiment(i))
}
data<- sapply(i, replication)
colMeans(data)

Related

R: cumulative sum until certain value

I want to calculate how many values are taken until the cumulative reaches a certain value.
This is my vector: myvec = seq(0,1,0.1)
I started with coding the cumulative sum function:
cumsum_for <- function(x)
{
y = 1
for(i in 2:length(x)) # pardon the case where x is of length 1 or 0
{x[i] = x[i-1] + x[i]
y = y+1}
return(y)
}
Now, with the limit
cumsum_for <- function(x, limit)
{
y = 1
for(i in 2:length(x)) # pardon the case where x is of length 1 or 0
{x[i] = x[i-1] + x[i]
if(x >= limit) break
y = y+1}
return(y)
}
which unfortunately errors:
myvec = seq(0,1,0.1)
cumsum_for(myvec, 0.9)
[1] 10
Warning messages:
1: In if (x >= limit) break :
the condition has length > 1 and only the first element will be used
[...]
What about this? You can use cumsum to compute the cumulative sum, and then count the number of values that are below a certain threshold n:
f <- function(x, n) sum(cumsum(x) <= n)
f(myvec, 4)
#[1] 9
f(myvec, 1.1)
#[1] 5
You can put a while loop in a function. This stops further calculation of the cumsum if the limit is reached.
cslim <- function(v, l) {
s <- 0
i <- 0L
while (s < l) {
i <- i + 1
s <- sum(v[1:i])
}
i - 1
}
cslim(v, .9)
# [1] 4
Especially useful for longer vectors, e.g.
v <- seq(0, 3e7, 0.1)

What is the purpose of b <-c() in this code?

I have written a code to find a positive integer that has more divisors than any smaller positive integer has. My code is right but I noticed that I wrote a step only because I had solved other questions similarly but I don't really understand the intuition of why we write this particular line:
b <- c()
Also, why is there a "b" in c(b, sum..) as in the below line:
b <- c(b, sum(p %% c(1:p) == 0))
Here is the full code:
code <- function(n) {
if (n < 1 | n %% 1 != 0)
print("Only positive integers allowed")
else if (n <= 2)
return(TRUE)
else {
a <- sum(n %% c(1:n) == 0)
b <- c()
for (p in 1:(n-1)) {
b <- c(b, sum(p %% c(1:p) == 0))
}
return(max(b) < a)
}
}
code(8)
code(6)
code(-7)
As already explained in comments the purpose of b<-c() is to initialise an empty vector and fill it in the loop. Also the reason why you are using b <- c(b,sum(p%%c(1:p)== 0)) is to append new values to already existing values of b.
For example,
b <- c()
b
#NULL
b <- c(b, 1)
b
#[1] 1
b <- c(b, 2)
b
#[1] 1 2
Usually, it is not a good practice to grow an object in a loop, it is highly inefficient to do that. If the size of output is fixed you can initialise a vector with fixed size and then fill it in the loop.
code <- function(n){
if (n<1 | n%%1!=0)
print("Only positive integers allowed")
else if (n <= 2)
return(TRUE)
else{
a <- sum(n%%c(1:n) == 0)
b <- integer(n-1) #Creates a vector with 0's of length n-1
for (p in 1:(n-1)) {
b[p] <- sum(p%%c(1:p)== 0)
}
return(max(b) < a)
}
}
Or in this case you can save only the max value of b since all other values are not important.
code <- function(n){
if (n<1 | n%%1!=0)
print("Only positive integers allowed")
else if (n <= 2)
return(TRUE)
else{
a <- sum(n%%c(1:n) == 0)
max_b <- 0
for (p in 1:(n-1)) {
val <- sum(p%%c(1:p)== 0)
if(val > max_b) max_b <- val
}
return(max_b < a)
}
}

R: Creating a function usingthe control structure 'for'

I would like to create a function that, having a vector (v) and a number (n), analyze if any of the numbers of 'v' is divisible by 'n', if it is, the function would have the outcome 'TRUE'. How could I use the control structure 'for' for it?
So far I've solved this problem using the 'while' operator:
function.while <- function(v, n){
while (n %% v == 0)
return (TRUE)
}
But I can't fully understand the logic of 'for'.
Thanks.
You can use for loop like below :
function.for <- function(v, n){
result <- logical(length(v))
for(i in seq_along(v)) {
result[i] <- v[i] %% n == 0
}
return(result)
}
function.for(1:10, 2)
#[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
However, this is vectorised operation and you don't need for loop.
function.vectorized <- function(v, n){
v %% n == 0
}
function.vectorized(1:10, 2)
If you want to check if any value is present.
function.vectorized <- function(v, n){
any(v %% n == 0)
}

Error argument is of length zero occurs when creating my own sorting function

*> csort <- function(c){
i<-1
for (i in 1:length(c)-1) {
j <- i+1
for (j in 2:length(c)) {
if(c[i] >= c[j])c[c(i,j)] <- c[c(j,i)]
j = j + 1
}
i = i + 1
}
}
> csort(a)
Error in if (c[i] >= c[j]) c[c(i, j)] <- c[c(j, i)] :
argument is of length zero*
This is what RStudio do when I run it. I do not know what cause the zero here.
csort <- function(c){
p <- 1
povit <- c[1]
c <- c[-1]
left <- c()
right <- c()
left <- c[which(c <= povit)]
right <- c[which(c > povit)]
if(length(left) > 1){
left <- csort(left)
}
if(length(right) > 1){
right <- csort(right)
}
return(c(left ,povit,right))
}
I viewed more about sorting online and this is a pivot sort way.
your mistake is in this line
for (i in 1:length(c)-1)
and should be
for (i in 1:(length(c)-1))
since $:$ operator precedes $-$.
an example is
1:(5-1)
#[1] 1 2 3 4
1:5-1
#[1] 0 1 2 3 4
so error happen in index with Zero value.
csort <- function(d){
for (i in 1:(length(d)-1)) {
for (j in (i+1):length(d)) {
if(d[i] >= d[j])d[c(i,j)] <- d[c(j,i)]
}
}
return(d)
}
d<-c(5:1,-1:3,-9,-3,10,9,-20,1,20,-6,5)
any((csort(d)==sort(d))==F)
#[1] FALSE
you can improve this function.

If statement doesn't work with a matrix condition

I have absolutely no idea why this is happening. My if statement doesn't work with a condition involving matrices.
This is my input:
i = matrix(c(1,0,0,1),nrow=2,ncol=2,byrow=TRUE)
j = matrix(c(1,0,0,2),nrow=2,ncol=2,byrow=TRUE)
if(i%*%i == j){
print("yes")
}
This is my output:
> i = matrix(c(1,0,0,1),nrow=2,ncol=2,byrow=TRUE)
> j = matrix(c(1,0,0,2),nrow=2,ncol=2,byrow=TRUE)
> if(i%*%i == j){
+ print("yes")
+ }
[1] "yes"
Warning message:
In if (i %*% i == j) { :
the condition has length > 1 and only the first element will be used

Resources