So I am trying to do a for loop over the upper triangle part of a matrix, so I only want the elements 1 <= i < j <= n. And I tried it out in R as follows:
for(i in 1:n-1) {
for(j in i+1:n) {
...
}
}
But instead of iterating over 1 <= i < j <= n these for loops go over the elements i + 1 <= j <= i + n, 1 <= i < n.
I'm new to R, so I don't understand what is happening. Could someone give me a hint how to do it correctly?
Thanks.
for(i in seq(1, n - 1)) {
for(j in seq(i + 1, n)) {
...
}
}
alternatively
for(i in 1:(n - 1)) {
for(j in (i + 1):n) {
...
}
}
The issue is that R understands i+1:n as i + (1:n)
Related
for (i in 2:100 )
{
count <- 0
for (j in i )
if( (i %% j ) == 0 )
count <- count + 1
if(count == 2 )
print(i)
}
I am trying to print print prime numbers in R. Could any one help me to resolve
Let us look at your code and show what went wrong. It's the inner loop that did not loop at all:
for (i in 2:100 )
{
count <- 0
for (j in 1:i ) # you forgot to provide a vector here
if( i %% j == 0 )
count <- count + 1
if(count == 2)
print(i)
}
The answer above tries to optimise the code some more and is a more efficient solution. For example does it only test odd numbers because even ones clearly aren't prime numbers.
The below code creates the function prime_numbers which returns prime numbers up to an inputted number.
prime_numbers <- function(n) {
if (n >= 2) {
x = seq(2, n)
prime_nums = c()
for (i in seq(2, n)) {
if (any(x == i)) {
prime_nums = c(prime_nums, i)
x = c(x[(x %% i) != 0], i)
}
}
return(prime_nums)
}
else {
stop("Input number should be at least 2.")
}
}
## return the answer to your question
prime_numbers(100)
If you wanted the range 3:100, after running the above code you could run something like this:
a<-prime_numbers(100)
a[a>3]
I hope this helps!
I would like to create my own Heapsort algorithm in R.
That is my code
heapify <- function(array, n, i)
{
parent <- i
leftChild <- 2 * i + 1
rightChild <- 2 * i + 2
if ((leftChild < n) & (array[parent] < array[leftChild]))
{
parent = leftChild
}
if ((rightChild < n) & (array[parent] < array[rightChild]))
{
parent = rightChild
}
if (parent != i)
{
array = replace(array, c(i, parent), c(array[parent], array[i]))
heapify(array, n, parent)
}
}
heapSort <- function(array)
{
n <- length(array)
for (i in (n+1):1)
{
heapify(array, n, i)
}
for (i in n:1)
{
array = replace(array, c(i, 0), c(array[0], array[i]))
heapify(array, i, 1)
}
print(array)
}
However that implementation seems to be incorrect. That's an example of an input and output.
array <- c(5, 14, 3, 70, 64)
heapSort(array)
Output: [1] 5 14 3 70 64
I have spent quite a while and I have no idea where the problem is. I would appreciate any hints or tips.
My guess is that you were trying to convert the algorithm posted on GeeksforGeeks where they implement this in many zero based languages. This is one of the sources of your problem (R starts indexing at 1 instead of 0).
Base Zero Indexing Issues:
Example 1:
## We also need to swap these indices
array = replace(array, c(i, 0), c(array[0], array[i]))
heapify(array, i, 1)
Should be:
array <- replace(array, c(i, 1), array[c(1, i)])
array <- heapify(array, i, 1)
Example 2:
leftChild <- 2 * i + 1
rightChild <- 2 * i + 2
Should be:
leftChild <- 2 * (i - 1) + 1
rightChild <- 2 * (i - 1) + 2
Pass By Reference Assumption
In R, you cannot pass an object by reference (see this question and answers Can you pass-by-reference in R?). This means that when we call a recursive function we must assign it and the recursive function must return something.
In heapify we must return array. Also every call to heapify we must assign array to the output.
Here is the amended code:
heapify <- function(array, n, i)
{
parent <- i
leftChild <- 2 * (i - 1) + 1
rightChild <- 2 * (i - 1) + 2
if ((leftChild < n) & (array[parent] < array[leftChild]))
{
parent <- leftChild
}
if ((rightChild < n) & (array[parent] < array[rightChild]))
{
parent <- rightChild
}
if (parent != i) {
array <- replace(array, c(i, parent), array[c(parent, i)])
array <- heapify(array, n, parent)
}
array
}
heapSort <- function(array)
{
n <- length(array)
for (i in floor(n / 2):1) {
array <- heapify(array, n, i)
}
for (i in n:1) {
array <- replace(array, c(i, 1), array[c(1, i)])
array <- heapify(array, i, 1)
}
array
}
Here are some tests (note this algorithm is extremely inefficient in R.. do not try on vectors much larger than below):
array <- c(5, 14, 3, 70, 64)
heapSort(array)
[1] 3 5 14 64 70
set.seed(11)
largerExample <- sample(1e3)
head(largerExample)
[1] 278 1 510 15 65 951
identical(heapSort(largerExample), 1:1e3)
[1] TRUE
I'm trying to write a condition that says if "i" doesn't exist in the vector print 0 - meaning in this vector it should print just [3]
number_vector=c(1,5,26,7,94)
for (i in numbers_vector)
if ((i >24)&(i%%13 == 0)) {
print(which(numbers_vector==i))
} else {
print(0)
}
Here is the solution for your hometask (using a loop):
v <- c(1, 5, 26, 7, 94)
w <- 0
for (i in 1:length(v)) {
if ((v[i] >24) & (v[i] %% 13 == 0)) { w <- i; break }
}
w
Without the restriction the code can be short:
v <- c(1,5,27,7,94)
w <- which((v >24) & (v%%13 == 0))
if (length(w)==0) w <- 0
I am learning how to use while and for loops, and am having difficulty executing a for loop within a while loop. I am trying to recurse a simple procedure over a vector, and have the while loop set the conditionals for which parts of the vector are operated upon. This is really just meant as an exercise to understand how to use for and while loops in conjunction.
data <- c(1:200)
n=0
while(n < 100){
for(x in data){
n=x+1
if(x >= 10 & x < 100){
print("double digit")
} else {
print("single digit")
}}}
I want to stop the while loop when the procedure hits x=100 of the vector that runs from 1:200. But instead, the for loop runs through every element within the vector, from 1:200, failing to stop executing when n hits 100.
Would appreciate any advice to help out a new coder, thanks.
I have also tried
for(x in data){
n=x+1
while(n < 100){
if(x >= 10 & x < 100){
print("double digit")
} else {
print("single digit")
}}}
But the code does not stop executing.
First let's try a for loop. Each time through it n will be set to the loop counter plus 1. If this result is between 10 and 100, print a message, if not print something else. Note that no loop depends on n .
data <- c(1:200)
n = 0
for (x in data) {
n = x + 1
if (n < 100) {
if (x >= 10 && x < 100) {
print("double digit")
} else {
print("single digit")
}
}
}
x
#[1] 200
n
#[1] 201
Now for a while loop. I believe it is much simpler, it only envolves one variable, n.
n <- 0
while (n < 100) {
n = n + 1
if (n < 100) {
if (n >= 10) {
print("double digit")
} else {
print("single digit")
}
}
}
n
#[1] 100
My reproducible R example:
f = runif(1500,10,50)
p = matrix(0, nrow=1250, ncol=250)
count = rep(0, 1250)
for(i in 1:1250) {
ref=f[i]
for(j in 1:250) {
p[i,j] = f[i + j - 1] / ref-1
if(p[i,j] == "NaN") {
count[i] = count[i]
}
else if(p[i,j] > (0.026)) {
count[i] = (count[i] + 1)
ref = f[i + j - 1]
}
}
}
To be more precise, I have a set of 600 f-series and this code runs 200 times for each f-series. Currently I am doing the iterations in loops and most of the operations are element-wise. My random variables are f, the condition if(p[i,j] > (0.026)), and the number 0.026 in itself.
One can drastically reduce the run-time by vectorizing my code and using functions, specifically the apply family, but I am rusty with apply and looking for some advice to proceed in the right direction.
It is quite easy to put for loops in Rcpp. I just copy-pasted your code to Rcpp and haven't checked the validity. In case of discrepancy, let me know. fCpp returns the list of p and c.
cppFunction('List fCpp(NumericVector f) {
const int n=1250;
const int k=250;
NumericMatrix p(n, k);
NumericVector c(n);
for(int i = 0; i < n; i++) {
double ref=f[i];
for(int j = 0; j < k; j++) {
p(i,j) = f[i+j+1]/ref-1;
if(p(i,j) == NAN){
c[i]=c[i];
}
else if(p(i,j) > 0.026){
c[i] = c[i]+1;
ref = f[i+j+1];
}
}
}
return List::create(p, c);
}')
Benchmark
set.seed(1)
f = runif(1500,10,50)
f1 <- function(f){
p = matrix(0, nrow=1250, ncol=250)
count = rep(0, 1250)
for(i in 1:1250) {
ref=f[i]
for(j in 1:250) {
p[i,j] = f[i + j - 1] / ref-1
if(p[i,j] == "NaN") {
count[i] = count[i]
}
else if(p[i,j] > (0.026)) {
count[i] = (count[i] + 1)
ref = f[i + j - 1]
}
}
}
list(p, count)
}
microbenchmark::microbenchmark(fCpp(f), f1(f), times=10L, unit="relative")
Unit: relative
expr min lq mean median uq max neval
fCpp(f) 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 10
f1(f) 785.8484 753.7044 734.4243 764.5883 718.0868 644.9022 10
Values returned by fCpp(f) and f1(f) are essentially identical, apart from column 1 of p matrix returned by f1 is filled with 0s.
system.time(a <- f1(f))[3]
#elapsed
# 2.8
system.time(a1 <- fCpp(f))[3]
#elapsed
# 0
all.equal( a[[1]], a1[[1]])
#[1] "Mean relative difference: 0.7019406"
all.equal( a[[2]], a1[[2]])
#[1] TRUE
Here is an implementation using while, although it is taking much longer than nested for loops which is a bit counter intuitive.
f1 <- function() {
n <- 1500
d <- 250
f = runif(n,1,5)
f = embed(f, d)
f = f[-(n-d+1),]
count = rep(0, n-d)
for(i in 1:(n-d)) {
tem <- f[i,]/f[i,1] - 1
ti <- which(t[-d] > 0.026)[1]
while(ti < d & !is.na(ti)) {
ti.plus = ti+1
tem[ti.plus:d] = f[i, ti.plus:d] / tem[ti]
count[i] = count[i] + 1
ti <- ti + which(tem[ti.plus:d-1] > 0.026)[1]
}
f[i] = tem
}
list(f, count)
}
system.time(f1())
#elapsed
#6.365
#ajmartin, your logic was better and reduced the number of iterations I was attempting. Here is the improved version of your code in R:
f1 <- function() {
n <- 1500
d <- 250
f = runif(n,1,5)
count = rep(0, n-d)
for(i in 1:(n-d)) {
tem <- f[i:(i+d-1)] / f[i] - 1
ind = which(tem>0.026)[1]
while(length(which(tem>0.026))){
count[i] = count[i] + 1
tem[ind:d] = f[ind:d] / tem[ind] - 1
ind = ind - 1 + (which(tem[ind:d] > 0.026)[1])
}
}
list(f, count)
}
system.time(f1())[3]
# elapsed
# 0.09
Implementing this in Rcpp will further reduce system-time but I can't install Rtools as my current machine does not have admin rights. Meanwhile this helps.