Bubble sort using R language? - r

I am new in programming, and I just start learning R language. I am trying to do a bubble sort, but it shows the following error message. Can anyone help me solve the problem?
x <-sample(1:100,10)
n <- length(x)
example <- function(x)
{
for (i in 1:n-1)
{
while (x[i] > x[i+1])
{
temp <- x[i+1]
x[i+1] <- x[i]
x[i] <- temp
}
i <- i+1
}
}
example(x)
Error in while (x[i] > x[i + 1]) { : argument is of length zero

x<-sample(1:100,10)
example <- function(x){
n<-length(x)
for(j in 1:(n-1)){
for(i in 1:(n-j)){
if(x[i]>x[i+1]){
temp<-x[i]
x[i]<-x[i+1]
x[i+1]<-temp
}
}
}
return(x)
}
res<-example(x)
#input
x
#output
res
It is working fine with little modification of your code. In 'R' it is better to use sort() function.
x <-sample(1:100,10)
x
res<-sort(x)
res

You have some inaccuracies in your algorithm of sorting. I've made changes to make it work.
set.seed(1)
x <-sample(1:100,10)
x
# [1] 27 37 57 89 20 86 97 62 58 6
example <- function(x)
{
n <- length(x) # better insert this line inside the sorting function
for (k in n:2) # every iteration of the outer loop bubbles the maximum element
# of the array to the end
{
i <- 1
while (i < k) # i is the index for nested loop, no need to do i < n
# because passing j iterations of the for loop already
# places j maximum elements to the last j positions
{
if (x[i] > x[i+1]) # if the element is greater than the next one we change them
{
temp <- x[i+1]
x[i+1] <- x[i]
x[i] <- temp
}
i <- i+1 # moving to the next element
}
}
x # returning sorted x (the last evaluated value inside the body
# of the function is returned), we can also write return(x)
}
example(x)
# [1] 6 20 27 37 57 58 62 86 89 97
BTW, R language has a lot of functions and methods for doing things. This example function can be a learning example, but I advice to use existing function sort for solving real problems.
In R language you should try to avoid loops and make usage of vectorized functions to make the code faster.

It gives you that error message because he cannot compare a value that is out of his bounds which is the case for you at (x[i] > x[i + 1]). Try this if you want to sort your array in a decreasing order:
for (i in 1:n){
j = i
while((j>1)){
if ((X[j]> X[j-1])){
temp = X[j]
X[j] = X[j-1]
X[j-1] = temp
}
j = j-1
}
}
For an increasing order you just have to switch around the > sign in the while loop.

Related

how to calculate h-point

I am trying to write a function to calculate h-point. the function is defined over a rank frequency data frame.
consider the following data.frame :
DATA <-data.frame(frequency=c(49,48,46,38,29,24,23,22,15,12,12,10,10,9,9), rank=c(seq(1, 15)))
and the formula for h-point is :
if {there is an r = f(r), h-point = r }
else { h-point = f(i)j-f(j)i / j-i+f(i)-f(j) }
where f(i) and f(j) are corresponding frequencies for ith and jth ranks and i and j are adjacent ranks that i<f(i) and j>f(j).
NOW, i have tried the following codes :
fr <-function(x){d <-DATA$frequency[x]
return(d)}
for (i in 1:length(DATA$rank)) {
j <- i+1
if (i==fr(i))
return(i)
else(i<fr(i) && j>fr(j)) {
s <-fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j)
return(s)
}}
I also tried:
for (i in 1:length(DATA$rank)) {
j <- i+1
if (i==fr(i))
return(i)
if (i<fr(i) while(j>fr(j))) {
s <-fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j)
return(s)
}}
and neither of them works. for the DATA ,the desired result would be i=11 and j=12, so:
h-point=12×12 - 10×11 / 12 - 11 + 12 - 10
can you please tell me what I`m doing wrong here?
You could do:
h_point <- function(data){
x <- seq(nrow(data))
f_x <- data[["frequency"]][x]
h <- which(x == f_x)
if(length(h)>1) h
else{
i <- which(x<f_x)
j <- which(x>f_x)
s <- which(outer(i,j,"-") == -1, TRUE)
i <- i[s[,1]]
j <- j[s[,2]]
cat("i: ",i, "j: ", j,"\n")
f_x[i]*j - f_x[j]*i / (i-j + f_x[i]-f_x[j])
}
}
h_point(DATA)
i: 11 j: 12
[1] 34
I think I have figured out what you are trying to achieve. My loop will go through DATA and break at any point if rank == frequency for a given row. If might be more prudent to explicitly test this with DATA$rank[i] == fr(i) rather than relying on i, in case tied ranks etc.
The second if statement calculates h-point (s) for rows i and j if row i has rank that is lower than freq and row j has a rank that is higher.
Is this what you wanted?
DATA <-data.frame(frequency=c(49,48,46,38,29,24,23,22,15,12,12,10,10,9,9), rank=c(seq(1, 15)))
fr <-function(x){d <-DATA$frequency[x]
return(d)}
for(i in 1:nrow(DATA)){
j <- i+1
if (i==fr(i)){
s <- list(ij=c(i=i,j=j), h=i)
break
}else if(i <fr(i) && j>fr(j)){
s <-list(ij=c(i=i,j=j),h=fr(i)*j-fr(j)*i/j-i+fr(i)-fr(j))
}}
I am not sure the formula is correct, in your loop you had j-i but in explanation it was i-j. Not sure if the entire i-j+fr(i)-fr(j) is the denominator and similarly for the numerator. Simple fixes.

Sorting a vector in R without using sort function

I am trying to write a function to sort a vector, but without using R's inbuilt 'Sort' function.
My Code:
sorting <- function(x){
for(i in 1:length(x)){
for(j in (i+1):length(x)){
if(x[i] > x[j]){
x[c(i,j)] = x[c(j,i)]
}
}
}
x
}
I get below output:
> x <- c(3,1,4,7,2,9)
> sorting(x)
Error in if (x[i] > x[j]) { : missing value where TRUE/FALSE needed
>
I understand that we'll get above error when the 'IF' condition returns 'NA' instead of TRUE/FALSE.
Is there an issue with the statement:
for(j in (i+1):length(x)){
Python code for same:
def sorting(a):
for i in range(len(a)):
for j in range(i+1,len(a)):
if a[i] > a[j]:
a[i],a[j] = a[j],a[i]
return a
Output:
sorting([3,1,4,7,2,9])
Out[380]: [1, 2, 3, 4, 7, 9]
In Python, the code works fine.
Could someone let me know the issue with my R code.
The problem is with that (i+1). When length(x) reaches its max value, j goes out of range. I added this: (length(x)-1).
sorting <- function(x){
for(i in 1:(length(x)-1)){
for(j in (i+1):length(x)){
if(x[i] > x[j]){
x[c(i,j)] = x[c(j,i)] # +1 for this
}
}
}
x
}
sorting(c(3,1,4,7,2,9))
[1] 1 2 3 4 7 9
Sorting <- function(x){
xs <- rep(0, length(x))
for(i in 1:length(x)){
xs[i] = min(x)
x <- x[x != min(x)]
}
xs
}
Sorting(c(3,1,4,7,2,9))
[1] 1 2 3 4 7 9
Sorting is a function with a numeric vector argument x. Initially xs, the sorted vector of the same length as x, is filled with zeroes. The for loop aims at assigning the minimum value of vector x to each component of xs (for(i in 1:length(x)){ xs[i] = min(x) ). We then remove such minimum value from x by x <- x[x != min(x)] } to find the next minimum value. Finally, xs shows the sorted vector of x.

Bubble sort using for loop in R

I'm learning basic algorithms in R. I want to create bubble sort algorithm using for loop.
x <- c(20,31,8,46,19)
for (i in 1:length(x)){
if (x[i] > x[i+1]){
temp <- x[i]
x[i] <- x[i+1]
x[i+1] <- temp
cat(x[i],"")
}
else cat (x[i],"")
}
I think in this particular vector, sorting should take place in three "turns.
1st should give a result: 20 8 31 19 46
2nd: 8 20 19 31 46
3rd: 8 19 20 31 46 (correct one)
This loop works out the 1st turn, except printing the last element.
I also don't know how to work out those other "turns". I predict it's something about the other for loop, but I don't know how to implement it.
You're iterating until the last element in x, but use x[i] > x[i+1]. So, in the last iteration x[x+1] is out of bounds. Look into this code which just deals with length(x)-1 iterations plus adds an outside loop which runs from the back to the front of the vector.
bubblesort <- function(x) {
for (j in (length(x)-1):1) {
for (i in j:(length(x)-1)) {
if (x[i] > x[i+1]) {
temp <- x[i]
x[i] <- x[i+1]
x[i+1] <- temp
}
}
}
return(x)
}
x <- c(20,31,8,46,19)
print(x)
print(bubblesort(x))

R Restrict a cycle

I want to restrict the for cycle to only perform task if j is in some range of i (3 units, for example).
I tried the following piece of code:
a <- c(1:100)
b <- c(1:100)
k1 <- length(a)
k2 <- length(b)
for (i in 1:k1){
for (j in 1:k2){
if (j>=i-3 & j<=i+3){
c<-c(a+b)
}
}
}
What I pretended was
if i=1, j={1,2,3}, if i=6, j={1,2,3,4,5,6}
This doesn´t really work since, j and i will end up running from 1 to 100.
If I understand, the problem is that you are looping through 100 combinations of j, when only three to seven are actually useful.
If this is correct, you can loop through seven iterations of j and filter for values that are positive and within bounds:
width <- 3
for (i in seq_along(a)) {
for (j in (i-width):(i+width)) {
if (j > 0 && j <= length(b)) {
# Do something
}
}
}
When you # Do something in your code, I would advise not assigning to a variable named c.

Rewriting loops with apply functions

I have the 3 following functions which I would like to make faster, I assume apply functions are the best way to go, but I have never used apply functions, so I have no idea what to do. Any type of hints, ideas and code snippets will be much appreciated.
n, T, dt are global parameters and par is a vector of parameters.
Function 1: is a function to create an m+1,n matrix containing poisson distributed jumps with exponentially distributed jump sizes. My troubles here is because I have 3 loops and I am not sure how to incorporate the if statement in the inner loop. Also I have no idea if it is at all possible to use apply functions on the outer layers of the loops only.
jump <- function(t=0,T=T,par){
jump <- matrix(0,T/dt+1,n) # initializing output matrix
U <- replicate(n,runif(100,t,T)) #matrix used to decide when the jumps will happen
Y <-replicate(n,rexp(100,1/par[6])) #matrix with jump sizes
for (l in 1:n){
NT <- rpois(1,par[5]*T) #number of jumps
k=0
for (j in seq(t,T,dt)){
k=k+1
if (NT>0){
temp=0
for (i in 1:NT){
u <- vector("numeric",NT)
if (U[i,l]>j){ u[i]=0
}else u[i]=1
temp=temp+Y[i,l]*u[i]
}
jump[k,l]=temp
}else jump[k,l]=0
}
}
return(jump)
}
Function 2: calculates a default intensity, based on Brownian motions and the jumps from function 1. Here my trouble is how to use apply functions when the variable used for the calculation is the values from the row above in the output matrix AND how to get the right values from the external matrices which are used in the calculations (BMz_C & J)
lambda <- function(t=0,T=T,par,fit=0){
lambda <- matrix(0,m+1,n) # matrix to hold intesity path output
lambda[1,] <- par[4] #initializing start value of the intensity path.
J <- jump(t,T,par) #matrix containing jumps
for(i in 2:(m+1)){
dlambda <- par[1]*(par[2]-max(lambda[i-1,],0))*dt+par[3]*sqrt(max(lambda[i- 1,],0))*BMz_C[i,]+(J[i,]-J[i-1,])
lambda[i,] <- lambda[i-1,]+dlambda
}
return(lambda)
}
Function 3: calculates a survival probability based on the intensity from function 2. Here a() and B() are functions that return numerical values. My problem here is that the both value i and j are used because i is not always an integer which thus can to be used to reference the external matrix. I have earlier tried to use i/dt, but sometimes it would overwrite one line and skip the next lines in the matrix, most likely due to rounding errors.
S <- function(t=0,T=T,par,plot=0, fit=0){
S <- matrix(0,(T-t)/dt+1,n)
if (fit > 0) S.fit <- matrix(0,1,length(mat)) else S.fit <- 0
l=lambda(t,T,par,fit)
j=0
for (i in seq(t,T,dt)){
j=j+1
S[j,] <- a(i,T,par)*exp(B(i,T,par)*l[j,])
}
return(S)
}
Sorry for the long post, any help for any of the functions will be much appreciated.
EDIT:
First of all thanks to digEmAll for the great reply.
I have now worked on vectorising function 2. First I tried
lambda <- function(t=0,T=T,par,fit=0){
lambda <- matrix(0,m+1,n) # matrix to hold intesity path input
J <- jump(t,T,par,fit)
lambda[1,] <- par[4]
lambda[2:(m+1),] <- sapply(2:(m+1), function(i){
lambda[i-1,]+par[1]*(par[2]-max(lambda[i-1,],0))*dt+par[3]*sqrt(max(lambda[i-1,],0))*BMz_C[i,]+(J[i,]-J[i-1,])
})
return(lambda)
}
but it would only produce the first column. So I tried a two step apply function.
lambda <- function(t=0,T=T,par,fit=0){
lambda <- matrix(0,m+1,n) # matrix to hold intesity path input
J <- jump(t,T,par,fit)
lambda[1,] <- par[4]
lambda[2:(m+1),] <- sapply(1:n, function(l){
sapply(2:(m+1), function(i){
lambda[i-1,l]+par[1]*(par[2]-max(lambda[i-1,l],0))*dt+par[3]*sqrt(max(lambda[i-1,l],0))*BMz_C[i,l]+(J[i,l]-J[i-1,l])
})
})
return(lambda)
}
This seems to work, but only on the first row, all rows after that have an identical non-zero value, as if lambda[i-1] is not used in the calculation of lambda[i], does anyone have an idea how to manage that?
I'm going to explain to you, setp-by-step, how to vectorize the first function (one possible way of vectorization, maybe not the best one for your case).
For the others 2 functions, you can simply apply the same concepts and you should be able to do it.
Here, the key concept is: start to vectorize from the innermost loop.
1) First of all, rpois can generate more than one random value at a time but you are calling it n-times asking one random value. So, let's take it out of the loop obtaining this:
jump <- function(t=0,T=T,par){
jump <- matrix(0,T/dt+1,n)
U <- replicate(n,runif(100,t,T))
Y <-replicate(n,rexp(100,1/par[6]))
NTs <- rpois(n,par[5]*T) # note the change
for (l in 1:n){
NT <- NTs[l] # note the change
k=0
for (j in seq(t,T,dt)){
k=k+1
if (NT>0){
temp=0
for (i in 1:NT){
u <- vector("numeric",NT)
if (U[i,l]>j){ u[i]=0
}else u[i]=1
temp=temp+Y[i,l]*u[i]
}
jump[k,l]=temp
}else jump[k,l]=0
}
}
return(jump)
}
2) Similarly, it is useless/inefficient to call seq(t,T,dt) n-times in the loop since it will always generate the same sequence. So, let's take it out of the loop and store into a vector, obtainig this:
jump <- function(t=0,T=T,par){
jump <- matrix(0,T/dt+1,n)
U <- replicate(n,runif(100,t,T))
Y <-replicate(n,rexp(100,1/par[6]))
NTs <- rpois(n,par[5]*T)
js <- seq(t,T,dt) # note the change
for (l in 1:n){
NT <- NTs[l]
k=0
for (j in js){ # note the change
k=k+1
if (NT>0){
temp=0
for (i in 1:NT){
u <- vector("numeric",NT)
if (U[i,l]>j){ u[i]=0
}else u[i]=1
temp=temp+Y[i,l]*u[i]
}
jump[k,l]=temp
}else jump[k,l]=0
}
}
return(jump)
}
3) Now, let's have a look at the innermost loop:
for (i in 1:NT){
u <- vector("numeric",NT)
if (U[i,l]>j){ u[i]=0
}else u[i]=1
temp=temp+Y[i,l]*u[i]
}
this is equal to :
u <- as.integer(U[1:NT,l]<=j)
temp <- sum(Y[1:NT,l]*u)
or, in one-line:
temp <- sum(Y[1:NT,l] * as.integer(U[1:NT,l] <= j))
hence, now the function can be written as :
jump <- function(t=0,T=T,par){
jump <- matrix(0,T/dt+1,n)
U <- replicate(n,runif(100,t,T))
Y <-replicate(n,rexp(100,1/par[6]))
NTs <- rpois(n,par[5]*T)
js <- seq(t,T,dt)
for (l in 1:n){
NT <- NTs[l]
k=0
for (j in js){
k=k+1
if (NT>0){
jump[k,l] <- sum(Y[1:NT,l]*as.integer(U[1:NT,l]<=j)) # note the change
}else jump[k,l]=0
}
}
return(jump)
}
4) Again, let's have a look at the current innermost loop:
for (j in js){
k=k+1
if (NT>0){
jump[k,l] <- sum(Y[1:NT,l]*as.integer(U[1:NT,l]<=j)) # note the change
}else jump[k,l]=0
}
as you can notice, NT does not depend on the iteration of this loop, so the inner if can be moved outside, as follows:
if (NT>0){
for (j in js){
k=k+1
jump[k,l] <- sum(Y[1:NT,l]*as.integer(U[1:NT,l]<=j)) # note the change
}
}else{
for (j in js){
k=k+1
jump[k,l]=0
}
}
this seems worse than before, well yes it is, but now the 2 conditions can be turned into one-liner's (note the use of sapply¹):
if (NT>0){
jump[1:length(js),l] <- sapply(js,function(j){ sum(Y[1:NT,l]*as.integer(U[1:NT,l]<=j)) })
}else{
jump[1:length(js),l] <- 0
}
obtaining the following jump function:
jump <- function(t=0,T=T,par){
jump <- matrix(0,T/dt+1,n)
U <- replicate(n,runif(100,t,T))
Y <-replicate(n,rexp(100,1/par[6]))
NTs <- rpois(n,par[5]*T)
js <- seq(t,T,dt)
for (l in 1:n){
NT <- NTs[l]
if (NT>0){
jump[1:length(js),l] <- sapply(js,function(j){ sum(Y[1:NT,l]*as.integer(U[1:NT,l]<=j)) })
}else{
jump[1:length(js),l] <- 0
}
}
return(jump)
}
5) finally we can get rid of the last loop, using again the sapply¹ function, obtaining the final jump function :
jump <- function(t=0,T=T,par){
U <- replicate(n,runif(100,t,T))
Y <-replicate(n,rexp(100,1/par[6]))
js <- seq(t,T,dt)
NTs <- rpois(n,par[5]*T)
jump <- sapply(1:n,function(l){
NT <- NTs[l]
if (NT>0){
sapply(js,function(j){ sum(Y[1:NT,l]*as.integer(U[1:NT,l]<=j)) })
}else {
rep(0,length(js))
}
})
return(jump)
}
(¹)
sapply function is pretty easy to use. For each element of the list or vector passed in the X parameter, it applies the function passed in the FUN parameter, e.g. :
vect <- 1:3
sapply(X=vect,FUN=function(el){el+10}
# [1] 11 12 13
since by default the simplify parameter is true, the result is coerced to the simplest possible object. So, for example in the previous case the result becomes a vector, while in the following example result become a matrix (since for each element we return a vector of the same size) :
vect <- 1:3
sapply(X=vect,FUN=function(el){rep(el,5)})
# [,1] [,2] [,3]
# [1,] 1 2 3
# [2,] 1 2 3
# [3,] 1 2 3
# [4,] 1 2 3
# [5,] 1 2 3
Benchmark :
The following benchmark just give you an idea of the speed gain, but the actual performances may be different depending on your input parameters.
As you can imagine, jump_old corresponds to your original function 1, while jump_new is the final vectorized version.
# let's use some random parameters
n = 10
m = 3
T = 13
par = c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6)
dt <- 3
set.seed(123)
system.time(for(i in 1:5000) old <- jump_old(T=T,par=par))
# user system elapsed
# 12.39 0.00 12.41
set.seed(123)
system.time(for(i in 1:5000) new <- jump_new(T=T,par=par))
# user system elapsed
# 4.49 0.00 4.53
# check if last results of the 2 functions are the same:
isTRUE(all.equal(old,new))
# [1] TRUE

Resources