Quicksort in R - array sorted after k steps - r

I am pretty new to R, so there is definitely some improvement to my code needed. What I want is to do quicksort on an array of n elements, count the number of comparisons made and output the sorted array after k comparisons.
So far, I have reused the code for a quicksort algorithm found here:
quickSort <- function(arr) {
# Pick a number at random.
mid <- sample(arr, 1)
print(arr)
print(mid)
# Place-holders for left and right values.
left <- c()
right <- c()
# Move all the smaller values to the left, bigger values to the right.
lapply(arr[arr != mid], function(d) {
count <<- count + 1
stopifnot(count <= k)
if (d < mid) {
left <<- c(left, d)
}
else {
right <<- c(right, d)
}
})
if (length(left) > 1) {
left <- quickSort(left)
}
if (length(right) > 1) {
right <- quickSort(right)
}
# Finally, return the sorted values.
c(left, mid, right)
}
I am currently struggling with several things:
How can I get not only the partial vector that is currently being sorted but also the full vector?
Did I put the right stopping condition in the right place?
An example of what I want:
given an array (2,4,1,3,5) and the first pivot element 3, after four comparisons I would want the output to be (2,1,3,4,5).
Any help would be greatly appreciated!

Related

How to implement a function with a sum inside in R?

I am trying to define a function with a for loop and inside a conditional in R studio. Yesterday I was able with the help of another thread to devise this piece of code. The problem is that I want to sum the vector elements ma for any possible x, so that is inside the function l. This is a simpler case which I am trying to solve to adapt the original model. However, I do not know how to proceed.
ma<-rep(0,20)
l <- function(x, ma) {
for(i in seq_along(ma)) {
if(i %% 2 == 1) {
ma[i] <- i + x
} else {
ma[i] <- 0
}
}
return(ma)
}
My problem is that I would like to have the sum of i+x+0+i+x... for any possible x. I mean a function of the kind for any possible x.
Question:
Can someone explain to me how to implement such a function in R?
Thanks in advance!
I am going to update the original function:
Theta_alpha_s<-function(s,alpha,t,Basis){
for (i in seq_along(Basis)){
if(i%% 2==1) {Basis[i]=s*i^{-alpha-0.5}*sqrt(2)*cos(2*pi*i*t)}
else{Basis[i]=s*i^{-alpha-0.5}*sqrt(2)*sin(2*pi*i*t)}
}
return(Basis)
}
If you don't want to change the values in Basis, you can create a new vector in the function (here result) that you will return:
l = function(s,alpha,t,Basis){
is.odd = which(Basis %% 2 == 1)
not.odd = which(Basis %% 2 == 0)
result = rep(NA, length(Basis))
result[is.odd] = s*is.odd^{-alpha-0.5}*sqrt(2)*cos(2*pi*is.odd*t)
result[not.odd] = s*not.odd^{-alpha-0.5}*sqrt(2)*sin(2*pi*not.odd*t)
#return(result)
return(c(sum(result[is.odd]), sum(result[not.odd])))
}

How to make sequences that may be empty?

I recently tried to adapt some pseudocode for an in-place quicksort, quoted below:
function quicksort(array)
if length(array) > 1
pivot := select any element of array
left := first index of array
right := last index of array
while left ≤ right
while array[left] < pivot
left := left + 1
while array[right] > pivot
right := right - 1
if left ≤ right
swap array[left] with array[right]
left := left + 1
right := right - 1
quicksort(array from first index to right)
quicksort(array from left to last index)
Following this, I wrote this code:
quicksort<-function(array)
{
len<-length(array)
if(len>1)
{
left<-1
right<-len
pivot<-array[(left+right)%/%2]
while(left<=right)
{
while(array[left]<pivot){left<-left+1}
while(array[right]<pivot){right<-right-1}
if(left<=right)
{
array[c(left,right)]<-array[c(right,left)]
left<-left+1
right<-right-1
}
}
array<-quicksort(array[1:right])#Bug here
array<-quicksort(array[left:len])
}
array
}
If you run this code with more than one integer as an input, you'll find that it eventually tries to sort lists of NAs. I suspect that the problem is that the pseduocode wants array from first index to right - my 1:right - to be read as an empty sequence when right is zero. As is well known, R would actually read as 1:0 as the sequence 0 1.
Is there any function to get the behavior that I intend? I could do this with an if statement, but R is usually good enough with sequences that I can't help but think that there will be a better way. I tried to use seq, but it will throw an error if you try to use anything like seq(from=1,to=0,by=1).
There are two issues with the code:
One of the inequality signs is flipped.
The array assignment needs to be modified to only change a subset of elements rather than replace (and shrink) the whole array.
Here is the corrected code
quicksort <- function(array) {
len <- length(array)
if(len > 1) {
left <- 1
right <- len
pivot <- array[(left+right)%/%2]
while(left <= right)
{
while(array[left] < pivot){
left <- left + 1
}
while(array[right] > pivot) { # Changed "<" to ">"
right <- right - 1
}
if(left <= right) {
array[c(left, right)] <- array[c(right, left)]
left <- left + 1
right <- right - 1
}
}
# Modified the following two lines to only set a subset of array
array[1:right] <- quicksort(array[1:right])
array[left:len] <- quicksort(array[left:len])
}
array
}
quicksort(c(2, 6, 3, 1, 4, 5))
#> [1] 1 2 3 4 5 6

Error message in Bubble sort code in R language

I did some programming work on R language to do the bubble sort. Sometimes it works perfectly without any error message, but sometimes, it shows "Error in if (x[i] > x[i + 1]) { : argument is of length zero". Can any one help me check whats wrong with it? I have attached my code below
example <- function(x) {
n <- length(x)
repeat {
hasChanged <- FALSE
n <- n - 1
for(i in 1:n) {
if ( x[i] > x[i+1] ) {
temp <- x[i]
x[i] <- x[i+1]
x[i+1] <- temp
hasChanged <- TRUE
cat("The current Vector is", x ,"\n")
}
}
if ( !hasChanged ) break;
}
}
x <-sample(1:10,5)
cat("The original Vector is", x ,"\n")
example(x)
The error occurs because you are iteratively decreasing n. Depending on the original vector's order (or lack thereof), n can reach the value of 1 after the last change. In that case, a further reduction of n in the next iteration step addresses the value x[0], which is undefined.
With a minimal correction your code will work properly, without giving error messages. Try to replace the line
if ( !hasChanged ) break;
with
if ( !hasChanged | n==1 ) break
Basically you have two termination criteria: Either nothing has been changed in the previous iteration or n is equal to one. In both cases, a further iteration won't change the vector since it is already ordered.
By the way, in R programming you don't need a semicolon at the end of a command. It is tolerated/ignored by the interpreter, but it clutters the code and is not considered good programming style.
Hope this helps.

R - Arrays with variable dimension

I have a weird question..
Essentially, I have a function which takes a data frame of dimension Nx(2k) and transforms it into an array of dimension Nx2xk. I then further use that array in various locations in the function.
My issue is this, when k == 2, I'm left with a matrix of degree Nx2, and even worse, if N = 1, I'm stuck with a matrix of degree 1x2.
I would like to write myArray[thisRow,,] to select that slice of the array, but this falls short for the N = 1, k = 2 case. I tried myArray[thisRow,,,drop = FALSE] but that gives an 'incorrect number of dimensions' error. This same issue arrises for the Nx2 case.
Is there a work around for this issue, or do I need to break my code into cases?
Sample Code Shown Below:
thisFunction <- function(myDF)
{
nGroups = NCOL(myDF)/2
afMyArray = myDF
if(nGroups > 1)
{
afMyArray = abind(lapply(1:nGroups, function(g){myDF[,2*(g-1) + 1:2]}),
along = 3)
}
sapply(1:NROW(myDF),
function(r)
{
thisSlice = afMyArray[r,,]
*some operation on thisSlice*
})
}
Thanks,
James

Finding duplicate values in r

So, In a string containing multiple 1's,
Now, it is possible that, the number
'1'
appears at several positions, let's say, at multiple positions. What I want is
(3)
This is not a complete answer, but some ideas (partly based on comments):
z <- "1101101101"
zz <- as.numeric(strsplit(z,"")[[1]])
Compute autocorrelation function and draw plot: in this case I'm getting the periodicity=3 pretty crudely as the first point at which there is an increase followed by a decrease ...
a1 <- acf(zz)
first.peak <- which(diff(sign(diff(a1$acf[,,1])))==-2)[1]
Now we know the periodicity is 3; create runs of 3 with embed() and analyze their similarities:
ee <- embed(zz,first.peak)
pp <- apply(ee,1,paste,collapse="")
mm <- outer(pp,pp,"==")
aa <- apply(mm[!duplicated(mm),],1,which)
sapply(aa,length) ## 3 3 2 ## number of repeats
sapply(aa,function(x) unique(diff(x))) ## 3 3 3
The following code does exactly what you ask for. Try it with str_groups('1101101101'). It returns a list of 3-vectors. Note that the first triple is (1, 3, 4) because the character at the 10th position is also a 1.
Final version, optimized and without errors
str_groups <- function (s) {
digits <- as.numeric(strsplit(s, '')[[1]])
index1 <- which(digits == 1)
len <- length(digits)
back <- length(index1)
if (back == 0) return(list())
maxpitch <- (len - 1) %/% 2
patterns <- matrix(0, len, maxpitch)
result <- list()
for (pitch in 1:maxpitch) {
divisors <- which(pitch %% 1:(pitch %/% 2) == 0)
while (index1[back] > len - 2 * pitch) {
back <- back - 1
if (back == 0) return(result)
}
for (startpos in index1[1:back]) {
if (patterns[startpos, pitch] != 0) next
pos <- seq(startpos, len, pitch)
if (digits[pos[2]] != 1 || digits[pos[3]] != 1) next
repeats <- length(pos)
if (repeats > 3) for (i in 4:repeats) {
if (digits[pos[i]] != 1) {
repeats <- i - 1
break
}
}
continue <- F
for (subpitch in divisors) {
sublen <- patterns[startpos, subpitch]
if (sublen > pitch / subpitch * (repeats - 1)) {
continue <- T
break
}
}
if (continue) next
for (i in 1:repeats) patterns[pos[i], pitch] <- repeats - i + 1
result <- append(result, list(c(startpos, pitch, repeats)))
}
}
return(result)
}
Note: this algorithm has roughly quadratic runtime complexity, so if you make your strings twice as long, it will take four times as much time to find all patterns on average.
Pseudocode version
To aid understanding of the code. For particulars of R functions such as which, consult the R online documentation, for example by running ?which on the R command line.
PROCEDURE str_groups WITH INPUT $s (a string of the form /(0|1)*/):
digits := array containing the digits in $s
index1 := positions of the digits in $s that are equal to 1
len := pointer to last item in $digits
back := pointer to last item in $index1
IF there are no items in $index1, EXIT WITH empty list
maxpitch := the greatest possible interval between 1-digits, given $len
patterns := array with $len rows and $maxpitch columns, initially all zero
result := array of triplets, initially empty
FOR EACH possible $pitch FROM 1 TO $maxpitch:
divisors := array of divisors of $pitch (including 1, excluding $pitch)
UPDATE $back TO the last position at which a pattern could start;
IF no such position remains, EXIT WITH result
FOR EACH possible $startpos IN $index1 up to $back:
IF $startpos is marked as part of a pattern, SKIP TO NEXT $startpos
pos := possible positions of pattern members given $startpos, $pitch
IF either the 2nd or 3rd $pos is not 1, SKIP TO NEXT $startpos
repeats := the number of positions in $pos
IF there are more than 3 positions in $pos THEN
count how long the pattern continues
UPDATE $repeats TO the length of the pattern
END IF (more than 3 positions)
FOR EACH possible $subpitch IN $divisors:
check $patterns for pattern with interval $subpitch at $startpos
IF such a pattern is found AND it envelopes the current pattern,
SKIP TO NEXT $startpos
(using helper variable $continue to cross two loop levels)
END IF (pattern found)
END FOR (subpitch)
FOR EACH consecutive position IN the pattern:
UPDATE $patterns at row of position and column of $pitch TO ...
... the remaining length of the pattern at that position
END FOR (position)
APPEND the triplet ($startpos, $pitch, $repeats) TO $result
END FOR (startpos)
END FOR (pitch)
EXIT WITH $result
END PROCEDURE (str_groups)
Perhaps the following route will help:
Convert string to a vector of integers characters
v <- as.integer(strsplit(s, "")[[1]])
Repeatedly convert this vector to matrices of varying number of rows...
m <- matrix(v, nrow=...)
...and use rle to find relevant patterns in the rows of the matrix m:
rle(m[1, ]); rle(m[2, ]); ...

Resources