Largest Difference within a list of integers - recursion

I've been trying to write a recursive function which takes a list of numbers such as [5;6;7;8;2;3;4] and returns 3 by finding the largest difference between sequences of ascending numbers in a list but have no idea where to go.

The function signature should take the lowest number of the current sequence, the highest number of the current sequence and the remainding sequence. If the head of the remaining list is bigger than the highest number of the current sequence, just recurse with the new highest number. Otherwise, return the maximum of the current difference and the recursive call for the remainder with the current head as minimum and maximum.

Related

Compute nearly equal pattern of a string

Find near duplicate string. Hi, I know there is a match, unique, duplicated function in R, but none of these does wha I'm really need. I've a unique column in my dataset that I need to go trough it to check if the number are nearly the same. For instance, the first element compared with the second has nearly equal pattern, except for the number '9'. The second compared with the third is nearly equal, except for the last number o the sequence, one is ending with 6 while other ending with 5. Lastly, the two last numbers are 100% equal. If I've used unique() function, only the last case would be correctly excluded.
I'm wondering if there is a function that I can flag nearly equal, maybe calculating the percentage of equality, so I can drive my attention to those cases with highly equality rate.
dat <- data.frame(text = c("87775956",
"987775956",
"987775955",
"987481732",
"987481732"))

rle command counting changes in vector

n <- length(rle(sign(z)))
z contains 1 and -1. n should indicate the number of how many times the sign of z changes.
The code above does not lead to the desired outcome. If I expand the command to
length(rle(sign(z))[[1]])
it works. I don't understand the underlying mechanism of how [[1]] solves the problem?
rle returns a list consisting of two components: lengths, and values. As such, its own length is always 2. By contrast, you want to know the length of either of those components (they obviously have the same length). So either length(rle(…)[[1]]) or length(rle(…)[[2]]) would work. Better to use the names instead of an index though, e.g.
length(rle(z)$lengths)
However, this won’t be the number of times the sign changes; rather, it will be the number of times the changes plus 1.

How to group data into unequal ranges and assign a value to those ranges in R? [duplicate]

I'm trying to make a function that determines what bucket a certain value goes into based off of a given vector. So my function has two inputs: a vector determining the break points for the bucket
(ex: if the vector is (1,4,5,10) the buckets would be <=1, 110)
and a certain number. I want the function to output a certain value determining the bucket.
For example if I input .9 the output could be 1, 1.6, the output could be 4, 5.8 the output could be 10, and 13, the output could be "10+".
The way I'm doing it right now is I first check if the input number is bigger than the vector's largest element or smaller than the vector's smallest element. If not, I then run a for loop (can't figure out how to use apply) to check if the number is in each specific interval. The problem is this is way too inefficient because I'm dealing with a large data set. Does anyone know an efficient way to do this?
The cut() function is convenient for bucketing: cut(splitme,breaks=vectorwithsplits) .
However, it looks like you're actually trying to figure out an insertion point. You need something like binary search.

Identifying indices of sequences which contain frequent subsequences

Using TraMineR I can identify frequent subsequences in a dataset of sequences. However, it only gives me a count of how often such a subsequence occur in the overall dataset, such as that it occurs in 21/22 sequences.
Is there any way of getting indices of exactly which sequences contain a specific frequent subsequence?
See function seqeapplysub.
According to help page: Checks occurrences of the subsequences subseq among the event sequences and returns the result according to the selected method.

Finding and counting repeated occurrences

I wish to make a function, which will accept three arguments (starting position, ending position, length), and by that function, I wish to find out, how many times each of the different pattern of that particular length appear and then I wish to extract the maximum one. Sounds confusing.
Try this:
countSubstring<-function(string,start,end,len) {
startChar<-seq(start,end,by=len)
table(substring(string,startChar,startChar+(len-1)))
}
string<-"aabaaaabaaaacaaaabaaaabaa"
countSubstring(string,start=1,end=15,len=5)
aabaa aacaa
2 1

Resources