I am trying to develop a really simple function in r, the idea is that;
Say I have the following sequence "UUDDDDUDU" where "U" = 1 and "D" = -1. I want to count the following: +1, +1, -1, -1, -1, -1, +1, -1, +1. Where I get the final number as -1.
funky <- function(n, s){
current_level = 0
U = 1
D = -1
for(i in 1:n){
if(s[i] == "U"){current_level +1}
if(s[i] == "D"){current_level -1}
}
}
funky(9, UUDDDDUDU)
Any pointers in the right direction would be great!
You could use stringr::str_count
s <- "UUDDDDUDU"
library(stringr)
str_count(s, 'U') - str_count(s, 'D')
# [1] -1
or more generally
library(purrr)
weights <- c(U = 1L, D = -1L)
sum(imap_int(weights, ~str_count(s, .y)*.x))
# [1] -1
Base R solution (using weights and s as defined above)
sum(weights[strsplit(s, '')[[1]]])
# [1] -1
These methods are in base R and should work with vectors
x = "UUDDDDUDU"
1
with(data.frame(t(sapply(strsplit(x, ""), table))), U - D)
#[1] -1
2
foo = function(p, s) {
sapply(gregexpr(p, s), function(x) sum(x > 0))
}
foo("U", x) + (-1 * foo("D", x))
#[1] -1
3
2 * nchar(gsub("D", "", x)) - nchar(x)
#[1] -1
Here is a method making use of reticulate
library(reticulate)
s <- "UUDDDDUDU"
repl_python()
>>> from collections import Counter
>>> sum([a* b for a, b in zip(Counter(list(r.s)).values(), [1, -1])])
#-1
Just for fun (and vectorised)...
s <- c("UUDDDDUDU", "UUDUU", "DDDDDUD")
sapply(parse(text=as.expression(gsub("(.)","\\11",chartr("UD","+-",s)))),eval)
[1] -1 3 -5
This translates as...
replace U with +, and D with - (chartr)
replace every character with that character followed by 1 (gsub)
and evaluate these character strings ("+1+1-1-1-1" etc) as expressions
Related
I have a binary string of length n which I wish to split into k sized substrings. Each substring is then decoded to decimal and added to a vector or a list.
Sorry if this is an obvious question but I'm very new to R
The example string should return: 01 01 01 01 01
Which should evaluate to 1 1 1 1 1
string <- 0101010101
n <- length(string)
k <- 2
#pseudocode
#For each substring in string:
# decode substring to decimal
# add substring to list/array
Does this help you?
It creates a small function splitter that takes the string (make sure that you input the string as a character and not as a number, otherwise leading zeroes are lost) and outputs the subsets.
# make sure that you use characters and not numbers here
# otherwise leading zeros are lost!
string <- "0101010101"
splitter <- function(t, k = 2) {
k <- min(k, nchar(t))
substring(t, seq(1, nchar(t) - 1, k), seq(k, nchar(t), k))
}
splitter(string)
#> [1] "01" "01" "01" "01" "01"
splitter(string, k = 3)
#> [1] "010" "101" "010"
splitter(string, k = 100)
#> [1] "0101010101"
## if you want to cast the values as numeric (sure that you want to case binaries to numerics?)
## use
v <- splitter(string, k = 2)
as.numeric(v)
#> [1] 1 1 1 1 1
Created on 2020-03-18 by the reprex package (v0.3.0)
If you are sure that you want to always get a numeric value, use this function
splitter2 <- function(t, k = 2) {
k <- min(k, nchar(t))
v <- substring(t, seq(1, nchar(t) - 1, k), seq(k, nchar(t), k))
as.numeric(v)
}
If you want to decode the binary values to int, use strtoi(string, base = 2), i.e.,
string <- "0101010101"
splitter3 <- function(t, k = 2) {
k <- min(k, nchar(t))
v <- substring(t, seq(1, nchar(t) - 1, k), seq(k, nchar(t), k))
strtoi(v, base = 2)
}
splitter3(string, k = 2)
#> [1] 1 1 1 1 1
splitter3(string, k = 3)
#> [1] 2 5 2
Created on 2020-03-18 by the reprex package (v0.3.0)
I have a problem to finish this R code. We are given a string having parenthesis like below
“( ((X)) (((Y))) )”
We need to find the maximum depth of balanced parenthesis, like 4 in above example. Since ‘Y’ is surrounded by 4 balanced parenthesis.
If parenthesis are unbalanced then return -1
My code looks like this:
current_max = 0
max = 0
def = function (S){
n=S
for (i in nchar(n))
if (is.element('(',n[i]))
{
current_max <- current_max + 1
}
if (current_max > max)
{
max <- current_max
}
else if (is.element(')',n[i]))
{
if (current_max > 0)
{
current_max <- current_max - 1
}
else
{
return -1
}
}
if (current_max != 0)
{
return -1
}
return (max)
}
but when i call function def("(A((B)))") answer should be 2. But every time it shows 0 even when the parenthesis is unbalanced. Im not sure if the code is correct or where is the mistake. Im trying to learn R so be patient with me. Thanks
If x <- "( ((X)) (((Y))) )", then remove all of the non-parentheses and split into characters...
y <- unlist(strsplit(gsub("[^\\(\\)]", "", x), ""))
y
[1] "(" "(" "(" ")" ")" "(" "(" "(" ")" ")" ")" ")"
and then the maximum nesting is the highest cumulative sum of +1 (for () and -1 (for ))...
z <- max(cumsum(ifelse(y=="(", 1, -1)))
z
[1] 4
If the parentheses are unbalanced then sum(ifelse(y=="(", 1, -1))) will not equal zero.
Here are three solutions. They are all vectorized, i.e. the input x can be a character vector, and they all handle the case of no parentheses properly.
1) strapply/proto strapply in the gsubfn packages matches the regular expression given as the second argument running the function fun in the proto object p which should also be passed to strapply. The pre function in p initializes the calculation for each component of the input x. The proto object can be used to retain memory of past matches (here lev is the nesting level) allowing counting to be done. We append an arbitrary character, here "X" to each string to ensure that there is always at least one match. If we knew there were no zero length character string inputs this could be omitted. The sapply uses Max which takes the maximum of the returned depths or returns -1 if no balance.
library(gsubfn) # also pulls in proto
# test input
x <- c("(A((B)))", "((A) ((())) (B))", "abc", "", "(A)((B)", "(A(B)))")
p <- proto(pre = function(.) .$lev <- 0,
fun = function(., x) .$lev <- .$lev + (x == "(") - (x == ")") )
Max <- function(x) if (tail(x, 1) == 0 && min(x) == 0) max(x) else -1
sapply(strapply(paste(x, "X"), ".", p), Max)
## [1] 3 4 0 0 -1 -1
2) Reduce This is a base solution. It makes use of Max from (1).
fun <- function(lev, char) lev + (char == "(") - (char == ")")
sapply(x, function(x) Max(Reduce(fun, init = 0, unlist(strsplit(x, "")), acc = TRUE)))
(A((B))) ((A) ((())) (B)) abc
3 4 0 0
(A)((B) (A(B)))
-1 -1
3) strapply/list Another possibility is to extract the parentheses and return with +1 or -1 for ( and ) using strapply with a replacement list. Then run cumsum and Max (from above) over that.
library(gsubfn)
fn$sapply(strapply(x, "[()]", list("(" = +1, ")" = -1), empty = 0), ~ Max(cumsum(x)))
## [1] 3 4 0 0 -1 -1
My aim is to create a vector, with sum 0, in which there are the same number of entries -x and the same number of entry equals x, the length of the vector is even, so it sums up to 0.
I created a function, that has x as an input.
there i insert a sample of the vectorlength but i the end it doesn't work out.
vector<-function(x){
for(i in length(sample)){
if(i %% 2!=0){
output[sample[i]]<-(-x)
}
if(i %% 2 ==0){
output[sample[i]]<-x
}
}
return(output)
}
Try this:
vector <- function(x, sample){
c(rep(x, sample/2), rep(-x, sample/2))
}
print(vector(x = 1, sample = 4))
# [1] 1 1 -1 -1
Edit
If alterning is required:
vector <- function(x, sample){
c(rbind(rep(-x, sample/2), rep(x, sample/2)))
}
print(vector(x = 1, sample = 4))
# [1] -1 1 -1 1
You can try
foo <- function(x, sample){
a <- sample(sample, x/2, replace = T)
c(a,-a)
# or alternating
# c(rbind(a,-a))
}
set.seed(123)
foo(4, 1:10)
[1] 3 8 -3 -8
According to the title you are looking for a random vector. In that case you can simply first generate an ordered vector with the desired properties and then use sample to shuffle it:
f <- function(x, size){
sample(c(rep(x, size/2), rep(-x, size/2), if(size %% 2 != 0) 0))
}
f(x = 1, size = 6)
#> [1] 1 -1 -1 1 -1 1
f(x = 1, size = 7)
#> [1] 0 -1 -1 1 -1 1 1
Edit: Now the function even allows for an odd size.
I have a vector with variable elements in it, and I want to check whether it's last two element are in the same digit order.
For example, if the last two vectors are 0.0194 and 0.0198 return TRUE. because their digit order after zero is the same (0.01 order 10^-2). ! for other example the number could be 0.00014 and 0.00012 so their precision is still around the same the function should return also TRUE.
How can we build a logical statement or function to check this.
x<- c(0.817104, 0.241665, 0.040581, 0.022903, 0.019478, 0.019846)
I may be over-thinking this, but you can test that the order of magnitude and first non-zero digit are identical for each.
x <- c(0.817104, 0.241665, 0.040581, 0.022903, 0.019478, 0.019846)
oom <- function(x, base = 10) as.integer(ifelse(x == 0, 0, floor(log(abs(x), base))))
oom(x)
# [1] -1 -1 -2 -2 -2 -2
(tr <- trunc(x / 10 ** oom(x, 10)))
# [1] 8 2 4 2 1 1
So for the last two, the order of magnitude for both is -2 and the first non-zero digit is 1 for both.
Put into a function:
f <- function(x) {
oom <- function(x, base = 10) as.integer(ifelse(x == 0, 0, floor(log(abs(x), base))))
x <- tail(x, 2)
oo <- oom(x)
tr <- trunc(x / 10 ** oo)
(oo[1] == oo[2]) & (tr[1] == tr[2])
}
## more test cases
x1 <- c(0.019, 0.011)
x2 <- c(0.01, 0.001)
f(x) ## TRUE
f(x1) ## TRUE
f(x2) ## FALSE
Here is a more general function than the above for checking the last n instead of 2
g <- function(x, n = 2) {
oom <- function(x, base = 10) as.integer(ifelse(x == 0, 0, floor(log(abs(x), base))))
x <- tail(x, n)
oo <- oom(x)
tr <- trunc(x / 10 ** oo)
Reduce(`==`, oo) & Reduce(`==`, tr)
}
g(c(.24, .15, .14), 2) ## TRUE
g(c(.24, .15, .14), 3) ## FALSE
#rawr worries about over-thinking. I guess I should as well. This is what I came up with and do note that this handles the fact that print representations of floating point numbers are sometimes deceiving.
orddig <- function(x) which( sapply( 0:16, function(n){ isTRUE(all.equal(x*10^n ,
round(x*10^n,0)))}))[1]
> sapply( c(0.00014 , 0.00012 ), orddig)
[1] 6 6
My original efforts were with the signif function but that's a different numerical thought trajectory, since 0.01 and 0.001 have the same number of significant digits. Also notice that:
> sapply( 10^5*c(0.00014 , 0.00012 ), trunc, 4)
[1] 13 12
Which was why we need the isTRUE(all.equal(... , ...))
I want to create a function which finds components of a vector which increase continually by k-times.
That is, if the contrived function is f(x,k) and x=c(2,3,4,3,5,6,5,7), then
the value of f(x,1) is 2,3,3,5,5 since only these components of x increase by 1 time.
In addition, if k=2, then the value of f(x,2) is 2,3 since only these components increase continually by 2 times.(2→3→4 and 3→5→6)
I guess that I ought to use repetitive syntax like for for this purpose.
1) Use rollapply from the zoo package:
library(zoo)
f <- function(x, k)
x[rollapply(x, k+1, function(x) all(diff(x) > 0), align = "left", fill = FALSE)]
Now test out f:
x <- c(2,3,4,3,5,6,5,7)
f(x, 1)
## [1] 2 3 3 5 5
f(x, 2)
## [1] 2 3
f(x, 3)
## numeric(0)
1a) This variation is slightly shorter and also works:
f2 <- function(x, k) head(x, -k)[ rollapply(diff(x) > 0, k, all) ]
2) Here is a version of 1a that uses no packages:
f3 <- function(x, k) head(x, -k)[ apply(embed(diff(x) > 0, k), 1, all) ]
A fully vectorized solution:
f <- function(x, k = 1) {
rlecumsum = function(x)
{ #cumsum with resetting
#http://stackoverflow.com/a/32524260/1412059
cs = cumsum(x)
cs - cummax((x == 0) * cs)
}
x[rev(rlecumsum(rev(c(diff(x) > 0, FALSE) ))) >= k]
}
f(x, 1)
#[1] 2 3 3 5 5
f(x, 2)
#[1] 2 3
f(x, 3)
#numeric(0)
I don't quite understand the second part of your question (that with k=2) but for the first part you can use something like this:
test<-c(2,3,4,3,5,6,5,7) #Your vector
diff(test) #Differentiates the vector
diff(test)>0 #Turns the vector in a logical vector with criterion >0
test[diff(test)>0] #Returns only the elements of test that correspond to a TRUE value in the previous line