Specific Permutations in R - r

I am trying to create permutations of the alphabet {0,1,2,3} using combinat::permn.
The thing is that I want each one of the permutations to be converted to the form of '%s-%s-%s'..etc and to be stored in a list. For example,
> library(combinat)
> permn(numbers[1:4])
[[1]]
[1] "0" "1" "2" "3"
[[2]]
[1] "0" "1" "3" "2"
.
.
. and so on
But I want to convert the output for all permutations into a list of string sequences of my specific format, i.e. '0-1-2-3', '0-1-3-2 etc.

Use lapply to apply paste on each of the vectors and collapse them with the delimiter you want (in this case "-").
lapply(permn(0:3), paste, collapse = "-")
If you just want the output as a vector instead of a list you could use sapply in place of lapply

Related

Ascending order of vector of numeric characters

I have a vector of numbers of type character.
x = c("5","-.5","-.1",".01",".1","1","3")
Is there a quick and easy way to order this character vector using the numeric value of each character? I can't find a clean way to do this.
So for instance, I want a function
x <- characterOrder(x)
With output:
c("-.5","-.1",".01",".1","1","3", "5")
Thank you!
You can do this in base R using the order function and the as.numeric when you order it by the as.numeric value.
x = c("5","-.5","-.1",".01",".1","1","3")
x[order(as.numeric(x))]
[1] "-.5" "-.1" ".01" ".1" "1" "3" "5"
If you want this in a function:
characterOrder <- function(x) {
return(x[order(as.numeric(x))])
}
You could try mixedsort from gtools
library(gtools)
mixedsort(x)
#[1] "-.5" "-.1" ".01" ".1" "1" "3" "5"

extract numerical suffixes from strings in R

I have this character vector:
variables <- c("ret.SMB.l1", "ret.mkt.l1", "ret.mkt.l4", "vix.l4", "ret.mkt.l5" "vix.l6", "slope.l11", "slope.l12", "us2yy.l2")
Desired output:
> suffixes(variables)
[1] 1 1 4 4 5 6 11 12 2
In other words, I need a function that will return a numeric vector showing the suffixes (each of which be 1 or 2 digits long). Note, I need something that can work with a much larger number of strings which may or may not have numbers somewhere the middle. The numerical suffixes range from 1 to 99.
Many thanks
Just use gsub:
> gsub(".*?([0-9]+)$", "\\1", variables)
[1] "1" "1" "4" "4" "5" "6" "11" "12" "2"
Wrap it in as.numeric if you want the result as a number.
You could use sub function.
> variables <- c("ret.SMB.l1", "ret.mkt.l1", "ret.mkt.l4", "vix.l4", "ret.mkt.l5" ,"vix.l6", "slope.l11", "slope.l12", "us2yy.l2")
> sub(".*\\D", "", variables)
[1] "1" "1" "4" "4" "5" "6" "11" "12" "2"
.*\\D matches all the characters from the start upto the last non-digit character. Replacing those matched characters with an empty string will give you the desired output.

Changing a column of a dataframe in R

I have a dataframe in R with a column with values as "s1-112", "s10-112", "s3656-112" etc. Now i want to change the values to only the part after "s" and before "-112" that is the number after s. is there a way?
You could use gsub here
x<-c("s1-112", "s10-112", "s3656-112")
gsub("s(.*)-112", "\\1", x)
# [1] "1" "10" "3656"
Or (using #MrFlick's data)
library(stringr)
str_extract(x, perl('\\d+(?=-)'))
#[1] "1" "10" "3656"

Sapply different than individual application of function

When applied individually to each element of the vector, my function gives a different result than using sapply. It's driving me nuts!
Item I'm using: this (simplified) list of arguments another function was called with:
f <- as.list(match.call()[-1])
> f
$ampm
c(1, 4)
To replicate this you can run the following:
foo <- function(ampm) {as.list(match.call()[-1])}
f <- foo(ampm = c(1,4))
Here is my function. It just strips the 'c(...)' from a string.
stripConcat <- function(string) {
sub(')','',sub('c(','',string,fixed=TRUE),fixed=TRUE)
}
When applied alone it works as so, which is what I want:
> stripConcat(f)
[1] "1, 4"
But when used with sapply, it gives something totally different, which I do NOT want:
> sapply(f, stripConcat)
ampm
[1,] "c"
[2,] "1"
[3,] "4"
Lapply doesn't work either:
> lapply(f, stripConcat)
$ampm
[1] "c" "1" "4"
And neither do any of the other apply functions. This is driving me nuts--I thought lapply and sapply were supposed to be identical to repeated applications to the elements of the list or vector!
The discrepency you are seeing, I believe, is simply due to how as.character coerces elements of a list.
x2 <- list(1:3, quote(c(1, 5)))
as.character(x2)
[1] "1:3" "c(1, 5)"
lapply(x2, as.character)
[[1]]
[1] "1" "2" "3"
[[2]]
[1] "c" "1" "5"
f is not a call, but a list whose first element is a call.
is(f)
[1] "list" "vector"
as.character(f)
[1] "c(1, 4)"
> is(f[[1]])
[1] "call" "language"
> as.character(f[[1]])
[1] "c" "1" "4"
sub attempts to coerce anything that is not a character into a chracter.
When you pass sub a list, it calls as.character on the list.
When you pass it a call, it calls as.character on that call.
It looks like for your stripConcat function, you would prefer a list as input.
In that case, I would recommend the following for that function:
stripConcat <- function(string) {
if (!is.list(string))
string <- list(string)
sub(')','',sub('c(','',string,fixed=TRUE),fixed=TRUE)
}
Note, however, that string is a misnomer, since it doesn't appear that you are ever planning to pass stripConcat a string. (not that this is an issue, of course)

In R, how can a string be split without using a seperator

i am try split method and i want to have the second element of a string containing only 2 elemnts. The size of the string is 2.
examples :
string= "AC"
result shouldbe a split after the first letter ("A"), that I get :
res= [,1] [,2]
[1,] "A" "C"
I tryed it with split, but I have no idea how to split after the first element??
strsplit() will do what you want (if I understand your Question). You need to split on "" to split the string on it's elements. Here is an example showing how to do what you want on a vector of strings:
strs <- rep("AC", 3) ## your string repeated 3 times
next, split each of the three strings
sstrs <- strsplit(strs, "")
which produces
> sstrs
[[1]]
[1] "A" "C"
[[2]]
[1] "A" "C"
[[3]]
[1] "A" "C"
This is a list so we can process it with lapply() or sapply(). We need to subset each element of sstrs to select out the second element. Fo this we apply the [ function:
sapply(sstrs, `[`, 2)
which produces:
> sapply(sstrs, `[`, 2)
[1] "C" "C" "C"
If all you have is one string, then
strsplit("AC", "")[[1]][2]
which gives:
> strsplit("AC", "")[[1]][2]
[1] "C"
split isn't used for this kind of string manipulation. What you're looking for is strsplit, which in your case would be used something like this:
strsplit(string,"",fixed = TRUE)
You may not need fixed = TRUE, but it's a habit of mine as I tend to avoid regular expressions. You seem to indicate that you want the result to be something like a matrix. strsplit will return a list, so you'll want something like this:
strsplit(string,"",fixed = TRUE)[[1]]
and then pass the result to matrix.
If you sure that it's always two char string (check it by all(nchar(x)==2)) and you want only second then you could use sub or substr:
x <- c("ab", "12")
sub(".", "", x)
# [1] "b" "2"
substr(x, 2, 2)
# [1] "b" "2"

Resources