Related
How can I get the last n characters from a string in R?
Is there a function like SQL's RIGHT?
I'm not aware of anything in base R, but it's straight-forward to make a function to do this using substr and nchar:
x <- "some text in a string"
substrRight <- function(x, n){
substr(x, nchar(x)-n+1, nchar(x))
}
substrRight(x, 6)
[1] "string"
substrRight(x, 8)
[1] "a string"
This is vectorised, as #mdsumner points out. Consider:
x <- c("some text in a string", "I really need to learn how to count")
substrRight(x, 6)
[1] "string" " count"
If you don't mind using the stringr package, str_sub is handy because you can use negatives to count backward:
x <- "some text in a string"
str_sub(x,-6,-1)
[1] "string"
Or, as Max points out in a comment to this answer,
str_sub(x, start= -6)
[1] "string"
Use stri_sub function from stringi package.
To get substring from the end, use negative numbers.
Look below for the examples:
stri_sub("abcde",1,3)
[1] "abc"
stri_sub("abcde",1,1)
[1] "a"
stri_sub("abcde",-3,-1)
[1] "cde"
You can install this package from github: https://github.com/Rexamine/stringi
It is available on CRAN now, simply type
install.packages("stringi")
to install this package.
str = 'This is an example'
n = 7
result = substr(str,(nchar(str)+1)-n,nchar(str))
print(result)
> [1] "example"
>
Another reasonably straightforward way is to use regular expressions and sub:
sub('.*(?=.$)', '', string, perl=T)
So, "get rid of everything followed by one character". To grab more characters off the end, add however many dots in the lookahead assertion:
sub('.*(?=.{2}$)', '', string, perl=T)
where .{2} means .., or "any two characters", so meaning "get rid of everything followed by two characters".
sub('.*(?=.{3}$)', '', string, perl=T)
for three characters, etc. You can set the number of characters to grab with a variable, but you'll have to paste the variable value into the regular expression string:
n = 3
sub(paste('.+(?=.{', n, '})', sep=''), '', string, perl=T)
UPDATE: as noted by mdsumner, the original code is already vectorised because substr is. Should have been more careful.
And if you want a vectorised version (based on Andrie's code)
substrRight <- function(x, n){
sapply(x, function(xx)
substr(xx, (nchar(xx)-n+1), nchar(xx))
)
}
> substrRight(c("12345","ABCDE"),2)
12345 ABCDE
"45" "DE"
Note that I have changed (nchar(x)-n) to (nchar(x)-n+1) to get n characters.
A simple base R solution using the substring() function (who knew this function even existed?):
RIGHT = function(x,n){
substring(x,nchar(x)-n+1)
}
This takes advantage of basically being substr() underneath but has a default end value of 1,000,000.
Examples:
> RIGHT('Hello World!',2)
[1] "d!"
> RIGHT('Hello World!',8)
[1] "o World!"
Try this:
x <- "some text in a string"
n <- 5
substr(x, nchar(x)-n, nchar(x))
It shoudl give:
[1] "string"
An alternative to substr is to split the string into a list of single characters and process that:
N <- 2
sapply(strsplit(x, ""), function(x, n) paste(tail(x, n), collapse = ""), N)
I use substr too, but in a different way. I want to extract the last 6 characters of "Give me your food." Here are the steps:
(1) Split the characters
splits <- strsplit("Give me your food.", split = "")
(2) Extract the last 6 characters
tail(splits[[1]], n=6)
Output:
[1] " " "f" "o" "o" "d" "."
Each of the character can be accessed by splits[[1]][x], where x is 1 to 6.
someone before uses a similar solution to mine, but I find it easier to think as below:
> text<-"some text in a string" # we want to have only the last word "string" with 6 letter
> n<-5 #as the last character will be counted with nchar(), here we discount 1
> substr(x=text,start=nchar(text)-n,stop=nchar(text))
This will bring the last characters as desired.
For those coming from Microsoft Excel or Google Sheets, you would have seen functions like LEFT(), RIGHT(), and MID(). I have created a package known as forstringr and its development version is currently on Github.
if(!require("devtools")){
install.packages("devtools")
}
devtools::install_github("gbganalyst/forstringr")
library(forstringr)
the str_left(): This counts from the left and then extract n characters
the str_right()- This counts from the right and then extract n characters
the str_mid()- This extract characters from the middle
Examples:
x <- "some text in a string"
str_left(x, 4)
[1] "some"
str_right(x, 6)
[1] "string"
str_mid(x, 6, 4)
[1] "text"
I used the following code to get the last character of a string.
substr(output, nchar(stringOfInterest), nchar(stringOfInterest))
You can play with the nchar(stringOfInterest) to figure out how to get last few characters.
A little modification on #Andrie solution gives also the complement:
substrR <- function(x, n) {
if(n > 0) substr(x, (nchar(x)-n+1), nchar(x)) else substr(x, 1, (nchar(x)+n))
}
x <- "moSvmC20F.5.rda"
substrR(x,-4)
[1] "moSvmC20F.5"
That was what I was looking for. And it invites to the left side:
substrL <- function(x, n){
if(n > 0) substr(x, 1, n) else substr(x, -n+1, nchar(x))
}
substrL(substrR(x,-4),-2)
[1] "SvmC20F.5"
Just in case if a range of characters need to be picked:
# For example, to get the date part from the string
substrRightRange <- function(x, m, n){substr(x, nchar(x)-m+1, nchar(x)-m+n)}
value <- "REGNDATE:20170526RN"
substrRightRange(value, 10, 8)
[1] "20170526"
I'm trying to teach myself R and in doing some sample problems I came across the need to reverse a string.
Here's what I've tried so far but the paste operation doesn't seem to have any effect.
There must be something I'm not understanding about lists? (I also don't understand why I need the [[1]] after strsplit.)
test <- strsplit("greg", NULL)[[1]]
test
# [1] "g" "r" "e" "g"
test_rev <- rev(test)
test_rev
# [1] "g" "e" "r" "g"
paste(test_rev)
# [1] "g" "e" "r" "g"
From ?strsplit, a function that'll reverse every string in a vector of strings:
## a useful function: rev() for strings
strReverse <- function(x)
sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")
strReverse(c("abc", "Statistics"))
# [1] "cba" "scitsitatS"
stringi has had this function for quite a long time:
stringi::stri_reverse("abcdef")
## [1] "fedcba"
Also note that it's vectorized:
stringi::stri_reverse(c("a", "ab", "abc"))
## [1] "a" "ba" "cba"
As #mplourde points out, you want the collapse argument:
paste(test_rev, collapse='')
Most commands in R are vectorized, but how exactly the command handles vectors depends on the command. paste will operate over multiple vectors, combining the ith element of each:
> paste(letters[1:5],letters[1:5])
[1] "a a" "b b" "c c" "d d" "e e"
collapse tells it to operate within a vector instead.
The following can be a useful way to reverse a vector of strings x, and is slightly faster (and more memory efficient) because it avoids generating a list (as in using strsplit):
x <- rep( paste( collapse="", LETTERS ), 100 )
str_rev <- function(x) {
sapply( x, function(xx) {
intToUtf8( rev( utf8ToInt( xx ) ) )
} )
}
str_rev(x)
If you know that you're going to be working with ASCII characters and speed matters, there is a fast C implementation for reversing a vector of strings built into Kmisc:
install.packages("Kmisc")
str_rev(x)
You can also use the IRanges package.
library(IRanges)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
You can also use the Biostrings package.
library(Biostrings)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
If your data is in a data.frame, you can use sqldf:
myStrings <- data.frame(forward = c("does", "this", "actually", "work"))
library(sqldf)
sqldf("select forward, reverse(forward) `reverse` from myStrings")
# forward reverse
# 1 does seod
# 2 this siht
# 3 actually yllautca
# 4 work krow
Here is a function that returns the whole reversed string, or optionally the reverse string keeping only the elements specified by index, counting backward from the last character.
revString = function(string, index = 1:nchar(string)){
paste(rev(unlist(strsplit(string, NULL)))[index], collapse = "")
}
First, define an easily recognizable string as an example:
(myString <- paste(letters, collapse = ""))
[1] "abcdefghijklmnopqrstuvwxyz"
Now try out the function revString with and without the index:
revString(myString)
[1] "zyxwvutsrqponmlkjihgfedcba"
revString(myString, 1:5)
[1] "zyxwv"
The easiest way to reverse string:
#reverse string----------------------------------------------------------------
revString <- function(text){
paste(rev(unlist(strsplit(text,NULL))),collapse="")
}
#example:
revString("abcdef")
You can do with rev() function as mentioned in a previous post.
`X <- "MyString"
RevX <- paste(rev(unlist(strsplit(X,NULL))),collapse="")
Output : "gnirtSyM"
Thanks,
Here's a solution with gsub. Although I agree that it's easier with strsplit and paste (as pointed out in the other answers), it may be interesting to see that it works with regular expressions too:
test <- "greg"
n <- nchar(test) # the number of characters in the string
gsub(paste(rep("(.)", n), collapse = ""),
paste("", seq(n, 1), sep = "\\", collapse = ""),
test)
# [1] "gerg"
##function to reverse the given word or sentence
reverse <- function(mystring){
n <- nchar(mystring)
revstring <- rep(NA, n)
b <- n:1
c <- rev(b)
for (i in 1:n) {
revstring[i] <- substr(mystring,c[(n+1)- i], b[i])
}
newrevstring <- paste(revstring, sep = "", collapse = "")
return (cat("your string =", mystring, "\n",
("reverse letters = "), revstring, "\n",
"reverse string =", newrevstring,"\n"))
}
Here is one more base-R solution:
# Define function
strrev <- function(x) {
nc <- nchar(x)
paste(substring(x, nc:1, nc:1), collapse = "")
}
# Example
strrev("Sore was I ere I saw Eros")
[1] "sorE was I ere I saw eroS"
Solution was inspired by these U. Auckland slides.
The following Code will take input from user and reverse the entire string-
revstring=function(s)
print(paste(rev(strsplit(s,"")[[1]]),collapse=""))
str=readline("Enter the string:")
revstring(str)
So apparently front-end JS developers get asked to do this (for interviews) in JS without using built-in reverse functions. It took me a few minutes, but I came up with:
string <- 'hello'
foo <- vector()
for (i in nchar(string):1) foo <- append(foo,unlist(strsplit(string,''))[i])
paste0(foo,collapse='')
Which all could be wrapped in a function...
What about higher-order functionals? Reduce?
I'm trying to teach myself R and in doing some sample problems I came across the need to reverse a string.
Here's what I've tried so far but the paste operation doesn't seem to have any effect.
There must be something I'm not understanding about lists? (I also don't understand why I need the [[1]] after strsplit.)
test <- strsplit("greg", NULL)[[1]]
test
# [1] "g" "r" "e" "g"
test_rev <- rev(test)
test_rev
# [1] "g" "e" "r" "g"
paste(test_rev)
# [1] "g" "e" "r" "g"
From ?strsplit, a function that'll reverse every string in a vector of strings:
## a useful function: rev() for strings
strReverse <- function(x)
sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")
strReverse(c("abc", "Statistics"))
# [1] "cba" "scitsitatS"
stringi has had this function for quite a long time:
stringi::stri_reverse("abcdef")
## [1] "fedcba"
Also note that it's vectorized:
stringi::stri_reverse(c("a", "ab", "abc"))
## [1] "a" "ba" "cba"
As #mplourde points out, you want the collapse argument:
paste(test_rev, collapse='')
Most commands in R are vectorized, but how exactly the command handles vectors depends on the command. paste will operate over multiple vectors, combining the ith element of each:
> paste(letters[1:5],letters[1:5])
[1] "a a" "b b" "c c" "d d" "e e"
collapse tells it to operate within a vector instead.
The following can be a useful way to reverse a vector of strings x, and is slightly faster (and more memory efficient) because it avoids generating a list (as in using strsplit):
x <- rep( paste( collapse="", LETTERS ), 100 )
str_rev <- function(x) {
sapply( x, function(xx) {
intToUtf8( rev( utf8ToInt( xx ) ) )
} )
}
str_rev(x)
If you know that you're going to be working with ASCII characters and speed matters, there is a fast C implementation for reversing a vector of strings built into Kmisc:
install.packages("Kmisc")
str_rev(x)
You can also use the IRanges package.
library(IRanges)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
You can also use the Biostrings package.
library(Biostrings)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
If your data is in a data.frame, you can use sqldf:
myStrings <- data.frame(forward = c("does", "this", "actually", "work"))
library(sqldf)
sqldf("select forward, reverse(forward) `reverse` from myStrings")
# forward reverse
# 1 does seod
# 2 this siht
# 3 actually yllautca
# 4 work krow
Here is a function that returns the whole reversed string, or optionally the reverse string keeping only the elements specified by index, counting backward from the last character.
revString = function(string, index = 1:nchar(string)){
paste(rev(unlist(strsplit(string, NULL)))[index], collapse = "")
}
First, define an easily recognizable string as an example:
(myString <- paste(letters, collapse = ""))
[1] "abcdefghijklmnopqrstuvwxyz"
Now try out the function revString with and without the index:
revString(myString)
[1] "zyxwvutsrqponmlkjihgfedcba"
revString(myString, 1:5)
[1] "zyxwv"
The easiest way to reverse string:
#reverse string----------------------------------------------------------------
revString <- function(text){
paste(rev(unlist(strsplit(text,NULL))),collapse="")
}
#example:
revString("abcdef")
You can do with rev() function as mentioned in a previous post.
`X <- "MyString"
RevX <- paste(rev(unlist(strsplit(X,NULL))),collapse="")
Output : "gnirtSyM"
Thanks,
Here's a solution with gsub. Although I agree that it's easier with strsplit and paste (as pointed out in the other answers), it may be interesting to see that it works with regular expressions too:
test <- "greg"
n <- nchar(test) # the number of characters in the string
gsub(paste(rep("(.)", n), collapse = ""),
paste("", seq(n, 1), sep = "\\", collapse = ""),
test)
# [1] "gerg"
##function to reverse the given word or sentence
reverse <- function(mystring){
n <- nchar(mystring)
revstring <- rep(NA, n)
b <- n:1
c <- rev(b)
for (i in 1:n) {
revstring[i] <- substr(mystring,c[(n+1)- i], b[i])
}
newrevstring <- paste(revstring, sep = "", collapse = "")
return (cat("your string =", mystring, "\n",
("reverse letters = "), revstring, "\n",
"reverse string =", newrevstring,"\n"))
}
Here is one more base-R solution:
# Define function
strrev <- function(x) {
nc <- nchar(x)
paste(substring(x, nc:1, nc:1), collapse = "")
}
# Example
strrev("Sore was I ere I saw Eros")
[1] "sorE was I ere I saw eroS"
Solution was inspired by these U. Auckland slides.
The following Code will take input from user and reverse the entire string-
revstring=function(s)
print(paste(rev(strsplit(s,"")[[1]]),collapse=""))
str=readline("Enter the string:")
revstring(str)
So apparently front-end JS developers get asked to do this (for interviews) in JS without using built-in reverse functions. It took me a few minutes, but I came up with:
string <- 'hello'
foo <- vector()
for (i in nchar(string):1) foo <- append(foo,unlist(strsplit(string,''))[i])
paste0(foo,collapse='')
Which all could be wrapped in a function...
What about higher-order functionals? Reduce?
Suppose I have a long string:
"XOVEWVJIEWNIGOIWENVOIWEWVWEW"
How do I split this to get every 5 characters followed by a space?
"XOVEW VJIEW NIGOI WENVO IWEWV WEW"
Note that the last one is shorter.
I can do a loop where I constantly count and build a new string character by character but surely there must be something better no?
Using regular expressions:
gsub("(.{5})", "\\1 ", "XOVEWVJIEWNIGOIWENVOIWEWVWEW")
# [1] "XOVEW VJIEW NIGOI WENVO IWEWV WEW"
Using sapply
> string <- "XOVEWVJIEWNIGOIWENVOIWEWVWEW"
> sapply(seq(from=1, to=nchar(string), by=5), function(i) substr(string, i, i+4))
[1] "XOVEW" "VJIEW" "NIGOI" "WENVO" "IWEWV" "WEW"
You can try something like the following:
s <- "XOVEWVJIEWNIGOIWENVOIWEWVWEW" # Original string
l <- seq(from=5, to=nchar(s), by=5) # Calculate the location where to chop
# Add sentinels 0 (beginning of string) and nchar(s) (end of string)
# and take substrings. (Thanks to #flodel for the condense expression)
mapply(substr, list(s), c(0, l) + 1, c(l, nchar(s)))
Output:
[1] "XOVEW" "VJIEW" "NIGOI" "WENVO" "IWEWV" "WEW"
Now you can paste the resulting vector (with collapse=' ') to obtain a single string with spaces.
No *apply stringi solution:
x <- "XOVEWVJIEWNIGOIWENVOIWEWVWEW"
stri_sub(x, seq(1, stri_length(x),by=5), length=5)
[1] "XOVEW" "VJIEW" "NIGOI" "WENVO" "IWEWV" "WEW"
This extracts substrings just like in #Jilber answer, but stri_sub function is vectorized se we don't need to use *apply here.
You can also use a sub-string without a loop. substring is the vectorized substr
x <- "XOVEWVJIEWNIGOIWENVOIWEWVWEW"
n <- seq(1, nc <- nchar(x), by = 5)
paste(substring(x, n, c(n[-1]-1, nc)), collapse = " ")
# [1] "XOVEW VJIEW NIGOI WENVO IWEWV WEW"
I'm trying to teach myself R and in doing some sample problems I came across the need to reverse a string.
Here's what I've tried so far but the paste operation doesn't seem to have any effect.
There must be something I'm not understanding about lists? (I also don't understand why I need the [[1]] after strsplit.)
test <- strsplit("greg", NULL)[[1]]
test
# [1] "g" "r" "e" "g"
test_rev <- rev(test)
test_rev
# [1] "g" "e" "r" "g"
paste(test_rev)
# [1] "g" "e" "r" "g"
From ?strsplit, a function that'll reverse every string in a vector of strings:
## a useful function: rev() for strings
strReverse <- function(x)
sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")
strReverse(c("abc", "Statistics"))
# [1] "cba" "scitsitatS"
stringi has had this function for quite a long time:
stringi::stri_reverse("abcdef")
## [1] "fedcba"
Also note that it's vectorized:
stringi::stri_reverse(c("a", "ab", "abc"))
## [1] "a" "ba" "cba"
As #mplourde points out, you want the collapse argument:
paste(test_rev, collapse='')
Most commands in R are vectorized, but how exactly the command handles vectors depends on the command. paste will operate over multiple vectors, combining the ith element of each:
> paste(letters[1:5],letters[1:5])
[1] "a a" "b b" "c c" "d d" "e e"
collapse tells it to operate within a vector instead.
The following can be a useful way to reverse a vector of strings x, and is slightly faster (and more memory efficient) because it avoids generating a list (as in using strsplit):
x <- rep( paste( collapse="", LETTERS ), 100 )
str_rev <- function(x) {
sapply( x, function(xx) {
intToUtf8( rev( utf8ToInt( xx ) ) )
} )
}
str_rev(x)
If you know that you're going to be working with ASCII characters and speed matters, there is a fast C implementation for reversing a vector of strings built into Kmisc:
install.packages("Kmisc")
str_rev(x)
You can also use the IRanges package.
library(IRanges)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
You can also use the Biostrings package.
library(Biostrings)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
If your data is in a data.frame, you can use sqldf:
myStrings <- data.frame(forward = c("does", "this", "actually", "work"))
library(sqldf)
sqldf("select forward, reverse(forward) `reverse` from myStrings")
# forward reverse
# 1 does seod
# 2 this siht
# 3 actually yllautca
# 4 work krow
Here is a function that returns the whole reversed string, or optionally the reverse string keeping only the elements specified by index, counting backward from the last character.
revString = function(string, index = 1:nchar(string)){
paste(rev(unlist(strsplit(string, NULL)))[index], collapse = "")
}
First, define an easily recognizable string as an example:
(myString <- paste(letters, collapse = ""))
[1] "abcdefghijklmnopqrstuvwxyz"
Now try out the function revString with and without the index:
revString(myString)
[1] "zyxwvutsrqponmlkjihgfedcba"
revString(myString, 1:5)
[1] "zyxwv"
The easiest way to reverse string:
#reverse string----------------------------------------------------------------
revString <- function(text){
paste(rev(unlist(strsplit(text,NULL))),collapse="")
}
#example:
revString("abcdef")
You can do with rev() function as mentioned in a previous post.
`X <- "MyString"
RevX <- paste(rev(unlist(strsplit(X,NULL))),collapse="")
Output : "gnirtSyM"
Thanks,
Here's a solution with gsub. Although I agree that it's easier with strsplit and paste (as pointed out in the other answers), it may be interesting to see that it works with regular expressions too:
test <- "greg"
n <- nchar(test) # the number of characters in the string
gsub(paste(rep("(.)", n), collapse = ""),
paste("", seq(n, 1), sep = "\\", collapse = ""),
test)
# [1] "gerg"
##function to reverse the given word or sentence
reverse <- function(mystring){
n <- nchar(mystring)
revstring <- rep(NA, n)
b <- n:1
c <- rev(b)
for (i in 1:n) {
revstring[i] <- substr(mystring,c[(n+1)- i], b[i])
}
newrevstring <- paste(revstring, sep = "", collapse = "")
return (cat("your string =", mystring, "\n",
("reverse letters = "), revstring, "\n",
"reverse string =", newrevstring,"\n"))
}
Here is one more base-R solution:
# Define function
strrev <- function(x) {
nc <- nchar(x)
paste(substring(x, nc:1, nc:1), collapse = "")
}
# Example
strrev("Sore was I ere I saw Eros")
[1] "sorE was I ere I saw eroS"
Solution was inspired by these U. Auckland slides.
The following Code will take input from user and reverse the entire string-
revstring=function(s)
print(paste(rev(strsplit(s,"")[[1]]),collapse=""))
str=readline("Enter the string:")
revstring(str)
So apparently front-end JS developers get asked to do this (for interviews) in JS without using built-in reverse functions. It took me a few minutes, but I came up with:
string <- 'hello'
foo <- vector()
for (i in nchar(string):1) foo <- append(foo,unlist(strsplit(string,''))[i])
paste0(foo,collapse='')
Which all could be wrapped in a function...
What about higher-order functionals? Reduce?