By default,
paste('hi', 'there')
[1] "hi there"
What if I want a function that does the following?
reverse_paste('hi','there')
[1] "there hi "
Is there a way to modify the elements of ... to get the second result?
I am not sure how to handle the ... arguments in this case, since I want to use the function via apply in order to concatenate the elements of data frame made using expand.grid (the order of pasting is the opposite of the order of expansion, and both of those orders are important).
Edit: I would like to clarify that I would also like to be able to use the sep and collapse keyword arguments.
You can get at the arguments in ... using list. Then you just need to reverse it, add the other arguments, and call paste.
paste_rev <- function(..., sep=" ", collapse=NULL) {
arg <- c(rev(list(...)), list(sep=sep, collapse=collapse))
do.call(paste, arg)
}
paste_rev(c("a1", "a2"), c("b", "c"), sep=".")
## [1] "b.a1" "c.a2"
However, if you're using this to apply paste to a data frame, this won't work as you're not actually using multiple arguments to paste, you're instead sending it a vector.
out <- expand.grid(a=c("a1", "a2"), b=c("b1", "b2"), stringsAsFactors=FALSE)
out
## a b
## 1 a1 b1
## 2 a2 b1
## 3 a1 b2
## 4 a2 b2
apply(out, 1, paste_rev, collapse=".")
## [1] "a1.b1" "a2.b1" "a1.b2" "a2.b2"
Instead, I'd simply reverse the order of the columns before pasting.
apply(out[rev(colnames(out))], 1, paste, collapse=".")
## [1] "b1.a1" "b1.a2" "b2.a1" "b2.a2"
Or, reverse the elements of each argument individually.
paste_rev2 <- function(..., sep=" ", collapse=NULL) {
arg <- c(lapply(list(...), rev), list(sep=sep, collapse=collapse))
do.call(paste, arg)
}
apply(out, 1, paste_rev2, collapse=".")
## [1] "b1.a1" "b1.a2" "b2.a1" "b2.a2"
For a generic function that could do either, you could add a couple arguments.
pasteX <- function(..., sep=" ", collapse=NULL,
rev.elements=FALSE, rev.arguments=FALSE) {
arg <- list(...)
if(rev.arguments) arg <- rev(arg)
if(rev.elements) arg <- lapply(arg, rev)
do.call(paste, c(arg, list(sep=sep, collapse=collapse)))
}
pasteX(c("a", "b"), c(1, 2))
## [1] "a 1" "b 2"
pasteX(c("a", "b"), c(1, 2), rev.elements=TRUE)
## [1] "b 2" "a 1"
pasteX(c("a", "b"), c(1, 2), rev.arguments=TRUE)
## [1] "1 a" "2 b"
pasteX(c("a", "b"), c(1, 2), rev.elements=TRUE, rev.arguments=TRUE)
## [1] "2 b" "1 a"
I'm trying to teach myself R and in doing some sample problems I came across the need to reverse a string.
Here's what I've tried so far but the paste operation doesn't seem to have any effect.
There must be something I'm not understanding about lists? (I also don't understand why I need the [[1]] after strsplit.)
test <- strsplit("greg", NULL)[[1]]
test
# [1] "g" "r" "e" "g"
test_rev <- rev(test)
test_rev
# [1] "g" "e" "r" "g"
paste(test_rev)
# [1] "g" "e" "r" "g"
From ?strsplit, a function that'll reverse every string in a vector of strings:
## a useful function: rev() for strings
strReverse <- function(x)
sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")
strReverse(c("abc", "Statistics"))
# [1] "cba" "scitsitatS"
stringi has had this function for quite a long time:
stringi::stri_reverse("abcdef")
## [1] "fedcba"
Also note that it's vectorized:
stringi::stri_reverse(c("a", "ab", "abc"))
## [1] "a" "ba" "cba"
As #mplourde points out, you want the collapse argument:
paste(test_rev, collapse='')
Most commands in R are vectorized, but how exactly the command handles vectors depends on the command. paste will operate over multiple vectors, combining the ith element of each:
> paste(letters[1:5],letters[1:5])
[1] "a a" "b b" "c c" "d d" "e e"
collapse tells it to operate within a vector instead.
The following can be a useful way to reverse a vector of strings x, and is slightly faster (and more memory efficient) because it avoids generating a list (as in using strsplit):
x <- rep( paste( collapse="", LETTERS ), 100 )
str_rev <- function(x) {
sapply( x, function(xx) {
intToUtf8( rev( utf8ToInt( xx ) ) )
} )
}
str_rev(x)
If you know that you're going to be working with ASCII characters and speed matters, there is a fast C implementation for reversing a vector of strings built into Kmisc:
install.packages("Kmisc")
str_rev(x)
You can also use the IRanges package.
library(IRanges)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
You can also use the Biostrings package.
library(Biostrings)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
If your data is in a data.frame, you can use sqldf:
myStrings <- data.frame(forward = c("does", "this", "actually", "work"))
library(sqldf)
sqldf("select forward, reverse(forward) `reverse` from myStrings")
# forward reverse
# 1 does seod
# 2 this siht
# 3 actually yllautca
# 4 work krow
Here is a function that returns the whole reversed string, or optionally the reverse string keeping only the elements specified by index, counting backward from the last character.
revString = function(string, index = 1:nchar(string)){
paste(rev(unlist(strsplit(string, NULL)))[index], collapse = "")
}
First, define an easily recognizable string as an example:
(myString <- paste(letters, collapse = ""))
[1] "abcdefghijklmnopqrstuvwxyz"
Now try out the function revString with and without the index:
revString(myString)
[1] "zyxwvutsrqponmlkjihgfedcba"
revString(myString, 1:5)
[1] "zyxwv"
The easiest way to reverse string:
#reverse string----------------------------------------------------------------
revString <- function(text){
paste(rev(unlist(strsplit(text,NULL))),collapse="")
}
#example:
revString("abcdef")
You can do with rev() function as mentioned in a previous post.
`X <- "MyString"
RevX <- paste(rev(unlist(strsplit(X,NULL))),collapse="")
Output : "gnirtSyM"
Thanks,
Here's a solution with gsub. Although I agree that it's easier with strsplit and paste (as pointed out in the other answers), it may be interesting to see that it works with regular expressions too:
test <- "greg"
n <- nchar(test) # the number of characters in the string
gsub(paste(rep("(.)", n), collapse = ""),
paste("", seq(n, 1), sep = "\\", collapse = ""),
test)
# [1] "gerg"
##function to reverse the given word or sentence
reverse <- function(mystring){
n <- nchar(mystring)
revstring <- rep(NA, n)
b <- n:1
c <- rev(b)
for (i in 1:n) {
revstring[i] <- substr(mystring,c[(n+1)- i], b[i])
}
newrevstring <- paste(revstring, sep = "", collapse = "")
return (cat("your string =", mystring, "\n",
("reverse letters = "), revstring, "\n",
"reverse string =", newrevstring,"\n"))
}
Here is one more base-R solution:
# Define function
strrev <- function(x) {
nc <- nchar(x)
paste(substring(x, nc:1, nc:1), collapse = "")
}
# Example
strrev("Sore was I ere I saw Eros")
[1] "sorE was I ere I saw eroS"
Solution was inspired by these U. Auckland slides.
The following Code will take input from user and reverse the entire string-
revstring=function(s)
print(paste(rev(strsplit(s,"")[[1]]),collapse=""))
str=readline("Enter the string:")
revstring(str)
So apparently front-end JS developers get asked to do this (for interviews) in JS without using built-in reverse functions. It took me a few minutes, but I came up with:
string <- 'hello'
foo <- vector()
for (i in nchar(string):1) foo <- append(foo,unlist(strsplit(string,''))[i])
paste0(foo,collapse='')
Which all could be wrapped in a function...
What about higher-order functionals? Reduce?
Regarding the bounty
Ben Bolker's paste2-solution produces a "" when the strings that are pasted contains NA's in the same position. Like this,
> paste2(c("a","b", "c", NA), c("A","B", NA, NA))
[1] "a, A" "b, B" "c" ""
The fourth element is an "" instead of an NA Like this,
[1] "a, A" "b, B" "c" NA
I'm offering up this small bounty for anyone who can fix this.
Original question
I've read the help page ?paste, but I don't understand how to have R ignore NAs. I do the following,
foo <- LETTERS[1:4]
foo[4] <- NA
foo
[1] "A" "B" "C" NA
paste(1:4, foo, sep = ", ")
and get
[1] "1, A" "2, B" "3, C" "4, NA"
What I would like to get,
[1] "1, A" "2, B" "3, C" "4"
I could do like this,
sub(', NA$', '', paste(1:4, foo, sep = ", "))
[1] "1, A" "2, B" "3, C" "4"
but that seems like a detour.
I know this question is many years old, but it's still the top google result for r paste na. I was looking for a quick solution to what I assumed was a simple problem, and was somewhat taken aback by the complexity of the answers. I opted for a different solution, and am posting it here in case anyone else is interested.
bar <- apply(cbind(1:4, foo), 1,
function(x) paste(x[!is.na(x)], collapse = ", "))
bar
[1] "1, A" "2, B" "3, C" "4"
In case it isn't obvious, this will work on any number of vectors with NAs in any positions.
IMHO, the advantage of this over the existing answers is legibility. It's a one-liner, which is always nice, and it doesn't rely on a bunch of regexes and if/else statements which may trip up your colleagues or future self. Erik Shitts' answer mostly shares these advantages, but assumes there are only two vectors and that only the last of them contains NAs.
My solution doesn't satisfy the requirement in your edit, because my project has the opposite requirement. However, you can easily solve this by adding a second line borrowed from 42-'s answer:
is.na(bar) <- bar == ""
For the purpose of a "true-NA": Seems the most direct route is just to modify the value returned by paste2 to be NA when the value is ""
paste3 <- function(...,sep=", ") {
L <- list(...)
L <- lapply(L,function(x) {x[is.na(x)] <- ""; x})
ret <-gsub(paste0("(^",sep,"|",sep,"$)"),"",
gsub(paste0(sep,sep),sep,
do.call(paste,c(L,list(sep=sep)))))
is.na(ret) <- ret==""
ret
}
val<- paste3(c("a","b", "c", NA), c("A","B", NA, NA))
val
#[1] "a, A" "b, B" "c" NA
I found a dplyr/tidyverse solution to that question, which is rather elegant in my opinion.
library(tidyr)
foo <- LETTERS[1:4]
foo[4] <- NA
df <- data.frame(foo, num = 1:4)
df %>% unite(., col = "New.Col", num, foo, na.rm=TRUE, sep = ",")
> New.Col
1: 1,A
2: 2,B
3: 3,C
4: 4
A function that follows up on #ErikShilt's answer and #agstudy's comment. It generalizes the situation slightly by allowing sep to be specified and handling cases where any element (first, last, or intermediate) is NA. (It might break if there are multiple NA values in a row, or in other tricky cases ...) By the way, note that this situation is described exactly in the second paragraph of the Details section of ?paste, which indicates that at least the R authors are aware of the situation (although no solution is offered).
paste2 <- function(...,sep=", ") {
L <- list(...)
L <- lapply(L,function(x) {x[is.na(x)] <- ""; x})
gsub(paste0("(^",sep,"|",sep,"$)"),"",
gsub(paste0(sep,sep),sep,
do.call(paste,c(L,list(sep=sep)))))
}
foo <- c(LETTERS[1:3],NA)
bar <- c(NA,2:4)
baz <- c("a",NA,"c","d")
paste2(foo,bar,baz)
# [1] "A, a" "B, 2" "C, 3, c" "4, d"
This doesn't handle #agstudy's suggestions of (1) incorporating the optional collapse argument; (2) making NA-removal optional by adding an na.rm argument (and setting the default to FALSE to make paste2 backward compatible with paste). If one wanted to make this more sophisticated (i.e. remove multiple sequential NAs) or faster it might make sense to write it in C++ via Rcpp (I don't know much about C++'s string-handling, but it might not be too hard -- see convert Rcpp::CharacterVector to std::string and Concatenating strings doesn't work as expected for a start ...)
As Ben Bolker mentioned the above approaches may fall over if there are multiple NAs in a row. I tried a different approach that seems to overcome this.
paste4 <- function(x, sep = ", ") {
x <- gsub("^\\s+|\\s+$", "", x)
ret <- paste(x[!is.na(x) & !(x %in% "")], collapse = sep)
is.na(ret) <- ret == ""
return(ret)
}
The second line strips out extra whitespace introduced when concatenating text and numbers.
The above code can be used to concatenate multiple columns (or rows) of a dataframe using the apply command, or repackaged to first coerce the data into a dataframe if needed.
EDIT
After a few more hours thought I think the following code incorporates the suggestions above to allow specification of the collapse and na.rm options.
paste5 <- function(..., sep = " ", collapse = NULL, na.rm = F) {
if (na.rm == F)
paste(..., sep = sep, collapse = collapse)
else
if (na.rm == T) {
paste.na <- function(x, sep) {
x <- gsub("^\\s+|\\s+$", "", x)
ret <- paste(na.omit(x), collapse = sep)
is.na(ret) <- ret == ""
return(ret)
}
df <- data.frame(..., stringsAsFactors = F)
ret <- apply(df, 1, FUN = function(x) paste.na(x, sep))
if (is.null(collapse))
ret
else {
paste.na(ret, sep = collapse)
}
}
}
As above, na.omit(x) can be replaced with (x[!is.na(x) & !(x %in% "") to also drop empty strings if desired. Note, using collapse with na.rm = T returns a string without any "NA", though this could be changed by replacing the last line of code with paste(ret, collapse = collapse).
nth <- paste0(1:12, c("st", "nd", "rd", rep("th", 9)))
mnth <- month.abb
nth[4:5] <- NA
mnth[5:6] <- NA
paste5(mnth, nth)
[1] "Jan 1st" "Feb 2nd" "Mar 3rd" "Apr NA" "NA NA" "NA 6th" "Jul 7th" "Aug 8th" "Sep 9th" "Oct 10th" "Nov 11th" "Dec 12th"
paste5(mnth, nth, sep = ": ", collapse = "; ", na.rm = T)
[1] "Jan: 1st; Feb: 2nd; Mar: 3rd; Apr; 6th; Jul: 7th; Aug: 8th; Sep: 9th; Oct: 10th; Nov: 11th; Dec: 12th"
paste3(c("a","b", "c", NA), c("A","B", NA, NA), c(1,2,NA,4), c(5,6,7,8))
[1] "a, A, 1, 5" "b, B, 2, 6" "c, , 7" "4, 8"
paste5(c("a","b", "c", NA), c("A","B", NA, NA), c(1,2,NA,4), c(5,6,7,8), sep = ", ", na.rm = T)
[1] "a, A, 1, 5" "b, B, 2, 6" "c, 7" "4, 8"
You can use ifelse, a vectorized if-else construct to determine if a value is NA and substitute a blank. You'll then use gsub to strip out the trailing ", " if it isn't followed by any other string.
gsub(", $", "", paste(1:4, ifelse(is.na(foo), "", foo), sep = ", "))
Your answer is correct. There isn't a better way to do it. This issue is explicitly mentioned in the paste documentation in the Details section.
If working with df or tibbles using tidyverse, I use mutate_all or mutate_at with str_replace_na before paste or unite to avoid pasting NAs.
library(tidyverse)
new_df <- df %>%
mutate_all(~str_replace_na(., "")) %>%
mutate(combo_var = paste0(var1, var2, var3))
OR
new_df <- df %>%
mutate_at(c('var1', 'var2'), ~str_replace_na(., "")) %>%
mutate(combo_var = paste0(var1, var2))
This can be acheived in a single line.
For e.g.,
vec<-c("A","B",NA,"D","E")
res<-paste(vec[!is.na(vec)], collapse=',' )
print(res)
[1] "A,B,D,E"
Or remove the NAs after paste with str_replace_all
data$1 <- str_replace_all(data$1, "NA", "")
A variant of Joe's solution (https://stackoverflow.com/a/49201394/3831096) that respects both sep and collapse and returns NA when all values are NA is:
paste_missing <- function(..., sep=" ", collapse=NULL) {
ret <-
apply(
X=cbind(...),
MARGIN=1,
FUN=function(x) {
if (all(is.na(x))) {
NA_character_
} else {
paste(x[!is.na(x)], collapse = sep)
}
}
)
if (!is.null(collapse)) {
paste(ret, collapse=collapse)
} else {
ret
}
}
Here is a solution that behaves more like paste and handles more edge cases than current solutions (empty strings, "NA" strings, more than 2 arguments, use of collapse argument...).
paste2 <- function(..., sep = " ", collapse = NULL, na.rm = FALSE){
# in default case, use paste
if(!na.rm) return(paste(..., sep = sep, collapse = collapse))
# cbind is convenient to recycle, it warns though so use suppressWarnings
dots <- suppressWarnings(cbind(...))
res <- apply(dots, 1, function(...) {
if(all(is.na(c(...)))) return(NA)
do.call(paste, as.list(c(na.omit(c(...)), sep = sep)))
})
if(is.null(collapse)) res else
paste(na.omit(res), collapse = collapse)
}
# behaves like `paste()` by default
paste2(c("a","b", "c", NA), c("A","B", NA, NA))
#> [1] "a A" "b B" "c NA" "NA NA"
# trigger desired behavior by setting `na.rm = TRUE` and `sep = ", "`
paste2(c("a","b", "c", NA), c("A","B", NA, NA), sep = ",", na.rm = TRUE)
#> [1] "a,A" "b,B" "c" NA
# handles hedge cases
paste2(c("a","b", "c", NA, "", "", ""),
c("a","b", "c", NA, "", "", "NA"),
c("A","B", NA, NA, NA, "", ""),
sep = ",", na.rm = TRUE)
#> [1] "a,a,A" "b,b,B" "c,c" NA "," ",," ",NA,"
Created on 2019-10-01 by the reprex package (v0.3.0)
This works for me
library(stringr)
foo <- LETTERS[1:4]
foo[4] <- NA
foo
# [1] "A" "B" "C" NA
if_else(!is.na(foo),
str_c(1:4, str_replace_na(foo, ""), sep = ", "),
str_c(1:4, str_replace_na(foo, ""), sep = "")
)
# [1] "1, A" "2, B" "3, C" "4"
Updating #Erik Shilts solution in order to get rid of the last one comma:
x = gsub(",$", "", paste(1:4, ifelse(is.na(foo), "", foo), sep = ","))
Then in order to get rid of the trailing last "," in it just repeat it once again:
x <- gsub(",$", "", x)
I'm trying to teach myself R and in doing some sample problems I came across the need to reverse a string.
Here's what I've tried so far but the paste operation doesn't seem to have any effect.
There must be something I'm not understanding about lists? (I also don't understand why I need the [[1]] after strsplit.)
test <- strsplit("greg", NULL)[[1]]
test
# [1] "g" "r" "e" "g"
test_rev <- rev(test)
test_rev
# [1] "g" "e" "r" "g"
paste(test_rev)
# [1] "g" "e" "r" "g"
From ?strsplit, a function that'll reverse every string in a vector of strings:
## a useful function: rev() for strings
strReverse <- function(x)
sapply(lapply(strsplit(x, NULL), rev), paste, collapse="")
strReverse(c("abc", "Statistics"))
# [1] "cba" "scitsitatS"
stringi has had this function for quite a long time:
stringi::stri_reverse("abcdef")
## [1] "fedcba"
Also note that it's vectorized:
stringi::stri_reverse(c("a", "ab", "abc"))
## [1] "a" "ba" "cba"
As #mplourde points out, you want the collapse argument:
paste(test_rev, collapse='')
Most commands in R are vectorized, but how exactly the command handles vectors depends on the command. paste will operate over multiple vectors, combining the ith element of each:
> paste(letters[1:5],letters[1:5])
[1] "a a" "b b" "c c" "d d" "e e"
collapse tells it to operate within a vector instead.
The following can be a useful way to reverse a vector of strings x, and is slightly faster (and more memory efficient) because it avoids generating a list (as in using strsplit):
x <- rep( paste( collapse="", LETTERS ), 100 )
str_rev <- function(x) {
sapply( x, function(xx) {
intToUtf8( rev( utf8ToInt( xx ) ) )
} )
}
str_rev(x)
If you know that you're going to be working with ASCII characters and speed matters, there is a fast C implementation for reversing a vector of strings built into Kmisc:
install.packages("Kmisc")
str_rev(x)
You can also use the IRanges package.
library(IRanges)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
You can also use the Biostrings package.
library(Biostrings)
x <- "ATGCSDS"
reverse(x)
# [1] "SDSCGTA"
If your data is in a data.frame, you can use sqldf:
myStrings <- data.frame(forward = c("does", "this", "actually", "work"))
library(sqldf)
sqldf("select forward, reverse(forward) `reverse` from myStrings")
# forward reverse
# 1 does seod
# 2 this siht
# 3 actually yllautca
# 4 work krow
Here is a function that returns the whole reversed string, or optionally the reverse string keeping only the elements specified by index, counting backward from the last character.
revString = function(string, index = 1:nchar(string)){
paste(rev(unlist(strsplit(string, NULL)))[index], collapse = "")
}
First, define an easily recognizable string as an example:
(myString <- paste(letters, collapse = ""))
[1] "abcdefghijklmnopqrstuvwxyz"
Now try out the function revString with and without the index:
revString(myString)
[1] "zyxwvutsrqponmlkjihgfedcba"
revString(myString, 1:5)
[1] "zyxwv"
The easiest way to reverse string:
#reverse string----------------------------------------------------------------
revString <- function(text){
paste(rev(unlist(strsplit(text,NULL))),collapse="")
}
#example:
revString("abcdef")
You can do with rev() function as mentioned in a previous post.
`X <- "MyString"
RevX <- paste(rev(unlist(strsplit(X,NULL))),collapse="")
Output : "gnirtSyM"
Thanks,
Here's a solution with gsub. Although I agree that it's easier with strsplit and paste (as pointed out in the other answers), it may be interesting to see that it works with regular expressions too:
test <- "greg"
n <- nchar(test) # the number of characters in the string
gsub(paste(rep("(.)", n), collapse = ""),
paste("", seq(n, 1), sep = "\\", collapse = ""),
test)
# [1] "gerg"
##function to reverse the given word or sentence
reverse <- function(mystring){
n <- nchar(mystring)
revstring <- rep(NA, n)
b <- n:1
c <- rev(b)
for (i in 1:n) {
revstring[i] <- substr(mystring,c[(n+1)- i], b[i])
}
newrevstring <- paste(revstring, sep = "", collapse = "")
return (cat("your string =", mystring, "\n",
("reverse letters = "), revstring, "\n",
"reverse string =", newrevstring,"\n"))
}
Here is one more base-R solution:
# Define function
strrev <- function(x) {
nc <- nchar(x)
paste(substring(x, nc:1, nc:1), collapse = "")
}
# Example
strrev("Sore was I ere I saw Eros")
[1] "sorE was I ere I saw eroS"
Solution was inspired by these U. Auckland slides.
The following Code will take input from user and reverse the entire string-
revstring=function(s)
print(paste(rev(strsplit(s,"")[[1]]),collapse=""))
str=readline("Enter the string:")
revstring(str)
So apparently front-end JS developers get asked to do this (for interviews) in JS without using built-in reverse functions. It took me a few minutes, but I came up with:
string <- 'hello'
foo <- vector()
for (i in nchar(string):1) foo <- append(foo,unlist(strsplit(string,''))[i])
paste0(foo,collapse='')
Which all could be wrapped in a function...
What about higher-order functionals? Reduce?