strsplit in R not working for $ as split character [duplicate] - r

This question already has answers here:
How do I strip dollar signs ($) from data/ escape special characters in R?
(4 answers)
Closed 7 years ago.
> str = "a$b$c"
> astr <- strsplit(str,"$")
> astr
[[1]]
[1] "a$b$c"
Still trying to figure the answer out!

You need to escape it
strsplit(str,"\\$")

Another option is to use , fixed = TRUE option:
strsplit(str,"$",fixed=TRUE)
## [1] "a" "b" "c"

Related

Print a result without a preceding square bracket in R [duplicate]

This question already has an answer here:
R how to not display the number into brackets of the row count in output
(1 answer)
Closed 2 years ago.
x <- 5+2
print(x)
[1] 7
How to suppress [1] and only print 7?
Similarly for characters:
y <- "comp"
print(y)
[1] "comp"
I want to remove both [1] and " ". Any help is appreciated!
Thanks!
With cat, it is possible
cat(x, '\n')
7
Or for characters
cat(dQuote(letters[1], FALSE), '\n')
"a"

How to obtain character at a specific place? [duplicate]

This question already has answers here:
str_extract: Extracting exactly nth word from a string
(5 answers)
Closed 3 years ago.
example:
"A.B.C.D"
"apple.good.sad.sea"
"X1.AN2.ED3.LK8"
What I need is to obtain the string specifically between the second dot and the third dot.
result:
"C"
"sad"
"ED3"
How can I do this?
You can use base::strsplit, loop thr the elements to get the 3rd one
v <- c("A.B.C.D", "apple.good.sad.sea", "X1.AN2.ED3.LK8")
sapply(strsplit(v, "\\."), `[[`, 3L)
output:
[1] "C" "sad" "ED3"
You can use unlist(strsplit(str,split = "."))[3] to get the third sub-string, where the original string is split by "." when you apply strsplit
I'd use
sub("^([^.]*\\.){2}([^.]*)\\..*", "\\2", x)
# [1] "C" "sad" "ED3"
Using regex in gsub.
v <- c("A.B.C.D", "apple.good.sad.sea", "X1.AN2.ED3.LK8", "A.B.C.D.E")
gsub("(.*?\\.){2}(.*?)(\\..*)", "\\2", v)
# [1] "C" "sad" "ED3" "C"

How to split words in R while keeping contractions [duplicate]

This question already has an answer here:
strsplit on all spaces and punctuation except apostrophes [duplicate]
(1 answer)
Closed 7 years ago.
I'm trying to turn a character vector novel.lower.mid into a list of single words. So far, this is the code I've used:
midnight.words.l <- strsplit(novel.lower.mid, "\\W")
This produces a list of all the words. However, it splits everything, including contractions. The word "can't" becomes "can" and "t". How do I make sure those words aren't separated, or that the function just ignores the apostrophe?
We can use
library(stringr)
str_extract_all(novel.lower.mid, "\\b[[:alnum:]']+\\b")
Or
strsplit(novel.lower.mid, "(?!')\\W", perl=TRUE)
If you just want your current "\W" split to not include apostrophes, negate \w and ':
novel.lower.mid <- c("I won't eat", "green eggs and", "ham")
strsplit(novel.lower.mid, "[^\\w']", perl=T)
# [[1]]
# [1] "I" "won't" "eat"
#
# [[2]]
# [1] "green" "eggs" "and"
#
# [[3]]
# [1] "ham"

Taking characters to the left of a character [duplicate]

This question already has answers here:
Splitting a file name into name,extension
(3 answers)
substring of a path variable
(2 answers)
Closed 9 years ago.
Given some data
hello <- c('13.txt','12.txt','14.txt')
I want to just take the numbers and convert to numeric, i.e. remove the .txt
You want file_path_sans_ext from the tools package
library(tools)
hello <- c('13.txt','12.txt','14.txt')
file_path_sans_ext(hello)
## [1] "13" "12" "14"
You can do this with regular expressions using the function gsub on the "hello" object in your original post.
hello <- c('13.txt','12.txt','14.txt')
as.numeric(gsub("([0-9]+).*","\\1",hello))
#[1] 13 12 14
Another regex solution
hello <- c("13.txt", "12.txt", "14.txt")
as.numeric(regmatches(hello, gregexpr("[0-9]+", hello)))
## [1] 13 12 14
If you know your extensions are all .txt then you can use substr()
> hello <- c('13.txt','12.txt','14.txt')
> as.numeric(substr(hello, 1, nchar(hello) - 3))
#[1] 13 12 14

R: extract directory out of a path [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How do I extract a file/folder_name only from a path?
May I ask you how I can get the last subdirectory of a path.
For example I want to get the subdirectory "7" and the following code fails:
Path <- "123\\456\\7"
Split <- strsplit(Path, "\\") # Fails because of 'Trailing backslash'
LastElement <- c[[1]][length(Split[[1]])]
Thank you in advance
You could also use the built-in function basename:
basename(Path)
[1] "7"
You have to add a second pair of \\ to escape the \ to the regex:
> Path <- "123\\456\\7"
> Split <- strsplit(Path, "\\\\")
> Split[[1]][length(Split[[1]])]
[1] "7"

Resources