Gsub a every element after a keyword in R - r

I'd like to remove all elements of a string after a certain keyword.
Example :
this.is.an.example.string.that.I.have
Desired Output :
This.is.an.example
I've tried using gsub('string', '', list) but that only removes the word string. I've also tried using the gsub('^string', '', list) but that also doesn't seem to work.
Thank you.

Following simple sub may help you here.
sub("\\.string.*","",variable)
Explanation: Method of using sub
sub(regex_to_replace_text_in_variable,new_value,variable)
Difference between sub and gsub:
sub: is being used for performing substitution on variables.
gsub: gsub is being used for same substitution tasks only but only thing it will be perform substitution on ALL matches found though sub performs it only for first match found one.
From help page of R:
sub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)

You can try this positive lookbehind regex
S <- 'this.is.an.example.string.that.I.have'
gsub('(?<=example).*', '', S, perl=TRUE)
# 'this.is.an.example'

You can use strsplit. Here you split your string after a key word, and retain the first part of the string.
x <- "this.is.an.example.string.that.I.have"
strsplit(x, '(?<=example)', perl=T)[[1]][1]
[1] "this.is.an.example"

Related

Replace slash with a single backslash in R

This is probably trivial, but I have failed to find any question referring to this exact issue.
My issue does not have to do with coming up with a suitable regex, it has to do with accurately specifying the replacement part.
x = "file_path/file_name.txt" - this is what I have
# "file_path\file_name.txt" - this is what I want
Here is what I tried:
library(stringr)
str_detect(string = x, pattern = "/") # returns TRUE, as expected
#str_replace_all(string = x, pattern = "/", replacement = "\") # fails, R believes I'm escaping the quote in the replacement
str_replace_all(string = x, pattern = "/", replacement = "\\") # this results to "file_pathfile_name.txt", missing the backslash altogether
str_replace_all(string = x, pattern = "/", replacement = "\\\\") # this results to "file_path\\file_name.txt", which is not what I want
Any help would be greatly appreciated.
The solution is to escape the escape character which means 4 '\' in the end.
cat(gsub('/', '\\\\', "file_path/file_name.txt"))
Look at the difference between your standard output with like 'print()' which escapes the escape character, or get the plain string by using 'cat()'.
str_replace_all(string = x, pattern = "/", replacement = "\\\\")
cat(str_replace_all(string = x, pattern = "/", replacement = "\\\\"))

Set number of arguments programmatically

I have the following string:
test <- "C:\\Users\\stefanj\\Documents\\Automation_Desk\\script.R"
I am separating the string on the backslash characters with the following code:
pdf_path_long <- unlist(strsplit(test, "\\\\",
fixed = FALSE, perl = FALSE, useBytes = FALSE))
What I want to do is:
pdf_path_short <- file.path(pdf_path_long[1], pdf_path_long[2], ...)
Problem is:
I know how to count the elements in the pdf_path_short - length(pdf_path_long), but I don't know how to set them in the file.path as the number of elements will very based on the length of the path.
You can directly (no need for a strsplit call) use gsub on test to change the separators (with fixed=TRUE so you don't need to escape the double backslash), you will get same output as with file.path:
pdf_path_short <- gsub("\\", "/", test, fixed=TRUE)
pdf_path_short
# "C:/Users/stefanj/Documents/Automation_Desk/script.R"
Of course, you can change the replacement part with whatever separator you need.
Note: you can also check normalizePath function:
normalizePath(test, "/", mustWork=FALSE)
#[1] "C:/Users/stefanj/Documents/Automation_Desk/script.R"

dir_ls - how to make "glob" case insensitive

library(fs)
dir_ls(glob = "*.R) lists all the .R files in the directory
However, dir_ls(glob = "*.r") does not return anything
How do I make glob become case insensitive?
You can use the ... argument in fs::dir_ls() to pass ignore.case to grep():
dir_ls(glob = "*.r", ignore.case = TRUE)
Use glob2rx to convert the glob to a regular expression and then use list.files with the ignore.case=TRUE argument:
list.files(pattern = glob2rx("*.R"), ignore.case = TRUE)

Deletion using gsub in R [duplicate]

I'm trying to use the following code to replace two dots for only one:
test<-"test..1"
gsub("\\..", ".", test, fixed=TRUE)
and getting:
[1] "test..1"
I tried several combinations of escape strings, including brackets [] with no success.
What am I doing wrong?
If you are going to use fixed = TRUE, use the (non-interpreted) character .:
> gsub("..", ".", test, fixed = TRUE)
Otherwise, within regular expressions (fixed = FALSE), . has a special meaning (any character) so you'll want to prefix it with a backslash to mean "the dot character":
> gsub("\\.\\.", ".", test)
> gsub("\\.{2}", ".", test)

Replace two dots in a string with gsub

I'm trying to use the following code to replace two dots for only one:
test<-"test..1"
gsub("\\..", ".", test, fixed=TRUE)
and getting:
[1] "test..1"
I tried several combinations of escape strings, including brackets [] with no success.
What am I doing wrong?
If you are going to use fixed = TRUE, use the (non-interpreted) character .:
> gsub("..", ".", test, fixed = TRUE)
Otherwise, within regular expressions (fixed = FALSE), . has a special meaning (any character) so you'll want to prefix it with a backslash to mean "the dot character":
> gsub("\\.\\.", ".", test)
> gsub("\\.{2}", ".", test)

Resources