This question already has answers here:
Replace multiple strings in one gsub() or chartr() statement in R?
(9 answers)
Closed 1 year ago.
I have these two lines of R codes:
df$symbol <- gsub("\\^", "-P", df$symbol) # find "^" and change it to "-P"
df$symbol <- gsub("/", "-", df$symbol) # find "/" and change it to "-"
How can I combine them into one line?
Thank you!
Given that you have two different replacement strings, there may not be a way to do this with just a single call to gsub. However, you could chain two calls to gsub here:
df$symbol <- gsub("/", "-", gsub("\\^", "-P", df$symbol))
Related
This question already has answers here:
Split column at delimiter in data frame [duplicate]
(6 answers)
R strsplit with multiple unordered split arguments?
(4 answers)
Closed 2 years ago.
as the title of the question says, i want to know how do i atribbute different parameters to the function
strsplit(x, split = " ")
If i apply it, i get every word in my string as a single vector, when it is separeted by space-bar. Ok. But the point is, i want also to split words that are connected with a dot (like banana.apple turning to "banana" and "apple").
I tought something like this (below) would work, but it doesnt...
strsplit(x, split = " ", "[.]")
Can anybody help me?
This should work if you want to split on both:
library(stringr)
x <- c("banana.apple turning.something")
str_split(x, "[\\.\\s]")
# [[1]]
# [1] "banana" "apple" "turning" "something"
This question already has answers here:
Extracting numbers from vectors of strings
(12 answers)
Closed 2 years ago.
I have a dataframe X with column names such as
1_abc,
2_fgy,
27_msl,
936_hhq,
3_hdv
I want to just keep the numbers as the column name (so instead of 1_abc, just 1). How do I go about removing it while keeping the rest of the data intact?
All column names have underscore as the separator between numeric and character variables. There are about 400 columns so I want to be able to code this without using specific column name
You may use sub here for a base R option:
names(df) <- sub("^(\\d+).*$", "\\1", names(df))
Another option might be:
names(df) <- sub("_.*", "", names(df))
This would just strip off everything from the first underscore until the end of the column name.
This question already has answers here:
Getting and removing the first character of a string
(7 answers)
Extract the first (or last) n characters of a string
(5 answers)
Closed 2 years ago.
I'm working in R. I have a dataset with people first and last names. There is a column called "First" and another column called "Last".
I want to change "Bodie" to just "B" and do the same for all the observations in the "Last" column.
I'm newer to programming so I don't even know where to start. I have looked at some of the string packages in R and can't quite figure out what to do. Thanks for the help.
We can use substr to extract the first letter of the 'Last' column
df1$Last <- substr(df1$Last, 1, 1)
Or sub to remove all the characters other than the first
df1$Last <- sub("^(.).*", "\\1", df1$Last)
Or another option is to split the characters, select the first element
df1$Last <- sapply(strsplit(df1$Last, ""), `[`, 1)
Just a variation on the #akrun answer which uses sub sans a capture group:
df1$Last <- sub("(?<=.).*$", "", df1$Last, perl=TRUE)
This question already has answers here:
R: How to replace . in a string?
(5 answers)
Closed 2 years ago.
I have the following data.frame.
df = data.frame(a.dfs.56=c(rep("a",8), rep("b",5), rep("c",7), rep("d",10)),
b.fqh.28=rnorm(30, 6, 2),
c.34.2.fgs=rnorm(30, 12, 3.5),
d.tre.19.frn=rnorm(30, 8, 3)
)
How can I substitute all periods "." in the column names to have them become dashes "-"?
I am aware of options like check.names=FALSE when using read.table or data.frame, but in this case, I cannot use this.
I have also tried variations of the following posts, but they did not work for me.
Specifying column names in a data.frame changes spaces to "."
How can I use gsub in multiple specific column in r
R gsub column names in all data frames within a list
Thank you.
You can use gsub for name replacement
names(df) <- gsub(".", "-", names(df), fixed=TRUE)
Note that you need fixed=TRUE because normally gsub expects regular expressions and . is a special regular expression character.
But be aware that - is a non-standard character for variable names. If you try to use those columns with functions that use non-standard evaluation, you will need to surround the names in back-ticks to use them. For example
dplyr::filter(df, `a-dfs-56`=="a")
gsub("\\.", "-", names(df)) is the regex (regular expressions) way. The . is a special symbol in regex that means "match any single character". That's why the fixed = TRUE argument is included in MrFlick's answer.
The \\ (escape) tells R that we wan't the literal period and not the special symbol that it represents.
This question already has answers here:
How to reverse a string in R
(14 answers)
Closed 5 years ago.
I have a sentence, ['this', 'is, 'my', house'].
After splitting it by using "-"as a as separator,and reversing it to[ house, my, is, this], how do I access the last part of string? and join my and is together with house to form another sentence?
sentence <- c("this","is","my","house")
strsplit(sentence[4], split="")[[1]][nchar(sentence[4]):1]
This code might be a bit dense for a beginner to interpret. The [[1]] is necessary because the value of strsplit is always a list, even when it's just one vector of individual characters; the indexing extracts that vector. The indexing after that, [nchar(sentence[4]):1], reorders the letters in that vector backwards, from the last to the first, in this case c(5,4,3,2,1). The split="" argument causes the strsplit function to split the string at every possible point, i.e. between each character.
out <- strsplit(sentence, "-")
last <- out[length(out)]
flip <- rev(last)
word <- paste(flip, collapse='')