Split comma delimited string [duplicate] - r

This question already has an answer here:
Split delimited single value character vector
(1 answer)
Closed 5 years ago.
I have a string in R in the following form:
"AAAAA","BBBBB","CCCCC",..
And i want to convert it to a standard typical R vector containing the same string elements ("AAAAA", "BBBBB", etc.):
vector<-c("AAAAA","BBBBB","CCCCC",..)
I've read that strsplit could do it, but haven't managed to achieve it.

strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,
unlist(strsplit(string, ","))

Related

Is there an R function to format a character string pattern? [duplicate]

This question already has an answer here:
Split delimited single value character vector
(1 answer)
Closed 5 years ago.
I have a string in R in the following form:
"AAAAA","BBBBB","CCCCC",..
And i want to convert it to a standard typical R vector containing the same string elements ("AAAAA", "BBBBB", etc.):
vector<-c("AAAAA","BBBBB","CCCCC",..)
I've read that strsplit could do it, but haven't managed to achieve it.
strsplit gives you back a list of the character vectors, so if you want it in a single vector, use unlist as well.
So,
unlist(strsplit(string, ","))

Split column names in R using a separator [duplicate]

This question already has answers here:
Extracting numbers from vectors of strings
(12 answers)
Closed 2 years ago.
I have a dataframe X with column names such as
1_abc,
2_fgy,
27_msl,
936_hhq,
3_hdv
I want to just keep the numbers as the column name (so instead of 1_abc, just 1). How do I go about removing it while keeping the rest of the data intact?
All column names have underscore as the separator between numeric and character variables. There are about 400 columns so I want to be able to code this without using specific column name
You may use sub here for a base R option:
names(df) <- sub("^(\\d+).*$", "\\1", names(df))
Another option might be:
names(df) <- sub("_.*", "", names(df))
This would just strip off everything from the first underscore until the end of the column name.

Passing a vector through a select statement [duplicate]

This question already has answers here:
grep using a character vector with multiple patterns
(11 answers)
Closed 3 years ago.
Looking for help to find a way to pass a vector of strings into a select statement. I want to subset a data frame to only output variables that contain the same string as my vector. I don't want it to match exactly and hence need to pass a function like contains as there are some text in the data frame variables that I do not have in my vector.
here is an example of the vector I want to pass into my select statement.
c("clrs_name", "_clrs_sitedetails_value", "_clrs_targetlicence_value",
"clrs_licenceclass", "clrs_licenceownership", "clrs_type", "statuscode")
For example, I want to extract the variable "odate_value_clrs_name" from my data frame and the string "clrs_name" in vector should extract that, but I am not sure how to incorporate contains and a vector into a select statement.
We can use matches in select after collapseing the pattern vector with | by either paste from base R or str_c (str_c would also return NA if there are any NAs). This would not return any error or warning if one of the pattern is missing or doesn't have any match with the column names
library(dplyr)
library(stringr)
df1 %>%
select(matches(str_c(v1, collapse = "|")))
where
v1 <- c("clrs_name", "_clrs_sitedetails_value", "_clrs_targetlicence_value",
"clrs_licenceclass", "clrs_licenceownership", "clrs_type", "statuscode")

R-taking reverse of substring of strsplit sentence [duplicate]

This question already has answers here:
How to reverse a string in R
(14 answers)
Closed 5 years ago.
I have a sentence, ['this', 'is, 'my', house'].
After splitting it by using "-"as a as separator,and reversing it to[ house, my, is, this], how do I access the last part of string? and join my and is together with house to form another sentence?
sentence <- c("this","is","my","house")
strsplit(sentence[4], split="")[[1]][nchar(sentence[4]):1]
This code might be a bit dense for a beginner to interpret. The [[1]] is necessary because the value of strsplit is always a list, even when it's just one vector of individual characters; the indexing extracts that vector. The indexing after that, [nchar(sentence[4]):1], reorders the letters in that vector backwards, from the last to the first, in this case c(5,4,3,2,1). The split="" argument causes the strsplit function to split the string at every possible point, i.e. between each character.
out <- strsplit(sentence, "-")
last <- out[length(out)]
flip <- rev(last)
word <- paste(flip, collapse='')

Keep leading zeros with colsplit in R [duplicate]

This question already has answers here:
How to avoid: read.table truncates numeric values beginning with 0
(3 answers)
Closed 6 years ago.
I'm using colsplit to split a large string to columns.
there are numbers in the string with leading zeros.
How can I prevent colsplit from converting them to numeric values?
Example:
value in string: 0000122517
after colsplit this becomes: 122517
I need the leading zeros, and the values can be of any length, so I
cannot add the zeros afterwards.
Kind regards,
Oene Douma
We can use read.table
read.table(text=str, sep="~", header=FALSE, colClasses = c("character", "character"))

Resources