Split a string into a vector of single character strings [duplicate] - r

This question already has answers here:
Split a character vector into individual characters? (opposite of paste or stringr::str_c)
(4 answers)
Closed 4 years ago.
I have a string like "abcde" and want it split into a vector like
> c("a", "b", "c", "d", "e")
[1] "a" "b" "c" "d" "e"
I found one way that I'll post as an answer but I am hoping someone else has a simpler way to do this, either in base R or using a package.

Using the stringr package:
library(stringr)
as.vector(str_split_fixed(x, pattern = "", n = nchar(x)))
[1] "a" "b" "c" "d" "e"
str_split_fixed produces a matrix that has to be coerced into a vector.

Related

Two Column R Dataframe to Named LIst [duplicate]

This question already has answers here:
Named List To/From Data.Frame
(4 answers)
Closed 2 years ago.
I am trying to convert a two-column dataframe to a named list. There are several solutions on StackOverflow where every value in the first column becomes the 'name', but I am looking to collapse the values in column 2 into common values in column 1.
For example, the list should look like the following:
# Create a Named list of keywords associated with each file.
fileKeywords <- list(fooBar.R = c("A","B","C"),
driver.R = c("A","F","G"))
Where I can retrieve all keywords for "fooBar.R" using:
# Get the keywords for a named file
fileKeywords[["fooBar.R"]]
My data frame looks like:
df <- read.table(header = TRUE, text = "
file keyWord
'fooBar.R' 'A'
'fooBar.R' 'B'
'fooBar.R' 'C'
'driver.R' 'A'
'driver.R' 'F'
'driver.R' 'G'
")
I'm sure there is a simple solution that I am missing.
You could use unstack:
as.list(unstack(rev(df)))
$driver.R
[1] "A" "F" "G"
$fooBar.R
[1] "A" "B" "C"
This is equivalent to as.list(unstack(df, keyWord~file))
We can use stack in base R
stack(fileKeywords)[2:1]
if it is the opposite, then we can do
with(df, tapply(keyWord, file, FUN = I))
-output
#$driver.R
#[1] "A" "F" "G"
#$fooBar.R
#[1] "A" "B" "C"

Compressing a string in R [duplicate]

This question already has an answer here:
Collapse vector to string of characters with respective numbers of consequtive occurences
(1 answer)
Closed 4 years ago.
I have a large sequence of strings containing only the following characters
"M", "D", "A"
such as:
"M" "M" "A" "A" "D" "D" "M" "D" "A"
and I would like to compress it to:
M2A2D2M1D1A1
in R. Googling has led me to this (a java solution) but before implementing it, it would be interesting to check if I can find something ready online. Thanks!
R function rle() is your friend.
testVector <- sample(c("M", "D", "A"), 20, replace=T)
res <- rle(testVector)
compressedString <- paste(res$values, res$lengths, collapse = "", sep = "")

Replace text values in a vector [duplicate]

This question already has answers here:
Dictionary style replace multiple items
(11 answers)
Closed 5 years ago.
Here's my data :
dataset <- c("h", "H", "homme", "masculin", "f", "femme", "épouse")
How can I replace text values of the vector like :
"femme" -> "f"
"épouse" ->"f"
"Homme"-> "h"
"masculin" -> "h"
What I tried for "femme" -> "f"
test_out <- sapply(dataset, switch,
"f"="femme")
test_out
Expected result :
"h" "h" "h" "masculin" "f" "f" "f"
Try gsub with regular expressions:
dataset = gsub("^((?!h).*)$", "f", gsub("^((h|H|m).*)$", "h", dataset), perl=TRUE)

return number of specific element of vector based of its name [duplicate]

This question already has answers here:
Convert letters to numbers
(5 answers)
Closed 5 years ago.
I need to return number of element in vector based on vector element name. Lets say i have vector of letters:
myLetters=letters[1:26]
> myLetters
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
and what I intent to do is to create/find function that returns me the number of element when called for example:
myFunction(myLetters["b"])
[1] 2
myFunction(myLetters["z"])
[1]26
In summary I need a way to refer to excel columns by writing letters of a column (A,B,C later maybe even AA or further) and to get the number.
If you want to refer to excel columnnames, you could create a reference vector with all possible excel column names:
eg1 <- expand.grid(LETTERS, LETTERS)
eg2 <- expand.grid(LETTERS, LETTERS, LETTERS)
excelcols <- c(LETTERS, paste0(eg1[[2]], eg1[[1]]), paste0(paste0(eg2[[3]], eg2[[2]], eg2[[1]])))
After which you can use which:
> which(excelcols == 'A')
[1] 1
> which(excelcols == 'AB')
[1] 28
> which(excelcols == 'ABC')
[1] 731
If you need to find the number of times specific letter occurs then the following should work:
myLetters = c("a","a", "b")
myFunction = function(myLetters, findLetter){
length(which(myLetters==findLetter))
}
Let find how many times "a" occurs in myLetters:
myFunction(myLetters, "a")
# [1] 2

joining lists of different length in R

I am not sure if R has the capabilities to do this, but I'd like to join two different lists of different lengths so that it's like a nested list within a list (if that makes sense).
edit: I'd like to add values in x as an additional value in z.
z <- c("a", "b", "c")
x <- c("c", "g")
c(z, x)
[1] "a" "b" "c" "c" "g"
# what I'd really like to see
[1] "a" "b" "c" "c, g"
I think it would be something similar to doing the following in python pandas
self.z.append(x)
We can paste the 'x' together and concatenate with 'z'
c(z, toString(x))
#[1] "a" "b" "c" "c, g"

Resources