What is the use of 'sep' in paste command of R? [duplicate] - r

This question already has answers here:
Concatenate a vector of strings/character
(8 answers)
Closed 6 years ago.
I was working with the paste command in R, when I found that
a <- c("something", "to", "paste")
paste(a, sep="_")
produces the output
# [1] "something" "to" "paste"
Which is same as when I print "a"
# [1] "something" "to" "paste"
So what effect does the sep have on the paste command in R?

sep is more generally applicable when you have more than two vectors of length greater than 1. If you were looking to get "something_to_paste", then you would be looking for the collapse argument.
Try the following to get a sense of what the sep argument does:
paste(a, 1:3, sep = "_")
# [1] "something_1" "to_2" "paste_3"
and compare it to collapse:
paste(a, collapse = "_")
# [1] "something_to_paste"

Related

Remove period and text after in list [duplicate]

This question already has answers here:
Remove part of string after "."
(6 answers)
gsub() in R is not replacing '.' (dot)
(3 answers)
Closed last year.
I have a list
t <- list('mcd.norm_1','mcc.norm_1', 'mcr.norm_1')
How can i convert the list to remove the period and everything after so the list is just
'mcd' 'mcc' 'mcr'
You may try
library(stringr)
lapply(t, function(x) str_split(x, "\\.", simplify = T)[1])
Another possible solution:
library(tidyverse)
t <- list('mcd.norm_1','mcc.norm_1', 'mcr.norm_1')
t %>%
str_remove("\\..*")
#> [1] "mcd" "mcc" "mcr"
This could be another option:
unlist(sapply(t, \(x) regmatches(x, regexec(".*(?=\\.)", x, perl = TRUE))))
[1] "mcd" "mcc" "mcr"

How to get the most frequent character within a character string? [duplicate]

This question already has answers here:
Finding the most repeated character in a string in R
(2 answers)
Closed 1 year ago.
Suppose the next character string:
test_string <- "A A B B C C C H I"
Is there any way to extract the most frequent value within test_string?
Something like:
extract_most_frequent_character(test_string)
Output:
#C
We can use scan to read the string as a vector of individual elements by splitting at the space, get the frequency count with table, return the named index that have the max count (which.count), get its name
extract_most_frequent_character <- function(x) {
names(which.max(table(scan(text = x, what = '', quiet = TRUE))))
}
-testing
extract_most_frequent_character(test_string)
[1] "C"
Or with strsplit
extract_most_frequent_character <- function(x) {
names(which.max(table(unlist(strsplit(x, "\\s+")))))
}
Here is another base R option (not as elegant as #akrun's answer)
> intToUtf8(names(which.max(table(utf8ToInt(gsub("\\s", "", test_string))))))
[1] "C"
One possibility involving stringr could be:
names(which.max(table(str_extract_all(test_string, "[A-Z]", simplify = TRUE))))
[1] "C"
Or marginally shorter:
names(which.max(table(str_extract_all(test_string, "[A-Z]")[[1]])))
Here is solution using stringr package, table and which:
library(stringr)
test_string <- str_split(test_string, " ")
test_string <- table(test_string)
names(test_string)[which.max(test_string)]
[1] "C"

find names that match either of two patters [duplicate]

This question already has answers here:
grep using a character vector with multiple patterns
(11 answers)
Closed 2 years ago.
Is is possible to find the names in a vector that contain either id OR group Or both in the example below?
I have used grepl() without success.
a = c("c-id" = 2, "g_idgroups" = 3, "z+i" = 4)
grepl(c("id", "group"), names(a)) # return name of elements that contain either `id` OR `group` OR both
You can use :
pattern <- c("id", "group")
grep(paste0(pattern, collapse = '|'), names(a), value = TRUE)
#[1] "c-id" "g_igroups"
With grepl you can get logical value
grepl(paste0(pattern, collapse = '|'), names(a))
#[1] TRUE TRUE FALSE
A stringr solution :
stringr::str_subset(names(a), paste0(pattern, collapse = '|'))
#[1] "c-id" "g_igroups"
Using str_detect:
> names(a)[str_detect(names(a), 'id|groups')]
[1] "c-id" "g_idgroups"
> names(a)
[1] "c-id" "g_idgroups" "z+i"
>

R: Paste two strings without space [duplicate]

This question already has answers here:
How to remove all whitespace from a string?
(9 answers)
Closed 1 year ago.
I want to merge following two strings in R (and remove the spaces). I was using paste but I was not able to get desired results.
a <- "big earth"
b <- "small moon"
c <- paste(a,b, sep = "")
I want to have a c <- "bigearthsmallmoon"
Thank you very much for the help.
You can paste the strings together into one with paste(). Then you can use gsub() to remove all spaces:
gsub(" ", "", paste(a, b))
# [1] "bigearthsmallmoon"
c <- paste(a, b, sep = "")

Convert a string with dot and comma to numeric [duplicate]

This question already has answers here:
Factor with comma and percentage to numeric
(3 answers)
Closed 5 years ago.
Is there a way to convert into a numeric type a string type?
For instance:
> as.numeric("1.560,65")
[1] NA
Warning message:
NAs introduced by coercion
I receive the above error.
I need the thousands to be displayed and separated by dot while (i.e. 1.560) the decimals to be displays and separated by comma.
> as.numeric("1.560")
[1] 1.56
> as.numeric("1.560")>2
[1] FALSE
In the above example while I want R to convert 1.560 into numeric it translates it to 1.56 which is not in thousands and this lower than 2 and thus my computations are wrong.
Any help is much appreciated.
You have to use a regexpr to format your strings into an understandable format for R and then convert it as numeric
as.numeric(gsub(",", "\\.", gsub("\\.","", "1.560,65")))
[1] 1560.65
For numeric formating see formatC
formatC(1560.65, format = "f", big.mark = ".", decimal.mark = ",")
[1] "1.560,6500"
String pattern matching and replacement can be done by using gsub function. Here is an example for your case:
str_numbers <- c("1.560,65", "134,2","123","0,32")
as.numeric(gsub(",", "\\.", gsub("\\.", "", str_numbers)))
The first call replaces the . with empty string. The second the , with .
> (tmp <- gsub("\\.", "", str_numbers))
[1] "1560,65" "134,2" "123" "0,32"
> gsub(",", "\\.", tmp)
[1] "1560.65" "134.2" "123" "0.32"

Resources