This question already has answers here:
Digit sum function in R
(4 answers)
Closed 7 years ago.
I've got this simple question: how can I change a vector consisting of 10 numbers into a vector consisting of ten numbers which are the sum of the figures of the first numbers? So 11 in the first vector becomes 2, 234 becomes 9.
We can use str_extract_all from stringr to get the individual numbers, convert them to numeric and get the sum.
library(stringr)
sapply(str_extract_all(c(11, 234), '\\d'), function(x) sum(as.numeric(x)))
Related
This question already has answers here:
How to calculate the number of occurrence of a given character in each row of a column of strings?
(14 answers)
Count the number of pattern matches in a string
(6 answers)
Closed 2 years ago.
Say I have a strings
seq1 <- "ACTACTGGATGACT"
pattern1 <- "ACT"
What is the best way to find the number of times the pattern is in the sequence, in R? I would like to use a sliding window for loop, but im not clear on the proper way to handle the character strings.
We can use str_count
library(stringr)
str_count(seq1, pattern1)
#[1] 3
This question already has answers here:
Formatting Decimal places in R
(15 answers)
Closed 2 years ago.
How do I round the numbers to 2 decimal points within a column within a dataframe?
The name of the df is tax_data and the column that I want to round is called rate_percent
I tried using:
format(round(rate_percent ,2), nsmall =2) but this didn't work.
Does anyone have any suggestions?
Here, in Base-R
tax_data$rate_percent <- round(tax_data$rate_percent, 2)
This question already has answers here:
Trying to return a specified number of characters from a gene sequence in R
(3 answers)
Extracting the last n characters from a string in R
(15 answers)
Closed 5 years ago.
Is there a function in R that I can cut a value in vector.
for example i got this vec:
40754831597
64278107602
64212163451
and each vale in the vec i want to cut so from the number pos 3 to 6 for example and get a new vector look like this
7548
2781
2121
and so on
I don't really get why you would like to do this, but here you go:
# assuming it's a character vector
substring(vec,3,6)
# if it's numeric
substring(as.character(vec),3,6)
#output
#[1] "7548" "2781" "2121"
We can use sub
sub(".{2}(.{4}).*", "\\1", v1)
#[1] "7548" "2781" "2121"
data
v1 <- c(40754831597, 64278107602, 64212163451)
This question already has answers here:
Count values separated by a comma in a character string
(5 answers)
How to calculate the number of occurrence of a given character in each row of a column of strings?
(14 answers)
Closed 6 years ago.
I have a column with a piped list of identifiers
Identifier
O75496|P62979|P62987|P0CG47|P0CG48|O00487|P25786
P28066|P60900|O14818|P20618|P40306
Q99436|P28062|P28065
P28062|P28065|P62191|P35998|P17980|P43686
How do I produce a column of the numbers of identifiers in each row?
Output to read something like this
Identifier Count
O75496|P62979|P62987|P0CG47|P0CG48|O00487|P25786 7
P28066|P60900|O14818|P20618|P40306 5
Q99436|P28062|P28065 3
P28062|P28065|P62191|P35998|P17980|P43686 6
Thanks in advance!
sapply(strsplit(df$Identifier, '[|]'), length)
for unique cases, just add the unique function
sapply(strsplit(df$Identifier, '[|]'), function(i) length(unique(i)))
A base R option without splitting would be
df1$Count <- nchar(gsub("[^|]", "", df1$Identifier)) + 1L
df1$Count
#[1] 7 5 3 6
Or with gregexpr
sapply(gregexpr("[|]", df1$Identifier),
function(x) sum(attr(x, "match.length"))+1)
#[1] 7 5 3 6
This question already has answers here:
Sum rows in data.frame or matrix
(7 answers)
Closed 7 years ago.
I need to sum columns of a table that have a names starting with a particular string.
An example table might be:
tbl<-data.frame(num1=c(3,2,9), num2=c(3,2,9),n3=c(3,2,9),char1=c('a', 'b', 'c'))
I get the list of columns (in this example I wrote only 2, but the real case has more tan 20).
a<-colnames(tbl)[grep('num', colnames(tbl))]
I tried with
sum(tbl[,a])
But I get only one number with the total sum of the elements in both vectors.
What I need is the result of:
tbl$num1+ tbl$num2
We can either use Reduce
Reduce(`+`, tbl[a])
Or rowSums. The rowSums also has the option of removing the NA elements with na.rm=TRUE.
rowSums(tbl[a])