I would like to extract the first digit after a decimal place from a numeric vector in R. Is there a way to do this without turning it into a character string? For example:
x <- c(1.0,1.1,1.2)
I would like the function to return a vector:
0,1,2
thanks.
There'll be a bunch of ways, but here's one:
(x %% 1)*10
# [1] 0 1 2
This assumes there's only ever one digit after the decimal place. If that's not the case:
floor((x %% 1)*10)
Related
A little bit like here, I would like to extract the first digit from each element of a numeric vector, without having it to turn into a character vector and back.
d <- c(123, 2, 45)
Expected Output:
[1] 1 2 4
I tried different stuff with floor(), but without the desired result.
One numerical approach here would be to divide each input number by 10 raised to the floor of log base 10. This means that, for example, we divide an input of 123 by 100, to yield 1.23. Then, we take the floor of that to yield the first digit 1.
getFirstDigit <- function(x) {
floor(x / (10 ^ floor(log10(x))))
}
d <- c(123, 2, 45)
getFirstDigit(d)
[1] 1 2 4
The more brute force way of doing this would be to cast the input vector to character, take the first character, and then cast back to a number. But, I doubt doing it that way would outperform what I have above.
I have a vector of strings with 24 digits each. Each digit represents an hour, and if the digit is "0" then the rate from period 0 applies and if the digit is 1 then the rate from period 1 applies.
As an example consider the two strings below. I would like to return the number of periods in each string. For example:
str1 <- "000000000000001122221100"
str2 <- "000000000000000000000000"
#str1: 3
#str2: 1
Any recommendations? I've been thinking about how to use str_count from stringr here. Also, I've searched other posts but most of them focus on counting letters in character strings, whereas this is a slight modification because the string contains digits and not letters.
Thanks!
Here is another option by using charToRaw.
length(unique(charToRaw(str1)))
[1] 3
length(unique(charToRaw(str2)))
[1] 1
This is an ugly solution, but here goes
length(unique(unlist(strsplit(str1,split = ""))))
I'm trying to create a unique column in a data frame that has a numeric of the character matches between two strings from the left side of both strings.
Each row represents has a comparison string, which we want to use as a test against a user given string. Given a dataframe:
df <- data.frame(x=c("yhf", "rnmqjk", "wok"), y=c("yh", "rnmj", "ok"))
x y
1 yhf yh
2 rnmqjk rnmj
3 wok ok
Where x is our comparison string and y is our given string, I'm looking to have the values of "2, 3, 0" output in column z., like so:
x y z
1 yhf yh 2
2 rnmqjk rnmj 3
3 wok ok 0
Essentially, I'm looking to have the given strings (y) checked from left -> right against a comparison string (x), and when the characters don't line up to not check the rest of the string and record the match numbers.
Thank you in advance!
This code works for your example:
df$z <- mapply(function(x, y) which.max(x != y),
strsplit(as.character(df$x), split=""),
strsplit(as.character(df$y), split="")) - 1
df
x y z
1 yhf yh 2
2 rnmqjk rnmj 3
3 wok ok 0
As an outline, strsplit splits a string vector into a list of character vectors. Here, each element of a vector is a single character (with the split="" argument). The which.max function returns the first position where it's argument is the maximum of the vector. Since The vectors returned by x != y are logical, which.max returns the first position where a difference is observed. mapply takes a function and lists and applies the provided function to corresponding elements of the lists.
Note that this produces warnings that the lengths of the strings don't match. This could be addressed in a couple of ways, the easiest is wrapping the function in suppressWarnings if the messages bug you.
As the OP notes int the comments if there are instances where the entire word matches, then which.max returns 1. To return the same length as the string, I'd add a second line of code that combines logical subsetting with the nchar function:
df$z[as.character(df$x) == as.character(df$y)] <-
nchar(as.character(df$x[as.character(df$x) == as.character(df$y)]))
I'm having troubles with
set.seed(1)
sum(abs(rnorm(100)))
set.seed(1)
cumsum(abs(rnorm(100)))
Why does the value of the sum differ from the last value of the cumulative sum with the cumulative sum preserving the all decimal digits and sum rounding 1 digit off.
Also note that this really really is about how values are printed i.e. presented. This does not change the values themselves, e.g. ...
set.seed(1)
d1 <- sum(abs(rnorm(100)))
set.seed(1)
d2 <- cumsum(abs(rnorm(100)))
(d1 == d2)[100]
## [1] TRUE
This is a consequence of the way R prints atomic vectors.
With the default digits option set to 7 as you likely have, any value between -1 and 1 will print with seven decimal places. Because of the way R prints atomic vectors, all other values in the vector will also have seven decimal places. Furthermore, a value of .6264538 with digits option set to 7 must print with eight digits (0.6264538) because it must have a leading zero. There are two of these values in your rnorm() vector.
If you look at cumsum(abs(rnorm(100)))[100] alone and you can see the difference (actually it becomes the same as printed value as sum(abs(rnorm(100))), although not exactly the same value).
sum(abs(rnorm(100)))
# [1] 71.67207
cumsum(abs(rnorm(100)))[100]
# [1] 71.67207
Notice that both of these values have seven digits. Probably the most basic example of this I can think of is as follows
0.123456789
#[1] 0.1234568
1.123456789
#[1] 1.123457
11.123456789
# [1] 11.12346
## and so on ...
I have a vector
v<-c(1,2,3)
I need add the numbers in the vector in the following fashion
1,1+2,1+2+3
producing a second vector
v1<-c(1,3,6)
This is probably quite simple...but I am a bit stuck.
Use the cumulative sum function:
cumsum(v)
#[1] 1 3 6