Related
Read two numbers. Find their product after exchanging last digits.
For example: Input: 4270 and 153, output: 640950 (4273x150).
Input: 348 and 31, output: 12958 (341*38).
x <- 348
y <- 31
as.integer(paste0(x %/% 10, y %% 10)) * as.integer(paste0(y %/% 10, x %% 10))
# [1] 12958
also works vecorized
x <- c(348, 4270)
y <- c(31, 153)
as.integer(paste0(x %/% 10, y %% 10)) * as.integer(paste0(y %/% 10, x %% 10))
# [1] 12958 640950
I would like to find out the three closest numbers in a vector.
Something like
v = c(10,23,25,26,38,50)
c = findClosest(v,3)
c
23 25 26
I tried with sort(colSums(as.matrix(dist(x))))[1:3], and it kind of works, but it selects the three numbers with minimum overall distance not the three closest numbers.
There is already an answer for matlab, but I do not know how to translate it to R:
%finds the index with the minimal difference in A
minDiffInd = find(abs(diff(A))==min(abs(diff(A))));
%extract this index, and it's neighbor index from A
val1 = A(minDiffInd);
val2 = A(minDiffInd+1);
How to find two closest (nearest) values within a vector in MATLAB?
My assumption is that the for the n nearest values, the only thing that matters is the difference between the v[i] - v[i - (n-1)]. That is, finding the minimum of diff(x, lag = n - 1L).
findClosest <- function(x, n) {
x <- sort(x)
x[seq.int(which.min(diff(x, lag = n - 1L)), length.out = n)]
}
findClosest(v, 3L)
[1] 23 25 26
Let's define "nearest numbers" by "numbers with minimal sum of L1 distances". You can achieve what you want by a combination of diff and windowed sum.
You could write a much shorter function but I wrote it step by step to make it easier to follow.
v <- c(10,23,25,26,38,50)
#' Find the n nearest numbers in a vector
#'
#' #param v Numeric vector
#' #param n Number of nearest numbers to extract
#'
#' #details "Nearest numbers" defined as the numbers which minimise the
#' within-group sum of L1 distances.
#'
findClosest <- function(v, n) {
# Sort and remove NA
v <- sort(v, na.last = NA)
# Compute L1 distances between closest points. We know each point is next to
# its closest neighbour since we sorted.
delta <- diff(v)
# Compute sum of L1 distances on a rolling window with n - 1 elements
# Why n-1 ? Because we are looking at deltas and 2 deltas ~ 3 elements.
withingroup_distances <- zoo::rollsum(delta, k = n - 1)
# Now it's simply finding the group with minimum within-group sum
# And working out the elements
group_index <- which.min(withingroup_distances)
element_indices <- group_index + 0:(n-1)
v[element_indices]
}
findClosest(v, 2)
# 25 26
findClosest(v, 3)
# 23 25 26
A base R option, idea being we first sort the vector and subtract every ith element with i + n - 1 element in the sorted vector and select the group which has minimum difference.
closest_n_vectors <- function(v, n) {
v1 <- sort(v)
inds <- which.min(sapply(head(seq_along(v1), -(n - 1)), function(x)
v1[x + n -1] - v1[x]))
v1[inds: (inds + n - 1)]
}
closest_n_vectors(v, 3)
#[1] 23 25 26
closest_n_vectors(c(2, 10, 1, 20, 4, 5, 23), 2)
#[1] 1 2
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 2)
#[1] 65 67
closest_n_vectors(c(19, 23, 45, 67, 89, 65, 1), 3)
#[1] 1 19 23
In case of tie this will return the numbers with smallest value since we are using which.min.
BENCHMARKS
Since we have got quite a few answers, it is worth doing a benchmark of all the solutions till now
set.seed(1234)
x <- sample(100000000, 100000)
identical(findClosest_antoine(x, 3), findClosest_Sotos(x, 3),
closest_n_vectors_Ronak(x, 3), findClosest_Cole(x, 3))
#[1] TRUE
microbenchmark::microbenchmark(
antoine = findClosest_antoine(x, 3),
Sotos = findClosest_Sotos(x, 3),
Ronak = closest_n_vectors_Ronak(x, 3),
Cole = findClosest_Cole(x, 3),
times = 10
)
#Unit: milliseconds
# expr min lq mean median uq max neval cld
#antoine 148.751 159.071 163.298 162.581 167.365 181.314 10 b
# Sotos 1086.098 1349.762 1372.232 1398.211 1453.217 1553.945 10 c
# Ronak 54.248 56.870 78.886 83.129 94.748 100.299 10 a
# Cole 4.958 5.042 6.202 6.047 7.386 7.915 10 a
An idea is to use zoo library to do a rolling operation, i.e.
library(zoo)
m1 <- rollapply(v, 3, by = 1, function(i)c(sum(diff(i)), c(i)))
m1[which.min(m1[, 1]),][-1]
#[1] 23 25 26
Or make it into a function,
findClosest <- function(vec, n) {
require(zoo)
vec1 <- sort(vec)
m1 <- rollapply(vec1, n, by = 1, function(i) c(sum(diff(i)), c(i)))
return(m1[which.min(m1[, 1]),][-1])
}
findClosest(v, 3)
#[1] 23 25 26
For use in a dataframe,
data%>%
group_by(var1,var2)%>%
do(data.frame(findClosest(.$val,3)))
I was wondering if there might be a way in R to change n to become an integer multiple of m?
For example, if n = 73, and m = 8, then I want n to change to 80 (Please note that n could change to 72 but I want n to be the next larger integer, e.g., 80 not 72)?
m = 8
n = 73
multiple <- function(n, m){
# your suggested solution #
}
multiple <- function(n,m){
ceiling(n/m) * m
}
multiple(72,8)
# [1] 72
multiple(73,8)
# [1] 80
multiple(80,8)
# [1] 80
I wrote a somewhat grotesque function, which should simply return a vector with two values.
For example, if you put in 33, you should get back c(30, 40).
It couldn't get much simpler than this.
return_a_range <- function(number){
ans <- ifelse( (30 <= number & number <= 40), c(30, 40),
(ifelse( (40 < number & number <= 50), c(40, 50),
(ifelse( (50 < number & number <= 60), c(50, 60),
(ifelse( (60 < number & number <= 70), c(60, 70),
(ifelse( (70 < number & number <= 80), c(70, 80),
(ifelse( (80 < number & number <= 100), c(80, 100),
ans <- c("NA"))))))))))))
return(ans)}
return_a_range(33)
Why is this returning only 30? How am I not getting back c(30, 40)? Why did R decide to only return the value in the first position of the vector?
EDIT
Although most of the responses are concerned with (justifiably!) spanking me for writing a lousy ifelse statement, I think the real question was recognized and answered best by #MrFick in the comments directly below.
You could just use:
> c(floor(33 / 10), ceiling(33 / 10))*10
[1] 30 40
Or as a function - thanks to #Khashaa for a nice modification (in the comments):
f <- function(x) if(abs(x) >= 100) NA else c(floor(x / 10), floor(x/10) + 1)*10
f(44)
#[1] 40 50
f(40)
#[1] 40 50
This kind of functions will be a lot more efficient than multiple nested ifelses.
I overlooked initially that you want to return 30 - 40 for a input value of 40 (I thought you wanted 40 - 50 which is what the above function does).
So this is a slightly more elaborate function which should implement that behavior:
ff <- function(x) {
if (abs(x) >= 100L) {
NA
} else {
y <- floor(x / 10L) * 10L
if (x %% 10L == 0L) {
c(y - 10L, y)
} else {
c(y, y + 10L)
}
}
}
And in action:
ff(40)
#[1] 30 40
ff(45)
#[1] 40 50
Or if you had a vector of numbers you could lapply/sapply over it:
( x <- sample(-100:100, 3, F) )
#[1] 73 89 -97
lapply(x, ff)
#[[1]]
#[1] 70 80
#
#[[2]]
#[1] 80 90
#
#[[3]]
#[1] -100 -90
Or
sapply(x, ff)
# [,1] [,2] [,3]
#[1,] 70 80 -100
#[2,] 80 90 -90
Here's another variation using %/% which will work for f2(40) case too (but my fail somewhere else?)
f2 <- function(x) if(abs(x) >= 100) NA else c(x %/% 10, (x + 10) %/% 10) * 10
f2(40)
## [1] 40 50
If you really want to use your function the way you use it and not go with docendo's answer (where for this problem I don't see why) you can do the following (in case you need to do something similar in the future):
return_a_range <- function(number){
ans <- ifelse( (30 <= number & number <= 40), a<-c(30, 40),
(ifelse( (40 < number & number <= 50), a<-c(40, 50),
(ifelse( (50 < number & number <= 60), a<-c(50, 60),
(ifelse( (60 < number & number <= 70), a<-c(60, 70),
(ifelse( (70 < number & number <= 80), a<-c(70, 80),
(ifelse( (80 < number & number <= 100), a<-c(80, 100),
a <- c("NA"))))))))))))
return(a)}
> return_a_range(33)
[1] 30 40
> return_a_range(62)
[1] 60 70
The only thing I did was to save the vector in a variable a on each ifelse.
I am writing a function to plot data. I would like to specify a nice round number for the y-axis max that is greater than the max of the dataset.
Specifically, I would like a function foo that performs the following:
foo(4) == 5
foo(6.1) == 10 #maybe 7 would be better
foo(30.1) == 40
foo(100.1) == 110
I have gotten as far as
foo <- function(x) ceiling(max(x)/10)*10
for rounding to the nearest 10, but this does not work for arbitrary rounding intervals.
Is there a better way to do this in R?
The plyr library has a function round_any that is pretty generic to do all kinds of rounding. For example
library(plyr)
round_any(132.1, 10) # returns 130
round_any(132.1, 10, f = ceiling) # returns 140
round_any(132.1, 5, f = ceiling) # returns 135
If you just want to round up to the nearest power of 10, then just define:
roundUp <- function(x) 10^ceiling(log10(x))
This actually also works when x is a vector:
> roundUp(c(0.0023, 3.99, 10, 1003))
[1] 1e-02 1e+01 1e+01 1e+04
..but if you want to round to a "nice" number, you first need to define what a "nice" number is. The following lets us define "nice" as a vector with nice base values from 1 to 10. The default is set to the even numbers plus 5.
roundUpNice <- function(x, nice=c(1,2,4,5,6,8,10)) {
if(length(x) != 1) stop("'x' must be of length 1")
10^floor(log10(x)) * nice[[which(x <= 10^floor(log10(x)) * nice)[[1]]]]
}
The above doesn't work when x is a vector - too late in the evening right now :)
> roundUpNice(0.0322)
[1] 0.04
> roundUpNice(3.22)
[1] 4
> roundUpNice(32.2)
[1] 40
> roundUpNice(42.2)
[1] 50
> roundUpNice(422.2)
[1] 500
[[EDIT]]
If the question is how to round to a specified nearest value (like 10 or 100), then James answer seems most appropriate. My version lets you take any value and automatically round it to a reasonably "nice" value. Some other good choices of the "nice" vector above are: 1:10, c(1,5,10), seq(1, 10, 0.1)
If you have a range of values in your plot, for example [3996.225, 40001.893] then the automatic way should take into account both the size of the range and the magnitude of the numbers. And as noted by Hadley, the pretty() function might be what you want.
The round function in R assigns special meaning to the digits parameter if it is negative.
round(x, digits = 0)
Rounding to a negative number of digits means rounding to a power of ten, so for example round(x, digits = -2) rounds to the nearest hundred.
This means a function like the following gets pretty close to what you are asking for.
foo <- function(x)
{
round(x+5,-1)
}
The output looks like the following
foo(4)
[1] 10
foo(6.1)
[1] 10
foo(30.1)
[1] 40
foo(100.1)
[1] 110
If you add a negative number to the digits-argument of round(), R will round it to the multiples of 10, 100 etc.
round(9, digits = -1)
[1] 10
round(89, digits = -1)
[1] 90
round(89, digits = -2)
[1] 100
How about:
roundUp <- function(x,to=10)
{
to*(x%/%to + as.logical(x%%to))
}
Which gives:
> roundUp(c(4,6.1,30.1,100.1))
[1] 10 10 40 110
> roundUp(4,5)
[1] 5
> roundUp(12,7)
[1] 14
Round ANY number Up/Down to ANY interval
You can easily round numbers to a specific interval using the modulo operator %%.
The function:
round.choose <- function(x, roundTo, dir = 1) {
if(dir == 1) { ##ROUND UP
x + (roundTo - x %% roundTo)
} else {
if(dir == 0) { ##ROUND DOWN
x - (x %% roundTo)
}
}
}
Examples:
> round.choose(17,5,1) #round 17 UP to the next 5th
[1] 20
> round.choose(17,5,0) #round 17 DOWN to the next 5th
[1] 15
> round.choose(17,2,1) #round 17 UP to the next even number
[1] 18
> round.choose(17,2,0) #round 17 DOWN to the next even number
[1] 16
How it works:
The modulo operator %% determines the remainder of dividing the first number by the 2nd. Adding or subtracting this interval to your number of interest can essentially 'round' the number to an interval of your choosing.
> 7 + (5 - 7 %% 5) #round UP to the nearest 5
[1] 10
> 7 + (10 - 7 %% 10) #round UP to the nearest 10
[1] 10
> 7 + (2 - 7 %% 2) #round UP to the nearest even number
[1] 8
> 7 + (100 - 7 %% 100) #round UP to the nearest 100
[1] 100
> 7 + (4 - 7 %% 4) #round UP to the nearest interval of 4
[1] 8
> 7 + (4.5 - 7 %% 4.5) #round UP to the nearest interval of 4.5
[1] 9
> 7 - (7 %% 5) #round DOWN to the nearest 5
[1] 5
> 7 - (7 %% 10) #round DOWN to the nearest 10
[1] 0
> 7 - (7 %% 2) #round DOWN to the nearest even number
[1] 6
Update:
The convenient 2-argument version:
rounder <- function(x,y) {
if(y >= 0) { x + (y - x %% y)}
else { x - (x %% abs(y))}
}
Positive y values roundUp, while negative y values roundDown:
# rounder(7, -4.5) = 4.5, while rounder(7, 4.5) = 9.
Or....
Function that automatically rounds UP or DOWN based on standard rounding rules:
Round <- function(x,y) {
if((y - x %% y) <= x %% y) { x + (y - x %% y)}
else { x - (x %% y)}
}
Automatically rounds up if the x value is > halfway between subsequent instances of the rounding value y:
# Round(1.3,1) = 1 while Round(1.6,1) = 2
# Round(1.024,0.05) = 1 while Round(1.03,0.05) = 1.05
Regarding the rounding up to the multiplicity of an arbitrary number, e.g. 10, here is a simple alternative to James's answer.
It works for any real number being rounded up (from) and any real positive number rounded up to (to):
> RoundUp <- function(from,to) ceiling(from/to)*to
Example:
> RoundUp(-11,10)
[1] -10
> RoundUp(-0.1,10)
[1] 0
> RoundUp(0,10)
[1] 0
> RoundUp(8.9,10)
[1] 10
> RoundUp(135,10)
[1] 140
> RoundUp(from=c(1.3,2.4,5.6),to=1.1)
[1] 2.2 3.3 6.6
If you always want to round a number up to the nearest X, you can use the ceiling function:
#Round 354 up to the nearest 100:
> X=100
> ceiling(354/X)*X
[1] 400
#Round 47 up to the nearest 30:
> Y=30
> ceiling(47/Y)*Y
[1] 60
Similarly, if you always want to round down, use the floor function. If you want to simply round up or down to the nearest Z, use round instead.
> Z=5
> round(367.8/Z)*Z
[1] 370
> round(367.2/Z)*Z
[1] 365
I think your code just works great with a small modification:
foo <- function(x, round=10) ceiling(max(x+10^-9)/round + 1/round)*round
And your examples run:
> foo(4, round=1) == 5
[1] TRUE
> foo(6.1) == 10 #maybe 7 would be better
[1] TRUE
> foo(6.1, round=1) == 7 # you got 7
[1] TRUE
> foo(30.1) == 40
[1] TRUE
> foo(100.1) == 110
[1] TRUE
> # ALL in one:
> foo(c(4, 6.1, 30.1, 100))
[1] 110
> foo(c(4, 6.1, 30.1, 100), round=10)
[1] 110
> foo(c(4, 6.1, 30.1, 100), round=2.3)
[1] 101.2
I altered your function in two way:
added second argument (for your specified X )
added a small value (=1e-09, feel free to modify!) to the max(x) if you want a bigger number
This rounds x up to the nearest integer multiple of y when y is positive and down when y is negative:
rom=\(x,y)x+(y-x%%y)%%y
rom(8.69,.1) # 8.7
rom(8.69,-.1) # 8.6
rom(8.69,.25) # 8.75
rom(8.69,-.25) # 8.5
rom(-8.69,.25) # -8.5
This always rounds to the nearest multiple like round_any in plyr (https://github.com/hadley/plyr/blob/34188a04f0e33c4115304cbcf40e5b1c7b85fedf/R/round-any.r):
rnm=\(x,y)round(x/y)*y
rnm(8.69,.25) # 8.75
plyr::round_any(8.69,.25) # 8.75
round_any can also be given ceiling as the third argument to always round up or floor to always round down:
plyr::round_any(8.51,.25,ceiling) # 8.75
plyr::round_any(8.69,.25,floor) # 8.5
You will find an upgraded version of Tommy's answer that takes into account several cases:
Choosing between lower or higher bound
Taking into account negative and zero values
two different nice scale in case you want the function to round differently small and big numbers. Example: 4 would be rounded at 0 while 400 would be rounded at 400.
Below the code :
round.up.nice <- function(x, lower_bound = TRUE, nice_small=c(0,5,10), nice_big=c(1,2,3,4,5,6,7,8,9,10)) {
if (abs(x) > 100) {
nice = nice_big
} else {
nice = nice_small
}
if (lower_bound == TRUE) {
if (x > 0) {
return(10^floor(log10(x)) * nice[[max(which(x >= 10^floor(log10(x)) * nice))[[1]]]])
} else if (x < 0) {
return(- 10^floor(log10(-x)) * nice[[min(which(-x <= 10^floor(log10(-x)) * nice))[[1]]]])
} else {
return(0)
}
} else {
if (x > 0) {
return(10^floor(log10(x)) * nice[[min(which(x <= 10^floor(log10(x)) * nice))[[1]]]])
} else if (x < 0) {
return(- 10^floor(log10(-x)) * nice[[max(which(-x >= 10^floor(log10(-x)) * nice))[[1]]]])
} else {
return(0)
}
}
}
I tried this without using any external library or cryptic features and it works!
Hope it helps someone.
ceil <- function(val, multiple){
div = val/multiple
int_div = as.integer(div)
return (int_div * multiple + ceiling(div - int_div) * multiple)
}
> ceil(2.1, 2.2)
[1] 2.2
> ceil(3, 2.2)
[1] 4.4
> ceil(5, 10)
[1] 10
> ceil(0, 10)
[1] 0
Might be missing something but is it not as easy as:
some_number = 789
1000 * round(some_number/1000, 0)
to produce something rounded to 1000s?