What is the alternate of numpy.digitize() function in Julia? - julia

I would like to know, how may I replicate the numpy.digitize() functionality in julia?
I am trying to convert this python example to Julia.
Python Example
x = np.array([0.2, 6.4, 3.0, 1.6])
bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])
inds = np.digitize(x, bins)
Output: array([1, 4, 3, 2], dtype=int64)
I tried using searchsorted function in Julia but it doesn't replicate the output form python.
Please suggest a solution to this problem.
Thanks in advance!!

You may use searchsortedlast with broadcasting:
julia> x = [0.2, 6.4, 3.0, 1.6]
4-element Array{Float64,1}:
0.2
6.4
3.0
1.6
julia> bins = [0.0, 1.0, 2.5, 4.0, 10.0]
5-element Array{Float64,1}:
0.0
1.0
2.5
4.0
10.0
julia> searchsortedlast.(Ref(bins), x)
4-element Array{Int64,1}:
1
4
3
2

Related

How to round to nearest multiple?

Is there an idiomatic way to round to the nearest multiple of a number?
number multiple result
12.2 0.5 12.0
12.2 0.25 12.25
12.4 0.5 12.5
You can define a function:
round_step(x, step) = round(x / step) * step
Usage:
julia> round_step(12.2, 0.25)
12.25
Such a function is actually used internally in Base for rounding numbers to a certain number of digits in a certain base:
julia> Base._round_step(12.2, 0.25, RoundNearest)
12.25
However since this is an internal implementation detail you shouldn't rely on this function. This function in turn calls _round_invstep.
julia> Base._round_invstep(12.2, 4, RoundNearest)
12.25
julia> Base._round_invstep(12.4, 2, RoundNearest)
12.5
Which performs the operation round(x * invstep, r) / invstep.
Because your examples happen to correspond to 0.1 in base 4 and base 2 you can also use round directly for these particular cases:
julia> round(12.2; base=2, digits=1)
12.0
julia> round(12.2; base=4, digits=1)
12.25
julia> round(12.4; base=2, digits=1)
12.5

How to create samples using sampling technique from existing arrays?

I have two arrays as shown below,
x = collect(range(1, 10, length=10))
y = colelct(range(1, 10, length=10))
I would like to how can i convert them into either Sobol or Uniform sample, using their algorithm.
Thanks, look forward to suggestions!
Are you trying to sample values (uniformly) from 1:10? If so, you can just pass the collection to rand:
julia> rand(1:10, 5)
5-element Vector{Int64}:
10
5
5
8
8
for the x you gave above, that would be
julia> x = collect(range(1, 10, length=10)) ;
julia> rand(x, 5)
5-element Vector{Float64}:
2.0
4.0
6.0
7.0
3.0
I'm not sure about Sobol sampling.

Multiply each component of vector by another vector (resulting in vector of length m*n)

Say I am making parts that come in three sizes, and each size has a certain tolerance:
target <- c(2, 4, 6)
tolerance <- c(0.95, 1.05)
What I'd like to end up with is an array that contains the limits of the tolerance for each target (i.e. target*0.95, target*1.05):
tol = (2*0.95, 2*1.05, 4*0.95, 4*1.05, 6*0.95, 6*1.05)
Here's a really ugly way of getting there, but I know there is a simple way to do this.
j<-1
tol<-NULL
for (i in target){
tol[j] <- i*tolerance[1]
tol[j+1] <- i*tolerance[2]
j<-j+2
}
The vector tol can be calculated using outer() like this:
tol <- c(outer(tolerance,target))
#> tol
#[1] 1.9 2.1 3.8 4.2 5.7 6.3
You can achieve that using matrix product:
target <- c(2, 4, 6)
tolerance <- c(0.95, 1.05)
target %*% t(tolerance)
[,1] [,2]
[1,] 1.9 2.1
[2,] 3.8 4.2
[3,] 5.7 6.3
The other answer would have my preference, but this alternative might generalise better in some specific context (more than two vectors)
Reduce("*", expand.grid(list(tolerance, target)))
Mostly for fun - using R's recycling:
rep(target, each = length(tolerance)) * tolerance
#[1] 1.9 2.1 3.8 4.2 5.7 6.3

Rank a vector based on order and replace ties with their average

I'm new to R, and i find it quite interesting.
I have MATLAB code to rank a vector based on order which works fine. Now I want to convert it to R code, a typical spearman ranking with ties:
# MATLAB CODE
function r=drank(x)
u = unique(x);
[xs,z1] = sort(x);
[z1,z2] = sort(z1);
r = (1:length(x))';
r=r(z2);
for i=1:length(u)
s=find(u(i)==x);
r(s,1) = mean(r(s));
end
This is what i tried:
# R CODE
x = c(10.5, 8.2, 11.3, 9.1, 13.0, 11.3, 8.2, 10.1)
drank <- function(x){
u = unique(x)
xs = order(x)
r=r[xs]
for(i in 1:length(u)){
s=which(u[i]==x)
r[i] = mean(r[s])
}
return(r)
}
r <- drank(x)
Results:
r = 5, 1.5, 6.5, 3, 8, 6.5, 1.5, 4
1.5 is average of 8.2 occurring twice ie. tie
6.5 is average of 11.3 occurring twice
Can anyone help me check it?
Thanks,
R has a built-in function for ranking, called rank() and it gives precisely what you are looking for. rank has the argument ties.method, "a character string specifying how ties are treated", which defaults to "average", i.e. replaces ties by their mean.
x = c(10.5, 8.2, 11.3, 9.1, 13.0, 11.3, 8.2, 10.1)
expected <- c(5, 1.5, 6.5, 3, 8, 6.5, 1.5, 4)
rank(x)
# [1] 5.0 1.5 6.5 3.0 8.0 6.5 1.5 4.0
identical(expected, rank(x))
# [1] TRUE

How do I round to 1, 1.5, 2 etc instead of 1, 2 or 1.1, 1.2, 1.3 in R?

I want to round numbers to the closest half or whole number. So I want to round 4.2 to 4, 4.3 to 4.5 and 4.8 to 5. I tried a few things with the round option:
> round(4.34,1)
[1] 4.3
> round(4.34)
[1] 4
> round(4.34,0.5)
[1] 4.3
> round(4.34,2)
[1] 4.34
So I only know how to increase the amount of significant numbers, but not how to do different kinds of rounding. Can that be done with the round function, or is there a different function to do that in R?
Use the function round_any in package plyr:
library(plyr)
x <- 4.34
round_any(x, 3)
[1] 3
round_any(x, 1)
[1] 4
round_any(x, 0.5)
[1] 4.5
round_any(x, 0.2)
[1] 4.4
This works without any extra package:
x <- c(4.2, 4.3, 4.8)
round(x*2)/2
#[1] 4.0 4.5 5.0
I don't know R, but I believe this would work, syntax aside:
y = x / 5
z = round(y, 1)
r = z * 5

Resources