R - nested loop alternatives/optimization - r

I'm currently trying to implement an algorithm in R that requires to loop through the rows and columns of a matrix and that for every cell it computes a value based on the value of previously computed cells.
Here is the code that does what I said above, it is a part of the Needleman Wunsch algorithm:
globalSequenceAlignment <- function(seq1, seq2, match, mismatch, gap) {
# splitting the sequences in order to use them as rows and columns names
seq1_split <- unlist(strsplit(toString(seq1), ""))
seq2_split <- unlist(strsplit(toString(seq2), ""))
len1 <- length(seq1_split)
len2 <- length(seq2_split)
# creating the alignment matrix
alignment_matrix <- matrix(0, nrow = len2+1, ncol = len1+1)
colnames(alignment_matrix) <- c("-", seq1_split)
rownames(alignment_matrix) <- c("-", seq2_split)
# filling first row and column of the alignment matrix
for (i in 2:ncol(alignment_matrix)) {
alignment_matrix[1,i] <- (alignment_matrix[1,i]+(i-1))*(gap)
for (j in 2:nrow(alignment_matrix)) {
alignment_matrix[j,1] <- (alignment_matrix[j,1]+(j-1))*(gap)
for (i in 2:ncol(alignment_matrix)) {
for (j in 2:nrow(alignment_matrix)) {
horizontal_score <- alignment_matrix[j,i-1] + gap
vertical_score <- alignment_matrix[j-1,i] + gap
if (colnames(alignment_matrix)[i] == rownames(alignment_matrix)[j]) {
diagonal_score <- alignment_matrix[j-1,i-1] + match
} else {
diagonal_score <- alignment_matrix[j-1,i-1] + mismatch
scores <- c(horizontal_score, vertical_score, diagonal_score)
alignment_matrix[j,i] <- max(scores)
a <- 'GAATC'
b <- 'CATACG'
globalSequenceAlignment(a, b, 10,-5,-4)
Using this code I get the result that I want.
The problem is that with matrices with dimensions grater than 500x500 the nested loops become way too slow (running this code with a 500x500 matrix takes more or less 2 minutes).
I know that *apply functions could improve this but I couldn't achieve to use them since for computing each cell it requires that the previous ones have been computed yet.
I was wondering if there is a way to achieve the same result using *apply functions or a way to vectorize this type of code so that it's more rapid in R.

If someone would ever need this I wrote my own solution to this problem using the package Rcpp. The runtime, from about 3 minutes for sequences of 500 characters, is now about 0.3s.
I post here the code for the part of the two nested loops that you can see in the text of the question, hope that will be useful for someone.
rcppFunction('IntegerMatrix rcpp_compute_matrices(IntegerMatrix Am, StringMatrix Dm,
StringVector seq1, StringVector seq2,
int gap, int miss, int match) {
int nrow = Am.nrow(), ncol = Am.ncol();
for (int i = 1; i < nrow; i++) {
for (int j = 1; j < ncol; j++) {
int vertical_score = Am(i-1, j) + gap;
int horizontal_score = Am(i, j-1) + gap;
int diagonal_score = 0;
if (seq1[j-1] == seq2[i-1]) {
diagonal_score = Am(i-1, j-1) + match;
else {
diagonal_score = Am(i-1, j-1) + miss;
IntegerVector score = {vertical_score, horizontal_score, diagonal_score};
int max_score = max(score);
Am(i, j) = max_score;
return Am;


an alternative to nested for loops in r OR possible Rcpp needed?

I have a nested loop as
X <- matrix(c(0.5,0,0,0.75), nrow = 2)
k = nrow(X)
ans1 <- 0
ans2 <- 0
for (aa in 1:k) {
for (bb in 1:k) {
for (cc in 1:k) {
for (dd in 1:k) {
ans1 = ans1 + (0.45 * X[aa,bb] * X[cc,dd])
for (xx in 1:k) {
for (yy in 1:k){
ans2 = ans2 + (1.7*X[aa,bb]*X[xx,yy]*X[cc,dd] + 0.2*X[aa,xx]*X[bb,yy]*X[cc,dd])
But the matrix X which must be a square matrix can be of very high dimensions. Which would therefore slow the loop considerably. e.g. X = matrix(rnorm(10000),nrow = 100,byrow = T)
I am wondering if there is a shorter way to compress this. That would be readable and most importantly a slightly faster. I have tried expand.grid but this doesn't help much.
For instance for ans1
an1 <- expand.grid(rep(list(seq(length(X))),2)) %>% arrange_all()
an11 <- t(apply(an1, 1, function(x) as.vector(t(X))[x]))
But as I mentioned, this doesn't improve the speed. Any suggestions? I am also thinking Rcpp might help but I am not sure and I have not tried that (not very good with the c++ syntax).
You don't need to use loops at all. Since your code for ans1 and ans2 is just a sum of terms, and those terms don't interact at all, the expressions simplify to
ans1simple <- 0.45*sum(X)^2
ans2simple <- 1.9*sum(X)^3
You can test this on random data. Change the seed or size of X if you're not convinced:
X <- matrix(rnorm(9), nrow = 3)
k = nrow(X)
ans1 <- 0
ans2 <- 0
for (aa in 1:k) {
for (bb in 1:k) {
for (cc in 1:k) {
for (dd in 1:k) {
ans1 = ans1 + (0.45 * X[aa,bb] * X[cc,dd])
for (xx in 1:k) {
for (yy in 1:k){
ans2 = ans2 + (1.7*X[aa,bb]*X[xx,yy]*X[cc,dd] + 0.2*X[aa,xx]*X[bb,yy]*X[cc,dd])
ans1simple <- 0.45*sum(X)^2
ans2simple <- 1.9*sum(X)^3
ans1 - ans1simple
#> [1] 2.220446e-16
ans2 - ans2simple
#> [1] -7.993606e-15
The differences are just rounding error.
for loops in R are very slow compared to for loops in C++.
C++ for loop syntax isn't too different from some flavors of R.
I highly suspect you can condense your code significantly. But just going by your very nested syntax:
Rcpp function:
Rcpp::NumericVector foo(Rcpp::NumericMatrix& X) {
Rcpp::NumericVector ans(2);
int k = X.rows();
for (int aa = 0; aa < k; ++aa) {
for (int bb = 0; bb < k; ++bb) {
for (int cc = 0; cc < k; ++cc) {
for (int dd = 0; dd < k; ++dd) {
ans[0] += 0.45 * X[aa, bb] * X[cc, dd]l;
for (int xx = 0; xx < k; ++xx) {
for (int yy = 0; yy < k; ++yy) {
ans[1] += (1.7 * X[aa, bb] * X[xx, yy] * X[cc, dd] + 0.2 * X[aa, xx] * X[bb, yy] * X[cc, dd]);
return ans;
On the R end:
X <- matrix(c(0.5,0,0,0.75), nrow = 2)
ans <- foo(X)
ans1 <- ans[1]
ans2 <- ans[2]
Plugging-and-chugging with the above code is NOT an excuse for not optimizing your code. Again, cut the number of loops. You shouldn't need them all.
After seeing #user2554330's answer (which I suspected was the case but was too lazy to work out), the Rcpp implementation will not be significantly faster than the R implementation (and I doubt you're worrying about such marginal gains)

Rcpp function for adding elements of a vector

I have a very long vector of parameters (approximately 4^10 elements) and a vector of indices. My aim is to add together all of the values of the parameters that are indexed in the indices vector.
For instance, if I had paras = [1,2,3,4,5,5,5] and indices = [3,3,1,6] then I would want to find the cumulative sum of the third value (3) twice, the first value (1) and the sixth (5), to get 12. There is additionally the option of warping the parameter values according to their location.
I am trying to speed up an R implementation, as I am calling it millions of times.
My current code always returns NA, and I can't see where it is going wrong
Here's the Rcpp function:
double dot_prod_c(NumericVector indices, NumericVector paras,
NumericVector warp = NA_REAL) {
int len = indices.size();
LogicalVector indices_ok;
for (int i = 0; i < len; i++){
return NA_REAL;
double counter = 0;
for (int i = 0; i < len; i++){
counter += paras[indices[i]];
} else {
for (int i = 0; i < len; i++){
counter += paras[indices[i]] * warp[i];
return counter;
And here is the working R version:
dot_prod <- function(indices, paras, warp = NA){
return(sum(sapply(indices, function(ind) paras[ind + 1])))
} else {
return(sum(sapply(1:length(indices), function(i){
ind <- indices[i]
paras[ind + 1] * warp[i]
Here is some code for testing, and benchmarking using the microbenchmark package:
# testing
parameters <- list()
indices <- list()
indices_trad <- list()
for (i in 4:12){
size <- 4^i
window_size <- 100
parameters[[i-3]] <- runif(size)
indices[[i-3]] <- floor(runif(window_size)*size)
temp <- rep(0, size)
for (j in 1:window_size){
temp[indices[[i-3]][j] + 1] <- temp[indices[[i-3]][j] + 1] + 1
indices_trad[[i-3]] <- temp
x <- sapply(1:9, function(i) dot_prod(indices[[i]], parameters[[i]])),
x_c <- sapply(1:9, function(i) dot_prod_c(indices[[i]], parameters[[i]])),
x_base <- sapply(1:9, function(i) indices_trad[[i]] %*% parameters[[i]])
all.equal(x, x_base) # is true, does work
all.equal(x_c, x_base) # not true - C++ version returns only NAs
I was having a little trouble trying to interpret your overall goal through your code, so I'm just going to go with this explanation
For instance, if I had paras = [1,2,3,4,5,5,5] and indices = [3,3,1,6]
then I would want to find the cumulative sum of the third value (3)
twice, the first value (1) and the sixth (5), to get 12. There is
additionally the option of warping the parameter values according to
their location.
since it was most clear to me.
There are some issues with your C++ code. To start, instead of doing this - NumericVector warp = NA_REAL - use the Rcpp::Nullable<> template (shown below). This will solve a few problems:
It's more readable. If you're not familiar with the Nullable class, it's pretty much exactly what it sounds like - an object that may or may not be null.
You won't have to make any awkward initializations, such as NumericVector warp = NA_REAL. Frankly I was surprised that the compiler accepted this.
You won't have to worry about accidentally forgetting that C++ uses zero-based indexing, unlike R, as in this line: if(NumericVector::is_na(warp[1])){. That has undefined behavior written all over it.
Here's a revised version, going off of your quoted description of the problem above:
#include <Rcpp.h>
typedef Rcpp::Nullable<Rcpp::NumericVector> nullable_t;
// [[Rcpp::export]]
double DotProd(Rcpp::NumericVector indices, Rcpp::NumericVector params, nullable_t warp_ = R_NilValue) {
R_xlen_t i = 0, n = indices.size();
double result = 0.0;
if (warp_.isNull()) {
for ( ; i < n; i++) {
result += params[indices[i]];
} else {
Rcpp::NumericVector warp(warp_);
for ( ; i < n; i++) {
result += params[indices[i]] * warp[i];
return result;
You had some elaborate code to generate sample data. I didn't take the time to go through this because it wasn't necessary, nor was the benchmarking. You stated yourself that the C++ version wasn't producing the correct results. Your first priority should be to get your code working on simple data. Then feed it some more complex data. Then benchmark. The revised version above works on simple data:
args <- list(
indices = c(3, 3, 1, 6),
params = c(1, 2, 3, 4, 5, 5, 5),
warp = c(.25, .75, 1.25, 1.75)
DotProd(args[[1]], args[[2]]),
dot_prod(args[[1]], args[[2]]))
#[1] TRUE
DotProd(args[[1]], args[[2]], args[[3]]),
dot_prod(args[[1]], args[[2]], args[[3]]))
#[1] TRUE
It's also faster than the R version on this sample data. I have no reason to believe it wouldn't be for larger, more complex data either - there's nothing magical or particularly efficient about the *apply functions; they are just more idiomatic / readable R.
"Rcpp" = DotProd(args[[1]], args[[2]]),
"R" = dot_prod(args[[1]], args[[2]]))
#Unit: microseconds
#expr min lq mean median uq max neval
#Rcpp 2.463 2.8815 3.52907 3.3265 3.8445 18.823 100
#R 18.869 20.0285 21.60490 20.4400 21.0745 66.531 100
"Rcpp" = DotProd(args[[1]], args[[2]], args[[3]]),
"R" = dot_prod(args[[1]], args[[2]], args[[3]]))
#Unit: microseconds
#expr min lq mean median uq max neval
#Rcpp 2.680 3.0430 3.84796 3.701 4.1360 12.304 100
#R 21.587 22.6855 23.79194 23.342 23.8565 68.473 100
I omitted the NA checks from the example above, but that too can be revised into something more idiomatic by using a little Rcpp sugar. Previously, you were doing this:
LogicalVector indices_ok;
for (int i = 0; i < len; i++){
return NA_REAL;
It's a little aggressive - you are testing a whole vector of values (with R_IsNA), and then applying is_true(any(indices_ok)) - when you could just break prematurely and return NA_REAL on the first instance of R_IsNA(indices[i]) resulting in true. Also, the use of push_back will slow down your function quite a bit - you would have been better off initializing indices_ok to the known size and filling it by index access in your loop. Nevertheless, here's one way to condense the operation:
if (Rcpp::na_omit(indices).size() != indices.size()) return NA_REAL;
For completeness, here's a fully sugar-ized version which allows you to avoid loops entirely:
#include <Rcpp.h>
typedef Rcpp::Nullable<Rcpp::NumericVector> nullable_t;
// [[Rcpp::export]]
double DotProd3(Rcpp::NumericVector indices, Rcpp::NumericVector params, nullable_t warp_ = R_NilValue) {
if (Rcpp::na_omit(indices).size() != indices.size()) return NA_REAL;
if (warp_.isNull()) {
Rcpp::NumericVector tmp = params[indices];
return Rcpp::sum(tmp);
} else {
Rcpp::NumericVector warp(warp_), tmp = params[indices];
return Rcpp::sum(tmp * warp);
/*** R
DotProd3(args[[1]], args[[2]]),
dot_prod(args[[1]], args[[2]]))
#[1] TRUE
DotProd3(args[[1]], args[[2]], args[[3]]),
dot_prod(args[[1]], args[[2]], args[[3]]))
#[1] TRUE

R: Optimise spike pruning function

Since I have not found an R package for analysis of electrophysiological data, I have used a function for spike pruning from my group:
prune.spikes <- function(spikes, min.isi) {
# copy spike matrix
prunedspikes <- spikes
# initialise index of last spike: infinitely before the first one.
for (i in 1:ncol(spikes)) {
last <- -Inf
for (j in 1:nrow(spikes)) {
if (spikes[j, i] == 1) {
if (j - last < min.isi) {
prunedspikes[j, i] <- 0; # remove the spike
else {
last <- j
The function takes a spike vector or matrix consisting of 0 and 1 values and removes any 1 if it occurred within a minimum interval.
Because of the two nested loops it takes ages to run. In order to optimise it I have come up with this solution (removes one loop):
prune.cols <- function(spikes, min.isi) {
prunedspikes <- apply(spikes, 2, FUN = prune.rows, min.isi = min.isi)
prune.rows <- function(spikes, min.isi) {
prunedspikes <- spikes
last <- -Inf
for (i in 1:length(spikes)) {
if (spikes[i] == 1) {
if (i - last < min.isi) {
prunedspikes[i] <- 0; # remove the spike
else {
last <- i
Calling prune.cols on a large data set is noticeable faster compared to the original version (~60 times). One loop remains, though. So far I could not come up with a nice and simple solution. How can the function be even further improved?
Like #Khashaa proposed, I implemented the function with the help of Rcpp:
NumericMatrix prunespikes(NumericMatrix spikes, double minisi) {
NumericMatrix prunedspikes = spikes;
int ncol = spikes.ncol();
int nrow = spikes.nrow();
for (int i = 0; i < ncol; i++) {
int last = 0;
while (spikes(last, i) == 0) {
for (int j = last + 1; j < nrow; j++) {
if (spikes(j, i) == 1) {
if (j - last < minisi) {
prunedspikes(j, i) = 0;
} else {
last = j;
return prunedspikes;
If the speed difference is not a problem yet, it may be better to keep the loop instead of using Rcpp.
According to Hadley Wickham's article Loops that should be left as is, it is not a bad idea to have this loop as it can be categorized into the Recursive relationship case.
Once the speed is the bottleneck, then resorting to Rcpp or this page (suggested by the article too) may be the solution.

R: How to compute correlation between rows of a matrix without having to transpose it?

I have a big matrix and am interested in computing the correlation between the rows of the matrix. Since the cor method computes correlation between the columns of a matrix, I am transposing the matrix before calling cor. But since the matrix is big, transposing it is expensive and is slowing down my program. Is there a way to compute the correlations among the rows without having to take transpose?
EDIT: thanks for the responses. thought i'd share some findings. my input matrix is 16 rows by 239766 cols and comes from a .mat file. I wrote C# code to do the same thing using the csmatio library. it looks like this:
foreach (var file in Directory.GetFiles(path, interictal_pattern))
var reader = new MatFileReader(file);
var mla = reader.Data[0] as MLStructure;
convert(mla.AllFields[0] as MLNumericArray<double>, data);
double sum = 0;
for (var i = 0; i < 16; i++)
for (var j = i + 1; j < 16; j++)
sum += cor(data, i, j);
var avg = sum / 120;
if (++count == 10)
var t2 = DateTime.Now;
var t = t2 - t1;
static double[][] createArray(int rows, int cols)
var ans = new double[rows][];
for (var row = 0; row < rows; row++)
ans[row] = new double[cols];
return ans;
static void convert(MLNumericArray<double> mla, double[][] M)
var rows = M.Length;
var cols = M[0].Length;
for (int i = 0; i < rows; i++)
for (int j = 0; j < cols; j++)
M[i][j] = mla.Get(i, j);
static double cor(double[][] M, int i, int j)
var count = M[0].Length;
double sum1 = 0, sum2 = 0;
for (int ctr = 0; ctr < count; ctr++)
sum1 += M[i][ctr];
sum2 += M[j][ctr];
var mu1 = sum1 / count;
var mu2 = sum2 / count;
double numerator = 0, sumOfSquares1 = 0, sumOfSquares2 = 0;
for (int ctr = 0; ctr < count; ctr++)
var x = M[i][ctr] - mu1;
var y = M[j][ctr] - mu2;
numerator += x * y;
sumOfSquares1 += x * x;
sumOfSquares2 += y * y;
return numerator / Math.Sqrt(sumOfSquares1 * sumOfSquares2);
this gave a throughput of 22.22s for 10 files or 2.22s/file
Then I profiled my R code:
for(file in files)
i = i + 1;
mat = readMat(paste(path,file,sep=""))
a = t(mat[[1]][[1]])
C = cor(a)
correlations[i] = mean(C[lower.tri(C)])
to my surprise its running faster than C# and is giving throughput of 5.7s per 10 files or 0.6s/file (an improvement of almost 4x!). The bottleneck in C# is the methods inside csmatio library to parse double values from input stream.
and if i do not convert the csmatio classes into a double[][] then the C# code runs extremely slow (order of magnitude slower ~20-30s/file).
Seeing that this problem arises from a data input issue whose details are not stated (and only hinted at in a comment), I will assume this is a comma-delimited file of unquoted numbers with the number of columns= Ncol. This does the transposition on input.
in.mat <- matrix( scan("path/to/the_file/fil.txt", what =numeric(0), sep=","),
ncol=Ncol, byrow=TRUE)
One dirty work-around would be to apply cor-functions row-wise and produce the correlation matrix from the results. You could try if this is any more efficient (which I doubt, though you could fine-tune it by not double computing everything or the redundant diagonal cases):
# Apply 2-fold nested row-wise functions
dat <- matrix(rnorm(1000), nrow=10)
cormat <- apply(dat, MARGIN=1, FUN=function(z) apply(dat, MARGIN=1, FUN=function(y) cor(z, y)))
cormat[1:3,1:3] # Show few first
# [,1] [,2] [,3]
#[1,] 1.000000000 0.002175792 0.1559263
#[2,] 0.002175792 1.000000000 -0.1870054
#[3,] 0.155926259 -0.187005418 1.0000000
Though, generally I would expect the transpose to have a really, really efficient implementation, so it's hard to imagine when that would be the bottle-neck. But, you could also dig through the implementation of 'cor' function and call the correlation C-function itself by first making sure your rows are suitable. Type 'cor' in the terminal to see the implementation, which is mostly a wrapper that makes input suitable for the C-function:
# Row with C-call from the implementation of 'cor':
# if (method == "pearson")
# .Call(C_cor, x, y, na.method, FALSE)
You can use outer:
outer(seq(nrow(mat)), seq(nrow(mat)),
Vectorize(function(x, y) cor(mat[x , ], mat[y , ])))
where mat is the name of your matrix.

Find closest value in a vector with binary search

As a silly toy example, suppose
I wonder if there is a simple R function that finds the index of the closest match to x in w. So if foo is that function, foo(w,x) would return 3. The function match is the right idea, but seems to apply only for exact matches.
Solutions here (e.g. which.min(abs(w - x)), which(abs(w-x)==min(abs(w-x))), etc.) are all O(n) instead of log(n) (I'm assuming that w is already sorted).
R>findInterval(4.5, c(1,2,4,5,6))
[1] 3
will do that with price-is-right matching (closest without going over).
You can use data.table to do a binary search:
dt = data.table(w, val = w) # you'll see why val is needed in a sec
setattr(dt, "sorted", "w") # let data.table know that w is sorted
Note that if the column w isn't already sorted, then you'll have to use setkey(dt, w) instead of setattr(.).
# binary search and "roll" to the nearest neighbour
dt[J(x), roll = "nearest"]
# w val
#1: 4.5 4
In the final expression the val column will have the you're looking for.
# or to get the index as Josh points out
# (and then you don't need the val column):
dt[J(x), .I, roll = "nearest", by = .EACHI]
# w .I
#1: 4.5 3
# or to get the index alone
dt[J(x), roll = "nearest", which = TRUE]
#[1] 3
See match.closest() from the MALDIquant package:
> library(MALDIquant)
> match.closest(x, w)
[1] 3
x = 4.5
w = c(1,2,4,6,7)
closestLoc = which(min(abs(w-x)))
closestVal = w[which(min(abs(w-x)))]
# On my phone- please pardon typos
If your vector is lengthy, try a 2-step approach:
x = 4.5
w = c(1,2,4,6,7)
sdev = sapply(w,function(v,x) abs(v-x), x = x)
closestLoc = which(min(sdev))
for maddeningly long vectors (millions of rows!, warning- this will actually be slower for data which is not very, very, very large.)
closestLoc = which(min(foreach(i = w) %dopar% {
This example is just to give you a basic idea of leveraging parallel processing when you have huge data. Note, I do not recommend you use it for simple & fast functions like abs().
To do this on character vectors, Martin Morgan suggested this function on R-help:
bsearch7 <-
function(val, tab, L=1L, H=length(tab))
b <- cbind(L=rep(L, length(val)), H=rep(H, length(val)))
i0 <- seq_along(val)
repeat {
updt <- M <- b[i0,"L"] + (b[i0,"H"] - b[i0,"L"]) %/% 2L
tabM <- tab[M]
val0 <- val[i0]
i <- tabM < val0
updt[i] <- M[i] + 1L
i <- tabM > val0
updt[i] <- M[i] - 1L
b[i0 + i * length(val)] <- updt
i0 <- which(b[i0, "H"] >= b[i0, "L"])
if (!length(i0)) break;
b[,"L"] - 1L
NearestValueSearch = function(x, w){
## A simple binary search algo
## Assume the w vector is sorted so we can use binary search
left = 1
right = length(w)
while(right - left > 1){
middle = floor((left + right) / 2)
if(x < w[middle]){
right = middle
left = middle
if(abs(x - w[right]) < abs(x - w[left])){
x = 4.5
w = c(1,2,4,6,7)
NearestValueSearch(x, w) # return 3
Based on #neal-fultz answer, here is a simple function that uses findInterval():
get_closest_index <- function(x, vec){
# vec must be sorted
iv <- findInterval(x, vec)
dist_left <- x - vec[ifelse(iv == 0, NA, iv)]
dist_right <- vec[iv + 1] - x
ifelse(! is.na(dist_left) & (is.na(dist_right) | dist_left < dist_right), iv, iv + 1)
values <- c(-15, -0.01, 3.1, 6, 10, 100)
grid <- c(-2, -0.1, 0.1, 3, 7)
get_closest_index(values, grid)
#> [1] 1 2 4 5 5 5
You can always implement custom binary search algorithm to find the closest value. Alternately, you can leverage standard implementation of libc bsearch(). You can use other binary search implementations as well, but it does not change the fact that you have to implement the comparing function carefully to find the closest element in array. The issue with standard binary search implementation is that it is meant for exact comparison. That means your improvised comparing function needs to do some kind of exactification to figure out if an element in array is close-enough. To achieve it, the comparing function needs to have awareness of other elements in the array, especially following aspects:
position of the current element (one which is being compared with the
the distance with key and how it compares with neighbors (previous
or next element).
To provide this extra knowledge in comparing function, the key needs to be packaged with additional information (not just the key value). Once the comparing function have awareness on these aspects, it can figure out if the element itself is closest. When it knows that it is the closest, it returns "match".
The the following C code finds the closest value.
#include <stdio.h>
#include <stdlib.h>
struct key {
int key_val;
int *array_head;
int array_size;
int compar(const void *k, const void *e) {
struct key *key = (struct key*)k;
int *elem = (int*)e;
int *arr_first = key->array_head;
int *arr_last = key->array_head + key->array_size -1;
int kv = key->key_val;
int dist_left;
int dist_right;
if (kv == *elem) {
/* easy case: if both same, got to be closest */
return 0;
} else if (key->array_size == 1) {
/* easy case: only element got to be closest */
return 0;
} else if (elem == arr_first) {
/* element is the first in array */
if (kv < *elem) {
/* if keyval is less the first element then
* first elem is closest.
return 0;
} else {
/* check distance between first and 2nd elem.
* if distance with first elem is smaller, it is closest.
dist_left = kv - *elem;
dist_right = *(elem+1) - kv;
return (dist_left <= dist_right) ? 0:1;
} else if (elem == arr_last) {
/* element is the last in array */
if (kv > *elem) {
/* if keyval is larger than the last element then
* last elem is closest.
return 0;
} else {
/* check distance between last and last-but-one.
* if distance with last elem is smaller, it is closest.
dist_left = kv - *(elem-1);
dist_right = *elem - kv;
return (dist_right <= dist_left) ? 0:-1;
/* condition for remaining cases (other cases are handled already):
* - elem is neither first or last in the array
* - array has atleast three elements.
if (kv < *elem) {
/* keyval is smaller than elem */
if (kv <= *(elem -1)) {
/* keyval is smaller than previous (of "elem") too.
* hence, elem cannot be closest.
return -1;
} else {
/* check distance between elem and elem-prev.
* if distance with elem is smaller, it is closest.
dist_left = kv - *(elem -1);
dist_right = *elem - kv;
return (dist_right <= dist_left) ? 0:-1;
/* remaining case: (keyval > *elem) */
if (kv >= *(elem+1)) {
/* keyval is larger than next (of "elem") too.
* hence, elem cannot be closest.
return 1;
/* check distance between elem and elem-next.
* if distance with elem is smaller, it is closest.
dist_right = *(elem+1) - kv;
dist_left = kv - *elem;
return (dist_left <= dist_right) ? 0:1;
int main(int argc, char **argv) {
int arr[] = {10, 20, 30, 40, 50, 60, 70};
int *found;
struct key k;
if (argc < 2) {
return 1;
k.key_val = atoi(argv[1]);
k.array_head = arr;
k.array_size = sizeof(arr)/sizeof(int);
found = (int*)bsearch(&k, arr, sizeof(arr)/sizeof(int), sizeof(int),
if(found) {
printf("found closest: %d\n", *found);
} else {
printf("closest not found. absurd! \n");
return 0;
Needless to say that bsearch() in above example should never fail (unless the array size is zero).
If you implement your own custom binary search, essentially you have to embed same comparing logic in the main body of binary search code (instead of having this logic in comparing function in above example).
