Binary Search | notesbytim

Question: Given m sorted arrays each of size n, what is the fastest algorithm to find the kth smallest element in the union of the m arrays?

A friend sent me this optimization question on Quora. He said he’d solved it in Python, on a plane ride, in O(log(k)) time (Note: the algorithm uses O(log(k)) time, [presumably] not my friend). I thought sure, challenge accepted.

I didn’t have the luxury of a full plane ride to figure it out, but I came up with a solution in R, using linear search. I know, I know, linear search is O(n) time. No excuse, I lost the challenge. But again, no plane ride! C’mon!

I wonder, though. Considering that sortation is a precondition for binary search, and sorting itself can take O(n) time, O(n+k) time, O(nk) time or more means that binary searching over a sorted list could take O(n) + O(log(n)) time in total, as opposed to linear/brute force search which tends to O(n) time. Not sure… should come back to Big O notation later.

As promised, please find the below solution on my recently not-so-anemic github page.

#searchArrays takes a list of sorted vectors as an argument
#and finds the kth smallest common element
#note that the input sorted vectors must be sorted descending

searchArrays <- function(array.list,k=1){
 element.vector<-vector()
   for(e in 1:length(array.list)){
   array.list.search<-array.list[!1:length(array.list) %in% e]
     for(p in 1:length(array.list[[e]])){
     x.search<-array.list[[e]][p]
     element.vector<-x.search
       for(l in 1:length(array.list.search)){
         for(n in 1:length(array.list.search[l])){
         element.vector<-append(element.vector,array.list.search[[l]][n])
           if(length(element.vector)==length(array.list) &
             range(element.vector)[1]==range(element.vector)[2]){
             return(element.vector)
             }
           }
         }
       }
     }
       if(length(element.vector)==length(array.list) &
       range(element.vector)[1]==range(element.vector)[2]){
       print(element.vector)
       } else { print("Error: no common element across arrays") }
}

The input to the function searchArrays() is a list of sorted vectors. The goal is to find the smallest common element in all of the arrays.

Using loops, I define the “search” array e (the first array), then loop through each element in e, which I label x.search (maybe I should have called this e.i). I then separate e from the search arrays, labeling those array.list.search.

At the heart of the nested loops is the “element.vector”, which builds one element at a time, checking for uniformity with each step. To illustrate, if you print it in real time it looks something like:

a
a b
a
a a
a a a
a a a c
a
a a
a a a
a a a a

To test the above:


test.gen<<-function(n){sort(sample(letters,n,replace=T))};
test<-list()
for(i in 1:5){a<-list(test.gen(100));test<-append(test,a)}
searchArrays(test)

Final thought: solving the above required checking whether element.vector was uniform at each step. It was an unexpected challenge which I enjoyed solving. The ahah moment came when I realized that if the vector is uniform , the min and max should be the same. Thus you can compare the range:

range(element.vector)[1]==range(element.vector)[2])

This blog post brought to you by Charles Mingus radio, via Spotify.

notesbytim

O(log(n)) > O(n)

Tag Archives: Binary Search

A Preference for Elegance to Brute Force, AKA O(log(n)) > O(n)