About Now

Galloping Search

I recently learned about an algorithm called Galloping Search. It’s used to search sorted items when the upper bound is unknown. It’s like binary search but without the ‘high’ value. In this short post, I’ll explain my problem and how I solved it.

I am building a distributed log over S3. In a bucket or directory, I continuously add files named with sequential integers:

s3 bucket

The writer keeps a counter in memory. On each insert request, the writer increments the counter, assigning a unique sequential number to the new object. There are no gaps. If the machine crashes, I need a way to locate the last inserted object—the one with the highest number.

S3’s limitations make this challenging:

Solution

Here’s what I came up with: I search for objects at exponential intervals (1,000th, 10k, 50k, 100k) in parallel. When I find a gap (e.g., 100k missing but 50k exists), I binary search that range (e.g., 60k, 75k, 90k) until I narrow it to a manageable gap (5–10k objects). Then I use S3’s LIST API to fetch objects from that point.

galloping search

Turns out this is called Exponential Search (or Galloping Search):

Exponential search allows for searching through a sorted, unbounded list for a specified input value (the search “key”). The algorithm consists of two stages. The first stage determines a range in which the search key would reside if it were in the list. In the second stage, a binary search is performed on this range.

When I posted this online, there were lots of questions, and many people offered alternative solutions to find the largest number:

Alternate Solutions

What I’m Doing

  1. At every 10k or 50k writes, I write the current count in a file called .metadata. This reduces both cost and write amplification. I call this the checkpointing operation.
  2. While searching, I start from the counter in the .metadata file. Then I perform the Galloping Search. Even if the metadata file doesn’t exist, the approach still works.

After checkpointing, gaps may exist in the files preceding the last checkpointed number, but this does not impact its effectiveness. For now, I’m happy with the approach, though I may move to checkpointing + partitioning in the future.


1. Thanks to folks @0xriggler and @JustinWaugh on X (formerly known as Twitter) for telling me about partitioning search.
2. The s3-log project is open source.
3. I use S3’s conditional write to “append” and add a new object with the next sequence number.
4. Having gaps in the log can be catastrophic. A writer may add a new object at the gap and return success to the client 💀