Last Updated: January 8, 2026
Imagine you're building a food delivery app like Swiggy or DoorDash. When a hungry customer opens the app, you need to show them nearby restaurants within a few kilometers.
Your database has millions of restaurants across the globe, each with a latitude and longitude. How do you quickly find the ones near the customer without scanning through every single record?
The naive approach would be to calculate the distance from the customer to each restaurant using the Haversine formula. But with millions of restaurants and thousands of concurrent users, this becomes computationally expensive and slow.
Geohash solves this problem elegantly by converting two-dimensional geographic coordinates into a one-dimensional string that can be indexed and searched efficiently.
Geohash is a hierarchical spatial indexing system that encodes geographic coordinates (latitude and longitude) into a short alphanumeric string.
Invented by Gustavo Niemeyer in 2008, it divides the Earth's surface into a grid of cells, where each cell is represented by a unique string. The longer the string, the smaller and more precise the cell.
For example:
9q8yy represents a large area in San Francisco9q8yyk represents a smaller area within that region9q8yykbv represents an even more precise locationThe key insight of Geohash is that locations that share a common prefix are geographically close to each other. This property makes it extremely efficient for proximity searches.
Here's how San Francisco looks when divided into Geohash cells:
This hierarchical structure is what makes Geohash so powerful. Adding more characters to the hash narrows down the area, similar to how adding more digits to a zip code narrows down a postal region.
The Geohash encoding algorithm is a clever combination of binary interleaving and Base32 encoding. Let's walk through how it works step by step.
The Earth's surface is mapped using two values:
Let's encode the coordinates of the Eiffel Tower in Paris: latitude 48.8584, longitude 2.2945.
The algorithm repeatedly divides the coordinate range in half and records which half contains our target value.
Start with the full longitude range: [-180, 180]
| Iteration | Range | Midpoint | Value (2.2945) | Bit |
|---|---|---|---|---|
| 1 | [-180, 180] | 0 | Right (> 0) | 1 |
| 2 | [0, 180] | 90 | Left (< 90) | 0 |
| 3 | [0, 90] | 45 | Left (< 45) | 0 |
| 4 | [0, 45] | 22.5 | Left (< 22.5) | 0 |
| 5 | [0, 22.5] | 11.25 | Left (< 11.25) | 0 |
After 5 iterations, longitude bits: 10000
Start with the full latitude range: [-90, 90]
| Iteration | Range | Midpoint | Value (48.8584) | Bit |
|---|---|---|---|---|
| 1 | [-90, 90] | 0 | Right (> 0) | 1 |
| 2 | [0, 90] | 45 | Right (> 45) | 1 |
| 3 | [45, 90] | 67.5 | Left (< 67.5) | 0 |
| 4 | [45, 67.5] | 56.25 | Left (< 56.25) | 0 |
| 5 | [45, 56.25] | 50.625 | Left (< 50.625) | 0 |
After 5 iterations, latitude bits: 11000
Now we interleave the longitude and latitude bits, alternating between them (longitude bit first):
The interleaved binary string: 1101000000
Finally, we convert the binary string to Base32. Geohash uses a custom Base32 alphabet:
(Note: letters 'a', 'i', 'l', 'o' are excluded to avoid confusion)
Group the bits into 5-bit chunks and convert:
So the first two characters of the Eiffel Tower's Geohash are: u0
The complete encoding process is shown in this diagram:
The full Geohash for the Eiffel Tower with 8 characters is: u09tunqu
One of the most useful properties of Geohash is that the string length determines the precision. More characters mean a smaller, more precise cell.
Here's how precision varies with Geohash length:
| Length | Cell Width | Cell Height | Use Case |
|---|---|---|---|
| 1 | 5,000 km | 5,000 km | Continental scale |
| 2 | 1,250 km | 625 km | Country/Region |
| 3 | 156 km | 156 km | Large city |
| 4 | 39.1 km | 19.5 km | City |
| 5 | 4.9 km | 4.9 km | Neighborhood |
| 6 | 1.2 km | 0.61 km | Street level |
| 7 | 153 m | 153 m | Building block |
| 8 | 38 m | 19 m | Building |
| 9 | 4.8 m | 4.8 m | Parking spot |
| 10 | 1.2 m | 0.6 m | Person |
| 11 | 14.9 cm | 14.9 cm | Very precise |
| 12 | 3.7 cm | 1.8 cm | Survey-grade |
This hierarchical precision is what makes Geohash so versatile:
The relationship between precision levels can be visualized as nested cells:
Key insight: To search a larger area, simply use a shorter prefix. To narrow down to a precise location, use more characters. This prefix-based approach is what makes database queries so efficient.
The main reason Geohash is used in production systems is its ability to enable fast proximity searches. Let's see how this works.
Locations that share a common prefix are geographically close to each other. This is the foundation of proximity search.
For example, all these Geohashes are in or near San Francisco:
9q8yy9 (Downtown SF)9q8yyk (Financial District)9q8yyu (North Beach)They all share the prefix 9q8yy, indicating they're in the same general area.
Instead of calculating distances to millions of points, you can use a simple prefix query:
This query uses a standard B-tree index on the geohash column, making it extremely fast, even with millions of records.
The query flow looks like this:
Here's where things get tricky. The prefix property doesn't always hold true at cell boundaries.
Consider two restaurants that are only 10 meters apart, but located on opposite sides of a Geohash cell boundary. They might have completely different prefixes:
A prefix query for 9q8yy% would find Restaurant A but miss Restaurant B, even though they're neighbors.
To solve this, we query not just the target cell, but also its 8 neighboring cells:
The query becomes:
This ensures we don't miss any nearby results due to boundary effects.
After retrieving candidates from the 9 cells, apply a final distance calculation to filter results within the exact radius:
This two-phase approach gives you the best of both worlds:
Subscribe
Geohash is used extensively in production systems. Here are some notable examples:
When you request a ride, the app needs to find available drivers near you. Uber uses a system based on Geohash (they call it H3, a similar hexagonal grid system) to:
MongoDB uses Geohash internally for its 2dsphere index. When you create a geospatial index:
MongoDB converts coordinates to Geohash values and stores them in a B-tree structure, enabling efficient queries like:
Redis provides built-in Geohash support with commands like:
Redis stores the Geohash as a sorted set score, enabling O(log N) proximity queries.
Elasticsearch uses Geohash for its geo_point field type. It supports:
Food delivery apps use Geohash to:
While Geohash is powerful, it has some important limitations you should be aware of.
The most significant limitation is that nearby locations can have very different Geohash values if they're on opposite sides of a cell boundary.
Even worse, this can happen at multiple levels. Two points that are 1 meter apart might differ in all but the first character if they happen to be on a major grid boundary.
Solution: Always query neighboring cells when doing proximity searches.
Geohash cells are not uniform in size across the globe. Because they're based on latitude/longitude divisions:
This is because longitude lines converge at the poles. A 1-degree change in longitude at the equator is about 111 km, but near the poles, it could be just a few kilometers.
Impact: If you're building a global application, be aware that the same Geohash precision represents different actual distances at different latitudes.
Geohash uses a Z-order curve (also called Morton code) to map 2D space to 1D. This creates some counterintuitive behavior:
Notice how cell 3 and cell 4 are adjacent in the numbering but not geographically adjacent. This is why prefix matching alone isn't sufficient.
Geohash doesn't handle the International Date Line (180 longitude) gracefully. Locations on either side have completely different prefixes.
Similarly, areas very close to the poles can have unusual behavior due to the convergence of longitude lines.
Solution: For applications that span these boundaries, use additional logic or alternative coordinate systems.
How does Geohash compare to other spatial indexing approaches?
| Method | Pros | Cons | Best For |
|---|---|---|---|
| Geohash | Simple, string-based, works with B-tree indexes | Boundary issues, non-uniform cells | General proximity search |
| S2 (Google) | Uniform cell sizes, handles poles/antimeridian | More complex implementation | Global-scale applications |
| H3 (Uber) | Hexagonal cells, better neighbor relationships | Newer, less database support | Ride-sharing, analytics |
| R-tree | Handles arbitrary shapes, range queries | More complex, specialized index | Polygon searches |
| Quadtree | Adaptive resolution, good for sparse data | Not string-based | Game engines, graphics |
For most applications, Geohash is the best starting point due to its simplicity and wide database support. Consider S2 or H3 for applications that need more precision at global scale.
Here's a complete Python implementation of Geohash encoding and decoding:
Output: