How we built a sub-50ms query system handling 20M+ merchants across 30K+ pincodes using Apache HBase and smart caching strategies. A deep dive into scaling sparse matrix operations.
Use the audio player below to listen to this article. You can customize the voice and reading speed with the settings button.
Picture this: You're building a system where 20 million merchants need to define which of India's 30,000+ pincodes they can serve. That's potentially 600 billion data points in a sparse matrix where 99% of the values are empty.
Oh, and it needs to respond in under 50ms while handling 10,000+ requests per minute.
Welcome to the ONDC Pincode Serviceability Challenge - and how we built a system that earned us TOP 7 National Finalist status at Build for Bharat 2024.
ONDC (Open Network for Digital Commerce) separates serviceability definition from verification. Merchants define their serviceable pincodes, while buyer apps verify if a merchant can serve a specific location. Simple concept, massive scale problem.
The numbers that kept us awake:
Traditional databases crumble under this scale. We needed something built for:
// HBase table design for optimal sparse storage
// Row Key: merchant_id
// Column Family: 'pincodes'
// Column Qualifier: pincode
// Value: serviceability_metadata
public class ServiceabilityTable {
private static final String TABLE_NAME = "merchant_serviceability";
private static final String CF_PINCODES = "pincodes";
public void addServiceability(String merchantId, List<String> pincodes) {
Put put = new Put(Bytes.toBytes(merchantId));
for (String pincode : pincodes) {
put.addColumn(
Bytes.toBytes(CF_PINCODES),
Bytes.toBytes(pincode),
Bytes.toBytes("ACTIVE")
);
}
table.put(put);
}
}
Raw HBase performance wasn't enough. We implemented a multi-tier caching strategy:
@Service
public class ServiceabilityService {
@Cacheable(value = "pincode-merchants", key = "#pincode")
public List<String> getMerchantsByPincode(String pincode) {
// L1: Local cache (Caffeine) - 1ms lookup
List<String> cached = localCache.get(pincode);
if (cached != null) return cached;
// L2: Redis cluster - 5ms lookup
cached = redisTemplate.opsForValue().get("merchants:" + pincode);
if (cached != null) {
localCache.put(pincode, cached);
return cached;
}
// L3: HBase - 20ms lookup
return fetchFromHBase(pincode);
}
}
This approach reduced database queries by 70% and achieved our sub-50ms target.
Merchants don't add pincodes one by one - they upload CSVs with thousands of entries. We optimized for this:
public class BulkUploadProcessor {
private static final int BATCH_SIZE = 1000;
public void processBulkUpload(String merchantId, List<String> pincodes) {
List<Put> puts = new ArrayList<>();
for (String pincode : pincodes) {
Put put = createPut(merchantId, pincode);
puts.add(put);
if (puts.size() >= BATCH_SIZE) {
table.put(puts);
puts.clear();
}
}
// Process remaining
if (!puts.isEmpty()) {
table.put(puts);
}
}
}
Result: 40-50 rows/second processing speed for bulk uploads.
HBase performance depends heavily on row key design. Our strategy:
// Bad: Sequential keys cause hotspotting
String badRowKey = merchantId;
// Good: Hash prefix prevents hotspots
String goodRowKey = DigestUtils.md5Hex(merchantId).substring(0, 4) + "_" + merchantId;
This distributed load evenly across HBase regions, preventing bottlenecks.
For multi-pincode queries, we process them concurrently:
public Map<String, List<String>> getMerchantsForPincodes(List<String> pincodes) {
Map<String, List<String>> results = new ConcurrentHashMap<>();
pincodes.parallelStream().forEach(pincode -> {
List<String> merchants = getMerchantsByPincode(pincode);
results.put(pincode, merchants);
});
return results;
}
After optimizations, our system delivered:
We built separate services for different concerns:
@RestController
@RequestMapping("/seller")
public class SellerController {
@PostMapping("/upload/csv")
public ResponseEntity<UploadResponse> uploadCsv(
@RequestParam("file") MultipartFile file) {
String processingId = UUID.randomUUID().toString();
// Async processing
CompletableFuture.runAsync(() ->
bulkUploadService.processCsv(processingId, file)
);
return ResponseEntity.ok(new UploadResponse(processingId));
}
}
@RestController
@RequestMapping("/buyer")
public class BuyerController {
@GetMapping("/merchants")
public ResponseEntity<List<String>> getMerchants(
@RequestParam String pincodes,
@RequestParam(defaultValue = "pincodes") String mode) {
List<String> pincodeList = Arrays.asList(pincodes.split(","));
List<String> merchants = serviceabilityService
.getMerchantsForPincodes(pincodeList);
return ResponseEntity.ok(merchants);
}
}
Traditional RDBMS couldn't handle our sparse matrix efficiently. HBase's column-family storage was perfect for our use case.
Our multi-tier caching reduced database load by 70%, but we only cached frequently accessed data to avoid memory bloat.
Understanding that 99% of merchant-pincode combinations would be empty guided our entire architecture.
Real users don't make single requests - they upload thousands of records at once. Design for this reality.
We tracked P95 response times, not just averages. That 95th percentile tells you about real user experience.
While our hackathon solution proved the concept, production deployment would need:
This project demonstrated that with the right architecture, you can build systems that handle massive scale while maintaining blazing-fast performance. The ONDC network can now efficiently verify serviceability for millions of merchants across thousands of pincodes in real-time.