Engineering March 10, 2026 · 8 min read

Building QR Menus at Scale for Restaurants

The engineering challenges behind processing thousands of menu photos daily with near-perfect accuracy.

When a restaurant uploads a photo of their printed menu, we have roughly three seconds to extract every dish name, description, price, and category — then structure it into a queryable digital format. At scale, this means processing thousands of menus a day, in dozens of languages, from images taken in poor lighting with consumer smartphones.

This is a genuinely hard problem. And it's the core of Restaurant Managment System.

The Pipeline

Our extraction pipeline runs in three stages. First, a layout detection model identifies the menu's structure — columns, sections, headers. Then a specialized OCR model optimized for restaurant typography extracts text. Finally, a language model parses the raw text into structured JSON, resolving ambiguities and normalizing currencies.

The whole pipeline runs in a serverless environment, scaling from zero to 1,000 concurrent jobs in under 30 seconds.

Handling Edge Cases

Real-world menus are beautifully chaotic. Hand-written chalkboards. Triple-folded laminated cards. Screenshots of PDFs. We've seen it all. Our test suite now includes 8,400 edge-case menu images, each tagged with the failure mode it was designed to catch.

Lessons Learned

The biggest lesson: never underestimate the diversity of human creativity in menu design. Our most impactful accuracy improvements have come not from model architecture changes but from better training data — specifically, from the thousands of restaurants who've provided feedback when extraction went wrong.

arrow_back Back to Blog

Building QR Menus at Scale for Restaurants

The Pipeline

Handling Edge Cases

Lessons Learned

More from our blog

The Future of AI in Nutrition Tracking

Our Design Philosophy at Mirrorbit AI

Multimodal AI: What 2026 Brings