Project

Turning Browse Sessions into Routing Advantage

Predicting orders before customers checked out.

Context

Getir processed roughly 500,000 orders per day across nine countries. The routing and assignment system was reactive by design: an order came in, the algorithm assigned it to a courier. At that scale, even small improvements in how orders were sequenced and batched translated to meaningful cost and speed gains. The constraint that made this interesting: couriers were employees (not gig workers), so supply was fixed to whoever was scheduled. You couldn't summon more capacity during a spike.

The Problem

Batching and routing are inherently backward-looking. An order arrives, you slot it into whatever courier availability exists right now. By the time you're optimizing, you've already lost your best window.

Every courier assignment is a commitment. Depending on the route, that's 15 to 50 minutes of locked utilization. During off-peak, the cost of a suboptimal assignment is low: another courier is probably free. During rush hours, it's a different story. Every courier is spoken for, and each dispatch decision removes capacity from the system for the duration of their route. Assign a courier to a solo trip at 12:01, and when three nearby orders drop at 12:03, you've already burned your best option for 20+ minutes.

The real cost wasn't in finding good routes for known orders. It was in making irreversible commitments without knowing what demand looked like two minutes from now.

Approach

The average user spent 5-10 minutes between opening the app and checking out. That's a window of forward visibility sitting in session data, unused.

The question wasn't "can we predict orders?" but "can we predict them accurately enough that acting on predictions beats ignoring them?" A false positive means the solver plans around an order that never materializes, potentially misallocating a courier. A false negative just means you're back to the status quo. The cost of errors was asymmetric, so the model needed to be conservative: only feed high-confidence predictions into the solver.

We built an ML model on session behavior (items added, browse patterns, time in app) that predicted which active baskets would convert to orders. High-confidence predictions were assigned expected checkout times and passed to the routing solver as anticipated future demand. Not confirmed orders, but likely enough to plan around.

This changed the optimization frame. Instead of "given these orders, find routes," the solver could ask "given current orders plus probable orders arriving in the next few minutes, what's the better assignment?" During peak hours especially, that forward visibility meant the algorithm could hold a courier for a higher-throughput batch rather than dispatching them on a solo trip that locked them for 20+ minutes.

Results

The model hit above 85% precision and recall on high-confidence predictions. That threshold mattered because it meant the solver was planning around futures that mostly materialized. Solo trip frequency dropped by 8 percentage points, meaning more orders per courier-hour and fewer wasted dispatch commitments during the windows where capacity was most scarce.

What Transferred

Most operational systems are purely reactive. They optimize against current state and treat the near future as unknowable. But in any domain where users signal intent before committing (browsing before buying, searching before booking, configuring before ordering), there's a prediction window hiding in plain sight. You don't need a perfect model. You need one accurate enough that planning for likely futures beats ignoring them. The higher the commitment cost per decision (locked capacity, long fulfillment cycles, constrained supply), the more that forward visibility is worth.