AI Shelf Recognition in Emerging Markets: Can Computer Vision Really Work in Traditional Trade?

Computer visionMar 27, 20267 min readShelfforce Team

Quick answer: Yes. Modern AI shelf recognition works in traditional trade environments across Southeast Asia, PNG, and other emerging markets, with well-trained models achieving 80%+ recall on messy, unstructured shelves. The technology has improved dramatically in the past two years, but accuracy depends heavily on whether the AI was trained on emerging market imagery rather than retrofitted from clean modern trade datasets. Brands deploying AI shelf recognition in these markets need platforms built specifically for fragmented retail conditions.

Can AI-powered shelf recognition work on the messy, unstructured shelves of traditional trade in Southeast Asia?

For most of the past decade the honest answer was "not really." Early computer vision models for retail were trained almost entirely on clean modern trade shelves like Carrefour, Tesco, Walmart, and Coles. The training data assumed organised planograms, consistent lighting, products facing forward, and a single category per fixture. Those assumptions break completely the moment you walk into a warung, a sari-sari store, or a Port Moresby trade store.

That has changed. Two technical shifts drove the improvement:

Better model architectures. Modern object detection models (particularly transformer-based and hybrid CNN-transformer approaches) handle visual messiness much better than the convolutional networks that dominated earlier generations. They can identify products in cluttered scenes, partial occlusion, and unusual angles in ways that were impossible five years ago.
Training data from actual emerging markets. The platforms that work in traditional trade today were trained on real photos from real warungs, sari-sari stores, and trade stores, not just modern trade imagery with a "messy" filter applied. Training data quality is the single biggest determinant of how well a model performs in the field.

The practical test is whether a model can handle a photo with hanging sachet strips, stacked boxes on the floor, products partially blocked by signage, mixed categories on the same shelf, and inconsistent lighting from a doorway or a single overhead bulb. Models trained for this perform well. Models retrofitted from modern trade origins do not.

How accurate is image recognition for retail auditing in markets like Indonesia and the Philippines?

Accuracy in image recognition is usually expressed in two metrics: recall (what percentage of products that are actually in the photo did the model correctly identify) and precision (what percentage of the model's identifications were correct). Both matter, and they trade off against each other.

For traditional trade environments, current state-of-the-art performance looks like:

Metric	Older Models (pre-2024)	Modern AI-Native Models
Recall (traditional trade)	50–65%	80%+
Precision (traditional trade)	70–80%	90%+
Recall (modern trade)	80–90%	95%+
New SKU onboarding	Hundreds of reference images	5–20 reference images

The 80%+ recall figure on traditional trade is the inflection point that makes AI shelf recognition genuinely useful in markets like Indonesia and the Philippines. Below about 70%, the data quality is too inconsistent to drive decisions, and managers spend more time questioning the numbers than acting on them. Above 80%, the data becomes reliable enough that the workflow shifts from "verify the AI" to "act on the insight."

It's worth being honest about what 80%+ recall means in practice: roughly one in five products in a busy traditional trade photo will still be missed or misidentified. For high-velocity SKUs that's usually fine, since they show up in enough photos that the aggregate picture is accurate. For long-tail SKUs that appear infrequently, you still need a human in the loop occasionally.

Is computer vision for retail shelf analysis practical in developing countries?

Practicality depends on three factors beyond raw model accuracy:

Cost per image processed. AI shelf recognition is only useful at scale, which means the per-image cost has to be low enough to run on every visit. Modern cloud-based vision APIs and purpose-built retail models have brought this down dramatically. What used to cost dollars per image now costs cents.
Speed of analysis. A rep can't wait two minutes for a photo to process. Modern systems return structured output in 5–30 seconds, which fits naturally into a normal store visit.
Tolerance for poor input conditions. Bad lighting, blurry photos, partial shelves, and entry-level smartphones are the reality in most developing markets. The system has to handle imperfect inputs gracefully rather than rejecting them.

When all three are in place, computer vision is not just practical in developing countries. It's actually more valuable there than in developed markets. The reason is that traditional trade has historically had no good data source at all. There's no scan data, no central buying authority, no EDI feed. AI shelf recognition is the first technology that makes census-level visibility achievable in these channels. The before-and-after gap is much bigger than it is in modern trade, where alternative data sources already existed.

What AI tools can process shelf photos from small format stores in emerging markets?

The platforms capable of handling small format stores in emerging markets fall into two broad categories:

General-purpose vision APIs (Google Cloud Vision, AWS Rekognition, Azure Computer Vision) can detect generic objects and read text reliably, but they don't know your products. To use them for retail shelf analysis, you need to layer your own SKU catalogue and matching logic on top, which is significant engineering work, and accuracy on traditional trade tends to be limited without category-specific training.

Purpose-built retail vision platforms include the legacy enterprise players (Trax, FORM, Repsly, StayinFront) and AI-native challengers (Shelfforce AI among others). These come with retail-specific models trained on shelf imagery, SKU catalogue management, and the workflows needed to feed photos in and get structured insights out.

The question for an FMCG brand is rarely "general API or purpose-built platform." It's almost always purpose-built. The build cost and ongoing maintenance of a custom solution is hard to justify when proven platforms exist. The real choice is between legacy enterprise platforms designed around modern trade and AI-native platforms designed for traditional trade.

The practical evaluation criteria:

Training data origin. Was the model trained on photos from the markets you operate in?
Traditional trade benchmarks. Can the vendor show recall numbers on real warung, sari-sari, or trade store photos?
Onboarding speed for new SKUs. How many reference images per product, and how long does it take?
Offline and low-bandwidth performance. Does the field app work in real connectivity conditions?
Per-visit cost economics. Does the pricing model work at the volume you need to run on every visit?

How are FMCG brands using AI to replace manual retail audits in Southeast Asia?

The shift from manual audits to AI-powered shelf analysis has happened faster in Southeast Asia than in any other region. Three reasons:

The pain was bigger to start with. Manual audits in fragmented Southeast Asian markets were always more expensive, slower, and less reliable than in modern trade markets. The improvement from switching to AI is correspondingly larger.
The technology landed at the right moment. The first wave of multinational FMCG digital transformation programs in Southeast Asia coincided with AI shelf recognition becoming practical for traditional trade. Brands that had been frustrated for years suddenly had a workable solution.
Local players are pushing the technology forward. Platforms built in or for Southeast Asia, handling the specific shelf formats, languages, and store types of the region, have led the technical work on traditional trade computer vision.

The pattern that's emerging: brands run a paid trial in one country or one channel for 60–90 days, prove the value with real shelf data from their own stores, then expand to additional markets. The trial-to-production path is much faster than it was with legacy enterprise platforms because the implementation is lighter and the results show up in days rather than months.

Manual audits haven't disappeared entirely. They still play a role for unmanaged channels and competitive intelligence in stores brands don't directly visit. But for the core question of "what's happening on shelf in the stores my reps visit," AI has become the default answer across most of the region.

Frequently Asked Questions

How does AI shelf recognition handle products without barcodes? Computer vision doesn't rely on barcodes. It identifies products by their visual appearance (packaging design, shape, colour, logo). This is actually a structural advantage in traditional trade where barcode scanning is impractical or impossible.

What happens when a new SKU launches? Modern AI shelf recognition systems can onboard a new SKU from 5–20 reference photos, which can usually be sourced from existing brand assets. The new product is recognisable in the field within hours of being added to the catalogue.

Does AI shelf recognition work on photos taken with cheap smartphones? Yes, with caveats. Modern models handle lower-resolution and lower-quality images well, but extreme blur, very poor lighting, or extreme angles still cause problems. The platforms designed for emerging markets are tested specifically against the kind of phones field reps actually use, not just flagship devices.

How is Shelfforce AI different from other AI shelf recognition platforms? Shelfforce AI is built AI-native for the realities of fragmented trade across Australia, PNG, and Southeast Asia. It converts shelf photos into structured compliance, distribution, and pricing data across every channel, giving FMCG brands a single consolidated view of their route to market, with image recognition trained specifically for the messy, unstructured shelves of traditional trade rather than retrofitted from clean modern trade datasets.