Real-Time Fraud Scoring Latency: What 47ms Actually Means

By Amir Shachar · March 31, 2026 · 6 min read

Fraud vendors love to say they are fast. The problem is that “fast” usually means one cherry-picked number with no context.

If you are evaluating real-time fraud scoring for checkout, instant payments, payout approval, or account takeover flows, the only latency number that matters is the one your customer actually feels when the system is under load.

That is why “47ms” is useful only if you understand what sits behind it, and why averages by themselves are usually the wrong way to compare vendors.

Average latency is the easiest number to game

A product demo can produce a beautiful average. Production traffic usually does not.

A vendor can still claim “sub-100ms average” while giving you a P99 that spikes into the high hundreds. That may be fine for a batch workflow. It is not fine for a hot path.

Why tail latency is the real product metric

In fraud, the decision rarely happens alone. It sits inside a longer chain: device signals, enrichment calls, policy logic, auth windows, orchestration, and the user waiting on the other side.

Once the tail gets bad, every other service has less room to breathe. Review queues start to absorb more borderline cases, fallback rules become more aggressive, and the user experiences friction that has nothing to do with model quality.

Scoring target: 47ms median Card network + orchestration budget: ~200-300ms Enrichment provider timeout: 120ms Risk engine tail spike: 800ms P99 Result: The average still looks fine. The user experience does not.

Fast enough depends on the flow

Not every workflow needs the same latency budget.

The right question is not “is this system fast?” It is “is this system fast enough for our exact flow when traffic is real, inputs are messy, and dependencies are imperfect?”

What to ask a vendor besides one number

If you want an honest answer, ask for these things together:

If the answer stays vague, the number is usually marketing, not engineering.

Latency and explainability are connected

Teams often compare speed and explainability as if they trade off automatically. Sometimes they do. Often the real issue is bad product architecture, not physics.

A fast score that no one can trust creates queue work. A beautiful explanation that arrives too late breaks the flow. The useful system is the one that gives you both inside the budget the business can actually support.

If you are evaluating the full stack, the broader checklist is in Fraud Detection API: What to Look For in 2026. If you care about what the ops team sees after the decision lands, read SHAP Explainability for Fraud Ops.

The practical standard

A median of 47ms is useful because it leaves room for the rest of the transaction path. But it only matters if the tail stays controlled and the system keeps giving usable decisions when the environment is imperfect.

That is the bar teams should use when they compare fraud vendors: not one pretty number, but a distribution, under real conditions, with enough context to trust it.

About Riskernel

Riskernel is built for real-time fraud scoring in production flows, with fast decisions and explanations that ops teams can actually use. If you want to compare latency and decision quality against your current stack, start with a shadow test. Get early access.