Article

The Real Cost of Manual Call Reviews in Regulated Industries

June 23, 2026

Reviewing 3% of calls is not quality assurance. It's hope and pray.

Most regulated contact centers have a call QA (quality assurance) program where they review a few percent of phone calls and tell themselves they have a good handle on compliance. They do not.

3
%

Average share of calls reviewed by QA programs in regulated UK and US contact centers

100
%

Of mobile calls totally missing from any form of call review, insights, or compliance oversight.

40
%

Disclosure adherence improvement through automated call reviews and coaching

Most regulated contact centers have a QA program. They review a sample of calls, score them, file the reports, and tell themselves they have visibility. They do not. What they have is a statistically inadequate sample, a coaching program built on it, and a compliance exposure that grows every week as the rest of the calls go unreviewed.

One of the best contact center QA programs we've seen at scale had a dedicated team of eight analysts. A 47-point rubric developed over four years. Weekly calibration sessions. Coaching conversations within 48 hours of any scored interaction. The compliance team loved it. The operations director was proud of it.

They were reviewing 3.2% of calls.

That meant 96.8% of conversations (with customers making financial decisions, discussing claims, asking about coverage) were scored by nobody. Not because the program was badly designed. Because manual QA at any meaningful volume hits a ceiling fast, and 3% is roughly where most contact centers with reasonable resource levels end up.

The ceiling is not a failure of effort. It is a structural problem with how manual QA works. And in regulated environments, the gap above that ceiling is not just a quality issue — it is a compliance issue, a performance issue, and a management burden that nobody signed up for when they built the program.

What is actually inside the 97% nobody reviews?

This is the question most QA programs cannot answer — and cannot answer by design, because the answer requires reviewing the calls that were not reviewed.

When contact centers have moved to full-coverage automated review, the picture that emerges from the previously unreviewed calls is consistent enough to describe with confidence. It is not that the unreviewed calls are dramatically worse than the reviewed ones. It is that they contain the same distribution of issues — compliance gaps, coaching opportunities, risk language, customer dissatisfaction signals that the reviewed calls contain. The QA program's findings are a sample of a larger pattern, not a window onto it.

3% — Average share of calls reviewed by QA programs in regulated UK and US contact centers. When SMS, chat, and mobile calls are included in the denominator, effective coverage often falls below 1%. (Source: NICE 2024 Contact Center Industry Report)

The specific things that live in the unreviewed population vary by sector but follow recognizable patterns. In financial services contact centers, the unreviewed calls contain the disclosure omissions that create suitability exposure. Not concentrated in one agent but distributed across the team, invisible because the sample did not find them. In insurance, they contain the claims handling conversations where expectations were set incorrectly, creating the disputes that surface weeks later with no call record to review. In healthcare, they contain the patient communication that fell short of the required standard without anyone knowing it happened.

The QA program finds some of these. The sample, by definition, misses most of them. And because the sample is not random — manual QA gravitates toward flagged calls, certain agents, or convenient time windows — the misses are not evenly distributed. The agents the program is least likely to catch are the ones whose calls are hardest to flag and easiest to skip.

"We thought our QA process was showing us where the problems were. What it was actually showing us was where the problems were in the calls we'd chosen to review. That's a very different thing." — Head of QA, UK financial services contact center, 400 agents

Why does manual QA fail at scale in regulated environments specifically?

Manual QA fails everywhere at sufficient volume. But it fails in a particular way in regulated environments that makes the consequences sharper.

In an unregulated contact center, the cost of missing a quality issue in an unreviewed call is typically customer experience problem. A dissatisfied customer, a complaint, a lost renewal. These are real costs, but they surface through measurable outcomes the business can track. The feedback loop is imperfect but it exists.

In a regulated contact center, the cost of missing a compliance issue in an unreviewed call is different in kind. The disclosure that was not made on a call in January may not become a problem until a regulatory examination in October — by which point the call record may be the only evidence of what did or did not happen.

There are four specific ways manual QA breaks down in regulated environments.

1. The sampling bias problem

Manual QA does not review a random sample. It reviews a convenient sample. QA analysts tend toward calls that were flagged, calls from agents they are already watching, calls from comfortable time windows, calls of manageable duration. The result is a sample that over-represents calls most likely to surface issues and under-represents the calls where issues are hiding unnoticed.

This is not a criticism of the analysts — it is how cognitive load and time pressure work at scale. But in a compliance context, the calls most likely to be systematically missed, e.g., mobile calls, after-hours calls, calls from agents who have never been flagged, these are often the ones carrying the highest risk.

2. The recency and lag problem

Manual QA takes time. A call reviewed on a Thursday is unlikely to generate coaching feedback before the following week. A compliance issue identified in week one may not appear in a coaching conversation until week three. In the intervening period, the same agent has had dozens more conversations the program was not watching.

Automated review closes this to minutes. A call ends, the transcript is scored, the compliance flag is raised. For regulated environments where the obligation is active supervision — not periodic review — that difference is material.

3. The consistency problem

Two analysts scoring the same call against the same rubric will produce different scores. This is documented, consistent, and not fully solvable by calibration. Human judgment varies by analyst, by time of day, by how many calls the analyst has already reviewed that week. In a compliance context, where the score is potentially evidence of whether a required disclosure was made, that variability is a defensibility problem as much as a quality problem.

4. The manager time problem

In most contact centers, QA is not fully separated from management. Managers listen to calls. Supervisors contribute to scoring. Team leaders review flagged interactions. The hours consumed by manual QA at even a 3% review rate represent a significant share of management capacity not available for the work QA was supposed to enable: coaching, development, and performance improvement.

Dimension Manual QA (3% sample) Automated Review (100% coverage)
Coverage 3–5% of calls 100% of calls
Lag from call to score 2–5 days Minutes
Consistency Varies by analyst Consistent against defined rubric
Compliance evidence Partial, sample-based Complete, defensible
Manager time per week 6–12 hours reviewing calls 1–2 hours acting on surfaced items
Coaching basis Small, potentially unrepresentative sample Full call history per agent
Spoke Enlighten applies your existing call QA rubric to 100% of conversations automatically. Within minutes of each call ending it surfaces what needs attention without manual review overhead.

What is the compliance exposure your program is not catching?

This question is worth answering precisely, because it is easy to overstate and easy to underestimate.

The compliance exposure from inadequate or manual call QA  is not that the contact center is full of agents systematically breaking rules. Most agents, most of the time, are doing a good job. The exposure is statistical and structural: in any large population of regulated conversations, a predictable proportion will contain compliance issues. Manual QA at 3% coverage finds a predictable proportion of those. The rest remain in the unreviewed population until a regulator, a litigant, or a customer complaint surfaces them.

40% — Improvement in required disclosure adherence observed in regulated contact centers after moving from manual sampling to automated full-coverage QA, in the first six months of deployment. (Source: Spoke customer operational data, 2024–2025. Five regulated contact centers, combined volume 1.8M calls/month. Anonymised.)

The most common compliance gaps that automated reviews surface that manual call QA do not, fall into three categories consistently across financial services, insurance, and healthcare.

1. Disclosure omissions. Required disclosures, suitability language, mandatory risk warnings, and regulatory references are the compliance obligation most directly tied to call content. Agents under call handling time pressure omit them. Not always, not deliberately, but at a rate that is only visible across the full call population. In a contact center handling 50,000 calls a month, if disclosure omissions run at even 5% in the unreviewed population (a conservative figure based on what full-coverage review typically finds), that represents 2,350 calls a month carrying a compliance gap the program did not see.

2. Risk language and complaint signals. Automated review can be tuned to flag specific language patterns. Phrases associated with elevated complaint risk, language that deviates from approved scripts, and escalation signals that should trigger supervisor review. Manual QA catches these when they appear in the reviewed sample. Automated review catches them in every call.

3. Post-call process gaps. In contact centers where agents are expected to complete specific workflow steps after a call, such as logging outcomes, completing documentation, following required escalation procedures, etc., automated call review technology can verify compliance with those requirements too. Manual QA checking post-call compliance at 3% is checking 3% of the exposure.

None of this is catastrophic in isolation. A single disclosure omission is a training issue. A pattern of disclosure omissions across a team, visible only when you review enough calls to see the pattern, is a compliance program issue. Manual QA at 3% coverage cannot distinguish between the two.

What does the transition to full-coverage QA actually involve?

The practical question operations leaders ask when they understand the coverage problem is not whether to fix it — it is what fixing it requires and how disruptive it is to the program that is already running.

Automated call review does not replace your QA program. It replaces the manual call-listening and creating of personalized coaching plans part of it. The time-consuming stuff. The rubric stays. The calibration process stays. The coaching framework stays. What changes is the mechanism that applies the rubric to conversations, and all the time it used to take managers to do.

Rubric translation

The first step is converting the existing QA rubric into a format the automated review system can apply consistently. This is less technical than it sounds for most rubrics, but it requires a level of precision that calibration meetings sometimes paper over. A rubric item that says "agent demonstrated empathy" needs to be translated into observable call behaviors the system can identify, Specific phrases, specific response patterns, the absence of certain language. This process usually surfaces rubric ambiguities that calibration was resolving informally, which is useful in its own right.

Most contact centers complete the translation in four to six weeks working with implementation support. Rubrics with significant judgment components retain a higher proportion of human review for items the system flags but cannot fully assess.

The QA team's role after transition

The question QA teams ask first is what happens to their jobs. The honest answer: the time spent on call listening and scorecard completion shifts totally. Most QA professionals find the post-transition role more substantive. They are spending their time on value and effecting change that requires human judgment, rather than listening to calls, processing, and data entry. But the transition requires explicit management. Operations leaders who handle it well are honest about what the team will be doing differently and why. And they have those conversations early and up-front.

The manager experience

The change that team leaders describe as most significant is not the compliance improvement. It is what happens to their week.

A team leader managing 15 agents in a regulated contact center typically spends six to twelve hours a week on call review and coaching preparation. Finding calls, listening, scoring, writing feedback, scheduling conversations.  Time that should be spent on actual coaching conversations, performance management, and team development they were hired to deliver.

When automation handles the call review and scoring, the coaching plan is automatically built too. The calls that need attention are identified, and peer based coaching plans can be created for each agent focusing in on their individual strengths and weaknesses. No one-size-fits-all coaching here. The manager's time now goes to the conversation. Bringing the human element, the nuanced performance discussion, or handling the delicate situation with an agent who needs support. The system flags it, creates the content, and tell managers where to focus.

The efficiency argument for automated QA is real. The more important argument is what the efficiency enables: managers who are developing people rather than processing call records.

Automated Call QA and Coaching FAQs

Is 100% call review really achievable at scale?

Yes, via automation. Human QA at 100% coverage is not achievable at any reasonable cost. The headcount required to review 100,000 calls a month would exceed most operations' entire agent population. Automated review applies the rubric to every call within minutes of it ending, at a per-call cost that scales with volume rather than headcount.

What happens to our QA team when we automate the review?

The role changes rather than disappears. Time spent on call listening and scorecard completion, shifts to work that requires judgment the automated system cannot replace. Most QA teams report finding the new roles more substantive and rewarding. The transition requires explicit management, including honest conversations about what the team will be doing differently.

How do you handle calls that require nuanced judgment?

Automated review systems can handle both the structured and the unstructured well. In rule-based approaches elements such as: was the required disclosure made, were prohibited phrases used, was the script followed, etc., are achievable. However, modern capabilities now mean calls requiring nuanced judgment such as complex complaints, ambiguous language, tone concerns that do not fit old-world technology patterns, are also now possible. Either way, nuance and ambiguity can be flagged for human review.

How long before we see measurable improvement in compliance scores?

Most regulated contact centers see measurable improvement in disclosure adherence within six to eight weeks of implementing full-coverage review and coaching. And all go to a 100% call review status with the first week. The feedback loop closes faster: instead of waiting for coaching based on a small sample, agents receive structured feedback weekly based on all their calls, their needs. The improvement compounds because the coaching is now based on actual patterns rather than visible ones.

What do we do about calls that happened before we implemented full coverage?

Historical calls on unmonitored or partially monitored channels are generally not recoverable without legal process. The practical answer: draw a clean line, document the change, treat prior calls as legacy exposure that counsel is aware of, and implement full coverage from a specific date. Regulators examining contact center QA programs understand that programs improve over time. What they look for is that improvement is happening and is documented.

“Spoke Enlighten exceeds its name - the insights it creates are truly enlightening. It's designed to lift people up not replace them. If you're struggling to get started with AI, Spoke Enlighten provides a clear and simple path to real ROI.“

Patrick Jones
Sage Freight

See how Spoke Reduces QA Costs and Compliance Risk

Book time with our team to see how automated call reviews and coaching for your contact center, desk, and mobile workforce calls can protect your company and set your team free.