Dograh

AI Voice Agents for Public Surveys: Phone Reach at Web Scale

AI Voice Agents for Public Surveys: Phone Reach at Web Scale
Use CaseJuly 3, 2026·10 min read

AI Voice Agents for Public Surveys: Phone Reach at Web Scale

Vemu Sandeep
Vemu Sandeep·GTM Engineer, Dograh AI

An AI voice agent runs a public-sector survey by calling residents, reading a scripted questionnaire with skip logic, and holding a real conversation in the respondent's own language. It records each answer, extracts it into a structured variable, and scores sentiment after the call, producing clean CSAT and NPS data without a human phone bank.

Key Takeaways

  • AI voice agents give agencies phone reach at online-survey cost.
  • Consistent scripting and live extraction turn spoken replies into clean CSAT and NPS data.
  • Self-hosting keeps constituent and voter survey data inside your own infrastructure and country.

This post is part of our guide to AI Voice Agents for Government & Public Services.

The reach gap online surveys keep leaving open

Online surveys are cheap, and they quietly reach the people already paying attention while missing everyone else. Response rates are sliding even for official government instruments. The UK Civil Service People Survey fell to 59% in 2025 from 61% a year earlier, and typical digital surveys land well below that. The residents a public body most needs to hear from tend to be the ones who drop out first: older people, and households with limited English or no reliable internet access. More than one in five US residents age five and older speak a language other than English at home, and an English-only web form never reaches most of them. Representative feedback is not a nicety for an agency. Program funding and service redesign both lean on it, and a sample that quietly excludes the harder-to-reach residents produces decisions that miss those same people.

Phone reaches the populations a web form cannot. The catch has always been price, since a human phone bank is slow and expensive and phone response rates have fallen for two decades. That has left agencies picking between a channel that is cheap but skewed and one that is representative but unaffordable. An AI voice agent opens a third path, phone reach at close to online economics. It is the outbound mirror of an inbound citizen helpline: instead of waiting for residents to call in, the agency reaches out to a sample and brings the answers back.

How an AI voice agent actually runs a public survey

The agent dials from a list and works through a scripted questionnaire, adapting to the person on the line without wandering off script. It reads questions in order, follows skip logic so each respondent only hears the items that apply, randomizes answer order where the methodology asks for it, and re-asks when a reply is unclear. Because the wording is fixed, every respondent hears the same question with no interviewer drift and no leading. A 2025 study put an LLM voice interviewer through a 123-question, roughly 30-minute instrument where 73% of the people who started finished and 86% rated the experience neutral or positive, against a 24% break-off rate for legacy IVR. That gap between a real conversation and a touch-tone menu is the whole point, because people stay on the line when it feels like talking to someone rather than pressing numbers through a phone tree.

It also runs at any hour and in the respondent's language, which matters when the sample includes shift workers and limited-English households. It slows down and does not talk over an older respondent who needs a moment, and it honors the call-time windows and consent rules that govern outbound public-sector calling. Barge-in lets a respondent interrupt and correct an answer mid-question, which keeps the data accurate instead of forcing them down a rigid path. The underlying plumbing is the same outbound stack agencies use to push proactive status updates to citizens, pointed at collecting answers rather than sending them.

Turning spoken answers into clean CSAT and NPS data

The real deliverable is the structured dataset that comes out the other end, and the call is only the way to get there. A spoken "yeah, it was fine, took a while though" is worthless as raw audio. The agent extracts each answer into a defined variable while the call is happening, so a scaled question becomes a number and an open comment becomes tagged text. A net-promoter question lands as a clean 0-to-10 integer with the follow-up reason stored beside it, and a batch of open comments can be themed after the field closes, so an analyst sees the top drivers rather than a transcript dump. After the call, sentiment scoring runs across the transcript, so a CSAT or NPS figure arrives with the reasoning attached instead of a bare digit.

That is what separates a pile of recordings from a dataset an analyst can load on Monday. Dograh handles the variable and data extraction plus post-call sentiment as part of the run, and exports transcripts and recordings for audit. Because the extraction runs live, partial calls still yield usable answers up to the point someone hung up. Consistent scripting means the exact wording that produced each number is identical across ten thousand calls, which is what makes the aggregate defensible when someone challenges it.

Dograh

Open Source Alternative to Vapi / Retell

Self-hosted voice agent platform — no per-minute fees

dograh-hq/dograh

Star on GitHub

Where this fits, and where it does not

AI voice holds up well on structured, quantitative surveys and still trails a skilled human on deep open-ended probing. A 2025 study at scale found an AI telephone agent performed comparably to an experienced human enumerator on consistency, with open-ended answers richer than a typical online form. That is a genuine result, and it is also the ceiling worth being honest about. When a question needs real follow-up, chasing a vague answer with the right prompt and reading hesitation on the line, a trained human interviewer still leads. Mode effects are real too, so a voice interview will not perfectly match a web panel on every item, and treating the two channels as interchangeable in a trend line is a mistake worth avoiding.

Survey institutions are actively studying whether the people who agree to an AI interview differ from those who decline, which stays an open methodological question. So the honest placement is structured, quantitative work at scale, from CSAT and NPS tracking to fixed-response constituent polling, with humans kept for the deep qualitative interviews where probing is the entire job. Anyone promising parity with a human enumerator on open-ended work is overselling it.

Join the Dograh Community

Dograh is an OSS alternative to Vapi. Join our Slack community for queries, releases, best practices & community interactions.

What to look for when the data is your constituents'

For public bodies, the buying criteria have less to do with voice quality and more with where the data lives and what a call costs at scale. Constituent survey data is sensitive. It can carry opinions about local officials, and in polling work it edges close to voter data. That makes data residency the first question rather than an afterthought. A hosted survey vendor routes every call and transcript through infrastructure the agency does not control, which is a hard sell for a records office bound by procurement rules. Self-hosting keeps the audio and the extracted answers inside your own environment, and open-source lets your team read and audit the code that touches the data instead of trusting a sealed box.

Where the data physically sits matters as much as who can read it. Self-hosting keeps constituent survey responses, and any voter or opinion data, inside the agency's own infrastructure and inside your own country, so nothing crosses a border to reach a third-party SaaS provider's servers. When a records office or an oversight body later asks where the data lived and who could touch it, an agency that ran the field on its own systems has a clean answer instead of a vendor's data-processing addendum.

The economics cut differently here as well. Survey fielding is bursty, a few weeks of heavy calling around a program cycle and then quiet, and a per-minute platform fee punishes exactly that shape. A self-hosted stack with no per-minute platform charge lets an agency run a large field without the bill scaling in lockstep with reach. The same setup produces a complete transcript and recording for every call, which is exactly what a public-records request or an internal audit asks for later, and ownership of that archive stays with the agency. Dograh is built on this model, open-source and self-hostable, carrying the same data-sovereignty logic that makes on-prem the winning shape for regulated voice AI, with 45+ languages so the sample finally includes the households that never fill out an English web form. The agencies that get representative feedback in 2026 will be the ones that stopped treating reach and budget as a trade, and started being candid about which questions voice is ready to ask.

Glossary

CATI (Computer-Assisted Telephone Interviewing)
The traditional human-interviewer phone survey method, where a live agent reads questions off a screen and keys in answers. An AI voice agent automates this loop while keeping the conversational back-and-forth.
Break-off rate
The share of respondents who start a survey and abandon it before finishing. Legacy touch-tone IVR surveys see high break-off; conversational voice agents keep far more people to the end.
Structured extraction
Converting a spoken answer into a defined data field in real time, so a scaled reply becomes a number and an open comment becomes tagged text ready for analysis.
Skip logic
Questionnaire branching that routes a respondent past questions that do not apply based on earlier answers, so each person only hears the relevant path through the instrument.

Frequently Asked Questions

Get started with Dograh

Build, deploy, and scale AI agents with Dograh. Join the community of developers building the future.