Writings

The AI India's Health Workers Actually Need

Introduction

Something is shifting in public health—and most people building AI are looking at the wrong layer.

I've been building deep learning systems for public health in India for seven years. Before that, a decade in the US—deep learning since the AlexNet days, across academia, IBM, and startups. I've watched a phone-based baby weight tool reach 100,000 vulnerable newborns, a pest advisory system hit a million farmers daily, and AI-powered TB screening catch 100,000 cases that would have been missed.

I've seen what works. I've seen what doesn't. And right now, I see something emerge that will either overwhelm the system—or transform it.

Let me tell you what I see.

The Problem

India's public health system runs on the backs of nearly a million frontline health workers. These are the people who actually reach families—walking into homes, asking about symptoms, tracking pregnancies, reminding mothers about immunizations.

They go by different names depending on their role and state. Accredited Social Health Activists (ASHAs) are incentive-based community volunteers who connect families to health services. Auxiliary Nurse Midwives (ANMs) are government health workers stationed at health sub-centres, handling everything from antenatal care to vaccinations. Community Health Workers (CHWs) is the umbrella term for these various frontline roles. There are also Anganwadi Workers (AWWs) who focus on nutrition and early childhood development.

I've sat with these workers. I've watched them navigate their day. And here's what it actually looks like:

Consider what a single ANM faces:

They record patient data on paper or WhatsApp during home visits because they can't fumble with seven different apps while talking to a scared pregnant woman. Then they go home at night and enter everything from memory across multiple systems. They lose an hour a day—minimum—to this administrative theater.

The systems of record that power our public health dashboards are erroneous and noisy by design. What time did they actually visit? What was actually said? It's all filtered through memory, through fatigue, through the cognitive tax of impossible systems. The data was never clean to begin with.

Here's the thing most AI interventions miss: the pain for these workers isn't the care—it's the logistics. They didn't sign up to be data entry clerks. They signed up to help mothers and children. But the administrative burden has become so overwhelming that it crowds out the care they're capable of giving.

The past decade brought digitization. It added screens to the workflow without removing friction. Dashboards proliferated; data quality didn't improve. Training happened in classrooms; protocols were forgotten in the field.

We digitized the paperwork. We didn't eliminate it.

Before we build fancy clinical decision support tools or screening AI, we need to fix the thing that's actually breaking these workers' days. The urgent problem isn't clinical knowledge—it's operational burden. Give them back the hour they lose every day to paperwork. Remove the friction that's standing between them and the care they already know how to deliver. Only then can we layer on capabilities that extend what they can do.


The Shift: Actions, Not Answers

Here's what's changed.

The next shift isn't better dashboards or another chatbot. It's agentic systems—AI that doesn't just answer questions, but actually does the digital work.

When I say "agent," I mean an LLM-powered, safety-wrapped system that can orchestrate software tools—pull data, fill forms, book appointments, submit records—end-to-end, across the messy stack we already have. The health worker asks for a referral; the agent checks eligibility, pre-fills the form from prior visit data, books the appointment, and updates the registers. The output isn't a paragraph. It's a submitted form, a scheduled follow-up, a referral booked.

And India is uniquely positioned for this leap—because we already built the rails.

Our Digital Public Infrastructure (DPI) gives agents what they need to actually operate: Aadhaar for identity, UPI for payments, ABDM (Ayushman Bharat Digital Mission) for health records—the pipes to authenticate, consent, and transact. Bhashini is building the language layer for real Indian speech—not just Hindi and English, but the dialects health workers actually speak. And we have scale that makes the investment worthwhile: nearly a million frontline workers, hundreds of millions of beneficiaries.

This combination—agentic AI running on public digital infrastructure, in a country with the scale and the need—doesn't exist anywhere else. If we can solve for the extreme constraints of an ANM in Uttar Pradesh, we build a foundation that can adapt to Kenya, Nigeria, Brazil. India becomes a producer of globally exportable health infrastructure—not a consumer of Silicon Valley tools.


The Solution: Four Agents, One System

Rather than scattered chatbots for isolated problems, I've converged on four agentic personas that together cover the full arc of a health worker's day.

Architecturally, these are capability modules, not standalone products. The orchestration layer, safety guardrails, and interaction patterns are shared. What changes per deployment is the grounding: which protocols Guru retrieves, which registries Clerk writes to, which dashboards Analyst reads. An ANM in UP running maternal health gets different configuration than an ASHA in Karnataka doing immunization—same system, parameterized differently.

Sakhi — The Voice in the Pocket

On-the-Job Support Agent

A fast, voice-to-voice interface that provides real-time guidance in the field. Sakhi handles live conversation in local dialects, answers protocol questions, and serves as the front-facing personality the worker actually talks to.

"High-risk pregnancy, third trimester, BP flagged—what's next?" Sakhi calls Guru to retrieve the protocol, synthesizes the reasoning for the worker, and hands off to Clerk to book the referral.


Guru — The Walking Supervisor

Knowledge & Learning Agent

Trained on NHM (National Health Mission) protocols, training curricula, and verified medical guidelines. Guru is as good as (or better than) a senior supervisor in knowledge—available 24/7, in any language, without judgment.

Guru shifts training from episodic classrooms to continuous, embedded learning. Before a home visit, refresh the case. After a difficult interaction, learn from it.


Clerk — The Paperwork Machine

Form-Filling & Action Agent

The action-taker. Clerk talks to software systems, knows how to connect APIs and apps, and completes the administrative loop. When the conversation is done, Clerk submits forms, updates registers, syncs records, and leaves an auditable trail.

"Done—I'll update the 6 relevant forms: ANC register, RCH portal, ICDS referral, immunization tracker, nutrition log, and MO summary."


Analyst — The Ambient Brain

Data Interaction Agent

The silent layer that reads dashboards, forms, and logs. Analyst surfaces gaps ("3 high-risk cases haven't been visited this week"), flags inconsistencies, and enables supervisors and program managers to talk to their data instead of staring at tables.


How They Work Together

These aren't four separate products. They're four capabilities that orchestrate as a single system:

Time of Day What Happens Agents Involved
Morning (PHC) Training session—ask "what-if" questions, refresh protocols Sakhi + Guru
Before visit (on bike) Review case file, check history, prepare for conversation Sakhi + Guru + Analyst
During visit Silent listener mode—ambient capture, minimal interruption Sakhi (background)
After visit (on bike) Post-visit review—clarify gaps, resolve inconsistencies Sakhi + Analyst + Clerk
Evening (home) Batch review—submit forms, confirm completion, plan tomorrow Analyst + Clerk + Guru

The Agentic Architecture

Building agents for healthcare isn't a prompt engineering problem. It's a systems engineering problem.

I learned this the hard way. Early prototypes treated Sakhi like a smart chatbot—give it a prompt, plug in some tools, see what happens. It didn't work. Field visits are chaotic. Workers get interrupted. Connectivity drops. Questions come in fragments. A chatbot that loses context the moment the call drops is useless. What actually works is thinking about Sakhi as an orchestrator—a system that routes requests to the right capability, holds state across interruptions, and knows when to answer versus when to act.

For instance: a health worker mid-visit asks about a high-risk flag on her patient. Is that a protocol question? Route to Guru. Does she need the referral booked? Route to Clerk. Does she want to see the patient's last three visits? Route to Analyst. And when the question is vague—"what should I do?"—the system asks for clarification instead of blindly assuming. It exposes ambiguity rather than hiding it. The orchestration layer handles intent classification, context management, tool selection—all the machinery that lets a single conversation feel coherent even when it's drawing from multiple systems underneath.

This isn't a chatbot with plugins. It's a reasoning engine with agency—capable of multi-step planning, error recovery, and knowing when to ask for clarification versus when to act.

The Platform

Here's what I realized building this: if the orchestration layer is robust enough, Sakhi becomes the universal interface to every AI capability in the health ecosystem.

There's already a screening model for oral cancer that works from a phone camera. Today, that's a separate app, separate login, separate training. With a proper orchestration layer, it becomes a skill Sakhi can invoke mid-conversation—"take a look at this"—and return a risk score without the worker ever leaving the flow. Same pattern applies to telehealth (an API call to eSanjeevani), lab results (a webhook from the district hospital), scheme eligibility (a function call to the govt database).

Whatever the terminology—APIs, skills, plugins—the point is the same: any future AI capability plugs into the same interface. The worker doesn't need ten apps. They need one conversation with the right capabilities behind it, added through proper governance, not a free-for-all.

That's the platform: a voice-first, agentic layer that unifies fragmented AI tools into a single, coherent experience.

Trust by Design

None of this works without trust—and trust has to be earned from three different directions.

Administrators need to trust that the system follows protocol, doesn't hallucinate, and leaves an audit trail. So agents speak only from verified sources—NHM protocols, ICMR guidelines, program-specific SOPs. Every factual claim traces back to a source document. When confidence is low, the system says so and escalates. Every action gets logged with timestamp, user, and context.

Health workers need to trust that the system helps them, not surveils them. So nothing happens without their explicit confirmation. The system fills the form, but the human submits it. Everything is reversible until they say yes. They stay in control.

People—the families being served—need to trust that their data is protected. So ambient listening happens on-device; recordings never leave the phone. PII gets detected and redacted before anything is transmitted. Consent flows align with DPDP, ABDM, and NDHM norms. Role-based access ensures health workers see only what they need, supervisors see aggregates.

This is the DPI philosophy applied to AI: build trust into the architecture itself, not as a policy layer on top. Governance by technical design.


The Invitation

I'm building the interface that India's health workers deserve—one that collapses ten screens into one conversation, speaks their language, follows their protocol, and gives them back the hours currently lost to paperwork.

I've spent years in this space. I've built AI that handles noisy audio, dialectal variation, intermittent connectivity, and users who've never interacted with AI before. I've learned the gap between "technically possible" and "actually works at 2 PM on a field visit in rural UP." The hard problems are the last mile—integrating with government systems, building trust with users, navigating operational realities that no dataset captures. This is what I find meaningful.

If we get the interface right, every future AI innovation in health flows through it. Better screening models ship? The worker gets access through the same conversation. New diagnostic tools emerge? Same interface. Telehealth improves? Same voice, same experience. The ecosystem advances, and for once, the worker's cognitive load doesn't increase—it stays the same while their capabilities expand.

As AI scales exponentially, health worker capability scales with it. They ride the wave instead of drowning under it. The inequity gap closes at the same rate the technology improves.