Your Patients' Notes Never Touch a Cloud

OllamaMemoryJudgePII RedactionDesktopPostgreSQL

"Had very incorrect information about the client. The note indicated past child sexual abuse and a medical condition, neither of which this client has ever experienced." — Therapist on Alma's AI-generated SOAP notes, December 2024

The Story

You're a therapist seeing six patients a day. After each session, you sit down to write clinical notes. Thirty to sixty minutes of documentation for every hour of therapy. It's the part of the job no one talks about — the part that leads to burnout, late nights, and weekends spent catching up on paperwork instead of resting.

AI tools exist for this. Freed, Nabla, Mentalyc — they'll record your session, transcribe it, and generate a SOAP note. The pitch is compelling. The problem is where the audio goes. Every one of these tools sends your patient's most intimate disclosures to cloud servers you don't control, processed by models you can't audit, stored under retention policies you didn't write.

Seventy percent of therapy clients don't know their sessions may be AI-transcribed. The ones who find out don't take it well. And the tools themselves aren't reliable — Alma's Note Assist launched in late 2024 and immediately started fabricating clinical details, inserting allegations of childhood sexual abuse into notes for patients who had never disclosed anything of the kind. HIPAA fines range from $141 to $2.1 million per violation. A single hallucinated note could end a career.

You need the efficiency. You can't afford the risk.

The Idea

After each session, you type a brief summary — three or four sentences of what happened, what you observed, what you're thinking. A local LLM running on Ollama generates a complete SOAP note. No audio recording. No cloud transmission. No third-party data processor between you and your patient's most sensitive information.

You

Session with R.T. — reported increased anxiety since workplace restructuring. Conflict with new manager. Affect congruent with reported mood, speech rate elevated. Discussed cognitive distortions around job security. Assigned thought record homework.

Agent

SUBJECTIVE: Patient reports increased anxiety over the past two weeks, coinciding with organisational restructuring at workplace. Describes ongoing conflict with new direct manager regarding role expectations. Reports difficulty sleeping (onset insomnia, ~45 min) and increased irritability at home. Denies suicidal ideation.

OBJECTIVE: Patient alert and oriented x4. Affect anxious, congruent with reported mood. Speech rate mildly elevated. Eye contact appropriate. No psychomotor agitation. Thought process linear and goal-directed.

ASSESSMENT: Adjustment disorder with anxiety (F43.22). Symptoms consistent with situational stressor. No evidence of escalation to generalised anxiety disorder at this time. PHQ-9 not administered this session; previous score 8 (mild).

PLAN: Continue CBT weekly. Introduced cognitive restructuring targeting catastrophic thinking around job loss. Assigned thought record homework (minimum 3 entries before next session). Discussed progressive muscle relaxation as sleep hygiene intervention. Review PHQ-9 and GAD-7 at next session. No medication changes.

The agent remembers this patient's history. It knows you administered the PHQ-9 two sessions ago and scored an 8. It knows you've been using CBT, not DBT. It adjusts the note format to your preferences — SOAP, DAP, BIRP, or whatever your practice requires. If you always include a risk assessment section, it includes one. If you prefer ICD-10 codes over DSM-5 labels, it uses those.

Between sessions, the agent becomes a clinical assistant.

You

Pull up my notes on the Thompson case

Agent

R. Thompson — 14 sessions since intake (September 2025). Primary dx: Major Depressive Disorder, recurrent, moderate (F33.1). Current treatment: CBT + sertraline 100mg (prescribed by Dr. Patel, last med check November). PHQ-9 trend: 16 → 12 → 8 → 11 (uptick last session, correlated with holiday family conflict). Last session focused on schema work around abandonment. Homework compliance has been inconsistent the past three weeks.

Everything lives in PostgreSQL on your laptop. The Judge layer monitors every interaction and blocks any attempt to transmit protected health information outside the local system. The LLM runs on Ollama — no API calls, no tokens leaving your machine. Your patients' words stay exactly where they belong: in your care.

What It Looks Like

After a session: Type three or four sentences about what happened. Get a complete, formatted clinical note in your preferred style within seconds. Review, edit if needed, save. Documentation that used to take 30 minutes takes 5.

Between sessions: Ask for a patient summary before their next appointment. The agent retrieves intake notes, treatment history, medication changes, homework compliance, and assessment score trends. You walk into the session prepared instead of scrambling through files.

You

Summarise my caseload trends this week

Agent

This week: 28 sessions across 24 clients. Three cancellations (two same-day). Average PHQ-9 delta across active clients: -1.2 points over past 4 weeks. Two clients flagged for review — M.K. reported passive SI for the second consecutive session, and J.L.'s GAD-7 increased from 9 to 14. Homework completion rate: 62%, down from 71% last month. Insurance documentation pending for 4 clients (Blue Cross x2, Aetna, United).

For insurance: "Generate a treatment summary for Blue Cross, sessions 1 through 12, R. Thompson." The agent produces a formatted document with dates, diagnoses, treatment modalities, progress measures, and medical necessity justification — all from data that already exists in your local notes.

How It Works

Ollama — Runs a local LLM (Llama 3, Mistral, or similar) entirely on your hardware. No API calls. No cloud. Inference happens on your CPU or GPU. Clinical language understanding is strong enough for note generation without needing the largest frontier models.
Memory — Three-tier memory stores patient history, treatment plans, assessment scores, and your clinical preferences. When you mention a patient, the agent retrieves their full context via hybrid search (BM25 + vector similarity). It knows the difference between your Thompson and your Thomson.
Judge — Evaluates every agent action for PHI risk. Note generation is local and low-risk — approved automatically. Any action that would transmit patient data externally (even accidentally, through a search query or API call) is blocked before execution. The Judge doesn't ask for permission; it refuses.
PII Redaction — An additional safety layer that scans outbound content for names, dates of birth, addresses, and clinical identifiers. Even if the Judge missed something, the redaction layer catches it. Defence in depth.
Desktop — The agent runs through the desktop app. No server, no tunnel, no network exposure. Your clinical workspace is a local application on your laptop, the same way your word processor is.
PostgreSQL — All notes, patient records, and agent memory live in a local PostgreSQL database. You own the data. You control the backups. You decide the retention policy. When a patient requests their records, you export from a database you control.

What Breaks Without This

Cloud AI note tools are a compliance time bomb. They work until they don't — and when they don't, the consequences are career-ending. The value proposition is "save time on documentation." The hidden cost is transferring your patients' most sensitive disclosures to infrastructure you cannot verify, audit, or control.

Consider what's already happened:

Serviceaide breach (2024): 483,000 patient records exposed because a cloud-based healthcare AI tool misconfigured its data storage. Names, diagnoses, treatment histories — all publicly accessible.
Alma Note Assist: Fabricated clinical details including allegations of childhood sexual abuse. Therapists discovered the hallucinations only because they reviewed the generated notes carefully. How many didn't?
HIPAA doesn't address AI training: Current regulations were written before LLMs existed. There is no explicit prohibition on using patient data submitted to AI tools for model training. "De-identified" data under HIPAA's Safe Harbor provision can be re-identified with as few as three data points.
Data retention varies wildly: Cloud AI tools retain your patients' data for anywhere from 15 days to 10 years. Some don't specify. Some change their policies without notice. You agreed to the terms of service; your patients didn't.
Re-identification risk: Research consistently demonstrates that de-identified clinical data can be re-identified by cross-referencing with public records. The combination of diagnosis, approximate age, and geographic region is often sufficient.

Local processing eliminates the entire category of risk. There is no breach vector for data that never leaves the machine. There is no retention policy debate for data you control. There is no re-identification risk for data that was never de-identified and transmitted in the first place.

Build This

This is not a concept — it's buildable today.

Salmex I/O runs entirely on your hardware with Ollama for local inference. The Judge blocks PHI from ever leaving your machine. Patient data lives in PostgreSQL on your laptop — not on a startup's cloud.

OllamaMemoryJudgePII RedactionDesktopPostgreSQL

Download

The Story

The Idea

What It Looks Like

How It Works

What Breaks Without This

Build This

This is not a concept — it's buildable today.

You might also like

It Remembers Everything — Stays on Your Laptop

Text Your AI to Control Your Home