US v. Heppner: Why Law Firms Need Private AI Now
Short answer: Judge Rakoff ruled AI-generated documents aren The article focuses on practical context, local relevance, and concrete next steps instead of abstract AI or technology hype.
On February 10, 2026, Judge Jed Rakoff of the Southern District of New York ruled that documents generated through Claude AI are not protected by attorney-client privilege. For law firms using ChatGPT, Claude, or other cloud AI tools with client data, the implications are immediate and serious.
What Happened in US v. Heppner
In United States v. Heppner, 25 Cr. 503 (S.D.N.Y.), the government moved to compel production of documents the defendant had created using Anthropic's Claude AI. The defendant claimed attorney-client privilege and work product protection.
Judge Rakoff rejected both claims. His reasoning was straightforward:
- The AI is not an attorney. No privilege attaches to communications with a non-attorney third party.
- No expectation of confidentiality. Anthropic's privacy policy expressly permits collection of prompts and outputs, use for model training, and disclosure to governmental authorities.
- Retroactive privilege fails. Sending pre-existing non-privileged documents to counsel after the fact doesn't make them privileged.
- Work product doesn't apply. Self-directed AI research isn't protected unless an attorney directed its use.
"The defendant voluntarily shared information with a platform whose own terms of service allow government access. This waiver cannot be undone." — Judge Jed Rakoff, US v. Heppner
Why This Matters for Every Law Firm
The ruling's logic extends far beyond criminal cases. Consider the standard privacy policies of major AI providers:
- OpenAI (ChatGPT): Retains prompts and outputs for 30 days minimum. May use data for model improvement unless you opt out via API.
- Anthropic (Claude): Collects conversation data. May comply with legal process and government requests.
- Google (Gemini): Human reviewers may read conversations. Data used to improve services.
- Microsoft Copilot: Enterprise agreements vary, but consumer versions retain data.
⚠️ The Core Problem
When you paste client information into a cloud AI service, you are transmitting confidential attorney-client communications to a third party whose terms of service permit disclosure to the government. Under Heppner, this may constitute waiver.
This isn't about whether AI providers will disclose your data. It's about whether their policies permit disclosure—and whether that permission destroys the reasonable expectation of confidentiality that privilege requires.
The Solution: Air-Gapped Local AI
The only way to use AI with privileged client data while maintaining confidentiality is to ensure the data never leaves your control. This means running AI models locally—on hardware you own, within your network, with no external data transmission.
This is no longer a "nice to have" for security-conscious firms. After Heppner, it's a compliance requirement.
What "Local AI" Actually Means
- The model runs on your hardware — a server, workstation, or even a laptop
- No API calls to external services — queries never leave your network
- No telemetry or logging to third parties — you control all data retention
- No training on your data — the model doesn't learn from your prompts
How to Set Up a Private AI Server
Here's a practical guide to deploying local AI that never phones home. This setup is appropriate for small to mid-sized firms and can be operational in an afternoon.
Option 1: Mac Mini / Mac Studio (Easiest)
Apple Silicon Macs are surprisingly capable for local AI inference. A Mac Mini M4 with 24GB RAM can run capable models entirely offline.
# Install Ollama (local AI runtime)
curl -fsSL https://ollama.com/install.sh | sh
# Pull a capable model (runs entirely local)
ollama pull llama3.2:8b
# Or for better legal reasoning:
ollama pull qwen2.5:14b
# Verify it's running locally
curl http://localhost:11434/api/tags
Cost: ~$1,500 for Mac Mini M4 Pro (24GB). No ongoing API fees.
Performance: 20-40 tokens/second depending on model size. Adequate for document review, summarization, drafting assistance.
Option 2: Dedicated Linux Server (More Powerful)
For firms needing faster inference or larger models, a dedicated server with GPU acceleration is ideal.
# On Ubuntu 22.04+ server
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# For GPU acceleration (NVIDIA), install CUDA first
# Then pull larger models:
ollama pull qwen2.5:32b
ollama pull llama3.1:70b # Requires ~48GB VRAM or CPU offload
# Run as a service
sudo systemctl enable ollama
sudo systemctl start ollama
Recommended Hardware:
- CPU: Any modern 8+ core processor
- RAM: 64GB minimum for larger models
- GPU (optional): NVIDIA RTX 4090 (24GB) or RTX 3090 (24GB) for 10x faster inference
- Storage: 500GB SSD for models
Cost: $3,000-8,000 depending on GPU. No ongoing API fees.
Option 3: Air-Gapped Workstation (Maximum Security)
For matters requiring the highest confidentiality—trade secrets, national security, sensitive M&A—consider a fully air-gapped machine:
- Set up a dedicated workstation with no network connection
- Install Ollama and models via USB transfer
- Transfer documents to/from the machine via encrypted USB only
- No WiFi, no Ethernet, no Bluetooth
This is extreme, but for certain practice areas, it may be appropriate.
Connecting Your AI to Legal Workflows
Once Ollama is running, you can access it from any application on your network:
Direct API Access
# Query your local AI from any machine on your network
curl http://your-server:11434/api/generate -d '{
"model": "qwen2.5:14b",
"prompt": "Summarize the key obligations in Section 4.2 of this agreement: [paste text]",
"stream": false
}'
Web Interface (Open WebUI)
# Install a ChatGPT-like interface for your team
docker run -d -p 3000:8080 \
-e OLLAMA_BASE_URL=http://your-server:11434 \
--name open-webui \
ghcr.io/open-webui/open-webui:main
This gives your attorneys a familiar chat interface that connects to your private AI server.
Microsoft Word Integration
For attorneys who live in Word, add-ins can connect to your local Ollama instance. The AI assists with drafting and review without data ever leaving your network.
Key Point: The endpoint http://localhost:11434 (or your server's local IP) never touches the public internet. Your queries, documents, and AI responses stay within your office network.
Compliance Checklist
Before deploying local AI for privileged work, verify:
- AI server has no route to public internet (firewall rules)
- No cloud sync services running on AI server (Dropbox, OneDrive, iCloud)
- Telemetry disabled in all software
- Models downloaded via secure transfer, not pulled live
- Access logs enabled for audit trail
- Physical security appropriate to data sensitivity
- Written policy documenting AI use procedures
- Staff training on what data can/cannot be processed
What About Enterprise AI Agreements?
Some firms use enterprise agreements with OpenAI, Microsoft, or Anthropic that include Zero Data Retention (ZDR) clauses. These agreements state that the provider won't retain or train on your data.
After Heppner, the question is whether a ZDR agreement is sufficient to preserve privilege. The ruling focused on two factors:
- Data transmission to a third party
- The third party's ability (per their policies) to disclose to government
A ZDR agreement addresses retention and training, but may not address the provider's obligation to comply with subpoenas or government requests. Until this is tested in court, local AI remains the conservative choice for privileged work.
The Bottom Line
US v. Heppner draws a clear line: if you transmit privileged information to a cloud AI provider whose policies permit government disclosure, you may have waived privilege.
For law firms, the path forward is straightforward:
- Stop using consumer AI (ChatGPT, Claude web) for anything involving client data
- Deploy local AI infrastructure that keeps data within your control
- Document your procedures to demonstrate reasonable confidentiality measures
- Train your team on what's appropriate for AI assistance
The technology exists. The models are capable. The only question is whether your firm will adapt before a privilege challenge forces the issue.
Need Help Setting Up Private AI?
We help law firms deploy secure, compliant AI infrastructure. From hardware selection to Word integration, we handle the technical work so you can focus on practicing law.
Learn More →