Building a Compassionate Local AI for Antara Pathways
I wanted Antara Pathways, Inc. to have more than a generic support widget. I wanted a
trauma-aware, self-hosted assistant that could stay on-topic, protect privacy, and serve
families moving through grief and end-of-life transitions. To achieve this, I built a chatbot powered by
a local LLM running on Ollama, orchestrated with FastAPI, systemd, and a custom WordPress widget.
Below is an overview of how I designed and implemented the Antara Pathways chatbot – written both to
document the architecture and to demonstrate the kind of engineering and product thinking I bring
to my work.
High-Level Architecture
- Local LLM via Ollama
A dedicatedantaramodel runs on my Ubuntu server through Ollama’s HTTP API. - FastAPI Backend
A Python FastAPI app exposesPOST /api/antara-chatto handle routing, safety, and LLM calls. - Sitemap-Aware Prompting
A weekly cron job caches WordPress sitemaps into JSON so the model can reference real Antara pages. - Systemd Service
Uvicorn runs as a persistent systemd service that starts on boot and survives SSH disconnects. - WordPress Chat Widget
A lightweight HTML/CSS/JS widget infooter.phpprovides the chat UI on every page.
Ollama & Local Model
I chose Ollama so all inference happens locally, keeping conversations private and the model firmly
within Antara’s domain. The antara model is prompted to stay gentle, non-clinical, and on topic.
The FastAPI backend calls Ollama’s HTTP endpoint like this:
resp = requests.post(
"http://127.0.0.1:8080/api/generate",
json={"model": "antara", "prompt": prompt, "stream": False},
timeout=25,
)
This lets me treat the LLM as a controlled “brain” for anything that falls outside the deterministic rules.
FastAPI Backend & Routing
The backend lives in /opt/antara-chat/app.py and exposes a simple /api/antara-chat endpoint using
FastAPI and Pydantic. It performs:
- • keyword-trigger matching (services, board, founder, hours, etc.)
- • multi-question splitting for complex queries
- • ambiguity checks (“president,” “where are you,” etc.)
- • safety rules (no clinical, crisis, or diagnostic replies)
- • sitemap-scoped responses only
- • fallback to LLM under controlled constraints
Deployment with systemd
I configured Uvicorn to run as a persistent Linux service. This allows the chatbot to stay online
24/7 — even if I log out or the server reboots.
[Unit]
Description=Antara Chat FastAPI service
[Service]
ExecStart=/opt/antara-chat/venv/bin/uvicorn app:app --host 0.0.0.0 --port 9000
WorkingDirectory=/opt/antara-chat
Restart=always
[Install]
WantedBy=multi-user.target
Once enabled, systemd ensures the chatbot is always available to the public.
WordPress Chat Widget
The frontend chat bubble is a custom HTML/CSS/JS widget added to footer.php so it appears on every page.
It:
- autoscrolls to newest messages
- supports clickable URLs
- handles enter key submission
- elegantly toggles open/close with animation
- calls the FastAPI backend for responses
This keeps everything lightweight, fast, and on-brand.
Why This Project Matters
This project merges my engineering background with my calling to support people through grief and end-of-life transitions.
By building a local, privacy-minded, compassionate AI, I created a tool that supports Antara Pathways’ mission while also demonstrating my technical ability across:
DevOps • Python • FastAPI • System architecture • Prompt engineering • WordPress development • Security • AI model design