Building a Compassionate Local AI for Antara Pathways

I wanted Antara Pathways, Inc. to have more than a generic support widget. I wanted a
trauma-aware, self-hosted assistant that could stay on-topic, protect privacy, and serve
families moving through grief and end-of-life transitions. To achieve this, I built a chatbot powered by
a local LLM running on Ollama, orchestrated with FastAPI, systemd, and a custom WordPress widget.

Below is an overview of how I designed and implemented the Antara Pathways chatbot – written both to
document the architecture and to demonstrate the kind of engineering and product thinking I bring
to my work.

High-Level Architecture

Local LLM via Ollama
A dedicated antara model runs on my Ubuntu server through Ollama’s HTTP API.
FastAPI Backend
A Python FastAPI app exposes POST /api/antara-chat to handle routing, safety, and LLM calls.
Sitemap-Aware Prompting
A weekly cron job caches WordPress sitemaps into JSON so the model can reference real Antara pages.
Systemd Service
Uvicorn runs as a persistent systemd service that starts on boot and survives SSH disconnects.
WordPress Chat Widget
A lightweight HTML/CSS/JS widget in footer.php provides the chat UI on every page.

Ollama & Local Model

I chose Ollama so all inference happens locally, keeping conversations private and the model firmly
within Antara’s domain. The antara model is prompted to stay gentle, non-clinical, and on topic.

The FastAPI backend calls Ollama’s HTTP endpoint like this:

resp = requests.post(
  "http://127.0.0.1:8080/api/generate",
  json={"model": "antara", "prompt": prompt, "stream": False},
  timeout=25,
)

This lets me treat the LLM as a controlled “brain” for anything that falls outside the deterministic rules.

FastAPI Backend & Routing

The backend lives in /opt/antara-chat/app.py and exposes a simple /api/antara-chat endpoint using
FastAPI and Pydantic. It performs:

• keyword-trigger matching (services, board, founder, hours, etc.)
• multi-question splitting for complex queries
• ambiguity checks (“president,” “where are you,” etc.)
• safety rules (no clinical, crisis, or diagnostic replies)
• sitemap-scoped responses only
• fallback to LLM under controlled constraints

Deployment with systemd

I configured Uvicorn to run as a persistent Linux service. This allows the chatbot to stay online
24/7 — even if I log out or the server reboots.

[Unit]
Description=Antara Chat FastAPI service

[Service]
ExecStart=/opt/antara-chat/venv/bin/uvicorn app:app --host 0.0.0.0 --port 9000
WorkingDirectory=/opt/antara-chat
Restart=always

[Install]
WantedBy=multi-user.target

Once enabled, systemd ensures the chatbot is always available to the public.

WordPress Chat Widget

The frontend chat bubble is a custom HTML/CSS/JS widget added to footer.php so it appears on every page.
It:

autoscrolls to newest messages
supports clickable URLs
handles enter key submission
elegantly toggles open/close with animation
calls the FastAPI backend for responses

This keeps everything lightweight, fast, and on-brand.

Why This Project Matters

This project merges my engineering background with my calling to support people through grief and end-of-life transitions.

By building a local, privacy-minded, compassionate AI, I created a tool that supports Antara Pathways’ mission while also demonstrating my technical ability across:

DevOps • Python • FastAPI • System architecture • Prompt engineering • WordPress development • Security • AI model design

Contact Me