NeuBird Desktop
The Production Ops Agent in your terminal.
NeuBird connects to your telemetry databases and investigates incidents, health, cost, and performance — using natural language.
What NeuBird Does
NeuBird is a terminal-native AI agent for site reliability engineering. It connects to your existing telemetry tools — PagerDuty, Datadog, CloudWatch, Grafana, GitHub, Snowflake, and 30+ more — and lets you ask questions in plain English:
- "What services have the highest error rates right now?"
- "Why did latency spike on api-gateway at 2am?"
- "How much did cloud costs increase this week and what drove it?"
NeuBird queries your telemetry, reasons over results across multiple data sources, and delivers a root cause analysis — complete with evidence, sources, and recommended actions.
Key Features
Predictive analysis — Ask what's likely to page you next. NeuBird identifies degradation trends, capacity cliffs, and silent failures before they become incidents.
Pre-deployment risk assessment — Evaluate code changes before they reach production. NeuBird cross-references PRs with live telemetry to tell you what could break and why.
Agentic investigations — NeuBird doesn't just answer questions. It explores your schema, runs multiple queries, correlates data across sources, and iterates until it finds the answer. You watch the investigation unfold in real time.
Health sweeps — Run /health for a full infrastructure health check. NeuBird scans incidents, alarms, error logs, and recent deployments, then produces a Good/Bad/Ugly summary with recommended actions.
Cost analysis — /cost analyzes cloud spending trends and projects 24-hour costs with breakdowns by service, team, and resource type.
Three agent personas — Switch between investigation styles to match the situation:
| Persona | Model | Best for |
|---|---|---|
| Responder | Claude Haiku | Fast triage, immediate next actions |
| Analyst | Claude Sonnet | Root cause analysis, deep investigation |
| Architect | Claude Opus | Runbooks, design reviews, systemic fixes |
Collapsible tool output — Press Ctrl+O to review every tool call and result from an investigation — even while it's still running. Expand individual calls to inspect full data tables, or collapse them to focus on the analysis. Works mid-investigation: pause to browse results, then press Esc to resume watching.
Local CLI integration — NeuBird automatically detects SRE tools on your machine (kubectl, docker, aws, gcloud, helm, git, terraform, curl, dig, openssl) and makes them available to the AI agent as read-only tools. Ask "how many pods are in the production namespace?" and NeuBird will run kubectl get pods directly. All commands are safety-gated — destructive operations (apply, delete, create, scale) are blocked at the code level.
Web search — When an investigation requires context beyond your telemetry — CVE details, vendor status pages, documentation, or recent outage reports — NeuBird can search the web for supporting evidence. Web search is enabled by default and integrates naturally into investigations alongside SQL queries and local CLI tools.
Tool inventory — On startup, NeuBird displays every tool available to the agent — grouped by source (built-in, cloud MCP, local CLI). Run /tools anytime to see the full inventory with capabilities. Example:
🔧 Tool inventory (18 tools)
📦 Built-in (12)
✓ exec_sql Execute SQL queries against the database
✓ list_schemas List database schemas
...
☁️ ACE MCP (4)
✓ external_tool External service queries
...
💻 Local CLI (2)
✓ kubectl Read-only Kubernetes inspection (pods, logs, events)
✓ run_local_command Execute read-only local CLI commands
🔍 Detected CLIs: kubectl (v1.28.2), docker (24.0.7), git (2.39.0)
⬚ Not found: aws, gcloud, helm, terraform
Persistent learning — NeuBird remembers which queries work with your data sources and which telemetry tables matter for each type of investigation. It gets faster and more accurate over time.
Custom slash commands — Drop a .md file into the skills/ directory to create custom investigation templates. The filename becomes the command, and the content becomes the prompt.
Export and share — Use /export to save any investigation as a file or copy it to your clipboard. An interactive picker lets you choose format (plain text, markdown, or PDF) and scope (last answer or full conversation). Or go direct: /export pdf, /export full md, /export clipboard. PDF exports include the NeuBird logo, formatted tables, and proper section headings. Use /copy as a shortcut to copy the last answer to your clipboard instantly.
Branded PDF reports — RCA investigations, health sweeps, and cost analyses automatically generate branded PDF reports saved to ~/.config/neubird/reports/. Each report includes the NeuBird logo, confidence score, data sources, investigation timeline, and structured findings with evidence tables and action items.
Live investigation dashboard — While an investigation runs, three progress bars update in real time:
██████████████████████████████████░░░░░░ 85% wrapping up · turn 15 · 22 tools • 52.3s
████████████████████░░░░░░░░░░░░░░░░░░░░ 50% confidence
████████████████████████░░░░░░░░░░░░░░░░ 2.4 MB · 1,247 rows · 8 queries (Ctrl+C)
The first bar tracks investigation progress. The second shows the AI's self-reported confidence (can go down — that's useful signal). The third shows cumulative telemetry data read from SQL queries on a logarithmic scale. On completion, a summary line shows the final stats:
──────────────────────────────────────────────────────────────────────────
✅ completed in 3m52.1s · 17 turns · 21 tools · 80% confidence · 49.4 KB read
📊 Data sources: config_aws_prod.aws_elbv2_target_groups, metric_aws_prod.rds_cpu
Live investigation narrative — While the agent explores schemas and runs queries, a background summarizer produces a rolling "what we know so far" status that updates after each tool round. Instead of watching raw tool calls scroll by, you see a plain-English narrative of findings and direction:
╭─ Investigation status ─────────────────────────────────────────────────╮
│ Error rates on payment-service have risen from 0.1% to 3.2% in the │
│ last 30 minutes, concentrated on the /checkout endpoint. A deploy │
│ went out 45 minutes ago touching the payment retry logic. Looking │
│ at upstream dependencies next to rule out cascading failures. │
│ confidence: 65% │
╰────────────────────────────────────────────────────────────────────────╯
The narrative is generated by a lightweight Haiku call running in parallel — it reads a snapshot of the investigation history but never modifies what Claude sees. The main investigation is not slowed or altered. The summary replaces itself on each update (not appending), so it stays compact. Web clients receive the narrative as interim_summary SSE events for rendering in a dedicated panel.
Built-in investigation skills — NeuBird ships with ready-to-use playbooks for the most common SRE workflows:
| Command | What it does |
|---|---|
/handoff |
On-call shift briefing — active incidents, recent deploys, current health, watch items |
/changes |
Compare two time windows — find every deploy, config change, metric shift, and correlate them |
/timeline |
Reconstruct a minute-by-minute incident timeline from all telemetry sources |
/pir |
Generate a leadership-ready post-incident review with 5-whys and action items |
/slo |
Calculate error budgets, burn rates, and project when SLOs will breach |
/blast-radius |
Map upstream/downstream dependencies and quantify failure impact |
/certs |
Scan TLS certificates across endpoints, flag anything expiring within 30 days |
Enable them by copying to your skills directory:
cp skills/*.md ~/.config/neubird/skills/
NeuBird advertises available skills on the welcome screen so you always know what's possible — no memorization required.
Sentinel mode — Run /sentinel to activate continuous background monitoring. NeuBird's sentinel polls for new alerts every 5 minutes and runs full health sweeps every hour. It surfaces actionable findings as they appear — not on a dumb timer, but by detecting real changes in your telemetry:
> /sentinel
🛡️ Sentinel active — polling every 5m0s, full sweep every 1h0m0s.
Type /sentinel status to see findings, /sentinel off to stop.
🛡️ Sentinel: 2 new finding(s) detected — type /sentinel status to review
> /sentinel status
🛡️ Sentinel Status
Running · Polls every 5m0s · Sweeps every 1h0m0s
Last poll: 14:35:12 · Last sweep: 14:30:00
Active findings: 3
🔴 payment-service error rate spike ✨
Error rate jumped from 0.1% to 3.2% in the last 15 minutes
→ Investigate payment-service error rate spike and correlate with recent deploys
🟡 auth-service connection pool at 82% ✨
Connection pool utilization trending toward saturation
→ Check auth-service connection pool usage and project when it hits the limit
🔵 Upcoming cert expiry: api.example.com
TLS certificate expires in 12 days
→ Scan TLS certificates and identify renewal schedule
Use /sentinel off to stop. The sentinel runs in the background — you can investigate other things while it monitors.
MCP tool support — Extend NeuBird with Model Context Protocol servers for additional data sources and capabilities beyond SQL.
Installation
macOS / Linux (Homebrew)
brew install neubirdai/tap/neubird
Linux (Snap)
sudo snap install neubird-desktop
Linux (Debian / Ubuntu)
curl -LO https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_linux_amd64.deb
sudo dpkg -i neubird_linux_amd64.deb
Linux (Fedora / RHEL)
curl -LO https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_linux_amd64.rpm
sudo rpm -i neubird_linux_amd64.rpm
Windows
Download neubird_windows_amd64.zip from the latest release, extract it, and add the folder to your PATH:
# PowerShell — download and extract
Invoke-WebRequest -Uri "https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_windows_amd64.zip" -OutFile neubird.zip
Expand-Archive neubird.zip -DestinationPath "$env:LOCALAPPDATA\neubird"
# Add to PATH (current session)
$env:PATH += ";$env:LOCALAPPDATA\neubird"
# Add to PATH (permanent — requires restart)
[Environment]::SetEnvironmentVariable("PATH", $env:PATH + ";$env:LOCALAPPDATA\neubird", "User")
Docker
docker run -it --rm neubirdai/neubird-desktop:latest
Verify installation
neubird --version
Quickstart
1. Log in
neubird login https://yourcompany.app.neubird.ai
2. Connect and investigate
neubird
# Reconnect to your last session
neubird --reconnect
# Run a health sweep immediately after connecting
neubird --assess
3. Ask questions
Once connected, type any question and press Enter. NeuBird streams its investigation — showing every tool call and reasoning step — then delivers a formatted analysis.
4. Use slash commands
| Command | What it does |
|---|---|
/health |
Infrastructure health sweep (1h lookback) |
/health 4h |
Health sweep with custom lookback |
/cost |
Cloud cost analysis + 24h projection |
/handoff |
On-call shift briefing |
/changes |
What changed? — compare time windows |
/timeline |
Reconstruct incident timeline |
/pir |
Post-incident review document |
/slo |
SLO burn rate check |
/blast-radius |
Map failure blast radius |
/certs |
TLS certificate expiry scan |
/sentinel |
Start sentinel mode — continuous alert monitoring |
/sentinel status |
Show sentinel findings |
/sentinel off |
Stop sentinel |
/export |
Export answer — interactive picker (txt/md/pdf, file/clipboard) |
/export pdf |
Export directly as PDF |
/export full md |
Export full conversation as markdown |
/copy |
Copy last answer to clipboard |
/agent |
Switch agent persona |
/welcome |
Show the welcome screen |
/tables |
List available telemetry tables |
/tools |
Show all tools with capabilities |
/project |
Switch database |
/config |
Show current connection and saved state |
/mcp |
Show MCP server status |
/clear |
Clear display (keeps AI context) |
/reset |
Clear conversation history |
Keyboard Shortcuts
| Key | Action |
|---|---|
Enter |
Submit question |
Esc |
Cancel in-progress investigation |
Ctrl+O |
Toggle tool output review (works mid-investigation) |
Ctrl+T |
Tool navigation mode — browse individual tool calls |
↑ / ↓ |
Navigate input history |
Ctrl+R |
Reverse search through history |
← / → |
Navigate welcome screen |
Tab |
Accept suggestion / complete command |
? |
Toggle shortcut overlay |
Ctrl+C (twice) |
Quit |
Supported Data Sources
NeuBird connects to your existing telemetry and operations tools via read-only API integrations. Out of the box, it works with:
Incident Management — PagerDuty, Opsgenie, ServiceNow
Monitoring & Observability — Datadog, Grafana (Loki, Tempo, Mimir), CloudWatch, New Relic, Prometheus
Cloud Infrastructure — AWS (EC2, RDS, ECS, Lambda, Cost Explorer), GCP, Azure
CI/CD & Source Control — GitHub, GitLab, ArgoCD
Data Warehouses — Snowflake, BigQuery, Redshift
And more — Jira, Confluence, Slack, Kubernetes, custom FDW plugins
Local CLI tools (auto-detected on your machine) — kubectl, helm, docker/podman, AWS CLI, gcloud, Azure CLI, git, terraform, curl, dig, openssl
Server API
When running in server mode (neubird serve), Falcon exposes a REST API for programmatic access:
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Server health check |
POST |
/v1/infer |
Run an investigation (SSE streaming) |
GET |
/v1/tools?session_id=... |
List available tools for a session |
GET |
/v1/skills |
List available investigation skills |
POST |
/v1/sentinel/start |
Start sentinel mode |
POST |
/v1/sentinel/stop |
Stop sentinel mode |
GET |
/v1/sentinel/status |
Sentinel status and config |
GET |
/v1/sentinel/findings |
Current sentinel findings |
POST |
/v1/sessions/reset |
Reset conversation history |
POST |
/v1/sessions/delete |
Delete a session |
POST |
/v1/sessions/feedback |
Submit rating or insight |
POST |
/v1/prompt/submit |
Submit a prompt for async processing (returns prompt_id) |
Async Prompt Queue
Prompts can be submitted to a Redis-backed queue instead of the synchronous /v1/infer endpoint. Any agent pod with access to the same Redis picks up the prompt and processes it. This enables horizontal scaling — add more pods to process more prompts in parallel.
# Submit a prompt
curl -X POST http://localhost:8080/v1/prompt/submit \
-H "Content-Type: application/json" \
-d '{"prompt": "Why did latency spike on api-gateway?", "project_uuid": "your-project-id"}'
# Response: {"prompt_id": "a1b2c3d4-..."}
The request body accepts the same fields as /v1/infer. If prompt_id is omitted, one is generated automatically.
Reading results — The agent streams output to Redis at <org>:response:<prompt_id>. Poll the status key <org>:response:<prompt_id>:status for the current state:
| Status | Meaning |
|---|---|
processing |
An agent is actively working on the prompt |
complete |
Prompt finished successfully |
dead_letter |
Prompt failed after exhausting all retries |
Retries and dead-letter — If an agent crashes or fails mid-prompt, the message stays in the queue and is automatically reclaimed by another agent after a configurable idle period. After the maximum number of delivery attempts (default 3), the prompt is moved to a dead-letter stream (<org>:prompt:deadletter) and marked dead_letter so it is not retried forever.
Configuration:
| Env var | Description | Default |
|---|---|---|
REDIS_URI |
Redis connection string | redis://redis-master.platform.svc.cluster.local:6379 |
ORG_NAME |
Org prefix for all Redis keys | local |
PROMPT_QUEUE_MAX_CONCURRENT |
Max parallel prompt workers per pod | 8 |
PROMPT_QUEUE_AUTOCLAIM_IDLE |
Idle time before a stalled prompt is reclaimed | 12m |
PROMPT_QUEUE_MAX_RETRIES |
Max delivery attempts before dead-letter | 3 |
The consumer loop starts automatically when Redis is reachable. If Redis is not configured, the /v1/prompt/submit endpoint is not registered and only synchronous /v1/infer is available.
Sentinel API
Start sentinel monitoring via the API:
# Start sentinel with custom intervals
curl -X POST http://localhost:8080/v1/sentinel/start \
-H "Content-Type: application/json" \
-d '{"poll_interval": "5m", "sweep_interval": "30m", "project_uuid": "your-project-id"}'
# Check findings
curl http://localhost:8080/v1/sentinel/findings
# Stop sentinel
curl -X POST http://localhost:8080/v1/sentinel/stop
All endpoints require the X-Desktop-Secret header when DESKTOP_SECRET is set.
SSE Event Types
The /v1/infer endpoint streams Server-Sent Events. Each event is a JSON object with a type field:
| SSE Event Type | Go Callback | TUI Handler | Web Proxy (falcon.py) | Description |
|---|---|---|---|---|
chunk |
OnText |
sp.AddText() |
Code-fence filter → RCA suppressor → yield markdown | Streamed text from Claude (investigation narration, final answer, Opus synthesis) |
tool_start |
OnToolExec (IsStart=true) |
sp.AddToolStart() |
update_status with friendly label, CoT tracker |
Agent begins executing a tool |
tool_done |
OnToolExec (IsStart=false) |
sp.AddToolResult() |
Renders markdown table or hides exploratory errors | Tool execution completed (output may be truncated to 8KB) |
tool_data |
OnToolData |
sp.AddToolData() |
Fallback table render from full JSON; stores raw SQL results | Full structured output from data-producing tools (not truncated) |
data_read |
OnDataRead |
sp.SetDataRead() |
Logs bytes/rows/queries, updates CoT tracker | Cumulative telemetry data volume (bytes, rows, queries from exec_sql) |
progress |
OnProgress |
sp.SetProgress() |
update_progress(pct, confidence), CoT state |
Investigation progress percentage, phase, and confidence |
phase |
OnPhase |
sp.SetAgentPhase() → overrides progress bar label |
Yields demarcated falcon_phase JSON block |
Agent phase transition: investigating, answering, synthesizing |
turn |
OnTurn |
sp.SetTurn() |
Not handled (logged, ignored) | LLM iteration number within the tool-use loop |
trace |
OnTrace |
sp.SetTraceID() |
Not handled (logged, ignored) | OpenTelemetry trace ID for Langfuse correlation |
sources |
OnSources |
sp.SetSourceSummary() |
Stored; emitted as CONTENT_TYPE_SOURCES in finally block |
Data sources (tables, URLs) used in the investigation |
health_report |
OnHealthReport |
RenderHealthSummaryTiles() → sp.AddText() |
Yields inline markdown block | Sonnet's structured health findings (mid-stream, before Opus synthesis) |
rca_saved |
OnRCASaved |
Not wired (TUI reads file paths from text) | Persists RCA markdown + structured JSON to MongoDB | Opus synthesis report saved as PDF/markdown |
session_metadata |
OnSessionMetadata |
Not wired | Yields demarcated JSON; persists to session store on COMPLETED | Running investigation metadata (cost, timing, actions, time-saved) |
rating_request |
OnRatingRequest |
Not wired | Logged, ignored (Raven UI handles ratings) | Signal to show a thumbs-up/down prompt |
interim_summary |
OnInterimSummary |
sp.SetInterimSummary() |
Yields demarcated interim_summary JSON block |
Background "what we know so far" narrative (Haiku, after each tool round) |
synthesizing |
— | — | Updates status spinner | Signal that Opus senior review is starting |
summary_text |
OnSummaryText |
Not wired (falls through to OnText) |
Logged (same content arrives via chunk) |
Synthesis/answer text on a dedicated channel for clients that distinguish it |
error |
OnError |
sp.SetError() |
Yields warning message, breaks stream loop | Non-recoverable error (or synthetic event from connection loss) |
Investigation Flow
Every investigation follows this sequence. Opus synthesis only runs when specific conditions are met (see When Opus Runs below).
User question
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase: investigating (Sonnet tool loop) │
│ │
│ Events: chunk, tool_start, tool_done, tool_data, data_read, │
│ progress, turn, interim_summary │
│ │
│ Claude iterates: think → call tools → observe results → think │
│ Tools execute in parallel. Up to 25 iterations per batch. │
│ Each iteration is one SendMessageStream API call. │
│ │
│ After each tool round, a background Haiku call produces a │
│ rolling "what we know so far" narrative (interim_summary). │
│ This is a read-only side-channel — it never modifies Claude's │
│ context window or alters the investigation flow. │
│ │
│ Loop exits when: │
│ • StopReason = "end_turn" → Claude is done (normal exit) │
│ • StopReason = "max_tokens" → auto-continue with same state │
│ • StopReason = "tool_use" → execute tools, loop again │
│ • Iteration limit (25) hit → summarize, ask to continue │
└─────────────┬───────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase: answering (Sonnet final turn) │
│ │
│ Events: phase("answering") │
│ │
│ Sonnet's last text was already streamed during the loop. │
│ This phase marker tells clients the investigation is complete │
│ and any future chunks are synthesis. │
└─────────────┬───────────────────────────────────────────────────┘
│
▼
┌─────────────┐
│ Opus needed? │──── No ──── skip to completion
└──────┬──────┘
│ Yes
▼
┌─────────────────────────────────────────────────────────────────┐
│ Phase: synthesizing (Opus senior review) │
│ │
│ Events: phase("synthesizing"), health_report (health only), │
│ chunk (Opus text), rca_saved │
│ │
│ Opus reviews ALL evidence from the Sonnet loop and produces │
│ a rigorous synthesis: RCA, health assessment, or cost analysis.│
│ Text streams as chunk events — the user reads it live. │
│ A PDF/markdown report is saved automatically. │
└─────────────┬───────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Completion │
│ │
│ Events: session_metadata (COMPLETED), sources, rating_request, │
│ phase not emitted — OnDone fires │
│ │
│ TUI: green "✅ completed in ..." footer bar │
│ Server: [DONE] SSE sentinel │
└─────────────────────────────────────────────────────────────────┘
When Opus Runs
Opus synthesis requires both conditions to be true. If either fails, Sonnet's answer stands alone — no Opus pass.
| Investigation Type | Question Detection | Minimum Sources | Opus Function |
|---|---|---|---|
| Health | Question starts with "Run a rapid infrastructure health assessment." (generated by /health slash command) |
≥ 3 | runHealthSynthesis — senior SRE review of Good/Bad/Ugly findings |
| Cost | Question starts with the cost analysis prefix (generated by /cost slash command) |
≥ 2 | runCostSynthesis — cost trend extrapolation and ROI-ranked cuts |
| RCA | Question contains trigger keywords (see below) | ≥ 3 | runSynthesis — rigorous root cause analysis with evidence chain |
| General | None of the above, or 0 sources queried | — | No Opus — Sonnet's answer is the final output |
RCA trigger keywords — if any of these appear anywhere in the user's question (case-insensitive), the question is eligible for Opus RCA synthesis:
root cause, rca, investigate, incident, what caused, what happened, why is, why are, why did, find the cause, diagnose, postmortem, post-mortem, find out why, long running, slow quer, slow job, performance, jobs, blocking, locks, deadlock, timeout
Common reasons Opus doesn't run:
- The question doesn't match any trigger keyword — e.g. "Show me trace latencies" won't trigger Opus even with 10 sources, because none of the RCA keywords are present. Rephrasing as "Investigate the trace latency slowdown" would trigger it.
- Claude answered from conversation context or knowledge without making any tool calls (0 sources).
- The investigation only queried 1-2 data sources — not enough evidence for a meaningful synthesis.
Stream Retry on Transient Errors
When the Anthropic API drops the connection mid-stream (HTTP/2 reset, INTERNAL_ERROR, TCP reset), the agent retries the current LLM call automatically. The conversation history is intact — the failed partial response was never appended — so Claude regenerates from the same state.
- Up to 2 retries with 3s / 6s backoff
- User sees
🔄 LLM connection lost, retrying (1/2)...inline - Applies to both the Sonnet investigation loop and tool-use iterations
- Non-transient errors (auth failures, rate limits, context cancellation) are not retried
Thin-Answer Conclusion Injection
Claude sometimes stops after gathering evidence without writing a conclusion — especially on lower-confidence investigations where it queried data but didn't synthesize findings. The user sees tool results followed immediately by the green completion bar with no analysis in between.
The agent detects this and injects a conclusion prompt automatically:
- Triggers when:
end_turnwith < 200 characters of text AND ≥ 3 tool calls were executed - What happens: A conclusion prompt is appended to the conversation and one more
SendMessageStreamcall is made without tools (so Claude writes text only — no more queries) - User sees: The analysis streams in naturally, as if Claude had written it in the first place
- Safety: Only fires once per investigation (
concludedOnceflag prevents loops). If the conclusion call fails, the investigation completes normally — the user still has all the evidence from the tool calls
This does not apply to simple questions (< 3 tool calls) where Claude's short answer is appropriate, nor to cases where Claude wrote a substantial response alongside its last tool call.
Documentation
Full documentation is available at neubirdai.github.io/neubird-desktop.
License
Proprietary. Copyright 2024-2026 NeuBird, Inc.