NeuBird Desktop

The Production Ops Agent in your terminal.

NeuBird connects to your telemetry databases and investigates incidents, health, cost, and performance — using natural language.

What NeuBird Does

NeuBird is a terminal-native AI agent for site reliability engineering. It connects to your existing telemetry tools — PagerDuty, Datadog, CloudWatch, Grafana, GitHub, Snowflake, and 30+ more — and lets you ask questions in plain English:

"What services have the highest error rates right now?"
"Why did latency spike on api-gateway at 2am?"
"How much did cloud costs increase this week and what drove it?"

NeuBird queries your telemetry, reasons over results across multiple data sources, and delivers a root cause analysis — complete with evidence, sources, and recommended actions.

Key Features

Predictive analysis — Ask what's likely to page you next. NeuBird identifies degradation trends, capacity cliffs, and silent failures before they become incidents.

Pre-deployment risk assessment — Evaluate code changes before they reach production. NeuBird cross-references PRs with live telemetry to tell you what could break and why.

Agentic investigations — NeuBird doesn't just answer questions. It explores your schema, runs multiple queries, correlates data across sources, and iterates until it finds the answer. You watch the investigation unfold in real time.

Health sweeps — Run /health for a full infrastructure health check. NeuBird scans incidents, alarms, error logs, and recent deployments, then produces a Good/Bad/Ugly summary with recommended actions.

Cost analysis — /cost analyzes cloud spending trends and projects 24-hour costs with breakdowns by service, team, and resource type.

Three agent personas — Switch between investigation styles to match the situation:

Persona	Model	Best for
Responder	Claude Haiku	Fast triage, immediate next actions
Analyst	Claude Sonnet	Root cause analysis, deep investigation
Architect	Claude Opus	Runbooks, design reviews, systemic fixes

Collapsible tool output — Press Ctrl+O to review every tool call and result from an investigation — even while it's still running. Expand individual calls to inspect full data tables, or collapse them to focus on the analysis. Works mid-investigation: pause to browse results, then press Esc to resume watching.

Local CLI integration — NeuBird automatically detects SRE tools on your machine (kubectl, docker, aws, gcloud, helm, git, terraform, curl, dig, openssl) and makes them available to the AI agent as read-only tools. Ask "how many pods are in the production namespace?" and NeuBird will run kubectl get pods directly. All commands are safety-gated — destructive operations (apply, delete, create, scale) are blocked at the code level.

Web search — When an investigation requires context beyond your telemetry — CVE details, vendor status pages, documentation, or recent outage reports — NeuBird can search the web for supporting evidence. Web search is enabled by default and integrates naturally into investigations alongside SQL queries and local CLI tools.

Tool inventory — On startup, NeuBird displays every tool available to the agent — grouped by source (built-in, cloud MCP, local CLI). Run /tools anytime to see the full inventory with capabilities. Example:

🔧 Tool inventory (18 tools)
  📦 Built-in (12)
     ✓ exec_sql                  Execute SQL queries against the database
     ✓ list_schemas              List database schemas
     ...
  ☁️  ACE MCP (4)
     ✓ external_tool             External service queries
     ...
  💻 Local CLI (2)
     ✓ kubectl                   Read-only Kubernetes inspection (pods, logs, events)
     ✓ run_local_command         Execute read-only local CLI commands
  🔍 Detected CLIs: kubectl (v1.28.2), docker (24.0.7), git (2.39.0)
  ⬚  Not found: aws, gcloud, helm, terraform

Persistent learning — NeuBird remembers which queries work with your data sources and which telemetry tables matter for each type of investigation. It gets faster and more accurate over time.

Custom slash commands — Drop a .md file into the skills/ directory to create custom investigation templates. The filename becomes the command, and the content becomes the prompt.

Export and share — Use /export to save any investigation as a file or copy it to your clipboard. An interactive picker lets you choose format (plain text, markdown, or PDF) and scope (last answer or full conversation). Or go direct: /export pdf, /export full md, /export clipboard. PDF exports include the NeuBird logo, formatted tables, and proper section headings. Use /copy as a shortcut to copy the last answer to your clipboard instantly.

Branded PDF reports — RCA investigations, health sweeps, and cost analyses automatically generate branded PDF reports saved to ~/.config/neubird/reports/. Each report includes the NeuBird logo, confidence score, data sources, investigation timeline, and structured findings with evidence tables and action items.

Live investigation dashboard — While an investigation runs, three progress bars update in real time:

  ██████████████████████████████████░░░░░░ 85% wrapping up · turn 15 · 22 tools • 52.3s
  ████████████████████░░░░░░░░░░░░░░░░░░░░ 50% confidence
  ████████████████████████░░░░░░░░░░░░░░░░ 2.4 MB · 1,247 rows · 8 queries      (Ctrl+C)

The first bar tracks investigation progress. The second shows the AI's self-reported confidence (can go down — that's useful signal). The third shows cumulative telemetry data read from SQL queries on a logarithmic scale. On completion, a summary line shows the final stats:

  ──────────────────────────────────────────────────────────────────────────
  ✅ completed in 3m52.1s · 17 turns · 21 tools · 80% confidence · 49.4 KB read
  📊 Data sources: config_aws_prod.aws_elbv2_target_groups, metric_aws_prod.rds_cpu

Live investigation narrative — While the agent explores schemas and runs queries, a background summarizer produces a rolling "what we know so far" status that updates after each tool round. Instead of watching raw tool calls scroll by, you see a plain-English narrative of findings and direction:

  ╭─ Investigation status ─────────────────────────────────────────────────╮
  │ Error rates on payment-service have risen from 0.1% to 3.2% in the   │
  │ last 30 minutes, concentrated on the /checkout endpoint. A deploy     │
  │ went out 45 minutes ago touching the payment retry logic. Looking     │
  │ at upstream dependencies next to rule out cascading failures.         │
  │                                                      confidence: 65%  │
  ╰────────────────────────────────────────────────────────────────────────╯

The narrative is generated by a lightweight Haiku call running in parallel — it reads a snapshot of the investigation history but never modifies what Claude sees. The main investigation is not slowed or altered. The summary replaces itself on each update (not appending), so it stays compact. Web clients receive the narrative as interim_summary SSE events for rendering in a dedicated panel.

Built-in investigation skills — NeuBird ships with ready-to-use playbooks for the most common SRE workflows:

Command	What it does
`/handoff`	On-call shift briefing — active incidents, recent deploys, current health, watch items
`/changes`	Compare two time windows — find every deploy, config change, metric shift, and correlate them
`/timeline`	Reconstruct a minute-by-minute incident timeline from all telemetry sources
`/pir`	Generate a leadership-ready post-incident review with 5-whys and action items
`/slo`	Calculate error budgets, burn rates, and project when SLOs will breach
`/blast-radius`	Map upstream/downstream dependencies and quantify failure impact
`/certs`	Scan TLS certificates across endpoints, flag anything expiring within 30 days

Enable them by copying to your skills directory:

cp skills/*.md ~/.config/neubird/skills/

NeuBird advertises available skills on the welcome screen so you always know what's possible — no memorization required.

Sentinel mode — Run /sentinel to activate continuous background monitoring. NeuBird's sentinel polls for new alerts every 5 minutes and runs full health sweeps every hour. It surfaces actionable findings as they appear — not on a dumb timer, but by detecting real changes in your telemetry:

> /sentinel
  🛡️ Sentinel active — polling every 5m0s, full sweep every 1h0m0s.
  Type /sentinel status to see findings, /sentinel off to stop.

  🛡️ Sentinel: 2 new finding(s) detected — type /sentinel status to review

> /sentinel status
  🛡️ Sentinel Status
  Running  · Polls every 5m0s · Sweeps every 1h0m0s
  Last poll: 14:35:12 · Last sweep: 14:30:00
  Active findings: 3

  🔴 payment-service error rate spike ✨
     Error rate jumped from 0.1% to 3.2% in the last 15 minutes
     → Investigate payment-service error rate spike and correlate with recent deploys

  🟡 auth-service connection pool at 82% ✨
     Connection pool utilization trending toward saturation
     → Check auth-service connection pool usage and project when it hits the limit

  🔵 Upcoming cert expiry: api.example.com
     TLS certificate expires in 12 days
     → Scan TLS certificates and identify renewal schedule

Use /sentinel off to stop. The sentinel runs in the background — you can investigate other things while it monitors.

MCP tool support — Extend NeuBird with Model Context Protocol servers for additional data sources and capabilities beyond SQL.

Installation

macOS / Linux (Homebrew)

brew install neubirdai/tap/neubird

Linux (Snap)

sudo snap install neubird-desktop

Linux (Debian / Ubuntu)

curl -LO https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_linux_amd64.deb
sudo dpkg -i neubird_linux_amd64.deb

Linux (Fedora / RHEL)

curl -LO https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_linux_amd64.rpm
sudo rpm -i neubird_linux_amd64.rpm

Windows

Download neubird_windows_amd64.zip from the latest release, extract it, and add the folder to your PATH:

# PowerShell — download and extract
Invoke-WebRequest -Uri "https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_windows_amd64.zip" -OutFile neubird.zip
Expand-Archive neubird.zip -DestinationPath "$env:LOCALAPPDATA\neubird"

# Add to PATH (current session)
$env:PATH += ";$env:LOCALAPPDATA\neubird"

# Add to PATH (permanent — requires restart)
[Environment]::SetEnvironmentVariable("PATH", $env:PATH + ";$env:LOCALAPPDATA\neubird", "User")

Docker

docker run -it --rm neubirdai/neubird-desktop:latest

Verify installation

neubird --version

Quickstart

1. Log in

neubird login https://yourcompany.app.neubird.ai

2. Connect and investigate

neubird

# Reconnect to your last session
neubird --reconnect

# Run a health sweep immediately after connecting
neubird --assess

3. Ask questions

Once connected, type any question and press Enter. NeuBird streams its investigation — showing every tool call and reasoning step — then delivers a formatted analysis.

4. Use slash commands

Command	What it does
`/health`	Infrastructure health sweep (1h lookback)
`/health 4h`	Health sweep with custom lookback
`/cost`	Cloud cost analysis + 24h projection
`/handoff`	On-call shift briefing
`/changes`	What changed? — compare time windows
`/timeline`	Reconstruct incident timeline
`/pir`	Post-incident review document
`/slo`	SLO burn rate check
`/blast-radius`	Map failure blast radius
`/certs`	TLS certificate expiry scan
`/sentinel`	Start sentinel mode — continuous alert monitoring
`/sentinel status`	Show sentinel findings
`/sentinel off`	Stop sentinel
`/export`	Export answer — interactive picker (txt/md/pdf, file/clipboard)
`/export pdf`	Export directly as PDF
`/export full md`	Export full conversation as markdown
`/copy`	Copy last answer to clipboard
`/agent`	Switch agent persona
`/welcome`	Show the welcome screen
`/tables`	List available telemetry tables
`/tools`	Show all tools with capabilities
`/project`	Switch database
`/config`	Show current connection and saved state
`/mcp`	Show MCP server status
`/clear`	Clear display (keeps AI context)
`/reset`	Clear conversation history

Keyboard Shortcuts

Key	Action
`Enter`	Submit question
`Esc`	Cancel in-progress investigation
`Ctrl+O`	Toggle tool output review (works mid-investigation)
`Ctrl+T`	Tool navigation mode — browse individual tool calls
`↑ / ↓`	Navigate input history
`Ctrl+R`	Reverse search through history
`← / →`	Navigate welcome screen
`Tab`	Accept suggestion / complete command
`?`	Toggle shortcut overlay
`Ctrl+C` (twice)	Quit

Supported Data Sources

NeuBird connects to your existing telemetry and operations tools via read-only API integrations. Out of the box, it works with:

Incident Management — PagerDuty, Opsgenie, ServiceNow

Monitoring & Observability — Datadog, Grafana (Loki, Tempo, Mimir), CloudWatch, New Relic, Prometheus

Cloud Infrastructure — AWS (EC2, RDS, ECS, Lambda, Cost Explorer), GCP, Azure

CI/CD & Source Control — GitHub, GitLab, ArgoCD

Data Warehouses — Snowflake, BigQuery, Redshift

And more — Jira, Confluence, Slack, Kubernetes, custom FDW plugins

Local CLI tools (auto-detected on your machine) — kubectl, helm, docker/podman, AWS CLI, gcloud, Azure CLI, git, terraform, curl, dig, openssl

Server API

When running in server mode (neubird serve), Falcon exposes a REST API for programmatic access:

Method	Endpoint	Description
`GET`	`/health`	Server health check
`POST`	`/v1/infer`	Run an investigation (SSE streaming)
`GET`	`/v1/tools?session_id=...`	List available tools for a session
`GET`	`/v1/skills`	List available investigation skills
`POST`	`/v1/sentinel/start`	Start sentinel mode
`POST`	`/v1/sentinel/stop`	Stop sentinel mode
`GET`	`/v1/sentinel/status`	Sentinel status and config
`GET`	`/v1/sentinel/findings`	Current sentinel findings
`POST`	`/v1/sessions/reset`	Reset conversation history
`POST`	`/v1/sessions/delete`	Delete a session
`POST`	`/v1/sessions/feedback`	Submit rating or insight
`POST`	`/v1/prompt/submit`	Submit a prompt for async processing (returns `prompt_id`)

Async Prompt Queue

Prompts can be submitted to a Redis-backed queue instead of the synchronous /v1/infer endpoint. Any agent pod with access to the same Redis picks up the prompt and processes it. This enables horizontal scaling — add more pods to process more prompts in parallel.

# Submit a prompt
curl -X POST http://localhost:8080/v1/prompt/submit \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Why did latency spike on api-gateway?", "project_uuid": "your-project-id"}'

# Response: {"prompt_id": "a1b2c3d4-..."}

The request body accepts the same fields as /v1/infer. If prompt_id is omitted, one is generated automatically.

Reading results — The agent streams output to Redis at <org>:response:<prompt_id>. Poll the status key <org>:response:<prompt_id>:status for the current state:

Status	Meaning
`processing`	An agent is actively working on the prompt
`complete`	Prompt finished successfully
`dead_letter`	Prompt failed after exhausting all retries

Retries and dead-letter — If an agent crashes or fails mid-prompt, the message stays in the queue and is automatically reclaimed by another agent after a configurable idle period. After the maximum number of delivery attempts (default 3), the prompt is moved to a dead-letter stream (<org>:prompt:deadletter) and marked dead_letter so it is not retried forever.

Configuration:

Env var	Description	Default
`REDIS_URI`	Redis connection string	`redis://redis-master.platform.svc.cluster.local:6379`
`ORG_NAME`	Org prefix for all Redis keys	`local`
`PROMPT_QUEUE_MAX_CONCURRENT`	Max parallel prompt workers per pod	`8`
`PROMPT_QUEUE_AUTOCLAIM_IDLE`	Idle time before a stalled prompt is reclaimed	`12m`
`PROMPT_QUEUE_MAX_RETRIES`	Max delivery attempts before dead-letter	`3`

The consumer loop starts automatically when Redis is reachable. If Redis is not configured, the /v1/prompt/submit endpoint is not registered and only synchronous /v1/infer is available.

Sentinel API

Start sentinel monitoring via the API:

# Start sentinel with custom intervals
curl -X POST http://localhost:8080/v1/sentinel/start \
  -H "Content-Type: application/json" \
  -d '{"poll_interval": "5m", "sweep_interval": "30m", "project_uuid": "your-project-id"}'

# Check findings
curl http://localhost:8080/v1/sentinel/findings

# Stop sentinel
curl -X POST http://localhost:8080/v1/sentinel/stop

All endpoints require the X-Desktop-Secret header when DESKTOP_SECRET is set.

SSE Event Types

The /v1/infer endpoint streams Server-Sent Events. Each event is a JSON object with a type field:

SSE Event Type	Go Callback	TUI Handler	Web Proxy (falcon.py)	Description
`chunk`	`OnText`	`sp.AddText()`	Code-fence filter → RCA suppressor → yield markdown	Streamed text from Claude (investigation narration, final answer, Opus synthesis)
`tool_start`	`OnToolExec` (IsStart=true)	`sp.AddToolStart()`	`update_status` with friendly label, CoT tracker	Agent begins executing a tool
`tool_done`	`OnToolExec` (IsStart=false)	`sp.AddToolResult()`	Renders markdown table or hides exploratory errors	Tool execution completed (output may be truncated to 8KB)
`tool_data`	`OnToolData`	`sp.AddToolData()`	Fallback table render from full JSON; stores raw SQL results	Full structured output from data-producing tools (not truncated)
`data_read`	`OnDataRead`	`sp.SetDataRead()`	Logs bytes/rows/queries, updates CoT tracker	Cumulative telemetry data volume (bytes, rows, queries from exec_sql)
`progress`	`OnProgress`	`sp.SetProgress()`	`update_progress(pct, confidence)`, CoT state	Investigation progress percentage, phase, and confidence
`phase`	`OnPhase`	`sp.SetAgentPhase()` → overrides progress bar label	Yields demarcated `falcon_phase` JSON block	Agent phase transition: `investigating`, `answering`, `synthesizing`
`turn`	`OnTurn`	`sp.SetTurn()`	Not handled (logged, ignored)	LLM iteration number within the tool-use loop
`trace`	`OnTrace`	`sp.SetTraceID()`	Not handled (logged, ignored)	OpenTelemetry trace ID for Langfuse correlation
`sources`	`OnSources`	`sp.SetSourceSummary()`	Stored; emitted as `CONTENT_TYPE_SOURCES` in finally block	Data sources (tables, URLs) used in the investigation
`health_report`	`OnHealthReport`	`RenderHealthSummaryTiles()` → `sp.AddText()`	Yields inline markdown block	Sonnet's structured health findings (mid-stream, before Opus synthesis)
`rca_saved`	`OnRCASaved`	Not wired (TUI reads file paths from text)	Persists RCA markdown + structured JSON to MongoDB	Opus synthesis report saved as PDF/markdown
`session_metadata`	`OnSessionMetadata`	Not wired	Yields demarcated JSON; persists to session store on COMPLETED	Running investigation metadata (cost, timing, actions, time-saved)
`rating_request`	`OnRatingRequest`	Not wired	Logged, ignored (Raven UI handles ratings)	Signal to show a thumbs-up/down prompt
`interim_summary`	`OnInterimSummary`	`sp.SetInterimSummary()`	Yields demarcated `interim_summary` JSON block	Background "what we know so far" narrative (Haiku, after each tool round)
`synthesizing`	—	—	Updates status spinner	Signal that Opus senior review is starting
`summary_text`	`OnSummaryText`	Not wired (falls through to `OnText`)	Logged (same content arrives via `chunk`)	Synthesis/answer text on a dedicated channel for clients that distinguish it
`error`	`OnError`	`sp.SetError()`	Yields warning message, breaks stream loop	Non-recoverable error (or synthetic event from connection loss)

Investigation Flow

Every investigation follows this sequence. Opus synthesis only runs when specific conditions are met (see When Opus Runs below).

User question
  │
  ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase: investigating  (Sonnet tool loop)                       │
│                                                                 │
│  Events: chunk, tool_start, tool_done, tool_data, data_read,   │
│          progress, turn, interim_summary                        │
│                                                                 │
│  Claude iterates: think → call tools → observe results → think  │
│  Tools execute in parallel. Up to 25 iterations per batch.      │
│  Each iteration is one SendMessageStream API call.              │
│                                                                 │
│  After each tool round, a background Haiku call produces a      │
│  rolling "what we know so far" narrative (interim_summary).     │
│  This is a read-only side-channel — it never modifies Claude's  │
│  context window or alters the investigation flow.               │
│                                                                 │
│  Loop exits when:                                               │
│    • StopReason = "end_turn" → Claude is done (normal exit)     │
│    • StopReason = "max_tokens" → auto-continue with same state  │
│    • StopReason = "tool_use" → execute tools, loop again        │
│    • Iteration limit (25) hit → summarize, ask to continue      │
└─────────────┬───────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase: answering  (Sonnet final turn)                          │
│                                                                 │
│  Events: phase("answering")                                     │
│                                                                 │
│  Sonnet's last text was already streamed during the loop.       │
│  This phase marker tells clients the investigation is complete  │
│  and any future chunks are synthesis.                           │
└─────────────┬───────────────────────────────────────────────────┘
              │
              ▼
        ┌─────────────┐
        │ Opus needed? │──── No ──── skip to completion
        └──────┬──────┘
               │ Yes
               ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase: synthesizing  (Opus senior review)                      │
│                                                                 │
│  Events: phase("synthesizing"), health_report (health only),    │
│          chunk (Opus text), rca_saved                            │
│                                                                 │
│  Opus reviews ALL evidence from the Sonnet loop and produces    │
│  a rigorous synthesis: RCA, health assessment, or cost analysis.│
│  Text streams as chunk events — the user reads it live.         │
│  A PDF/markdown report is saved automatically.                  │
└─────────────┬───────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Completion                                                     │
│                                                                 │
│  Events: session_metadata (COMPLETED), sources, rating_request, │
│          phase not emitted — OnDone fires                       │
│                                                                 │
│  TUI: green "✅ completed in ..." footer bar                    │
│  Server: [DONE] SSE sentinel                                    │
└─────────────────────────────────────────────────────────────────┘

When Opus Runs

Opus synthesis requires both conditions to be true. If either fails, Sonnet's answer stands alone — no Opus pass.

Investigation Type	Question Detection	Minimum Sources	Opus Function
Health	Question starts with `"Run a rapid infrastructure health assessment."` (generated by `/health` slash command)	≥ 3	`runHealthSynthesis` — senior SRE review of Good/Bad/Ugly findings
Cost	Question starts with the cost analysis prefix (generated by `/cost` slash command)	≥ 2	`runCostSynthesis` — cost trend extrapolation and ROI-ranked cuts
RCA	Question contains trigger keywords (see below)	≥ 3	`runSynthesis` — rigorous root cause analysis with evidence chain
General	None of the above, or 0 sources queried	—	No Opus — Sonnet's answer is the final output

RCA trigger keywords — if any of these appear anywhere in the user's question (case-insensitive), the question is eligible for Opus RCA synthesis:

root cause, rca, investigate, incident, what caused, what happened, why is, why are, why did, find the cause, diagnose, postmortem, post-mortem, find out why, long running, slow quer, slow job, performance, jobs, blocking, locks, deadlock, timeout

Common reasons Opus doesn't run:

The question doesn't match any trigger keyword — e.g. "Show me trace latencies" won't trigger Opus even with 10 sources, because none of the RCA keywords are present. Rephrasing as "Investigate the trace latency slowdown" would trigger it.
Claude answered from conversation context or knowledge without making any tool calls (0 sources).
The investigation only queried 1-2 data sources — not enough evidence for a meaningful synthesis.

Stream Retry on Transient Errors

When the Anthropic API drops the connection mid-stream (HTTP/2 reset, INTERNAL_ERROR, TCP reset), the agent retries the current LLM call automatically. The conversation history is intact — the failed partial response was never appended — so Claude regenerates from the same state.

Up to 2 retries with 3s / 6s backoff
User sees 🔄 LLM connection lost, retrying (1/2)... inline
Applies to both the Sonnet investigation loop and tool-use iterations
Non-transient errors (auth failures, rate limits, context cancellation) are not retried

Thin-Answer Conclusion Injection

Claude sometimes stops after gathering evidence without writing a conclusion — especially on lower-confidence investigations where it queried data but didn't synthesize findings. The user sees tool results followed immediately by the green completion bar with no analysis in between.

The agent detects this and injects a conclusion prompt automatically:

Triggers when: end_turn with < 200 characters of text AND ≥ 3 tool calls were executed
What happens: A conclusion prompt is appended to the conversation and one more SendMessageStream call is made without tools (so Claude writes text only — no more queries)
User sees: The analysis streams in naturally, as if Claude had written it in the first place
Safety: Only fires once per investigation (concludedOnce flag prevents loops). If the conclusion call fails, the investigation completes normally — the user still has all the evidence from the tool calls

This does not apply to simple questions (< 3 tool calls) where Claude's short answer is appropriate, nor to cases where Claude wrote a substantial response alongside its last tool call.

Documentation

Full documentation is available at neubirdai.github.io/neubird-desktop.