Skip to content
NeuBird

NeuBird Desktop

The Production Ops Agent in your terminal.

NeuBird connects to your telemetry databases and investigates incidents, health, cost, and performance — using natural language.


What NeuBird Does

NeuBird is a terminal-native AI agent for site reliability engineering. It connects to your existing telemetry tools — PagerDuty, Datadog, CloudWatch, Grafana, GitHub, Snowflake, and 30+ more — and lets you ask questions in plain English:

  • "What services have the highest error rates right now?"
  • "Why did latency spike on api-gateway at 2am?"
  • "How much did cloud costs increase this week and what drove it?"

NeuBird queries your telemetry, reasons over results across multiple data sources, and delivers a root cause analysis — complete with evidence, sources, and recommended actions.

Key Features

Predictive analysis — Ask what's likely to page you next. NeuBird identifies degradation trends, capacity cliffs, and silent failures before they become incidents.

Pre-deployment risk assessment — Evaluate code changes before they reach production. NeuBird cross-references PRs with live telemetry to tell you what could break and why.

Agentic investigations — NeuBird doesn't just answer questions. It explores your schema, runs multiple queries, correlates data across sources, and iterates until it finds the answer. You watch the investigation unfold in real time.

Health sweeps — Run /health for a full infrastructure health check. NeuBird scans incidents, alarms, error logs, and recent deployments, then produces a Good/Bad/Ugly summary with recommended actions.

Cost analysis/cost analyzes cloud spending trends and projects 24-hour costs with breakdowns by service, team, and resource type.

Three agent personas — Switch between investigation styles to match the situation:

Persona Model Best for
Responder Claude Haiku Fast triage, immediate next actions
Analyst Claude Sonnet Root cause analysis, deep investigation
Architect Claude Opus Runbooks, design reviews, systemic fixes

Collapsible tool output — Press Ctrl+O to review every tool call and result from an investigation — even while it's still running. Expand individual calls to inspect full data tables, or collapse them to focus on the analysis. Works mid-investigation: pause to browse results, then press Esc to resume watching.

Local CLI integration — NeuBird automatically detects SRE tools on your machine (kubectl, docker, aws, gcloud, helm, git, terraform, curl, dig, openssl) and makes them available to the AI agent as read-only tools. Ask "how many pods are in the production namespace?" and NeuBird will run kubectl get pods directly. All commands are safety-gated — destructive operations (apply, delete, create, scale) are blocked at the code level.

Web search — When an investigation requires context beyond your telemetry — CVE details, vendor status pages, documentation, or recent outage reports — NeuBird can search the web for supporting evidence. Web search is enabled by default and integrates naturally into investigations alongside SQL queries and local CLI tools.

Tool inventory — On startup, NeuBird displays every tool available to the agent — grouped by source (built-in, cloud MCP, local CLI). Run /tools anytime to see the full inventory with capabilities. Example:

🔧 Tool inventory (18 tools)
  📦 Built-in (12)
     ✓ exec_sql                  Execute SQL queries against the database
     ✓ list_schemas              List database schemas
     ...
  ☁️  ACE MCP (4)
     ✓ external_tool             External service queries
     ...
  💻 Local CLI (2)
     ✓ kubectl                   Read-only Kubernetes inspection (pods, logs, events)
     ✓ run_local_command         Execute read-only local CLI commands
  🔍 Detected CLIs: kubectl (v1.28.2), docker (24.0.7), git (2.39.0)
  ⬚  Not found: aws, gcloud, helm, terraform

Persistent learning — NeuBird remembers which queries work with your data sources and which telemetry tables matter for each type of investigation. It gets faster and more accurate over time.

Custom slash commands — Drop a .md file into the skills/ directory to create custom investigation templates. The filename becomes the command, and the content becomes the prompt.

Export and share — Use /export to save any investigation as a file or copy it to your clipboard. An interactive picker lets you choose format (plain text, markdown, or PDF) and scope (last answer or full conversation). Or go direct: /export pdf, /export full md, /export clipboard. PDF exports include the NeuBird logo, formatted tables, and proper section headings. Use /copy as a shortcut to copy the last answer to your clipboard instantly.

Branded PDF reports — RCA investigations, health sweeps, and cost analyses automatically generate branded PDF reports saved to ~/.config/neubird/reports/. Each report includes the NeuBird logo, confidence score, data sources, investigation timeline, and structured findings with evidence tables and action items.

Live investigation dashboard — While an investigation runs, three progress bars update in real time:

  ██████████████████████████████████░░░░░░ 85% wrapping up · turn 15 · 22 tools • 52.3s
  ████████████████████░░░░░░░░░░░░░░░░░░░░ 50% confidence
  ████████████████████████░░░░░░░░░░░░░░░░ 2.4 MB · 1,247 rows · 8 queries      (Ctrl+C)

The first bar tracks investigation progress. The second shows the AI's self-reported confidence (can go down — that's useful signal). The third shows cumulative telemetry data read from SQL queries on a logarithmic scale. On completion, a summary line shows the final stats:

  ──────────────────────────────────────────────────────────────────────────
  ✅ completed in 3m52.1s · 17 turns · 21 tools · 80% confidence · 49.4 KB read
  📊 Data sources: config_aws_prod.aws_elbv2_target_groups, metric_aws_prod.rds_cpu

Live investigation narrative — While the agent explores schemas and runs queries, a background summarizer produces a rolling "what we know so far" status that updates after each tool round. Instead of watching raw tool calls scroll by, you see a plain-English narrative of findings and direction:

  ╭─ Investigation status ─────────────────────────────────────────────────╮
  │ Error rates on payment-service have risen from 0.1% to 3.2% in the   │
  │ last 30 minutes, concentrated on the /checkout endpoint. A deploy     │
  │ went out 45 minutes ago touching the payment retry logic. Looking     │
  │ at upstream dependencies next to rule out cascading failures.         │
  │                                                      confidence: 65%  │
  ╰────────────────────────────────────────────────────────────────────────╯

The narrative is generated by a lightweight Haiku call running in parallel — it reads a snapshot of the investigation history but never modifies what Claude sees. The main investigation is not slowed or altered. The summary replaces itself on each update (not appending), so it stays compact. Web clients receive the narrative as interim_summary SSE events for rendering in a dedicated panel.

Built-in investigation skills — NeuBird ships with ready-to-use playbooks for the most common SRE workflows:

Command What it does
/handoff On-call shift briefing — active incidents, recent deploys, current health, watch items
/changes Compare two time windows — find every deploy, config change, metric shift, and correlate them
/timeline Reconstruct a minute-by-minute incident timeline from all telemetry sources
/pir Generate a leadership-ready post-incident review with 5-whys and action items
/slo Calculate error budgets, burn rates, and project when SLOs will breach
/blast-radius Map upstream/downstream dependencies and quantify failure impact
/certs Scan TLS certificates across endpoints, flag anything expiring within 30 days

Enable them by copying to your skills directory:

cp skills/*.md ~/.config/neubird/skills/

NeuBird advertises available skills on the welcome screen so you always know what's possible — no memorization required.

Sentinel mode — Run /sentinel to activate continuous background monitoring. NeuBird's sentinel polls for new alerts every 5 minutes and runs full health sweeps every hour. It surfaces actionable findings as they appear — not on a dumb timer, but by detecting real changes in your telemetry:

> /sentinel
  🛡️ Sentinel active — polling every 5m0s, full sweep every 1h0m0s.
  Type /sentinel status to see findings, /sentinel off to stop.

  🛡️ Sentinel: 2 new finding(s) detected — type /sentinel status to review

> /sentinel status
  🛡️ Sentinel Status
  Running  · Polls every 5m0s · Sweeps every 1h0m0s
  Last poll: 14:35:12 · Last sweep: 14:30:00
  Active findings: 3

  🔴 payment-service error rate spike ✨
     Error rate jumped from 0.1% to 3.2% in the last 15 minutes
     → Investigate payment-service error rate spike and correlate with recent deploys

  🟡 auth-service connection pool at 82% ✨
     Connection pool utilization trending toward saturation
     → Check auth-service connection pool usage and project when it hits the limit

  🔵 Upcoming cert expiry: api.example.com
     TLS certificate expires in 12 days
     → Scan TLS certificates and identify renewal schedule

Use /sentinel off to stop. The sentinel runs in the background — you can investigate other things while it monitors.

MCP tool support — Extend NeuBird with Model Context Protocol servers for additional data sources and capabilities beyond SQL.

Installation

macOS / Linux (Homebrew)

brew install neubirdai/tap/neubird

Linux (Snap)

sudo snap install neubird-desktop

Linux (Debian / Ubuntu)

curl -LO https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_linux_amd64.deb
sudo dpkg -i neubird_linux_amd64.deb

Linux (Fedora / RHEL)

curl -LO https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_linux_amd64.rpm
sudo rpm -i neubird_linux_amd64.rpm

Windows

Download neubird_windows_amd64.zip from the latest release, extract it, and add the folder to your PATH:

# PowerShell — download and extract
Invoke-WebRequest -Uri "https://github.com/neubirdai/neubird-desktop/releases/latest/download/neubird_windows_amd64.zip" -OutFile neubird.zip
Expand-Archive neubird.zip -DestinationPath "$env:LOCALAPPDATA\neubird"

# Add to PATH (current session)
$env:PATH += ";$env:LOCALAPPDATA\neubird"

# Add to PATH (permanent — requires restart)
[Environment]::SetEnvironmentVariable("PATH", $env:PATH + ";$env:LOCALAPPDATA\neubird", "User")

Docker

docker run -it --rm neubirdai/neubird-desktop:latest

Verify installation

neubird --version

Quickstart

1. Log in

neubird login https://yourcompany.app.neubird.ai

2. Connect and investigate

neubird

# Reconnect to your last session
neubird --reconnect

# Run a health sweep immediately after connecting
neubird --assess

3. Ask questions

Once connected, type any question and press Enter. NeuBird streams its investigation — showing every tool call and reasoning step — then delivers a formatted analysis.

4. Use slash commands

Command What it does
/health Infrastructure health sweep (1h lookback)
/health 4h Health sweep with custom lookback
/cost Cloud cost analysis + 24h projection
/handoff On-call shift briefing
/changes What changed? — compare time windows
/timeline Reconstruct incident timeline
/pir Post-incident review document
/slo SLO burn rate check
/blast-radius Map failure blast radius
/certs TLS certificate expiry scan
/sentinel Start sentinel mode — continuous alert monitoring
/sentinel status Show sentinel findings
/sentinel off Stop sentinel
/export Export answer — interactive picker (txt/md/pdf, file/clipboard)
/export pdf Export directly as PDF
/export full md Export full conversation as markdown
/copy Copy last answer to clipboard
/agent Switch agent persona
/welcome Show the welcome screen
/tables List available telemetry tables
/tools Show all tools with capabilities
/project Switch database
/config Show current connection and saved state
/mcp Show MCP server status
/clear Clear display (keeps AI context)
/reset Clear conversation history

Keyboard Shortcuts

Key Action
Enter Submit question
Esc Cancel in-progress investigation
Ctrl+O Toggle tool output review (works mid-investigation)
Ctrl+T Tool navigation mode — browse individual tool calls
↑ / ↓ Navigate input history
Ctrl+R Reverse search through history
← / → Navigate welcome screen
Tab Accept suggestion / complete command
? Toggle shortcut overlay
Ctrl+C (twice) Quit

Supported Data Sources

NeuBird connects to your existing telemetry and operations tools via read-only API integrations. Out of the box, it works with:

Incident Management — PagerDuty, Opsgenie, ServiceNow

Monitoring & Observability — Datadog, Grafana (Loki, Tempo, Mimir), CloudWatch, New Relic, Prometheus

Cloud Infrastructure — AWS (EC2, RDS, ECS, Lambda, Cost Explorer), GCP, Azure

CI/CD & Source Control — GitHub, GitLab, ArgoCD

Data Warehouses — Snowflake, BigQuery, Redshift

And more — Jira, Confluence, Slack, Kubernetes, custom FDW plugins

Local CLI tools (auto-detected on your machine) — kubectl, helm, docker/podman, AWS CLI, gcloud, Azure CLI, git, terraform, curl, dig, openssl

Server API

When running in server mode (neubird serve), Falcon exposes a REST API for programmatic access:

Method Endpoint Description
GET /health Server health check
POST /v1/infer Run an investigation (SSE streaming)
GET /v1/tools?session_id=... List available tools for a session
GET /v1/skills List available investigation skills
POST /v1/sentinel/start Start sentinel mode
POST /v1/sentinel/stop Stop sentinel mode
GET /v1/sentinel/status Sentinel status and config
GET /v1/sentinel/findings Current sentinel findings
POST /v1/sessions/reset Reset conversation history
POST /v1/sessions/delete Delete a session
POST /v1/sessions/feedback Submit rating or insight
POST /v1/prompt/submit Submit a prompt for async processing (returns prompt_id)

Async Prompt Queue

Prompts can be submitted to a Redis-backed queue instead of the synchronous /v1/infer endpoint. Any agent pod with access to the same Redis picks up the prompt and processes it. This enables horizontal scaling — add more pods to process more prompts in parallel.

# Submit a prompt
curl -X POST http://localhost:8080/v1/prompt/submit \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Why did latency spike on api-gateway?", "project_uuid": "your-project-id"}'

# Response: {"prompt_id": "a1b2c3d4-..."}

The request body accepts the same fields as /v1/infer. If prompt_id is omitted, one is generated automatically.

Reading results — The agent streams output to Redis at <org>:response:<prompt_id>. Poll the status key <org>:response:<prompt_id>:status for the current state:

Status Meaning
processing An agent is actively working on the prompt
complete Prompt finished successfully
dead_letter Prompt failed after exhausting all retries

Retries and dead-letter — If an agent crashes or fails mid-prompt, the message stays in the queue and is automatically reclaimed by another agent after a configurable idle period. After the maximum number of delivery attempts (default 3), the prompt is moved to a dead-letter stream (<org>:prompt:deadletter) and marked dead_letter so it is not retried forever.

Configuration:

Env var Description Default
REDIS_URI Redis connection string redis://redis-master.platform.svc.cluster.local:6379
ORG_NAME Org prefix for all Redis keys local
PROMPT_QUEUE_MAX_CONCURRENT Max parallel prompt workers per pod 8
PROMPT_QUEUE_AUTOCLAIM_IDLE Idle time before a stalled prompt is reclaimed 12m
PROMPT_QUEUE_MAX_RETRIES Max delivery attempts before dead-letter 3

The consumer loop starts automatically when Redis is reachable. If Redis is not configured, the /v1/prompt/submit endpoint is not registered and only synchronous /v1/infer is available.

Sentinel API

Start sentinel monitoring via the API:

# Start sentinel with custom intervals
curl -X POST http://localhost:8080/v1/sentinel/start \
  -H "Content-Type: application/json" \
  -d '{"poll_interval": "5m", "sweep_interval": "30m", "project_uuid": "your-project-id"}'

# Check findings
curl http://localhost:8080/v1/sentinel/findings

# Stop sentinel
curl -X POST http://localhost:8080/v1/sentinel/stop

All endpoints require the X-Desktop-Secret header when DESKTOP_SECRET is set.

SSE Event Types

The /v1/infer endpoint streams Server-Sent Events. Each event is a JSON object with a type field:

SSE Event Type Go Callback TUI Handler Web Proxy (falcon.py) Description
chunk OnText sp.AddText() Code-fence filter → RCA suppressor → yield markdown Streamed text from Claude (investigation narration, final answer, Opus synthesis)
tool_start OnToolExec (IsStart=true) sp.AddToolStart() update_status with friendly label, CoT tracker Agent begins executing a tool
tool_done OnToolExec (IsStart=false) sp.AddToolResult() Renders markdown table or hides exploratory errors Tool execution completed (output may be truncated to 8KB)
tool_data OnToolData sp.AddToolData() Fallback table render from full JSON; stores raw SQL results Full structured output from data-producing tools (not truncated)
data_read OnDataRead sp.SetDataRead() Logs bytes/rows/queries, updates CoT tracker Cumulative telemetry data volume (bytes, rows, queries from exec_sql)
progress OnProgress sp.SetProgress() update_progress(pct, confidence), CoT state Investigation progress percentage, phase, and confidence
phase OnPhase sp.SetAgentPhase() → overrides progress bar label Yields demarcated falcon_phase JSON block Agent phase transition: investigating, answering, synthesizing
turn OnTurn sp.SetTurn() Not handled (logged, ignored) LLM iteration number within the tool-use loop
trace OnTrace sp.SetTraceID() Not handled (logged, ignored) OpenTelemetry trace ID for Langfuse correlation
sources OnSources sp.SetSourceSummary() Stored; emitted as CONTENT_TYPE_SOURCES in finally block Data sources (tables, URLs) used in the investigation
health_report OnHealthReport RenderHealthSummaryTiles()sp.AddText() Yields inline markdown block Sonnet's structured health findings (mid-stream, before Opus synthesis)
rca_saved OnRCASaved Not wired (TUI reads file paths from text) Persists RCA markdown + structured JSON to MongoDB Opus synthesis report saved as PDF/markdown
session_metadata OnSessionMetadata Not wired Yields demarcated JSON; persists to session store on COMPLETED Running investigation metadata (cost, timing, actions, time-saved)
rating_request OnRatingRequest Not wired Logged, ignored (Raven UI handles ratings) Signal to show a thumbs-up/down prompt
interim_summary OnInterimSummary sp.SetInterimSummary() Yields demarcated interim_summary JSON block Background "what we know so far" narrative (Haiku, after each tool round)
synthesizing Updates status spinner Signal that Opus senior review is starting
summary_text OnSummaryText Not wired (falls through to OnText) Logged (same content arrives via chunk) Synthesis/answer text on a dedicated channel for clients that distinguish it
error OnError sp.SetError() Yields warning message, breaks stream loop Non-recoverable error (or synthetic event from connection loss)

Investigation Flow

Every investigation follows this sequence. Opus synthesis only runs when specific conditions are met (see When Opus Runs below).

User question
  │
  ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase: investigating  (Sonnet tool loop)                       │
│                                                                 │
│  Events: chunk, tool_start, tool_done, tool_data, data_read,   │
│          progress, turn, interim_summary                        │
│                                                                 │
│  Claude iterates: think → call tools → observe results → think  │
│  Tools execute in parallel. Up to 25 iterations per batch.      │
│  Each iteration is one SendMessageStream API call.              │
│                                                                 │
│  After each tool round, a background Haiku call produces a      │
│  rolling "what we know so far" narrative (interim_summary).     │
│  This is a read-only side-channel — it never modifies Claude's  │
│  context window or alters the investigation flow.               │
│                                                                 │
│  Loop exits when:                                               │
│    • StopReason = "end_turn" → Claude is done (normal exit)     │
│    • StopReason = "max_tokens" → auto-continue with same state  │
│    • StopReason = "tool_use" → execute tools, loop again        │
│    • Iteration limit (25) hit → summarize, ask to continue      │
└─────────────┬───────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase: answering  (Sonnet final turn)                          │
│                                                                 │
│  Events: phase("answering")                                     │
│                                                                 │
│  Sonnet's last text was already streamed during the loop.       │
│  This phase marker tells clients the investigation is complete  │
│  and any future chunks are synthesis.                           │
└─────────────┬───────────────────────────────────────────────────┘
              │
              ▼
        ┌─────────────┐
        │ Opus needed? │──── No ──── skip to completion
        └──────┬──────┘
               │ Yes
               ▼
┌─────────────────────────────────────────────────────────────────┐
│  Phase: synthesizing  (Opus senior review)                      │
│                                                                 │
│  Events: phase("synthesizing"), health_report (health only),    │
│          chunk (Opus text), rca_saved                            │
│                                                                 │
│  Opus reviews ALL evidence from the Sonnet loop and produces    │
│  a rigorous synthesis: RCA, health assessment, or cost analysis.│
│  Text streams as chunk events — the user reads it live.         │
│  A PDF/markdown report is saved automatically.                  │
└─────────────┬───────────────────────────────────────────────────┘
              │
              ▼
┌─────────────────────────────────────────────────────────────────┐
│  Completion                                                     │
│                                                                 │
│  Events: session_metadata (COMPLETED), sources, rating_request, │
│          phase not emitted — OnDone fires                       │
│                                                                 │
│  TUI: green "✅ completed in ..." footer bar                    │
│  Server: [DONE] SSE sentinel                                    │
└─────────────────────────────────────────────────────────────────┘

When Opus Runs

Opus synthesis requires both conditions to be true. If either fails, Sonnet's answer stands alone — no Opus pass.

Investigation Type Question Detection Minimum Sources Opus Function
Health Question starts with "Run a rapid infrastructure health assessment." (generated by /health slash command) ≥ 3 runHealthSynthesis — senior SRE review of Good/Bad/Ugly findings
Cost Question starts with the cost analysis prefix (generated by /cost slash command) ≥ 2 runCostSynthesis — cost trend extrapolation and ROI-ranked cuts
RCA Question contains trigger keywords (see below) ≥ 3 runSynthesis — rigorous root cause analysis with evidence chain
General None of the above, or 0 sources queried No Opus — Sonnet's answer is the final output

RCA trigger keywords — if any of these appear anywhere in the user's question (case-insensitive), the question is eligible for Opus RCA synthesis:

root cause, rca, investigate, incident, what caused, what happened, why is, why are, why did, find the cause, diagnose, postmortem, post-mortem, find out why, long running, slow quer, slow job, performance, jobs, blocking, locks, deadlock, timeout

Common reasons Opus doesn't run:

  • The question doesn't match any trigger keyword — e.g. "Show me trace latencies" won't trigger Opus even with 10 sources, because none of the RCA keywords are present. Rephrasing as "Investigate the trace latency slowdown" would trigger it.
  • Claude answered from conversation context or knowledge without making any tool calls (0 sources).
  • The investigation only queried 1-2 data sources — not enough evidence for a meaningful synthesis.

Stream Retry on Transient Errors

When the Anthropic API drops the connection mid-stream (HTTP/2 reset, INTERNAL_ERROR, TCP reset), the agent retries the current LLM call automatically. The conversation history is intact — the failed partial response was never appended — so Claude regenerates from the same state.

  • Up to 2 retries with 3s / 6s backoff
  • User sees 🔄 LLM connection lost, retrying (1/2)... inline
  • Applies to both the Sonnet investigation loop and tool-use iterations
  • Non-transient errors (auth failures, rate limits, context cancellation) are not retried

Thin-Answer Conclusion Injection

Claude sometimes stops after gathering evidence without writing a conclusion — especially on lower-confidence investigations where it queried data but didn't synthesize findings. The user sees tool results followed immediately by the green completion bar with no analysis in between.

The agent detects this and injects a conclusion prompt automatically:

  • Triggers when: end_turn with < 200 characters of text AND ≥ 3 tool calls were executed
  • What happens: A conclusion prompt is appended to the conversation and one more SendMessageStream call is made without tools (so Claude writes text only — no more queries)
  • User sees: The analysis streams in naturally, as if Claude had written it in the first place
  • Safety: Only fires once per investigation (concludedOnce flag prevents loops). If the conclusion call fails, the investigation completes normally — the user still has all the evidence from the tool calls

This does not apply to simple questions (< 3 tool calls) where Claude's short answer is appropriate, nor to cases where Claude wrote a substantial response alongside its last tool call.

Documentation

Full documentation is available at neubirdai.github.io/neubird-desktop.

License

Proprietary. Copyright 2024-2026 NeuBird, Inc.