Skip to content

Roadmap

Distribution

Feature Status Description
Nix flake Planned nix run github:blackwell-systems/agent-lsp

Extensions

Extensions add language-specific tools beyond what LSP exposes. The core 50 tools cover everything the language server protocol provides; extensions run arbitrary toolchain logic for a specific language.

Go extension (Wave 1 — test + module intelligence)

Tool Description
go.test_run Run a specific test by name, return full output + pass/fail
go.test_coverage Coverage % and uncovered lines for a file or package
go.benchmark_run Run a benchmark, return ns/op and allocs/op
go.test_race Run with -race, return any data races found
go.mod_graph Full dependency tree as structured data
go.mod_why Why is this package in go.mod? (go mod why)
go.mod_outdated List deps with available upgrades
go.vulncheck govulncheck scan — CVEs with affected symbols

Go extension (Wave 2 — build + quality)

Tool Description
go.escape_analysis gcflags="-m" output for a function — what allocates and why
go.cross_compile Try cross-compiling for a target OS/arch, return errors
go.lint staticcheck or golangci-lint output for a file
go.deadcode Find exported symbols with no callers (go tool deadcode)
go.vet_all go vet ./... with structured output

Go extension (Wave 3 — generation + docs)

Tool Description
go.generate Run go generate on a file, return output
go.generate_status Which //go:generate directives are stale
go.doc go doc output for any symbol — richer than hover
go.examples Find Example* test functions for a symbol

TypeScript extension

Tool Description
typescript.tsconfig_diagnostics Errors in tsconfig.json beyond what the language server reports
typescript.type_coverage Type coverage % for a file (any usage, implicit types)

Rust extension

Tool Description
rust.cargo_check cargo check with structured error output
rust.dep_tree Crate dependency tree (cargo tree)
rust.clippy cargo clippy lint output for a file
rust.audit cargo audit CVE scan on Cargo.lock

Python extension

Python has the largest gap between what pyright-langserver gives an agent and what the toolchain provides directly.

Tool Description
python.test_run Run a specific pytest test by name, return output + pass/fail
python.test_coverage coverage.py branch coverage for a file or module
python.lint ruff lint output with structured violations
python.type_check mypy type errors for a file (stricter than pyright diagnostics)
python.audit pip-audit CVE scan on installed packages
python.security bandit security scan for a file
python.deadcode vulture dead code detection
python.imports isort check — unsorted or missing imports

C / C++ extension

The gap between what clangd provides and what the broader toolchain offers is larger than any other language — sanitizers and profiling are completely outside LSP scope.

Tool Description
cpp.tidy clang-tidy diagnostics for a file (beyond clangd's built-in checks)
cpp.static_analysis cppcheck output with structured findings
cpp.asan_run Build and run with AddressSanitizer, return memory error output
cpp.ubsan_run Build and run with UndefinedBehaviorSanitizer
cpp.valgrind valgrind --memcheck output for a test binary
cpp.symbols nm / objdump symbol table for a compiled object

Java extension

Tool Description
java.test_run Run a specific JUnit test, return output
java.coverage JaCoCo coverage report for a class
java.build Maven/Gradle build with structured error output
java.deps jdeps dependency analysis — what packages does this class use?
java.checkstyle Checkstyle violations for a file
java.spotbugs SpotBugs static analysis findings

Elixir extension

Tool Description
elixir.test_run Run a specific ExUnit test, return output
elixir.dialyzer Dialyzer type analysis — unique to Elixir, finds type errors without annotations
elixir.credo Credo static analysis findings
elixir.audit mix deps.audit CVE scan

Ruby extension

Tool Description
ruby.test_run Run a specific RSpec or Minitest test, return output
ruby.lint RuboCop violations for a file
ruby.security Brakeman security scan (Rails)
ruby.audit bundle-audit CVE scan on Gemfile.lock

Product

Feature Status Description
agent-lsp update Planned Self-update to the latest release; fetches from GitHub Releases and replaces the binary in-place
Config file format Planned ~/.agent-lsp.json or agent-lsp.json project file for complex setups with per-server options
Continue.dev config support Planned agent-lsp init currently skips Continue.dev; it uses a different config format than mcpServers

Skills

20 skills shipped. See skills.md for the full catalog.

Creation skills

Current skills are oriented around modifying existing code. These skills target greenfield creation workflows where LSP can still add value through completions, diagnostics, and code actions.

Skill Description
/lsp-create Iterative file creation with diagnostic checks between steps. Create file, open in LSP, write incrementally, verify diagnostics after each addition, format on completion. /lsp-safe-edit for files that don't exist yet.
/lsp-implement (extend) Given an interface or type definition, generate the full implementation using get_completions to discover required methods, verify it compiles via diagnostics, format.
/lsp-discover-api Completion-driven API exploration. Open a file, place the cursor after a package qualifier, call get_completions to show available methods/fields. Use LSP knowledge instead of training data (which may be outdated).
/lsp-bootstrap Project scaffolding with LSP verification. Create build files (go.mod, package.json, Cargo.toml), start LSP, confirm indexing works, verify initial diagnostics are clean before writing application code.
/lsp-wire After creating a new package/module, verify it's importable from the intended consumer, check the public API surface via get_document_symbols, confirm no dangling imports or missing exports.

Skill composition

Skills calling other skills. /lsp-refactor is already composed from /lsp-impact + /lsp-safe-edit + /lsp-verify + /lsp-test-correlation. Formal runtime support for skill-to-skill invocation would enable arbitrary composition.

Skill Schema Specification

Skills are currently prose — markdown prompts the agent follows. The inputs and outputs are implicit and unvalidatable. A schema layer would make contracts explicit — what goes in, what comes out — enabling validation and eventual skill composition with typed interfaces.

The case for machine-readable skill contracts: - Tooling can validate that an agent invoked a skill correctly - Clearer interface between the agent and the skill — what goes in, what comes out - Enables skill composition with type safety (skill A's output feeds skill B's input) - Documentation that can be auto-generated and kept in sync

Feature Status Description
Skill input/output schema Planned JSON Schema definitions for each skill's expected inputs and guaranteed outputs — machine-readable contracts alongside the prose skill files
Schema validation tooling Planned Validate agent skill invocations against the schema at runtime or in CI — surfaces misuse before it causes silent failures

IDE Integration

agent-lsp already works with any IDE that has an MCP client (VS Code via Continue/Cline, JetBrains via AI Assistant, Cursor, Windsurf, Neovim via mcp.nvim). The items below improve this from "works" to "native."

Passive mode (connect to existing language servers)

agent-lsp currently launches and manages its own language server processes. In IDE environments, the IDE already has gopls/pyright/rust-analyzer running and indexed. Passive mode would connect to an already-running server instead of spawning a duplicate, eliminating double-indexing and double memory usage.

agent-lsp --connect go:localhost:9999 typescript:localhost:9998

Some language servers support multi-client connections over TCP (gopls supports gopls -listen=:9999). Passive mode would connect to these sockets and share the IDE's warm index. No IDE plugin required for this path.

Feature Status Description
--connect transport Planned Connect to an existing language server TCP socket instead of spawning a new process
Shared index Planned Reuse the IDE's warm language server index; no duplicate indexing or memory overhead

IDE extensions

Feature Status Description
VS Code extension Planned Auto-start agent-lsp, command palette for skills, inline diff preview for speculative execution, code lens for blast-radius annotations
JetBrains plugin Planned Single plugin for all JetBrains IDEs (GoLand, IntelliJ, PyCharm, WebStorm, CLion, Rider). Only needs com.intellij.modules.platform dependency since agent-lsp manages its own LSP connections. No language-specific module dependencies required.
Neovim plugin Planned Lua plugin using vim.lsp.buf_get_clients() to proxy requests through existing LSP connections

CI Performance Metrics

Instrument the existing test suite to capture per-language timing data on every CI run, then publish it as a public docs/metrics.md table. This turns CI from a pass/fail gate into a performance baseline.

What to measure

Metric How Where
Server init time start_lsp to first successful response Existing multi-lang tests
Diagnostic settle time open_document to get_diagnostics returning stable results Existing multi-lang tests
Speculative execution confidence confidence field from simulate_edit_atomic (high/partial/eventual) New speculative test per language
Speculative round-trip time simulate_edit_atomic call to response New speculative test per language
Cross-file propagation time Edit file A → diagnostics update in file B New test using multi-file fixtures
Tool latency (hover, definition, references, completions) Per-call time.Since wrapping Existing tier-2 tool tests

Output schema

Each CI job writes metrics/<language>.json:

{
  "language": "go",
  "server": "gopls",
  "init_ms": 1240,
  "diagnostic_settle_ms": 890,
  "speculative_confidence": "high",
  "speculative_round_trip_ms": 2100,
  "cross_file_propagation_ms": 1800,
  "tool_latency_ms": {
    "hover": 45,
    "definition": 62,
    "references": 310,
    "completions": 120
  },
  "timestamp": "2026-04-21T00:00:00Z",
  "ci_run_id": 12345
}

Files to create/modify

File Change
test/metrics.go New — timing harness, JSON serialization, WriteMetrics(path string)
test/multi_lang_test.go Instrument TestMultiLanguage — wrap each tool call with time.Since, collect into LanguageMetrics struct
test/speculative_test.go Expand to all supported languages (currently Go only); record speculative_confidence and speculative_round_trip_ms per language
.github/workflows/ci.yml Add upload-artifact step per language job; add collect-metrics job that runs after all language jobs, downloads all artifacts, and commits merged metrics.json to a metrics branch
scripts/generate-metrics.py New — reads metrics/<language>.json files, computes p50/p95 after 5+ runs from metrics/history.json, renders docs/metrics.md
docs/metrics.md Generated output — markdown table with one row per language

Public dashboard format

| Language   | Server          | Init  | Diag Settle | Spec Confidence | Spec RT | Cross-file |
|------------|-----------------|-------|-------------|-----------------|---------|------------|
| Go         | gopls           | 1.2s  | 0.9s        | high            | 2.1s    | 1.8s       |
| Rust       | rust-analyzer   | 2.1s  | 1.4s        | high            | 2.8s    | 2.2s       |
| TypeScript | typescript-language-server | 0.8s  | 0.6s        | high            | 1.3s    | 1.1s       |
| Python     | pyright         | 1.5s  | 1.1s        | high            | 2.4s    | —          |

Rolling averages

After 5+ CI runs, generate-metrics.py reads metrics/history.json on the metrics branch and replaces single-run numbers with p50/p95 per metric. The history file is a JSON array of per-run records; the script appends the latest run and trims to the last 50 entries.

Implementation notes

  • The timing harness must not fail the test on timeout — capture what is available and write -1 for unresolvable metrics.
  • Cross-file propagation requires multi-file test fixtures; Go and TypeScript already have them in test/testdata; Python and Rust need new fixtures.
  • Speculative confidence for languages without high confidence is expected — record the actual value, not a failure.
  • The collect-metrics CI job should only run on the main branch to avoid polluting the metrics branch with PR data.

Control Plane

The agent-local pipeline (blast-radius → simulate → apply → verify → test) handles correctness for a single session. The control plane adds organizational primitives for teams running agents at scale.

Feature Status Description
Audit trail Shipped JSONL log of every apply_edit, rename_symbol, and commit_session call with timestamp, affected files, edit summary, pre/post diagnostic state, and net_delta. Configure via --audit-log flag or AGENT_LSP_AUDIT_LOG env var.
Change plan output Planned Materialize simulate_chain output as a structured, human-reviewable artifact before apply — files, edits, per-step diagnostic delta, safe-to-apply watermark. Three community members have independently requested this.
Policy gates Planned Configurable rules that block apply based on blast-radius thresholds, public API changes, or path patterns. Evaluate at apply time using the audit record.
Cross-session coordination Planned Shared state between concurrent MCP sessions — symbol-level lock registry to prevent overlapping renames/refactors. Requires a sidecar daemon or file-based coordination. The hardest piece.

Bigger Bets

Feature Status Description
Observability Planned Metrics (requests/sec, latency per tool, error rate) for production deployments — valuable for teams running agent-lsp as shared infrastructure