Code Search
The code search tool gives agents semantic and lexical access to indexed repositories. It started as a single search_codebase call; it now ships as a small toolkit of eight related tools, so an agent can move from "find the file" to "open the right lines" without ever leaving the toolkit.
The eight tools
| Tool | What it does | When the agent picks it |
|---|---|---|
search_codebase | Vector search inside one or more named repos | The agent knows which repo to look in |
multi_repo_search | Vector search across all indexed repos (or a subset) | The agent doesn't know which repo holds the answer |
find_similar_code | Embeds an actual code snippet (not a query) and finds similar code | The agent has code and wants to find analogous code elsewhere |
list_indexed_files | Lists every file embedded in a repo, with chunk counts | Sanity-checking what's actually in the index |
get_file_chunks | Returns all indexed segments for one file | Drilling into a file after a hit, without reading raw bytes |
grep | Regex/literal search in a file or folder, with context lines | Exact-string lookups — names, error messages, TODO tags |
glob | Glob-pattern file listing (**/*.py, etc.) | Discovering files by name/extension |
read_file_by_lines | Reads a line range (1-indexed, inclusive) from a file | Inspecting a slice after search_codebase or grep returns line numbers |
Argument reference
search_codebase — required: query, repo_ids. Optional: top_k (default 10), file_filter (substring match on file path), min_score (0–1, default 0.0 — raise to ~0.7 to drop weak matches).
multi_repo_search — required: query. Optional: repo_ids (omit to search every indexed repo), top_k, file_filter, min_score.
find_similar_code — required: code (the snippet to find analogues of). Optional: repo_ids, top_k, file_filter, min_score.
list_indexed_files — required: repo_id. Optional: file_filter.
get_file_chunks — required: repo_id, filename (relative path as returned by search_codebase).
grep — required: path (absolute), pattern. Optional: file_pattern (glob filter when path is a folder, default *), recursive (default true), ignore_case (default false), fixed (literal substring, default false), context (lines before/after each match, default 0), max_matches (default 1000).
glob — required: folder_path (absolute), Optional: pattern (default **/*), recursive (default true), include_dirs (default false), include_hidden (default true), max_results (default 1000).
read_file_by_lines — required: file_path (absolute, or a relative name resolved against active repos and the vault). Optional: repo_id, start_line (default 1), end_line (default 100).
Access control
grep, glob, and read_file_by_lines are scoped to configured repo paths plus the vault. They cannot read arbitrary host paths even when given absolute paths outside those roots — Synapse rejects the call. This is enforced at tool entry, not by the agent's prompt.
Setting up code search
1. Enable code repository indexing
Open Settings → General and toggle Code Repository Indexing on. The first time you enable it, Synapse walks you through the PostgreSQL + pgvector setup (see PostgreSQL + pgvector setup below).
2. Add a repository
Go to Settings → Repos. The page asks for:
| Field | Description |
|---|---|
| Repo Name | Display name (e.g. "Frontend App") |
| Absolute Directory Path | Full path to the repo on the host (e.g. /home/user/projects/app) |
| Interconnection Description | Free-text hint for the LLM about what the repo contains |
| Extra Excluded Patterns (advanced) | One per line. data excludes every dir named data; backend/data excludes only that path; *.json excludes JSON files anywhere |
| Embedding model provider | Defaults to gemini — change in Settings → Models |
Defaults that already work for most repos: included extensions are .py, .ts, .tsx, .js, .jsx, .rs, .go, .java, .md, .html, .vue, .css, .scss, .cpp, .c. Excluded by default: node_modules, __pycache__, venv, .git, *.pyc, all dotfiles.
3. Index the repository
Click Index on the repo card. Synapse chunks files by function/class/section, embeds each chunk, and writes the vectors to PostgreSQL. Status transitions: pending → indexing → indexed. Large repos (10k+ files) take a few minutes.
4. Assign repos to agents
Open the agent editor (Agents → click your agent), find the Repos field, and tick the repos this agent should search. Code-type agents get prompt scaffolding that nudges them to use search_codebase early in the loop.
Advanced: direct settings.json / agent JSON edit
{ "repos": ["repo-abc123", "repo-def456"] }
Example agent prompts
Find the function that handles MCP server registration
Where is the vault_write tool implemented?
Show me all places where the anthropic client is initialised
Find tests for the orchestration engine
For exact-string lookups (error messages, env var names), nudge the agent toward grep:
Use grep to find every place that references AWS_SECRET_KEY in the backend repo
Re-indexing
After significant code changes, click Re-index on the repo card. The tool cache for code_search is auto-invalidated when a repo is re-indexed, so subsequent searches see the new chunks immediately — see Caching.
PostgreSQL + pgvector setup
Toggling Code Repository Indexing in Settings → General kicks off a guided setup that adapts to what Synapse finds on your machine:
| State | What Synapse shows |
|---|---|
| PostgreSQL not installed | Install instructions for Ubuntu/Debian, macOS, and Windows |
| No database configured | A form for host, port, username, password, database name — click Create Database to provision and enable pgvector |
| Existing connection broken | The same form pre-filled, with the failing URL surfaced — click Save & Connect |
| pgvector extension missing | Instructions to run CREATE EXTENSION vector; plus OS package install commands |
If you prefer to set it up manually:
CREATE DATABASE synapse;
\c synapse
CREATE EXTENSION vector;
Then point Synapse at it via Settings → General → Code Repository Indexing → connection form.