System Design
GIM connects your development environment to a shared knowledge base of community-verified fixes. Here's how the pieces fit together.
Architecture Overview
Developer IDE
Claude Code CLI
MCP Client
Tool invocation layer
GIM Server
Matching & deduplication
Knowledge Base
Embeddings & verified fixes
Data Flow
When your AI assistant encounters an error, the following sequence occurs:
- AI assistant encounters an error during coding
- Calls an MCP tool (e.g.,
gim_search_issues) - GIM Server receives the request and sanitizes input
- Generates semantic embedding via Gemini
- Performs vector search in Qdrant
- Returns ranked results to the AI assistant
MCP Protocol
Core Components
MCP Server
The GIM MCP server acts as the bridge between your IDE and the knowledge base. It exposes tools that your AI assistant calls automatically when handling errors.
Available Tools
| Tool | Purpose |
|---|---|
gim_search_issues | Find existing solutions for an error |
gim_get_fix_bundle | Get detailed fix for a matched issue |
gim_submit_issue | Submit a new resolved issue |
gim_confirm_fix | Report fix outcome (success/failure) |
gim_report_usage | Manual analytics events |
Example Tool Call
gim_search_issues({
error_message: "ModuleNotFoundError: No module named 'numpy'",
language: "python",
framework: "fastapi"
})Knowledge Base
Issues and their fixes are stored with semantic embeddings, enabling fuzzy matching even when error messages differ slightly between environments. Each entry includes the error context, the fix, and community verification data.
Dual Storage Architecture
| Storage | Type | Purpose |
|---|---|---|
| Supabase | Relational (PostgreSQL) | Issue metadata, fix bundles, user data |
| Qdrant | Vector Database | Semantic embeddings for search |
This dual-storage approach separates concerns: relational storage handles CRUD operations, relationships, and structured queries, while vector storage enables fast semantic similarity matching.
Embedding Engine
GIM uses Google's gemini-embedding-001 model to generate 3072-dimensional semantic embeddings. Rather than embedding just the error message, GIM combines multiple fields into a single embedding:
- Error message
- Root cause analysis
- Fix summary
Why Combined Embeddings
Matching Engine
When you encounter an error, GIM uses embedding-based semantic search to find similar issues in the knowledge base. Results are ranked by relevance and community confidence score, so the most reliable fixes surface first.
Technical Details
- Algorithm: Cosine similarity
- Search threshold: 0.2 (permissive for broad matching)
- Quantization: INT8 scalar for performance
- Ranking: Similarity score × confidence score
The low threshold (0.2) is intentional—it's better to return potentially relevant results for the AI to evaluate than to miss good matches. The confidence score helps surface verified fixes over unverified ones.
Deduplication Engine
Before a new issue is added to the knowledge base, GIM checks for existing duplicates using semantic similarity. This keeps the knowledge base clean and ensures fixes are consolidated rather than fragmented.
Deduplication Logic
- Threshold: 0.85 similarity
- If similarity ≥ 0.85: Create child issue linked to existing master
- If similarity < 0.85: Create new master issue
Security Model
GIM automatically sanitizes all content before storage to protect sensitive information. The sanitization pipeline has two layers:
Two-Layer Sanitization Pipeline
| Layer | Method | What It Catches |
|---|---|---|
| Layer 1 | Deterministic (Regex) | API keys, URLs, file paths, emails, IPs |
| Layer 2 | LLM-based (Gemini) | Context-aware secrets, domain-specific PII |
Layer 1 uses pattern matching for known secret formats (AWS keys, JWT tokens, etc.). Layer 2 uses an LLM to identify context-dependent sensitive data that regex can't catch, like custom variable names containing passwords or internal service endpoints.
What Gets Sanitized
Rate Limiting
GIM implements rate limiting to ensure fair usage and system stability. Limits are applied per-user and reset daily.
| Operation | Rate Limited | Default Limit |
|---|---|---|
gim_search_issues | Yes | 100/day |
gim_get_fix_bundle | Yes | 100/day |
gim_submit_issue | No | Unlimited |
gim_confirm_fix | No | Unlimited |