Local Chat - VS Code Extension
A powerful chat extension with MCP (Model Context Protocol) support, file attachment, and integrated authentication.
✨ Features
Core Capabilities
- 💬 Chat Interface - Rich chat UI with Markdown, LaTeX, and code highlighting
- 🔌 MCP Support - Integrated Model Context Protocol for tool calling
- 🤖 Custom LLM Provider - Connects to agent_server with DeepSeek API
- 🔐 Authentication - Session management with 3-hour expiration
- 📁 File Attachment - Intelligent @ file selection with validation
- 💾 Conversation Management - Persistent chat history with multiple conversations
- 📊 Token Tracking - Real-time usage and cost statistics
File Attachment & Management
- @ File Selection: Type
@ in input box to trigger file search
- Smart Search: Filter files by name instantly
- Color-Coded Tags: Different colors for file types (PDB green, SDF blue, FASTA orange)
- Two-Level Validation:
- Frontend: Quick validation (existence, extension, directory)
- Backend: Deep validation (format, structure, BioPython/RDKit)
- Supported Formats:
.pdb, .cif, .pdbqt, .sdf, .mol, .mol2, .csv, .fasta, .fa, .faa, .fna
- Data Directory: Configurable file access restriction (
localChat.dataDirectory)
- Auto Content Reading: File content automatically attached to messages for LLM analysis
- Auto-discovery of MCP servers
- Tool call execution and result persistence
- Multi-turn tool usage with task_id tracking
- Backend-managed MCP connections (no client config needed)
UI Features
- Markdown rendering with syntax highlighting
- LaTeX formula support (KaTeX)
- Code blocks with copy buttons
- Left sidebar with conversation list
- Green theme (borders, price info)
- Inline file tags (removable)
Quick Start
Prerequisites
- Agent Server must be running on
localhost:5001
- Platform API access (for authentication)
- DeepSeek API key (configured in agent_server)
1. Start Agent Server
cd agent_server
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export BACKEND_API_URL="http://120.26.248.92/api"
export BILLING_API_URL="http://120.26.248.92:8085"
export DEEPSEEK_API_KEY="your-api-key"
# Start service (port 5001)
python app.py
Server will start at http://localhost:5001
Agent server features:
- User authentication (session validation)
- Request forwarding to DeepSeek API
- File validation and OSS upload
- MCP server management
- Token usage tracking
Open VS Code settings (Cmd+Shift+P → "Preferences: Open User Settings (JSON)"):
{
// Authentication
"localChat.auth.apiBaseUrl": "http://120.26.248.92/api",
"localChat.auth.apiKey": "sk-xxxxx",
// LLM Provider
"localChat.llmProvider": "custom",
"localChat.custom.apiUrl": "http://localhost:5001/api/chat/completions",
// Optional: File access restriction
"localChat.dataDirectory": "~/data"
}
3. Build and Run Extension
Development Mode (Recommended):
cd /path/to/quregenai_ide/extensions/local_chat
# Terminal 1: Continuous compilation
npm run watch
# VS Code: Press F5 to launch Extension Development Host
Package and Install:
npm run compile
npx @vscode/vsce package --allow-missing-repository --allow-star-activation
# Install the generated .vsix file
# Extensions: Install from VSIX...
4. First Use
- Open Command Palette (
Cmd+Shift+P)
- Run Local Chat: Login
- Enter platform credentials
- Chat interface opens automatically after login
Usage Guide
Basic Chat
- Type your message in the input box
- Press
Enter or click Send
- View response with token usage and cost
Attach Files
- Type
@ in the input box to trigger file search
- Type filename to filter results
- Click to select a file
- File tag appears before input box (click
x to remove)
- File content is automatically included in the message
Example:
@protein.pdb
→ Select file from dropdown
→ Tag: [@protein.pdb x]
→ Type: "Analyze this protein structure"
Manage Conversations
- New Chat: Click
+ New Chat button
- Switch: Click conversation name in left sidebar
- Rename: Click
Rename button
- Clear: Click
Clear to empty current conversation
- Delete: Remove a conversation permanently
View History
- All conversations listed in left sidebar
- Click to switch instantly
- Full context preserved including:
- Chat messages
- Token usage
- Tool execution results (with task_id)
- Tools execute automatically when LLM decides
- Results include task_id for tracking
- Tool outputs persist in conversation history
- Use task_id to query status or download results
Architecture
┌─────────────────────────────────────┐
│ VSCode Extension │
│ ┌─────────────────────────────┐ │
│ │ ChatPanel (Webview) │ │ • Rich UI with Markdown/LaTeX
│ │ - File attachment (@) │ │ • @ file selection
│ │ - Conversation list │ │ • Token tracking
│ └──────────┬──────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ Extension.ts │ │ • Command handling
│ │ - Login check │ │ • Message routing
│ │ - File validation │ │
│ └──────────┬──────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ AgentLoop.ts │ │ • Multi-turn orchestration
│ │ - Context management │ │ • Tool call handling
│ │ - ConversationManager │ │ • History persistence
│ └──────────┬──────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ CustomProvider.ts │ │ • LLM communication
│ └──────────┬──────────────────┘ │
└─────────────┼───────────────────────┘
│ HTTP
┌─────────────▼───────────────────────┐
│ Agent Server (Flask) │
│ - Port 5001 │
│ ┌─────────────────────────────┐ │
│ │ Authentication & Routing │ │ • Session validation
│ │ - Session verification │ │ • API key check
│ │ - API key validation │ │
│ └──────────┬──────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ File Validation │ │ • BioPython/RDKit
│ │ - Deep format check │ │ • Structure validation
│ │ - OSS upload │ │ • 10MB / 4000 residues
│ └─────────────────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ MCP Client │ │ • Server management
│ │ - Tool discovery │ │ • Tool execution
│ │ - Call routing │ │ • Result handling
│ └─────────────────────────────┘ │
│ │ │
│ ┌──────────▼──────────────────┐ │
│ │ DeepSeek API │ │ • Chat completions
│ │ - Stream/non-stream │ │ • Tool calling
│ │ - Usage tracking │ │ • Token stats
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘
│
┌─────────────▼───────────────────────┐
│ MCP Servers (stdio) │
│ - quregenai_mcp │ • Structure prediction
│ - file_tools │ • Docking tools
│ - ... │ • Analysis tools
└─────────────────────────────────────┘
Data Flow
User Input ("+" attached files)
↓
ChatPanel validates files
↓
Extension.ts routes message
↓
AgentLoop prepares context
↓ (messages + tools + files)
CustomProvider → agent_server
↓
agent_server validates session
↓
agent_server → DeepSeek API
↓
DeepSeek returns response/tool_calls
↓ (if tool_calls)
agent_server → MCP servers
↓
MCP servers execute tools
↓
Results → DeepSeek (continued)
↓
Final response + usage
↓
AgentLoop saves to ConversationManager
↓
ChatPanel displays (with token cost)
Conversation Logic Flow (AgentLoop)
The AgentLoop orchestrates multi-turn conversations with tool execution and user confirmation. Here's the detailed flow:
1. Message Preparation Phase
1. Load conversation history from ConversationManager
2. Filter out temporary messages:
- Remove "No MCP server" errors
- Remove "cancelled by user" messages
3. Apply sliding window (MAX_HISTORY_MESSAGES = 10):
- Keep recent 10 messages
- Summarize older messages (extract task_ids, file names)
- Combine: [summary] + [recent 10 messages]
Smart Context Management:
- Full history preserved in ConversationManager
- Only recent context sent to LLM (reduces token usage)
- Old context summarized to preserve key facts (task IDs, files)
2. Agent Loop Iteration
for each iteration (max 10):
├─ Call LLM with messages + available tools
├─ Receive response (message + optional tool_calls)
│
├─ Display assistant message
│ └─ Save to ConversationManager
│
├─ If no tool_calls → END
│
├─ If tool_calls exist:
│ ├─ Check if user cancelled before → prevent loop
│ ├─ Request user confirmation (once per turn)
│ │ ├─ User clicks "Execute" → continue
│ │ └─ User clicks "Cancel":
│ │ ├─ Set userCancelled flag
│ │ ├─ Send instruction to LLM (temporary, not saved)
│ │ ├─ LLM responds asking about issues
│ │ └─ If LLM returns tool_calls again → force stop
│ │
│ ├─ Execute tools via backend MCP API
│ │ ├─ Success → add tool result (JSON) to messages
│ │ └─ Error → add tool error to messages
│ │ (Not displayed or saved - only for LLM context)
│ │
│ ├─ Tool execution count++ (MAX = 1 per turn)
│ └─ Continue loop → LLM summarizes tool results
│
└─ LLM analyzes tool results:
├─ Extracts key info (task_id, status, errors)
├─ Generates natural language summary
└─ Display summary → Save to history
Old Approach (Removed):
- Tool executes → Format JSON → Display to user → Save
- User sees raw technical details
New Approach (Current):
- Tool executes → Add raw JSON to messages (temporary)
- Continue loop → LLM receives tool result
- LLM summarizes: "Task abc123 submitted successfully. Expected wait time: 5 minutes."
- Display LLM's summary → Save summary to history
Benefits:
- ✅ Natural language responses instead of raw JSON
- ✅ LLM extracts important info (task_id, status, errors)
- ✅ LLM provides context ("You can check status with...")
- ✅ Multilingual support (LLM responds in user's language)
- ✅ Better error explanations
4. User Cancellation Flow
LLM proposes tool call (e.g., structure prediction)
↓
Confirmation dialog appears
↓
User clicks "Cancel"
↓
Set userCancelled = true
↓
Add temporary instruction: "Ask user why they cancelled"
↓
LLM responds: "Would you like to modify the parameters?"
↓
User replies: "Change copies to 3"
↓
New conversation turn (userCancelled reset)
↓
LLM generates new tool_calls with updated params
↓
Confirmation dialog appears again
Safety Mechanisms:
userCancelled flag prevents immediate retry
- If LLM ignores instruction and returns tool_calls → force stop
- Cancellation instruction is temporary (not saved to history)
- User can modify parameters over multiple turns
5. Loop Control & Safety
Protections Against Infinite Loops:
Tool Execution Limit: MAX_TOOL_EXECUTIONS = 1
- Only execute tools once per user message
- After execution, continue for LLM summary but don't execute again
Iteration Limit: maxIterations = 10
- Maximum 10 LLM calls per user message
- Prevents runaway loops
Cancellation Protection:
- If user cancels and LLM still returns tool_calls → terminate
- Display fallback message instead of looping
No Tool Calls → End:
- If LLM doesn't return tool_calls → conversation turn ends
- Prevents unnecessary iterations
6. Message Types & Persistence
| Message Type |
Displayed |
Saved to History |
Purpose |
| User message |
✅ |
✅ |
User's input |
| Assistant message |
✅ |
✅ |
LLM's response |
| Tool result (JSON) |
❌ |
❌ |
LLM context only |
| LLM summary |
✅ |
✅ |
User-facing result |
| Cancel instruction |
❌ |
❌ |
Temporary guidance |
| Tool error |
❌ |
❌ |
LLM handles it |
Key Principle: Users only see natural language, never raw JSON or system messages.
7. Context Window Management
Problem: Long conversations confuse LLM with old context
Solution: Three-layer approach
Message Filtering:
- Remove temporary errors ("No MCP server")
- Remove cancellation messages
Sliding Window:
- Keep only recent 10 messages for LLM
- Older messages don't affect current intent
Smart Summarization:
- Extract task_ids from old messages
- Extract file names from old messages
- Create compact summary: "[Previous: tasks abc123, xyz789; files protein.pdb]"
- LLM gets key facts without old conversation noise
Result: LLM has fresh context for current task + key facts from history.
8. Known Limitations & Potential Issues
Issue 1: LLM May Return Tool Calls After Summarizing ✅ FIXED
- Problem: After receiving tool results, LLM might generate new tool_calls while summarizing (e.g., "Let me check task status")
- Solution: Added system instruction after tool execution: "DO NOT make any new tool calls in this response - only provide a clear, user-friendly summary"
- Mitigation:
MAX_TOOL_EXECUTIONS = 1 still prevents re-execution as backup
- Status: Fixed - LLM now understands it should only summarize, not call new tools
Issue 2: Multiple Tool Execution
- Problem: If LLM returns multiple tool_calls, all are executed in the same turn
- Current Behavior: All tools execute, then count increments by 1
- Risk: In edge cases, user might execute more tools than intended
- Mitigation: Tool execution limit per turn prevents runaway
Issue 3: Mixed Success/Failure Results
- Problem: If some tools succeed and others fail, LLM receives mixed results
- Current Behavior: LLM summarizes both successes and failures
- Benefit: Actually helpful - LLM can explain what worked and what didn't
- No action needed: This is acceptable behavior
Issue 4: Context Window Synchronization
- Problem: Frontend MAX_HISTORY_MESSAGES hardcoded, must match backend .env
- Current: Manual sync required (both set to 10)
- Risk: If values differ, context management breaks
- Future Fix: Backend should return MAX_HISTORY_MESSAGES in API response
Issue 5: Backend Tool Execution Mode
- Problem: Code supports backend auto-execution (toolResults) but currently unused
- Current: Frontend always executes tools after confirmation
- Risk: Dead code path may have bugs
- Recommendation: Remove backend auto-execution support or test thoroughly
Key Components
Frontend (VSCode Extension)
| File |
Purpose |
src/extension.ts |
Entry point, command registration |
src/agent/AgentLoop.ts |
Multi-turn conversation orchestration |
src/llm/CustomProvider.ts |
Agent server communication |
src/auth/AuthManager.ts |
Session management (3h expiration) |
src/conversation/ConversationManager.ts |
Conversation persistence |
src/file/FileManager.ts |
File search and validation |
src/webview/ChatPanel.ts |
Chat UI with @ file selection |
src/webview/LoginPanel.ts |
Authentication UI |
Backend (Agent Server)
| File |
Purpose |
agent_server/app.py |
Flask server entry point |
agent_server/routes/chat.py |
Chat API endpoints |
agent_server/routes/upload.py |
File upload and validation |
agent_server/routes/user.py |
User authentication |
agent_server/services/mcp_client.py |
MCP server management |
agent_server/services/file_validator.py |
Deep file validation |
agent_server/services/upload.py |
OSS file upload |
Configuration
VSCode Settings (settings.json):
{
"localChat.auth.apiBaseUrl": "Platform API URL",
"localChat.auth.apiKey": "Your API key",
"localChat.llmProvider": "custom",
"localChat.custom.apiUrl": "http://localhost:5001/api/chat/completions",
"localChat.dataDirectory": "~/data" // Optional: restrict file access
}
MCP Configuration (agent_server/mcp_servers.json):
{
"mcpServers": {
"quregenai_mcp": {
"type": "stdio",
"command": "python",
"args": ["src/server.py"],
"cwd": "/absolute/path/to/quregenai_mcp",
"env": {"QUREGENAI_API_KEY": "your-key"},
"disabled": false,
"timeout": 60
}
}
}
See agent_server/MCP_SETUP.md for detailed MCP configuration.
Development
Build
npm run compile
Watch Mode
npm run watch
Debug
Press F5 in VS Code to launch the extension in debug mode.
Troubleshooting
Agent Server Connection Failed
Check:
- Is agent_server running? Test:
curl http://localhost:5001/health
- API key configured correctly in settings?
- Firewall blocking port 5001?
Solution:
cd agent_server
python app.py
# Look for "Agent Server starting on port 5001"
Login Failed
Check:
- Platform backend API accessible? (
http://120.26.248.92/api)
- Correct username and password?
- Session expired? (3-hour limit)
Solution:
- Re-run Local Chat: Login command
- Verify API key in settings
- Check network connectivity
File Selection Not Working
Check:
- Data directory configured? (if set, files must be inside)
- File format supported?
- File exists and readable?
Solution:
// Remove directory restriction (or set correct path)
"localChat.dataDirectory": ""
// Or set to specific directory
"localChat.dataDirectory": "~/data"
Token Stats Not Showing
Issue: Old conversations don't have usage data
Solution:
- Create new conversation with
+ New Chat
- New messages will show token usage and cost
Check:
- Agent server logs show "MCP client initialized successfully"?
- MCP servers configured in
agent_server/mcp_servers.json?
- Correct paths (use absolute paths)?
Solution:
# Check agent_server logs
cd agent_server
python app.py
# Look for:
# [MCP Client] Connected to server: quregenai_mcp
# [MCP Client] Cached 50+ tools from quregenai_mcp
Compilation Errors
# Clean and rebuild
cd extensions/local_chat
rm -rf node_modules package-lock.json
npm install
npm run compile
Extension Not Loading
Issue: Changes not reflected after reload
Solution:
- Must reload VSCode window after code changes
- Or use
npm run watch + F5 for development
- Packaged .vsix requires reinstallation
Security
- API Key Storage: Stored in VSCode globalState (encrypted)
- Session Management: 3-hour expiration, automatic re-authentication required
- Agent Server: Validates every request with API key
- File Access: Optional directory restriction via
localChat.dataDirectory
- No API Key Leakage: API keys never sent to DeepSeek (handled by agent_server)
Additional Documentation
SETUP.md - Detailed setup guide with architecture overview
agent_server/README.md - Agent server API documentation
agent_server/MCP_SETUP.md - MCP integration and configuration guide
Commands
- Local Chat: Login - Authenticate with platform credentials
- Local Chat: Open Chat - Open chat interface (requires login)
- Local Chat: New Conversation - Create new conversation
- Local Chat: Switch Conversation - Change active conversation
- Local Chat: Rename Conversation - Rename current conversation
- Local Chat: Delete Conversation - Remove a conversation
- Local Chat: Clear Conversation - Empty current conversation
Development Tips
- Use
npm run watch for automatic recompilation
- Press
F5 to launch Extension Development Host
- Reload window (
Cmd+R) after code changes
- Check extension output for debugging
- Agent server logs show MCP and tool execution details
Project Structure
local_chat/
├── src/
│ ├── extension.ts # Entry point
│ ├── agent/
│ │ └── AgentLoop.ts # Conversation orchestration
│ ├── auth/
│ │ └── AuthManager.ts # Session management
│ ├── conversation/
│ │ └── ConversationManager.ts # History persistence
│ ├── file/
│ │ └── FileManager.ts # File operations
│ ├── llm/
│ │ └── CustomProvider.ts # Agent server client
│ ├── webview/
│ │ ├── ChatPanel.ts # Chat UI
│ │ └── LoginPanel.ts # Login UI
│ └── types.ts # TypeScript definitions
├── agent_server/
│ ├── app.py # Flask server
│ ├── routes/ # API endpoints
│ ├── services/ # Business logic
│ └── mcp_servers.json # MCP configuration
└── README.md # This file
License
Apache 2.0