Local Chat - VS Code Extension

A powerful chat extension with MCP (Model Context Protocol) support, file attachment, and integrated authentication.

✨ Features

Core Capabilities

💬 Chat Interface - Rich chat UI with Markdown, LaTeX, and code highlighting
🔌 MCP Support - Integrated Model Context Protocol for tool calling
🤖 Custom LLM Provider - Connects to agent_server with DeepSeek API
🔐 Authentication - Session management with 3-hour expiration
📁 File Attachment - Intelligent @ file selection with validation
💾 Conversation Management - Persistent chat history with multiple conversations
📊 Token Tracking - Real-time usage and cost statistics

File Attachment & Management

@ File Selection: Type @ in input box to trigger file search
Smart Search: Filter files by name instantly
Color-Coded Tags: Different colors for file types (PDB green, SDF blue, FASTA orange)
Two-Level Validation:
- Frontend: Quick validation (existence, extension, directory)
- Backend: Deep validation (format, structure, BioPython/RDKit)
Supported Formats: .pdb, .cif, .pdbqt, .sdf, .mol, .mol2, .csv, .fasta, .fa, .faa, .fna
Data Directory: Configurable file access restriction (localChat.dataDirectory)
Auto Content Reading: File content automatically attached to messages for LLM analysis

MCP Tool Integration

Auto-discovery of MCP servers
Tool call execution and result persistence
Multi-turn tool usage with task_id tracking
Backend-managed MCP connections (no client config needed)

UI Features

Markdown rendering with syntax highlighting
LaTeX formula support (KaTeX)
Code blocks with copy buttons
Left sidebar with conversation list
Green theme (borders, price info)
Inline file tags (removable)

Quick Start

Prerequisites

Agent Server must be running on localhost:5001
Platform API access (for authentication)
DeepSeek API key (configured in agent_server)

1. Start Agent Server

cd agent_server

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export BACKEND_API_URL="http://120.26.248.92/api"
export BILLING_API_URL="http://120.26.248.92:8085"
export DEEPSEEK_API_KEY="your-api-key"

# Start service (port 5001)
python app.py

Server will start at http://localhost:5001

Agent server features:

User authentication (session validation)
Request forwarding to DeepSeek API
File validation and OSS upload
MCP server management
Token usage tracking

2. Configure VS Code Extension

Open VS Code settings (Cmd+Shift+P → "Preferences: Open User Settings (JSON)"):

{
  // Authentication
  "localChat.auth.apiBaseUrl": "http://120.26.248.92/api",
  "localChat.auth.apiKey": "sk-xxxxx",
  
  // LLM Provider
  "localChat.llmProvider": "custom",
  "localChat.custom.apiUrl": "http://localhost:5001/api/chat/completions",
  
  // Optional: File access restriction
  "localChat.dataDirectory": "~/data"
}

3. Build and Run Extension

Development Mode (Recommended):

cd /path/to/quregenai_ide/extensions/local_chat

# Terminal 1: Continuous compilation
npm run watch

# VS Code: Press F5 to launch Extension Development Host

Package and Install:

npm run compile
npx @vscode/vsce package --allow-missing-repository --allow-star-activation

# Install the generated .vsix file
# Extensions: Install from VSIX...

4. First Use

Open Command Palette (Cmd+Shift+P)
Run Local Chat: Login
Enter platform credentials
Chat interface opens automatically after login

Usage Guide

Basic Chat

Type your message in the input box
Press Enter or click Send
View response with token usage and cost

Attach Files

Type @ in the input box to trigger file search
Type filename to filter results
Click to select a file
File tag appears before input box (click x to remove)
File content is automatically included in the message

Example:

@protein.pdb
→ Select file from dropdown
→ Tag: [@protein.pdb x]
→ Type: "Analyze this protein structure"

Manage Conversations

New Chat: Click + New Chat button
Switch: Click conversation name in left sidebar
Rename: Click Rename button
Clear: Click Clear to empty current conversation
Delete: Remove a conversation permanently

View History

All conversations listed in left sidebar
Click to switch instantly
Full context preserved including:
- Chat messages
- Token usage
- Tool execution results (with task_id)

MCP Tools

Tools execute automatically when LLM decides
Results include task_id for tracking
Tool outputs persist in conversation history
Use task_id to query status or download results

Architecture

┌─────────────────────────────────────┐
│         VSCode Extension            │
│  ┌─────────────────────────────┐   │
│  │    ChatPanel (Webview)      │   │  • Rich UI with Markdown/LaTeX
│  │    - File attachment (@)    │   │  • @ file selection
│  │    - Conversation list      │   │  • Token tracking
│  └──────────┬──────────────────┘   │
│             │                       │
│  ┌──────────▼──────────────────┐   │
│  │      Extension.ts           │   │  • Command handling
│  │      - Login check          │   │  • Message routing
│  │      - File validation      │   │
│  └──────────┬──────────────────┘   │
│             │                       │
│  ┌──────────▼──────────────────┐   │
│  │      AgentLoop.ts           │   │  • Multi-turn orchestration
│  │      - Context management   │   │  • Tool call handling
│  │      - ConversationManager  │   │  • History persistence
│  └──────────┬──────────────────┘   │
│             │                       │
│  ┌──────────▼──────────────────┐   │
│  │   CustomProvider.ts         │   │  • LLM communication
│  └──────────┬──────────────────┘   │
└─────────────┼───────────────────────┘
              │ HTTP
┌─────────────▼───────────────────────┐
│       Agent Server (Flask)          │
│       - Port 5001                   │
│  ┌─────────────────────────────┐   │
│  │  Authentication & Routing   │   │  • Session validation
│  │  - Session verification     │   │  • API key check
│  │  - API key validation       │   │
│  └──────────┬──────────────────┘   │
│             │                       │
│  ┌──────────▼──────────────────┐   │
│  │    File Validation          │   │  • BioPython/RDKit
│  │    - Deep format check      │   │  • Structure validation
│  │    - OSS upload             │   │  • 10MB / 4000 residues
│  └─────────────────────────────┘   │
│             │                       │
│  ┌──────────▼──────────────────┐   │
│  │    MCP Client               │   │  • Server management
│  │    - Tool discovery         │   │  • Tool execution
│  │    - Call routing           │   │  • Result handling
│  └─────────────────────────────┘   │
│             │                       │
│  ┌──────────▼──────────────────┐   │
│  │    DeepSeek API             │   │  • Chat completions
│  │    - Stream/non-stream      │   │  • Tool calling
│  │    - Usage tracking         │   │  • Token stats
│  └─────────────────────────────┘   │
└─────────────────────────────────────┘
              │
┌─────────────▼───────────────────────┐
│       MCP Servers (stdio)           │
│  - quregenai_mcp                    │  • Structure prediction
│  - file_tools                       │  • Docking tools
│  - ...                              │  • Analysis tools
└─────────────────────────────────────┘

Data Flow

User Input ("+" attached files)
  ↓
ChatPanel validates files
  ↓
Extension.ts routes message
  ↓
AgentLoop prepares context
  ↓ (messages + tools + files)
CustomProvider → agent_server
  ↓
agent_server validates session
  ↓
agent_server → DeepSeek API
  ↓
DeepSeek returns response/tool_calls
  ↓ (if tool_calls)
agent_server → MCP servers
  ↓
MCP servers execute tools
  ↓
Results → DeepSeek (continued)
  ↓
Final response + usage
  ↓
AgentLoop saves to ConversationManager
  ↓
ChatPanel displays (with token cost)

Conversation Logic Flow (AgentLoop)

The AgentLoop orchestrates multi-turn conversations with tool execution and user confirmation. Here's the detailed flow:

1. Message Preparation Phase

1. Load conversation history from ConversationManager
2. Filter out temporary messages:
   - Remove "No MCP server" errors
   - Remove "cancelled by user" messages
3. Apply sliding window (MAX_HISTORY_MESSAGES = 10):
   - Keep recent 10 messages
   - Summarize older messages (extract task_ids, file names)
   - Combine: [summary] + [recent 10 messages]

Smart Context Management:

Full history preserved in ConversationManager
Only recent context sent to LLM (reduces token usage)
Old context summarized to preserve key facts (task IDs, files)

2. Agent Loop Iteration

for each iteration (max 10):
  ├─ Call LLM with messages + available tools
  ├─ Receive response (message + optional tool_calls)
  │
  ├─ Display assistant message
  │  └─ Save to ConversationManager
  │
  ├─ If no tool_calls → END
  │
  ├─ If tool_calls exist:
  │  ├─ Check if user cancelled before → prevent loop
  │  ├─ Request user confirmation (once per turn)
  │  │  ├─ User clicks "Execute" → continue
  │  │  └─ User clicks "Cancel":
  │  │     ├─ Set userCancelled flag
  │  │     ├─ Send instruction to LLM (temporary, not saved)
  │  │     ├─ LLM responds asking about issues
  │  │     └─ If LLM returns tool_calls again → force stop
  │  │
  │  ├─ Execute tools via backend MCP API
  │  │  ├─ Success → add tool result (JSON) to messages
  │  │  └─ Error → add tool error to messages
  │  │  (Not displayed or saved - only for LLM context)
  │  │
  │  ├─ Tool execution count++ (MAX = 1 per turn)
  │  └─ Continue loop → LLM summarizes tool results
  │
  └─ LLM analyzes tool results:
     ├─ Extracts key info (task_id, status, errors)
     ├─ Generates natural language summary
     └─ Display summary → Save to history

3. Tool Execution & Result Handling

Old Approach (Removed):

Tool executes → Format JSON → Display to user → Save
User sees raw technical details

New Approach (Current):

Tool executes → Add raw JSON to messages (temporary)
Continue loop → LLM receives tool result
LLM summarizes: "Task abc123 submitted successfully. Expected wait time: 5 minutes."
Display LLM's summary → Save summary to history

Benefits:

✅ Natural language responses instead of raw JSON
✅ LLM extracts important info (task_id, status, errors)
✅ LLM provides context ("You can check status with...")
✅ Multilingual support (LLM responds in user's language)
✅ Better error explanations

4. User Cancellation Flow

LLM proposes tool call (e.g., structure prediction)
  ↓
Confirmation dialog appears
  ↓
User clicks "Cancel"
  ↓
Set userCancelled = true
  ↓
Add temporary instruction: "Ask user why they cancelled"
  ↓
LLM responds: "Would you like to modify the parameters?"
  ↓
User replies: "Change copies to 3"
  ↓
New conversation turn (userCancelled reset)
  ↓
LLM generates new tool_calls with updated params
  ↓
Confirmation dialog appears again

Safety Mechanisms:

userCancelled flag prevents immediate retry
If LLM ignores instruction and returns tool_calls → force stop
Cancellation instruction is temporary (not saved to history)
User can modify parameters over multiple turns

5. Loop Control & Safety

Protections Against Infinite Loops:

Tool Execution Limit: MAX_TOOL_EXECUTIONS = 1
- Only execute tools once per user message
- After execution, continue for LLM summary but don't execute again
Iteration Limit: maxIterations = 10
- Maximum 10 LLM calls per user message
- Prevents runaway loops
Cancellation Protection:
- If user cancels and LLM still returns tool_calls → terminate
- Display fallback message instead of looping
No Tool Calls → End:
- If LLM doesn't return tool_calls → conversation turn ends
- Prevents unnecessary iterations

6. Message Types & Persistence

Message Type	Displayed	Saved to History	Purpose
User message	✅	✅	User's input
Assistant message	✅	✅	LLM's response
Tool result (JSON)	❌	❌	LLM context only
LLM summary	✅	✅	User-facing result
Cancel instruction	❌	❌	Temporary guidance
Tool error	❌	❌	LLM handles it

Key Principle: Users only see natural language, never raw JSON or system messages.

7. Context Window Management

Problem: Long conversations confuse LLM with old context

Solution: Three-layer approach

Message Filtering:
- Remove temporary errors ("No MCP server")
- Remove cancellation messages
Sliding Window:
- Keep only recent 10 messages for LLM
- Older messages don't affect current intent
Smart Summarization:
- Extract task_ids from old messages
- Extract file names from old messages
- Create compact summary: "[Previous: tasks abc123, xyz789; files protein.pdb]"
- LLM gets key facts without old conversation noise

Result: LLM has fresh context for current task + key facts from history.

8. Known Limitations & Potential Issues

Issue 1: LLM May Return Tool Calls After Summarizing ✅ FIXED

Problem: After receiving tool results, LLM might generate new tool_calls while summarizing (e.g., "Let me check task status")
Solution: Added system instruction after tool execution: "DO NOT make any new tool calls in this response - only provide a clear, user-friendly summary"
Mitigation: MAX_TOOL_EXECUTIONS = 1 still prevents re-execution as backup
Status: Fixed - LLM now understands it should only summarize, not call new tools

Issue 2: Multiple Tool Execution

Problem: If LLM returns multiple tool_calls, all are executed in the same turn
Current Behavior: All tools execute, then count increments by 1
Risk: In edge cases, user might execute more tools than intended
Mitigation: Tool execution limit per turn prevents runaway

Issue 3: Mixed Success/Failure Results

Problem: If some tools succeed and others fail, LLM receives mixed results
Current Behavior: LLM summarizes both successes and failures
Benefit: Actually helpful - LLM can explain what worked and what didn't
No action needed: This is acceptable behavior

Issue 4: Context Window Synchronization

Problem: Frontend MAX_HISTORY_MESSAGES hardcoded, must match backend .env
Current: Manual sync required (both set to 10)
Risk: If values differ, context management breaks
Future Fix: Backend should return MAX_HISTORY_MESSAGES in API response

Issue 5: Backend Tool Execution Mode

Problem: Code supports backend auto-execution (toolResults) but currently unused
Current: Frontend always executes tools after confirmation
Risk: Dead code path may have bugs
Recommendation: Remove backend auto-execution support or test thoroughly

Key Components

Frontend (VSCode Extension)

File	Purpose
`src/extension.ts`	Entry point, command registration
`src/agent/AgentLoop.ts`	Multi-turn conversation orchestration
`src/llm/CustomProvider.ts`	Agent server communication
`src/auth/AuthManager.ts`	Session management (3h expiration)
`src/conversation/ConversationManager.ts`	Conversation persistence
`src/file/FileManager.ts`	File search and validation
`src/webview/ChatPanel.ts`	Chat UI with @ file selection
`src/webview/LoginPanel.ts`	Authentication UI

Backend (Agent Server)

File	Purpose
`agent_server/app.py`	Flask server entry point
`agent_server/routes/chat.py`	Chat API endpoints
`agent_server/routes/upload.py`	File upload and validation
`agent_server/routes/user.py`	User authentication
`agent_server/services/mcp_client.py`	MCP server management
`agent_server/services/file_validator.py`	Deep file validation
`agent_server/services/upload.py`	OSS file upload

Configuration

VSCode Settings (settings.json):

{
  "localChat.auth.apiBaseUrl": "Platform API URL",
  "localChat.auth.apiKey": "Your API key",
  "localChat.llmProvider": "custom",
  "localChat.custom.apiUrl": "http://localhost:5001/api/chat/completions",
  "localChat.dataDirectory": "~/data"  // Optional: restrict file access
}

MCP Configuration (agent_server/mcp_servers.json):

{
  "mcpServers": {
    "quregenai_mcp": {
      "type": "stdio",
      "command": "python",
      "args": ["src/server.py"],
      "cwd": "/absolute/path/to/quregenai_mcp",
      "env": {"QUREGENAI_API_KEY": "your-key"},
      "disabled": false,
      "timeout": 60
    }
  }
}

See agent_server/MCP_SETUP.md for detailed MCP configuration.

Development

Build

npm run compile

Watch Mode

npm run watch

Debug

Press F5 in VS Code to launch the extension in debug mode.

Troubleshooting

Agent Server Connection Failed

Check:

Is agent_server running? Test: curl http://localhost:5001/health
API key configured correctly in settings?
Firewall blocking port 5001?

Solution:

cd agent_server
python app.py
# Look for "Agent Server starting on port 5001"

Check:

Platform backend API accessible? (http://120.26.248.92/api)
Correct username and password?
Session expired? (3-hour limit)

Solution:

Re-run Local Chat: Login command
Verify API key in settings
Check network connectivity

File Selection Not Working

Check:

Data directory configured? (if set, files must be inside)
File format supported?
File exists and readable?

Solution:

// Remove directory restriction (or set correct path)
"localChat.dataDirectory": ""

// Or set to specific directory
"localChat.dataDirectory": "~/data"

Token Stats Not Showing

Issue: Old conversations don't have usage data

Solution:

Create new conversation with + New Chat
New messages will show token usage and cost

MCP Tools Not Working

Check:

Agent server logs show "MCP client initialized successfully"?
MCP servers configured in agent_server/mcp_servers.json?
Correct paths (use absolute paths)?

Solution:

# Check agent_server logs
cd agent_server
python app.py
# Look for:
# [MCP Client] Connected to server: quregenai_mcp
# [MCP Client] Cached 50+ tools from quregenai_mcp

Compilation Errors

# Clean and rebuild
cd extensions/local_chat
rm -rf node_modules package-lock.json
npm install
npm run compile

Extension Not Loading

Issue: Changes not reflected after reload

Solution:

Must reload VSCode window after code changes
Or use npm run watch + F5 for development
Packaged .vsix requires reinstallation

Security

API Key Storage: Stored in VSCode globalState (encrypted)
Session Management: 3-hour expiration, automatic re-authentication required
Agent Server: Validates every request with API key
File Access: Optional directory restriction via localChat.dataDirectory
No API Key Leakage: API keys never sent to DeepSeek (handled by agent_server)

Additional Documentation

SETUP.md - Detailed setup guide with architecture overview
agent_server/README.md - Agent server API documentation
agent_server/MCP_SETUP.md - MCP integration and configuration guide

Commands

Local Chat: Login - Authenticate with platform credentials
Local Chat: Open Chat - Open chat interface (requires login)
Local Chat: New Conversation - Create new conversation
Local Chat: Switch Conversation - Change active conversation
Local Chat: Rename Conversation - Rename current conversation
Local Chat: Delete Conversation - Remove a conversation
Local Chat: Clear Conversation - Empty current conversation

Development Tips

Use npm run watch for automatic recompilation
Press F5 to launch Extension Development Host
Reload window (Cmd+R) after code changes
Check extension output for debugging
Agent server logs show MCP and tool execution details

Project Structure

local_chat/
├── src/
│   ├── extension.ts              # Entry point
│   ├── agent/
│   │   └── AgentLoop.ts          # Conversation orchestration
│   ├── auth/
│   │   └── AuthManager.ts        # Session management
│   ├── conversation/
│   │   └── ConversationManager.ts # History persistence
│   ├── file/
│   │   └── FileManager.ts        # File operations
│   ├── llm/
│   │   └── CustomProvider.ts     # Agent server client
│   ├── webview/
│   │   ├── ChatPanel.ts          # Chat UI
│   │   └── LoginPanel.ts         # Login UI
│   └── types.ts                  # TypeScript definitions
├── agent_server/
│   ├── app.py                    # Flask server
│   ├── routes/                   # API endpoints
│   ├── services/                 # Business logic
│   └── mcp_servers.json          # MCP configuration
└── README.md                     # This file

License

Apache 2.0

QureGenAI Chat

quregenai

Local Chat - VS Code Extension

✨ Features

Core Capabilities

File Attachment & Management

MCP Tool Integration

UI Features

Quick Start

Prerequisites

1. Start Agent Server

2. Configure VS Code Extension

3. Build and Run Extension

4. First Use

Usage Guide

Basic Chat

Attach Files

Manage Conversations

View History

MCP Tools

Architecture

Data Flow

Conversation Logic Flow (AgentLoop)

1. Message Preparation Phase

2. Agent Loop Iteration

3. Tool Execution & Result Handling

4. User Cancellation Flow

5. Loop Control & Safety

6. Message Types & Persistence

7. Context Window Management

8. Known Limitations & Potential Issues

Key Components

Frontend (VSCode Extension)

Backend (Agent Server)

Configuration

Development

Build

Watch Mode

Debug

Troubleshooting

Agent Server Connection Failed

Login Failed

File Selection Not Working

Token Stats Not Showing

MCP Tools Not Working

Compilation Errors

Extension Not Loading

Security

Additional Documentation

Commands

Development Tips

Project Structure

License