How to Test and Debug MCP Apps: A Complete Guide for Developers
How to Test and Debug MCP Apps: A Complete Guide for Developers
You've built an MCP App โ a slick data dashboard, an interactive form, or maybe a 3D visualization. It works perfectly in your local development environment. But then you deploy it, open Claude Desktop, and... nothing renders. Or it renders but buttons don't work. Or it works in Claude but breaks in ChatGPT.
Welcome to the reality of MCP App testing and debugging.
Testing apps that run inside AI assistants comes with unique challenges. You're not just testing a web component โ you're testing how that component behaves inside a sandboxed environment, across different AI clients, with varying CSP policies, and sometimes unpredictable AI-generated inputs.
Today, I'll walk you through a complete testing and debugging framework for MCP Apps (interactive UI components for AI assistants like Claude, ChatGPT, and VS Code). By the end, you'll have a battle-tested workflow to catch bugs before your users do.
๐ฏ The Testing Challenge: Why MCP Apps Are Different
Before diving into solutions, let's understand what makes MCP App testing unique:
| Traditional Web App | MCP App |
|---|---|
| Runs in browser you control | Runs inside AI client's sandbox |
| Direct access to dev tools | Limited debugging visibility |
| Predictable environment | Multiple client implementations |
| Standard error handling | Errors may be swallowed by AI client |
| Refresh to update | Requires AI to re-render component |
This means your testing strategy needs layers:
- Unit tests for component logic
- Integration tests for MCP protocol handling
- Client-specific tests for each AI assistant
- Manual QA for edge cases AI inputs create
๐งช Layer 1: Unit Testing Your Components
Start with standard React/Vue component testing. Use the same tools you'd use for any frontend project.
Setup with Vitest + React Testing Library
npm install -D vitest @testing-library/react @testing-library/jest-dom
// Counter.test.tsx
import { describe, it, expect } from 'vitest';
import { render, screen, fireEvent } from '@testing-library/react';
import Counter from './Counter';
describe('Counter', () => {
it('renders initial count', () => {
render(<Counter initialValue={5} />);
expect(screen.getByText('5')).toBeInTheDocument();
});
it('increments when + button clicked', () => {
render(<Counter initialValue={0} />);
fireEvent.click(screen.getByText('+'));
expect(screen.getByText('1')).toBeInTheDocument();
});
it('handles negative values', () => {
render(<Counter initialValue={0} />);
fireEvent.click(screen.getByText('โ'));
expect(screen.getByText('-1')).toBeInTheDocument();
});
});
Testing MCP-Specific Functions
If your app exposes functions to the AI, test those too:
// mcp-functions.test.ts
import { describe, it, expect, vi } from 'vitest';
import { registerMCPFunctions } from './mcp-functions';
describe('MCP Functions', () => {
it('registers setCounter function', () => {
const mockRegister = vi.fn();
global.window.mcp = { registerFunction: mockRegister };
registerMCPFunctions();
expect(mockRegister).toHaveBeenCalledWith(
'setCounter',
expect.any(Function)
);
});
it('setCounter updates state correctly', () => {
const registeredFunctions: Record<string, Function> = {};
global.window.mcp = {
registerFunction: (name: string, fn: Function) => {
registeredFunctions[name] = fn;
}
};
registerMCPFunctions();
const result = registeredFunctions.setCounter(42);
expect(result).toEqual({ success: true, newValue: 42 });
});
});
๐ Layer 2: Integration Testing with MCP Protocol
Your app communicates with AI clients via the Model Context Protocol. Test this integration.
Mocking the MCP Client
// mcp-mock.ts
export class MockMCPClient {
private handlers: Map<string, Function> = new Map();
private messageLog: any[] = [];
registerFunction(name: string, handler: Function) {
this.handlers.set(name, handler);
}
async callFunction(name: string, args: any) {
this.messageLog.push({ direction: 'in', name, args });
const handler = this.handlers.get(name);
if (!handler) throw new Error(`Unknown function: ${name}`);
const result = await handler(args);
this.messageLog.push({ direction: 'out', result });
return result;
}
getMessageLog() {
return this.messageLog;
}
clearLog() {
this.messageLog = [];
}
}
// Make available globally for tests
global.MockMCPClient = MockMCPClient;
Testing Full Interaction Flows
// app-integration.test.ts
import { describe, it, expect, beforeEach } from 'vitest';
import { MockMCPClient } from './mcp-mock';
import { initializeApp } from './app';
describe('MCP App Integration', () => {
let client: MockMCPClient;
beforeEach(() => {
client = new MockMCPClient();
initializeApp(client);
});
it('handles complete user interaction flow', async () => {
// AI calls function to set initial data
await client.callFunction('loadData', { source: 'sales-q4' });
// Verify data loaded
expect(client.getMessageLog()).toContainEqual(
expect.objectContaining({
direction: 'out',
result: expect.objectContaining({ status: 'loaded' })
})
);
// AI requests chart generation
const chartResult = await client.callFunction('generateChart', {
type: 'bar',
data: 'sales-q4'
});
expect(chartResult).toHaveProperty('chartId');
expect(chartResult).toHaveProperty('renderUrl');
});
it('handles errors gracefully', async () => {
// Request with invalid parameters
const result = await client.callFunction('loadData', {
source: 'non-existent-source'
});
expect(result).toEqual({
success: false,
error: 'Data source not found',
availableSources: ['sales-q4', 'sales-q3', 'inventory']
});
});
});
๐ฅ๏ธ Layer 3: Client-Specific Testing
Different AI clients implement MCP differently. You need to test in each target environment.
Testing Matrix
| Feature | Claude Desktop | ChatGPT | VS Code | Test Priority |
|---|---|---|---|---|
| Component rendering | Critical | Critical | High | P0 |
| Button interactions | Critical | Critical | High | P0 |
| Form submissions | Critical | High | Medium | P1 |
| File uploads | High | Medium | Low | P2 |
| External API calls | Critical | Critical | Medium | P1 |
| Dark mode | High | Medium | Low | P2 |
Automated Browser Testing with Playwright
// e2e/claude-desktop.spec.ts
import { test, expect } from '@playwright/test';
test.describe('MCP App in Claude Desktop', () => {
test('renders counter app', async ({ page }) => {
// Navigate to Claude Desktop web interface
await page.goto('https://claude.ai');
// Login (use test account)
await page.fill('[name="email"]', process.env.TEST_EMAIL!);
await page.click('button[type="submit"]');
// Start new conversation
await page.click('text=New Chat');
// Ask Claude to render app
await page.fill('textarea', 'Show me the counter app');
await page.press('textarea', 'Enter');
// Wait for MCP App iframe to appear
const appFrame = await page.waitForSelector('iframe[data-mcp-app]');
expect(appFrame).toBeVisible();
// Interact with app inside iframe
const frame = await appFrame.contentFrame();
await frame!.click('text=+');
// Verify counter updated
const counter = await frame!.$eval('.counter-value', el => el.textContent);
expect(counter).toBe('1');
});
});
Manual Testing Checklist
Some things are better tested manually. Create a checklist:
## Pre-Release Manual QA Checklist
### Claude Desktop
- [ ] App renders on first request
- [ ] App re-renders correctly on refresh
- [ ] Buttons respond to clicks
- [ ] Forms validate input
- [ ] Loading states display
- [ ] Error messages appear
- [ ] Dark mode applies correctly
- [ ] App responds to AI function calls
- [ ] Large datasets don't freeze UI
- [ ] Mobile view (narrow width) works
### ChatGPT
- [ ] App renders on first request
- [ ] App re-renders correctly on refresh
- [ ] Buttons respond to clicks
- [ ] Forms validate input
- [ ] Loading states display
- [ ] Error messages appear
- [ ] App responds to AI function calls
### Cross-Client
- [ ] State persists correctly across re-renders
- [ ] App handles rapid consecutive requests
- [ ] App handles malformed AI inputs gracefully
- [ ] Memory usage stays reasonable (<100MB)
๐ Debugging Techniques
When things go wrong, here's how to diagnose issues:
1. Console Logging Strategy
Add structured logging to trace execution:
// logger.ts
const DEBUG = import.meta.env.DEV || window.location.search.includes('debug=true');
export const log = {
info: (msg: string, data?: any) => {
if (DEBUG) console.log(`[MCP App] ${msg}`, data ?? '');
},
error: (msg: string, error: any) => {
console.error(`[MCP App Error] ${msg}`, error);
// Send to error tracking service
if (window.Sentry) {
window.Sentry.captureException(error);
}
},
mcp: (direction: 'in' | 'out', message: any) => {
if (DEBUG) {
const arrow = direction === 'in' ? 'โ' : 'โ';
console.log(`[MCP ${arrow}]`, message);
}
}
};
// Usage in your app
function handleAIRequest(data: any) {
log.mcp('in', data);
try {
const result = processRequest(data);
log.mcp('out', result);
return result;
} catch (error) {
log.error('Failed to process AI request', error);
throw error;
}
}
2. Visual Debugging Overlay
Add a debug panel visible in development:
// DebugPanel.tsx
function DebugPanel() {
const [logs, setLogs] = useState<any[]>([]);
const [showPanel, setShowPanel] = useState(false);
useEffect(() => {
if (!import.meta.env.DEV) return;
// Capture console logs
const originalLog = console.log;
console.log = (...args) => {
setLogs(prev => [...prev.slice(-50), args]);
originalLog.apply(console, args);
};
}, []);
if (!import.meta.env.DEV) return null;
return (
<>
<button
onClick={() => setShowPanel(!showPanel)}
style={{ position: 'fixed', bottom: 10, right: 10, zIndex: 9999 }}
>
๐ Debug
</button>
{showPanel && (
<div style={{
position: 'fixed',
bottom: 50,
right: 10,
width: 400,
height: 300,
background: 'rgba(0,0,0,0.9)',
color: '#0f0',
fontFamily: 'monospace',
fontSize: 12,
overflow: 'auto',
padding: 10,
zIndex: 9999
}}>
<h4>MCP Debug Log</h4>
{logs.map((log, i) => (
<div key={i}>{JSON.stringify(log)}</div>
))}
</div>
)}
</>
);
}
3. Network Request Inspection
Monitor external API calls:
// Wrap fetch to log all network requests
const originalFetch = window.fetch;
window.fetch = async (...args) => {
const [url, config] = args;
log.info(`API Request: ${config?.method || 'GET'} ${url}`);
try {
const response = await originalFetch(...args);
log.info(`API Response: ${response.status} ${url}`);
return response;
} catch (error) {
log.error(`API Error: ${url}`, error);
throw error;
}
};
4. Error Boundary for Crash Recovery
Prevent total app crashes:
// ErrorBoundary.tsx
import { Component, ErrorInfo, ReactNode } from 'react';
interface Props {
children: ReactNode;
fallback?: ReactNode;
}
interface State {
hasError: boolean;
error?: Error;
}
export class MCPErrorBoundary extends Component<Props, State> {
state: State = { hasError: false };
static getDerivedStateFromError(error: Error): State {
return { hasError: true, error };
}
componentDidCatch(error: Error, errorInfo: ErrorInfo) {
log.error('App crash caught by boundary', { error, errorInfo });
// Report to AI that app crashed
if (window.mcp?.registerFunction) {
window.mcp.registerFunction('getErrorReport', () => ({
error: error.message,
stack: error.stack,
componentStack: errorInfo.componentStack,
timestamp: new Date().toISOString()
}));
}
}
render() {
if (this.state.hasError) {
return this.props.fallback || (
<div style={{ padding: 20, color: 'red' }}>
<h3>โ ๏ธ App Error</h3>
<p>Something went wrong. Try refreshing or contact support.</p>
{import.meta.env.DEV && (
<pre>{this.state.error?.message}</pre>
)}
<button onClick={() => window.location.reload()}>
Reload App
</button>
</div>
);
}
return this.props.children;
}
}
// Usage
< MCPErrorBoundary>
<YourApp />
</MCPErrorBoundary>
๐ Common Issues and Solutions
Issue: App Doesn't Render in Claude
Symptoms: Claude responds with text description instead of rendering your app.
Debugging steps:
-
Check MCP server is running:
npx mcp-cli serve mcp.json # Should show "Server running on port..." -
Verify Claude Desktop config:
// claude_desktop_config.json { "mcpServers": { "my-app": { "command": "npx", "args": ["mcp-cli", "serve", "/path/to/mcp.json"] } } } -
Test with explicit prompt:
- Instead of "show my app", try "render the my-app MCP application"
- Claude needs clear intent to render vs. describe
-
Check browser console:
- Open Claude Desktop DevTools (Cmd+Option+I on Mac)
- Look for CSP errors or 404s
Issue: Buttons Don't Work
Symptoms: App renders but interactions have no effect.
Common causes:
- Event delegation: AI clients may intercept clicks. Use
onClickhandlers, not delegated events. - CSP restrictions: Inline scripts may be blocked. Move handlers to external files.
- State not updating: React state may not trigger re-render in iframe. Force update:
const [count, setCount] = useState(0); const increment = () => { setCount(c => c + 1); // Force parent notification window.parent.postMessage({ type: 'state-update' }, '*'); };
Issue: Works in Claude, Broken in ChatGPT
Symptoms: App functions in one client but not another.
Differences to check:
| Aspect | Claude | ChatGPT |
|---|---|---|
| CSP strictness | Moderate | Strict |
| iframe sandbox | allow-scripts | allow-scripts allow-same-origin |
| CSS isolation | Partial | Full |
| Function calls | Synchronous | Async |
Fix: Test both environments during development. Use feature detection:
function detectClient() {
if (window.location.hostname.includes('claude')) return 'claude';
if (window.location.hostname.includes('chatgpt')) return 'chatgpt';
if (window.location.hostname.includes('vscode')) return 'vscode';
return 'unknown';
}
// Adjust behavior based on client
const client = detectClient();
if (client === 'chatgpt') {
// Apply stricter CSP workarounds
}
Issue: Slow Initial Load
Symptoms: App takes 3+ seconds to appear.
Solutions:
-
Code split heavy dependencies:
const Chart = lazy(() => import('./HeavyChart')); -
Show skeleton immediately:
<Suspense fallback={<Skeleton />}> <Chart /> </Suspense> -
Preload critical resources:
<link rel="preload" href="/critical.css" as="style"> <link rel="preload" href="/chart-data.json" as="fetch">
๐ Testing Metrics to Track
Monitor these in production:
| Metric | Target | Measurement |
|---|---|---|
| Time to First Render | < 2s | performance.now() on mount |
| Error Rate | < 1% | Sentry or similar |
| Client Compatibility | 100% P0 features | Automated test coverage |
| Function Call Success | > 99% | Logged MCP responses |
| User Interaction Latency | < 100ms | Event timestamp diff |
// Instrument your app
useEffect(() => {
// Report load time
const loadTime = performance.now();
log.info(`App loaded in ${loadTime.toFixed(2)}ms`);
// Send to analytics
analytics.track('mcp_app_loaded', {
app: 'counter-app',
loadTime,
client: detectClient()
});
}, []);
โ Production Testing Checklist
Before shipping your MCP App:
Automated Tests
- Unit tests pass (>80% coverage)
- Integration tests pass
- E2E tests pass in target clients
- No console errors in production build
Manual QA
- Tested in Claude Desktop (latest version)
- Tested in ChatGPT (if supported)
- Tested with slow network (3G throttling)
- Tested with error conditions (offline API)
- Tested with malformed AI inputs
Performance
- Initial load < 2 seconds
- Bundle size < 500KB
- No memory leaks (test 10 min usage)
- Responsive at 320px width
Security
- Input validation on all AI-provided data
- No secrets in client bundle
- CSP headers configured
- Error messages don't leak stack traces
๐ Advanced: CI/CD for MCP Apps
Automate testing in your pipeline:
# .github/workflows/test.yml
name: MCP App Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Node
uses: actions/setup-node@v3
with:
node-version: '18'
- name: Install dependencies
run: npm ci
- name: Run unit tests
run: npm run test:unit
- name: Run integration tests
run: npm run test:integration
- name: Build production bundle
run: npm run build
- name: Audit bundle size
run: npx bundlesize
- name: E2E tests
run: npm run test:e2e
env:
CLAUDE_TEST_EMAIL: ${{ secrets.CLAUDE_TEST_EMAIL }}
CLAUDE_TEST_PASSWORD: ${{ secrets.CLAUDE_TEST_PASSWORD }}
Wrapping Up
Testing MCP Apps requires going beyond traditional web testing. You're building for a multi-client ecosystem where your app runs inside AI assistants with varying capabilities and constraints.
Key takeaways:
- Layer your tests: Unit โ Integration โ Client-specific โ Manual
- Test in real clients: Simulators miss edge cases
- Add observability: Logging and error tracking are essential
- Automate what you can: CI/CD catches regressions early
- Maintain a QA checklist: Consistency prevents shipping bugs
The extra testing effort pays off. Users expect MCP Apps to "just work" โ and with this framework, yours will.
๐ ๏ธ Resources
- MCP Official Documentation
- Vitest Testing Framework
- Playwright E2E Testing
- Sentry Error Tracking
- MCP Apps Directory โ See how others structure their apps
Have a testing strategy that works for you? Share it with the community on Discord or Twitter.
Happy debugging! ๐
Tags: #mcp-apps #testing #debugging #developer-tools #model-context-protocol #claude-apps #tutorial
The team behind MCP Apps, curating the best interactive components for AI assistants.
Subscribe to our newsletter
Get the latest tutorials, showcases, and MCP Apps updates delivered to your inbox.