写点什么

构建能交互网页的 AI 助手——基于 Playwright MCP 的完整项目

作者:测试人
  • 2025-10-14
    北京
  • 本文字数:18495 字

    阅读完需:约 61 分钟

项目概述:打造智能网页操作助手

在本教程中,我们将构建一个完整的、能够实际交互网页的 AI 助手。这个助手不仅能理解自然语言指令,还能通过 Playwright MCP 执行复杂的网页操作。我们将从零开始,搭建一个功能完备的系统,涵盖从环境配置到实际部署的全流程。

项目目标

构建一个能够执行以下任务的 AI 助手:

  • 自动登录网站并处理认证

  • 填写复杂表单和交互元素

  • 提取、分析和结构化网页数据

  • 处理多步骤工作流程

  • 应对网页异常和动态内容

一、项目架构设计

技术栈选择

  • 后端框架: Node.js + Express

  • 浏览器自动化: Playwright

  • AI 模型集成: Anthropic Claude API

  • 协议层: 自定义 MCP Server

  • 前端界面: React + Tailwind CSS

  • 数据库: SQLite (用于会话存储)

  • 任务队列: Bull (用于异步任务处理)

系统架构

用户界面 (React)    ↓ (HTTP/REST API)后端服务器 (Express + AI 路由)    ↓ (MCP 协议)Playwright MCP Server    ↓ (浏览器控制)Chromium/Firefox 实例
复制代码

二、环境准备与项目初始化

步骤 1:创建项目结构

mkdir ai-web-assistantcd ai-web-assistantmkdir -p src/{mcp,ai,routes,models,utils} public/{css,js} teststouch package.json server.js .env.example README.md
复制代码

步骤 2:定义项目依赖

创建 package.json:

{  "name": "ai-web-assistant","version": "1.0.0","type": "module","scripts": {    "start": "node server.js",    "dev": "nodemon server.js",    "test": "jest",    "mcp:dev": "node src/mcp/server.js"  },"dependencies": {    "express": "^4.18.2",    "cors": "^2.8.5",    "dotenv": "^16.3.0",    "playwright": "^1.40.0",    "@anthropic-ai/sdk": "^0.7.0",    "sqlite3": "^5.1.6",    "bull": "^4.11.0",    "express-rate-limit": "^7.1.0",    "helmet": "^7.0.0"  },"devDependencies": {    "nodemon": "^3.0.0",    "jest": "^29.6.0"  }}
复制代码

运行 npm install 安装依赖。

步骤 3:环境配置

创建 .env 文件:

# API 配置ANTHROPIC_API_KEY=your_anthropic_api_key_herePORT=3000NODE_ENV=development
# 浏览器配置BROWSER_TYPE=chromiumHEADLESS_MODE=falseBROWSER_TIMEOUT=30000
# 数据库配置DB_PATH=./data/sessions.db
# 安全配置SESSION_SECRET=your_session_secret_hereRATE_LIMIT_WINDOW=900000RATE_LIMIT_MAX=100
复制代码

三、核心模块实现

1. Playwright MCP Server 实现

创建 src/mcp/server.js:

import { chromium, firefox, webkit } from'playwright';import { EventEmitter } from'events';
class PlaywrightMCPServer extends EventEmitter {constructor(config = {}) { super(); this.config = { browserType: config.browserType || 'chromium', headless: config.headless !== false, timeout: config.timeout || 30000, ...config }; this.browser = null; this.context = null; this.page = null; this.isInitialized = false; this.sessionId = null; }
// 初始化浏览器实例async initialize(sessionId = null) { try { this.sessionId = sessionId || `session_${Date.now()}`; const browserMap = { chromium, firefox, webkit }; const BrowserClass = browserMap[this.config.browserType] || chromium; this.browser = await BrowserClass.launch({ headless: this.config.headless, timeout: this.config.timeout, args: ['--no-sandbox', '--disable-dev-shm-usage'] }); this.context = awaitthis.browser.newContext({ viewport: { width: 1280, height: 720 }, userAgent: 'AI-Web-Assistant/1.0', acceptDownloads: true, ignoreHTTPSErrors: true }); // 添加页面错误处理 this.context.on('page', page => { page.on('pageerror', error => { this.emit('pageError', { sessionId: this.sessionId, error }); }); }); this.page = awaitthis.context.newPage(); // 设置默认超时 this.page.setDefaultTimeout(this.config.timeout); this.page.setDefaultNavigationTimeout(this.config.timeout * 2); this.isInitialized = true; this.emit('initialized', { sessionId: this.sessionId }); return { success: true, message: 'Playwright MCP Server initialized successfully', sessionId: this.sessionId }; } catch (error) { console.error('Failed to initialize Playwright:', error); this.emit('error', error); return { success: false, error: error.message }; } }
// 工具定义 - MCP 协议核心 getTools() { return { navigate: { name: 'navigate', description: 'Navigate to a specific URL', parameters: { url: { type: 'string', description: 'The URL to navigate to' }, waitUntil: { type: 'string', description: 'When to consider navigation successful', enum: ['load', 'domcontentloaded', 'networkidle'], default: 'networkidle' } } }, click: { name: 'click', description: 'Click on an element using CSS selector, XPath, or text', parameters: { selector: { type: 'string', description: 'CSS selector, XPath, or text to identify the element' }, selectorType: { type: 'string', description: 'Type of selector: css, xpath, or text', enum: ['css', 'xpath', 'text'], default: 'css' }, waitForNavigation: { type: 'boolean', description: 'Whether to wait for navigation after click', default: false } } }, fill_form: { name: 'fill_form', description: 'Fill a form with multiple fields', parameters: { fields: { type: 'object', description: 'Object mapping selectors to values' } } }, extract_data: { name: 'extract_data', description: 'Extract structured data from the page', parameters: { schema: { type: 'object', description: 'Schema defining what data to extract' } } }, wait_for_element: { name: 'wait_for_element', description: 'Wait for an element to appear', parameters: { selector: { type: 'string', description: 'CSS selector for the element' }, state: { type: 'string', description: 'Element state to wait for', enum: ['attached', 'detached', 'visible', 'hidden'], default: 'visible' }, timeout: { type: 'number', description: 'Timeout in milliseconds', default: 10000 } } }, screenshot: { name: 'screenshot', description: 'Take a screenshot for debugging', parameters: { fullPage: { type: 'boolean', description: 'Whether to capture full page', default: false } } }, get_page_info: { name: 'get_page_info', description: 'Get comprehensive information about the current page' } }; }
// 工具执行引擎async executeTool(toolName, parameters = {}) { if (!this.isInitialized) { thrownewError('Playwright not initialized. Call initialize() first.'); }
try { let result; switch (toolName) { case'navigate': result = awaitthis.navigateToUrl(parameters.url, parameters.waitUntil); break; case'click': result = awaitthis.clickElement(parameters.selector, parameters.selectorType, parameters.waitForNavigation); break; case'fill_form': result = awaitthis.fillForm(parameters.fields); break; case'extract_data': result = awaitthis.extractData(parameters.schema); break; case'wait_for_element': result = awaitthis.waitForElement(parameters.selector, parameters.state, parameters.timeout); break; case'screenshot': result = awaitthis.takeScreenshot(parameters.fullPage); break; case'get_page_info': result = awaitthis.getPageInfo(); break; default: thrownewError(`Unknown tool: ${toolName}`); } this.emit('toolExecuted', { sessionId: this.sessionId, toolName, parameters, result }); return { success: true, data: result }; } catch (error) { console.error(`Tool execution failed: ${toolName}`, error); this.emit('toolError', { sessionId: this.sessionId, toolName, parameters, error: error.message }); return { success: false, error: error.message, suggestion: this.getErrorSuggestion(error.message) }; } }
// 具体的工具实现方法async navigateToUrl(url, waitUntil = 'networkidle') { if (!url.startsWith('http')) { url = 'https://' + url; } const response = awaitthis.page.goto(url, { waitUntil, timeout: this.config.timeout }); return { url: this.page.url(), status: response?.status(), title: awaitthis.page.title(), finalUrl: this.page.url() }; }
async clickElement(selector, selectorType = 'css', waitForNavigation = false) { let element; switch (selectorType) { case'css': element = this.page.locator(selector); break; case'xpath': element = this.page.locator(`xpath=${selector}`); break; case'text': element = this.page.getByText(selector, { exact: false }); break; default: thrownewError(`Unsupported selector type: ${selectorType}`); } await element.waitFor({ state: 'visible' }); if (waitForNavigation) { awaitPromise.all([ this.page.waitForNavigation({ waitUntil: 'networkidle' }), element.click() ]); } else { await element.click(); } return { success: true, element: awaitthis.getElementInfo(element) }; }
async fillForm(fields) { const results = {}; for (const [selector, value] ofObject.entries(fields)) { try { const element = this.page.locator(selector); await element.waitFor({ state: 'visible' }); await element.fill(value); results[selector] = { success: true, value }; } catch (error) { results[selector] = { success: false, error: error.message }; } } return results; }
async extractData(schema) { const data = {}; for (const [key, config] ofObject.entries(schema)) { try { const { selector, type = 'text', attribute } = config; const element = this.page.locator(selector); switch (type) { case'text': data[key] = await element.textContent(); break; case'attribute': data[key] = await element.getAttribute(attribute); break; case'multiple': data[key] = await element.allTextContents(); break; default: data[key] = await element.textContent(); } } catch (error) { data[key] = null; } } return data; }
async getElementInfo(element) { try { const boundingBox = await element.boundingBox(); const isVisible = await element.isVisible(); return { visible: isVisible, boundingBox, tagName: await element.evaluate(el => el.tagName.toLowerCase()) }; } catch (error) { return { error: error.message }; } }
async takeScreenshot(fullPage = false) { const screenshot = awaitthis.page.screenshot({ fullPage, type: 'png' }); return { screenshot: screenshot.toString('base64'), type: 'png', fullPage }; }
async getPageInfo() { return { url: this.page.url(), title: awaitthis.page.title(), content: awaitthis.page.content(), viewport: this.page.viewportSize() }; }
// 错误处理和建议 getErrorSuggestion(errorMessage) { const suggestions = { 'timeout': '尝试增加等待时间或检查网络连接', 'element not found': '检查选择器是否正确,或等待元素加载', 'navigation failed': '检查URL是否正确,或网站是否可访问', 'target closed': '浏览器页面已关闭,需要重新初始化' }; for (const [key, suggestion] ofObject.entries(suggestions)) { if (errorMessage.toLowerCase().includes(key)) { return suggestion; } } return'请检查网络连接和页面状态后重试'; }
// 清理资源async cleanup() { try { if (this.page) { awaitthis.page.close(); } if (this.context) { awaitthis.context.close(); } if (this.browser) { awaitthis.browser.close(); } this.isInitialized = false; this.emit('cleanedUp', { sessionId: this.sessionId }); return { success: true, message: 'Resources cleaned up successfully' }; } catch (error) { console.error('Cleanup failed:', error); return { success: false, error: error.message }; } }}
exportdefault PlaywrightMCPServer;
复制代码

2. AI 处理模块

创建 src/ai/handler.js:

import Anthropic from'@anthropic-ai/sdk';import PlaywrightMCPServer from'../mcp/server.js';
class AIHandler {constructor(apiKey) { this.anthropic = new Anthropic({ apiKey }); this.mcpServer = new PlaywrightMCPServer(); this.conversationHistory = newMap(); }
// 初始化会话async initializeSession(sessionId) { const result = awaitthis.mcpServer.initialize(sessionId); if (!this.conversationHistory.has(sessionId)) { this.conversationHistory.set(sessionId, []); } return result; }
// 处理用户指令async processInstruction(sessionId, instruction, context = {}) { try { const history = this.conversationHistory.get(sessionId) || []; // 构建系统提示词 const systemPrompt = this.buildSystemPrompt(context); // 获取可用工具 const availableTools = this.mcpServer.getTools(); // 调用 Claude 模型 const message = awaitthis.anthropic.messages.create({ model: "claude-3-sonnet-20240229", max_tokens: 4096, system: systemPrompt, messages: [ ...history, { role: "user", content: instruction } ], tools: Object.values(availableTools) }); let finalResponse = ''; let currentMessage = message; // 处理工具调用 while (currentMessage.content.some(item => item.type === 'tool_use')) { const toolResults = []; for (const contentItem of currentMessage.content) { if (contentItem.type === 'tool_use') { const toolName = contentItem.name; const parameters = contentItem.input; // 执行工具 const toolResult = awaitthis.mcpServer.executeTool(toolName, parameters); toolResults.push({ type: 'tool_result', tool_use_id: contentItem.id, content: JSON.stringify(toolResult) }); } } // 继续对话 currentMessage = awaitthis.anthropic.messages.create({ model: "claude-3-sonnet-20240229", max_tokens: 4096, messages: [ ...history, { role: "user", content: instruction }, { role: "assistant", content: currentMessage.content }, { role: "user", content: toolResults } ], tools: Object.values(availableTools) }); } // 提取最终响应 const textContent = currentMessage.content.find(item => item.type === 'text'); finalResponse = textContent ? textContent.text : '操作完成'; // 更新对话历史 history.push( { role: "user", content: instruction }, { role: "assistant", content: currentMessage.content } ); // 保持最近10轮对话 if (history.length > 20) { history.splice(0, 4); } return { success: true, response: finalResponse, sessionId }; } catch (error) { console.error('AI processing failed:', error); return { success: false, error: error.message, sessionId }; } }
// 构建系统提示词 buildSystemPrompt(context) { return`你是一个专业的网页操作助手,可以通过浏览器自动化工具执行各种网页任务。
你的能力包括:- 导航到指定网址- 点击按钮和链接- 填写表单和输入框- 提取网页数据- 等待页面加载- 处理复杂交互
重要指导原则:1. 在执行操作前先分析页面结构2. 使用合适的选择器定位元素3. 处理可能出现的错误和异常4. 提供清晰的操作反馈5. 对于复杂任务,分解为多个步骤执行
当前上下文:${JSON.stringify(context)}
请谨慎操作,确保每一步都正确执行。如果遇到错误,请分析原因并提供解决方案。`; }
// 获取会话历史 getSessionHistory(sessionId) { returnthis.conversationHistory.get(sessionId) || []; }
// 清理会话async cleanupSession(sessionId) { this.conversationHistory.delete(sessionId); returnawaitthis.mcpServer.cleanup(); }}
exportdefault AIHandler;
复制代码

3. Express 服务器和路由

创建 server.js:

import express from'express';import cors from'cors';import helmet from'helmet';import rateLimit from'express-rate-limit';import dotenv from'dotenv';import AIHandler from'./src/ai/handler.js';
// 加载环境变量dotenv.config();
const app = express();const PORT = process.env.PORT || 3000;
// 初始化 AI 处理器const aiHandler = new AIHandler(process.env.ANTHROPIC_API_KEY);
// 中间件配置app.use(helmet());app.use(cors());app.use(express.json({ limit: '10mb' }));
// 速率限制const limiter = rateLimit({windowMs: parseInt(process.env.RATE_LIMIT_WINDOW) || 15 * 60 * 1000,max: parseInt(process.env.RATE_LIMIT_MAX) || 100,message: '请求过于频繁,请稍后再试'});app.use(limiter);
// 会话存储const sessions = newMap();
// API 路由
// 健康检查app.get('/health', (req, res) => { res.json({ status: 'ok', timestamp: newDate().toISOString() });});
// 初始化会话app.post('/api/session/init', async (req, res) => {try { const sessionId = req.body.sessionId || `session_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`; const result = await aiHandler.initializeSession(sessionId); if (result.success) { sessions.set(sessionId, { createdAt: newDate(), lastActivity: newDate() }); res.json({ success: true, sessionId, message: '会话初始化成功' }); } else { res.status(500).json({ success: false, error: result.error }); } } catch (error) { console.error('Session init error:', error); res.status(500).json({ success: false, error: error.message }); }});
// 处理用户指令app.post('/api/instruction', async (req, res) => {try { const { sessionId, instruction, context = {} } = req.body; if (!sessionId || !instruction) { return res.status(400).json({ success: false, error: '缺少必要参数:sessionId 和 instruction' }); } // 更新会话活动时间 const session = sessions.get(sessionId); if (session) { session.lastActivity = newDate(); } const result = await aiHandler.processInstruction(sessionId, instruction, context); res.json(result); } catch (error) { console.error('Instruction processing error:', error); res.status(500).json({ success: false, error: error.message }); }});
// 获取会话历史app.get('/api/session/:sessionId/history', (req, res) => {const { sessionId } = req.params;const history = aiHandler.getSessionHistory(sessionId);
res.json({ success: true, sessionId, history });});
// 清理会话app.delete('/api/session/:sessionId', async (req, res) => {try { const { sessionId } = req.params; const result = await aiHandler.cleanupSession(sessionId); sessions.delete(sessionId); res.json({ success: true, sessionId, message: '会话清理成功' }); } catch (error) { console.error('Session cleanup error:', error); res.status(500).json({ success: false, error: error.message }); }});
// 会话清理任务(定期清理过期会话)setInterval(() => {const now = newDate();const SESSION_TIMEOUT = 30 * 60 * 1000; // 30分钟
for (const [sessionId, session] of sessions.entries()) { if (now - session.lastActivity > SESSION_TIMEOUT) { console.log(`清理过期会话: ${sessionId}`); aiHandler.cleanupSession(sessionId); sessions.delete(sessionId); } }}, 5 * 60 * 1000); // 每5分钟检查一次
// 错误处理中间件app.use((error, req, res, next) => {console.error('Unhandled error:', error); res.status(500).json({ success: false, error: '服务器内部错误' });});
// 404 处理app.use('*', (req, res) => { res.status(404).json({ success: false, error: '接口不存在' });});
// 启动服务器app.listen(PORT, () => {console.log(`AI Web Assistant 服务器运行在端口 ${PORT}`);console.log(`环境: ${process.env.NODE_ENV}`);});
exportdefault app;
复制代码

四、前端界面实现

创建 public/index.html:

<!DOCTYPE html><html lang="zh-CN"><head>    <meta charset="UTF-8">    <meta name="viewport" content="width=device-width, initial-scale=1.0">    <title>AI 网页操作助手</title>    <script src="https://cdn.tailwindcss.com"></script>    <style>        .message-user { background-color: #3b82f6; color: white; }        .message-assistant { background-color: #e5e7eb; color: #374151; }        .typing-indicator { display: inline-block; }        .typing-dot {             display: inline-block;             width: 8px; height: 8px;             background-color: #9ca3af;             border-radius: 50%;             margin: 02px;             animation: typing 1.4s infinite ease-in-out;         }        .typing-dot:nth-child(1) { animation-delay: -0.32s; }        .typing-dot:nth-child(2) { animation-delay: -0.16s; }        @keyframes typing {            0%, 80%, 100% { transform: scale(0); }            40% { transform: scale(1); }        }    </style></head><body class="bg-gray-100 min-h-screen">    <div class="container mx-auto px-4 py-8 max-w-4xl">        <!-- 头部 -->        <header class="text-center mb-8">            <h1 class="text-3xl font-bold text-gray-800 mb-2">AI 网页操作助手</h1>            <p class="text-gray-600">使用自然语言指令自动化网页操作</p>        </header>
<!-- 主界面 --> <div class="bg-white rounded-lg shadow-lg overflow-hidden"> <!-- 会话控制 --> <div class="bg-gray-800 text-white p-4 flex justify-between items-center"> <div> <span id="sessionStatus" class="text-sm">未连接</span> </div> <div class="space-x-2"> <button id="initSession" class="bg-green-600 hover:bg-green-700 px-4 py-2 rounded text-sm"> 开始新会话 </button> <button id="clearSession" class="bg-red-600 hover:bg-red-700 px-4 py-2 rounded text-sm" disabled> 结束会话 </button> </div> </div>
<!-- 聊天区域 --> <div class="h-96 overflow-y-auto p-4 space-y-4" id="chatMessages"> <div class="text-center text-gray-500 py-8"> 发送指令开始与AI助手对话 </div> </div>
<!-- 输入区域 --> <div class="border-t p-4"> <div class="flex space-x-2"> <input type="text" id="instructionInput" placeholder="输入你的指令,例如:打开百度并搜索AI最新进展..." class="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500" disabled > <button id="sendButton" class="bg-blue-600 hover:bg-blue-700 text-white px-6 py-2 rounded-lg disabled:bg-gray-400 disabled:cursor-not-allowed" disabled > 发送 </button> </div> <div class="mt-2 text-sm text-gray-500"> <p>示例指令:</p> <div class="flex flex-wrap gap-2 mt-1"> <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="打开百度首页">打开百度</button> <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="搜索今天的热门新闻">搜索新闻</button> <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="提取当前页面的所有标题">提取标题</button> </div> </div> </div> </div>
<!-- 会话信息 --> <div class="mt-4 bg-white rounded-lg shadow p-4"> <h3 class="font-semibold mb-2">会话信息</h3> <div class="text-sm space-y-1"> <div>会话ID: <span id="sessionIdDisplay" class="font-mono">-</span></div> <div>状态: <span id="connectionStatus">未连接</span></div> <div>消息数: <span id="messageCount">0</span></div> </div> </div> </div>
<script> class AIAssistant { constructor() { this.sessionId = null; this.isConnected = false; this.messageCount = 0; this.initializeElements(); this.attachEventListeners(); }
initializeElements() { this.sessionStatus = document.getElementById('sessionStatus'); this.sessionIdDisplay = document.getElementById('sessionIdDisplay'); this.connectionStatus = document.getElementById('connectionStatus'); this.messageCountDisplay = document.getElementById('messageCount'); this.chatMessages = document.getElementById('chatMessages'); this.instructionInput = document.getElementById('instructionInput'); this.sendButton = document.getElementById('sendButton'); this.initSessionBtn = document.getElementById('initSession'); this.clearSessionBtn = document.getElementById('clearSession'); }
attachEventListeners() { this.initSessionBtn.addEventListener('click', () => this.initializeSession()); this.clearSessionBtn.addEventListener('click', () => this.clearSession()); this.sendButton.addEventListener('click', () => this.sendInstruction()); this.instructionInput.addEventListener('keypress', (e) => { if (e.key === 'Enter') this.sendInstruction(); });
// 示例指令点击事件 document.querySelectorAll('.example-instruction').forEach(btn => { btn.addEventListener('click', (e) => { this.instructionInput.value = e.target.dataset.instruction; this.sendInstruction(); }); }); }
async initializeSession() { try { this.showLoading('正在初始化会话...'); const response = await fetch('/api/session/init', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}) });
const data = await response.json();
if (data.success) { this.sessionId = data.sessionId; this.isConnected = true; this.messageCount = 0; this.updateUI(); this.addMessage('system', '会话已初始化,你可以开始发送指令了。'); } else { thrownewError(data.error); } } catch (error) { this.addMessage('error', `初始化失败: ${error.message}`); } finally { this.hideLoading(); } }
async sendInstruction() { const instruction = this.instructionInput.value.trim(); if (!instruction || !this.isConnected) return;
// 添加用户消息 this.addMessage('user', instruction); this.instructionInput.value = ''; // 显示输入状态 const thinkingMessage = this.addMessage('assistant', ''); this.showTypingIndicator(thinkingMessage);
try { const response = await fetch('/api/instruction', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ sessionId: this.sessionId, instruction: instruction }) });
const data = await response.json(); // 移除输入状态 this.removeTypingIndicator(thinkingMessage);
if (data.success) { this.addMessage('assistant', data.response); } else { this.addMessage('error', `操作失败: ${data.error}`); } } catch (error) { this.removeTypingIndicator(thinkingMessage); this.addMessage('error', `网络错误: ${error.message}`); } }
async clearSession() { if (!this.sessionId) return;
try { await fetch(`/api/session/${this.sessionId}`, { method: 'DELETE' }); } catch (error) { console.error('清理会话失败:', error); }
this.sessionId = null; this.isConnected = false; this.messageCount = 0; this.updateUI(); this.clearMessages(); this.addMessage('system', '会话已结束。点击"开始新会话"重新开始。'); }
addMessage(role, content) { this.messageCount++; this.messageCountDisplay.textContent = this.messageCount;
const messageDiv = document.createElement('div'); messageDiv.className = `p-3 rounded-lg max-w-3/4 ${ role === 'user' ? 'message-user ml-auto' : role === 'error' ? 'bg-red-100 text-red-800 border border-red-200' : 'message-assistant' }`;
if (role === 'thinking') { messageDiv.innerHTML = '<div class="typing-indicator"><span class="typing-dot"></span><span class="typing-dot"></span><span class="typing-dot"></span></div>'; } else { messageDiv.textContent = content; }
this.chatMessages.appendChild(messageDiv); this.chatMessages.scrollTop = this.chatMessages.scrollHeight;
return messageDiv; }
showTypingIndicator(messageElement) { messageElement.innerHTML = '<div class="typing-indicator"><span class="typing-dot"></span><span class="typing-dot"></span><span class="typing-dot"></span></div>'; }
removeTypingIndicator(messageElement) { messageElement.innerHTML = ''; }
clearMessages() { this.chatMessages.innerHTML = '<div class="text-center text-gray-500 py-8">发送指令开始与AI助手对话</div>'; }
showLoading(message) { this.initSessionBtn.disabled = true; this.initSessionBtn.textContent = message; }
hideLoading() { this.initSessionBtn.disabled = false; this.initSessionBtn.textContent = '开始新会话'; }
updateUI() { this.sessionStatus.textContent = this.isConnected ? '已连接' : '未连接'; this.sessionIdDisplay.textContent = this.sessionId || '-'; this.connectionStatus.textContent = this.isConnected ? '活跃' : '未连接'; this.connectionStatus.className = this.isConnected ? 'text-green-600' : 'text-red-600'; this.instructionInput.disabled = !this.isConnected; this.sendButton.disabled = !this.isConnected; this.clearSessionBtn.disabled = !this.isConnected; } }
// 初始化应用 document.addEventListener('DOMContentLoaded', () => { new AIAssistant(); }); </script></body></html>
复制代码

五、测试与验证

1. 创建测试脚本

创建 tests/integration.test.js:

import { test, expect } from'@playwright/test';import AIHandler from'../src/ai/handler.js';import dotenv from'dotenv';
dotenv.config();
test.describe('AI Web Assistant Integration Tests', () => {let aiHandler;let sessionId;
test.beforeEach(async () => { aiHandler = new AIHandler(process.env.ANTHROPIC_API_KEY); const initResult = await aiHandler.initializeSession(); sessionId = initResult.sessionId; });
test.afterEach(async () => { await aiHandler.cleanupSession(sessionId); });
test('should initialize session successfully', async () => { expect(sessionId).toBeDefined(); expect(typeof sessionId).toBe('string'); });
test('should process simple navigation instruction', async () => { const result = await aiHandler.processInstruction( sessionId, '请打开百度首页 https://www.baidu.com' ); expect(result.success).toBe(true); expect(result.response).toBeDefined(); });
test('should handle invalid instruction gracefully', async () => { const result = await aiHandler.processInstruction( sessionId, '执行一个不存在的操作' ); // 即使指令有问题,也应该有合理的响应 expect(result.response).toBeDefined(); });});
复制代码

2. 运行测试

npm test
复制代码

六、部署与运行

1. 生产环境配置

创建 ecosystem.config.js:

module.exports = {apps: [{    name: 'ai-web-assistant',    script: 'server.js',    instances: 'max',    exec_mode: 'cluster',    env: {      NODE_ENV: 'production',      PORT: 3000    },    env_production: {      NODE_ENV: 'production'    }  }]};
复制代码

2. Docker 配置

创建 Dockerfile:

FROM node:18-alpine
WORKDIR /app
# 安装 Playwright 依赖RUN apk add --no-cache \ chromium \ nss \ freetype \ freetype-dev \ harfbuzz \ ca-certificates \ ttf-freefont
# 设置环境变量ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=trueENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
# 复制 package.json 并安装依赖COPY package*.json ./RUN npm ci --only=production
# 复制源代码COPY . .
# 创建非root用户RUN addgroup -g 1001 -S nodejsRUN adduser -S nextjs -u 1001USER nextjs
EXPOSE3000
CMD ["npm", "start"]
复制代码

3. 启动应用

# 开发模式npm run dev
# 生产模式npm start
复制代码

七、实际应用场景

场景 1:自动化数据收集

// 指令:收集 GitHub 趋势项目const instruction = `请访问 GitHub Trending 页面 (https://github.com/trending),收集今天最流行的 JavaScript 项目的前5名,包括项目名称、星标数和描述,并整理成 JSON 格式返回。`;
复制代码

场景 2:自动化表单填写

// 指令:注册测试用户const instruction = `请打开我们的测试注册页面 http://localhost:3000/register,填写以下信息:- 用户名: testuser_${Date.now()}- 邮箱: test${Date.now()}@example.com  - 密码: TestPassword123然后点击注册按钮,并确认注册成功。`;
复制代码

场景 3:复杂工作流程

// 指令:完整的电商流程测试const instruction = `请执行以下电商购物流程:1. 登录到测试电商网站2. 搜索"笔记本电脑"3. 选择第一个商品4. 添加到购物车5. 进入结算流程6. 填写测试配送信息7. 确认订单请在每个步骤完成后报告状态。`;
复制代码

总结

通过本教程,我们成功构建了一个功能完整的 AI 网页操作助手,具备以下特点:

  1. 完整的架构:从前端界面到后端服务,再到浏览器自动化层

  2. 灵活的 MCP 协议:支持多种网页操作工具

  3. 智能的 AI 集成:利用 Claude 模型理解自然语言指令

  4. 健壮的错误处理:能够应对各种网页异常情况

  5. 可扩展的设计:易于添加新的工具和功能

这个项目展示了如何将现代 AI 技术与浏览器自动化相结合,创造出能够理解并执行复杂网页操作的智能助手。你可以在此基础上继续扩展,比如添加视觉识别、多浏览器支持、分布式任务处理等功能,打造更强大的自动化解决方案。

用户头像

测试人

关注

专注于软件测试开发 2022-08-29 加入

霍格沃兹测试开发学社,测试人社区:https://ceshiren.com/t/topic/22284

评论

发布
暂无评论
构建能交互网页的 AI 助手——基于 Playwright MCP 的完整项目_软件测试_测试人_InfoQ写作社区