MCP Services - LLM Gateway MCP Server

LLM Gateway 是一个原生支持 MCP 的服务器，允许高级 AI 代理（如 Claude 3.7）智能地将任务委派给更经济的模型（如 Gemini Flash），从而大幅降低 API 成本，同时保持输出质量。

主要优势

AI 任务智能委派：高级模型可以将常规任务委派给更经济的模型
成本优化：降低 70-90% 的 API 成本
提供商抽象：通过统一接口避免供应商锁定
大规模文档处理：高效处理和分析大型文档

安装

pip install llm-gateway-mcp

基本用法

启动服务器

llm-gateway-server --port 8000

使用 MCP 客户端连接

import asyncio
from mcp.client import Client

async def main():
    # 连接到 LLM Gateway
    client = Client("http://localhost:8000")
    
    # 使用工具
    response = await client.tools.summarize_document(
        document="需要摘要的长文档...",
        provider="gemini",
        model="gemini-2.0-flash-lite",
        format="paragraph"
    )
    
    print(f"摘要: {response['summary']}")
    print(f"成本: ${response['cost']:.6f}")
    
    await client.close()

if __name__ == "__main__":
    asyncio.run(main())

委派工作流示例

# 1. 文档分块
chunks_response = await client.tools.chunk_document(
    document=large_document,
    chunk_size=1000,
    method="semantic"
)

# 2. 委派摘要任务给更便宜的模型
summaries = []
for chunk in chunks_response["chunks"]:
    summary = await client.tools.summarize_document(
        document=chunk,
        provider="gemini",
        model="gemini-2.0-flash-lite"
    )
    summaries.append(summary["summary"])

# 3. 实体提取
entities = await client.tools.extract_entities(
    document=large_document,
    entity_types=["person", "organization", "date"]
)

主要功能

智能任务路由：自动选择最合适的模型
高级缓存：减少重复 API 调用
文档工具：智能分块、摘要、实体提取
结构化数据提取：JSON、表格和键值对提取

支持的提供商

OpenAI (GPT 模型)
Anthropic (Claude 模型)
Google (Gemini 模型)
DeepSeek

配置

在 config.yaml 中设置您的 API 密钥:

providers:
  anthropic:
    api_key: "your_anthropic_api_key"
  openai:
    api_key: "your_openai_api_key"
  gemini:
    api_key: "your_gemini_api_key"

了解更多信息，请访问 GitHub 项目页面。