# fast-agent-api

**Repository Path**: wangboa/fast-agent-api

## Basic Information

- **Project Name**: fast-agent-api
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-17
- **Last Updated**: 2026-03-20

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Fast Agent API

FastAPI-based AI Agent inference service with LangChain/LangGraph support, designed for real-time conversational AI applications.

## Project Background

This project is an AI dialogue microservice designed for teaching platforms. It provides AI-powered conversation capabilities via SSE streaming or HTTP callbacks, suitable for various educational scenarios (e.g., "Story King", "Public Speaking" courses).

## Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                      Teaching Platform                          │
│                      (main-service)                             │
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │ Student     │  │ Course      │  │ Chat Training           │  │
│  │ Management  │  │ Management  │  │ (Turbo + Stimulus +    │  │
│  │             │  │             │  │  EventSource)           │  │
│  └─────────────┘  └─────────────┘  └───────────┬─────────────┘  │
│                                                 │                │
│                    ┌───────────────────────────┘                │
│                    │  HTTP POST (SSE or Callback)               │
│                    ▼                                            │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                      fast-agent-api                        ││
│  │                      (This Project)                         ││
│  │                                                              ││
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  ││
│  │  │ FastAPI     │  │ Agent       │  │ Chat Storage        │  ││
│  │  │ (Web Layer) │──│ Manager     │  │ (Session History)   │  ││
│  │  └─────────────┘  └──────┬──────┘  └─────────────────────┘  ││
│  │                          │                                  ││
│  │                    ┌─────▼─────┐                           ││
│  │                    │ LLM Layer │                           ││
│  │                    │(DeepAgents)│                           ││
│  │                    └─────┬─────┘                           ││
│  │                    ┌─────┴─────┐                           ││
│  │                    │ Providers │                           ││
│  │                    │ Anthropic │                           ││
│  │                    │ OpenAI    │                           ││
│  │                    │ MiniMax   │                           ││
│  │                    └───────────┘                           ││
│  └─────────────────────────────────────────────────────────────┘│
│                           │                                      │
│                    HTTP Callback                                  │
│                           ▼                                      │
└───────────────────────────┬───────────────────────────────────────┘
                           │
              ┌────────────┴────────────┐
              │ Real-time Update       │
              │ (Turbo Stream/SSE)    │
              └────────────────────────┘
```

## Project Structure

```
fast-agent-api/
├── src/fast_agent_api/
│   ├── main.py                 # FastAPI application entry
│   ├── config.py               # Configuration management
│   ├── agent/
│   │   ├── config.py           # Agent & Session config models
│   │   ├── factory.py          # Agent runner factory
│   │   ├── manager.py          # Agent lifecycle management
│   │   ├── tools.py            # Tool registry (name -> function mapping)
│   │   └── runners/
│   │       └── deepagent_runner.py  # DeepAgent implementation
│   ├── api/
│   │   ├── deps.py             # Dependency injection
│   │   └── routes/
│   │       ├── agent.py        # Agent chat endpoints
│   │       └── health.py       # Health check endpoint
│   ├── backend/
│   │   ├── schema.py           # Data models
│   │   ├── client.py           # Backend client
│   │   └── storage.py          # Chat storage
│   ├── llm/
│   │   └── providers/          # LLM providers (Anthropic, OpenAI, MiniMax)
│   ├── middleware/
│   │   ├── logging.py          # Request/response logging
│   │   └── monitoring.py      # Metrics & monitoring
│   ├── services/
│   │   └── callback.py         # HTTP callback service
│   └── logs/
│       └── __init__.py         # Structured logging config
├── tests/                      # Test suite
├── docs/                       # Documentation
│   ├── agent_api.md           # API documentation (Chinese)
│   └── specs/                 # Design specifications
├── pyproject.toml              # Project metadata & dependencies
└── .env.example               # Environment variables template
```

## Key Features

- **FastAPI** - High-performance async web framework
- **DeepAgents** - Advanced AI agent framework with ReAct, tool system, and skill support
- **LangChain/LangGraph** - Underlying agent orchestration (via DeepAgents)
- **Multi-LLM Support** - Anthropic (Claude), OpenAI, MiniMax
- **SSE Streaming** - Real-time response delivery via Server-Sent Events
- **HTTP Callback** - Async chunk-by-chunk push to external URLs
- **Agent Configuration** - Dynamic system prompts, skills, tools per session
- **Tool System** - Register and use Python functions as agent tools
- **Skill System** - Load custom skills for specialized agent behavior
- **Chat Storage** - In-memory session history (extensible to SQLite/PostgreSQL)
- **Structured Logging** - Development-friendly console output with structlog
- **Metrics** - Prometheus-compatible metrics endpoint
- **CORS Support** - Configurable cross-origin requests

## Communication Modes

### Mode 1: SSE Stream (Primary)

The frontend connects to Rails SSE endpoint, which proxies to fast-agent-api:

```
Frontend (EventSource) → Rails SSE → fast-agent-api (/chat/stream)
```

### Mode 2: HTTP Callback (Alternative)

fast-agent-api calls back to Rails after processing each chunk:

```
Frontend → Rails → fast-agent-api → Rails Callback → Turbo Stream
```

## Teaching Course Architecture

This section describes how fast-agent-api integrates with the main teaching platform (Rails) to deliver structured educational conversations like "Story King" courses.

### Architecture Overview

```
┌─────────────────────────────────────────────────────────────────────────┐
│                         Teaching Platform (Rails)                      │
│                                                                          │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────────────┐│
│  │ Course Config   │  │ Student Progress│  │ Chat UI                  ││
│  │ (YAML/JSON)     │  │ (Database)      │  │ (Turbo + Stimulus)       ││
│  └────────┬────────┘  └────────┬────────┘  └───────────┬─────────────┘│
│           │                     │                       │               │
│           │    ┌────────────────┴───────────────────────┘               │
│           │    │                                                        │
│           ▼    ▼                                                         │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                    Course Logic Layer                               ││
│  │  • Stage management (intro → framework → practice → feedback)      ││
│  │  • Generate system_prompt based on current stage                  ││
│  │  • Store practice data and conversation history                   ││
│  └─────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    │ HTTP POST (SSE)
                                    │ system_prompt: "You are a story teacher..."
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         fast-agent-api                                  │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │  Agent Capabilities (Core)                                         ││
│  │                                                                      ││
│  │  • Receive message + system_prompt                                 ││
│  │  • Stream response via SSE                                         ││
│  │  • Store conversation history                                      ││
│  │  • Support session recovery                                        ││
│  │                                                                      ││
│  │  Note: fast-agent-api does NOT manage course stages,               ││
│  │        prompts, or teaching logic. These are handled by Rails.    ││
│  └─────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────┘
```

### Responsibility Separation

| Responsibility | Rails (Main Service) | fast-agent-api |
|---------------|---------------------|----------------|
| Course configuration | ✅ YAML/JSON files | ❌ |
| Stage management | ✅ intro → framework → practice → feedback | ❌ |
| AI system prompt generation | ✅ Based on stage/theme | ❌ |
| Practice data storage | ✅ Database | ❌ |
| Conversation history | Partial (for recovery) | ✅ Full history |
| Streaming response | ✅ Proxy | ✅ Core capability |
| Session recovery | ✅ | ✅ |
| LLM inference | ❌ | ✅ |

### Workflow Example: Story King Course

**Step 1: Student selects a theme**
- Rails renders theme selection UI
- User selects "我最尴尬的一件事" (ST-01)

**Step 2: Rails starts learning session**
- Create `CourseProgress` record in database
- Initialize session_id for fast-agent-api

**Step 3: Stage 1 - Introduction**
- Rails generates system_prompt for "intro" stage
- Rails sends first message to fast-agent-api
- AI introduces the topic, asks engaging questions

**Step 4: Stage 2 - Framework**
- Rails switches to "framework" stage
- Rails generates new system_prompt explaining story structure
- AI explains "起承转合" (Beginning, Development, Twist, Ending)

**Step 5: Stage 3 - Practice**
- Rails switches to "practice" stage
- Rails generates system_prompt with practice questions
- AI guides student through the story step by step

**Step 6: Stage 4 - Feedback**
- Rails switches to "feedback" stage
- Rails generates feedback system_prompt
- AI provides constructive feedback and encouragement

### API Integration

Rails calls fast-agent-api with stage-specific prompts:

```json
{
  "message": "我想讲故事",
  "session_id": "session-001",
  "system_prompt": "你是故事大王老师，用亲切有趣的语言介绍主题...",
  "metadata": {
    "stage": "intro",
    "theme_id": "ST-01",
    "course_key": "story_master"
  }
}
```

The `system_prompt` is generated by Rails based on:
- Current stage (intro/framework/practice/feedback)
- Theme configuration
- Student grade level

### Session Recovery

When a student returns to continue learning:

1. Rails fetches conversation history from fast-agent-api:
   ```
   GET /api/v1/agent/sessions/{session_id}
   ```

2. Rails restores the UI with previous conversation

3. Rails determines the current stage from `CourseProgress` record

4. Rails continues with the appropriate system_prompt

## Quick Start

### Prerequisites

- Python 3.11+
- Poetry

### Local Development

```bash
# Install dependencies
poetry install

# Copy environment variables
cp .env.example .env

# Start the service
poetry run uvicorn fast_agent_api.main:app --reload

# Or run directly
poetry run python -m fast_agent_api.main
```

### Docker

```bash
# Build and run
docker build -t fast-agent-api:latest .
docker run -p 8000:8000 --env-file .env fast-agent-api:latest

# Or use docker-compose
docker-compose up -d
```

## Configuration

All configuration is managed via environment variables. Copy `.env.example` to `.env` and customize:

```bash
# Application
APP_ENV=development
APP_HOST=0.0.0.0
APP_PORT=8000
APP_DEBUG=true
APP_LOG_LEVEL=INFO

# LLM Providers (at least one required)
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
MINIMAX_API_KEY=...

# Default LLM provider: anthropic, openai, or minimax
DEFAULT_LLM_PROVIDER=anthropic
OPENAI_MODEL=gpt-4o
OPENAI_BASE_URL=https://api.openai.com/v1

# Backend Storage
BACKEND_TYPE=memory  # memory, sqlite, or postgresql
BACKEND_DB_PATH=./data/backend.db

# CORS
CORS_ORIGINS=http://localhost:3000,http://localhost:8080

# Monitoring
METRICS_ENABLED=true
METRICS_PORT=9090
```

## API Endpoints

### Health Check

```bash
curl http://localhost:8000/health
```

### Agent Chat (Blocking)

```bash
curl -X POST http://localhost:8000/api/v1/agent/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Hello",
    "session_id": "test-001",
    "system_prompt": "You are a helpful assistant."
  }'
```

### Stream Chat (SSE)

```bash
curl -X POST http://localhost:8000/api/v1/agent/chat/stream \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Hello",
    "session_id": "test-001"
  }'
```

**Response Format (SSE):**
```
data: chunk1
data: chunk2
data: [DONE]
```

### Stream Chat with Callback

```bash
curl -X POST http://localhost:8000/api/v1/agent/chat/stream \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Hello",
    "session_id": "test-001",
    "callback_url": "http://localhost:3000/chat/callback/test-001/message"
  }'
```

This will send each streaming chunk to the callback URL.

### Get Session

```bash
curl http://localhost:8000/api/v1/agent/sessions/test-001
```

### Delete Session

```bash
curl -X DELETE http://localhost:8000/api/v1/agent/sessions/test-001
```

### List Sessions

```bash
curl http://localhost:8000/api/v1/agent/sessions
```

### Metrics

```bash
curl http://localhost:8000/metrics
```

## Request/Response Examples

### Chat Request

```json
{
  "message": "我想听故事",
  "session_id": "session-001",
  "system_prompt": "你是一个面向小学生的故事大王老师...",
  "callback_url": "http://localhost:3000/chat/callback/session-001/message",
  "skills": ["storytelling"],
  "tools": ["search"],
  "metadata": {
    "course_id": "story-king-101"
  }
}
```

### Chat Response (Blocking)

```json
{
  "session_id": "session-001",
  "message": "好的，让我给你讲一个有趣的故事..."
}
```

### SSE Stream Format

```
data: {"content": "好的", "done": false}
data: {"content": "让我给你", "done": false}
data: {"content": "讲故事...", "done": true}
data: [DONE]
```

### Callback Payload

Each streaming chunk triggers a callback to `callback_url`:

```json
{
  "session_id": "session-001",
  "role": "assistant",
  "content": "chunk content",
  "metadata": {}
}
```

## Testing

```bash
# Run all tests
poetry run pytest

# Run with coverage
poetry run pytest --cov

# Run specific test file
poetry run pytest tests/agent/test_manager.py
```

## License

MIT