# LLMTEST

**Repository Path**: zcakin/llmtest

## Basic Information

- **Project Name**: LLMTEST
- **Description**: 本仓库用于实现对大模型的快速测试，对接指定服务器的vllm或ollama部署的模型，可支持vllm lora方式部署
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 0
- **Created**: 2026-03-26
- **Last Updated**: 2026-07-23

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# LLM Test Project

This project is a minimal test platform for validating fine-tuned models exposed through HTTP APIs.

Current capabilities:

- vLLM OpenAI-compatible endpoint support
- Ollama `/api/chat` endpoint support
- Web UI for one-turn question answering
- Python test scripts for direct model calls
- Config-file based connection management
- Sample evaluation cases in `test_cases/basic_capability_cases.json`

## 1. Install

```bash
pip install -r requirements.txt
```

## 2. Model config

The project reads model settings from:

```text
config/model_config.json
```

Current default example:

```json
{
  "provider": "vllm",
  "base_url": "http://211.87.232.207:8001",
  "endpoint_path": "/v1/completions",
  "model_name": "trajad",
  "api_key": "",
  "timeout": 60,
  "threshold": 0.7,
  "temperature": 0.7,
  "top_p": 0.9,
  "max_tokens": 200,
  "repetition_penalty": 1.2,
  "frequency_penalty": 0.5,
  "presence_penalty": 0.3,
  "stop_sequences": []
}
```

For completion-style customer-service models, the default decoding setup now favors shorter and less repetitive answers. If a LoRA still loops on stock closings, add task-specific `stop_sequences` in the UI or `config/model_config.json`.

## 3. Start web service

```bash
python app.py
```

Then open:

```text
http://127.0.0.1:5000
```

## 4. One-turn Python test

Use the config file directly:

```bash
python scripts/chat_once.py --prompt "Hello"
```

## 5. Ollama standalone test script

For the Ollama model:

```text
qwen2.5-7B_zhongjian_int4:latest
```

Python script:

```bash
python scripts/ollama_chat_test.py "???????????" --base-url http://127.0.0.1:11434
```

Linux wrapper script:

```bash
chmod +x scripts/run_ollama_chat_test.sh
./scripts/run_ollama_chat_test.sh "???????????"
```

Specify remote Ollama server:

```bash
./scripts/run_ollama_chat_test.sh "???????????" http://your-server-ip:11434
```

Or use environment variable:

```bash
OLLAMA_BASE_URL=http://your-server-ip:11434 ./scripts/run_ollama_chat_test.sh "??"
```

## 6. Future extensions

- Standard test-set management
- Batch evaluation and scoring
- Multi-turn dialogue replay
- Stability comparison tests
- CSV / JSON report export

## 7. Structure

```text
.
??? app.py
??? config/
?   ??? model_config.json
??? model_client.py
??? requirements.txt
??? scripts/
?   ??? chat_once.py
?   ??? ollama_chat_test.py
?   ??? run_ollama_chat_test.sh
??? static/
?   ??? style.css
??? templates/
?   ??? index.html
??? test_cases/
    ??? basic_capability_cases.json
```