# PaperBrain **Repository Path**: Tierra_avida/PaperBrain ## Basic Information - **Project Name**: PaperBrain - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-03-22 - **Last Updated**: 2026-03-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # PaperBrain 🤖📄 - Automated Research Intelligence Hub ![PaperBrain Architecture](paperbrain_arch.png) [中文说明](#paperbrain---自动化科研情报中枢) | [English](#english) --- ## 🇬🇧 English **PaperBrain** is a fully automated research intelligence pipeline that turns daily paper streams into actionable research notes. In one run, it completes a full loop: **Fetch → Screen → Deep Analyze → Link to Your Knowledge Base → Notify/Podcast**. * **It Thinks**: Uses strong reasoning models for engineering-level analysis (method, math, evidence, limitations). * **It Remembers**: Uses **Context-Aware RAG** to compare new papers with your existing notes. * **It Speaks**: Generates a **~5-minute Deep Dive Podcast by default** (duration configurable via CLI). * **It Organizes**: Writes structured Markdown notes, metadata, and graph-ready links into **Obsidian**. ### 🚀 Core Features * **Intelligent Screening**: Fast first-pass scoring on large daily paper sets. * **Deep Analysis**: Detailed report for high-value papers (problem gap, method, formulas, evidence, critique). * **Context-Aware RAG**: Differential analysis against your prior notes. * **Strict Link Policy**: `[[Note]]` links are used **only** when a detailed note file exists; otherwise, it uses an ArXiv web link. * **Visual Extraction**: Automatic architecture figure extraction from PDFs. * **AI Podcast + Mobile Push**: Audio briefing generation and Bark notifications. ### 🏁 Getting Started (Zero to Hero) #### Prerequisites * **OS**: Windows / macOS / Linux * **Python**: 3.10+ (Conda recommended) * **Obsidian**: For the best knowledge base experience. #### Step 1: Installation Clone the repository and set up the environment: ```bash git clone https://github.com/YourUsername/PaperBrain.git cd PaperBrain/script # Create Conda environment conda create -n wd python=3.10 conda activate wd # Install dependencies pip install -r requirements.txt ``` #### Step 2: API Configuration PaperBrain uses **OpenRouter** (recommended) or Doubao for LLM inference. 1. Copy the example env file: ```bash cp .env.example .env ``` 2. Edit `.env` and fill in your keys: ```ini # Recommended: Use OpenRouter for access to Claude 3.7 / Gemini Pro OPENROUTER_API_KEY=sk-or-v1-your-key-here # Optional: For Bark Mobile Notifications (iOS) BARK_URL=https://api.day.app/your-token/ ``` #### Step 3: Customize Your Research Interests Edit `config.yaml` to define what you care about: ```yaml search: keywords: - "World Model" - "Embodied AI" - "Humanoid Robot" arxiv_categories: - "cs.RO" - "cs.AI" ``` #### Step 4: Run It! **Option A: Run Immediately (Manual Mode)** ```bash # Default: Generates Podcast (~5 minutes) python main.py --run-now --provider openrouter # Skip Podcast (faster) python main.py --run-now --provider openrouter --no-podcast # Custom podcast duration (e.g., 10 minutes) python main.py --run-now --provider openrouter --podcast-minutes 10 ``` **Option B: Generate Podcast for Existing Note** ```bash # Great for revisiting old papers! python generate_podcast.py "Solaris.md" --provider openrouter # Custom duration for single-note podcast python generate_podcast.py "Solaris.md" --provider openrouter --minutes 12 ``` **Option C: Auto-Schedule (Windows)** Use **Task Scheduler** to run `script/run_daily.bat` every morning at 8:00 AM. * *Action*: Start a program * *Program*: `D:\PaperBrain\script\run_daily.bat` * *Start in*: `D:\PaperBrain\script\` (Crucial!) ### 🔒 Privacy & Security This tool processes data locally and interacts with third-party AI services. * **API Keys**: Stored in `.env` (git-ignored). Never shared. * **Data Transmission**: Paper content is sent to ArXiv (search), Hugging Face (metadata), OpenAI/Anthropic (analysis), and Microsoft (TTS). * **See [SECURITY.md](SECURITY.md)** for full details. --- ## 🇨🇳 PaperBrain - 自动化科研情报中枢 **PaperBrain** 不只是“论文抓取器”,而是一个完整的**自动化科研工作流引擎**。 一次运行会自动完成:**抓取 → 筛选 → 深度分析 → 知识库关联 → 推送/播客**。 它每天自动从 ArXiv 和 Hugging Face 抓取最新论文,利用 **快慢思考 (System 1 & 2)** 架构进行筛选和深度解读,并将结构化的知识(公式、图谱、代码审计)同步到您的 **Obsidian** 知识库中。 ### ✨ 核心亮点(清晰版) 1. **智能筛选 + 深度分析双层架构**: * 第一层:快速筛选大批论文,给出相关性评分。 * 第二层:只对高价值论文生成深度报告,重点输出方法、公式、实验和局限。 2. **上下文感知 RAG 差异化分析**: * 新论文分析前,先检索您已有笔记。 * 报告会明确写出“与既有工作相比,哪里变好、哪里仍不足”。 3. **严格链接策略(避免误跳转)**: * 只有真正生成了深度笔记的论文,才使用 `[[内联链接]]`。 * 未生成深度笔记的论文,一律使用 ArXiv 网页链接。 4. **AI 播客与移动推送**: * 默认生成约 5 分钟英文深度播客(可通过命令行参数调整时长); * 通过 Bark 推送摘要和播客就绪提醒。 ### 🏁 快速上手指南 #### 准备工作 * 推荐安装 **Anaconda** 或 Miniconda。 * 推荐使用 **Obsidian** 作为知识库前端。 #### 第一步:安装项目 ```bash # 1. 克隆代码 git clone https://github.com/YourUsername/PaperBrain.git cd PaperBrain/script # 2. 创建环境 conda create -n wd python=3.10 conda activate wd # 3. 安装依赖 (包含 PyTorch, Edge-TTS 等) pip install -r requirements.txt ``` #### 第二步:配置密钥 为了保护隐私,我们使用 `.env` 文件存储密钥。 1. 复制模板: ```bash cp .env.example .env ``` 2. 编辑 `.env` 文件(使用记事本或 VS Code): ```ini # 推荐使用 OpenRouter (可访问 Claude 3.7, Gemini Pro 等 SOTA 模型) OPENROUTER_API_KEY=sk-or-v1-your-key-here # 选填:Bark 推送链接 (iOS App Store 下载 Bark) BARK_URL=https://api.day.app/your-token/ ``` #### 第三步:定义研究方向 打开 `config.yaml`,修改您的关注领域: ```yaml search: keywords: - "World Model" - "Embodied AI" - "Humanoid Robot" arxiv_categories: - "cs.RO" # 机器人 - "cs.CV" # 计算机视觉 ``` #### 第四步:运行! **方式一:立即运行 (手动)** ```bash # 默认模式:包含播客生成(默认约 5 分钟) python main.py --run-now --provider openrouter # 纯文本模式:不生成播客 (速度更快) python main.py --run-now --provider openrouter --no-podcast # 自定义播客时长(示例:10分钟) python main.py --run-now --provider openrouter --podcast-minutes 10 ``` **方式二:为特定笔记生成播客** 想听听以前读过的某篇论文? ```bash python generate_podcast.py "Solaris.md" --provider openrouter # 自定义该篇播客时长(示例:12分钟) python generate_podcast.py "Solaris.md" --provider openrouter --minutes 12 ``` **方式三:每日自动运行 (Windows)** 使用 Windows 自带的 **任务计划程序 (Task Scheduler)**: 1. 创建基本任务 -> 每天上午 8:00。 2. 操作:启动程序 -> 选择 `D:\PaperBrain\script\run_daily.bat`。 3. **关键点**:在“起始于 (Start in)”一栏中,**必须**填入脚本所在目录 `D:\PaperBrain\script\`,否则会报错。 ### 🔒 隐私与安全 本项目注重您的数据隐私与密钥安全。 * **API 密钥**:仅存储在本地 `.env` 文件中(默认被 git 忽略),绝不上传。 * **数据传输**:论文内容仅发送至 ArXiv (搜索)、Hugging Face (元数据)、OpenAI/Anthropic (分析) 及 Microsoft (语音合成) 用于生成报告。 * **详见 [SECURITY.md](SECURITY.md)** 了解完整安全策略。 --- *Powered by PaperBrain Team & LLMs*