# mgrep **Repository Path**: syyang99/mgrep ## Basic Information - **Project Name**: mgrep - **Description**: A calm, CLI-native way to semantically grep everything, like code, images, pdfs and more. - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-03-02 - **Last Updated**: 2026-03-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
## Why mgrep? - Natural-language search that feels as immediate as `grep`. - Semantic, multilingual & multimodal (audio, video support coming soon!) - Web search built-in — query the web alongside your local files with `--web`. - Smooth background indexing via `mgrep watch`, designed to detect and keep up-to-date everything that matters inside any git repository. - Friendly device-login flow and first-class coding agent integrations. - Built for agents and humans alike, and **designed to be a helpful tool**, not a restrictive harness: quiet output, thoughtful defaults, and escape hatches everywhere. - Reduces the token usage of your agent by 2x while maintaining superior performance ```bash # index once mgrep watch # then ask your repo things in natural language mgrep "where do we set up auth?" ``` ## Quick Start 1. **Install** ```bash npm install -g @mixedbread/mgrep # or pnpm / bun ``` 2. **Sign in once** ```bash mgrep login ``` A browser window (or verification URL) guides you through Mixedbread authentication. **Alternative: API Key Authentication** For CI/CD or headless environments, set the `MXBAI_API_KEY` environment variable: ```bash export MXBAI_API_KEY=your_api_key_here ``` This bypasses the browser login flow entirely. 3. **Index a project** ```bash cd path/to/repo mgrep watch ``` `watch` performs an initial sync, respects `.gitignore`, then keeps the Mixedbread store updated as files change. 4. **Search anything** ```bash mgrep "where do we set up auth?" src/lib mgrep -m 25 "store schema" ``` Searches default to the current working directory unless you pass a path. **Today, `mgrep` works great on:** code, text, PDFs, images. **Coming soon:** audio & video. ## Using it with Coding Agents > [!CAUTION] > **Background Sync Enabled**: When installed with a coding agent, mgrep runs a > background process that syncs your files to enable semantic search. This > process starts automatically when you begin a session and stops when your > session ends. You can see your current usage in the [Mixedbread > platform](https://www.platform.mixedbread.com/). > [!NOTE] > **Default Limits**: mgrep enforces default limits to ensure optimal performance: > - **Maximum file size**: 1MB per file > - **Maximum file count**: 1,000 files per directory > > These limits can be customized via CLI flags (`--max-file-size`, `--max-file-count`), > environment variables, or config files. See the [Configuration](#configuration) section for details. If you prefer to manually start the file watcher instead of relying on the agent's automatic background sync, you can run: ```bash mgrep watch /path/to/your/project ``` This gives you explicit control over when indexing occurs and which directories are watched. `mgrep` supports assisted installation commands for many agents: - `mgrep install-claude-code` for Claude Code - `mgrep install-opencode` for OpenCode - `mgrep install-codex` for Codex - `mgrep install-droid` for Factory Droid These commands sign you in (if needed) and add Mixedbread `mgrep` support to the agent. After that you only have to start the agent in your project folder, thats it. ### More Agents Coming Soon More agents (Cursor, Windsurf, etc.) are on the way—this section will grow as soon as each integration lands. ## Making your agent smarter We plugged `mgrep` into Claude Code and ran a benchmark of 50 QA tasks to evaluate the economics of `mgrep` against `grep`.  In our 50-task benchmark, `mgrep`+Claude Code used ~2x fewer tokens than grep-based workflows at similar or better judged quality. `mgrep` finds the relevant snippets in a few semantic queries first, and the model spends its capacity on reasoning instead of scanning through irrelevant code from endless `grep` attempts. You can [Try it yourself](http://demo.mgrep.mixedbread.com). *Note: Win Rate (%) was calculated by using an LLM as a judge.* ## Why we built mgrep `grep` is an amazing tool. It's lightweight, compatible with just about every machine on the planet, and will reliably surface any potential match within any target folder. But grep is **from 1973**, and it carries the limitations of its era: you need exact patterns and it slows down considerably in the cases where you need it most, on large codebases. Worst of all, if you're looking for deeply-buried critical business logic, you cannot describe it: you have to be able to accurately guess what kind of naming patterns would have been used by the previous generations of engineers at your workplace for `grep` to find it. This will often result in watching a coding agent desperately try hundreds of patterns, filling its token window, and your upcoming invoice, with thousands of tokens. But it doesn't have to be this way. Everything else in our toolkit is increasingly tailored to understand us, and so should our search tools. `mgrep` is our way to bring `grep` to 2025, integrating all of the advances in semantic understanding and code-search, without sacrificing anything that has made `grep` such a useful tool. Under the hood, `mgrep` is powered by [Mixedbread Search](https://www.mixedbread.com/blog/mixedbread-search), our full-featured search solution. It combines state-of-the-art semantic retrieval models with context-aware parsing and optimized inference methods to provide you with a natural language companion to `grep`. We believe both tools belong in your toolkit: use `grep` for exact matches, `mgrep` for semantic understanding and intent. ## When to use what We designed `mgrep` to complement `grep`, not replace it. The best code search combines `mgrep` with `grep`. | Use `grep` (or `ripgrep`) for... | Use `mgrep` for... | | --- | --- | | **Exact Matches** | **Intent Search** | | Symbol tracing, Refactoring, Regex | Code exploration, Feature discovery, Onboarding | ## Web Search `mgrep` can also search the web alongside your local files. This is useful when you need to find documentation, tutorials, or answers to programming questions without leaving your terminal. ```bash # Search the web and get a summarized answer mgrep --web --answer "How do I integrate a JavaScript runtime into Deno?" # Get the urls of the search mgrep --web "best practices for error handling in TypeScript" ``` Web search queries the `mixedbread/web` store in addition to your local store, merging results based on relevance. Use `--answer` (or `-a`) to get a concise summary instead of raw results. ## mgrep as Subagent For complex questions that require information from multiple sources, `mgrep` can act as a subagent that automatically refines queries and performs multiple searches to find the best answer. ```bash # Enable agentic search for complex multi-part questions mgrep --agentic "What are the yearly numbers for 2020, 2021, 2022, 2023, 2024?" # Combine with --answer for a synthesized response from multiple sources mgrep --agentic -a "How does authentication work and where is it configured?" ``` When `--agentic` is enabled, mgrep will: - Automatically break down complex queries into sub-queries - Perform multiple searches as needed to gather comprehensive results - Combine findings from different parts of your codebase This is particularly useful for questions that span multiple files or concepts, where a single search might miss important context. ## Commands at a Glance | Command | Purpose | | --- | --- | | `mgrep` / `mgrep search