# company-crawler

**Repository Path**: Cjimer/company-crawler

## Basic Information

- **Project Name**: company-crawler
- **Description**: 天眼查爬虫&企查查爬虫，指定关键字爬取公司信息
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 4
- **Forks**: 9
- **Created**: 2022-01-10
- **Last Updated**: 2026-02-26

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

### 天眼查、企查查 
### 公司信息爬虫


------

## 使用说明

1. 设置用户状态
    
    抓包小程序，设置请求头里用户鉴权信息
2. 设置数据源
    ```pydocstring
    MysqlConfig = {
        'develop': {
            'host': '192.168.1.103',
            'port': 3306,
            'db': 'enterprise',
            'username': 'root',
            'password': 'root@123'
        }
    }
    ```
3. 执行```db/data.sql```生成数据结构
4. 配置IP代理```config/settings```
    ```pydocstring
    # 全局代理控制
    GLOBAL_PROXY = True
    PROXY_POOL_URL = "http://localhost:5010"
    ```
5. 设置爬取关键字```qichacha```&```tianyancha```
    ```pydocstring
    keys = ['Google'] # 设置爬取列表
    crawler.load_keys(keys)
    crawler.start()
    ```
   
PS：**建议使用IP代理 + 随机UA，否者一定会被ban**
1. 随机UA推荐[fake_useragent](https://github.com/hellysmile/fake-useragent)
2. 代理池推荐[proxy_pool](https://github.com/jhao104/proxy_pool.git)