# CrawlerForReader **Repository Path**: njn/CrawlerForReader ## Basic Information - **Project Name**: CrawlerForReader - **Description**: CrawlerForReader是一个本地网络小说爬虫,基于jsoup与xpath,通过模版解析网页 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2022-02-14 - **Last Updated**: 2022-02-14 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # CrawlerForReader **本项目是基于开源项目CrawlerForReader进行ohos化的移植和开发的,可以通过项目标签以及github地址( https://github.com/smuyyh/CrawlerForReader )追踪到原项目版本** #### 项目介绍 - 项目名称:CrawlerForReader - 所属系列:ohos的第三方组件适配移植 - 功能:CrawlerForReader是一个本地网络小说爬虫,基于jsoup与xpath,通过模版解析网页。 - 项目移植状态:完成 - 调用差异:无 - 项目作者和维护人:hihope - 联系方式:hihope@hoperun.com - 原项目Doc地址:https://github.com/smuyyh/CrawlerForReader - 原项目基线版本:无releases版本, sha1:fc52a470291fef3b0d15c4554b2d6845e8bfa713 - 编程语言:Java - 外部库依赖:gson, jsoup, commons-lang3 #### 安装教程 ##### 方法一: 1. 下载har包CrawlerForReader.har。 2. 启动 DevEco Studio,将下载的har/jar包,导入工程目录“entry->libs”下。 3. 在moudle级别下的build.gradle文件中添加依赖,在dependences标签中增加对libs目录下har/jar包的引用。 ``` dependencies { implementation fileTree(dir: 'libs', include: ['*.jar', '*.har']) } ``` 4. 在导入的har包上点击右键,选择“Add as Library”对包进行引用,选择需要引用的模块,并点击“OK”即引用成功。 ##### 方法二: 1. 在工程的build.gradle的allprojects中,添加HAR所在的Maven仓地址: ``` repositories { maven { url 'http://106.15.92.248:8081/repository/Releases/' } } ``` 2. 在应用模块的build.gradle的dependencies闭包中,添加如下代码: ``` dependencies { implementation 'com.qy.reader.ohos:crawler:1.0.0' } ``` #### 效果展示 ![p1](screenshot/CrawlerForReader.gif) #### 使用说明 ##### 支持书源 ```java public static final SparseArray SOURCES = new SparseArray() { { put(SourceID.LIEWEN, new Source(SourceID.LIEWEN, "猎文网", "https://www.liewen.cc/search.php?keyword=%s")); put(SourceID.CHINESE81, new Source(SourceID.CHINESE81, "八一中文网", "https://www.zwdu.com/search.php?keyword=%s")); put(SourceID.ZHUISHU, new Source(SourceID.ZHUISHU, "追书网", "https://www.zhuishu.tw/search.aspx?keyword=%s")); put(SourceID.BIQUG, new Source(SourceID.BIQUG, "笔趣阁", "http://zhannei.baidu.com/cse/search?s=1393206249994657467&q=%s")); put(SourceID.WENXUEMI, new Source(SourceID.WENXUEMI, "文学迷", "http://www.wenxuemi.com/search.php?keyword=%s")); put(SourceID.CHINESEXIAOSHUO, new Source(SourceID.CHINESEXIAOSHUO, "小说中文网", "http://www.xszww.com/s.php?ie=gbk&s=10385337132858012269&q=%s")); put(SourceID.DINGDIAN, new Source(SourceID.DINGDIAN, "顶点小说", "http://zhannei.baidu.com/cse/search?s=1682272515249779940&q=%s")); put(SourceID.BIQUGER, new Source(SourceID.BIQUGER, "笔趣阁2", "http://zhannei.baidu.com/cse/search?s=7928441616248544648&ie=utf-8&q=%s")); put(SourceID.CHINESEZHUOBI, new Source(SourceID.CHINESEZHUOBI, "着笔中文网", "http://www.zbzw.com/s.php?ie=utf-8&s=4619765769851182557&q=%s")); put(SourceID.DASHUBAO, new Source(SourceID.DASHUBAO, "大书包", "http://zn.dashubao.net/cse/search?s=9410583021346449776&entry=1&ie=utf-8&q=%s")); put(SourceID.CHINESEWUZHOU, new Source(SourceID.CHINESEWUZHOU, "梧州中文台", "http://www.gxwztv.com/search.htm?keyword=%s")); put(SourceID.UCSHUMENG, new Source(SourceID.UCSHUMENG, "UC书盟", "http://www.uctxt.com/modules/article/search.php?searchkey=%s", 4)); put(SourceID.QUANXIAOSHUO, new Source(SourceID.QUANXIAOSHUO, "全小说", "http://qxs.la/s_%s")); put(SourceID.YANMOXUAN, new Source(SourceID.YANMOXUAN, "衍墨轩", "http://www.ymoxuan.com/search.htm?keyword=%s")); put(SourceID.AIQIWENXUE, new Source(SourceID.AIQIWENXUE, "爱奇文学", "http://m.i7wx.com/?m=book/search&keyword=%s")); put(SourceID.QIANQIANXIAOSHUO, new Source(SourceID.QIANQIANXIAOSHUO, "千千小说", "http://www.xqqxs.com/modules/article/search.php?searchkey=%s", 4)); put(SourceID.PIAOTIANWENXUE, new Source(SourceID.PIAOTIANWENXUE, "飘天文学网", "http://www.piaotian.com/modules/article/search.php?searchtype=articlename&searchkey=%s")); put(SourceID.SUIMENGXIAOSHUO, new Source(SourceID.SUIMENGXIAOSHUO, "随梦小说网", "http://m.suimeng.la/modules/article/search.php?searchkey=%s", 4)); put(SourceID.DAJIADUSHUYUAN, new Source(SourceID.DAJIADUSHUYUAN, "大家读书苑", "http://www.dajiadu.net/modules/article/searchab.php?searchkey=%s")); put(SourceID.SHUQIBA, new Source(SourceID.SHUQIBA, "书旗吧", "http://www.shuqiba.com/modules/article/search.php?searchkey=%s", 4)); put(SourceID.XIAOSHUO52, new Source(SourceID.XIAOSHUO52, "小说52", "http://m.xs52.com/search.php?searchkey=%s")); } }; ``` ##### 模版示例 例如针对八一中文网: ```json { "id": 2, "search": { "charset": "UTF-8", "xpath": "//div[@class='result-item result-game-item']", "coverXpath": "//div[@class='result-game-item-pic']//a//img/@src", "titleXpath": "//div[@class='result-game-item-detail']//h3//a/@title", "linkXpath": "//div[@class='result-game-item-detail']//h3//a/@href", "authorXpath": "//div[@class='result-game-item-detail']//div[@class='result-game-item-info']//p[1]/span[2]/text()", "descXpath": "//div[@class='result-game-item-detail']//p[@class='result-game-item-desc']/text()" }, "catalog": { "xpath": "//div[@id=list]//dl//dd", "titleXpath": "//a/text()", "linkXpath": "//a/@href" }, "content": { "xpath": "//div[@id='content']/text()" } } ``` ##### 搜索图书 ```java Crawler.search("a", new SearchCallback() { @Override public void onResponse(String keyword, List appendList) { } @Override public void onFinish() { } @Override public void onError(String msg) { } }); ``` ##### 加载图书章节 ```java Crawler.catalog(new SearchBook.SL("https://www.81book.com/book/62867/", SourceManager.SOURCES.get(2).get()), new ChapterCallback() { @Override public void onResponse(List chapters) { } @Override public void onError(String msg) { } }); ``` ##### 加载章节内容 ```java Crawler.content(new SearchBook.SL("https://www.81book.com/book/62867/", SourceManager.SOURCES.get(2).get()), "/book/62867/29013513.html", new ContentCallback() { @Override public void onResponse(String content) { } @Override public void onError(String msg) { } }); ``` #### 版本迭代 - v1.0.0 #### 版权和许可信息 ``` Copyright 2016 smuyyh, All right reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ```