# pdf_searcher **Repository Path**: along_coding/pdf_searcher ## Basic Information - **Project Name**: pdf_searcher - **Description**: searching content in massive pdf document - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2016-09-10 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README pdf文本检索的小工具 folder tree had been shown in tree.txt 轮子: pdfminer,pypdf2,pdfbox(java),whoosh,jieba get requirement: pip install -r requirement.txt HOWTOUSE: 1.make ./pdf_searcher/source linked to your document folder direction 2.make sure ./pdf_searcher/src/webapp/static linked to the same direction as ./pdf_searcher/source 3.run python ./pdf_searcher/src/Main_Daemonlize.py start|stop|restart to setup web service and auto update service as daemon process and manage the service web service run in port 8080 4.run python ./pdf_searcher/src/Main_CMD.py for more administrator operation,type help for operation detail TODO: 2017.5.29: 1.明确结果排序,增加结果显示前端分页功能,美化前端 2.优化index性能 2017.6.3 update: 1.结果按照库内部评分可以排序,分页已经实现 2.前端待美化,index待优化 2017.6.3:先完整,后优化,先做到能用。 1.增强参数配置的灵活性,把配置操作集中到一个config模块中 2.开始考虑部署以及使用的交互,使软件易于使用 3.开始尝试看自己用的库的源代码,从自己已经用的功能入手,学习规范开发和单元测设技术