# cjhtmlparser
**Repository Path**: fuckcpps/cjhtmlparser
## Basic Information
- **Project Name**: cjhtmlparser
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-01-22
- **Last Updated**: 2021-01-22
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# cjhtmlparser
html parser on windows and linux
基于gumbo-parser 和gumbo-query 改造成为了 可以适用 Windows和Linux的 html解析库
编译: gumbo-parser 所有文件全部直接加入到工程直接编译即可
基本用法如下:
```
#pragma once
#include "stdafx.h"
#include "enumtest.cpp"
#include "gumbo-parser/Selector.h"
#include "gumbo-parser/Document.h"
#include "gumbo-parser/Selection.h"
#include "gumbo-parser/Node.h"
void test_parser() {
std::string page("
");
CDocument doc;
doc.parse(page.c_str());
CSelection c = doc.find("h1 a.special");
CNode node = c.nodeAt(0);
printf("Node: %s\n", node.text().c_str());
std::string content = page.substr(node.startPos(), node.endPos() - node.startPos());
printf("Node: %s\n", content.c_str());
}
void test_html() {
std::string page = "1\n2\n
";
CDocument doc;
doc.parse(page.c_str());
CNode pNode = doc.find("div").nodeAt(0);
std::string content = page.substr(pNode.startPos(), pNode.endPos() - pNode.startPos());
printf("Node: #%s#\n", content.c_str());
}
void test_escape() {
std::string page = "1\n2\n
";
CDocument doc;
doc.parse(page.c_str());
CNode pNode = doc.find("span[id=\"that's\"]").nodeAt(0);
std::string content = page.substr(pNode.startPos(), pNode.endPos() - pNode.startPos());
printf("Node: #%s#\n", content.c_str());
}
int main() {
test_parser();
test_html();
test_escape();
}
```