# parse5
**Repository Path**: mirrors_mightyiam/parse5
## Basic Information
- **Project Name**: parse5
- **Description**: Fast full-featured HTML parser for Node. Based on WHATWG HTML5 specification.
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-25
- **Last Updated**: 2026-05-24
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README

Fast full-featured HTML parsing/serialization toolset for Node. Based on WHATWG HTML5 specification.
To build [TestCafé](http://testcafe.devexpress.com/) we needed fast and ready for production HTML parser, which will parse HTML as a modern browser's parser.
Existing solutions were either too slow or their output was too inaccurate. So, this is how parse5 was born.
##Install
```
$ npm install parse5
```
##Simple usage
```js
var Parser = require('parse5').Parser;
//Instantiate parser
var parser = new Parser();
//Then feed it with an HTML document
var document = parser.parse('
Hi there!')
//Now let's parse HTML-snippet
var fragment = parser.parseFragment('Parse5 is fucking awesome!42
');
```
##Is it fast?
Check out [this benchmark](https://github.com/inikulin/node-html-parser-bench).
```
Starting benchmark. Fasten your seatbelts...
html5 (https://github.com/aredridel/html5) x 0.18 ops/sec ±5.92% (5 runs sampled)
htmlparser (https://github.com/tautologistics/node-htmlparser/) x 3.83 ops/sec ±42.43% (14 runs sampled)
htmlparser2 (https://github.com/fb55/htmlparser2) x 4.05 ops/sec ±39.27% (15 runs sampled)
parse5 (https://github.com/inikulin/parse5) x 3.04 ops/sec ±51.81% (13 runs sampled)
Fastest is htmlparser2 (https://github.com/fb55/htmlparser2),parse5 (https://github.com/inikulin/parse5)
```
So, parse5 is as fast as simple specification incompatible parsers and ~15-times(!) faster than the current specification compatible parser available for the node.
##API reference
###Enum: TreeAdapters
Provides built-in tree adapters which can be passed as an optional argument to the `Parser` and `TreeSerializer` constructors.
####• TreeAdapters.default
Default tree format for parse5.
####• TreeAdapters.htmlparser2
Quite popular [htmlparser2](https://github.com/fb55/htmlparser2) tree format (e.g. used in [cheerio](https://github.com/MatthewMueller/cheerio) and [jsdom](https://github.com/tmpvar/jsdom)).
---------------------------------------
###Class: Parser
Provides HTML parsing functionality.
####• Parser.ctor([treeAdapter])
Creates new reusable instance of the `Parser`. Optional `treeAdapter` argument specifies resulting tree format. If `treeAdapter` argument is not specified, `default` tree adapter will be used.
*Example:*
```js
var parse5 = require('parse5');
//Instantiate new parser with default tree adapter
var parser1 = new parse5.Parser();
//Instantiate new parser with htmlparser2 tree adapter
var parser2 = new parse5.Parser(parse5.TreeAdapters.htmlparser2);
```
####• Parser.parse(html)
Parses specified `html` string. Returns `document` node.
*Example:*
```js
var document = parser.parse('Hi there!');
```
####• Parser.parseFragment(htmlFragment, [contextElement])
Parses given `htmlFragment`. Returns `documentFragment` node. Optional `contextElement` argument specifies resulting tree format. If `contextElement` argument is not specified, `` element will be used.
*Example:*
```js
var documentFragment = parser.parseFragment('
');
//Parse html fragment in context of the parsed
element
var trFragment = parser.parseFragment('| Shake it, baby |
', documentFragment.childNodes[0]);
```
---------------------------------------
###Class: TreeSerializer
Provides tree-to-HTML serialization functionality.
####• TreeSerializer.ctor([treeAdapter])
Creates new reusable instance of the `TreeSerializer`. Optional `treeAdapter` argument specifies input tree format. If `treeAdapter` argument is not specified, `default` tree adapter will be used.
*Example:*
```js
var parse5 = require('parse5');
//Instantiate new serializer with default tree adapter
var serializer1 = new parse5.TreeSerializer();
//Instantiate new serializer with htmlparser2 tree adapter
var serializer2 = new parse5.TreeSerializer(parse5.TreeAdapters.htmlparser2);
```
####• TreeSerializer.serialize(node)
Serializes the given `node`. Returns HTML string.
*Example:*
```js
var document = parser.parse('Hi there!');
//Serialize document
var html = serializer.serialize(document);
//Serialize element content
var bodyInnerHtml = serializer.serialize(document.childNodes[0].childNodes[1]);
```
---------------------------------------
##Testing
Test data is adopted from [html5lib project](https://github.com/html5lib). Parser is covered by more than 8000 test cases.
To run tests:
```
$ node test/run_tests.js
```
##Custom tree adapter
You can create a custom tree adapter so parse5 can work with your own DOM-tree implementation.
Just pass your adapter implementation to the parser's constructor as an argument:
```js
var Parser = require('parse5').Parser;
var myTreeAdapter = {
//Adapter methods...
};
//Instantiate parser
var parser = new Parser(myTreeAdapter);
```
Sample implementation can be found [here](https://github.com/inikulin/parse5/blob/master/lib/tree_adapters/default.js).
The custom tree adapter should implement all methods exposed via `exports` in the sample implementation.
##Questions or suggestions?
If you have any questions, please feel free to create an issue [here on github](https://github.com/inikulin/parse5/issues).
##Author
[Ivan Nikulin](https://github.com/inikulin) (ifaaan@gmail.com)