xml5ever

High-performance browser-grade HTML5 parser

Latest version: 0.20.0 registry icon
Maintenance score
82
Safety score
100
Popularity score
88
Check your open source dependency risks. Get immediate insight about security, stability and licensing risks.
Security
  Vulnerabilities
Version Suggest Low Medium High Critical
0.20.0 0 0 0 0 0
0.19.0 0 0 0 0 0
0.18.1 0 0 0 0 0
0.18.0 0 0 0 0 0
0.17.0 0 0 0 0 0
0.16.2 0 0 0 0 0
0.16.1 0 0 0 0 0
0.16.0 0 0 0 0 0
0.15.0 0 0 0 0 0
0.14.0 0 0 0 0 0
0.13.2 0 0 0 0 0
0.12.1 0 0 0 0 0
0.12.0 0 0 0 0 0
0.11.2 0 0 0 0 0
0.11.1 0 0 0 0 0
0.11.0 0 0 0 0 0
0.10.0 0 0 0 0 0
0.9.0 0 0 0 0 0
0.8.0 0 0 0 0 0
0.7.0 0 0 0 0 0
0.6.0 0 0 0 0 0
0.5.0 0 0 0 0 0
0.4.3 0 0 0 0 0
0.4.2 0 0 0 0 0
0.4.1 0 0 0 0 0
0.4.0 0 0 0 0 0
0.3.2 0 0 0 0 0
0.3.1 0 0 0 0 0
0.3.0 0 0 0 0 0
0.2.0 0 0 0 0 0
0.1.3 0 0 0 0 0
0.1.2 0 0 0 0 0
0.1.1 0 0 0 0 0
0.1.0 0 0 0 0 0

Stability
Latest release:

0.20.0 - This version is safe to use because it has no known security vulnerabilities at this time. Find out if your coding project uses this component and get notified of any reported security vulnerabilities with Meterian-X Open Source Security Platform

Licensing

Maintain your licence declarations and avoid unwanted licences to protect your IP the way you intended.

Apache-2.0   -   Apache License 2.0

Not a wildcard

Not proprietary

OSI Compliant


MIT   -   MIT License

Not a wildcard

Not proprietary

OSI Compliant



html5ever

Build Status crates.io

API Documentation

html5ever is an HTML parser developed as part of the Servo project.

It can parse and serialize HTML according to the WHATWG specs (aka "HTML5"). However, there are some differences in the actual behavior currently, most of which are documented in the bug tracker. html5ever passes all tokenizer tests from html5lib-tests, with most tree builder tests outside of the unimplemented features. The goal is to pass all html5lib tests, while also providing all hooks needed by a production web browser, e.g. document.write.

Note that the HTML syntax is very similar to XML. For correct parsing of XHTML, use an XML parser (that said, many XHTML documents in the wild are serialized in an HTML-compatible form).

html5ever is written in Rust, therefore it avoids the notorious security problems that come along with using C. Being built with Rust also makes the library come with the high-grade performance you would expect from an HTML parser written in C. html5ever is basically a C HTML parser, but without needing a garbage collector or other heavy runtime processes.

Getting started in Rust

Add html5ever as a dependency in your Cargo.toml file:

[dependencies]
html5ever = "0.29"

You should also take a look at examples/html2html.rs, examples/print-rcdom.rs, and the API documentation.

Getting started in other languages

Bindings for Python and other languages are much desired.

Working on html5ever

To fetch the test suite, you need to run

git submodule update --init

Run cargo doc in the repository root to build local documentation under target/doc/.

Details

html5ever uses callbacks to manipulate the DOM, therefore it does not provide any DOM tree representation.

html5ever exclusively uses UTF-8 to represent strings. In the future it will support other document encodings (and UCS-2 document.write) by converting input.

The code is cross-referenced with the WHATWG syntax spec, and eventually we will have a way to present code and spec side-by-side.

html5ever builds against the official stable releases of Rust, though some optimizations are only supported on nightly releases.