Class HtmlParser

java.lang.Object
com.oorian.html.parser.HtmlParser

public class HtmlParser extends Object
HTML5 Parser that parses HTML strings and creates a DOM using Oorian element classes.

This parser implements a simplified version of the HTML5 parsing algorithm to transform HTML markup into a tree of Oorian Element objects. It handles HTML5 constructs including void elements, optional closing tags, and complex nested structures.

  • Constructor Details

    • HtmlParser

      public HtmlParser()
      Creates a new HTML5 parser instance with default configuration.
  • Method Details

    • parse

      public Element parse(String html) throws HtmlParseException
      Parses an HTML5 string and returns the root element of the DOM tree.
      Parameters:
      html - the HTML string to parse.
      Returns:
      the root element of the parsed DOM tree.
      Throws:
      HtmlParseException - if parsing fails due to null/empty input or invalid HTML structure.