Changes from 1.2 to 1.2.1 ========================= Match DOCTYPE case-blind Extend PushbackReader's size for oddball cases like & followed by CR Leo Sutic's 2x-4x speedup by precompiling HTMLScanner table Daniel Janus's fix for ]] in CDATA sections Remove bogus newline after printing children of the root element Allow noscript element anywhere, same as script Updated to 2011 edition of W3C character entities Changes from 1.1.3 to 1.2 ========================= Changed license to Apache 2.0 Bogon default model is now ANY, not EMPTY Support new DOCTYPE output switches --doctype-system and --doctype-public Support new XML declaration output switches --standalone and --version New --norootbogons switch makes bogons children of the root Don't resolve entity references in attribute values unless semicolon-terminated Support character entities above U+FFFF Add character entities from the 2007-12-14 draft of xml-entity-names Call SAX events startPrefixMapping and endPrefixMapping to report prefixes Clean up newline processing, shrinking html.stml considerably Allow link elements in the body as well as the head, to avoid excess bodies Allow tables inside paragraphs Allow cells and forms in thead and tfoot elements without intervening tr element The span element is no longer restartable Support non-standard elements bgsound, blink, canvas, comment, listing, marquee, nobr, ruby, rbc, rtc, rb, rt, rp, wbr, xmp In HTML mode, boolean attributes like checked are output in minimized form Correctly handle runs of less-than characters Suppress all but the first DOCTYPE declaration Modify PI targets containing colons to have underscores instead The case of element tags is now canonicalized to the schema PI targets are no longer forced to lower case Changes from 1.1.2 to 1.1.3 =========================== Allow Parser.set* methods to accept null Allow setting the LexicalHandler feature to be null in both cases means "use default behavior" Changes from 1.1.1 to 1.1.2 =========================== Setting CDATAElementsFeature didn't really set CDATAElements instance variable Changes from 1.1 to 1.1.1 ========================= Removed lexical handler calls to startCDATA/endCDATA from CDATA element handling Added lexical handler calls to startCDATA/endCDATA from CDATA section handling Added CDATAElementsFeature, the programmatic equivalent of the --nocdata switch Changes from 1.0.5 to 1.1 ========================= Add Tatu Saloranta's JAXP support package Changes from 1.0.4 to 1.0.5 =========================== Major repairs to comment scanning Skip leading BOM Comment out debugging code in PYXWriter Allow &#X as well as &#x Add net.sf.saxon to list of supported XSLT engines Changes from 1.0.4 to 1.0.3 =========================== Certain options were mutually exclusive that should not have been Blocked XML declaration from specifying an encoding of "" --method=html was not doing the right thing Changes from 1.0.3 to 1.0.2 =========================== Fixed build file to use Java target version 1.4 Fixed --version switch to print the right thing Changes from 1.0.1 to 1.0.2 =========================== Version attribute default value removed from html element Leading and trailing hyphens now trimmed properly from comments Added --output-encoding switch to control encoding If output encoding is Unicode, don't generate character references Whitespace compressed and junk stripped from public identifiers Changes from 1.0 to 1.0.1 ========================= Added ignorableWhitespaceFeature and --ignorable to report ignorable whitespace Patch due to David Pashley Insert spaces to break up -- in comments Change bogus chars in publicids to spaces --lexical switch now outputs DOCTYPE if there is one Remove unnecessary blank line after XML declaration Changes from 1.0rc9 to 1.0 ========================== Added feature to control restartability Patch due to Nikita Zhuk Added corresponding --norestart switch in CommandLine Made translate-colons feature actually work Changes from 1.0rc8 to 1.0rc9 ============================= If there is a publicid but no systemid, set systemid to "" Changes from 1.0rc7 to 1.0rc8 ============================= Fixed paper-bag bug (source didn't match binary in release) Changes from 1.0rc6 to 1.0rc7 ============================= LexicalHandler now gets DOCTYPE information (publicid and systemid) Patch due to Mike Bremford HTMLScanner now reports more useful debug output when not commented out Patch due to Mike Bremford Change "" to exclude "" pseudo-element This prevents "script" from being output as a root The shared HTMLParser object has been eliminated Changes from 1.0rc5 to 1.0rc6 ============================= If namespaceFeature is false, uri and localname are passed as empty strings The namespacePrefixesFeature is now always false Command line switch --nons no longer affects namespacePrefixesFeature Command line switch --html now implies --nons XMLWriter is now told directly to use the schema's URI as default namespace XMLWriter now takes the element name from the qname if localname is empty Changes from 1.0rc4 to 1.0rc5 ============================= The --nodefault switch now removes only default attributes, not all of them Added --nocolons switch and translate-colons feature to convert ":" in names to "_" (thus suppressing namespaces other than the basic one) The root element can be unknown without problem Empty