aboutsummaryrefslogblamecommitdiff
path: root/textproc/py-pytidylib/pkg-descr
blob: 5554c07a368c58a84f62976dcb71f5ebec84593e (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11
12
13












                                                                             
PyTidyLib is a Python package that wraps the HTML Tidy library. This allows
you, from Python code, to "fix" invalid (X)HTML markup. Some of the library's
many capabilities include:

  * Clean up unclosed tags and unescaped characters such as ampersands
  * Output HTML 4 or XHTML, strict or transitional, and add missing doctypes
  * Convert named entities to numeric entities, which can then be used in XML
    documents without an HTML doctype.
  * Clean up HTML from programs such as Word (to an extent)
  * Indent the output, including proper (i.e. no) indenting for pre elements,
    which some (X)HTML indenting code overlooks.

WWW: http://countergram.com/open-source/pytidylib