xfind
– Tree traversal and filtering
This module contains XFind selectors and related classes and functions.
A selector specifies a condition that a node in an XIST tree must satisfy to
match the selector. For example the method Node.walk()
will only output
nodes that match the specified selector.
Selectors can be combined with various operations and form a language comparable to XPath but implemented as Python expressions.
- ll.xist.xfind.filter(iter, *selectors)[source]
Filter an iterator over
xsc.Cursor
objects against aSelector
object.Example:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> [c.node.string() for c in xfind.filter(doc.walk(), html.b, html.title)] [ '<title>Welcome to Python.org</title>', '<b>Web Programming</b>', '<b>GUI Development</b>', '<b>Scientific and Numeric</b>', '<b>Software Development</b>', '<b>System Administration</b>' ]
- ll.xist.xfind.selector(*objs)[source]
Create a
Selector
object fromobjs
.If
objs
is empty (i.e.selector()
is called without arguments)any
is returned (which matches every node).If more than one argument is passed (or the argument is a tuple), an
OrCombinator
is returned.Otherwise the following steps are taken for the single argument
obj
:if
obj
already is aSelector
object it is returned unchanged;if
obj
is aNode
subclass, anIsInstanceSelector
is returned (which matches if the node is an instance of this class);if
obj
is aNode
instance, anIsSelector
is returned (which matches onlyobj
);if
obj
is callable aCallableSelector
is returned (where matching is done by callingobj
);if
obj
isNone
any
will be returned;otherwise
selector()
will raise aTypeError
.
- class ll.xist.xfind.Selector[source]
Bases:
object
A selector specifies a condition that a node in an XIST tree must satisfy to match the selector.
Whether a node matches the selector can be specified by overwriting the
__contains__()
method. Selectors can be combined with various operations (see methods below).- __contains__(path)[source]
Return whether
path
(which is a list of XIST nodes from the root of the tree to the node in question) matches the selector.
- __truediv__(other)[source]
Create a
ChildCombinator
withself
as the left hand selector andother
as the right hand selector.
- __rtruediv__(other)[source]
Create a
ChildCombinator
withother
as the left hand selector andself
as the right hand selector.
- __floordiv__(other)[source]
Create a
DescendantCombinator
withself
as the left hand selector andother
as the right hand selector.
- __rfloordiv__(other)[source]
Create a
DescendantCombinator
withother
as the left hand selector andself
as the right hand selector.
- __mul__(other)[source]
Create an
AdjacentSiblingCombinator
withself
as the left hand selector andother
as the right hand selector.
- __rmul__(other)[source]
Create an
AdjacentSiblingCombinator
withother
as the left hand selector andself
as the right hand selector.
- __pow__(other)[source]
Create a
GeneralSiblingCombinator
withself
as the left hand selector andother
as the right hand selector.
- __rpow__(other)[source]
Create a
GeneralSiblingCombinator
withother
as the left hand selector andself
as the right hand selector.
- __and__(other)[source]
Create an
AndCombinator
fromself
andother
.
- __rand__(other)[source]
Create an
AndCombinator
fromother
andself
.
- __or__(other)[source]
Create an
OrCombinator
fromself
andother
.
- __ror__(other)[source]
Create an
OrCombinator
fromother
andself
.
- __invert__()[source]
Create a
NotCombinator
invertingself
.
- class ll.xist.xfind.AnySelector[source]
Bases:
Selector
Selector that selects all nodes.
An instance of this class named
any
is created as a module global, i.e. you can usexfind.any
.
- class ll.xist.xfind.IsInstanceSelector[source]
Bases:
Selector
Selector that selects all nodes that are instances of the specified type. You can either create an
IsInstanceSelector
object directly or simply pass a class to a function that expects a selector (this class will be automatically wrapped in anIsInstanceSelector
):>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.a): ... print(node.attrs.href, node.attrs.title) ... https://www.python.org/#content Skip to content https://www.python.org/#python-network https://www.python.org/ The Python Programming Language https://www.python.org/psf-landing/ The Python Software Foundation ...
- class ll.xist.xfind.element[source]
Bases:
Selector
Selector that selects all elements that have a specified namespace name and element name:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.element(html, "img")): ... print(node.string()) ... <img alt="python™" class="python-logo" src="https://www.python.org/static/img/python-logo.png" />
- class ll.xist.xfind.procinst[source]
Bases:
Selector
Selector that selects all processing instructions that have a specified name.
- class ll.xist.xfind.entity[source]
Bases:
Selector
Selector that selects all entities that have a specified name.
- class ll.xist.xfind.IsSelector[source]
Bases:
Selector
Selector that selects one specific node in the tree. This can be combined with other selectors via
ChildCombinator
orDescendantCombinator
selectors to select children of this specific node. You can either create anIsSelector
directly or simply pass a node to a function that expects a selector:>>> from ll.xist import xsc, parse >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(doc[0]/xsc.Element): ... print(repr(node)) ... <element ll.xist.ns.html.head xmlns='http://www.w3.org/1999/xhtml' (89 children/no attrs) location='https://www.python.org/:?:?' at 0x104ad7630> <element ll.xist.ns.html.body xmlns='http://www.w3.org/1999/xhtml' (14 children/2 attrs) location='https://www.python.org/:?:?' at 0x104cc1f28>
- class ll.xist.xfind.IsRootSelector[source]
Bases:
Selector
Selector that selects the node that is the root of the traversal.
An instance of this class named
isroot
is created as a module global, i.e. you can usexfind.isroot
.
- class ll.xist.xfind.IsEmptySelector[source]
Bases:
Selector
Selector that selects all empty elements or fragments.
An instance of this class named
empty
is created as a module global, i.e. you can usexfind.empty
:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.empty): ... print(node.string()) ... <meta charset="utf-8" /> <meta http-equiv="X-UA-Compatible" content="IE=edge" /> <link href="https://ajax.googleapis.com/" rel="prefetch" /> <meta name="application-name" content="Python.org" /> ...
- class ll.xist.xfind.OnlyChildSelector[source]
Bases:
Selector
Selector that selects all nodes that are the only child of their parents.
An instance of this class named
onlychild
is created as a module global, i.e. you can usexfind.onlychild
:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.onlychild & html.a): ... print(node.string()) ... <a class="text-shrink" href="javascript:;" title="Make Text Smaller">Smaller</a> <a class="text-grow" href="javascript:;" title="Make Text Larger">Larger</a> <a class="text-reset" href="javascript:;" title="Reset any font size changes I have made">Reset</a> <a href="http://plus.google.com/+Python"><span aria-hidden="true" class="icon-google-plus"></span>Google+</a> ...
- class ll.xist.xfind.OnlyOfTypeSelector[source]
Bases:
Selector
Selector that selects all nodes that are the only nodes of their type among their siblings.
An instance of this class named
onlyoftype
is created as a module global, i.e. you can usexfind.onlyoftype
:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.onlyoftype & xsc.Element): ... print(repr(node)) ... <element ll.xist.ns.html.html xmlns='http://www.w3.org/1999/xhtml' (7 children/3 attrs) location='https://www.python.org/:?:?' at 0x108858d30> <element ll.xist.ns.html.head xmlns='http://www.w3.org/1999/xhtml' (89 children/no attrs) location='https://www.python.org/:?:?' at 0x108858630> <element ll.xist.ns.html.title xmlns='http://www.w3.org/1999/xhtml' (1 child/no attrs) location='https://www.python.org/:?:?' at 0x108c547b8> <element ll.xist.ns.html.body xmlns='http://www.w3.org/1999/xhtml' (14 children/2 attrs) location='https://www.python.org/:?:?' at 0x108c54eb8> ...
- class ll.xist.xfind.hasattr[source]
Bases:
Selector
Selector that selects all element nodes that have an attribute with one of the specified names. (Names can be strings, (attribute name, namespace name) tuples or attribute classes or instances):
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.hasattr("id")): ... print(node.xmlname, node.attrs.id) ... body homepage div touchnav-wrapper div top a close-python-network ...
- class ll.xist.xfind.attrhasvalue[source]
Bases:
Selector
Selector that selects all element nodes where an attribute with the specified name has one of the specified values. (Names can be strings, (attribute name, namespace name) tuples or attribute classes or instances). Note that “fancy” attributes (i.e. those containing non-text) will not be considered:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.attrhasvalue("rel", "stylesheet")): ... print(node.attrs.href) ... https://www.python.org/static/stylesheets/style.css https://www.python.org/static/stylesheets/mq.css
- class ll.xist.xfind.attrcontains[source]
Bases:
Selector
Selector that selects all element nodes where an attribute with the specified name contains one of the specified substrings in its value. (Names can be strings, (attribute name, namespace name) tuples or attribute classes or instances). Note that “fancy” attributes (i.e. those containing non-text) will not be considered:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.attrcontains("rel", "stylesheet")): ... print(node.attrs.rel, node.attrs.href) ... stylesheet https://www.python.org/static/stylesheets/style.css stylesheet https://www.python.org/static/stylesheets/mq.css
- class ll.xist.xfind.attrstartswith[source]
Bases:
Selector
Selector that selects all element nodes where an attribute with the specified name starts with any of the specified strings. (Names can be strings, (attribute name, namespace name) tuples or attribute classes or instances). Note that “fancy” attributes (i.e. those containing non-text) will not be considered:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.attrstartswith("class", "icon-")): ... print(node.bytes()) ... b'<span aria-hidden="true" class="icon-arrow-down"><span>\xe2\x96\xbc</span></span>' b'<span aria-hidden="true" class="icon-arrow-up"><span>\xe2\x96\xb2</span></span>' b'<span aria-hidden="true" class="icon-search"></span>' b'<span aria-hidden="true" class="icon-facebook"></span>' ...
- class ll.xist.xfind.attrendswith[source]
Bases:
Selector
Selector that selects all element nodes where an attribute with the specified name ends with one of the specified strings. (Names can be strings, (attribute name, namespace name) tuples or attribute classes or instances). Note that “fancy” attributes (i.e. those containing non-text) will not be considered:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.attrendswith("href", ".css")): ... print(node.attrs.href) ... https://www.python.org/static/stylesheets/style.css https://www.python.org/static/stylesheets/mq.css
- class ll.xist.xfind.hasid[source]
Bases:
Selector
Selector that selects all element nodes where the
id
attribute has one if the specified values:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.hasid("id-search-field")): ... print(node.string()) ... <input class="search-field" id="id-search-field" name="q" placeholder="Search" role="textbox" tabindex="1" type="search" />
- class ll.xist.xfind.hasclass[source]
Bases:
Selector
Selector that selects all element nodes where the
class
attribute contains one of the specified values:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.hasclass("tier-1")/html.a): ... print(node.string()) ... A A Socialize Sign In About Downloads ...
- class ll.xist.xfind.InAttrSelector[source]
Bases:
Selector
Selector that selects all attribute nodes and nodes inside of attributes:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for path in doc.walkpaths(xfind.inattr & xsc.Text, enterattrs=True, enterattr=True): ... print(path[-3].xmlname, path[-2].xmlname, path[-1].string()) ... html class no-js html dir ltr html lang en meta charset utf-8 meta content IE=edge meta http-equiv X-UA-Compatible ...
- class ll.xist.xfind.Combinator[source]
Bases:
Selector
A
Combinator
is a selector that transforms one or combines two or more other selectors in a certain way.
- class ll.xist.xfind.BinaryCombinator[source]
Bases:
Combinator
A
BinaryCombinator
is a combinator that combines two selector: the left hand selector and the right hand selector.
- class ll.xist.xfind.ChildCombinator[source]
Bases:
BinaryCombinator
A
ChildCombinator
is aBinaryCombinator
. To match theChildCombinator
the node must match the right hand selector and its immediate parent must match the left hand selector (i.e. it works similar to the>
combinator in CSS or the/
combinator in XPath).ChildCombinator
objects can be created via the division operator (/
):>>> from ll.xist import xsc, parse >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.a/html.img): ... print(node.string()) ... <img alt="python™" class="python-logo" src="https://www.python.org/static/img/python-logo.png" />
- class ll.xist.xfind.DescendantCombinator[source]
Bases:
BinaryCombinator
A
DescendantCombinator
is aBinaryCombinator
. To match theDescendantCombinator
the node must match the right hand selector and any of its ancestor nodes must match the left hand selector (i.e. it works similar to the descendant combinator in CSS or the//
combinator in XPath).DescendantCombinator
objects can be created via the floor division operator (//
):>>> from ll.xist import xsc, parse >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.div//html.img): ... print(node.string()) ... <img alt="python™" class="python-logo" src="https://www.python.org/static/img/python-logo.png" />
- class ll.xist.xfind.AdjacentSiblingCombinator[source]
Bases:
BinaryCombinator
A
AdjacentSiblingCombinator
is aBinaryCombinator
. To match theAdjacentSiblingCombinator
the node must match the right hand selector and the immediately preceding sibling must match the left hand selector.AdjacentSiblingCombinator
objects can be created via the multiplication operator (*
). The following example outputs allspan
elements that immediately follow aform
element:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.form*html.span): ... print(node.string()) ... <span class="breaker"></span>
- class ll.xist.xfind.GeneralSiblingCombinator[source]
Bases:
BinaryCombinator
A
GeneralSiblingCombinator
is aBinaryCombinator
. To match theGeneralSiblingCombinator
the node must match the right hand selector and any of the preceding siblings must match the left hand selector.AdjacentSiblingCombinator
objects can be created via the exponentiation operator (**
). The following example outputs allmeta
elements that come after alink
elements:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.link**html.meta): ... print(node.string()) ... <meta name="application-name" content="Python.org" /> <meta name="msapplication-tooltip" content="The official home of the Python Programming Language" /> <meta name="apple-mobile-web-app-title" content="Python.org" /> <meta name="apple-mobile-web-app-capable" content="yes" /> <meta name="apple-mobile-web-app-status-bar-style" content="black" /> ...
- class ll.xist.xfind.ChainedCombinator[source]
Bases:
Combinator
A
ChainedCombinator
combines any number of other selectors.
- class ll.xist.xfind.OrCombinator[source]
Bases:
ChainedCombinator
An
OrCombinator
is aChainedCombinator
where the node must match at least one of the selectors to match theOrCombinator
. AnOrCombinator
can be created with the binary or operator (|
):>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.hasattr("href") | xfind.hasattr("src")): ... print(node.attrs.href if "href" in node.Attrs else node.attrs.src) ... https://ajax.googleapis.com/ https://www.python.org/static/js/libs/modernizr.js https://www.python.org/static/stylesheets/style.css https://www.python.org/static/stylesheets/mq.css https://www.python.org/static/favicon.ico ...
- class ll.xist.xfind.AndCombinator[source]
Bases:
ChainedCombinator
An
AndCombinator
is aChainedCombinator
where the node must match all of the combined selectors to match theAndCombinator
. AnAndCombinator
can be created with the binary and operator (&
):>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.input & xfind.hasattr("id")): ... print(node.string()) ... <input class="search-field" id="id-search-field" name="q" placeholder="Search" role="textbox" tabindex="1" type="search" />
- class ll.xist.xfind.NotCombinator[source]
Bases:
Combinator
A
NotCombinator
inverts the selection logic of the underlying selector, i.e. a node matches only if it does not match the underlying selector. ANotCombinator
can be created with the unary inversion operator (~
).The following example outputs all internal scripts:
>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(html.script & ~xfind.hasattr("src")): ... print(node.string()) ... <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-39055973-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script> <script>window.jQuery || document.write('<script src="/static/js/libs/jquery-1.8.2.min.js"><\/script>')</script>
- class ll.xist.xfind.CallableSelector[source]
Bases:
Selector
A
CallableSelector
is a selector that calls a user specified callable to select nodes. The callable gets passed the path and must return a bool specifying whether this path is selected. ACallableSelector
is created implicitely whenever a callable is passed to a method that expects a selector.The following example outputs all links that point outside the
python.org
domain:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> def isextlink(path): ... return isinstance(path[-1], html.a) and not str(path[-1].attrs.href).startswith("https://www.python.org") ... >>> for node in doc.walknodes(isextlink): ... print(node.string()) ... <a href="http://docs.python.org/" title="Python Documentation">Docs</a> <a href="https://pypi.python.org/" title="Python Package Index">PyPI</a> <a class="text-shrink" href="javascript:;" title="Make Text Smaller">Smaller</a> <a class="text-grow" href="javascript:;" title="Make Text Larger">Larger</a> ..
- class ll.xist.xfind.nthchild[source]
Bases:
Selector
An
nthchild
object is a selector that selects every node that is the n-th child of its parent. E.g.nthchild(0)
selects every first child,nthchild(-1)
selects each last child. Furthermorenthchild("even")
selects each first, third, fifth, … child andnthchild("odd")
selects each second, fourth, sixth, … child.
- class ll.xist.xfind.nthoftype[source]
Bases:
Selector
An
nthoftype
object is a selector that selects every node that is the n-th node of a specified type among its siblings. Similar tonthchild
nthoftype
supports negative and positive indices as well as"even"
and"odd"
. Which types are checked can be passed explicitly. If no types are passed the type of the node itself is used:>>> from ll.xist import xsc, parse, xfind >>> from ll.xist.ns import xml, html, chars >>> doc = parse.tree( ... parse.URL("https://www.python.org/"), ... parse.Tidy(), ... parse.NS(html), ... parse.Node(pool=xsc.Pool(xml, html, chars)) ... ) >>> for node in doc.walknodes(xfind.nthoftype(0, html.h2)): ... print(node.string()) ... <h2 class="widget-title"><span aria-hidden="true" class="icon-get-started"></span>Get Started</h2> <h2 class="widget-title"><span aria-hidden="true" class="icon-download"></span>Download</h2> <h2 class="widget-title"><span aria-hidden="true" class="icon-documentation"></span>Docs</h2> <h2 class="widget-title"><span aria-hidden="true" class="icon-jobs"></span>Jobs</h2> ...