meerqat.data.wikidump module#
input/output: entities.json
Parses the dump (should be downloaded first, TODO add instructions), gathers images and assign them to the relevant entity given its common categories (retrieved in wiki.py commons rest)
Note that the wikicode is parsed very lazily and might need a second run depending on your application, e.g. templates are not expanded…
Usage: wikidump.py <subset>
- meerqat.data.wikidump.find(element, tag, namespace={'mw': 'http://www.mediawiki.org/xml/export-0.10/'})[source]#
test if element is None before returning ET.Element.find