meerqat.data.wikidump module#
input/output: entities.json
Parses the dump (should be downloaded first, TODO add instructions), gathers images and assign them to the relevant entity given its common categories (retrieved in wiki.py commons rest
)
Note that the wikicode is parsed very lazily and might need a second run depending on your application, e.g. templates are not expanded…
Usage: wikidump.py <subset>
- meerqat.data.wikidump.find(element, tag, namespace={'mw': 'http://www.mediawiki.org/xml/export-0.10/'})[source]#
test if element is None before returning ET.Element.find