The multilingual lexicon of toponyms (WikiTopoPl) contains a list of over 155,000 Polish geographical proper names (countries, cities, regions, hydronyms, etc.) and their equivalents in Bulgarian, Croatian, English, German, modern Greek, Hungarian, Romanian, Serbian and Slovak. These data (whenever available) have been automatically extracted from the open encyclopedia Wikipedia. The Wikipedia categories attached to the lexicon entries have been mapped to a short list of succinct categories compliant with Prolexbase, a multilingual ontology of proper names.
The lexicon contains translations of over 155,000 Polish geographical proper names as follows:
- over 8,000 Bulgarian translations,
- over 4,375 Croatian translations,
- over 43,000 German translations,
- over 3,000 modern Greek translations,
- over 16,000 Hungarian translations,
- over 155,000 English translations,
- over 19,000 Romanian translations,
- over 12,000 Slovak translations,
- over 21,000 Serbian translations.
- Leszek Manicki
The lexicon is available under the Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).
Expanding the lexicon by:
- using the newer Wikipedia dump,
- adding new languages to the lexicon,
- expanding the set of Wikipedia article categories to be included in the lexicon.