To help researchers investigate relation extraction, we’re releasing a human-judged dataset of two relations about public figures on Wikipedia: nearly 10,000 examples of “place of birth”, and over 40,000 examples of “attended or graduated from an institution”. Each of these was judged by at least 5 raters, and can be used to train or evaluate relation extraction systems. We also plan to release more relations of new types in the coming months.
M. Schwab, R. Jäschke, and F. Fischer. Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, page 110--115. Association for Computational Linguistics, (2023)
M. Schwab, R. Jäschke, and F. Fischer. Proceedings of the 5th International Conference on Natural Language and Speech Processing, page 282--287. Association for Computational Linguistics, (2022)
F. Arnold, and R. Jäschke. Proceedings of the Workshop Understanding LIterature references in academic full TExt at JCDL 2022, volume 3220 of ULITE-ws '22, page 7--15. CEUR Workshop Proceedings, (2022)
G. Muzny, M. Fang, A. Chang, and D. Jurafsky. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, page 460--470. Valencia, Spain, Association for Computational Linguistics, (April 2017)
C. Scheible, R. Klinger, and S. Padó. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), page 1736--1745. Berlin, Germany, Association for Computational Linguistics, (August 2016)
B. Powley, and R. Dale. Large Scale Semantic Access to Content (Text, Image, Video, and Sound), page 618--632. Paris, France, France, LE CENTRE DE HAUTES ETUDES INTERNATIONALES D'INFORMATIQUE DOCUMENTAIRE, (2007)