копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Structured information extraction from scientific text with large language models

J. Dagdelen, A. Dunn, S. Lee, N. Walker, A. Rosen, G. Ceder, K. Persson, и A. Jain. Nature Communications, 15 (1): 1418 (15.02.2024)
DOI: 10.1038/s41467-024-45563-x

Аннотация

Extracting structured knowledge from scientific text remains a challenging task for machine learning models. Here, we present a simple approach to joint named entity recognition and relation extraction and demonstrate how pretrained large language models (GPT-3, Llama-2) can be fine-tuned to extract useful records of complex scientific knowledge. We test three representative tasks in materials chemistry: linking dopants and host materials, cataloging metal-organic frameworks, and general composition/phase/morphology/application information extraction. Records are extracted from single sentences or entire paragraphs, and the output can be returned as simple English sentences or a more structured format such as a list of JSON objects. This approach represents a simple, accessible, and highly flexible route to obtaining large databases of structured specialized scientific knowledge extracted from research papers.

Описание

Structured information extraction from scientific text with large language models | Nature Communications

Линки и ресурсы

ключ BibTeX: Dagdelen2024
тип записи: article
год: 2024
месяц: feb
день: 15
журнал: Nature Communications
номер: 1
страницы: 1418
том: 15
issn: 2041-1723
DOI: 10.1038/s41467-024-45563-x
url: https://doi.org/10.1038/s41467-024-45563-x

тэги

Цитировать эту публикацию

искать в

Метаданные

Последнее изменение 3 месяцев назад
Создан 3 месяцев назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!