Abstract
Large-scale knowledge graph construction remains infeasible since it requires significant human-expert involvement. Further complications arise when building graphs from domain-specific data due to their unique vocabularies and associated contexts. In this work, we demonstrate the ability of open-source large language models (LLMs), such as Llama-2 and Llama-3, to extract facts from domain-specific Maintenance Short Texts (MSTs). We employ an approach which combines ontology-guided triplet extraction and in-context learning. By using only 20 semantically similar examples with the Llama-3-70B-Instruct model, we achieve performance comparable to previous methods that relied on fine-tuning techniques like SpERT and REBEL. This indicates that domain-specific fact extraction can be accomplished through inference alone, requiring minimal labeled data. This opens up possibilities for effective and efficient semi-automated knowledge graph construction for domain-specific data.
Original language | English |
---|---|
Title of host publication | Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024) |
Editors | Russa Biswas, Lucie-Aimée Kaffee, Oshin Agarwal, Pasquale Minervini, Sameer Singh, Gerard de Melo |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 75-84 |
Number of pages | 10 |
ISBN (Electronic) | 979-8-89176-147-6 |
DOIs | |
Publication status | Published - 15 Aug 2024 |
Event | 1st Workshop on Knowledge Graphs and Large Language Models, KaLLM 2024 - Bangkok, Thailand, Bangkok, Thailand Duration: 15 Aug 2024 → 15 Aug 2024 |
Conference
Conference | 1st Workshop on Knowledge Graphs and Large Language Models, KaLLM 2024 |
---|---|
Abbreviated title | KaLLM 2024 |
Country/Territory | Thailand |
City | Bangkok |
Period | 15/08/24 → 15/08/24 |
Funding
This work was made possible by the TKI MATTER grant. We also would like to thank Mykola Pechenizkiy, Tyler Bikaun and Simon Koop for their comments.