CISPA
Browse

Towards Concept-Aware Large Language Models

Download (1.09 MB)
conference contribution
posted on 2024-04-08, 08:37 authored by Chen Shani, Jilles VreekenJilles Vreeken, Dafna Shahaf
Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human concepts and their structure. We then discuss ways to develop concept-aware LLMs, taking place at different stages of the pipeline. We sketch a method for pretraining LLMs using concepts, and also explore the simpler approach that uses the output of existing LLMs. Despite its simplicity, our proof-of-concept is shown to better match human intuition, as well as improve the robustness of predictions. These preliminary results underscore the promise of concept-aware LLMs.

History

Editor

Bouamor H ; Pino J ; Bali K

Primary Research Area

  • Trustworthy Information Processing

Name of Conference

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Journal

EMNLP (Findings)

Page Range

13158-13170

Publisher

Association for Computational Linguistics (ACL)

Open Access Type

  • Hybrid

BibTeX

@conference{Shani:Vreeken:Shahaf:2023, title = "Towards Concept-Aware Large Language Models", author = "Shani, Chen" AND "Vreeken, Jilles" AND "Shahaf, Dafna", editor = "Bouamor, Houda" AND "Pino, Juan" AND "Bali, Kalika", year = 2023, month = 1, journal = "EMNLP (Findings)", pages = "13158--13170", publisher = "Association for Computational Linguistics (ACL)", doi = "10.18653/v1/2023.findings-emnlp.877" }

Usage metrics

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC