α ILP: thinking visual scenes as differentiable logic programs

Research output: Contribution to journalArticleAcademicpeer-review

25 Citations (Scopus)

Abstract

Deep neural learning has shown remarkable performance at learning representations for visual object categorization. However, deep neural networks such as CNNs do not explicitly encode objects and relations among them. This limits their success on tasks that require a deep logical understanding of visual scenes, such as Kandinsky patterns and Bongard problems. To overcome these limitations, we introduce αILP, a novel differentiable inductive logic programming framework that learns to represent scenes as logic programs—intuitively, logical atoms correspond to objects, attributes, and relations, and clauses encode high-level scene information. αILP has an end-to-end reasoning architecture from visual inputs. Using it, αILP performs differentiable inductive logic programming on complex visual scenes, i.e., the logical rules are learned by gradient descent. Our extensive experiments on Kandinsky patterns and CLEVR-Hans benchmarks demonstrate the accuracy and efficiency of αILP in learning complex visual-logical concepts.

Original languageEnglish
Pages (from-to)1465-1497
Number of pages33
JournalMachine Learning
Volume112
Issue number5
DOIs
Publication statusPublished - May 2023
Externally publishedYes

Funding

Open Access funding enabled and organized by Projekt DEAL. This work was supported by the AI lighthouse project “SPAICER” (01MK20015E), the EU ICT-48 Network of AI Research Excellence Center “TAILOR” (EU Horizon 2020, GA No 952215), and the Collaboration Lab “AI in Construction” (AICO). The work has also benefited from the Hessian Ministry of Higher Education, Research, Science and the Arts (HMWK) cluster projects “The Third Wave of AI” and “The Adaptive Mind”.

Funders
Technische Universität Darmstadt

    Keywords

    • Differentiable reasoning
    • Inductive logic programming
    • Neuro-symbolic AI
    • Object-centric learning

    Fingerprint

    Dive into the research topics of 'α ILP: thinking visual scenes as differentiable logic programs'. Together they form a unique fingerprint.

    Cite this