🥇 1st Place — CLEF 2026 GutBrainIE (all 4 subtasks)

GutBrainIE 2026 NER · Linking · RE

📅 2026 👤 Co-First Author & Corresponding 🏛️ CLEF 2026 Working Notes (BioASQ) 📂 GitHub Repository

Overview

NightSun is a single sequential pipeline that addresses all four GutBrainIE 2026 subtasks over gut-brain axis biomedical abstracts: named entity recognition (T611), entity disambiguation / linking (T612), and mention- and concept-level relation extraction (T621/T622). T611 entities feed T612 disambiguation and T621 relation extraction; T621 mention-level triples are then lifted to canonical concept URIs via T612.

🏆 Achievement: Ranked 1st in all four official subtasks. The build on our prior 2025 first-place NER system, extended end-to-end to entity linking and relation extraction.

Why this domain is hard

Heterogeneous vocabulary: a single abstract may reference microorganism taxonomy (Lactobacillus rhamnosus GG), dietary components (inulin), psychiatric disorders (depression), and statistical methods (ANOVA) — each demanding a different recognition signal.
Overlapping categories: the 13 entity types have fuzzy boundaries — microbiome vs. bacteria, food vs. dietary supplement, DDF spanning both diseases and clinical findings.
Evolving taxonomy: NCBI Taxonomy and GTDB disagree on species boundaries, complicating disambiguation.
Near-synonymous predicates: relation types like influence / affect / impact share surface patterns but carry distinct semantics that classifiers easily confuse.

Pipeline

   ┌──────────────────────────────────────────────┐
   │   Gut-brain axis PubMed abstract (raw text)   │
   └───────────────────────┬───────────────────────┘
                           ▼
   ┌──────────────────────────────────────────────┐
   │  T611 · NER                                   │
   │  BiomedBERT-CRF · 27 BIO labels (13 types)    │
   │  chunked inference → cross-model voting       │
   └───────────────────────┬───────────────────────┘
            entity spans    │
            ┌───────────────┴───────────────┐
            ▼                               ▼
   ┌──────────────────────┐      ┌──────────────────────┐
   │  T612 · Entity        │      │  T621 · Mention-level │
   │  Linking (NERD)       │      │  Relation Extraction  │
   │  dict → SapBERT →     │      │  typed entity markers │
   │  type-prior rerank    │      │  CE-Dice · 18 classes │
   └──────────┬───────────┘      └───────────┬──────────┘
              │  canonical URIs              │  mention triples
              └──────────────┬───────────────┘
                             ▼
                  ┌──────────────────────┐
                  │  T622 · Concept-level │
                  │  RE — lift triples to │
                  │  canonical concept URIs│
                  └──────────────────────┘

T611 — Named Entity Recognition

Token-level sequence labeling with BertCrfForTokenClassification: a BiomedBERT encoder, a linear emission head, and a CRF layer over a 27-label BIO scheme.

Differential learning rates: 2e-5 (encoder), 2e-4 (linear head), 2e-3 (CRF transition matrix) — respecting the different convergence speeds of pre-trained vs. randomly initialized components.
Bronze relabeling: the 2,972 distant-supervision "bronze" articles are re-annotated by a cross-model ensemble trained only on gold+silver data, keeping spans where ≥2/3 models agree — cutting false positives (gene agreement with gold-trained models was only 59%).
Chunked inference: long abstracts split into ≤512-token chunks at punctuation boundaries, predictions merged back to original offsets so entities near the end of long abstracts are not lost to truncation.
Within-model ensembling: three seeds (7, 11, 17) averaged at the emission + CRF transition level before Viterbi decoding — soft averaging consistently beats majority voting on predictions.
Cross-model unanimous voting: the winning run requires 3/3 agreement across heterogeneous encoder-data combinations; per-class analysis shows each encoder has a categorical advantage tracing back to its pretraining corpus.

T612 — Entity Linking (three-stage cascade)

A unified knowledge base of 52,263 concept aliases derived from MeSH, NCBI Gene, GTDB, NCBI Taxonomy, FoodOn, and USDA FNDDS, plus a direct 18,512-entry surface-form lookup table from the training annotations.

Dictionary lookup — normalized (surface form, entity type) query against the lookup table; handles common unambiguous mentions with near-perfect precision in under a second.
SapBERT dense retrieval — unresolved mentions encoded with SapBERT, top-k (k=20) candidate URIs retrieved from a FAISS index over the 52,263 alias embeddings (95.6% top-1 where a correct KB entry exists).
Type-prior reranking — candidates reranked by sim(e,c) + α·log p(c|t), correcting type-ambiguous surface forms (e.g. a chemical that also appears as a drug) without over-relying on frequency priors.

Key finding: retrieval accuracy on in-KB entities is near-ceiling — the real bottleneck is KB coverage, not retrieval quality, which explains the dev-to-test gap on this subtask.

T621 / T622 — Relation Extraction

Typed entity markers: BiomedBERT fine-tuned for 18-class relation classification (no_relation + 17 predicates); subject/object spans wrapped with type-bearing markers (@ * t · span * @, # t · span #) so entity-type information stays in the encoder's receptive field.
CE-Dice loss: Dice loss combined with cross-entropy to handle severe class imbalance where no_relation pairs vastly outnumber positives — stable gradients with effective minority-class focus.
Per-class threshold sweep: soft-probability ensembling with a per-predicate threshold sweep handles the 17-predicate imbalance better than majority voting.
T622: mention-level T621 triples are lifted to canonical concept-level triples using the T612 URI assignments.

Error analysis: the oracle-to-pipeline RE gap is driven mostly by upstream NER errors, not the relation classifier itself — identifying NER recall as the highest-leverage direction for further gains.

Tech Stack

Python PyTorch Transformers BiomedBERT BioLinkBERT CRF SapBERT FAISS Relation Extraction Entity Linking BioNLP

Previous: Swiss Legal RAG Next: TalentCLEF 2026