NC-Bench And NCfold: A Benchmark And Closed-Loop Framework For RNA Non-Canonical Base-Pair Prediction

Heqin Zhu1, #, Ruifeng Li2, #, Ao Chang1, Mingqina Li2, Hongyang Chen2, *, Peng Xiong1, *, Shaohua Kevin Zhou1, *,
University of Science and Technology of China
ICLR 2026

#Indicates Equal Contribution

*Indicates Corresponding Author
NCfold Framework Overview

Overview of the NC-Bench pipeline and NCfold framework.

Abstract

Non-canonical (NC) base pairs play crucial roles in RNA structural stability, functionality, and recognition processes. Despite their importance, limited benchmark resources and deep learning approaches have hindered accurate NC base pair prediction. We introduce NCBench, a comprehensive benchmark dataset containing 925 RNA sequences with 6,708 high-quality NC annotations curated from protein-RNA complexes and RNA-only 3D structures. NCBench features rigorous annotation standards, redundancy reduction, and multiple RNA types. Additionally, we present NCfold, a novel deep learning framework that fuses sequence features with structural priors from RNA foundation models. NCfold employs a dual-branch architecture with Representative Embedding Fusion (REF) to integrate multiple RNA foundation models and uses Base Pair Motif energy matrices as structural priors. Extensive experiments demonstrate NCfold achieves state-of-the-art performance, with the AttnMatFusion_net variant significantly outperforming existing methods. Analysis reveals that longer training sequences and careful foundation model selection improve performance, particularly for G-U wobble pairs and multi-branch loops. NCBench and NCfold provide valuable resources and methodologies for RNA structural biology research.

Video Presentation

Video presentation coming soon.

Poster

NCBench Poster

BibTeX


      @article {NCBench,
    author = {Zhu, Heqin and Li, Ruifeng and Chang, Ao and Li, Mingqian and Chen, Hongyang and Xiong, Peng and Zhou, Shaohua Kevin},
    title = {Toward Accurate RNA Non-Canonical Structure Prediction: The NC-Bench Benchmark and the NCfold Framework},
    elocation-id = {2025.11.16.688746},
    year = {2025},
    doi = {10.1101/2025.11.16.688746},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2025/11/17/2025.11.16.688746},
    eprint = {https://www.biorxiv.org/content/early/2025/11/17/2025.11.16.688746.full.pdf},
    journal = {bioRxiv}
}