ReflectiveRAG: Rethinking adaptivity in retrieval-augmented generation
2026
Retrieval-Augmented Generation (RAG) systems degrade sharply under extreme noise, where irrelevant or redundant passages dominate. Current methods-fixed top-k retrieval, cross-encoder reranking, or policybased iteration-depend on static heuristics or costly reinforcement learning, failing to assess evidence sufficiency, detect subtle mismatches, or reduce redundancy, leading to hallucinations and poor grounding. We introduce ReflectiveRAG, a lightweight yet reasoning-driven architecture that enhances factual grounding through two complementary mechanisms: Self-Reflective Retrieval (SRR) and Contrastive Noise Removal (NR). SRR employs a small language model as a decision controller that iteratively evaluates evidence sufficiency, enabling adaptive query reformulation without fixed schedules or policy training. NR further refines retrieved content via embedding-based contrastive filtering, enforcing semantic sparsity and removing redundant or tangential passages. Evaluated on WebQuestions, HotpotQA (distractor setting) and InternalQA with 50M Common Crawl distractors, ReflectiveRAG achieves substantial gains over strong baselines-including DeepRAG-improving EM by +2.7 pp and F1 by +2.5 pp, while reducing evidence redundancy by 30.88% with only 18 ms additional latency. Ablation studies confirm that SRR and NR jointly drive both factual accuracy and efficiency, validating our central claim that retrieval reasoning and contrastive filtering can outperform large-scale policy optimization in RAG.
Research areas