What do these badges mean?
- 🚀ShippingCode exists. Multiple GitHub repos already reference this paper — people are building on it.
- 📈ClimbingCitation velocity is rising. Researchers are starting to pick it up.
- 💤QuietPublished but no notable signal yet. Most papers live here — could become anything later.
- 🎭HypeHeavy social buzz but no shipping signal. The counter-signal — defer until Twitter/X data is wired up.
- 2605.18570·May 18, 2026·~11 mincs.AI
Query-Conditioned Knowledge Alignment for Reliable Cross-System Medical Reasoning
Yan Jiao, Jingran Xu, Pin-Han Ho, Limei Peng
ELI5When doctors use multiple medical systems (like Traditional Chinese Medicine and Western Medicine), concepts don't always match one-to-one. This tool figures out which concepts in one system correspond to concepts in another, using the specific question being asked to guide the matching—like asking 'what does this symptom mean in the other system?' rather than assuming a fixed translation.
Problem solvedMedical AI systems that combine knowledge from different traditions or sources often fail because they use rigid, context-blind mappings between concepts. This causes wrong evidence to be retrieved and inaccurate answers in medical Q&A systems. QCEA makes alignments flexible and query-aware so the right knowledge gets surfaced.
- 2605.18561·May 18, 2026·~10 mincs.IRcs.AIcs.SE
Improving BM25 Code Retrieval Under Fixed Generic Tokenization: Adaptive q-Log Odds as a Drop-In BM25 Fix
Santosh Kumar Radha, Oktay Goktas
ELI5When searching for code files, traditional BM25 search struggles because common programming identifiers don't stand out enough from rare ones. This fix tweaks BM25's math formula to give more weight to those distinguishing identifiers, making relevant code files show up in search results much more often.
Problem solvedCode search systems often fail to retrieve the right file because generic tokenization flattens the importance of function and variable names that make code unique. This fix works with frozen, unchangeable search indexes to dramatically improve retrieval without rebuilding the index or slowing down queries.
- 2605.18490·May 18, 2026·~13 mincs.CLcs.IR
Vector RAG vs LLM-Compiled Wiki: A Preregistered Comparison on a Small Multi-Domain Research
Theodore O. Cochran
ELI5Two ways to help AI answer questions about research papers go head-to-head: traditional search-and-retrieve versus an AI-compiled wiki. The wiki better connected ideas across papers, but cost more per question to run, and neither was clearly better overall.
Problem solvedTeams need to know which approach to use when building AI systems that answer questions over document collections—does it make sense to pre-compile a wiki or retrieve chunks dynamically? This test shows the tradeoff depends on what matters most: accuracy, cost, or citation quality.
- 2605.18299·May 18, 2026·~13 mincs.AIcs.CLcs.IR
SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning
Yufei Ma, Zihan Liang, Ben Chen, Zhipeng Qian, +5
ELI5A search-augmented AI agent learns to write better search queries by comparing itself to a smarter version of itself that knows how previous attempts turned out. Instead of just getting one reward at the end, the agent gets feedback on each individual search decision.
Problem solvedSearch-augmented reasoning agents struggle to learn which queries are worth making because they only get a single reward signal at the end of a rollout, not credit for individual search decisions. Previous fixes required expensive teacher models or manual annotations.
- 2605.18284·May 18, 2026·~14 mincs.SEcs.AI
CommitDistill: A Lightweight Knowledge-Centric Memory Layer for Software Repositories
Divya Chukkapalli, Thejesh Avula, Aditya Aggarwal, Harsimran Singh, +1
ELI5A tool that mines git commit history to extract useful facts, skills, and patterns from your codebase, then lets you search that distilled knowledge efficiently without needing embeddings or external services — like having a searchable memory of why your code is written the way it is.
Problem solvedDevelopers and AI coding assistants waste vast amounts of knowledge trapped in commit messages and git history. CommitDistill makes that history queryable and trustworthy (deterministic, local, inspectable) so teams can actually reuse past decisions and patterns without external dependencies.
- 2605.18271·May 18, 2026·~9 mincs.CLcs.AIcs.IR
From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG
Changmin Lee, Jaemin Kim, Taesik Gong
ELI5A system that stores only the user preferences that matter on your phone instead of raw data, making personal AI assistants work better with tiny memory budgets. It's like keeping a cheat sheet of what you like instead of memorizing entire books.
Problem solvedRunning personalized AI on-device requires storing lots of context, but phones have tiny memory budgets. This solves which data to keep so the AI actually remembers what you prefer without eating your storage.
- 2605.18226·May 18, 2026·~8 mincs.CLcs.AI
Context Memorization for Efficient Long Context Generation
Yasuyuki Okoshi, Hao Mark Chen, Guanxi Lu, Hongxiang Fan, +2
ELI5Instead of keeping long context in attention (which gets slower), this method pre-computes how the context should influence each token and stores those pre-made answers in a fast lookup table, so the model can grab the right context influence instantly without recomputing.
Problem solvedLong context prefixes slow down LLM inference because attention computation scales linearly with prefix length, and their influence fades during generation. This method cuts latency while maintaining or improving accuracy without expensive retraining.
- 2605.18199·May 18, 2026·~8 mincs.IRcs.AI
PIPER: Content-Based Table Search via profiling and LLM-Generated Pseudoqueries
Riccardo Terrenzi, Matteo Falconi, Serkan Ayvaz, Pierluigi Plebani
ELI5A system that finds relevant data tables by having an AI read what's actually in them and imagine what questions they could answer, then uses those imagined questions to match tables to user searches.
Problem solvedSearching through thousands of tables in data lakes is hard when table names and descriptions are missing or unhelpful. PIPER uses AI to understand table content directly, making it much easier to find the right dataset without relying on good metadata.
- 2605.18144·May 18, 2026·~13 mincs.AI
Evidence-Grounded Frontier Mapping and Agentic Hypothesis Generation in Nanomedicine
Christiaan G. A. Viviers, Koen de Bruin, Mirre M. Trines, Ayla M. Hokke, +5
ELI5A tool that reads thousands of nanomedicine research papers, finds gaps and connections between different research areas, then uses AI to suggest new research directions—backed by actual citations so scientists can verify where the ideas came from.
Problem solvedNanomedicine researchers drown in fragmented literature across chemistry, biology, and medicine, making it hard to spot promising new directions. This system helps researchers discover underexplored intersections and generates hypothesis ideas grounded in existing evidence.
- 🚀Shipping2605.16217·May 15, 2026·~13 mincs.CLcs.AIcs.IR
Argus: Evidence Assembly for Scalable Deep Research Agents
Zhen Zhang, Liangcai Su, Zhuo Chen, Xiang Lin, +6
⭐ 123 stars / 23 repos📚 0 citesELI5A research AI system where one agent searches for evidence pieces while another agent tracks what's been found, spots what's missing, and assembles everything into a final answer—like coordinating a team to complete a jigsaw puzzle instead of having everyone solve it separately.
Problem solvedCurrent AI research agents waste compute by running parallel searches that duplicate effort instead of finding new information, and they struggle to fit all the results into context windows. This system makes parallel searching actually efficient by tracking what's been gathered and targeting searches at gaps.
- 🚀Shipping2605.16117·May 15, 2026·~9 mincs.CL
SGR: A Stepwise Reasoning Framework for LLMs with External Subgraph Generation
Xin Zhang, Yang Cao, Baoxing Wu, Kai Song, +1
⭐ 199 stars / 10 repos📚 0 citesELI5A system that helps AI language models answer tricky questions by first building a small, focused map of relevant facts from a knowledge base, then walking through that map step-by-step to reach a reliable answer.
Problem solvedLanguage models often hallucinate or give inconsistent answers on complex reasoning tasks because they're working from just their training data. This grounds them in real, structured facts and makes their reasoning process traceable and verifiable.
- 🚀Shipping2605.16113·May 15, 2026·~12 mincs.CLcs.AI
DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation
Rui Chu, Bingyin Zhao, Thanh Quoc Hung Le, Duy Cao Hoang, +5
⭐ 197 stars / 10 repos📚 0 citesELI5A system that fixes biased outputs from AI language models by automatically retrieving and inserting fairness-promoting text snippets into the model's context—no retraining needed, just smarter retrieval.
Problem solvedLanguage models produce biased, stereotyped responses about race, gender, and age. Fine-tuning fixes are expensive and degrade performance; this approach removes bias at inference time without touching the model.