What do these badges mean?
- 🚀ShippingCode exists. Multiple GitHub repos already reference this paper — people are building on it.
- 📈ClimbingCitation velocity is rising. Researchers are starting to pick it up.
- 💤QuietPublished but no notable signal yet. Most papers live here — could become anything later.
- 🎭HypeHeavy social buzz but no shipping signal. The counter-signal — defer until Twitter/X data is wired up.
- 🚀Shipping2605.18747·May 18, 2026·~13 mincs.CLcs.AI
Code as Agent Harness
Xuying Ning, Katherine Tieu, Dongqi Fu, Tianxin Wei, +38
⭐ 1.3k stars / 9 repos📚 0 citesELI5Instead of treating code as just the output LLMs produce, this survey shows how code can be the central operating system for AI agents—the glue that lets them think, act, remember, and verify their work in a way humans can actually understand and check.
Problem solvedCurrent AI agents are hard to make reliable, debuggable, and controllable. Using code as the core infrastructure lets you write agent logic you can read, test, and fix—solving the black-box nature of pure neural approaches and making agents deployable in real systems.
- 🚀Shipping2605.16250·May 15, 2026·~4 mincs.CLcs.AIcs.DB
A Generative AI Framework for Intelligent Utility Billing CO 2 Analytics and Sustainable Resource Optimisation
Pavan Manjunath, Thomas Pruefer
⭐ 104 stars / 10 repos📚 0 citesELI5A system that automatically writes personalized utility bills in plain English while tracking the carbon footprint of each unit of electricity and predicting tomorrow's grid demand to help balance renewable energy with customer usage.
Problem solvedUtility companies struggle to explain bills clearly to customers, can't easily attach verifiable carbon numbers to power usage, and lack accurate demand forecasts to manage grid stress—making it hard to optimize renewable energy integration and customer engagement.
- 🚀Shipping2605.16238·May 15, 2026·~9 mincs.AI
Prospective multi-pathogen disease forecasting using autonomous LLM-guided tree search
Sarah Martinson, Michael P. Brenner, Martyna Plomecka, Brian P. Williams, +2
⭐ 171 stars / 10 repos📚 0 citesELI5An AI system uses a language model to automatically design and test disease forecast models by searching through combinations of mathematical approaches, then picks the best ones to predict flu, COVID, and RSV—matching expert predictions without needing humans to build the models.
Problem solvedDisease forecasting currently requires expert teams to manually build and tune models for each pathogen and location, which is slow and doesn't scale. This system automates that work so forecasts can be deployed quickly for new diseases or regions without waiting for scarce modeling expertise.
- 🚀Shipping2605.16217·May 15, 2026·~13 mincs.CLcs.AIcs.IR
Argus: Evidence Assembly for Scalable Deep Research Agents
Zhen Zhang, Liangcai Su, Zhuo Chen, Xiang Lin, +6
⭐ 123 stars / 23 repos📚 0 citesELI5A research AI system where one agent searches for evidence pieces while another agent tracks what's been found, spots what's missing, and assembles everything into a final answer—like coordinating a team to complete a jigsaw puzzle instead of having everyone solve it separately.
Problem solvedCurrent AI research agents waste compute by running parallel searches that duplicate effort instead of finding new information, and they struggle to fit all the results into context windows. This system makes parallel searching actually efficient by tracking what's been gathered and targeting searches at gaps.
- 🚀Shipping2605.16207·May 15, 2026·~8 mincs.AIcs.CL
Confirming Correct, Missing the Rest: LLM Tutoring Agents Struggle Where Feedback Matters Most
Tahreem Yasir, Wenbo Li, Sam Gilson, Sutapa Dey Tithi, +2
⭐ 439 stars / 22 repos📚 0 citesELI5Researchers tested whether AI tutors can actually tell the difference between correct answers, partially correct answers, and wrong answers—and found they're surprisingly bad at catching subtle mistakes that real tutors should catch.
Problem solvedSchools and education platforms are replacing human tutors with AI, but we didn't know if these AI tutors could actually diagnose student mistakes well enough to give useful feedback. This matters because bad diagnosis leads to bad teaching.
- 🚀Shipping2605.16205·May 15, 2026·~14 mincs.AIcs.CLcs.LG
Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP
Igor Bogdanov, Chung-Horng Lung, Thomas Kunz, Jie Gao, +2
⭐ 348 stars / 28 repos📚 0 citesELI5Researchers tested different ways to build AI agents that play a cyber defense game where they can't see the full situation. They compared three design choices: what information to show the agent, how much the agent should think things through, and whether to use one big agent or split it into smaller specialist agents. They found that clean data representation and task splitting work best, but adding too much internal reasoning actually makes things worse.
Problem solvedTeams building AI agents for complex, partial-information tasks don't know which design patterns actually improve performance versus just burning compute. This study quantifies the cost-benefit tradeoffs of context, reasoning depth, and hierarchical decomposition so builders can stop guessing and start optimizing.
- 🚀Shipping2605.16194·May 15, 2026·~9 mincs.DLcs.AIcs.IR
paper.json: A Coordination Convention for LLM-Agent-Actionable Papers
Arquimedes Canedo
⭐ 345 stars / 19 repos📚 0 citesELI5A standard format (JSON file) that travels with research papers to help AI agents understand and use them better—listing key claims with IDs, what the paper doesn't claim, and exact commands to reproduce figures.
Problem solvedAI agents reading papers often misinterpret scope, can't cite specific claims within a paper, and struggle to find figure-generation code. This metadata layer makes papers machine-readable and actionable without changing the human-readable PDF.
- 🚀Shipping2605.16143·May 15, 2026·~9 mincs.AIcs.CL
Look Before You Leap: Autonomous Exploration for LLM Agents
Ziang Ye, Wentao Shi, Yuxin Liu, Yu Wang, +5
⭐ 159 stars / 10 repos📚 0 citesELI5LLM agents jump to conclusions too fast in new environments instead of poking around first. This paper teaches them to systematically explore and map out what's possible before trying to solve tasks, like learning the layout before cooking dinner in an unfamiliar kitchen.
Problem solvedLLM-based agents fail in novel environments because they rely on pre-training rather than gathering real info about what's actually possible. Teams need agents that can adapt to new situations instead of confidently doing the wrong thing.