What do these badges mean?
- 🚀ShippingCode exists. Multiple GitHub repos already reference this paper — people are building on it.
- 📈ClimbingCitation velocity is rising. Researchers are starting to pick it up.
- 💤QuietPublished but no notable signal yet. Most papers live here — could become anything later.
- 🎭HypeHeavy social buzz but no shipping signal. The counter-signal — defer until Twitter/X data is wired up.
- 💤Quiet2605.18738·May 18, 2026·~11 mincs.AI
What Does the AI Doctor Value? Auditing Pluralism in the Clinical Ethics of Language Models
Payal Chandak, Victoria Alkin, David Wu, Maya Dagan, +10
⭐ 1 stars / 2 repos📚 0 citesELI5Researchers tested whether AI language models used in medicine make ethical decisions the same way human doctors do, or if they have hidden value biases. They found that while AI systems discuss multiple viewpoints, they actually make the same decision over and over, and some consistently underweight patient choice—potentially replacing diverse doctor values with one AI's preferences at scale.
Problem solvedWhen hospitals deploy AI to advise on medical decisions, nobody knows what ethical values the AI actually prioritizes. Some AI systems might consistently favor treatment efficiency over patient autonomy, or other values, but this bias goes undetected—meaning one AI's hidden preferences could be applied to thousands of patients instead of respecting the natural variation in how good doctors make ethical trade-offs.
- 💤Quiet2605.16234·May 15, 2026·~9 mincs.LGcs.AIcs.CL
Layer Equivalence Is Not a Property of Layers Alone: How You Test Redundancy Changes What You Find
Gabriel Garcia
⭐ 73 stars / 10 repos📚 0 citesELI5When you test whether a layer in a transformer is redundant, different test methods give different answers about which layers are safe to remove. This paper shows that the gap between these tests is large and unpredictable, so you need to run both tests before deciding what to prune.
Problem solvedModel compression tools rely on identifying redundant layers to remove, but current equivalence tests disagree on which layers are actually safe to cut. This means compression pipelines built on one test method can fail when the layers chosen for removal don't actually compress well in practice.
- 💤Quiet2605.16223·May 15, 2026·~5 mincs.GRcs.AIcs.CV
Evaluating Design Video Generation: Metrics for Compositional Fidelity
Adrienne Deganutti, Dingning Cao, Jaejung Seol, Elad Hirsch, +1
⭐ 78 stars / 10 repos📚 0 citesELI5A new way to automatically grade how well AI video generators handle design animations—checking if objects move the right way, stay where they should, and follow the instructions given.
Problem solvedDesign animation has strict rules (move this box left, keep that text still) but there was no automated way to measure if generated videos actually follow them. Teams had to manually watch and grade videos, slowing down development.
- 💤Quiet2605.16116·May 15, 2026·~13 mincs.AI
ShopGym: An Integrated Framework for Realistic Simulation and Scalable Benchmarking of E-Commerce Web Agents
Chinmay Savadikar, Mingyu Zhao, Yuanzheng Zhu, Han Li, +4
⭐ 72 stars / 9 repos📚 0 citesELI5A framework that turns real online stores into controllable, reproducible test environments for AI shopping agents. It captures the real structure and complexity of e-commerce sites but lets researchers reset them, inspect them, and run consistent experiments.
Problem solvedTesting e-commerce agents on real websites is messy and irreproducible; testing on hand-built fake stores is too narrow and unrealistic. ShopGym bridges this by automatically converting real storefronts into stable, inspectable simulations that preserve actual shopping complexity.