Create Next App

All 53 🚀 Shipping 2 📈 Climbing 0 💤 Quiet 33 Unscored 18

What do these badges mean?

🚀ShippingCode exists. Multiple GitHub repos already reference this paper — people are building on it.
📈ClimbingCitation velocity is rising. Researchers are starting to pick it up.
💤QuietPublished but no notable signal yet. Most papers live here — could become anything later.
🎭HypeHeavy social buzz but no shipping signal. The counter-signal — defer until Twitter/X data is wired up.

11 min read
🚀Shipping2606.12384·Jun 10, 2026cs.LGcs.AI
APPO: Agentic Procedural Policy Optimization
Xucong Wang, Ziyu Ma, Yong Wang, Yuxiang Ji, +4
⭐ 1.8k stars / 20 repos📚 0 cites
ELI5When training AI agents that use tools, this paper figures out which decisions actually matter and how to learn from them better. Instead of treating each tool call as a unit, it zooms in on individual tokens and uses a smart scoring system to pick which ones are worth exploring differently.
Problem solvedCurrent RL methods for tool-using agents struggle to pinpoint which intermediate decisions drive success, leading to wasted exploration on unimportant choices. This makes training inefficient and credit assignment unreliable, especially when good decisions are scattered throughout long sequences rather than at obvious tool-call boundaries.
13 min read
🚀Shipping2606.12344·Jun 10, 2026cs.LGcs.CL
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks
Mengyu Zheng, Kai Han, Boxun Li, Haiyang Xu, +12
⭐ 1.5k stars / 22 repos📚 0 cites
ELI5A toolkit for fairly comparing different AI agent designs on real software engineering tasks across multiple programming languages. It standardizes how agents interact with code, extract patches, and get scored so you can see which agent setup actually works best.
Problem solvedTesting AI coding agents fairly is hard because each agent design needs different handling—there's no standard way to measure them against each other. This benchmark lets you actually compare agent harnesses (the glue code that connects models to tools) on an apples-to-apples basis, including cost.
13 min read
💤Quiet2607.08717·Jul 9, 2026cs.LGeess.SP
Deep Learning for Joint Narrowband Interference Cancellation and Soft Demodulation in OFDM Systems
Emmanouil Kavvousanos, Francky Catthoor, Vassilis Paliouras
⭐ 0 stars / 0 repos📚 0 cites
ELI5A deep learning system that removes interference from wireless signals and recovers the original data in a single pass, instead of using slow traditional algorithms that leave corrupted leftovers that confuse decoders.
Problem solvedOFDM wireless systems get hit by narrowband interference that traditional algorithms struggle to clean up, leaving unreliable remnants that cause decoding failures and error floors. This solution fixes both the interference removal and data recovery simultaneously, reducing latency and eliminating those error floors.
💤Quiet2607.09662·Jul 10, 2026·~12 minq-bio.NCcs.AIcs.LG
PHINN-EEG: Topological Time-Series Analysis of Dream-State EEG -- Dynamic Betti Curves for Dream Content Classification and Topology-Conditioned Neural Signal Synthesis
Ren Takahashi, Emre Yusuf, Jayabrata Bhaduri
⭐ 0 stars / 0 repos📚 0 cites
ELI5This paper uses topological mathematics (studying how shapes and patterns connect) on EEG brain waves to detect when people are dreaming, then generates synthetic dream EEG signals. Instead of just measuring signal power like current methods, it analyzes the geometric structure of neural activity patterns.
Problem solvedCurrent dream detection from EEG barely works (70% accuracy). This matters for brain-computer interfaces and sleep research where you need to know what state someone's brain is in. The paper proposes topology-based features that theoretically achieve much higher accuracy (82-90%) by capturing the actual geometry of brain activity patterns.
💤Quiet2607.09657·Jul 10, 2026·~8 mincs.CVcs.AIcs.MM
Scalable Visual Pretraining for Language Intelligence
Yiming Zhang, Zhonghan Zhao, Wenwei Zhang, Haiteng Zhao, +12
⭐ 0 stars / 0 repos📚 0 cites
ELI5Instead of converting documents into plain text for training language models, this work shows that training directly on the visual layout and images of documents—charts, equations, page structure—gives better results than just using text alone.
Problem solvedLanguage models trained on text-only representations lose rich information from figures, equations, and document layouts. This wastes training data and limits what models can learn from visually complex sources like PDFs, papers, and web pages.
💤Quiet2607.09654·Jul 10, 2026·~14 mincs.CVcs.AI
Evolution of Accuracy and Visual-Cognitive Errors in a Decade of Vision-Language AI Models
Shravan Murlidaran, Miguel P. Eckstein
⭐ 0 stars / 0 repos📚 0 cites
ELI5Researchers tracked how well AI models describe images over 10 years, using a new dataset of complex social scenes instead of simple ones. They found modern AI now describes complicated behaviors almost as well as humans, but still sometimes looks at different parts of the image than people do.
Problem solvedPrevious benchmarks only tested AI on simple, curated images and didn't reveal what types of mistakes models were making. This gives builders a clearer picture of where vision-language models actually struggle on real-world complexity.
💤Quiet2607.09653·Jul 10, 2026·~11 mincs.CRcs.AI
VEXAIoT: Autonomous IoT Vulnerability EXploitation using AI Agents
Katherine Swinea, Kshitiz Aryal, Lopamudra Praharaj, Maanak Gupta
⭐ 0 stars / 0 repos📚 0 cites
ELI5AI agents that act like automated penetration testers can now hunt down and exploit security holes in IoT devices by reasoning about vulnerabilities and commanding hacking tools to attack them.
Problem solvedIoT devices are notoriously vulnerable but hard to test at scale—manually finding and exploiting flaws is slow and expensive. This automates the entire process, letting security teams rapidly assess risk across many devices.
💤Quiet2607.09649·Jul 10, 2026·~9 mincs.AI
ConceptSMILE: Auditing the Trustworthiness of Concept-Based Explainable AI
Mohadeseh Mollapour, Koorosh Aslansefat, Zeinab Dehghani, Bhupesh Kumar Mishra, +2
⭐ 0 stars / 0 repos📚 0 cites
ELI5A tool that checks whether concept-based AI explanations (like 'this image shows vessel damage') are actually reliable, by testing how the model responds when you slightly change parts of the image and seeing if the explanation holds up.
Problem solvedConcept-based explanations seem intuitive to doctors and users, but there's no standard way to verify they're actually faithful to what the model is doing—you could get misleading explanations that sound trustworthy. This framework audits whether those concepts are real.
💤Quiet2607.09645·Jul 10, 2026·~11 minstat.MLcs.LGmath.ST
Deep Gaussian Processes on Directed Acyclic Graphs
Federico L. Perlino, Oliver Hamelijnck, Adam M. Johansen, Theodoros Damoulas
⭐ 0 stars / 0 repos📚 0 cites
ELI5This paper extends Gaussian Processes (a way to make predictions with built-in uncertainty) to work on directed acyclic graphs — chain-like structures where functions compose together. It solves the problem of handling noisy measurements at different points in these chains while tracking uncertainty through the whole system.
Problem solvedWhen you have a system where outputs feed into other processes (like gene regulation networks or multi-stage simulations), fitting models with uncertainty is hard because noise accumulates and you don't observe everything. This work lets you reconstruct the whole system from partial, messy observations while keeping track of what you're confident about.
💤Quiet2607.09641·Jul 10, 2026·~8 mincs.LGcs.AI
Semantic Pareto-DQN: A Multi-Objective Reinforcement Learning Framework for Financial Anomaly Detection
Cláudio Lúcio do Val Lopes, Lucca Machado da Silva
⭐ 0 stars / 0 repos📚 0 cites
ELI5A fraud detection system that uses AI to write short stories about transactions, then learns to catch suspicious ones without annoying legitimate customers—balancing two conflicting goals instead of picking just one.
Problem solvedFraud detection systems normally fail at catching fraud because legitimate transactions vastly outnumber fraudulent ones. This forces a painful trade-off: catch more fraud and block real customers, or avoid blocks and miss fraud. This approach lets you adjust that trade-off dynamically.
💤Quiet2607.09632·Jul 10, 2026·~9 minquant-phcs.AI
Lean-QIT: Towards a Formal Infrastructure for Quantum Information Theory
Chengkai Zhu, Ziao Tang, Guocheng Zhen, Yimeng Cao, +5
⭐ 0 stars / 0 repos📚 0 cites
ELI5A library that lets computers formally verify quantum information theory theorems—like checking that data compression and communication schemes actually work the way mathematicians claim, with all the proofs machine-checkable.
Problem solvedQuantum information theory proofs are hard to verify and scattered across papers using different notation. This gives researchers a shared, checked foundation so they can build new quantum protocols and theorems without re-proving basics.
💤Quiet2607.09629·Jul 10, 2026·~10 mincs.CVcs.AI
4DR360: State Reasoning for Joint 3D Detection and Occupancy Prediction in 4D Radar-Camera Full-Scene Perception
Xiaokai Bai, Lianqing Zheng, Runwei Guan, Songkai Wang, +2
⭐ 0 stars / 0 repos📚 0 cites
ELI5A system that fuses 4D radar and camera data to simultaneously detect cars/objects AND create a dense map of what's occupying the scene in all directions, treating the occupancy map as an evolving state that gets refined over time rather than computed from scratch each frame.
Problem solvedSelf-driving cars need both object detection (where are the cars?) and occupancy prediction (what space is free to drive in), but radar is sparse and cheap while cameras need fusion. Existing methods either ignore one task or treat them separately with little interaction.
💤Quiet2607.09623·Jul 10, 2026·~11 mincs.CLcs.AI
Task-Specific Multimodal Question Answering Agents via Confidence Calibration and Incremental Reasoning for QANTA 2026
Nirjhar Das, Md. Al-Mamun Provath
⭐ 0 stars / 0 repos📚 0 cites
ELI5A system that answers trivia questions from partial clues (text + images) by using two specialized AI agents—one decides when to buzz in on tossup questions, the other carefully selects answers on bonus questions—using confidence scoring and reasoning rules instead of brute-force retrieval.
Problem solvedMultimodal trivia systems need to work fast with limited compute while handling two different question types with opposite constraints: tossup requires risk-aware timing (answer too soon = wrong, too late = someone else wins), bonus requires accuracy. This system wins the QANTA competition by building task-specific strategies rather than one generic approach.
💤Quiet2607.09616·Jul 10, 2026·~9 mincs.ETcs.ARcs.LG
LLM for EDA in Front-End Design: Challenges and Opportunities
Kangwei Xu, Bing Li, Ulf Schlichtmann
⭐ 0 stars / 0 repos📚 0 cites
ELI5Large language models can help automate chip design by writing hardware code, creating test plans, and exploring design options—turning what used to be manual engineering work into something an AI can do end-to-end.
Problem solvedChip design is slow and expensive because engineers manually write thousands of lines of hardware code and tests. LLMs can draft this code automatically, letting teams move faster and explore more design possibilities without hiring more people.
💤Quiet2607.09611·Jul 10, 2026·~9 mincs.CL
Toward Real-Time Sentence-Level Sign Language Translation
Thanh-Hoang Nguyen Doan
⭐ 0 stars / 0 repos📚 0 cites
ELI5A system that translates sign language videos to text in real-time by splitting the work between a cheap camera device and a more powerful backend computer, then cleverly batches and reorders the data to cut response time by over a quarter.
Problem solvedSign language translation systems were too slow for real conversation and required expensive hardware on-device. This builds a practical end-to-end system that runs on a Raspberry Pi client, cutting latency from 1.9 to 1.4 seconds so users can have actual back-and-forth communication.
💤Quiet2607.09600·Jul 10, 2026·~8 mincs.AIcs.CL
Agora: Enhancing LLM Agent Reasoning Via Auction-Based Task Allocation
Kaiji Zhou, Ales Leonardis, Yue Feng
⭐ 0 stars / 0 repos📚 0 cites
ELI5This paper builds a smarter task dispatcher for AI agents that works like an auction—instead of just picking the first tool that matches a job, it has multiple expert models bid on each reasoning step, and the most genuinely capable one wins the work. This prevents overconfident models from taking on tasks they'll bungle.
Problem solvedLLM agents often waste time and money by routing tasks to the first available tool that sounds relevant, or picking overconfident models that fail. You need a way to dynamically match each reasoning step to whichever expert is actually best at it, accounting for both performance and cost.
💤Quiet2607.09598·Jul 10, 2026·~8 mincs.CL
Tokenizer Transplantation: Mitigating Autoregressive Collapse in Edge-Efficient Bengali ASR
Sanjid Hasan, Md. Abdur Rahman
⭐ 0 stars / 0 repos📚 0 cites
ELI5A lightweight speech recognition model built for English breaks when you try to use it for Bengali because it splits Bengali words into tiny fragments. The fix: swap out the English word-breaking system for a Bengali-specific one, then shrink the model to fit. This makes the model work reliably and fast on edge devices.
Problem solvedCompact speech models for phones and edge devices work great in English but fail completely on morphologically rich languages like Bengali. Teams had to either retrain from scratch (expensive) or accept broken outputs. This swap-and-resize approach fixes it without retraining.
💤Quiet2607.09590·Jul 10, 2026·~9 mincs.ROcs.AI
PAC-ACT: Post-training Actor-Critic for Action Chunking Transformers
Yujie Pang, Zudong Li
⭐ 0 stars / 0 repos📚 0 cites
ELI5A method to improve robot policies that predict multiple action steps at once by using reinforcement learning instead of just copying human demonstrations, while keeping the model fast and memory-efficient for real factory work.
Problem solvedIndustrial robots trained on human examples alone fail when conditions change slightly or when they need to apply precise force (like assembly tasks); this method lets them learn safer, more reliable behaviors through trial-and-error while staying practical for real-time control.
💤Quiet2607.09586·Jul 10, 2026·~10 mincs.AI
TrustX Agent Risk Classification Framework (ARC): Risk-Tiering Internally Created Agentic AI Systems
Hannah M. Liu, Rhea Saxena, Shiv Asthana
⭐ 0 stars / 0 repos📚 0 cites
ELI5A structured checklist system that helps organizations figure out how risky their AI agents are and what controls they need. It scores AI systems across 12 dimensions and spits out a risk tier (low/medium/high) with recommended safeguards.
Problem solvedCompanies building AI agents have no standard way to assess and manage their risks. Existing AI governance frameworks don't fit agentic systems specifically, so teams either over-regulate or under-protect. This gives them a concrete, repeatable tool to classify risk and decide what guardrails to implement.
💤Quiet2607.09582·Jul 10, 2026·~9 minphysics.flu-dyncs.LG
Entropy-Constrained Machine Learning with Residual Data Augmentation for Modeling Chemical Kinetics
Okezzi Ukorigho, Opeoluwa Owoyele
⭐ 0 stars / 0 repos📚 0 cites
ELI5A machine learning model learns to predict chemical reaction rates in flames much faster than computing them from scratch, and stays physically realistic by enforcing the second law of thermodynamics as a hard constraint during training.
Problem solvedSimulating turbulent flames with detailed chemistry is extremely slow because calculating reaction rates at every grid point is expensive. This replaces that bottleneck with a fast neural network that respects physics, cutting computation time 10x+ while staying accurate.
💤Quiet2607.09578·Jul 10, 2026·~10 mincs.AI
Knowledge Graphs and Explainable AI as Complementary Resources for Urban Mining
Jan Gronewald, Andreas Emrich, Nijat Mehdiyev
⭐ 0 stars / 0 repos📚 0 cites
ELI5When buildings are demolished, auditors need to decide what materials are valuable and safe to recover. This work shows how combining knowledge graphs (structured databases of facts) with explainable AI creates better audit reports that regulators will actually trust and approve.
Problem solvedPre-demolition auditors need defensible decisions they can explain to regulators—not just accurate predictions. Neither knowledge graphs nor explainable AI alone provides the full audit trail, sourcing, and contestability required by law.
💤Quiet2607.09576·Jul 10, 2026·~10 mincs.CLcs.AIcs.ET
Conceptual Networks for Cross-Linguistic Idiomatic Expressions:A Feature-Based Graph Approach
Kiran Pala, Punam Silu, Lixun Yu
⭐ 0 stars / 0 repos📚 0 cites
ELI5Instead of treating idioms as black-box text, this work maps them as a network of shared conceptual patterns—like 'spill the beans' and 'let the cat out of the bag' cluster together because they both involve revealing secrets. It works across 8 languages and outperforms standard embedding models.
Problem solvedIdioms are hard for AI to understand because their meaning doesn't come from word definitions—models trained on raw text statistics miss the conceptual patterns humans use to grasp and translate them. This gives AI a structured, interpretable way to handle idioms across languages.
💤Quiet2607.09566·Jul 10, 2026·~12 mincs.CEcs.AImath.OC
Large-Scale Portfolio Optimization Problem Under Cardinality Constraint With Enhanced Multi-Objective Evolutionary Algorithms
Danial Ramezani, Mostafa Abouei Ardakan
⭐ 0 stars / 0 repos📚 0 cites
ELI5Investors need to pick which stocks to buy and how much of each—but with thousands of options and limits on how many they can hold, finding the best mix is nearly impossible. This paper builds better algorithms that quickly find good portfolios by using specialized techniques borrowed from nature-inspired optimization methods.
Problem solvedPortfolio managers waste time and computing power trying to balance risk and return while respecting real constraints like "only hold 20–50 stocks." Existing tools either are too slow or miss good solutions. This work makes those tools faster and more effective at scale.
💤Quiet2607.09562·Jul 10, 2026·~7 mincs.CVcs.AI
TCLA: Training-Free Class-wise Logit Adaptation for Medical Vision-Language Models
Tianyou Jiang, Ziyu Zhou
⭐ 0 stars / 0 repos📚 0 cites
ELI5A method that tweaks the final predictions of medical image AI models using just a few examples, without retraining anything. It adjusts the confidence scores to account for new data patterns the model hasn't seen before.
Problem solvedMedical AI models trained on general internet data perform poorly on actual hospital images due to domain shift. Retraining is slow, unstable with tiny datasets (1-2 examples), and risky in clinical settings. This fixes predictions on-the-fly with minimal data and zero retraining.
💤Quiet2607.09560·Jul 10, 2026·~14 mincs.AIcs.LG
Beyond Fixed Representations: The Vocabulary and Verifier Gaps in Open-Ended AI
Yuan Cao, Haiqian Yang
⭐ 0 stars / 0 repos📚 0 cites
ELI5Today's AI systems are stuck working within a fixed rulebook—they can reason and solve problems really well, but can't invent new concepts or tools that would let them tackle fundamentally different kinds of problems. This paper says true innovation requires AI to create and stabilize new building blocks that change the game itself.
Problem solvedCurrent AI hits a wall on open-ended tasks because it can only remix existing ideas, not invent new ones that unlock whole classes of solutions. Without the ability to create and trust new conceptual primitives, AI systems can't do the kind of foundational innovation humans do.
💤Quiet2607.09546·Jul 10, 2026·~5 mincs.LGmath.NAmath.OC
Graph-Regularized Low-Rank Matrix Completion by Variable Projection
Benoît Loucheur, P. -A. Absil, Michel Journée
⭐ 0 stars / 0 repos📚 0 cites
ELI5When you have a matrix with missing values, this method fills them in by assuming the data is low-rank (simple) and by using the graph structure of how rows and columns relate to each other—like knowing which items are similar helps you guess missing ratings better.
Problem solvedMatrix completion (filling in missing data) often ignores relationships between rows/columns. By incorporating graph structure, this approach recovers missing values more accurately when data has natural groupings or correlations, useful for recommender systems and sensor networks.
💤Quiet2607.09544·Jul 10, 2026·~10 mincs.CVcs.LG
The Count Is There, but Misaligned: Understanding and Correcting Counting Failures in VLMs
Ahmed Oumar El-Shangiti, Abzal Nurgazy, Hilal AlQuabeh, Nikolai Rozanov, +1
⭐ 0 stars / 0 repos📚 0 cites
ELI5Vision-language models know how to count but give wrong answers anyway. Researchers found they can detect when the model will mess up by watching its internal brain activity, then ask it to try again—boosting accuracy by 15% without retraining.
Problem solvedVLMs fail at basic counting tasks despite having the ability internally. This breaks real applications like inventory management and visual inspection. Now you can catch these failures at inference time and fix them automatically.
💤Quiet2607.09543·Jul 10, 2026·~8 mincs.LGq-bio.NC
CoCoT-EEG: Contrastive-Pretrained Multiscale Convolutional Transformer for EEG Decoding
Gabriel Mahuas, Victoria Shevchenko, Ugo Tanielian, Yassir Bendou, +1
⭐ 0 stars / 0 repos📚 0 cites
ELI5A new AI model learns from raw brain wave (EEG) recordings by comparing similar and different patterns, rather than trying to reconstruct missing data like previous models do. It then uses these learned patterns to decode what someone is thinking or doing from their brain signals.
Problem solvedEEG data is noisy and its useful information is scattered across specific frequencies and time patterns, making standard pretaining methods inefficient. This model decodes brain activity more accurately with less data, enabling better brain-computer interfaces and neuroscience applications.
💤Quiet2607.09537·Jul 10, 2026·~12 mincs.LG
GatedLinear: Adaptive Routing of Complementary Linear Bases for Time Series Forecasting
Qitai Tan, Ruiwen Gu, Yilin Su, Mo Li, +2
⭐ 0 stars / 0 repos📚 0 cites
Time series forecasting requires models to capture diverse, often mutually exclusive, temporal dynamics, from smooth trend continuation to nonstationary drift and strict phase-aligned recurrence. While recent deep learning models have improved accuracy, they typically force these diverse patterns through a single compu…
💤Quiet2607.09532·Jul 10, 2026·~6 mincs.LGcs.CRstat.ML
Statistically Undetectable Backdoors in Deep Neural Networks
Andrej Bogdanov, Alon Rosen, Neekon Vafa
⭐ 0 stars / 0 repos📚 0 cites
We show how an adversarial model trainer can plant backdoors in a large class of deep, feedforward neural networks. These backdoors are statistically undetectable in the white-box setting, meaning that the backdoored and honestly trained models are close in total variation distance, even given the full descriptions of…
💤Quiet2607.09530·Jul 10, 2026·~11 mincs.CL
FreyaTTS Technical Report
Ahmet Erdem Pamuk, Ömer Yentür, Ahmet Tunga Bayrak, Yavuz Alp Sencer Öztürk, +1
⭐ 0 stars / 0 repos📚 0 cites
We introduce Freya-TTS, a compact, tokenizer-free, Turkish-first text-to-speech model designed for highly reliable and efficient conversational synthesis. Freya-TTS is a 183.2M-parameter non-autoregressive conditional flow-matching Diffusion Transformer (DiT) that operates in the frozen continuous latent space of Audio…
💤Quiet2607.09528·Jul 10, 2026·~8 mincs.LGcs.CRcs.DB
TSAI-MetaFraud: A Benchmark Dataset for Financial Fraud Transaction and Behavioral Risk Detection in Metaverse Ecosystems
Refat Ishrak Hemel, Ehsan Hallaji, Roozbeh Razavi-Far
⭐ 0 stars / 0 repos📚 0 cites
The emergence of metaverse platforms has created virtual economies that introduce new challenges related to fraud, bot activity, and illicit financial behavior. Despite growing interest in trustworthy metaverse analytics, existing datasets typically focus on user behavior, authentication, or financial transactions in i…
💤Quiet2607.09526·Jul 10, 2026·~7 mincs.CVcs.AI
ALICE: Learning a General-Purpose Pathology Foundation Model from Vision, Vision-Language, and Slide-Level Experts
Jiawen Li, Tian Guan, Huijuan Shi, Xitong Ling, +4
⭐ 0 stars / 0 repos📚 0 cites
Foundation models are reshaping computational pathology, yet their capabilities remain shaped by pretraining objectives, data sources, and spatial scales, fragmenting complementary expertise across separate backbones. Here we present ALICE, a unified foundation model trained through multi-stage agglomerative distillati…
💤Quiet2607.09521·Jul 10, 2026·~10 mincs.AI
SAGEAgent: A Self-Evolving Agent for Cost-Aware Modality Acquisition in Multimodal Survival Prediction
Chongyu Qu, Can Cui, Zhengyi Lu, Junchao Zhu, +7
⭐ 0 stars / 0 repos📚 0 cites
Does every cancer patient truly need a complete diagnostic workup for accurate survival prediction? In multimodal clinical oncology, diagnostic modalities follow a clinically mandated order of escalating burden -- from demographics collected at intake to genomic profiling requiring specialized tissue analysis. Current…
💤Quiet2607.09520·Jul 10, 2026·~15 mincs.CVcs.AI
Seeing is Free, Speaking is Not: Uncovering the True Energy Bottleneck in Edge VLM Inference
Junfei Zhan, Haoxun Shen, Mingang Guo, Zixuan Huang, +1
⭐ 0 stars / 0 repos📚 0 cites
Vision-Language Models (VLMs) are the perceptual backbone of embodied AI, but their energy footprint on edge hardware remains poorly understood. Existing efficiency efforts focus predominantly on reducing visual tokens, implicitly treating visual processing as the dominant energy cost. We overturn this implicit assumpt…
2607.09510·Jul 10, 2026·~10 mincs.SEcs.AI
Failure as a Process: An Anatomy of CLI Coding Agent Trajectories
Xiangxin Zhao, Han Li, Shuaiting Li, Tianyi Zhao, +3
Large language model (LLM) coding agents are increasingly deployed to autonomously perform software engineering tasks in terminal-based environments, making their reliability a growing concern. Existing empirical studies investigate why coding agents fail, yet they largely treat failure as a final outcome rather than a…
2607.09503·Jul 10, 2026·~11 mincs.CVcs.AI
What VGGT Knows About Overlap: Probing Geometric Foundation Models for Co-Visibility
Filippo Ziliotto, Luciano Serafini, Lamberto Ballan, Tommaso Campari
A fundamental challenge in 3D reconstruction and robotic localization is co-visibility: determining which image pairs share overlapping visible surfaces, particularly in scenarios with minimal overlap. We demonstrate that VGGT implicitly encodes co-visibility as an emergent behavior: without any supervision for this ta…
2607.09502·Jul 10, 2026·~11 mincs.LGcs.AIcs.IR
All Explanations are Wrong, But Many Are Useful: Exploring the Rashomon Explanation Set with Large Language Models
Pan Li
Explaining machine-learning models is increasingly important for decision-making and consumer trust, yet it is widely believed to come at a cost: existing Explainable AI (XAI) methods suffer from a persistent accuracy-explainability trade-off. We argue that this trade-off is not fundamental, but an artifact of treating…
2607.09501·Jul 10, 2026·~13 mincs.CLstat.AP
Normalisation-Based Likelihood Ratio Estimation for Forensic Authorship Verification
Sadie Barlow, Andrea Nini, Edoardo Manino
Authorship verification (AV) is the task of determining whether two texts were written by the same author. In a forensic context, the strength of AV evidence can be quantified using likelihood ratios. Most AV methods are score-based and deriving well-calibrated likelihood ratios from these scores requires a separate ca…
2607.09493·Jul 10, 2026·~13 mincs.AIcs.MAcs.SE
Shared Selective Persistent Memory for Agentic LLM Systems
Sanjana Pedada, Aditya Dhavala, Neelraj Patil
Agentic LLM systems that generate code through multi-turn tool use face a fundamental context problem: each session starts from zero, discarding the configuration choices, domain constraints, data schemas, and tool-use patterns that made previous sessions productive. Naively persisting entire conversation histories is…
2607.09492·Jul 10, 2026·~11 mincs.AI
Multimodal Reward Hacking in Reinforcement Learning
Jiayu Yao, Yiwei Wang, Anmeng Zhang, Zhe Sun, +4
Reinforcement learning (RL) is increasingly used to align multimodal large language models (MLLMs), but higher rewards do not always imply better task performance. This risk is amplified when visual evidence is evaluated by text-only or weakly grounded rewards. We study reward hacking in MLLM RL across safety VQA, char…
2607.09490·Jul 10, 2026·~13 mincs.DScs.CGcs.LG
Terminal Dimension Reduction for Time Series with Applications
Alexander Munteanu, Matteo Russo, David Saulpic, Chris Schwiegelshohn
Terminal embeddings have emerged as a powerful tool for dimension reduction. Given a set of points $P\subset \mathbb{R}^d$, a terminal embedding is a mapping $f:\mathbb{R}^d\rightarrow \mathbb{R}^t$ that preserves the pairwise distance between any pair of points $p\in P$ and $q\in \mathbb{R}^d$ up to small distortion u…
2607.09489·Jul 10, 2026·~7 mincs.AIcs.PL
Ceci n'est pas une pipe: AI systems as semantic abstractions
Jade Alglave, Patrick Cousot
An AI system's output is not the fact or world state it appears to describe, but rather an engineered representation. We propose a semantic framework to describe AI systems, to be able to examine the correctness of such representations. To do so, we distinguish what is justified by accepted domain knowledge, what refer…
2607.09487·Jul 10, 2026·~10 mincs.LGcs.CLstat.ML
Neural Collapse Is Forbidden: Information Floors in Language Models
Bruno Abrahao
Within-class variance in language-model representations is commonly read as incomplete neural collapse. We argue it is allocated information storage, and that the allocation obeys a law. A one-line centering identity voids a family of simplex equiangular-tight-frame claims, including our own earlier ones; in dimensionl…
2607.09481·Jul 10, 2026·~11 mincs.CVcs.AI
Decoupling Language Guidance from Backbones for Text-Guided Medical Segmentation
Yungeng Liu, Xuanzi Fang, Haijin Zeng, Qi Dai, +1
Text-guided medical image segmentation leverages clinical semantics to improve lesion delineation, yet many existing models bind cross-modal fusion, supervision, and decoder design into a task-specific architecture. Such tight coupling makes it difficult to reuse language guidance modules across heterogeneous vision an…
2607.09480·Jul 10, 2026·~8 mincs.CVcs.LGcs.NE
Foveation-Guided Dynamic Token Selection for Robust and Efficient Vision Transformers
Ibrahim Batuhan Akkaya, Kishaan Jeeveswaran, Bahram Zonooz, Elahe Arani
The human visual system (HVS) employs foveated sampling and eye movements to achieve efficient perception, conserving both metabolic energy and computational resources. Drawing inspiration from this robustness and adaptability, we introduce the Foveated Dynamic Transformer (FDT), a foveation-guided dynamic token-select…
2607.09474·Jul 10, 2026·~9 mincs.AI
ProofCouncil: An LLM Agent for Solving Open Mathematical Problems
Johannes Schmitt, Tim Gehrunger, Jasper Dekoninck, Gergely Bérczi, +3
Large language models (LLMs) have shown increasing promise in solving open problems in mathematics. However, their performance can be further improved through agentic workflows tailored to real-world mathematical practice. To this end, we introduce ProofCouncil, a mathematical agent that is designed to tackle open prob…
2607.09456·Jul 10, 2026·~11 mincs.LG
Active rejection enables reliable generalization of universal machine-learning interatomic potentials
Mingxiang Luo, Xinnan Mao, Lu Wang, Lei Bai, +2
Universal machine learning interatomic potentials (uMLIPs) bridge quantum-mechanical accuracy and large-scale molecular dynamics, but the cost of high-accuracy calculations such as r$^2$SCAN limits training to datasets that remain small relative to the open materials space. Strong average benchmark performance also doe…
2607.09452·Jul 10, 2026·~9 mincs.SEcs.AI
Practical Source Code Recovery from Binary Functions Using Anchor-Based Retrieval and LLM Reasoning
Charles Edward Gagnon, Steven H. H. Ding, Philippe Charland, Benjamin C. M. Fung
We present a practical pipeline for recovering source code from stripped binary functions by combining reverse engineering, anchor-based source code retrieval, and large language model reasoning. Our binary-to-source-code retrieval method attempts to identify the source function from a source code database, rather than…
2607.09450·Jul 10, 2026·~9 mincs.CVcs.LG
Robustifying Vision-Language Models via Test-Time Prompt Adaptation
Xingyu Zhu, Huanshen Wu, Shuo Wang, Beier Zhu, +3
Pre-trained Vision-Language Models (VLMs) such as CLIP achieve strong zero-shot generalization, but their performance degrades sharply under adversarial perturbations. Existing test-time adaptation methods typically rely on sample-level confidence heuristics, overlooking the intrinsic distributional structure of the da…
2607.09449·Jul 10, 2026·~8 mincs.AI
How Does Bayesian Causal Discovery Fail? Characterising Structural Consequences in Linear Gaussian Networks under Latent Confounding
Debargha Ghosh, Silja Renooij, Anna Kononova
Bayesian causal discovery is widely used for its ability to quantify epistemic uncertainty over directed acyclic graphs (DAGs) through posterior inference. However, its behaviour under latent confounding remains poorly understood, as existing work typically notes that confounding breaks identifiability without characte…
2607.09443·Jul 10, 2026·~10 mincs.CVcs.AI
Parameter-Efficient Vision-Language Adaptation with Continuous Metadata Conditioning for Animal Re-Identification
Anil Osman Tur, Tonje Knutsen Sordalen, Kim Tallaksen Halvorsen, Cigdem Beyan
Long-term animal re-identification (ReID) must remain robust to gradual morphological evolution and seasonal appearance shifts. Although recent vision-language models provide strong pretrained visual representations, adapting them to longitudinal ecological settings remains challenging, particularly under identity and…
2607.09438·Jul 10, 2026·~11 mincs.CLcs.AIcs.LG
Test-Time Scaling for Small VLMs on Multilingual Visual MCQ
Spiros Baxevanakis, Peng-Jian Yang
Test-time scaling (TTS) reliably improves reasoning in large language models, but whether it transfers to small open vision-language models remains unclear. We examine this on EXAMS-V, a multilingual visual multiple-choice benchmark, comparing self-consistency, describe-then-reason with PRM-guided beam search, and two…

APPO: Agentic Procedural Policy Optimization

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

Deep Learning for Joint Narrowband Interference Cancellation and Soft Demodulation in OFDM Systems

PHINN-EEG: Topological Time-Series Analysis of Dream-State EEG -- Dynamic Betti Curves for Dream Content Classification and Topology-Conditioned Neural Signal Synthesis

Scalable Visual Pretraining for Language Intelligence

Evolution of Accuracy and Visual-Cognitive Errors in a Decade of Vision-Language AI Models

VEXAIoT: Autonomous IoT Vulnerability EXploitation using AI Agents

ConceptSMILE: Auditing the Trustworthiness of Concept-Based Explainable AI

Deep Gaussian Processes on Directed Acyclic Graphs

Semantic Pareto-DQN: A Multi-Objective Reinforcement Learning Framework for Financial Anomaly Detection

Lean-QIT: Towards a Formal Infrastructure for Quantum Information Theory

4DR360: State Reasoning for Joint 3D Detection and Occupancy Prediction in 4D Radar-Camera Full-Scene Perception

Task-Specific Multimodal Question Answering Agents via Confidence Calibration and Incremental Reasoning for QANTA 2026

LLM for EDA in Front-End Design: Challenges and Opportunities

Toward Real-Time Sentence-Level Sign Language Translation

Agora: Enhancing LLM Agent Reasoning Via Auction-Based Task Allocation

Tokenizer Transplantation: Mitigating Autoregressive Collapse in Edge-Efficient Bengali ASR

PAC-ACT: Post-training Actor-Critic for Action Chunking Transformers

TrustX Agent Risk Classification Framework (ARC): Risk-Tiering Internally Created Agentic AI Systems

Entropy-Constrained Machine Learning with Residual Data Augmentation for Modeling Chemical Kinetics

Knowledge Graphs and Explainable AI as Complementary Resources for Urban Mining

Conceptual Networks for Cross-Linguistic Idiomatic Expressions:A Feature-Based Graph Approach

Large-Scale Portfolio Optimization Problem Under Cardinality Constraint With Enhanced Multi-Objective Evolutionary Algorithms

TCLA: Training-Free Class-wise Logit Adaptation for Medical Vision-Language Models

Beyond Fixed Representations: The Vocabulary and Verifier Gaps in Open-Ended AI

Graph-Regularized Low-Rank Matrix Completion by Variable Projection

The Count Is There, but Misaligned: Understanding and Correcting Counting Failures in VLMs

CoCoT-EEG: Contrastive-Pretrained Multiscale Convolutional Transformer for EEG Decoding

GatedLinear: Adaptive Routing of Complementary Linear Bases for Time Series Forecasting

Statistically Undetectable Backdoors in Deep Neural Networks

FreyaTTS Technical Report

TSAI-MetaFraud: A Benchmark Dataset for Financial Fraud Transaction and Behavioral Risk Detection in Metaverse Ecosystems

ALICE: Learning a General-Purpose Pathology Foundation Model from Vision, Vision-Language, and Slide-Level Experts

SAGEAgent: A Self-Evolving Agent for Cost-Aware Modality Acquisition in Multimodal Survival Prediction

Seeing is Free, Speaking is Not: Uncovering the True Energy Bottleneck in Edge VLM Inference

Failure as a Process: An Anatomy of CLI Coding Agent Trajectories

What VGGT Knows About Overlap: Probing Geometric Foundation Models for Co-Visibility

All Explanations are Wrong, But Many Are Useful: Exploring the Rashomon Explanation Set with Large Language Models

Normalisation-Based Likelihood Ratio Estimation for Forensic Authorship Verification

Shared Selective Persistent Memory for Agentic LLM Systems

Multimodal Reward Hacking in Reinforcement Learning

Terminal Dimension Reduction for Time Series with Applications

Ceci n'est pas une pipe: AI systems as semantic abstractions

Neural Collapse Is Forbidden: Information Floors in Language Models

Decoupling Language Guidance from Backbones for Text-Guided Medical Segmentation

Foveation-Guided Dynamic Token Selection for Robust and Efficient Vision Transformers

ProofCouncil: An LLM Agent for Solving Open Mathematical Problems

Active rejection enables reliable generalization of universal machine-learning interatomic potentials

Practical Source Code Recovery from Binary Functions Using Anchor-Based Retrieval and LLM Reasoning

Robustifying Vision-Language Models via Test-Time Prompt Adaptation

How Does Bayesian Causal Discovery Fail? Characterising Structural Consequences in Linear Gaussian Networks under Latent Confounding

Parameter-Efficient Vision-Language Adaptation with Continuous Metadata Conditioning for Animal Re-Identification

Test-Time Scaling for Small VLMs on Multilingual Visual MCQ