๐Ÿš€Shippingscore 76.6May 15, 2026ยท2605.16107cs.CL

Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection

Chenwang Wu, Yiuming Cheung, Bo Han, Shuhai Zhang, Defu Lian

Narrative

Token-level detection scores for MGT detectors are noisy because LLM outputs are inherently stochastic โ€” the same prompt can produce text with wildly varying per-token log-probabilities. This paper tackles that by modeling how token-level scores relate to each other across a sequence: a Markov-informed calibration module smooths local transitions, while a rule-support reasoning module applies logical rules derived from global score statistics. The combined framework sits on top of existing metric-based detectors (like DetectGPT, Fast-DetectGPT) rather than replacing them, and claims broad gains across cross-LLM and cross-domain benchmarks with minimal added compute.

No production traction yet. The GitHub references are all arxiv feed aggregators, not implementations. Zero citations at time of writing. The work is recent and the underlying idea โ€” stacking a lightweight inference layer on existing zero-shot detectors โ€” is practical enough to ship, but nothing is deployed or even open-sourced from the authors as of now.

Abstract

Machine-generated texts (MGTs) pose risks such as disinformation and phishing, underscoring the need for reliable detection. Metric-based methods, which extract statistically distinguishable features of MGTs, are often more practical than complex model-based methods that are prone to overfitting. Given their diverse designs, we first place representative metric-based methods within a unified framework, enabling a clear assessment of their advantages and limitations. Our analysis identifies a core challenge across these methods: the token-level detection score is easily biased by the inherent randomness of the MGTs generation process. Then, we theoretically derive the multi-hop transitions of the token-level detection score and explore their local and global relations. Based on these findings, we propose a multi-level contextual token relation modeling framework for MGT detection. Specifically, for local relations, we model them through a lightweight Markov-informed calibration module that refines token-level evidence before aggregation. For global relations, we introduce a rule-support reasoning module that uses explicit logical rules derived from contextual score statistics. Finally, we combine the local calibrated score and the global rule-support reasoning signal in a joint multi-level inference framework. Extensive experiments show broad and substantial improvements across various real-world scenarios, including cross-LLM and cross-domain settings, with low computational overhead.

Citation timeline
Not enough citation snapshots yet to plot a timeline. Come back after a few cron runs.