Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (quant-ph) 2026-06-17

Approximately Decoding the Colour Code

Authors:

arXiv:2606.18035v1 Announce Type: new Abstract: Recently we showed that minimum weight decoding in the (6.6.6 planar) colour code is NP-hard. However, it remained an open question as to whether it was possible to approximate the minimum weight decoding arbitrarily closely in polynomial time. In this paper we prove that it is possible: for any $\varepsilon>0$ there is an polynomial time algorithm that, given a syndrome, can find an error-set generating that syndrome whose weight is at most $1+\varepsilon$ times the weight of the minimum weight decoding. As a consequence we see that, for any $\varepsilon>0$, there is a polynomial time algorithm that can correct all errors of weight up to $(1-\varepsilon)d/2$ in the distance $d$ colour code (so almost up to the theoretical $d/2$ limit). The polynomial we give is impractically large, but it does open the door for sensible polynomial time algorithms that approximate minimum weight decoding and, in particular, shows that approximate decoding is not NP-hard.

02.
arXiv (quant-ph) 2026-06-17

A Quantum Approach to Stochastic Optimization in Insurance Underwriting

arXiv:2605.01169v2 Announce Type: replace Abstract: The presence of stochastic elements in combinatorial optimization problems makes them particularly challenging, as such problems quickly become intractable for classical computers even at relatively small sizes. In this work, we propose a novel quantum-classical hybrid scheme for solving a class of stochastic optimization problems known as chance-constrained knapsack problems, in which item weights follow probability distributions and constraints may be violated within a specified risk tolerance. Our method employs knapsack-specific QAOA-based circuits to generate samples which, when combined with a new self-consistent classical recovery scheme introduced in this work, produce high-quality solutions. Experiments carried out on IBM Heron processors, using circuits with depths up to 177 and comprising 3443 gates acting on as many as 150 qubits, yield solutions that indicate performance comparable to classical optimization schemes. The proposed quantum-classical scheme paves the way to tackling such problems, with the potential to outperform approaches that rely solely on classical computation.

03.
arXiv (quant-ph) 2026-06-16

Noise-Adaptive Predictive Dynamical Decoupling

arXiv:2606.15769v1 Announce Type: new Abstract: Protecting quantum coherence against realistic environmental noise remains one of the fundamental obstacles to scalable quantum technologies. We develop a noise-adaptive dynamical decoupling framework that combines analytical open-quantum-system modeling with machine-learning-based forecasting for a qubit interacting with random telegraph noise. Unlike conventional dynamical decoupling protocols based on fixed pulse schedules, the proposed approach continuously forecasts short-time coherence evolution and adaptively applies control pulses according to the instantaneous noise dynamics. We investigate stationary and non-stationary environments spanning both Markovian and non-Markovian regimes. Numerical simulations demonstrate that the machine-learning-assisted adaptive control strategy substantially outperforms conventional periodic dynamical decoupling while using a comparable number of control pulses. The improvement becomes particularly pronounced in non-Markovian and non-stationary regimes, where memory effects, coherence revivals, and temporally evolving noise strongly limit the effectiveness of static pulse protocols. These results establish predictive machine-learning-assisted dynamical decoupling as a promising and scalable framework for adaptive quantum control in realistic noisy quantum devices.

04.
arXiv (CS.LG) 2026-06-18

Effects of sparsity and superposition on loss in simple autoencoders

arXiv:2606.18538v1 Announce Type: new Abstract: One of the major difficulties in the mechanistic interpretability of neural networks is the occurrence of polysemanticity, which suggests that each neuron is typically responsible for multiple different tasks, impeding a clean interpretation of their function. The seminal paper of Elhage et al. (2022) argues that this occurs due to superposition, a phenomenon where the neural network represents distinct features as non-orthogonal directions in a lower-dimensional space, a strategy that allows much greater compression of the data without sacrificing fidelity due to the feature sparsity of input vectors. Elhage et al. (2022) empirically validates these hypotheses in a rather natural and simple autoencoder with sparse inputs. The contribution of the present work is to analyze the mathematical basis for the occurrence and optimality of superposition, while rigorously corroborating some of their findings. In particular, we provide upper and lower bounds for the L2 reconstruction loss, tight in the very sparse regime, for power activation functions. A short list of interesting open problems are also included at the end.

05.
arXiv (CS.CL) 2026-06-12

Beyond the Commitment Boundary: Probing Epiphenomenal Chain-of-Thought in Large Reasoning Models

Chain-of-thought (CoT) reasoning is the dominant paradigm for inference-time scaling in language models, yet the causal influence of individual steps on the final answer poorly understood. We estimate each step's causal importance via early exit and use this measure to study how answers form across the reasoning traces of several model families. Across diverse tasks, we find that reasoning typically crosses a commitment boundary – a sharp transition from transient intermediate guesses to a stable, high-confidence answer. This transition often happens in a single step, well before the model's reasoning block ends, and is followed by epiphenomenal CoT steps that leave the final answer probability unaltered. Using attention probes, we show that answer-formation stages can be linearly decoded from intermediate reasoning steps with high accuracy and generalize robustly to unseen reasoning tasks. We exploit this signal to early-exit reasoning blocks at the commitment boundary, reducing the length of CoTs up to 55\% on average with negligible impact on model performance.

06.
arXiv (CS.LG) 2026-06-19

SEAGAN: domain-Specific and Edge-Aware Graph Attention Network for Dynamic Plant Processes

arXiv:2606.19623v1 Announce Type: new Abstract: Graph neural networks (GNNs) provide a flexible framework for learning from scientific data linked through physical, biological, or functional relationships. One promising domain is plant physiology, where measured responses often arise from multiple interacting processes whose exact separation remains difficult even with manual intervention. In plant physiology, a key example is the A-Ci curve, which relates net CO2 assimilation rate (Anet) to leaf intercellular CO2 concentration (Ci) and is used to estimate photosynthetic parameters in leaf and crop-canopy models. However, reliable estimation requires identifying the active biochemical limitation state at each curve point, which remains a major source of uncertainty. Here, we formulate limitation-state identification along A-Ci curves as a graph-based node classification problem, with curve points as nodes. Domain-specific graph representations are created using distance-based k-nearest-neighbor (kNN) and auxiliary-signal-guided (ASG) connectivity, with edge attributes encoding pairwise relations. The framework was evaluated against conventional learning baselines, graph-based architectures, and an automated fitting-based benchmark. Results on a large synthetic dataset with known ground-truth limitation states show that graph-based models improve classification, particularly near biochemical transition regions. The best-performing configuration, SEAGAN (domain-Specific and Edge-Aware Graph Attention Network for Dynamic Plant Processes), integrates process-aware node features, edge attributes, kNN connectivity, and graph attention with weighted cross-entropy loss, achieving an F1-score of 0.857 and an accuracy of 0.882. The results show that representing A-Ci curves as graphs improves biochemical limitation-state analysis, with edge-aware attention over local kNN neighborhoods providing the most effective strategy.

07.
arXiv (CS.AI) 2026-06-16

ChatPlanner: A Large Language Model Framework for Personalized Public Transit Routing

arXiv:2606.15315v1 Announce Type: new Abstract: Personalized public transit routing in public transit systems remains challenging due to the difficulty of capturing and integrating diverse user preferences into routing algorithms. This paper presents ChatPlanner, a novel framework that leverages Large Language Models (LLMs) to enable preference aware public transit routing. Our approach employs fine-tuned LLMs with Retrieval-Augmented Generation (RAG) to extract routing parameters and interpret nuanced user preferences from natural language queries, subsequently integrating these preferences into the objective function of a public transit routing algorithm. This study designs preference aware datasets incorporating eight personas and five contexts to establish scoring standards for both fine-tuning and RAG. This work conducted three experiments to validate the solutions' feasibility, extraction of routing information and preferences, and solution set quality and completeness. Results demonstrate that ChatPlanner generates feasible solutions reliably. Fine-tuning enforces the required output structure and learns general preference patterns, while RAG provides query-specific context to resolve imprecise or conversational expressions and calibrate continuous scores. The combination of both achieves the highest accuracy in routing information extraction and user preference interpretation. Results based on selected case studies show that by capturing user preferences, ChatPlanner identifies valuable solutions across different dimensions that existing route planners overlook, generating more valuable route alternatives. This research establishes a new paradigm for integrating natural language understanding into transportation optimization.

08.
Nature (Science) 2026-06-10

Molecular glue degraders of HuR suppress BRAF-mutant colorectal cancer

Authors:

BRAF gain-of-function mutations, particularly BRAF(V600E), affect roughly 10% of all patients with colorectal cancer (CRC), and portend poor prognosis with limited therapeutic interventions. BRAF inhibitors such as encorafenib are ineffective due to MAPK pathway reactivation driven by BRAF dimerization. Combined inhibition of BRAF and EGFR, although approved therapies, results in short survival benefits and frequent treatment resistance and relapse1–3. Here, through rational chemical library design coupled with parallel proteomic screening, we identified dHuR as a molecular glue degrader of human antigen R (HuR), an RNA-binding protein that drives tumour growth, invasion and therapy resistance. dHuR binds to the CRBN ubiquitin ligase to create a unique benzofuran-tethered composite surface to recruit HuR as a neosubstrate by engaging its β-hairpin G-loop degron, as revealed by the cryo-electron microscopy structure of the ternary complex. dHuR abrogated BRAF expression by inducing its exon 18 skipping, and demonstrated superior suppression of BRAF-mutant CRC tumours including those gaining resistance to BRAF inhibitors. Finally, we performed kinome library CRISPR screening and revealed that inactivation of EGFR or MEK enhanced dHuR cytotoxicity, thus establishing a combinatorial strategy to treat patients with refractory BRAF-mutant CRC. Molecular glue degraders of the RNA-binding protein HuR have therapeutic potential for BRAF-mutant cancers.

09.
arXiv (CS.AI) 2026-06-18

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

arXiv:2606.01139v3 Announce Type: replace Abstract: Agent skills are procedural artifacts that enable LLM agents to execute workflows, verify constraints, and recover from failures. Existing self-evolving methods refine skills using accumulated trajectories. However, they struggle in cold-start settings, where only an initial, imperfect skill is available. Consequently, skill construction defaults to expert authoring or one-shot LLM generation. Expert-authored skills are costly and may not align with how LLM agents actually execute tasks, while one-shot generated skills can be syntactically well formed yet behaviorally weak. To bridge this gap, we propose SkillRevise, an execution-grounded framework designed to iteratively refine these initial skills. SkillRevise diagnoses skill defects from execution evidence, retrieves relevant repair principles from a general memory, and applies execution-anchored edits. By re-executing candidates, it retains the first verifier-passing skill within the revision budget and falls back to empirical utility only when no candidate succeeds. Evaluated across three benchmarks and five LLMs, SkillRevise substantially outperforms one-shot baselines, improving the base agent's success rate on SkillsBench from 36.05% to 61.63%. Furthermore, the revised skills transfer across both executors and task environments, suggesting that SkillRevise captures reusable procedural knowledge beyond any single executor.

10.
arXiv (CS.CL) 2026-06-12

Quickest Detection of Hallucination Onset: Delay Bounds and Learned CUSUM Statistics

Authors:

Token-level hallucination detectors are evaluated as classifiers, by AUC over all tokens, yet a streaming monitor is judged by its reaction time: the number of tokens that pass between the onset of a hallucination and the alarm. We formulate hallucination onset detection as a quickest change detection problem. A first-order Markov model of the latent faithful/hallucinated state, validated on RAGTruth, places the task inside classical change-point theory and yields Lorden's lower bound on detection delay: about 1.3 tokens at a false-alarm rate of 0.01. We then show that a causal recurrent labeler acts as a CUSUM with a learned increment; at a matched false-alarm rate it detects in 11-13 tokens, against 31 for a linear per-token baseline, and a controlled decomposition attributes most of this advantage to a better per-token score rather than to temporal accumulation. An information-rate optimality theorem of Donsker-Varadhan type explains the remaining order-of-magnitude gap: the learned score realizes only 1/4.5 of the divergence the features carry, a deficit that recalibration cannot remove, with the remainder a finite-horizon effect. Classification metrics conceal this delay structure; sequential analysis makes it measurable

11.
arXiv (CS.CL) 2026-06-18

ForecastBench-Sim: A Simulated-World Forecasting Benchmark

Forecasting benchmarks for general-purpose AI systems usually inherit the constraints of the real world: outcomes resolve slowly, tail events are rare, and counterfactual questions are difficult to score. We introduce ForecastBench-Sim, a simulated-world forecasting benchmark built on game rollouts from Freeciv, a turn-based strategy game modelled on the Civilization series. Forecasters receive a fixed world report (a structured snapshot of the current game state) and answer questions about hidden future states; the benchmark then continues the simulation and scores forecasts. Because the world is simulated, the same setup can generate continuous or binary forecasting questions at arbitrary time horizons, paired intervention worlds for conditional or causal questions, and resolved examples of rare or disruptive outcomes. We describe the benchmark pipeline, question families, scoring protocol, and release artifacts, and report validation slices from model evaluations and an anonymized human pilot. ForecastBench-Sim is intended to complement real-world forecasting benchmarks by providing controlled, immediately resolvable tasks for studying probabilistic reasoning under dynamic world states.

12.
arXiv (CS.LG) 2026-06-15

IntSeqBERT: Learning Arithmetic Structure in OEIS via Modulo-Spectrum Embeddings

arXiv:2603.05556v2 Announce Type: replace Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT, a dual-stream Transformer encoder for masked integer-sequence modelling on OEIS. Each sequence element is encoded along two complementary axes: a continuous log-scale magnitude embedding and sin/cos modulo embeddings for 100 residues (moduli $2$–$101$), fused via FiLM. Three prediction heads (magnitude regression, sign classification, and modulo prediction for 100 moduli) are trained jointly on 274,705 OEIS sequences. At the Large scale (91.5M parameters), IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy (MMA) on the test set, outperforming a standard tokenised Transformer baseline by $+8.9$ pt and $+4.5$ pt, respectively. An ablation removing the modulo stream confirms it accounts for $+15.2$ pt of the MMA gain and contributes an additional $+6.2$ pt to magnitude accuracy. A probabilistic Chinese Remainder Theorem (CRT)-based Solver converts the model's predictions into concrete integers, yielding a 7.4-fold improvement in next-term prediction over the tokenised-Transformer baseline (Top-1: 19.09% vs. 2.59%). Modulo spectrum analysis reveals a strong negative correlation between Normalised Information Gain (NIG) and Euler's totient ratio $\varphi(m)/m$ ($r = -0.851$, $p < 10^{-28}$), providing empirical evidence that composite moduli capture OEIS arithmetic structure more efficiently via CRT aggregation.

13.
arXiv (math.PR) 2026-06-15

Stability of the $k$-Plane Transform on Measures and Hölder-Type Comparisons of Wasserstein Metrics

arXiv:2605.00375v2 Announce Type: replace-cross Abstract: We establish stability estimates for the $k$-plane transform on finite positive Radon measures, with emphasis on Fourier and Wasserstein metrics. We first introduce a metric on $k$-plane transform data and prove a bi-Lipschitz stability estimate showing that this metric is equivalent to a generalized Fourier metric obtained by augmenting the Fourier distance between centered normalized measures with separate barycenter and total mass difference terms. Building on a Hölder-type comparison between Fourier and Wasserstein metrics due to Carrillo and Toscani, we extend this comparison to positive Radon measures under uniform bounds on centered moments of order slightly larger than $2$. This yields Hölder-type stability for the $k$-plane transform in a generalized $2$-Wasserstein metric and, in particular, a $W_2$-stability estimate for centered probability measures. We also compare the $2$-Wasserstein distance with its max-sliced analogue. For centered probability measures with uniformly bounded moments of order slightly larger than $2$, we prove a two-sided Hölder-type comparison between these distances. We then extend the result to positive Radon measures by applying it to centered normalized measures and adding separate barycenter and mass terms. Finally, for absolutely continuous compactly supported probability measures with bounded densities, we prove a strong equivalence between the $2$-Wasserstein distance of the measures and the $(k/2-1)$-order Sobolev norm of the $k$-plane transform data of the difference of their densities.

14.
arXiv (CS.CL) 2026-06-16

TMASC: Transmasculine Attitude and Speech Corpus

Authors:

We introduce the Transmasculine Attitudes and Speech Corpus (TMASC), a multimodal corpus of 196 transmasculine individuals, including questionnaire responses and 66 audio recordings. The questionnaire includes items exploring the vocal health of transmasculine individuals. The audio recordings include cough and throat-clearing samples, a reading passage, and additional session-specific questions. This paper outlines the development of this corpus and the data collection procedures. To illustrate the utility of this corpus, we present three case studies demonstrating how this crowd-sourced multimodal corpus can be used to support transmasculine individuals. These include the integration of perceptual and acoustic data, the identification of group-level characteristics, and the calibration of acoustic measurements.

15.
arXiv (CS.LG) 2026-06-16

Next-Latent Prediction Transformers Learn Compact World Models

arXiv:2511.05963v4 Announce Type: replace Abstract: Transformers replace recurrence with a memory that grows with sequence length and self-attention that enables ad-hoc lookups over past tokens. Consequently, they lack an inherent incentive to compress history into compact latent states with consistent transition rules. This often leads to learning solutions that generalize poorly. We introduce Next-Latent Prediction (NextLat), which extends standard next-token training with self-supervised predictions in the latent space. Specifically, NextLat trains a transformer to learn latent representations that are predictive of its next latent state given the next token. Theoretically, we show that these latents provably converge towards belief states, compressed information about the history necessary to predict the future. This simple auxiliary objective injects a recurrent inductive bias into transformers while leaving their architecture, parallel training efficiency, and inference unchanged. NextLat effectively encourages transformers to form compact internal world models with coherent belief states and transition dynamics – crucial properties not guaranteed by standard next-token prediction alone. Empirically, across benchmarks in world modeling, reasoning, planning, and language modeling, NextLat demonstrates significant gains over standard next-token prediction and other baselines in downstream accuracy, representation compression, and lookahead planning. Furthermore, NextLat enables variable-length self-speculative decoding, accelerating inference by up to 3.3x in language modeling. NextLat offers a simple yet effective paradigm for learning compact, predictive representations in transformers that generalize better. Our code is available at https://github.com/JaydenTeoh/NextLat.

16.
arXiv (CS.AI) 2026-06-15

From Self-Supervised Speech Models to Mixture-of-Experts for Robust Anti-Spoofing

arXiv:2606.14639v1 Announce Type: cross Abstract: Recent advances in speech generation have significantly improved the naturalness of synthetic speech, making spoofing detection increasingly challenging. A key limitation of current anti-spoofing systems is their limited robustness to unseen synthesis methods. In this work, we transform a self-supervised speech representation model into a Mixture-of-Experts (MoE) architecture to improve generalization. Feed-forward blocks in selected encoder layers are replaced by multiple expert networks controlled by a layer-wise gating mechanism, allowing experts to capture complementary acoustic patterns while preserving the representations learned during self-supervised pretraining. We further analyze the architectural choices affecting the performance of this MoE conversion and investigate the activation behavior of the experts. The proposed approach is evaluated on 14 spoofing datasets and reduces the macro EER from 5.46% to 4.81%, corresponding to 11.9% relative improvement over the baseline.

17.
arXiv (math.PR) 2026-06-17

Killed resolvents and measure-valued stopping gains for reflected optimal stopping with max-type rewards

arXiv:2606.17517v1 Announce Type: new Abstract: We study an infinite-horizon optimal stopping problem for a normally reflected two-dimensional diffusion in the positive quadrant with nonsmooth max-type reward \(G(x_1,x_2)=x_1\vee \alpha x_2\). The paper develops a conditional measure-theoretic framework for the associated reflected obstacle problem. The main innovation is to show that the stopping gain \(\Gamma=c+rG-\mathcal LG\) is a signed measure, not a function: the kink of \(G\) generates an explicit negative surface measure on \(\Delta=\{x_1=\alpha x_2\}\). We then prove that the correct potential representation uses the resolvent of the reflected diffusion killed on first entry into the stopping set, rather than the unrestricted reflected resolvent. Under explicit monotonicity, regularity, and measure-superharmonicity assumptions, we derive an epigraph representation, a continuation-side boundary-trace condition, and a candidate verification theorem. The framework clarifies hidden regularity and uniqueness assumptions in multidimensional nonsmooth optimal stopping.

18.
arXiv (CS.AI) 2026-06-15

A Multi-Agent AI System for Automated High School Transcript Processing: Collaborative Document Analysis at Scale

arXiv:2606.13916v1 Announce Type: new Abstract: Each year, college admissions offices face an overwhelming challenge: processing millions of high school transcripts, each with unique formats, grading systems, and layouts. This manual process creates operational bottlenecks that delay admissions decisions and consume valuable resources. We present a transformative solution through a multi-agent AI system where specialized agents collaborate to automatically process diverse transcript formats through intelligent coordination and communication. Our multi-agent architecture consists of three specialized agents-a Pattern Recognition Agent for format-specific parsing, a Semantic Analysis Agent for natural language understanding, and a Vision Intelligence Agent for multimodal document analysis-coordinated by an Orchestration Agent that manages agent communication and result reconciliation. Our key innovation lies in agent-based quality control using GPA extraction as a coordination signal, ensuring reliable agent collaboration and preventing critical information loss. When evaluated on 40 real world transcripts from high schools across 13 U.S. states, our agent system successfully processed every document, achieving 96.7% accuracy compared to expert manual review while maintaining practical processing speeds of 45 seconds per transcript. This work demonstrates how multi-agent coordination can solve complex document processing challenges, offering institutions a scalable, collaborative AI solution that preserves accuracy while dramatically reducing processing time.

19.
arXiv (CS.CV) 2026-06-18

The Market in the Model: Latent Diffusion as Neural Economy

Valuable critique of generative image models within visual culture and the humanities has emphasized the role of datasets in shaping the images they produce. Yet, close studies of the ideological positions embedded into the mechanism of the models have been neglected, leaving them imagined as "black boxes." In a bid to expand, rather than replace, dataset critique, this paper examines the mechanisms of the latent diffusion model in terms of the problems they were brought in to solve on behalf of computer vision engineers, and the decisions each component was tasked with automating. I interpret that ensemble through the histories of its parts and the theory of vision the system inscribes into every generated image. Drawing on Impett and Offert's notion of neural exchange value, I offer this analysis to argue that the model operates as a neural economy: a contained symbolic system that abstracts social communication into commensurable vectors as it transfers the social sphere into parcels for sale. Tracing the training and generation pipelines component by component reveals what each operation displaces, and how it further entrenches the logics of platform and attention economies over social communication. The paper warns that any critique fixated exclusively on copyright and commodity defenses risks reaffirming the very fetishism the model produces, and argues instead for centering social exchange.

20.
arXiv (CS.LG) 2026-06-16

PHINN: Persistent Homology Inspired Neural Network for Rare-Event Time Series Generation

arXiv:2606.15452v1 Announce Type: new Abstract: Rare events in time series are critical to model but hard to learn due to data scarcity. Current generative models struggle with extreme values. We observe that rare events leave distinct topological fingerprints - transitions in Betti numbers from point-cloud embeddings - that are more stable and discriminative than statistical moments. We introduce PHINN, a flow-matching framework using dynamic Betti curves as conditioning signals and a persistence landscape loss for homology consistency. It scales to multivariate data, includes a natural-language interface to set Betti targets, supports cross-domain meta-learning and few-shot generation, and provides certified adversarial robustness. On financial, epidemiological, and multi-modal benchmarks, PHINN outperforms statistical and diffusion baselines in topological fidelity (beta-RMSE down 41-63%, transition accuracy up 84%) and matches jump-diffusion models in tail coverage while exceeding them in shape fidelity. All results have 95% confidence intervals.

21.
Nature (Science) 2026-06-22

Will AI spark a scientific renaissance — or a diffuse monoculture?

Authors:

Artificial intelligence’s ability to enrich science will depend not only on model capability, but also on whether researchers, reviewers and funders reward originality over speed. Artificial intelligence’s ability to enrich science will depend not only on model capability, but also on whether researchers, reviewers and funders reward originality over speed.

22.
arXiv (math.PR) 2026-06-15

Boltzmann-Like Occupation of Nonequilibrium Steady States on Dense Networks

Authors:

arXiv:2606.14542v1 Announce Type: cross Abstract: A central problem in statistical physics is to extend the Boltzmann distribution to nonequilibrium steady states (NESS). We prove that NESS on large dense networks have Boltzmann-like occupation despite extensive entropy production. We further show that the active-matter heuristic of "low rattling" is asymptotically exact. Intuitively, these NESS spend a greater fraction of their time in states they leave more slowly. This explanation extends to the broader class of "equiaccessible" steady states, which play a role in our analysis akin to that of equilibrium in linear response.

23.
arXiv (CS.CV) 2026-06-19

Benchmarking Vision Foundation Models for Domain-Generalizable Face Anti-Spoofing

Face Anti-Spoofing (FAS) remains challenging due to the requirement for robust domain generalization across unseen environments. While recent trends leverage Vision-Language Models (VLMs) for semantic supervision, these multimodal approaches often demand prohibitive computational resources and exhibit high inference latency. Furthermore, their efficacy is inherently limited by the quality of the underlying visual features. This paper revisits the potential of vision-only foundation models to establish a highly efficient and robust baseline for FAS. We conduct a systematic benchmarking of 15 pre-trained models, such as supervised CNNs, supervised ViTs, and self-supervised ViTs, under severe cross-domain scenarios including the MICO and Limited Source Domains (LSD) protocols. Our comprehensive analysis reveals that self-supervised vision models, particularly DINOv2 with Registers, significantly suppress attention artifacts and capture critical, fine-grained spoofing cues. Combined with Face Anti-Spoofing Data Augmentation (FAS-Aug), Patch-wise Data Augmentation (PDA) and Attention-weighted Patch Loss (APL), our proposed vision-only baseline achieves state-of-the-art performance in the MICO protocol. This baseline outperforms existing methods under the data-constrained LSD protocol while maintaining superior computational efficiency. This work provides a definitive vision-only baseline for FAS, demonstrating that optimized self-supervised vision transformers can serve as a backbone for both vision-only and future multimodal FAS systems. The project page is available at: https://gsisaoki.github.io/FAS-VFMbenchmark-CVPRW2026/ .

24.
arXiv (CS.CV) 2026-06-17

Million-scale multimodal pollen microscopy with expert-guided foundation models

Automated pollen identification from microscopy remains a bottleneck in aerobiology, palaeoecology and biodiversity monitoring, because scalable systems must generalise across specimen preparation, scanner settings and geographic origins while retaining palynological interpretability. To address this gap, we present a million-scale multimodal pollen microscopy resource, Pollen AI Atlas, assembled from pure-species whole-slide bright-field images spanning four geographic origins, four scanner settings and 46 taxon labels across 31 botanical families. Seeded by one manually selected exemplar per source slide, token-level mining and filtering produced 1,511,390 released grain detections with 99.6\% proposal precision in expert-curated test regions. Each detection was paired with machine-generated grain-level morphological captions from five open-weight vision-language models, guided by expert-verified palynological anchors, yielding structured descriptions of aperture systems, wall ornamentation, shape and size. Among the evaluated models, Gemma4 provided the most controlled primary caption set, combining tight length control, no leakage and the strongest text-retrieval performance. Baseline benchmarks with frozen visual features reached 88.16\% top-1 accuracy, while cross-regional retrieval showed that caption-derived text embeddings remained robust when image similarity degraded (mAP@20 0.811 versus 0.262). Released data, annotations, captions, splits, code, and weights provide a benchmark for pollen recognition, cross-regional domain adaptation and domain-specific multimodal microscopy learning.

25.
arXiv (CS.AI) 2026-06-15

FreoStream:Enhancing Stream Guardrails via Future-Aware Reasoning and Safety-Aligned Optimization

arXiv:2606.13737v1 Announce Type: cross Abstract: Stream guardrails enable token-level safety detection before full responses are generated. However, they often make overly conservative judgements and block those sensitive but safe tokens, which is known as over-refusal. Due to lack of full context, they also fail to detect implicitly harmful content from jailbreaking. To address these challenges, we propose FreoStream, a novel streaming guardrail framework. Specifically, FreoStream fine-tunes a LoRA module to perform Future-Aware Reasoning when the base guardrail detects unsafe tokens. The reasoning process follows a Future-Reason-Judge paradigm: predict the future, reason about the full context and give the final judgement. This design can effectively reduce over-refusal by incorporating the future information. Moreover, we introduce the Safety-Aligned Optimization module that extracts the safety-aligned component from the reasoning gradients to update the base guardrail model, thereby enhancing streaming safety detection. Extensive experiments on various safety benchmarks demonstrate that FreoStream achieves lower over-refusal rates and better jailbreak defense compared to existing streaming guardrails.