Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-15

Boltzmann-Like Occupation of Nonequilibrium Steady States on Dense Networks

Authors:

arXiv:2606.14542v1 Announce Type: cross Abstract: A central problem in statistical physics is to extend the Boltzmann distribution to nonequilibrium steady states (NESS). We prove that NESS on large dense networks have Boltzmann-like occupation despite extensive entropy production. We further show that the active-matter heuristic of "low rattling" is asymptotically exact. Intuitively, these NESS spend a greater fraction of their time in states they leave more slowly. This explanation extends to the broader class of "equiaccessible" steady states, which play a role in our analysis akin to that of equilibrium in linear response.

02.
arXiv (CS.AI) 2026-06-19

Class-Incremental Motion Forecasting

arXiv:2603.09420v3 Announce Type: replace-cross Abstract: Motion forecasting enables autonomous vehicles to anticipate scene evolution by predicting the future trajectories of dynamic agents. However, existing approaches typically assume a closed-world setting with a fixed object taxonomy and access to high-quality perception, limiting their applicability in the real world where perception is imperfect, and new object classes may emerge over time. In this work, we introduce class-incremental motion forecasting, a novel setting in which new object classes are sequentially introduced over time and future object trajectories are predicted directly from camera images. We propose the first end-to-end framework for this setting, which adapts to newly introduced classes while mitigating catastrophic forgetting of previously learned ones. Our method generates motion forecasting pseudo-labels for known classes and matches them with 2D instance masks from an open-vocabulary segmentation model. This 3D-to-2D keypoint voting mechanism filters inconsistent and overconfident predictions, while a query feature variance-based replay strategy samples informative past sequences to preserve prior knowledge. Extensive evaluations on nuScenes and Argoverse 2 show that our approach successfully preserves performance on known classes while effectively adapting to novel ones. We further demonstrate zero-shot transfer to real-world driving and show that the framework extends naturally to open- and closed-loop end-to-end class-incremental planning on nuScenes and NeuroNCAP. Code and models will be made publicly available at https://omen.cs.uni-freiburg.de.

03.
arXiv (math.PR) 2026-06-11

Asymptotic analysis of the finite predictor for fractional Gaussian noise

arXiv:2504.01562v2 Announce Type: replace-cross Abstract: This paper proposes a new approach to the asymptotic analysis of the finite predictor for stationary sequences. Our method yields the exact asymptotics of both the relative prediction error and the partial correlation coefficients. The underlying assumptions are analytic in nature, making the approach applicable to processes with long-range dependence. The ARMA-type process driven by fractional Gaussian noise (fGn), which had previously remained elusive, is used as a case study.

04.
arXiv (CS.CV) 2026-06-16

An Open-Source Monitoring Framework for Data Exploration and Progress Tracking in Multi-Center Radiology Studies

Multi-center studies are crucial for advancing medical and radiological research. Data exploration, collaboration discovery, and study progress monitoring are essential for maximizing their potential. However, in practice these processes often rely on manual communication and shared tables, which quickly become outdated and hinder efficient coordination in large distributed studies. This highlights the need for dedicated monitoring solutions that provide transparent and up-to-date insights into study progress. We propose a lightweight, open-source monitoring architecture for multi-center studies based on the widely used Grafana-Prometheus stack. The framework collects aggregated monitoring metrics from distributed study sites and visualizes them through configurable dashboards. As a real-world deployment example, the framework is integrated into the medical imaging platform Kaapana and evaluated within a large multi-center research network. By deploying our solution within the Germany-wide RACOON consortium, we demonstrate its ability to enable privacy-preserving data exploration and study progress monitoring across all 38 German university clinics. The monitoring framework supports transparent coordination of distributed research activities and can facilitate more efficient management of large-scale multi-center studies. The source code and Kaapana integration are publicly available at https://github.com/MIC-DKFZ/study-monitoring-kaapana.

05.
arXiv (CS.LG) 2026-06-17

Recursive Learning Without Collapse: A Weighting-Based Stabilization Framework

arXiv:2502.18049v5 Announce Type: replace-cross Abstract: Recent studies identified an intriguing phenomenon in recursive generative model training known as model collapse, where models trained on data generated by previous models exhibit severe performance degradation. Addressing this issue and developing more effective training strategies have become central challenges in generative model research. In this paper, we investigate this phenomenon within a novel framework, where generative models are iteratively trained on a combination of newly collected real data and synthetic data from the previous training step. To develop an optimal training strategy for integrating real and synthetic data, we evaluate the performance of a weighted training scheme in various scenarios, including Gaussian distribution estimation, generalized linear models, and nonparametric estimation. We theoretically characterize the impact of the mixing proportion and weighting scheme of synthetic data on the final model's performance. Our key finding is that, across different settings, the optimal weighting scheme under different proportions of synthetic data asymptotically follows a unified expression, revealing a fundamental trade-off between leveraging synthetic data and model performance. In some cases, the optimal weight assigned to real data corresponds to the reciprocal of the golden ratio. Finally, we validate our theoretical results on extensive simulated datasets and a real tabular dataset.

06.
arXiv (math.PR) 2026-06-16

Geometry of critical discrete structures: long-range percolation on the hierarchical lattice and the discrete torus

arXiv:2509.09589v2 Announce Type: replace Abstract: Consider (a) balls $\Lambda_n$ of growing volumes in the $d$-dimensional hierarchical lattice, and (b) the $d$-dimensional discrete torus $\mathbb{T}_n^d$ on $n^d$ vertices. Place edges independently between each pair of vertices $x\neq y\in\Lambda_n$ or $\mathbb{T}_n^d$ with probability $1-\exp(-\beta J(x, y) )$ where $J(x, y) \asymp \| x-y \|^{-\alpha}$ for some $0

07.
arXiv (CS.CL) 2026-06-12

The Tone of Awareness: Topic, Sentiment, and Toxicity Maps During Mental Health Month on TikTok

Despite raising concerns about the mental health effects associated with the usage of TikTok, little is known about how related content is framed by creators and received by audiences. We collect the content of 28,341 TikTok videos and 80,130 comments from Mental Health Awareness Month (May) in 2023 and 2024 via the TikTok Research API, and study how the tone of awareness varies across topics and years. We characterize "tone" as the emotional and interpersonal framing of mental health discourse, operationalized through sentiment and toxicity measures. We extract topics from video text using BERTopic and log-odds keywords, then quantify topic-conditioned sentiment (XLM-T) and toxicity (Detoxify) separately for video transcriptions and comments. Sentiment captures the affective valence of content, while toxicity reflects the presence of harmful or abusive language. We find a stable set of recurring themes across years, spanning clinical conditions, emotional disclosure, self-care, and campaign-oriented content, with engagement highly skewed toward a small subset of topics. All sentiment and toxicity analyses are computed separately for video content and comments, allowing us to distinguish between content production and audience reception. Sentiment in videos is often negative for emotionally charged topics, while comments tend to shift toward more mixed or positive polarity, especially for suicide prevention. Toxicity is low in median overall, but exhibits longer-tailed outliers in comments than in videos that are more pronounced in comments and concentrated in specific topics (e.g., "Duet", "Suicide Prevention", and "Psychisch"). Overall, our results provide a topic-level decomposition of mental health discourse on TikTok during awareness-month campaigns.

08.
arXiv (quant-ph) 2026-06-11

Isotropic random walks and Brownian diffusion on complex projective space

arXiv:2606.11438v1 Announce Type: new Abstract: We show that isotropic random walks on the complex projective space provide a canonical and analytically tractable stochastic-geometric framework for the exploration of quantum-state space. The approach combines harmonic analysis on compact rank-one symmetric spaces with stochastic pure-state evolution and yields explicit analytical expressions for transition kernels, fidelity statistics, and geometric observables associated with the Fubini–Study metric. In particular, the framework provides a solvable reference model for isotropic depolarization and Haar equilibration, reproducing Haar-random fidelity statistics and the invariant measure on projective Hilbert space without specifying a microscopic Lindblad generator. In the short-time regime, the stochastic evolution converges to Brownian diffusion generated by the Fubini–Study Laplace–Beltrami operator, while the long-time limit exhibits concentration-of-measure behaviour characteristic of high-dimensional random quantum states. We further derive analytical and asymptotic results for the first-passage-time problem, including closed-form expressions in the Brownian limit for the mean first passage time and the long-time tail of the first-passage-time distribution. For high-fidelity target states, the mean first passage time exhibits a strong dimension-dependent divergence originating from the concentration properties of the Fubini–Study geometry.

09.
arXiv (math.PR) 2026-06-18

Delayed blow-up by transport noise for the 3D Navier-Stokes equation with Navier-slip boundary conditions

Authors:

arXiv:2606.19060v1 Announce Type: cross Abstract: We study the vorticity formulation of the 3D Navier-Stokes equation driven by transport noise in a periodic channel with Navier-slip boundary conditions. We consider both non-degenerate transport noise and degenerate tangential transport noise. For any prescribed $T>0$ and $\epsilon>0$, we prove that, by choosing the noise intensity sufficiently large and concentrating the noise on sufficiently high modes, the solution exists up to $T$ with probability at least $1-\epsilon$. A main contribution of this work is to identify and analyze the interaction between enhanced dissipation induced by transport noise and physical boundary effects. The no-flux condition breaks the isotropy of the noise and changes the scaling limit of the Itô-Stratonovich corrector. In the non-degenerate case, a boundary feedback term appears in the limiting effective operator; in the degenerate case, the limiting operator is a nonlocal anisotropic tangential dissipation. The proof is based on a combination of a boundary correction operator, a Meyers-type estimate, a scaling-limit analysis of the Itô-Stratonovich corrector, and resolvent estimates for the deterministic limiting equations.

10.
arXiv (CS.AI) 2026-06-17

SP-GCRL: Influence Maximization on Incomplete Social Graphs

arXiv:2605.12513v2 Announce Type: replace-cross Abstract: Influence maximization (IM) in real platforms is challenged by incomplete, noisy social graphs and non-stationary diffusion dynamics. We propose SP-GCRL, a social-propagation-aware graph contrastive reinforcement learning framework that learns end-to-end seed selection under partial observability.We first introduce a social-propagation-aware nonlinear diffusion function to model reinforcement/diminishing effects and probability drift under repeated exposure; we then construct dual structural views and perform contrastive learning to obtain node representations robust to missing edges and weak ties, while replacing expensive strategy metrics with a GAT-based regression surrogate to improve efficiency and scalability; finally, we use DDQN to learn an end-to-end seed selection policy on top of these representations. Experiments on multiple real-world networks show that SP-GCRL achieves significant gains over heuristic and learning-based baselines across budgets and topologies, while maintaining strong large-scale scalability.

11.
PLOS Computational Biology 2026-06-01

Challenges and progress in RNA velocity: Comparative analysis across multiple biological contexts

Authors:

by Sarah Ancheta, Leah Dorman, Guillaume Le Treut, Abel Gurung, Greg Huber, Loïc A. Royer, Alejandro Granados, Merlin Lange Single-cell RNA sequencing is revolutionizing our understanding of cell state dynamics, allowing researchers to capture and quantify the transcriptomic profile of a single cell at a specific timepoint. Among the computational techniques used to predict cellular trajectories, RNA velocity has emerged as a predominant tool for modeling transcriptional dynamics. RNA velocity leverages the mRNA maturation process to generate velocity vectors that predict the likely future state of a cell, offering insights into cellular differentiation, aging, and disease progression. Although this technique has shown promise across biological fields, the performance accuracy varies depending on the RNA velocity method and dataset. We established a comparative pipeline and analyzed the performance of five RNA velocity methods on three datasets based on local consistency, method agreement, identification of driver genes, and robustness to sequencing depth. This benchmark provides a resource for scientists to understand the strengths and limitations of different RNA velocity methods.

12.
arXiv (CS.CV) 2026-06-11

Contour Field based Elliptical Shape Prior for the Segment Anything Model

The elliptical shape prior information plays a vital role in improving the accuracy of image segmentation for specific tasks in medical and natural images. Existing deep learning-based segmentation methods, including the Segment Anything Model (SAM), often struggle to produce segmentation results with elliptical shapes efficiently. This paper proposes a new approach to integrate the prior of elliptical shapes into the deep learning-based SAM image segmentation techniques using variational methods. The proposed method establishes a parameterized elliptical contour field, which constrains the segmentation results to align with predefined elliptical contours. Utilizing the dual algorithm, the model seamlessly integrates image features with elliptical priors and spatial regularization priors, thereby greatly enhancing segmentation accuracy. By decomposing SAM into four mathematical sub-problems, we integrate the variational ellipse prior to design a new SAM network structure, ensuring that the segmentation output of SAM consists of elliptical regions. Experimental results on some specific image datasets demonstrate an improvement over the original SAM.

13.
arXiv (CS.LG) 2026-06-17

Performance-Driven Environment Abstraction with Multi-Timescale Learning

arXiv:2606.17377v1 Announce Type: new Abstract: We study performance-driven environment abstraction for decision-making in large Markov decision processes. Rather than preserving geometric or topological structure, we seek abstractions that directly optimize decision quality. We model abstraction as a controlled approximation obtained by aggregating the state space and enforcing a shared action distribution within each aggregated state. For a fixed partition, we establish a performance guarantee that separates value-function approximation error from the loss introduced by action sharing. Guided by this analysis, we develop a multi-timescale reinforcement learning framework that jointly adapts the policy and a tree-structured environment abstraction. The resulting algorithm refines and coarsens regions of the state space based on Q-value discrepancies, balancing performance against abstraction size and complexity. Empirical results demonstrate substantial state compression, improved sample efficiency, and faster replanning compared to actor-critic baselines.

14.
arXiv (CS.AI) 2026-06-19

SL-S4Wave: Self-Supervised Learning of Physiological Waveforms with Structured State Space Models

arXiv:2606.19888v1 Announce Type: cross Abstract: Modeling long-sequence medical time series data, such as electrocardiograms (ECG), poses significant challenges due to high sampling rates, multichannel signal complexity, inherent noise, and limited labeled data. While recent self-supervised learning (SSL) methods, based on various encoder architectures such as convolutional neural networks, have been proposed to learn representations from unlabeled data, they often fall short in capturing long-range dependencies and noise-invariant features. Structured state space models (S4) excel at long-sequence modeling, but existing S4 architectures fail to capture the unique characteristics of multichannel physiological waveforms. In this work, we propose SL-S4Wave, a self-supervised learning framework that combines contrastive learning with a tailored encoder built on structured state space models. The encoder incorporates multi-layer global convolution using multiscale subkernels, enabling the capture of both fine-grained local patterns and long-range temporal dependencies in noisy, high-resolution multichannel waveforms. Extensive experiments on real-world datasets demonstrate that SL-S4Wave (1) consistently outperforms state-of-the-art supervised and self-supervised baselines in a challenging arrhythmia detection task, (2) achieves high performance with significantly fewer labeled examples, showcasing strong label efficiency, and (3) maintains robust performance on long waveform segments, highlighting its capacity to model complex temporal dynamics in long sequences that most existing approaches fail to efficiently model, and (4) transfers effectively to unseen arrhythmia types, underscoring its robust cross-domain generalization. We additionally evaluate SL-S4Wave on multiple EEG tasks, achieving superior performance over strong baselines, demonstrating generalizability of our approach beyond cardiac waveforms.

15.
arXiv (CS.LG) 2026-06-12

GF-DiT: Scheduling Parallelism for Diffusion Transformer Serving

arXiv:2606.13501v1 Announce Type: cross Abstract: Diffusion Transformers (DiTs) have become the dominant architecture for image and video generation, creating growing demand for efficient DiT serving. Existing systems assign each request a fixed parallel configuration throughout its lifetime. However, DiT workloads exhibit substantial heterogeneity across requests, execution stages, and system conditions, making static parallelism inefficient and often leading to poor GPU utilization and degraded service quality. This paper argues that DiT serving should treat GPU parallelism as a first-class schedulable resource. We present GF-DiT, a policy-programmable runtime for elastic DiT serving that dynamically adapts the parallelism of running requests according to workload demands and service objectives. GF-DiT introduces an asynchronous execution abstraction that decomposes requests into independently schedulable trajectory tasks and enables online GPU reallocation. To make elastic parallelism practical, GF-DiT further proposes group-free collectives, a lightweight communication abstraction that supports low-overhead online formation and reconfiguration of arbitrary execution groups. We implement GF-DiT in vLLM-Omni and evaluate it on representative image and video diffusion workloads. Compared with fixed-pipeline execution with static parallelism, GF-DiT improves throughput by up to 6.01$\times$, reduces mean latency by up to 95%, lowers SLO violation rates by up to 90%, and reduces communication-group setup overhead from 778 ms to approximately 60 $\mu$s.

16.
PLOS Medicine 2026-05-20

Associations between hematologic dynamics during pregnancy and obstetric complications: A retrospective observational study

by Veronica Tozzo, Rachel Petherbridge, Kaitlyn James, Sarah Hsu, Deepti Pant, Chloe Michalopoulos, Brody H. Foy, Tanayott Thaweethai, Christopher Mow, Jacqueline Maya, Carolina Batlle Camero, Lydia Shook, Kathryn J. Gray, Logan Mauney, John M. Higgins, Camille E. Powe Background Pregnancy alters hematologic state as measured by complete blood count (CBC), but the longitudinal changes in CBC indices that define healthy pregnancies are not well established. In a large cohort based at an academic health system in the United States, we aimed to define reference intervals and typical longitudinal changes in CBC indices during pregnancy. We then tested for associations between extreme CBC values for gestational age or extreme longitudinal changes in CBC indices and obstetric complications. Methods and findings We studied nine CBC indices in individuals with singleton pregnancies who delivered after 30 weeks’ gestation and presented for prenatal care prior to 20 weeks. The electronic health record (EHR)-based Maternal Health Cohort (Massachusetts General Hospital; 1998–2016) formed our discovery cohort of 45,992 pregnancies, 18% of which had relevant complications. We developed a validation cohort of 48,868, 27% with complications from EHR data in the Mass General Brigham healthcare system from 2016 to 2024. In pregnancies without complications in the discovery cohort, we derived gestational-age-specific reference intervals (2.5th–97.5th percentile) and established typical intra-pregnancy longitudinal changes. In the validation cohort, we then tested CBC values outside of the 26–29 weeks’ gestation reference interval and CBC rare changes (uncommon changes in magnitude and direction) between 7–14 and 26–29 weeks’ gestation for association with a composite outcome (hypertensive disorders of pregnancy, small for gestational age birthweight, preterm birth) and its individual components using generalized estimating equations. Derived reference intervals differed from those in the literature for mean red cell volume, mean red cell hemoglobin, red cell count, and mean red cell hemoglobin concentration; reference intervals for other indices were similar to those previously published. In validation, hematocrit, hemoglobin, and red cell count values above their gestational-age specific reference intervals were associated with increased risk of the composite obstetric outcome: odds ratios (ORs) of 1.4 (95% CI [1.2, 1.5] p 

17.
arXiv (quant-ph) 2026-06-17

Quantum statistical enhancement of collective behaviour in a bosonic active Ising model

arXiv:2606.18091v1 Announce Type: new Abstract: Collective behaviour such as flocking (the collective motion of a spontaneously formed group along a common direction) or aster formation (the binding of opposing flocks, inhibiting each others motion) are intriguing emergent phenomena in active systems with local alignment rules. Until recently, their occurrence was mainly studied for classical systems, a prime example being the active Ising model (AIM), which translates the main ingredients of flocking and aster formation (i.e., alignment and self-propulsion) to a lattice framework. Here we introduce and study a one-dimensional (1D) quantum lattice variant of the AIM, based on ideal bosons with a spin degree of freedom. We find that both the collective behaviours of the 1D classical model, flocking and aster formation, are markedly enhanced by the bosonic quantum statistics. This contrasts with a recent quantum generalization of the AIM based onto hard-core bosons [Khasseh et al., Phys. Rev. Lett. 135, 248302 (2025)], where flocking, but neither its quantum-statistical stabilization nor aster states were observed as a consequence of interactions. Moreover, we investigate the competition of this quantum statistical stabilization of collective phases with their suppression by the quantum fluctuations induced by a transverse external magnetic field.

18.
arXiv (quant-ph) 2026-06-11

Lowest order Carleman linearization for low Reynolds long-term behaviour of fluid flow simulations

arXiv:2605.23380v2 Announce Type: replace Abstract: It is shown that the lowest (second) order truncation of the Carleman linearization of the fluid equations (C2) recovers the late stage of the evolution, namely the steady-state solution, although to a decreasing degree of accuracy at increasing Reynolds number. This asymptotic property is first proved analytically for the decaying logistic with external forcing and then shown to hold to a significant degree of accuracy also for the more complex case of two-dimensional Kolmogorov-like fluid flow at low Reynolds numbers, below $Re \sim 10$. This time-asymptotic property may open interesting prospects for the quantum simulation of low-Reynolds steady-state fluid flows.

19.
arXiv (CS.CV) 2026-06-11

Real-Time Neural Hair Denoising

We propose a lightweight real-time method for reconstructing strand-based hair G-Buffers from severely undersampled rasterized inputs. Our pipeline first applies neural spatial reconstruction and temporal accumulation to recover hair coverage, i.e., fractional hair visibility within a pixel, and tangent. It then uses a tangent-guided reconstruction step to complete the position, which is subsequently used for physically based deferred hair shading. We evaluate our method across a diverse set of hairstyles, including straight, wavy, afro, and ponytail styles, under both static and dynamic scenarios. Our method achieves higher hair reconstruction quality than existing hair-specific denoising techniques and general industrial neural reconstruction solutions such as DLSS and FSR.

20.
arXiv (CS.LG) 2026-06-18

Modeling Doppler Shifts in Radial-Velocity Data with Deep Learning toward Earth-mass Exoplanet Detection

arXiv:2606.18464v1 Announce Type: cross Abstract: Detecting the tiny Doppler shifts induced by Earth-mass planets in stellar radial-velocity measurements remains extremely challenging due to stellar activity. Many deep-learning methods performing well on simulated data remain difficult to apply reliably on real stellar spectra. The aim of this work is to develop a deep-learning framework that generalizes to real, unseen spectra and improves the detectability of Earth-mass planets in radial-velocity data. We train artificial neural networks on HARPS-N solar spectra with injected planetary signals, using physics-motivated spectral representations based on flux and line-formation temperature, together with their velocity gradients. Two training strategies are explored: hold-out testing and cross-validation. Model robustness is enhanced through genetic-algorithm-based hyperparameter optimization, and predictive uncertainty is quantified using Monte Carlo dropout. Our most precise neural network model reliably retrieves, under the cross-validation strategy, the amplitudes, phases, and orbital periods of planetary signals with amplitudes greater than or equal to 25 cm/s and periods between 10 and 550 days. In addition, in all cases tested here, the successfully recovered signals correspond to the most significant peaks in the periodograms of the Doppler-shift predictions. Temperature-based spectral-shell representations consistently outperform flux-based shells. We also release doppleriann, a Python package implementing the proposed framework. Our results demonstrate that combining physically motivated spectral representations with deep learning provides a promising pathway toward the detection of Earth-mass planets in radial-velocity data from real observations, supported by a modeling framework that is both physically grounded and statistically rigorous, incorporating uncertainty quantification and optimized training strategies.

21.
arXiv (CS.LG) 2026-06-12

How Reliable are Fairness Audits with Unreliable Data?

arXiv:2506.23033v3 Announce Type: replace Abstract: Fairness audits are a key component of responsible machine-learning deployment. Yet, audit-recommendation reliability under incomplete protected-label access is still poorly understood. In this work, we focused on protected-label missingness in fairness mitigation audits. We introduced a seed-calibrated stress test to separate missingness effects from seed-to-seed movement already present under complete labels. Across ACS/Folktables tasks, missingness settings that retain some protected labels usually do not move selected mitigation methods beyond a complete-label seed-to-seed baseline. At $0%$ protected-label access, candidates collapse to an empirical-risk-minimization baseline and deterministic tie-breaking rather than revealing a broad missingness effect. We also found that threshold optimization can turn fairness gains on a single protected axis into intersectional harm above a seed baseline, and this threshold-optimizer finding persists under random-forest validation. Overall, our results highlight that protected-label missingness should be reported with seed-null calibration, candidate-set context, and intersectional consequences before it is treated as evidence of audit fragility.

22.
arXiv (CS.CL) 2026-06-16

Evaluating the Robustness of Proof Autoformalization in Lean 4

Proof autoformalization aims to translate a mathematical informal proof written in natural language into a formal proof in a formal language such as Lean~4. Several works have developed LLM-based models for proof autoformalization. However, existing evaluations have typically focused on translating well-formed informal proofs from curated datasets. We argue that a robust proof autoformalizer must remain faithful even for informal proofs that diverge from these idealized ones, and we present the first study on the robustness of proof autoformalization models. We formulate two categories of perturbations and evaluate robustness under each: a global perturbation paraphrases the informal proof in a different style, under which the formalization should remain consistent; a local perturbation alters a value, symbol, or proof step, possibly in a counterfactual way, and a robust formalization should faithfully reflect the perturbation rather than reverting to the original one or inferring a different one on its own. We build a benchmark with both perturbations on miniF2F and MATH-500, and automatically measure how stable a proof autoformalization's correctness is under global perturbations and how faithfully its output reflects local perturbations. We evaluate seven recent models, all of which are sensitive to global perturbations and mostly fail to remain faithful under local perturbations. Code and data are available via https://github.com/ucr-rai/robust-proof-autoformalization.

23.
arXiv (math.PR) 2026-06-15

Upper tails for irregular graphs beyond the mean-field regime

arXiv:2606.14564v1 Announce Type: new Abstract: Let $G_{n,p}$ be the binomial random graph of density $p$ and let $X_H$ be the number of copies of a fixed graph $H$ in $G_{n,p}$. We prove asymptotically tight bounds on the logarithmic upper-tail probability of $X_H$ whenever $H$ is a connected, irregular graph with maximum degree $\Delta \ge 2$ and $p \ge n^{-1/\Delta - \varepsilon_H} (\log n)^{\omega(1)}$ for an explicit $\varepsilon_H >0$. These bounds are expressed in terms of a new variational problem that generalises the combinatorial optimisation problem arising from the naïve mean-field approximation. This new variational problem includes an entropy term that corresponds to the large number of embeddings of certain highly structured graphs in $K_n$. For a certain class of irregular graphs $H$ that we call stable, we show that this description of the upper-tail probability is valid in a range of densities that is optimal up to a poly($\log\log n$) factor. For a further subclass of stable graphs, which includes all irregular complete bipartite graphs, we show that this range of densities is optimal up to a multiplicative constant.

24.
arXiv (CS.CV) 2026-06-15

Mirage Probes: How Vision Models Fake Visual Understanding

Vision-language models (VLMs) can answer image-based questions confidently, and often correctly, even when no image is provided. This mirage behavior inflates benchmark scores without reflecting visual grounding. Prior work treats this as a single failure mode. We argue it is two. Using Mirage Probes, a contrastive probing framework that pairs paraphrased question variants with matched mirage and non-mirage labels on the same image, we show that mirage behavior is linearly decodable from internal activations across residual stream, MLP, post-attention, and attention-head sites in two open-source VLMs. We demonstrate that a Naive Bayes text baseline cannot recover this signal, ruling out surface lexical confounds. Cross-benchmark separability patterns, together with a novel Prior Harnessing Index (PHI) measuring how much a model can answer from text alone, expose two distinct regimes: textual biases, where the model answers from language priors without engaging visual representations, and spurious images, where it constructs false visual content in latent space and answers as if grounded. The distinction has direct mitigation consequences: text-distribution cleaning can address the first regime but cannot reach the second, since spurious-image mirages live in the model's visual representations rather than its text. Faithful visual grounding will require interventions at the representational level.

25.
Nature (Science) 2026-06-17

Emergent decadal predictability in Antarctic contribution to sea-level rise

Despite large uncertainties associated with future mass loss from the Antarctic Ice Sheet, ice-sheet models show that the rate of sea-level rise from Antarctic ice loss in 2025 is strongly predictive of the rate for the next several decades, regardless of emission pathway or model complexity. This finding is robust across all models that were considered in the Intergovernmental Panel on Climate Change Sixth Assessment Report global mean sea-level projections, including the low-likelihood, high-impact scenarios of sea-level rise. Given this strong near-term decadal predictability, ice-sheet models that can accurately reproduce present-day ice-mass loss provide a reliable basis for near-term sea-level planning and adaptation through to mid-century. The predictability breaks down by the end of the twenty-first century as feedbacks, such as those related to marine ice-sheet retreat, begin to emerge, leading to accelerating ice loss. Drawing on these results, we identify key feedback mechanisms that can account for the transition between near-term decadal predictability and the longer-term, feedback-driven evolution, and suggest priorities for ice-sheet model development aimed at resolving long-term sea-level rise uncertainty. Although Antarctic ice loss projections diverge widely by 2100, this Perspective shows that present-day rates robustly predict mid-century sea level rise, providing a firm basis for near-term planning, while highlighting priorities for model development aimed at resolving longer-term sea level rise uncertainty.