Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-18

Synthetic Resonance: A Framework for Growth-Oriented Human-AI Relationships

arXiv:2606.18265v1 Announce Type: cross Abstract: As human relationships with artificial intelligence systems become increasingly frequent and sustained, existing language and theory fail to accurately capture the nature of these affiliations. Common descriptors such as mutual understanding, connection, or friendship risk anthropomorphizing systems that lack subjective experience, while dominant frameworks tend to reduce AI to either a tool or a threat. In this paper, I introduce the concept of synthetic resonance as an integrative framework for understanding human-AI relationships. Synthetic resonance describes how relationships humans define as meaningful can emerge between a human and an AI system without the need to attribute shared feelings or mutual awareness. I argue that synthetic resonance is best understood as a structured, dynamic pattern of interaction that can produce a sense of relationship without the presence of a second experiencing subject. By clarifying this distinction, the concept of synthetic resonance offers a more precise way of conceptualizing human-AI relationships and highlights their potential value and ethical implications. I also call for more research that tests the processes and outcomes of synthetic resonance.

03.
arXiv (CS.CV) 2026-06-11

MentisOculi: Revealing the Limits of Reasoning with Mental Imagery

Frontier models are transitioning from multimodal large language models (MLLMs) that merely ingest visual information to unified multimodal models (UMMs) capable of native interleaved generation. This shift has sparked interest in using intermediate visualizations as a reasoning aid, akin to human mental imagery. Central to this idea is the ability to form, maintain, and manipulate visual representations in a goal-oriented manner. To evaluate and probe this capability, we develop MentisOculi, a procedural, stratified suite of multi-step reasoning problems amenable to visual solution, tuned to challenge frontier models. Evaluating visual strategies ranging from latent tokens to explicit generated imagery, we find they generally fail to improve performance. Analysis of UMMs specifically exposes a critical limitation: While they possess the textual reasoning capacity to solve a task and can sometimes generate correct visuals, they suffer from compounding generation errors and fail to leverage even ground-truth visualizations. Our findings suggest that despite their inherent appeal, visual thoughts do not yet benefit model reasoning. MentisOculi establishes the necessary foundation to analyze and close this gap across diverse model families.

04.
arXiv (CS.LG) 2026-06-19

Stochastic Linear Contextual Bandits with Bounded Noise: A Set-Membership Approach

arXiv:2606.20022v1 Announce Type: cross Abstract: This paper considers stochastic linear contextual bandits (SLCB) with bounded reward noise. Existing works typically assume sub-Gaussian reward noise and bounded expected rewards, under which the optimal regret bound scales as $\tilde{O}(\sqrt{T})$ in terms of horizon $T$. However, in many applications, realized/observed rewards are also naturally bounded, implying bounded reward noise. Bounded noise is more informative than the sub-Gaussian condition but has not been leveraged explicitly in the SLCB literature. In this paper, we propose a novel algorithm SME-OFU by utilizing an uncertainty quantification method called set-membership estimation (SME) and applying the principle of optimism in the face of uncertainty (OFU). Our algorithm enjoys an improved regret bound $O(\log T)$. Notice that this does not contradict the existing optimal bound $\tilde{O}(\sqrt{T})$ for sub-Gaussian noise because bounded noise is a stronger condition. Finally, simulations show empirical improvements of SME-OFU over a benchmark algorithm designed for sub-Gaussian noise when the reward noise is bounded.

05.
arXiv (CS.CV) 2026-06-16

You Don't Need Strong Assumptions: Visual Representation Learning via Temporal Differences

Progress in AI has largely been driven by methods that assume less. As compute and data increase, approaches with weaker inductive biases generally outperform those with stronger assumptions. This is particularly characteristic of the field of Visual Representation Learning, where approaches have gone from being dominated by Supervised Learning, to Weakly Supervised Learning, to the now widespread success of Self-Supervised Learning without human labels. Yet, even modern Self-Supervised Learning approaches still depend on strong inductive biases such as augmentations, masking, or cropping. If this trend holds, even these remaining biases should become bottlenecks at scale – and our experiments confirm this: the optimal strength of inductive biases decreases as data grows. This motivates the search for approaches that rely on fewer assumptions. To this end, we introduce Temporal Difference in Vision (TDV), a new paradigm for self-supervised learning from video that avoids existing inductive biases, relying instead on a causal assumption that the past causes the future. TDV functions by jointly training an image encoder and a motion encoder so that the current frame's representation plus the encoded motion equals the next frame's representation. Despite not leveraging any strong inductive biases, TDV matches state-of-the-art recipes on dense spatial tasks, laying the foundation for representation learning without strong assumptions.

06.
PLOS Medicine 2026-05-12

Social contact patterns in the United Kingdom following the COVID-19 pandemic: The Reconnect cross-sectional survey

by Lucy Goodfellow, Billy J. Quilty, Kevin van Zandvoort, W. John Edmunds Background Close-contact and respiratory infectious diseases are spread through social interactions. Measuring these interactions has transformed our ability to understand transmission and control these infections. Social contact patterns were disrupted during the COVID-19 pandemic and have been affected by wider demographic, cultural, and workplace changes since then. Methods and findings To estimate post-pandemic social contact patterns in the United Kingdom, we conducted a cross-sectional social contact survey from November 2024 to March 2025 on a nationally representative sample of participants. Interactions were captured by age, gender, and across socioeconomic status (SES) and ethnic groups. We calculated the mean number of daily contacts and contact matrices, stratified by variables of interest, using a negative binomial regression model weighted by age, gender, ethnic group, and weekday/weekend. 13,238 participants were recruited, 3,019 of whom were aged under 18 years old; survey response rates were 36% and 27% for adults and children, respectively. The mean number of daily contacts was 9.1 (95% confidence interval (CI): 8.7, 9.5); this figure was 13.8 (95% CI: 12.8, 14.9) for children, and 7.8 (95% CI: 7.4, 8.2) for adults. Higher numbers of contacts were positively associated with employment, household income, and educational qualifications held. Contact matrices showed high levels of age-assortativity, as well as inter-generational contacts in the home. Contacts were assortative between ethnic groups and SES in all settings; this effect was strongest between ethnic groups in the home, and between SES in the workplace. We constructed socially-stratified next-generation matrices for a novel respiratory pathogen, projecting that the majority White ethnic group would account for the largest share of new infections (76.7% (95% CI: 75.5, 77.9) of cases), but that per-capita infection risk would disproportionately affect minority ethnic groups, with the risk for the Black population being 2.27 (95% CI: 2.06, 2.51) times that of the White population. This study may be limited by the inherent recall biases and reporting fatigue involved with self-reporting contacts. Conclusions This study provides crucial data to inform post-pandemic mathematical models of infectious disease transmission, and allows ethnicity and SES to be incorporated in such models.

07.
medRxiv (Medicine) 2026-06-22

Level of Physical Activity and ApoE Status - Effects on Alzheimer's Disease and on Mortality

Background: Alzheimer's disease and related dementias (ADRD) affect over 7.2 million Americans aged 65 and older, with the APOE-4 allele representing the strongest known genetic risk factor. Physical activity (PA) has been associated with reduced dementia risk, but its interaction with APOE genotype remains poorly characterized in large, genomically informed cohorts. Methods: We conducted a retrospective cohort analysis using linked genomic, survey, and longitudinal electronic health record data from the VA Million Veteran Program (MVP). Veterans aged

08.
arXiv (CS.AI) 2026-06-11

Forecasting Future Behavior as a Learning Task

arXiv:2606.11445v1 Announce Type: new Abstract: Trust in an AI system is often anchored by explanations of how it works, which one then uses to forecast its behavior on new inputs. For large reasoning models (LRMs), this conventional route is particularly difficult to follow: explanation methods for single token generations do not naturally generalize to long trajectories, and the trajectories themselves are often not faithful when read as natural language. We propose an alternative that bypasses the explanation step: treat behavior forecasting as a learnable task and train Behavior Forecasters that operates on a single reasoning trajectory to make the same forecasts one would typically seek from an explanation. The forecaster's training data is obtained by querying the LRM with no human annotation, and its inference is done in a single forward pass. We instantiate this approach on two tasks: how likely the LRM is to repeat its answer on re-runs, and how removing parts of the input changes its answer. We evaluate this approach on both tasks across three diverse reasoning datasets and find that trained Behavior Forecasters are more accurate than GPT-5.4 and Claude Opus-4.6 reading the same trajectories as naive readers, at a small fraction of their inference cost. We find that fine-tuning the backbone end-to-end and initializing it from the target LRM are each necessary for strong performance. These results show that the reasoning trajectory carries information about the LRM's future behavior that goes beyond what naive reading conveys.

09.
arXiv (math.PR) 2026-06-12

Counterintuitive problems in discrete probability

arXiv:2606.07516v2 Announce Type: replace Abstract: This manuscript contains a collection of counterintuitive problems in discrete probability, together with detailed solutions. The dataset was constructed as part of a broader research project investigating the capabilities of the latest-generation Large Language Models (LLMs) in solving discrete probability problems, in order to assess whether LLMs tend to make systematic reasoning errors associated with known cognitive biases. The problems collected here are specifically designed to challenge heuristic reasoning strategies that often lead to intuitively appealing but mathematically incorrect conclusions. The dataset combines several types of problems. Some are adapted from classical probabilistic paradoxes and cognitive-bias literature, while others originate from recreational mathematics sources or were developed by ourselves following similar principles. The primary purpose of this document is to provide a transparent and publicly accessible reference for the problems used in our experimental evaluation of language models, as well as providing detailed human-made solutions. At the same time, we believe that this collection may also prove useful for future research on probabilistic reasoning, cognitive biases, and the evaluation of reasoning capabilities in artificial intelligence systems.

10.
arXiv (CS.LG) 2026-06-19

Learning to Emulate Chaos: Adversarial Optimal Transport Regularization

arXiv:2604.21097v2 Announce Type: replace-cross Abstract: Chaos arises in many complex dynamical systems, from weather to power grids, but is difficult to accurately model with data-driven methods such as machine learning emulators. While emulators are promising tools for accelerating simulations and solving inverse problems, they still struggle to learn chaotic dynamics, where sensitivity to initial conditions renders exact long-term forecasts infeasible, especially given noisy data. Recent work instead trains emulators to match the statistical properties of chaotic attractors, but these approaches often rely on handcrafted summary statistics or large, diverse multi-environment datasets. In this work, we propose a family of adversarial optimal transport objectives that can jointly learn high-quality summary statistics and a physically consistent emulator from a single noisy trajectory. We theoretically analyze and experimentally validate a Sinkhorn divergence formulation (2-Wasserstein) and a WGAN-style dual formulation (1-Wasserstein) of our approach. Numerical experiments across a variety of chaotic systems, including ones with high-dimensional spatiotemporal chaos, show that emulators trained using our proposed objectives have significantly improved long-term statistical fidelity.

11.
arXiv (CS.LG) 2026-06-19

Alzheimer's Disease Diagnosis using a Multimodal Approach with 3D MRI and PET

arXiv:2606.20037v1 Announce Type: new Abstract: Alzheimer's disease (AD) is an irreversible neurodegenerative disorder and a leading cause of death worldwide. Early diagnosis plays an important part especially at the Mild Cognitive Impairment stage, where timely intervention can help slow its progression before it advances to AD. Neuroimaging data, like Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) scans, can help detect brain changes early by providing structural and functional brain changes related to the disease. Yet, many multimodal models still fuse MRI and PET with static concatenation and apply identical computation to all subjects, which limits robustness to patient/site heterogeneity and can waste computation. To address these limitations, we present the first study of combining 3D convolutional feature extractors with three fusion strategies - concatenation, Gated Multimodal Unit (GMU), and gated self-attention - and a sparsely gated Mixture-of-Experts (MoE) classifier that performs input-adaptive routing, activating only the most informative experts per case. Finally, we utilize Grad-CAM to visualize disease-related regions, ensuring model interpretability. Experiments are performed across three binary classification tasks (NC vs. MCI, MCI vs. AD, and NC vs. AD). Results show that GMU achieves accuracies of 80.46 % (NC vs. MCI) and 95.47 % (NC vs. AD), while gated self-attention attains 82.08 % on MCI vs. AD. Ablations show that removing the MoE consistently degrades accuracy across all tasks. These findings underscore the value of input-adaptive, multimodal modeling for AD diagnosis by leveraging the complementary nature of MRI and PET.

12.
arXiv (quant-ph) 2026-06-17

Quantum Resources and Wigner Symmetry in Nucleon-Nucleon Scattering from Effective Field Theory

arXiv:2606.17148v1 Announce Type: cross Abstract: We study quantum resources in the spin degrees of freedom, such as entanglement, stabilizer magic, and non-local magic, in low-energy nucleon-nucleon scattering through next-to-leading order in pionless effective field theory. Treating each nucleon spin as a qubit, we calculate the corresponding resource-generating powers of the scattering operator at generic center-of-mass momentum and scattering angle $\Theta$. The analysis retains $S$- and $P$-wave channels generated by two-derivative contact interactions. When the microscopic physics exhibits Wigner's $SU(4)$ spin-flavor symmetry, the neutron-proton amplitude becomes proportional to the spin-space identity operator and therefore generates no new resources after scattering, extending an observation previously made for leading-order $S$-wave scattering. The same-nucleon channel remains resource-generating because constraints from identical particles project out part of the Hilbert space. These results show how enhanced symmetries, partial-wave structure, and resource generation are intertwined in low-energy two-body scattering.

13.
bioRxiv (Bioinfo) 2026-06-12

The Geometry of Allostery: A Laplacian Minor Hierarchy for Many-Body Protein Communication

Quantifying how cooperative, many-body relationships drive allostery in protein networks remains a major challenge. To address this, we develop the Laplacian minor hierarchy, a mathematical framework that characterizes the geometric invariants of a protein network. Lower-order minors yield standard metrics including the partition function and effective distances, whereas higher-order minors define novel topological measures: cooperation indices, each bounded between zero and one, that characterize pathway correlations at increasing levels of complexity, the third-order minor determines whether allosteric pathways are correlated or uncorrelated, and the fourth-order minor quantifies how distinct pathways communicate through intermediary residues. We apply this framework to analyze the evolutionary adaptation of the PSD95pdz3 domain from Class I to Class II ligand specificity via mutations G330T and H372A. The cooperation index demonstrates a distinct evolutionary hierarchy: the G330T mutation establishes distributed pathway couplings that the H372A mutation subsequently exploits, whereas H372A alone produces minimal global changes. Furthermore, the fourth-order analysis identifies His317 as a critical intermediary node bridging the class-switching (330-372) and class-bridging (330-400) allosteric pathways. These results demonstrate that allosteric dependencies emerge only when mutations accumulate in specific combinations, with a hierarchical organization of pathways structured around position 330 and intermediary nodes His317 and Phe400. Rather than predicting allosteric mechanisms, this framework provides a mechanistic explanation for why and how allostery emerges during protein evolution.

14.
arXiv (CS.AI) 2026-06-11

Learning to Inject: Automated Prompt Injection via Reinforcement Learning

arXiv:2602.05746v2 Announce Type: replace-cross Abstract: Prompt injection is a critical vulnerability in LLM agents, yet the strongest methods still rely on human red-teamers and hand-crafted prompts. Adapting automated jailbreak optimizers does not close this gap: jailbreaks shape models toward generic compliance, while prompt injection requires emitting specific tool calls with correct parameters. The success signal is binary, and randomly sampled suffixes almost never trigger it, so standard optimizers have no gradient to follow. We present AutoInject, a black-box reinforcement learning (RL) framework that learns adversarial suffixes for prompt injection. A learned comparison-based reward scores each candidate against the best suffix seen so far, turning the binary signal into a dense reward suitable for RL optimization. The framework supports both online query-based attacks and offline-trained transferable suffixes that need no utility access at deployment, and incorporates a utility objective when task-completion feedback is available. On AgentDojo, AutoInject outperforms template attacks, GCG, TAP, and adaptive attack across production models, with statistically significant improvements under McNemar's test with p

15.
medRxiv (Medicine) 2026-06-17

Performance of five risk stratification tools for paediatric pneumonia against WHO scores using data from the PediCAP trial in sub-Saharan Africa

Background Risk stratification tools for childhood pneumonia have been proposed to improve identification of children at highest risk of death, particularly in low-resource settings. However, their added value over the WHO Integrated Management of Childhood Illness (IMCI) criteria and danger signs remains uncertain. Methods We conducted a secondary analysis of a multi-country randomised controlled trial of children without HIV hospitalised with pneumonia in Mozambique, South Africa, Uganda, Zambia, and Zimbabwe. We evaluated the performance of five published risk scores alongside WHO IMCI severity classification and danger signs. Discrimination for (1) in-hospital mortality, (2) 28-day mortality, and (3) 28-day readmission or death was assessed using area under the receiver operating characteristic curve (AUC). Comparative performance and clinical utility were examined. Results Of the 1010 participants, 18 (1.8%) died in hospital, 22 (2.2%) died in hospital or in the 7 days post-discharge, and 63 (6.2%) died or were readmitted by day 28. Univariate case-fatality rates were highest for variables associated with malnutrition, convulsions, and hypoxaemia. All risk scores demonstrated moderate discrimination for in-hospital and in-hospital+7-day mortality (AUC range approximately 0.75-0.84), with no meaningful differences between models, and performed similarly to the WHO danger signs and IMCI severity classification. In contrast, all approaches performed poorly in predicting 28-day readmission or death (AUC approximately 0.54-0.58). No risk score consistently outperformed simple clinical criteria. Conclusions In this multi-country dataset, we found no evidence that published paediatric pneumonia risk scores meaningfully outperform WHO IMCI-based clinical assessment for predicting mortality. The relatively small number of mortality events limits precision, and modest differences cannot be excluded. These findings suggest that, in low-resource settings, strengthening implementation of existing WHO clinical criteria may be more effective than adopting more complex prediction tools.

16.
arXiv (CS.LG) 2026-06-11

Mitigating Disparate Impact of Differentially Private Learning through Bounded Adaptive Clipping

arXiv:2506.01396v2 Announce Type: replace Abstract: Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves worst-class accuracy by over 10 percentage points on Skewed and Fashion MNIST compared to unbounded adaptive clipping, 7 points compared to Automatic clipping, and 5 points compared to constant clipping. The code is available at https://github.com/TrustworthyMLHelsinki/adaptive-clipping-fairness.

17.
arXiv (quant-ph) 2026-06-12

Block algebra for morphing circuits

作者:

arXiv:2606.12724v1 Announce Type: new Abstract: Morphing circuits are a new paradigm for quantum error correction that relaxes hardware requirements. We present four constructions for CNOT-based CSS morphing circuits with explicit qubit connectivity degrees. All four constructions are specified in block algebra notation, with entries in algebras generated by permutation matrices. The first three are obtained by rewriting existing surface- and color-code morphing circuits; the fourth is a new three-round construction modeled on the 6.6.6 color code. The surface-code construction recovers the morphing circuit of Ref. [ST25] for two-block group algebra codes. Numerical search then instantiates these permutation matrices using regular representations of finite groups. [ST25] M. H. Shaw and B. M. Terhal, Phys. Rev. Lett. 134(9), 090602 (2025).

18.
arXiv (CS.CV) 2026-06-11

AnchorEdit: Maintaining Temporal Consistency in Multi-turn Image Editing via Causal Memory

Multi-turn image editing is essential for iterative design, yet current models often struggle with identity drift and error accumulation over successive steps. While existing research leverages video priors for consistency, their reliance on bidirectional attention is fundamentally misaligned with the causal, sequential nature of interactive editing. In this paper, we propose AnchorEdit, the first autoregressive (AR) diffusion-based framework designed specifically for high-resolution, long-term multi-turn editing. AnchorEdit bridges the gap between video priors and causal inference through a three-stage training curriculum: identity-preserving sing-turn pretraining, causal AR forcing fine-tuning with a novel self-rollout strategy to mitigate exposure bias, and consistency distillation for efficient 4-step generation. During inference, we introduce a memory mechanism to anchor the initial subject identity and ensure stable extrapolation across extended editing trajectories. To evaluate performance, we provide a new high-resolution multi-turn editing benchmark designed to stress-test long-horizon stability. Extensive experiments demonstrate that AnchorEdit achieves state-of-the-art results, maintaining exceptional subject fidelity and instruction following even over 10+ interaction rounds.

19.
arXiv (CS.LG) 2026-06-19

Enhancing Graph Neural Networks Using Proximity Graphs for Dust Source Emission Forecasting

arXiv:2606.19825v1 Announce Type: new Abstract: Accurate prediction of dust source emissions is critical for mitigating the significant environmental and health hazards posed by dust storms. Traditional forecasting methods often struggle to capture the complex spatiotemporal dynamics of these phenomena. In this paper, we demonstrate that proximity graphs enable Graph Neural Networks (GNNs) to effectively model the intricate spatial and temporal relationships between data points. Specifically, we use proximity graphs–such as Delaunay triangulation, Gabriel graph, k-Nearest Neighbor graph, and Yao graph–as the input for GNNs (including GraphSAGE, Graph Convolutional Networks, and Graph Attention Networks) to perform message passing. Our approach highlights the effectiveness of integrating proximity graphs with GNNs for robust and accurate dust source forecasting. To emphasize the importance of proximity graph representations, we compare our method against GNNs using random graphs for message passing. The results show that GNNs with proximity graphs significantly outperform those with random graphs and are also far superior to Long Short-Term Memory (LSTM) model in dust source emission forecasting.

20.
medRxiv (Medicine) 2026-06-15

Midwifery Practice in Conflict Contexts: Lived Experiences from Somalia and Nigeria

Background: Midwives are a central cadre in the health system, particularly in conflict-affected settings where they are sometimes the primary or even only skilled providers available. Yet, despite their critical role, there is limited qualitative evidence capturing their lived experiences and how these shape workforce entry, retention, and overall well-being. Methods: Drawing on a phenomenological research methodology, this qualitative study was embedded within a larger prospective longitudinal cohort of midwifery students and graduates in Somalia and Nigeria. We conducted focus group discussions with graduate midwives (n=48 in Nigeria; n=63 in Somalia) to explore their experiences transitioning into the workforce and their realities working in health systems impacted by conflict and violent insecurity. Data were analysed using inductive thematic analysis. Results: Five themes emerged from the data: (1) job search and workforce entry, which was described as fraught with challenges and shaped by a set of formal systems in Nigeria but informal networks and structural barriers in Somalia (2) working conditions that were marked by resource scarcity, infrastructural challenges, and heavy and unreasonable workloads, (3) safety, security and coping strategies that differed across the two contexts but reflected persistent exposure to violence and a reliance on ad hoc and personal coping in lieu of systematic protection, (4) community perceptions of midwives, shaped and constrained by social and gender norms and (5) mental health and emotional wellbeing, highlighting stress, burnout and moral injury experienced by this cadre. Conclusion: Our findings highlight the profound challenges faced by midwives working in conflict-affected settings, and they shine a light on the urgent need to support and invest in this critical and predominantly female health workforce.

21.
arXiv (CS.AI) 2026-06-19

Exploring Feature Extraction Technique Parameters for Acoustic Gunshot Classification

arXiv:2606.19568v1 Announce Type: cross Abstract: Acoustic gunshot detection is a problem with applications across civilian public safety, military operations, and wildlife conservation, yet the field lacks a rigorous exploration of feature extraction techniques with a focus on generalization to realistic data. The mixed effectiveness of commercial gunshot detection and classification systems indicates an open problem that is not adequately addressed by the current literature. In this paper, we present a systematic investigation of common feature extraction techniques using a dataset of 23,000 gunshot recordings across 85 firearms and 21 calibers. We benchmark three feature extraction techniques with 12 total unique parameter sets using ResNet-18. Our results demonstrate that using the correct feature extraction technique can improve top-1 accuracy by up to 20%, and utilizing the correct parameters for a given feature extraction technique can improve that value by up to 4.7%.

22.
arXiv (CS.CV) 2026-06-16

Question-Aware Evidence Ledgers for Video Relational Reasoning

The VRR-QA challenge evaluates visual relational reasoning in videos, where answers often depend on implicit spatial relations, event boundaries, target identity, and dialogue context rather than a single salient frame. We present a test-time reasoning pipeline built around a strong GPT-5.5 video QA solver and a set of question-aware evidence ledgers. The initial solver answers each question from a uniform video representation, while routed ledgers are prompted to make the required targets, count units, reference frames, and temporal or spatial scope explicit for counting, spatial, endpoint, viewpoint, and dialogue reasoning. External tools such as open-vocabulary detection, depth cues, pair crops, ASR, and scene-graph ledgers are used only as evidence sources. A conservative gate keeps the current answer unless independent evidence uniquely supports a different option. The final evidence-gated pipeline achieves 92.95% overall accuracy and 93.79% macro accuracy on the challenge test split.

23.
arXiv (math.PR) 2026-06-11

Numerical simulations of the spread from the mean of the SLE and Multiple SLE dynamics

arXiv:2606.11254v1 Announce Type: cross Abstract: The Schramm-Loewner Evolution (SLE) describes a family of fractal curves that arise in the study of the scaling limits of many planar Statistical Physics models. These curves are modeled using the Loewner Differential Equation for the conformal maps $g_t(z)$ with a Brownian motion driver. Using Euler's Method, in the current work we performed numerical experiments to study at a fixed time the quantities $|g_t(z) - \overline{g_t(z)}|$ and $Re(g_t(z)) - Re(\overline{g_t(z)})$, where $Re$ denotes the real part and $\overline{g_t(z)}$ refers to the sample average. These random variables measure the 'spread' of the dynamics from the average behavior at fixed time. One of the scopes of this work is to give numerical predictions for future theoretical investigations on these quantities. When investigating these quantities in the SLE case our experiments predict that the distribution is bimodal when the dynamics started close to the origin, and it can become bell-shaped if the dynamics is started further from the origin. In the second part, we performed experiments for a Multiple SLE model whose driver is Dyson Brownian Motion. Due to singularity in the dynamics of the drivers and the many data points needed, this part is challenging from a computational perspective. In the multiple SLE case, our experiments predict that the distribution is bell-shaped in all cases. In addition, we check the changes in the distributions as we vary the parameter $\kappa$ in the SLE case and $\beta$ in the Multiple SLE case.

24.
arXiv (CS.AI) 2026-06-12

The Query Channel: Information-Theoretic Limits of Masking-Based Explanations

arXiv:2604.16689v2 Announce Type: replace Abstract: Masking-based post-hoc explanation methods, such as KernelSHAP and LIME, estimate local feature importance by querying a black-box model under randomized perturbations. This paper formulates this procedure as communication over a query channel, where the latent explanation acts as a message and each masked evaluation is a channel use. Within this framework, the complexity of the explanation is captured by the entropy of the hypothesis class, while the query interface supplies information at a rate determined by an identification capacity per query. We derive a strong converse showing that, if the explanation rate exceeds this capacity, the probability of exact recovery necessarily converges to one in error for any sequence of explainers and decoders. We also prove an achievability result establishing that a sparse maximum-likelihood decoder attains reliable recovery when the rate lies below capacity. A Monte Carlo estimator of mutual information yields a non-asymptotic query benchmark that we use to compare optimal decoding with Lasso- and OLS-based procedures that mirror LIME and KernelSHAP. Experiments reveal a range of query budgets where information theory permits reliable explanations but standard convex surrogates still fail. Finally, we interpret super-pixel resolution and tokenization for neural language models as a source-coding choice that sets the entropy of the explanation and show how Gaussian noise and nonlinear curvature degrade the query channel, induce waterfall and error-floor behavior, and render high-resolution explanations unattainable.

25.
arXiv (math.PR) 2026-06-16

The Ornstein$-$Uhlenbeck process on $\mathscr P_2$ with a volatility operator

arXiv:2606.14917v1 Announce Type: new Abstract: We analyze a diffusion ${(\mu_t)}_{t\geq 0}$ on the $2$-Wasserstein space $\mathscr P_2$ over $\mathbb R^d$ for which \begin{equation*} |\mu_t|_2^2-|\mu_0|_2^2-2ct+2\int_0 ^t|\mu_s|_2^2\,d s,\qquad t\geq 0, \end{equation*} is a martingale, where the constant $c\in(0,\infty)$ equals the trace of a volatility operator on a Hilbert space and $|\mu_t|_2:=(\int_{\mathbb R^d}x^T x\mu_t(d x ))^{1/2}$. The invariant measure of ${(\mu_t)}_{t\geq 0}$ is a Gaussian on $\mathscr P_2$, as introduced by P. Ren and F.-Y. Wang. Moreover, the Dirichlet form and its generator are given explicitly on a dense subspace of $L^2$.