Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-17

Absolute continuity, supports and idempotent splitting in categorical probability

arXiv:2308.00651v5 Announce Type: replace Abstract: Markov categories have recently turned out to be a powerful high-level framework for probability and statistics. They accommodate purely categorical definitions of notions like conditional probability and almost sure equality, as well as proofs of fundamental results such as the Hewitt–Savage 0/1 Law, the de Finetti Theorem and the Ergodic Decomposition Theorem. In this work, we develop additional relevant notions from probability theory in the setting of Markov categories. This comprises improved versions of previously introduced definitions of absolute continuity and supports, as well as a detailed study of idempotents and idempotent splitting in Markov categories. Our main result on idempotent splitting is that every idempotent measurable Markov kernel between standard Borel spaces splits across another standard Borel space, and we derive this as an instance of a general categorical criterion for idempotent splitting in Markov categories.

02.
arXiv (CS.AI) 2026-06-18

Explaining Attention with Program Synthesis

arXiv:2606.19317v1 Announce Type: cross Abstract: A longstanding goal of research on interpretable deep learning is to replace opaque neural computations with human-meaningful symbolic descriptions. In this paper, we propose an approach for approximating the behavior of components of deep networks with executable programs. We focus on attention heads in transformer language models. For a given head, we first compute its associated attention matrices on a collection of randomly selected training examples. Next, we prompt a pre-trained language model with a summary of these matrices, and instruct it to generate a set of Python programs that can reproduce the associated attention patterns given only text from the input sentence. Finally, we re-rank programs according to how well our final set of programs predict behavior on held-out inputs. We demonstrate that a set of fewer than 1,000 such generated programs can reproduce the attention patterns of heads in GPT-2, TinyLlama-1.1B, and Llama-3B, achieving an average Intersection-over-Union similarity above 75% on TinyStories. Moreover, the best-fit programs can replace neural attention heads without substantially affecting model behavior: replacing 25% of attention heads with programmatic surrogates across the three models incurs only a 16% average perplexity increase, while maintaining performance on a variety of downstream question answering benchmarks. This work contributes a scalable pipeline for reverse-engineering attention heads in transformer models using human-readable, executable code, advancing a path toward symbolic transparency in neural models.

03.
arXiv (quant-ph) 2026-06-12

Robust Pretty Good Measurement via Hybrid Classical-Quantum Pseudoinverse Approximation and Circuit-Level Realization

arXiv:2606.13150v1 Announce Type: new Abstract: Pretty Good Measurement (PGM) is a near-optimal strategy for quantum state discrimination, but its practical realization becomes unstable when the ensemble operator is singular or ill-conditioned. We introduce a numerically robust PGM formulation based on the Moore-Penrose pseudoinverse, replacing the standard inverse square root with a threshold-regularized variant that remains well-defined across different spectral regimes. We develop a hybrid classical-quantum framework that combines pseudoinverse-based spectral preprocessing with quantum circuit realizations using block-encoding and spectral-transformation techniques. The framework incorporates support awareness, yielding physically meaningful measurement operators even in rank-deficient cases, and employs oblivious amplitude amplification to improve circuit-level success probabilities. Extensive numerical and circuit-level simulations show close agreement between theoretical predictions and quantum circuit outputs. Experiments on synthetic and real datasets, including ill-conditioned and degenerate scenarios, demonstrate stable discrimination performance where standard PGM becomes numerically unstable. The results establish a practical hybrid classical-quantum framework for robust quantum state discrimination and extend previous circuit-based implementations of the PGM testing stage toward pseudoinverse-aware measurement design.

04.
arXiv (CS.CL) 2026-06-17

When AI Says "I have been in similar situations": Synthetic Lived Experience in Peer-Like Caregiver Support

Caregivers often turn to online communities for informational and emotional support. In these spaces, peer supporters frequently draw on personal narratives to respond to emotionally complex caregiving situations. As LLMs are increasingly designed as peer-like sources of support, they introduce a critical tension: AI can provide immediate, private, and nonjudgmental support, but it cannot authentically possess the lived experiences that make human peer support meaningful. Yet, when prompted to sound peer-like, LLMs may generate language that implies lived experience. This creates a synthetic lived experience paradox: the same experiential language that may make AI support feel warm, relatable, and peer-like can also falsely position the system as someone with lived experience. We examine this paradox in the context of family caregivers of people living with Alzheimer's Disease and Related Dementias (ADRD). Drawing on caregiver support exchanges from online communities and prompted peer-like responses from three LLMs – LLaMA, GPT-4o-mini, and MedGemma – we analyze how human peers use personal narratives and how AI incorporates similar narrative forms. Psycholinguistic analysis shows that peer responses used significantly more first-person and past-focused language than peer-like AI responses. Qualitatively, we identify seven types of personal narratives in human peer support and show that AI often captures their emotional work, but can fabricate experiential grounding. These findings reveal a narrative authenticity gap: peer-like AI can generate synthetic lived experience without the real experience that makes peer support meaningful. We argue that caregiver-support AI systems need mechanisms to distinguish supportive peer-like framing from fabricated lived experience, ensuring that models can offer warmth and validation without falsely positioning themselves as experiential peers.

05.
arXiv (CS.AI) 2026-06-16

Mojo: A Promising Tool for Scalable Financial AI Efficiency

Authors:

arXiv:2606.16059v1 Announce Type: cross Abstract: For thirty years, quantitative finance has paid a costly two-language tax: models researched in Python are rewritten in C++ for production, often introducing numerical discrepancies. GPU-accelerated deep learning exacerbates this problem, as nondeterministic floating-point reductions can produce drift in long backtests, challenging regulatory reproducibility and auditability expectations. This article surveys Mojo, Modular's 2026 Python-like systems language, as a structural response for capital markets engineering. While closing the Python-to-C++ performance gap, Mojo uniquely combines native interoperability with the low-level systems control required to construct bit-exact deterministic kernels. Its MLIR compilation infrastructure further allows a single codebase to target scalar, SIMD, multicore, and GPU execution, reducing the translation bottleneck between research and production. We benchmark four core financial AI workloads: Monte Carlo option pricing, LLM sentiment inference, multi-asset backtesting, and portfolio Value at Risk. On Apple Silicon, Mojo demonstrates 20x to 180x speedups over pure Python on directly measured kernels; larger-scale GPU workload results are projections calibrated from published benchmarks. Alongside transparent performance data, we introduce mojo-deterministic, an open-source library of reproducible reduction kernels, and provide a candid assessment of the problems Mojo does and does not yet solve.

06.
arXiv (CS.LG) 2026-06-19

Model soups need only one ingredient

arXiv:2602.09689v2 Announce Type: replace Abstract: Fine-tuning large pre-trained models on a target distribution often improves in-distribution (ID) accuracy, but at the cost of out-of-distribution (OOD) robustness as representations specialize to the fine-tuning data. Weight-space ensembling methods, such as Model Soups, mitigate this effect by averaging multiple checkpoints, but they are computationally prohibitive, requiring the training and storage of dozens of fine-tuned models. In this paper, we introduce MonoSoup, a simple, data-free, hyperparameter-free, post-hoc method that achieves a strong ID-OOD balance using only a single checkpoint. Our method applies Singular Value Decomposition (SVD) to each layer's update and decomposes it into high-energy directions that capture task-specific adaptation and low-energy directions that introduce noise but may still encode residual signals useful for robustness. MonoSoup then uses entropy-based effective rank to automatically re-weigh these components with layer-wise coefficients that account for the spectral and geometric structure of the model. Experiments on CLIP models fine-tuned on ImageNet and evaluated under natural distribution shifts, as well as on Qwen language models tested on mathematical reasoning and multiple-choice benchmarks, show that this plug-and-play approach is a practical and effective alternative to multi-checkpoint methods, retaining much of their benefits without their computational overhead.

07.
medRxiv (Medicine) 2026-06-22

The Unsteady Return of Command-Following: Recovery and Instability of Bedside Motor Command-Following After Acute Brain Injury

Background/Objective: Following a verbal command marks the bedside transition from unresponsiveness to overt recovery of consciousness after acute brain injury. Its timing across phenotypes, stability once present, and dependence on sedation are uncharacterized at scale. Methods: Retrospective cohort of adults with acute brain injury, first intensive care unit stay, MIMIC-IV. Command-following was the Glasgow Coma Scale motor response "Obeys Commands." Among patients not following commands at admission, cumulative incidence was estimated with death or hospice and discharge without recovery as competing events. Instability was quantified as transient first recovery and threshold crossings; examinations were tagged for concurrent sedation. Principal findings were externally validated in the multicenter eICU Collaborative Research Database. Results: Of 13,900 brain-injured patients with three or more motor examinations, 5,498 (39.6%) were not following commands at admission. The cumulative incidence of first command-following was 43.5% by 24 hours and 65.0% by 14 days, ranging at 14 days from 36.9% in anoxic injury to 77.2% in ischemic stroke (anoxic versus ischemic stroke at 72 hours, difference 0.41; adjusted P = .002). Among 3,573 patients who recovered, the first recovery was transient in 22.2%, and 62.4% crossed the threshold repeatedly. Non-following was strongly associated with sedation, consistent with an arousal-dependent examination. In eICU, the 14-day incidence was 64.8%, and transient first recovery was 22.7%, closely matching the primary cohort. Conclusions: After acute brain injury, overt bedside command-following returns early but unsteadily, with phenotype-dependent timing, threshold fluctuation, and strong dependence on sedation. A single charted observation is an unreliable index of the underlying state.

08.
arXiv (CS.CV) 2026-06-17

NTIRE 2025 Challenge on Image Super-Resolution (x4): Methods and Results

This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025. The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $\times$4 scaling factor. The objective is to develop effective network designs or solutions that achieve state-of-the-art SR performance. To reflect the dual objectives of image SR research, the challenge includes two sub-tracks: (1) a restoration track, emphasizes pixel-wise accuracy and ranks submissions based on PSNR; (2) a perceptual track, focuses on visual realism and ranks results by a perceptual score. A total of 286 participants registered for the competition, with 25 teams submitting valid entries. This report summarizes the challenge design, datasets, evaluation protocol, the main results, and methods of each team. The challenge serves as a benchmark to advance the state of the art and foster progress in image SR.

09.
arXiv (CS.LG) 2026-06-16

Greedy Coordinate Diffusion: Effective and Semantically Coherent Adversarial Attacks via Diffusion Guidance

arXiv:2606.15531v1 Announce Type: new Abstract: Fine-tuning aligned language models on benign tasks (e.g. math tutoring) systematically breaks safety guardrails, even when training data contains no harmful content. While mechanistic approaches have shed light on where alignment resides in model weights, they do not by provide a general formal framework for deriving guarantees about when fine-tuning degrades it – leaving the field without principled tools for predicting or preventing alignment collapse. We develop a local geometric framework through geometric analysis of parameter-space trajectories and apply it to understand the fragility of alignment in fine-tuning. While first-order analysis suggests orthogonal updates are safe, we prove this is illusory: the curvature of the fine-tuning loss induces second-order acceleration that can induce second-order drift into alignment-sensitive regions. We formalize a construct of our framework as the Alignment Instability Condition (AIC), three geometric properties that, when present, are sufficient to guarantee degradation. Our main result proves quartic onset of alignment degradation along gradient-flow trajectories, determined by how sharply alignment depends on specific parameters and how strongly tasks couple to these parameters. These findings yield formal sufficient conditions under which static first-order protection can fail under gradient descent. We further empirically validate the framework's foundations, showing that the Fisher Information Matrix provides a proxy for the degree of safety degradation across diverse fine-tuning.

10.
arXiv (CS.AI) 2026-06-16

Cognitive Debt: AI as Intellectual Leverage and the Dynamics of Systemic Fragility

Authors:

arXiv:2606.15078v1 Announce Type: new Abstract: We develop a formal theory of cognitive debt: the stock of unverified reasoning obligations that accumulates when individuals use AI as a substitute rather than a complement for first-principles cognition. The model features two state variables per agent, cognitive capital and cognitive debt, and a multiplicative production technology in which cognitive capital functions as collateral that determines the return to AI adoption. We establish six propositions. Rational agents incur positive cognitive debt because the costs are deferred, partially external, and masked by short-run productivity gains. Tranquil periods lower subjective risk assessments, raise AI substitution intensity, and compound leverage, generating a cognitive Minsky moment in which subjective risk falls while true systemic fragility rises. Expected crisis losses are convex in aggregate leverage. Post-crisis, output-target pressure can produce a false-correction loop in which agents patch AI failures with more AI. The decentralised equilibrium over-adopts substitutive AI relative to the social optimum because of systemic risk, cognitive public goods, and arms-race externalities. In a two-type heterogeneous-agent economy, high-cognitive-capital agents adopt AI more intensively and may eventually erode their unaided cognitive capital below that of initially lower-skilled agents.

11.
arXiv (CS.CV) 2026-06-17

GASE: Gaussian Splatting-Based Automated System for Reconstructing Embodied-Simulation Environments

Training embodied agents in the real world requires skilled operators and expensive hardware. Simulation environments offer a compelling alternative by enabling large-scale, cost-effective data augmentation. Consequently, rapidly constructing high-fidelity simulation scenes with a minimal sim-to-real gap has become a critical objective in robot learning. While reconstruction-based methods provide superior visual quality, current workflows are hindered by inefficient data acquisition and subpar foreground object extraction. We thus propose GASE, a highly automated system for simulation scene construction. GASE leverages multi-view video streams from panoramic camera arrays to enable rapid environment scanning. To ensure high-quality asset generation, our pipeline introduces a camera-pose-based strategy that robustly extracts objects across frames in the 2D domain, followed by high-fidelity scene inpainting. Foreground objects and the static background are then reconstructed independently and seamlessly imported into physics simulators for policy training. Extensive experiments demonstrate that GASE outperforms existing 3D Gaussian-based methods in segmentation accuracy by over 10\% while achieving state-of-the-art inpainting quality. Furthermore, real-robot deployments across manipulation and navigation tasks maintains a performance gap of less than 10\% compared to policies trained purely on real-world data. These results confirm that GASE provides an efficient and highly effective solution for bridging the sim-to-real gap. Code will be released.

12.
arXiv (CS.CL) 2026-06-25

Dustin: Draft-Augmented Sparse Verification for Efficient Long-Context Generation with Speculative Decoding

While speculative decoding improves inference throughput for multi-batch long-context Large Language Models (LLMs), its efficiency is often limited by a verification bottleneck where Key-Value (KV) cache loading dominates latency. Existing compression methods fail in this regime: static eviction incurs accuracy loss due to saliency shift, while dynamic selection introduces prohibitive computational overhead during the verification path. We propose Dustin, a sparse verification framework designed for long-context speculative decoding. Dustin integrates lookahead signals from the draft model with historical attention from the target model to identify critical tokens with high fidelity across multi-step verification windows. To reduce recomputation latency, this approach further employs a sparse estimation scheme that restricts importance scoring to a minimal subset of attention heads. Evaluations on PG-19 and LongBench with Qwen2.5-72B demonstrate that Dustin achieves a 27.85x speedup in self-attention and a 9.17x end-to-end decoding speedup at a 32k sequence length, all with negligible accuracy degradation.

15.
arXiv (CS.AI) 2026-06-25

Variable Bound Tightening for Nash Equilibrium Computation in Multiplayer Imperfect-Information Games

Authors:

arXiv:2606.25997v1 Announce Type: cross Abstract: There has been significant recent progress in algorithms for approximation of Nash equilibrium in large two-player zero-sum imperfect-information games and exact computation of Nash equilibrium in multiplayer strategic-form games. While counterfactual regret minimization and fictitious play are scalable to large games and have convergence guarantees in two-player zero-sum games, they do not guarantee convergence to Nash equilibrium in multiplayer games. Recently, an approach has been presented for exact computation of Nash equilibrium in multiplayer imperfect-information games that solves a quadratically constrained program based on a nonlinear complementarity problem formulation derived from the sequence-form game representation. This formulation was solved using Gurobi's nonconvex quadratic solver, which employs spatial branch-and-bound to iteratively refine variable bounds by solving convex relaxations of bilinear terms via McCormick envelopes. During presolve, Gurobi introduces auxiliary variables and, in some cases, binary variables, leading to an internal MIQCP reformulation. This approach was demonstrated to outperform prior algorithms from the Gambit software suite and quickly solve three-player Kuhn poker after removal of dominated actions; however, the algorithm was not able to solve the full version of the game within 24 hours. In this paper, we derive finite bounds on slack and multiplier variables in the nonlinear complementarity formulation. These bounds strengthen the convex relaxations used within spatial branch-and-bound and lead to substantial computational improvements. We demonstrate the impact of the proposed bounds on exact Nash equilibrium computation in three-player Kuhn poker.

16.
arXiv (quant-ph) 2026-06-24

Auxiliary Schmidt Rank as a Resource for Photonic Bell Measurements

arXiv:2606.24591v1 Announce Type: new Abstract: In quantum communication and fusion-based quantum computation, photonic Bell measurements are fundamentally limited when only passive linear optics is employed. While for qubits, some Bell states can be unambiguously identified with static beam splitters and no extra photons or entanglement, additional auxiliary photons or at least additional auxiliary degrees of freedom with a certain level of additional entanglement are needed to approach or attain a complete, deterministic Bell measurement. Here, we prove an exact resource threshold when the same two photons carry system qudits of dimension $d$ and a fixed auxiliary entangled state $\Phi$, possibly distributed over several additional degrees of freedom, with total Schmidt rank $r_\Phi$. We show that a single conclusive Bell-label functional can occur for $r_\Phi\geqslant\lceil d/2\rceil$, but deterministic discrimination of all $d^2$ Bell-state labels requires $r_\Phi\geqslant d$. A maximally entangled rank-$d$ auxiliary state achieves the bound by local Bell-basis sorting between each photon's system and auxiliary degrees of freedom. Thus, the auxiliary Schmidt rank is a certified resource for ancilla-photon-free, embedded photonic Bell measurements.

17.
arXiv (CS.LG) 2026-06-15

Neural ARFIMA model for forecasting BRIC exchange rates with long memory

arXiv:2509.06697v3 Announce Type: replace-cross Abstract: Exchange rate forecasting remains a challenging problem, particularly for emerging economies, where the observed time series exhibit pronounced long-memory dependence, nonlinear dynamics, and sensitivity to macro-financial drivers. Classical models such as ARFIMA capture long-range persistence but fail to adequately represent nonlinear relationships, while modern machine learning approaches often neglect the underlying long-memory structure in macroeconomic series. To address this gap, we propose a Neural AutoRegressive Fractionally Integrated Moving Average (NARFIMA) model that integrates ARFIMA-based long-memory modeling with neural networks for nonlinear function approximation, while incorporating exogenous macroeconomic and uncertainty indicators. The framework provides a unified approach for capturing persistence, nonlinear dynamics, and external shocks. We establish asymptotic stationarity of the NARFIMA process and develop conformal prediction intervals for distribution-free uncertainty quantification. Empirical results for BRIC exchange rates show that NARFIMA consistently outperforms a broad range of forecasting benchmarks across multiple horizons, underscoring the importance of explicitly modeling long-memory dependence in exchange rate dynamics. The `narfima' R package provides an implementation of our approach.

18.
arXiv (CS.CL) 2026-06-16

SkillWiki: A Living Knowledge Infrastructure for Agent Skills

While knowledge is managed through Wikipedia and software through GitHub, agent skills still lack an infrastructure for large-scale production, governance, and evolution. SkillWiki is a living knowledge infrastructure that supports the organization, grounding, and continuous evolution of agent skills by transforming heterogeneous knowledge into reusable skill assets linked to their originating evidence. Our demonstration presents the complete skill lifecycle, from knowledge ingestion and skill production to provenance-aware exploration, governance, and execution-driven evolution. SkillWiki highlights a future in which knowledge, skills, and execution experience co-evolve within a shared infrastructure. The live demonstration and source code are publicly available at https://github.com/Huangdingcheng/SkillWiki.

19.
arXiv (math.PR) 2026-06-16

Layerwise Terminal Discrepancy in Chen's Reverse-Heat Coupling on the Boolean Cube

arXiv:2606.04573v2 Announce Type: replace-cross Abstract: Recently, Chen [Chen2026] proved that Talagrand's Boolean convolution conjecture holds up to the dimension-free factor \((\log\log\eta)^{3/2}\), namely for every fixed \(\tau>0\), \[ \mu\{P_\tau f>\eta\|f\|_1\} \le C_\tau \frac{(\log\log\eta)^{3/2}}{\eta\sqrt{\log\eta}}, \qquad \eta>e^3. \] We revisit the terminal testing-discrepancy step in Chen's perturbed reverse-heat coupling. Chen estimates this discrepancy globally in terms of the remaining gap to the terminal level. We keep the same coupling and the same reverse-heat formulations, but localize the terminal discrepancy on each remaining-gap layer before summing the layers. This changes the fixed-time anti-concentration cost from order \((\log L)^{3/2}/\sqrt L\) to order \((\log L)/\sqrt L\), where \(L=\log\eta\). Consequently, we obtain a \((\log\log\eta)^{1/2}\) improvement as \[ \mu\{P_\tau f>\eta\|f\|_1\} \le C_\tau \frac{\log\log\eta}{\eta\sqrt{\log\eta}}, \qquad \eta>e^3. \]

20.
medRxiv (Medicine) 2026-06-24

MedGenesis: Toward a World Model for Autonomous Clinical and Translational Research

Clinical research advances slowly because its core tasks, from evidence synthesis to mechanistic validation, remain fragmented. We present MedGenesis, a clinical artificial intelligence (AI) scientist built on a world-model reasoning loop that jointly updates a Latent Hypothesis Space and a Latent Action Space under expected information gain (EIG), uncertainty reduction (UR), and a safety prior P(safe), and integrates longitudinal electronic health records (EHRs) via the Virtual Clinical Trajectory and Observation Representation (ViCTOR) for cohort retrieval, trajectory stratification, and time-to-event analysis. On two benchmarks - ClinicalResBench (1,697 expert-curated questions) and ClinicalRepBench (40 paper-reproduction tasks) - MedGenesis outperformed frontier language models and biomedical AI systems while reducing hallucination. Across 1 million patient observations spanning five clinical evidence formats, it generated traceable outputs across meta-analysis, randomized controlled trials, real-world trajectories, case-control studies, and case reports, with one wet-lab-coupled run nominating a 3-hydroxybutyrate - neutrophil axis modulating antitumor immunity. These results compress hypothesis-to-evidence cycles from years to hours, creating a continuous clinical discovery process.

21.
arXiv (CS.AI) 2026-06-15

Hierarchical ODE: Learning Continuous-Time Physical Prototypes for Early Link Failure Detection

arXiv:2606.14284v1 Announce Type: cross Abstract: Time series prototype learning is fundamentally challenged by observational ambiguity. Discrete architectures fail to resolve this, as they lack the capacity to decouple stochastic noise from continuous dynamics. Furthermore, rigid closed-set assumptions fail to capture unseen diversity. To address these limitations, we propose a hierarchical ordinary differential equation clustering network, which utilizes neural ordinary differential equation to model latent state evolution as a continuous integral curve. This formulation enforces temporal continuity to effectively disentangle smooth feature trends from stochastic noise, while our adaptive hierarchical mechanism autonomously determines the appropriate number of prototypes without rigid prior constraints. Validated on the early link failure detection task with irregularly sampled time series, the proposed method effectively extracts underlying physical prototypes, thereby enabling robust failure detection. Our code is available at https://github.com/NJ-LNN/Hierarchical-ODE.

22.
arXiv (CS.CV) 2026-06-25

Steering Vision-Language Models with Joint Sparse Autoencoders

Sparse Autoencoders (SAEs) have shown promise for analyzing language models, but applying them to vision-language models (VLMs) often yields representations that are difficult to use as controllable cross-modal steering directions. We introduce the Joint Sparse Autoencoder (JSAE), which uses an explicit alignment constraint to jointly factorize sequence-pooled vision and language activations into shared, interpretable image/caption-level features. Applied to LLaVA, JSAE recovers cross-modal features for recognizable concepts (e.g., food and animals). Through bidirectional interventions (additive steering and suppression), we observe a layer-dependent asymmetry under our protocol: additive steering peaks at mid-to-late (pre-output) layers and weakens at both ends, whereas suppression scores remain within a comparable range across all probed layers within statistical noise. Experiments on three VLMs, namely LLaVA-v1.6-Mistral-7B, Llama3-LLaVA-8B, and the MoE-based Qwen3-VL-30B, show related layer-localized effects across architectures. Together, these results suggest that explicitly aligned sparse representations support more controllable intervention-based analysis of multimodal features, within an identifiable layer range, than the unconstrained alternatives tested here.

23.
arXiv (CS.CV) 2026-06-11

CoCoSI: Collaborative Cognitive Map Construction for Spatial Intelligence

Spatial intelligence is a key frontier for multimodal large language models (MLLMs), enabling them to reason about the physical world from visual experience. Inspired by human spatial cognition, recent approaches construct grid-based cognitive maps from multi-frame visual inputs to maintain coherent spatial representations over time. However, limited context lengths still challenge spatial understanding, while existing methods, such as long-context modeling and external memory, often require architectural changes, memory modules, or finetuning, limiting their applicability to off-the-shelf pretrained MLLMs. This motivates a lightweight, model-agnostic method for preserving spatial information beyond the native context window. To this end, we propose a plug-and-play multi-agent framework that collaboratively constructs cognitive maps as structured spatial memory, enhancing the spatial understanding of arbitrary pretrained MLLMs without architectural modification or additional training. Our framework features local-global agent coordination, cognitive map construction with atomic commits, and cross-agent verification. Extensive experiments demonstrate that our method achieves superior performance on spatial understanding tasks while remaining fully training-free. Code will be released.

24.
medRxiv (Medicine) 2026-06-24

Digital exclusion and mental health in UK Armed Forces veterans: findings from the Veterans Digital Needs Study

Background: Public services are increasingly delivered through digital platforms. Although digital health may improve access and scalability, they may also widen inequalities for people who lack reliable access, confidence, skills, affordability or trust. Objective: This study examined the prevalence of self-reported digital exclusion among UK veterans and assessed its association with depression, anxiety and loneliness. Methods: A cross-sectional online survey was conducted between July 2025 and March 2026. Participants were UK Armed Forces veterans and resident in the UK. The survey collected sociodemographic, military service, digital access and health data. Self-reported digital exclusion was defined as reporting feeling excluded or disadvantaged due to lack of digital access or skills. Probable depression, anxiety and loneliness were assessed using the PHQ-2, GAD-2 and three-item UCLA Loneliness Scale, respectively. Associations between digital exclusion and each outcome were examined using adjusted multivariable logistic regression. Results: Of 1,911 responses received, 1,607 were included after data quality exclusions. Among participants with valid responses to the primary digital exclusion item, 553 (41.7%) reported digital exclusion. Digital exclusion was more common among females, younger veterans and those with lower household income. Probable depression, anxiety and loneliness were more prevalent among digitally excluded participants than among non-excluded participants. In adjusted models, self-reported digital exclusion was associated with higher odds of probable depression (AOR 1.38; 95% CI 1.04 to 1.83; p=0.028), probable anxiety (AOR 1.63, 95% CI 1.23 to 2.16; p

25.
arXiv (quant-ph) 2026-06-12

Coarse-grained quantum thermodynamics: Observation-dependent quantities, observation-independent laws

arXiv:2507.15918v2 Announce Type: replace Abstract: In both classical and quantum thermodynamics, physical quantities are typically assigned objective values defined independently of our observations. We then refer to the 'work performed by a gas', or the 'entropy of the gas', regardless of how they are evaluated. Here, we question this conception in the context of quantum thermodynamics, estimating how the definition of pivotal thermodynamic quantities is affected by experimental instruments of limited precision. We find that the coarse-grained thermodynamic quantities frequently lead to different conclusions from those drawn in fine-grained scenarios. For instance, the irreversibility of a process, or its work payoff, can significantly vary with the instrument precision. We show nonetheless that coarse-grained thermodynamic quantities satisfy the same relations (i.e., the second law inequality, the relation between dissipation and distinguishability of a process from its time-reverse, and the quantum work fluctuation theorems) as their fine-grained counterparts. These results highlight the observation-independence of relations linking thermodynamic quantities which are themselves observation-dependent.