Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
bioRxiv (Bioinfo) 2026-06-16

MetaPilot: genome-aware adaptive search-space refinement for unified DDA and DIA metaproteomics

Metaproteomic peptide identification is constrained by the structure and size of the protein search space. Pooled gene catalogues provide coverage but obscure genome-level evidence, and current workflows for data-dependent (DDA) and data-independent (DIA) acquisition diverge in their database strategies. We present MetaPilot, a genome-aware workflow that uses conserved marker-protein evidence to rank candidate genomes from MGnify catalogues and construct adaptive, sample-specific search spaces. Applied to paired DDA/DIA datasets of defined mixtures and fecal samples, MetaPilot adapted genome selection to community complexity and reproduced published peptide evidence while expanding the detectable peptide space. In DDA-independent reanalysis of Orbitrap human gut DIA data, MetaPilot identified 24.4% more peptides than the published DDA-derived library and 2.06-fold more than the matched DDA-assisted DIA search. On timsTOF DIA-PASEF mouse intestinal data, it outperformed uMetaP by 41.8~119.7%, enabling genome-resolved functional interpretation without DDA-PASEF input.

02.
arXiv (CS.AI) 2026-06-15

Learning High Coverage Discriminative Parsimonious Rulesets

arXiv:2606.14156v1 Announce Type: cross Abstract: Learning systems based on IF-THEN rule representations readily offer interpretability, making them a crucial focus in contemporary AI research. A key objective for such rule sets is to achieve both high discriminative power and interpretability. While existing state-of-the-art algorithms implicitly prioritize predictive accuracy, they often fall short on one or more quality metrics that ensure interpretability, such as coverage and parsimony of rule sets. Motivated by this, this paper propose the development of CDPR, which aims to create highly accurate and interpretable rule sets for classification problems. To the best of our knowledge, this represents the first attempt to establish such an approach. In this study, we introduce two algorithms rooted in submodular maximization, which not only provide provable guarantees on coverage but also yield rule sets that are both discriminative and parsimonious. We empirically demonstrate that rule sets learned through our approaches achieve higher accuracy and interpretability and has more than a 2.5-fold improvement in average coverage rates when compared to the next best algorithm.

03.
arXiv (CS.CV) 2026-06-18

AMALIA-VL: A Native European Portuguese Open-Source Vision and Language Model

Large Vision and Language Models (LVLMs) have advanced rapidly, yet European Portuguese (pt-PT) remains systematically underserved by existing open-source multimodal models, which either conflate it with Brazilian Portuguese or severely under-represent it in their training data mixes. We introduce AMALIA-VL, the first open-source instruction-tuned LVLM built natively for pt-PT, pairing a high-resolution vision encoder with dynamic image tiling and a fully open pt-PT-optimized language model via a learned connector. We contribute with a purposefully designed three-stage training process - vision-language alignment, general visual instruction tuning, and preference optimization - together with a pt-PT-centric multimodal data mix combining curated and translated public datasets with novel datasets that address the near-total absence of European Portuguese multimodal resources. Our evaluation shows that AMALIA-VL establishes a strong baseline for open-source pt-PT LVLMs.We will release model weights, training data, and construction pipelines along with machine-translated pt-PT evaluation benchmarks to help democratize pt-PT LVLM development.

04.
arXiv (CS.LG) 2026-06-25

Knowledge Cascade: Reverse Knowledge Distillation on Nonparametric Multivariate Functional Estimation

arXiv:2606.25927v1 Announce Type: cross Abstract: As machine learning models and datasets continue to grow, developing complex models has become increasingly computationally demanding. Knowledge distillation reduces deployment cost by compressing a large, well-trained teacher model into a compact student model, but it does not address settings where constructing the teacher itself is the bottleneck. Motivated by this challenge, we introduce Knowledge Cascade (KCas), a reverse knowledge distillation framework that uses information from a small, inexpensive student model to guide the development of a more complex teacher model. Although this direction is counterintuitive because the teacher typically has greater representational capacity, we show that student-to-teacher transfer can be principled when supported by statistical scaling relationships. We first develop KCas for nonparametric multivariate functional estimation in reproducing kernel Hilbert spaces via smoothing splines, where selecting multiple smoothing parameters is a major computational bottleneck. KCas transfers student-selected smoothing parameters to the full-sample regime through asymptotic scaling laws, substantially reducing computational cost for high-dimensional and large-scale datasets while retaining theoretical guarantees. Beyond smoothing splines, we illustrate the same principle through kernel density estimation and deep learning hyperparameter transfer. Simulations and real-data experiments show that KCas achieves substantial computational savings while maintaining strong statistical performance, and can sometimes outperform the corresponding full-sample procedure.

05.
arXiv (CS.CV) 2026-06-25

Entropy-Based Observability for AI Agent Behavior

AI agents are typically instrumented through outcome-oriented indicators such as task success, reward, latency, and cost.Although these indicators are operationally important, they provide limited visibility into the internal structure of agent behavior such as the degree of exploration, the rigidity or diversity of action selection, the concentration of tool use, the reduction of uncertainty across a run, and the stability of behavior across repeated executions.This paper proposes Entropy-Based Observability for AI Agents (EOA), a lightweight framework for deriving behavioral telemetry from agent traces.

06.
arXiv (CS.LG) 2026-06-16

CREST: Deployment-Realistic Hardware-in-the-Loop NAS for Embedded Sensing Systems

arXiv:2606.15004v1 Announce Type: cross Abstract: Deploying neural networks on low-power microcontrollers (MCUs) requires selecting model architectures under tight memory, latency, and energy constraints. Existing workflows often simplify this process along one or more axes: static proxy costs such as FLOPs or parameters, treating one MCU as representative, and continuous-inference tests instead of deployed sensing schedules. These assumptions can mis-rank Pareto-front candidates, miss infeasible deployments, and obscure schedule-dependent energy. We present CREST (Cross-platform Runtime Evaluation and Search Tool), a deployment-realistic hardware-in-the-loop (HIL) neural architecture search (NAS) framework for MCU sensing systems. CREST keeps the optimizer, HIL measurement boundary, logging, and replay workflow fixed while exposing workload, model family, target backend, schedule, quantization, and scoring policy as configurable axes. This makes deployment effects experimentally separable within one reusable workflow. We evaluate CREST on inertial odometry and audio classification across three Arm Cortex-M targets. For inertial odometry, measured-energy HIL search reduces median per-inference energy by 41.7% versus FLOPs-based selection and 40.8% versus memory-traffic-based selection at similar error. FLOPs-based selection also chooses infeasible deployments on memory-constrained targets. On the STM32 N657 target, continuous-inference and duty-cycled searches produce different Pareto frontiers. For audio classification, the same application-level policy selects different DS-CNN architectures on different boards, and cross-board replay changes deployment cost substantially. Overall, CREST shows that deployment-realistic MCU NAS must jointly optimize model architecture, target platform, runtime schedule, and deployment policy rather than relying only on static proxy costs or continuous-inference measurements.

07.
arXiv (CS.CV) 2026-06-18

Fuzzy-Geometric Branch-Point Modeling for Structure-Aware Augmentation of Handwritten Chinese Characters

Data scarcity and structural distortion significantly limit handwriting recognition in high-security authentication. Existing augmentation methods often cause topological and morphological damage, particularly when processing complex Chinese characters where stroke intersections, ligatures, and sharp turns render traditional branch-point detection unreliable. To address this, this paper proposes a fuzzy geometry-driven structure-aware (FGSA) augmentation framework. We model branch points as fuzzy sets within the skeleton space, constructing a continuous branch-point membership field by integrating topological neighborhood evidence with direction field divergence. This membership field is adaptively optimized via an unsupervised surrogate objective, enabling robust stroke decoupling without manual annotation. Finally, kinematically-aligned samples are synthesized through parameterized cubic Bézier reconstruction and multi-strategy perturbations, ensuring a balance between structural fidelity and sample diversity. Moreover, we establish LZUSig, a large-scale, highly challenging dataset specifically dedicated to fine-grained structural degradation in Chinese handwritten signatures. Extensive experiments on CASIA-HWDB1.1, ChiSig, and LZUSig demonstrate that FGSA significantly reduces the word-level error rate ($\Delta$WER), achieving optimal recognition gains over the compared baselines. More importantly, it strikes a robust trade-off among task gain, structural fidelity, and discriminative feature preservation, offering a highly controllable solution for handwriting augmentation.

08.
arXiv (math.PR) 2026-06-24

Scheduling jobs with unknown size distribution in a M/G/1 queue: the shifted empirical Gittins

arXiv:2606.24703v1 Announce Type: new Abstract: In this paper we consider a M/G/1 queue for which we want to minimize the expected response time. We show how to compute indices from $n$ samples of the job size distribution such that the corresponding index policy is asymptotically optimal as $n$ grows. This construction is based on a discretization of the bounded support of the job size distribution and a shift of the samples to their nearest discrete point to the right. We show that the Gittins index of the empirical distribution of these shifted samples is close to the Gittins index of the original distribution. This translates to the asymptotic optimality of the corresponding index policy for minimizing the expected response time. Numerical comparison with other approaches further confirm the efficiency of our approach.

09.
arXiv (CS.AI) 2026-06-17

CausalT5k: Diagnosing Refusal and Failure Modes in Trustworthy Causal Reasoning Across Causal Rungs

arXiv:2602.08939v2 Announce Type: replace Abstract: Large language models increasingly produce fluent causal explanations, yet they often fail in ways aggregate accuracy cannot diagnose: confusing association with intervention, abandoning correct judgments under pressure, over-refusing valid claims, or answering when evidence is underdetermined. We introduce CTK, a diagnostic benchmark of 5,147 cases and growing, across 10 domains and all three levels of Pearl's Ladder of Causation. Unlike benchmarks that only score correctness, CTK reveals why a model failed by annotating causal rung, trap type, pressure sensitivity, refusal quality, and Utility-Safety tradeoffs. Its Sheep/Wolf taxonomy separates valid causal designs from inferential traps; paired neutral/pressure variants measure sycophantic drift through Bad Flip Rate; and Wise Refusal fields test whether a model identifies the missing information needed before endorsing a claim. CTK exposes failure modes hidden by aggregate accuracy: the Skepticism Trap, Rung Collapse under scaling, pressure-induced drift, Detection-Correction gaps, and counterfactual error modes. Rather than prescribing a correction method, it provides the diagnostic substrate for studying causal-reasoning failure profiles.

10.
arXiv (CS.CL) 2026-06-24

Are LLM Evaluators Really Narcissists? Sanity Checking Self-Preference Evaluations

Recent research has shown that large language models (LLMs) favor their own outputs when acting as judges, undermining the integrity of automated post-training and evaluation workflows. However, it is difficult to disentangle which behaviors are explained by narcissism versus experimental confounds. Specifically, LLM evaluators may deliver self-preferring verdicts when comparing responses to questions they fail on; these verdicts may not depend on the identity of the author, but on evaluator quality. We correct this by directly comparing the judge's voting distribution in cases where it evaluates itself versus another model. This evaluator quality baseline reveals that only 51% of examples in previous findings retain statistical significance against this null hypothesis, covering 89.6% of total self-preference probability mass. Finally, we compare the entropy of voting distributions, suggesting uncertainty-driven overlap, and show that our procedure enables more careful documentation against the backdrop of judge-bias research.

11.
arXiv (quant-ph) 2026-06-25

Note About Koopman-von Neumann Theory and Density Matrix

Authors:

arXiv:2606.25085v1 Announce Type: new Abstract: In this short note we study Koopman-von Neumann theory for N-particle system. We argue that it is natural to identify classical N-particle distribution function as diagonal form of density matrix operator in coordinate representation. We also determine generalized BBGKY hierarchy for reduced density matrix in coordinate representation.

12.
Nature Medicine 2026-06-25

<b>HPV vaccination linked to dramatic reduction in cervical cancer deaths</b>

A new study examines the impact of the national HPV vaccination program in England, initiated almost two decades ago — showing huge declines in cervical cancer mortality among young women. A new study examines the impact of the national HPV vaccination program in England, initiated almost two decades ago — showing huge declines in cervical cancer mortality among young women.

13.
arXiv (quant-ph) 2026-06-25

Analytic Approach to Quantum Control Using Quantum Signal Processing

arXiv:2606.26085v1 Announce Type: new Abstract: Realizing coherent quantum computation requires precise and robust manipulation of quantum systems through quantum control protocols. Most quantum control techniques rely on heuristic methods for designing the driving pulses that steer the system towards a target state. Such methods are often based on brute-force optimization and offer limited understanding of the solution landscape. In contrast, quantum algorithms offer a rich body of analytical methods with rigorous error guarantees for implementing unitary and non-unitary transformations, which suggests a promising direction for developing new approaches to quantum control. Among various such algorithms, quantum signal processing (QSP) has emerged as a powerful framework for quantum algorithm design, implementation, and optimization. However, its potential for quantum control remains largely unexplored. In this work, we establish QSP-Control, an analytical framework for quantum control of qubit-oscillator dynamics. We focus on dispersively coupled qubit-oscillator systems and employ the QSP formalism to mitigate unwanted nonlinear effects arising from cross-Kerr interactions. In addition, we develop constructions for precise manipulation of Fock states by designing Fock-state-selective operators, based on structural parallels between the Jaynes-Cummings interaction and QSP. These findings demonstrate how several practically relevant problems in quantum control can be mapped to forms amenable to QSP, offering both a systematic design framework and an interpretable perspective on quantum control.

14.
arXiv (CS.CV) 2026-06-25

HaineiFRDM: Structure-Preserving Diffusion for Film Restoration under Fast Motion and Diverse Defects

Existing film-restoration methods frequently fail under fast motion, producing limb disappearance and structural distortion due to inaccurate motion modeling. Moreover, high-resolution restoration under spatially-persistent and mixed defects remains insufficiently studied. We propose HaineiFRDM, a Film Restoration Diffusion Model that leverages the content modeling capability of diffusion models for content-aware restoration, removing defects while preserving scene structure.To enable scalable high-resolution restoration, we adopt a patch-wise strategy with position-aware global fusion modules to maintain cross-patch coherence. We further introduce a frequency-based module to enhance texture consistency and a patch-consistent inference framework to alleviate blocking artifacts introduced by patch-based processing.We also construct a film restoration dataset comprising categorized defect templates, professionally restored films, and realistic synthetic degradations.Extensive experiments demonstrate our superior restoration quality with strong structural consistency. Our design also reduces memory requirements, enabling high-resolution restoration on a single 24GB-VRAM GPU.Code and the dataset will be released at https://anonymous.4open.science/r/HaineiFRDM.

15.
arXiv (math.PR) 2026-06-16

A non-asymptotic bound on the TV distance between a Wishart matrix and an appropriately scaled GOE matrix

arXiv:2606.16018v1 Announce Type: new Abstract: In this note, we prove a non-asymptotic version of a theorem by Bubeck, Ding, Eldan, and Rácz, showing that a Wishart matrix is close in total variation to an affine transformation of a GOE matrix. The proof mirrors the proof given by Bubeck et al., with some changes made to make it non-asymptotic.

16.
arXiv (quant-ph) 2026-06-24

Dynamical low-rank methods for the Wigner equation I: separable difference potential

arXiv:2606.24190v1 Announce Type: cross Abstract: Recent advances in dynamical low-rank approximation (DLRA) have demonstrated its effectiveness in high-dimensional simulations. However, existing DLRA algorithms still face significant challenges when handling systems that involve complex collision terms, including the pseudo-differential operator ($\Psi$) in the Wigner equation, a representative operator characterized by nonlocality. It is deserving to carry out a series of works to develop the DLRA algorithms for solving the Wigner equation. As the first step in this series of works, we propose an efficient DLRA algorithm for the Wigner equation, using a separable decomposition of the difference potential. We combine this separable assumption with two often-used truncations of $\Psi$, namely $\mathcal{K}$-truncation and $\mathcal{Y}$-truncation, to obtain a kind of separated representation of $\Psi$. Complexity analysis and several challenging experiments, including harmonic oscillators, Gaussian barrier scattering, electron-electron scattering, and a Helium-like system, all of which satisfy the separable assumption, confirm that the proposed DLRA algorithm has significant advantages, achieving a reduction in computational effort by one to two orders of magnitude in both runtime and memory requirements compared to the full-grid approach. It is worth noting that, even in the absence of a predetermined low-rank structure for the solution, DLRA can still serve as a numerical scheme that balances efficiency and accuracy.

17.
arXiv (CS.CL) 2026-06-25

Fault of Our Stars: Behavioral Drivers of Rating-Sentiment Incongruence

When people share experiences online, they often express thoughts in two ways: a star rating and a written review. In sentiment analysis, ratings are widely used as convenient weak labels for textual sentiment, yet whether the two actually agree is rarely questioned. This study investigates sentiment-rating incongruence, where the sentiment expressed in review text differs from the sentiment implied by the assigned star rating, in Sri Lankan tourism attraction reviews. A dataset of 16,156 reviews from 2010 to 2023 is analyzed using a transformer-based sentiment pipeline that derives textual sentiment independently of assigned ratings. Incongruence occurs in 18.6% of reviews and falls into six directional patterns, with Conservative Rater and Obligatory 5-Star behaviors accounting for the majority of mismatches. Prevalence also varies across venue types, with museums showing the highest rates. Statistical tests, logistic regression, Random Forest, and SHAP analysis identify venue type, reviewer expertise, review length, and temporal factors as contributors to rating-text divergence. Overall, this study demonstrates that star ratings are not interchangeable with textual sentiment and should be validated before being treated as ground-truth labels in NLP.

18.
arXiv (CS.AI) 2026-06-25

The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems

arXiv:2606.26057v1 Announce Type: new Abstract: AI agents are granted access to tools, APIs, and other infrastructure, making them active principals in those systems. The dominant approach places controls inside the agent's own runtime: system prompts, output filters, and guardrail libraries. Any control in the agent's address space is reachable by inputs that influence it; this generalizes to any AI system with sufficient reach into its own runtime, a class we term escapable AI systems. We identify four properties that an authorization mechanism must satisfy for architectural control rather than for cooperative requests: process separation, pre-action enforcement on a structurally only path, fail-closed at both the request and system levels, and externalized signed evidence verifiable outside the controlled system's trust boundary. We position this layer as execution-time AI alignment, complementing training-time alignment (RLHF, Constitutional AI) and inference-time alignment. We present the Unfireable Safety Kernel, a Rust reference implementation realizing all four. Its fail-closed invariant is machine-checked at two levels: an SMT theorem (Z3) and an exhaustive bounded-model-checking proof of the production decision function (Kani, 4/4 harnesses). A Python-to-Rust migration was gated on byte-equivalence (1000/1000 fixtures; 17/17 adversarial classes). We evaluate the kernel governing a live, escapable AI system, a deterministic, self-improving world model, against an escape-seeking adversary driving its real self-modification seam: across 1,000 self-modifications, all 704 attempts on the safety-critical core are refused, with no escape; a further 300, under the operator kill switch, are also refused. A separate campaign of 6,240 authorization round-trips had no successful bypass. Against 3 contemporary systems claiming the agent control plane, the agent invokes control; here, it lacks that choice.

19.
arXiv (CS.AI) 2026-06-24

FlowPipe: LLM-Enhanced Conditional Generative Flow Networks for Data Preparation Pipeline Construction

arXiv:2606.24679v1 Announce Type: cross Abstract: Data preparation pipelines improve data quality in machine learning by transforming raw tables into learning-ready data through sequential cleaning and feature transformation operators. However, automatically constructing such pipelines is computationally difficult because operator sequences are combinatorial and end-to-end evaluation is expensive. Existing state-of-the-art (SOTA) Multi-DQN methods still face three key limitations: decoupled value estimators weaken long-horizon credit assignment, dataset context is only weakly injected into the policy, and exploration is inefficient in a sparse search space with many invalid states. To address these issues, we propose FlowPipe, a unified framework that formulates pipeline synthesis as conditional probabilistic flow generation over a directed acyclic graph. FlowPipe uses Conditional Generative Flow Networks (C-GFlowNets) with a Trajectory Balance objective to connect terminal validation rewards with early pipeline decisions. It further introduces Deep Semantic Modulation through Feature-wise Linear Modulation (FiLM), allowing LLM-derived logical priors to condition the policy's internal activations according to dataset semantics. In addition, FlowPipe incorporates failure awareness into the flow objective to avoid invalid states and concentrate search on high-potential regions. Experiments on two benchmark suites with 74 real-world datasets show that FlowPipe outperforms SOTA baselines, improving accuracy by 11.96% on average and achieving 12.5x faster training convergence. Source code is available at https://github.com/KunyuNi/FlowPipe.

20.
arXiv (CS.AI) 2026-06-25

TopoCast: A Topological Fidelity Framework for Evaluating Transformer-Based Time Series Forecasting

arXiv:2606.25439v1 Announce Type: cross Abstract: Deep learning-based models have achieved state-of-the-art performance in Time Series Forecasting (TSF), yet their evaluation remains dominated by pointwise error metrics such as Mean Squared Error (MSE), which quantify numerical accuracy but overlook structural properties of the forecast signal, including recurrent dynamics, oscillatory behavior, and phase alignment. As a result, forecasts exhibiting over-smoothing, phase shifts, or frequency distortions may achieve favorable error scores despite substantial structural degradation. To address this limitation, we propose TopoCast, a topology-driven framework for evaluating structural fidelity in TSF. TopoCast reconstructs phase-space representations of forecast and ground-truth sequences using Takens delay embedding and applies persistent homology to characterize their intrinsic dynamics. We derive four complementary topological fidelity measures from persistence diagrams and aggregate them into a Topological Fidelity Score (TFS). We further introduce dominant cycle overlap, a novel metric that maps persistent topological features to the temporal domain to assess whether dominant oscillatory patterns occur at the correct time points. Combined with TFS, this yields the Localized Topological Fidelity Score (LTFS), a phase-aware measure that captures temporal localization errors invisible to existing evaluation metrics. Experiments on five Transformer architectures across three real-world benchmark datasets demonstrate that models with similar forecasting errors can exhibit markedly different structural fidelity profiles, revealing failure modes overlooked by conventional evaluation and highlighting the value of topology-aware forecast assessment.

21.
arXiv (quant-ph) 2026-06-12

Explicit Quantum Circuit Simulation of Nonlinear 1-Dimensional Fluid with Carleman-linearized Boltzmann Method

arXiv:2606.12770v1 Announce Type: new Abstract: Quantum computation of fluid dynamics has attracted growing attention as a key application of fault-tolerant quantum computers anticipated in the coming decade, with lattice Boltzmann methods emerging as a particularly promising approach. Explicit and efficient elementary-gate-level circuit simulations, however, have so far been demonstrated only in the linear case. Here we include the leading nonlinearity through second-order Carleman linearization of the one-dimensional Boltzmann equation, and demonstrate, via explicit quantum-circuit simulation, the preparation of the final-time state using a Taylor-expansion-based ODE solver based on the quantum singular value transformation. With this construction, we analyze the gate and qubit complexities, which scale logarithmically with the grid size, the nonlinearity captured by the higher-order Carleman linearization, and the practical utility of higher-order expansions in the Taylor ODE solver. The construction provides a concrete baseline for computational cost reduction and further developments such as extensions to higher dimensions, complex geometries, and the extraction of physical quantities, towards industrially useful quantum CFD.

22.
arXiv (CS.AI) 2026-06-12

Decoding the Multimodal Maze: A Systematic Review on the Adoption of Explainability in Multimodal Attention-based Models

arXiv:2508.04427v2 Announce Type: replace-cross Abstract: Multimodal learning has witnessed remarkable advancements in recent years, particularly with the integration of attention-based models, leading to significant performance gains across a variety of tasks. Parallel to this progress, the demand for explainable artificial intelligence (XAI) has spurred a growing body of research aimed at interpreting the complex decision-making processes of these models. This systematic literature review analyzes research published between January 2020 and early 2024 that focuses on the explainability of multimodal models. Framed within the broader goals of XAI, we examine the literature across multiple dimensions, including model architecture, modalities involved, explanation algorithms and evaluation methodologies. Our analysis reveals that most studies are concentrated on vision-language and language-only models, with attention-based techniques being the most commonly employed for explanation. However, these methods often fall short in capturing the full spectrum of interactions between modalities, a challenge further compounded by the architectural heterogeneity across domains. Importantly, we find that evaluation methods for XAI in multimodal settings are largely non-systematic, lacking consistency, robustness, and consideration for modality-specific cognitive and contextual factors. To address these gaps, we not only synthesize findings from the surveyed works but also incorporate a complementary analysis that integrates recent and emerging advances driving multimodal explainability. Based on these insights, we provide a comprehensive set of recommendations aimed at promoting rigorous, transparent, and standardized evaluation and reporting practices in multimodal XAI research. Our goal is to support future research in more interpretable, accountable, and responsible multimodal AI systems, with explainability at their core.

23.
arXiv (CS.CV) 2026-06-17

Adversarial Attacks Leverage Interference Between Features in Superposition

Why do adversarial examples exist, and why do they transfer between models? Existing explanations appeal to high-dimensional geometry, non-robust patterns in the input, and decision boundary structure, but none provides a representation-level mechanism that explains why specific perturbations succeed and why attacks transfer between models. In this paper, we show that adversarial vulnerability can stem from efficient information encoding in neural networks. Specifically, vulnerability can arise from superposition - the phenomenon where networks represent more concepts than they have dimensions, forcing non-orthogonal representation and thus interference. This interference causes perturbations targeting one representation to affect others, creating vulnerabilities determined by interference patterns. In synthetic settings with precisely controlled superposition, we establish that superposition suffices to create adversarial vulnerability. The resulting attacks are predictable: PGD-discovered perturbations align with theoretically optimal perturbations derived from the interference geometry. Models trained on similar data develop similar interference patterns, explaining attack transferability. We then show that successful attacks on image classifiers exhibit the structure predicted by our proposed mechanism. These findings reveal that adversarial vulnerability can be a byproduct of networks' representational compression, complementing existing explanations based on data properties or architectural factors.

24.
arXiv (CS.AI) 2026-06-24

MyoInteract: A Framework for Fast Prototyping of Biomechanical HCI Tasks using Reinforcement Learning

arXiv:2602.15245v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL)-based biomechanical simulations have the potential to revolutionise HCI research and interaction design, but currently lack usability and interpretability. Using the Human Action Cycle as a design lens, we identify key limitations of biomechanical RL frameworks and develop MyoInteract, a novel framework for fast prototyping of biomechanical HCI tasks. MyoInteract allows designers to setup tasks, user models, and training parameters from an easy-to-use GUI within minutes. It trains and evaluates muscle-actuated simulated users within minutes, reducing training times by up to 98%. A workshop study with 12 interaction designers revealed that MyoInteract allowed novices in biomechanical RL to successfully setup, train, and assess goal-directed user movements within a single session. By transforming biomechanical RL from a days-long expert task into an accessible hour-long workflow, this work significantly lowers barriers to entry and accelerates iteration cycles in HCI biomechanics research.

25.
arXiv (quant-ph) 2026-06-16

A short proof of the modified Kretschmann-Schlingemann-Werner conjecture

Authors:

arXiv:2606.16418v1 Announce Type: new Abstract: Let $\Phi_1, \Phi_2 : \mathbb{M}_d(\mathbb{C})\to \mathbb{M}_n(\mathbb{C})$ be two quantum channels with respective Stinespring isometries $V_1, V_2 : \mathbb{C}^{d}\to \mathbb{C}^{n} \otimes \mathbb{C}^{m}$ on any common dilation space $\mathbb{C}^{m}$. We prove that there exists a unitary $U$ on $\mathbb{C}^{m}$ such that $\|V_1-({\bf1}\otimes U)V_2\|_\infty\leq\sqrt{2\|\Phi_1-\Phi_2\|_\diamond},$ thus resolving vom Ende's modification of the Kretschmann-Schlingemann-Werner conjecture in the affirmative.