Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.AI) 2026-06-11

Engineering Robustness into Personal Agents with the AI Workflow Store

arXiv:2605.10907v3 Announce Type: replace-cross Abstract: The dominant paradigm for AI agents is an "on-the-fly" loop in which agents synthesize plans and execute actions within seconds or minutes in response to user prompts. We argue that this paradigm short-circuits disciplined software engineering (SE) processes – iterative design, rigorous testing, adversarial evaluation, staged deployment, and more – that have delivered the (relatively) reliable and secure systems we use today. By focusing on rapid, real-time synthesis, are AI agents effectively delivering users improvised prototypes rather than systems fit for high-stakes scenarios in which users may unwittingly apply them? This paper argues for the need to integrate rigorous SE processes into the agentic loop to produce production-grade, hardened, and deterministically-constrained agent *workflows* that substantially outperform the potentially brittle and vulnerable results of on-the-fly synthesis. Doing so may require extra compute and time, and if so, we must amortize the cost of rigor through reuse across a broad user community. We envision an *AI Workflow Store* that consists of hardened and reusable workflows that agents can invoke with far greater reliability and security than improvised tool chains. We outline the research challenges of this vision, which stem from a broader flexibility-robustness tension that we argue requires moving beyond the ``on-the-fly'' paradigm to navigate effectively.

02.
arXiv (CS.LG) 2026-06-16

Representation Costs in Data Science: Foundations and the Quasi-Banach Spaces of Deep Neural Networks

arXiv:2606.14954v1 Announce Type: cross Abstract: We develop a general framework for analyzing representation costs of parametric data-fitting methods through their parameter-space regularizers. From this abstract perspective, we define representation costs for arbitrary parametric models and reveal their induced (native) function spaces. This unifies recent function-space views of data-fitting methods. We also prove that many natural results hold in this abstract setting, including representer theorems for parametric methods on their native spaces. The framework also rigorously connects parametric methods with their equivalent nonparametric descriptions under sufficient overparameterization. Classical methods and their native spaces, such as kernel methods / reproducing kernel Hilbert spaces, wavelets / Besov spaces, and shallow neural networks / variation spaces emerge as special cases of our abstract framework. A byproduct of "axiomatizing" the study of representation costs is that we also immediately obtain new results for deep neural networks: For depth-$L$ feedforward ReLU networks, their induced native spaces are $p$-normable quasi-Banach spaces with $p = 2/L$. This reveals that the inductive bias of deep neural networks (as given by the representation cost) cannot be captured by norms for depths $L > 2$.

03.
arXiv (CS.CL) 2026-06-11

FlowBank: Query-Adaptive Agentic Workflows Optimization through Precompute-and-Reuse

Large Language Model (LLM)-based multi-agent systems are increasingly powerful, but current agentic workflow optimization paradigms make an unsatisfying trade-off. Task-level methods spend substantial offline compute yet deploy only a single workflow, leaving complementary candidates unused, while query-level methods synthesize a new workflow per query at substantial inference cost. Our motivating analysis shows these paradigms are more complementary than competing: workflows discovered during offline search often solve different subsets of queries, and many queries handled by expensive query-level generation can already be solved by cheaper precomputed workflows. This suggests a different objective: rather than searching for one universally best workflow or regenerating one per instance, we should build a compact bank of reusable, complementary workflows and select among them adaptively at inference time. Doing so requires solving three coupled problems: generating complementary rather than redundant candidates, compressing them into a small deployable portfolio, and assigning each query to the right workflow under a performance-cost trade-off. To this end, we present FlowBank, a three-stage framework for portfolio-based agentic workflow optimization. Diversifying proposes DiverseFlow to steer search toward under-covered queries and produce a high-coverage candidate pool. Curating proposes CuraFlow to compress this pool into a compact portfolio with minimal redundancy. Matching casts deployment as edge-value prediction on a query-workflow bipartite graph and routes each incoming query to the portfolio member with the best predicted utility. Across five benchmarks, FlowBank achieves the highest average score among the evaluated methods while remaining cost-competitive, improving over the strongest automated and handcrafted baselines by 4.26% and 14.92% relative, respectively.

04.
arXiv (CS.AI) 2026-06-17

Mental Health AI Safety Claims Must Preserve Temporal Evidence

arXiv:2605.08827v2 Announce Type: replace Abstract: The safety of mental health AI is often judged at the wrong temporal scale. Current evaluations typically score isolated responses, endpoint outcomes, or aggregate dialogue quality, while clinically consequential failures may arise from the order and accumulation of interactions themselves, including delayed escalation, repeated reinforcement, dependency formation, failed repair, and gradual deterioration across turns. This paper argues that this mismatch is not merely a limitation of evaluation coverage but a source of invalid safety conclusions. We introduce Temporal Safety Non-Identifiability, a formal account of why safety properties that depend on sequence, timing, accumulation, or recovery cannot be certified by protocols that discard those features. From this formalization, we develop SCOPE (Safety Claims Over Preserved Evidence) as a general principle for aligning safety claims with the evidence an evaluation actually retains, and instantiate it as SCOPE-MH, a mental-health instantiation of this reporting standard. We operationalize SCOPE-MH through a proof-of-concept on the AnnoMI dataset of expert-annotated motivational interviewing conversations, which reveals mechanisms of failure that per-turn behavior scoring does not represent. We propose SCOPE-MH as a diagnostic complement to existing evaluation infrastructure and argue that evaluation preserving temporal evidence is necessary, not optional, for safety-critical mental health AI deployment.

05.
arXiv (CS.CV) 2026-06-16

When RAG Hurts: Diagnosing and Mitigating Attention Distraction in Retrieval-Augmented LVLMs

While Retrieval-Augmented Generation (RAG) is one of the dominant paradigms for enhancing Large Vision-Language Models (LVLMs) on knowledge-based VQA tasks, recent work attributes RAG failures to insufficient attention towards the retrieved context, proposing to reduce the attention allocated to image tokens. In this work, we identify a distinct failure mode that previous study overlooked: Attention Distraction (AD). When the retrieved context is sufficient (highly relevant or including the correct answer), the retrieved text suppresses the visual attention globally, and the attention on image tokens shifts away from question-relevant regions. This leads to failures on questions the model could originally answer correctly without the retrieved text. To mitigate this issue, we propose MAD-RAG, a training-free intervention that decouples visual grounding from context integration through a dual-question formulation, combined with attention mixing to preserve image-conditioned evidence. Extensive experiments on OK-VQA, E-VQA, and InfoSeek demonstrate that MAD-RAG consistently outperforms existing baselines across different model families, yielding absolute gains of up to 4.76%, 9.20%, and 6.18% over the vanilla RAG baseline. Notably, MAD-RAG rectifies up to 74.68% of failure cases with negligible computational overhead.

06.
arXiv (CS.AI) 2026-06-16

Learning aligned EEG representations with subject-specific encoders

arXiv:2606.16462v1 Announce Type: cross Abstract: Cross-subject EEG decoding promises more training data, but it also exposes neural networks to strong inter-subject distribution shifts. We study whether task supervision and architecture alone can learn subject-aligned representations. We replace a shared EEG encoder with subject-specific encoders followed by a common classifier, and compare this hybrid model with standard EEGNet, AttentionBaseNet, and CTNet baselines with Euclidean Alignment (EA) on four motor-imagery datasets. EA improves shared encoders by recentering subject covariances, but the hybrid encoder largely internalises this role: validation-loss curves and latent-distance analyses change little when EA is removed. Subject-specific heads increase class distinctiveness and place each subject close to its own latent manifold, improving most subjects while leaving a method-sensitive subset. These results support subject-specific encoders as a learned alignment mechanism for EEG decoding and identify head selection for unseen subjects as the remaining bottleneck.

07.
arXiv (CS.LG) 2026-06-19

Adaptive Distance-Aware Trunk Deep Operator Learning for Long-Span Roadway Bridges

arXiv:2606.20015v1 Announce Type: new Abstract: Long-span roadway bridges exhibit highly localized structural responses under vehicular loading, making repeated FE analysis computationally expensive for applications such as influence surface generation and structural digital twins. Existing SciML approaches struggle to accurately capture these localized responses. To address this challenge, this study proposes an adaptive-trunk DeepONet for localized structural response prediction in large-scale bridge systems. The framework dynamically constructs a load-dependent learning domain using a KNN strategy, allowing the network to focus on structural influence zones. The trunk network is further enhanced using distance-aware features that encode the geometric relationship between the load and structural nodes. A physics-based full-field reconstruction is incorporated through a stiffness-informed Schur complement formulation, enabling predictions at adaptive nodes to be extended to the entire structural domain. To enable scalable training, response data are generated using a reduced-order equivalent shell model that preserves the dominant global behavior while significantly reducing computational cost. The proposed framework is validated on both a benchmark bridge model and the real-world Mussafah Bridge. Results show that the method achieves FEM-level accuracy with relative errors below 5%, while reducing the total response evaluation time (including full-field reconstruction) by approximately 60x; excluding the post-processing reconstruction step, the AD-DeepONet inference is up to four orders of magnitude faster than FEM. In addition, the framework enables rapid generation of full-field responses, influence lines, and influence surfaces under arbitrary vehicular loading configurations, demonstrating strong potential for large-scale bridge analysis and digital twin applications.

08.
arXiv (CS.CV) 2026-06-17

Neural Tree Reconstruction for the Open Forest Observatory

The Open Forest Observatory (OFO) is a collaboration across universities and other partners to make low-cost forest mapping accessible to ecologists, land managers, and the general public. The OFO is building both a database of geospatial forest data as well as open-source methods and tools for forest mapping by uncrewed aerial vehicle. Such data are useful for a variety of climate applications including prioritizing reforestation efforts, informing wildfire hazard reduction, and monitoring carbon sequestration. In the current iteration of the OFO's forest map database, 3D tree maps are created using classical structure-from-motion techniques. This approach is prone to artifacts, lacks detail, and has particular difficulty on the forest floor where the input data (overhead imagery) has limited visibility. These reconstruction errors can potentially propagate to the downstream scientific tasks (e.g. a wildfire simulation.) Advances in 3D reconstruction, including methods like Neural Radiance Fields (NeRF), produce higher quality results that are more robust to sparse views and support data-driven priors. We explore ways to incorporate NeRFs into the OFO dataset, outline future work to support even more state-of-the-art 3D vision models, and describe the importance of high-quality 3D reconstructions for forestry applications.

09.
arXiv (CS.CV) 2026-06-16

Ultra Flash: Scaling Real-Time Streaming Video Generation to High Resolutions

While recent autoregressive video diffusion models achieve remarkable streaming quality, they remain confined to low resolutions (e.g., 480P), leaving efficient, scalable, real-time high-resolution video generation a fundamental open challenge. To bridge this gap, we present Ultra Flash, a cascaded streaming framework capable of real-time high-resolution video generation. Ultra Flash achieves ~30 FPS at 1K resolution and ~18 FPS at 2K resolution on a single GPU through three key contributions: (1) an architecture-preserving T2V-to-TV2V super-resolution training paradigm coupled with an AIGC-oriented data degradation pipeline that effectively preserves the generative capability of the base model, enabling enhanced high-resolution detail when cascaded after mainstream low-resolution generative models; (2) a causal streaming latent upsampler paired with a high-resolution decoder, which enhances spatiotemporal coherence while enabling efficient latent spatial scaling and precise high-resolution decoding with negligible computational overhead; and (3) a cascade high-resolution streaming video generation optimization scheme that first performs hybrid-reward-enhanced sparse causalization and single-step distillation of the super-resolution model, then introduces cascaded streaming self-forcing preference optimization with dynamic cache management, jointly enhancing overall coherence, improving quality, and enabling real-time high-resolution streaming video generation. Extensive experiments demonstrate that Ultra Flash reliably produces ultra-high-resolution streaming video while maintaining state-of-the-art visual quality and superior efficiency. Project Page: https://xin1u.github.io/UltraFlash/

10.
arXiv (quant-ph) 2026-06-17

Engineering entanglement and transport in interacting quantum walks with tailored potentials

arXiv:2606.17825v1 Announce Type: new Abstract: Controlling the interplay between particle propagation and quantum correlation generation is a central challenge in quantum transport. Here, we investigate two distinguishable continuous-time quantum walkers evolving on parallel one-dimensional lattices, interacting via distance-dependent potentials. While on-site interactions reproduce the typical bosonic behaviour, extending the interaction to a linear potential over multiple neighbors introduces controlled Bloch-like oscillations and shifts the bound-pair regime to stronger couplings. More generally, we explore a Coulomb-like interaction parameterized by strength, spatial scaling, and decay rate. This reveals a rich phase diagram including four distinct dynamical regimes: (i) a high-entropy, oscillatory regime akin to a linear potential; (ii) a strongly localized, bound-pair regime; (iii) a novel intermediate regime combining near-ballistic spreading with strong correlations; and (iv) a weakly interacting, free-propagation regime. Notably, regime (iii) achieves concurrent optimization of transport efficiency and entanglement, offering a sweet spot for correlated quantum dynamics. Our results provide a tool for designing interaction-engineered quantum walks with potential applications in quantum information processing and simulations.

11.
arXiv (quant-ph) 2026-06-11

Holographic Complexity, Extremality, and Cosmic Censorship

arXiv:2604.20170v2 Announce Type: replace-cross Abstract: We propose a holographic complexity origin for the third law of black-hole mechanics and weak cosmic censorship. In both complexity equals action and complexity equals volume prescriptions, the relative complexity between subextremal and extremal AdS black holes diverges logarithmically. For overcharged RN-AdS, explicit calculations in both prescriptions show that the near-singularity action terms are power-law divergent or finite, while the maximal-volume contribution is finite. Thus, the extremal-to-naked relative complexity also diverges, obstructing finite-time transitions.

12.
arXiv (CS.CL) 2026-06-16

Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning

Supervised fine-tuning (SFT) is a commonly used technique to adapt large language models (LLMs) to downstream tasks. In practice, SFT on a full dataset is computationally expensive and sometimes suffers from overfitting or bias amplification. This facilitates the rise of data curation in SFT, which prioritizes the most valuable data to optimze. This work studies the online batch selection family that dynamically scores and filters samples during the training process. However, existing popular methods often (i) rely merely on the utility of data to select a subset while neglecting other crucial factors like diversity, (ii) rely on external resources such as reference models or validation sets, and (iii) incur extra training time over full-dataset training. To address these limitations, this work develops UDS (Utility-Diversity Sampling), a framework for efficient online batch selection in SFT. UDS leverages the nuclear norm of the logits matrix to capture both data utility and intra-sample diversity, while estimating inter-sample diversity through efficient low-dimensional embedding comparisons with a lightweight memory buffer of historical samples. Such a design eliminates the need for external resources and unnecessary backpropagation, securing computational efficiency. Experiments on multiple benchmarks demonstrate that UDS consistently outperforms state-of-the-art online batch selection methods under varying data budgets, and significantly reduces training time compared to full-dataset fine-tuning. Code is available at https://github.com/gfyddha/UDS.

13.
arXiv (CS.LG) 2026-06-16

Learning Topological Representations for Molecular Dynamics

arXiv:2606.14737v1 Announce Type: cross Abstract: Molecular dynamics (MD) simulations generate trajectories in a high-dimensional configuration space whose analysis critically depends on molecular descriptors, typically handcrafted observables or learned kinetic embeddings. Designing descriptors that are both expressive and broadly applicable, however, remains challenging. We study persistent homology (PH) as a general-purpose representation for MD and introduce the masked Flood complex, a protein-tailored modification of a recently introduced simplicial complex construction that emphasizes inter-residue structure at low computational cost. Vectorized persistence diagrams then provide information-rich, geometry-aware summaries of protein conformations, which we evaluate on protein class prediction, frame-level observable regression, and Markov state model (MSM) estimation from learned low-dimensional coordinates in a single shared representation space. Results on the mdCATH dataset show that PH-based descriptors are competitive across tasks, with masked Flood PH yielding the most consistent overall performance. Further, when using topologically-informed MSMs as a drop-in replacement within the recent MarS-FM framework for generative modeling of protein conformations, we obtain consistently better ensemble statistics than MSMs based on physical observables. Finally, we explore the transferability of the generative model to qualitatively different, fast folding, proteins.

14.
arXiv (CS.CL) 2026-06-16

Exploring Extrinsic and Intrinsic Properties for Effective Reasoning with Code Interpreter

Reasoning with a Code Interpreter (CI) has emerged as an effective paradigm for enhancing the reasoning capabilities of large language models (LLMs) through executable computation and iterative verification. Despite its growing adoption, the behavioral properties underlying effective code reasoning remain largely underexplored. In this work, we investigate code reasoning from two distinct perspectives inspired by prior studies of natural language reasoning: extrinsic properties, represented by crucial tokens, and intrinsic properties, represented by code-specific cognitive behaviors. Across multiple LLMs, we find that stronger CI reasoning models consistently exhibit a higher prevalence of crucial tokens and cognitive behaviors, particularly verification, backtracking, and backward chaining. Building on these observations, we examine how these properties can be leveraged during both inference and training. At inference time, appending code-specific crucial tokens improves performance on several reasoning capabilities, including mathematical, ordering, and optimization, while yielding limited benefits elsewhere. At training time, augmenting a state-of-the-art framework with code-specific cognitive behaviors improves supervised fine-tuning and reinforcement learning performance in two of three evaluated models. Further analysis shows that these behaviors reduce overthinking in incorrect responses and improve token efficiency, while also revealing factors that limit gains in a certain model. Our findings provide the first systematic characterization of effective reasoning with CI and demonstrate both the potential and limitations of leveraging key properties to improve CI-based reasoning.

15.
arXiv (CS.CL) 2026-06-17

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

As Large Language Model based agents enter autonomous scientific research, their ability to resist pseudoscience becomes increasingly important. Otherwise, such systems may rapidly generate plausible yet misleading studies that contaminate academic literature and erode trust in science. We present PseudoBench, an adversarial benchmark for evaluating whether agentic auto-research systems can identify and resist pseudoscientific narratives. PseudoBench contains 200 curated pseudoscientific claim-evidence pairs across five domains and evaluates agents through an end-to-end research pipeline from experiments to writing. Testing seven state-of-the-art agents, we find that current systems readily produce persuasive reports that align with pseudoscientific premises with near-zero refusal rates and the highest resistance of only 27.4%. Stronger agents risk packaging pseudoscience in more sophisticated scientific language, increasing its apparent credibility. These findings reveal an alarming capacity to fuel pseudoscience, calling for scientific alignment before widespread deployment.

16.
arXiv (quant-ph) 2026-06-12

Geometric Algebra Quantum Gate Decomposition

arXiv:2606.12480v1 Announce Type: new Abstract: Quantum gates are usually described through matrix and tensor-product formalisms that often obscure their geometric structure. In this work, we formulate the Pauli and Clifford groups within the complex Geometric Algebra (GA) framework. We show that the Pauli group is naturally identified with the group of blades up to a global phase, thereby providing a geometric interpretation of Pauli operators and their commutation relations in terms of oriented subspaces. We further prove that Clifford operators are generated by products of {\pi}/4-Pauli rotors and introduce a greedy Pauli rotor decomposition algorithm whose empirical behavior suggests unexpectedly compact decompositions for Clifford operators. Finally, we show that Clifford+T universality admits a natural geometric interpretation through {\pi}/8-rotors within this framework.

17.
arXiv (CS.LG) 2026-06-24

Decentralized SGD with Controlled Disagreement Finds Flatter Minima

arXiv:2602.02899v2 Announce Type: replace Abstract: Decentralized training is often regarded as inferior to centralized training because the consensus errors between workers are thought to undermine convergence and generalization. This work challenges this view by introducing decentralized SGD with Adaptive Consensus (DSGD-AC), which uses a time-dependent scaling mechanism to maintain consensus errors throughout the training. We show that adaptive consensus changes the stationary variance of disagreement modes by balancing two effects: it preserves consensus-error magnitude through weaker graph damping while still allowing curvature-dependent damping to shape the disagreement directions. This balance can produce a stronger Hessian-weighted loss-envelope penalty around the deployed model, even when normalized Hessian alignment is weaker than in standard DSGD. Empirical results on image classification show that DSGD-AC reaches flatter solutions and higher test accuracy than standard DSGD and even centralized SGD. Together, these results support consensus errors as a useful implicit regularizer and open a new perspective on the design of decentralized learning algorithms.

18.
arXiv (quant-ph) 2026-06-16

Neural network inverse design of nanophotonic scintillators

arXiv:2606.16309v1 Announce Type: cross Abstract: Scintillators are materials converting high-energy radiation into optical light, essential in a range of technologies such as medical imaging systems and security scanners. Scintillator development and optimization have remained limited by the complexity of their underlying physics, involving stochastic cascades of electron-electron, electron-phonon, and electron-photon interactions. Such processes are typically modeled by non-differentiable Monte Carlo simulations, limiting the applicability of machine learning for scintillator development. Here we present a physics-informed neural network that learns the scintillation cascade process from the incident high-energy particle to photon emission, substantially accelerating scintillator design and optimization. Combining this neural network with photonic simulations enables end-to-end differentiable optimization of the scintillator geometry. This allows us to optimize for arbitrary figures of merit, such as specific target emission patterns.. We demonstrate the concept and characterize it relative to previous approaches by inverse design of nanophotonic scintillators for X-ray imaging.

19.
arXiv (quant-ph) 2026-06-15

New Identity for Cayley's First Hyperdeterminant with Applications to Symmetric Tensors and Entanglement

Authors:

arXiv:2512.03093v3 Announce Type: replace Abstract: In this article, a new formula for computing Cayley's first hyperdeterminant in terms of the Levi-Civita symbol is given. It is then shown that this formula can be used to compute the hyperdeterminant of symmetric tensors in polynomial time with respect to their order (assuming fixed side length). Applications to quantifying the entanglement of states of bosonic quantum systems are then discussed. Additionally, in order to obtain the fast calculation of the hyperdeterminant on symmetric tensors, generalized elimination and duplication matrices are defined and their explicit formulas are derived.

20.
arXiv (CS.CL) 2026-06-16

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation

We introduce HK-LegiCoST, a new three-way parallel corpus of Cantonese-English translations, containing 600+ hours of Cantonese audio, its standard traditional Chinese transcript, and English translation, segmented and aligned at the sentence level. We describe the notable challenges in corpus preparation: segmentation, alignment of long audio recordings, and sentence-level alignment with non-verbatim transcripts. Such transcripts make the corpus suitable for speech translation research when there are significant differences between the spoken and written forms of the source language. Due to its large size, we are able to demonstrate competitive speech translation baselines on HK-LegiCoST and extend them to promising cross-corpus results on the FLEURS Cantonese subset. These results deliver insights into speech recognition and translation research in languages for which non-verbatim or ``noisy'' transcription is common due to various factors, including vernacular and dialectal speech.

21.
arXiv (quant-ph) 2026-06-17

Tensor network compression using fluid dynamics as a testbed: Analytical foundations in one dimension

arXiv:2606.17064v1 Announce Type: cross Abstract: High performance computers produce extreme-scale data sets that require sampling or compression if they are to be used to their full potential. Existing data compression techniques typically exploit features such as sparsity in the data, homogeneity in the data, or {\it a priori} knowledge of what subsets of data are of most interest. Fluid dynamics data in general do not exhibit these features and so are attractive test beds for generic compression techniques that are objective, robust, and tuneable with respect to information lost due to compression. Presented here is a method based on tensor networks, specifically matrix product states or tensor trains, that meets these requirements. The method is demonstrated for compression in one-dimension and is extensible to higher dimensionality. Lossless compression is demonstrated for random Fourier series for sufficiently high bond dimension of the tensor network, with the memory required to store the tensor network scaling directly proportional to the bond dimension. The lossy compression exhibited at lower bond dimension can be well within the relative error of many fluid simulations. The compression algorithm is tested for the time evolution of Burger's equation with excellent results. We additionally demonstrate the capability to perform computations in the compressed form through a tensor network periodic convolution that can be orders of magnitude faster than using fast Fourier transforms and the convolution theorem. In addition to being an attractive method for working with data sets generated by existing computers, the tensor network methods utilised are directly translatable to the emerging paradigm of quantum computing.

22.
arXiv (CS.AI) 2026-06-16

MimicIK: Real-Time Generative Inverse Kinematics from Teleoperation with FK Consistency

arXiv:2606.15148v1 Announce Type: cross Abstract: Inverse kinematics (IK) remains a critical bottleneck for real-time robot manipulation. Classical numerical solvers achieve high geometric precision but often suffer from discontinuous branch switching and unstable behavior near kinematic singularities during closed-loop deployment. Meanwhile, learned IK approaches frequently struggle to balance spatial accuracy, motion smoothness, and real-time efficiency, particularly when trained on noisy human teleoperation data. We present MimicIK, a real-time generative inverse kinematics framework that learns smooth and robust joint-space motion priors from teleoperation demonstrations through conditional flow matching. Given the current joint configuration and a target end-effector pose, MimicIK predicts continuous delta-joint commands using an efficient two-step iterative refinement process based on a Minimal Iterative Policy (MIP) backbone. To enforce physical consistency, we further introduce an FK consistency loss, a differentiable forward-kinematics regularization that penalizes task-space deviations from the target pose during training. We evaluate MimicIK on a real-world 6-DOF robot dataset containing 8,848 teleoperation demonstrations. MimicIK achieves a mean position error of 4.65 mm, a 10 mm success rate of 92.01\%, and a trajectory spike rate of only 7.99\%. Compared with a UNet diffusion baseline, our method improves both spatial accuracy and motion smoothness while reducing inference latency from 21.66 ms to 6.74 ms. Furthermore, unlike deterministic MLP baselines that catastrophically diverge under out-of-distribution deployment, MimicIK remains stable near singular configurations and enables robust 20 Hz real-time control on deployment hardware.

23.
arXiv (CS.LG) 2026-06-15

A Complexity Measure for Active Learning in Multi-group Mean Estimation

arXiv:2606.14690v1 Announce Type: new Abstract: We study a max-risk objective for active learning in a multi-group mean estimation $d$-armed bandits: a learner adaptively allocates a budget of $T$ samples across $d$ groups to minimize the worst-case uncertainty index $\max_{k\in[d]}\sigma_k^2/n_k$, where $\sigma_k$ is the standard deviation of the distribution of arm $d$, and $n_k$ is the number of times arm $d$ is sampled. We develop a local minimax framework and prove the first general lower bound for this objective, valid for any finite-variance hypothesis class. The bound separates difficulty into three orthogonal factors: a budget term, a heteroscedasticity index measuring how unevenly the uncertainty is spread across arms, and a model-dependent complexity measure, the Variance Local Curvature ($\mathrm{VLC}$), which captures how much information a local change of variance creates inside the hypothesis class. For smooth classes, the $\mathrm{VLC}$ is a reparametrization of a variance–Fisher information, with closed-form values for common families. Benchmarking against the strongest available upper bound shows near-optimality up to logarithmic factors in broad regimes, and pinpoints a systematic gap in highly heterogeneous instances. Our proof introduces two key ingredients: a loss-induced $\ell_1$ geometry on the decision space, and a representation-based instance generator that reduces hard-instance construction to an explicit random matrix calculation.

24.
arXiv (CS.LG) 2026-06-12

Hölder++: Improving the Quality-Coherence Trade-off in Multimodal VAEs

arXiv:2606.13381v1 Announce Type: new Abstract: Existing approaches for multimodal variational autoencoders (VAEs) face a trade-off between generative quality and coherence-i.e., they struggle to generate realistic and diverse samples that, at the same time, are semantically consistent across modalities. A recent work shows that using a simple approximation to Hölder pooling as an aggregation method improves coherence over the SOTA MMVAE+, despite assuming a single shared representation across all modalities. Yet, it slightly compromises sample diversity. Inspired by this insight, we propose Hölder++, a novel multimodal VAE that improves the generative quality-coherence trade-off through: (i) the first implementation of Hölder pooling without any approximation for multimodal VAEs; (ii) an extended architecture that models distinct shared and private (i.e., modality-specific) representations (Hölder+); and (iii) hierarchical inference that further enhances the disentanglement between the shared and private representations (Hölder++). Our experiments corroborate that Hölder++ consistently improves the generative quality-coherence trade-off, yields more structured latent spaces, and learns shared representations that are informative for downstream tasks.

25.
PLOS Medicine 2026-06-18

Association between initial benzodiazepine prescribing patterns and time to benzodiazepine discontinuation: A population-based retrospective cohort study

by Nikki Bozinoff, Tanya S. Hauck, Robert A. Kleinman, Matthew E. Sloan, Beth A. Sproule, Simone N. Vigod, Jennifer Wyman, Priscila Pequeno, Tara Gomes Background Long-term benzodiazepine use has been associated with increased risk of morbidity and mortality. Preventing long-term use through safer prescribing practices has received little attention to date. We sought to better understand associations between initial prescription characteristics and duration of benzodiazepine use. Methods and findings This was a retrospective population-based cohort study of 1,820,808 adults in Ontario with incident benzodiazepine prescriptions between January 1, 2013 and December 31, 2020, with follow-up to December 31, 2021. The primary exposure was duration of the index prescription (≤7 days—referent group, 8–14 days, 15–30 days, or >30 days). Secondary exposures were: (a) duration of action of index benzodiazepine(s) prescription (short-acting, long-acting or both); (b) number of benzodiazepine dispensed on index (1 or 2+); and (c) mean daily dose of the index prescription in Diazepam Milligram Equivalents (DMEs). The primary outcome was time to benzodiazepine discontinuation in days. Multivariable models were adjusted for age, sex, anxiety, insomnia, and substance use disorders as well as other important comorbidities and socio-demographic characteristics. The median age at index was 53 years (Interquartile Range (IQR) 38–67), and 62.6% were women. The median time to discontinuation in women was 16 days (IQR: 6–29) while the median time to discontinuation in men was 19 days (IQR: 6–29). Lorazepam was the most commonly prescribed benzodiazepine on index (63.9%), followed by clonazepam (17.3%) and diazepam (5.8%). In multivariable Cox Proportional Hazards Models, longer index prescriptions were associated with a lower likelihood of benzodiazepine discontinuation (adjusted Hazard Ratio (aHR) 0.54 (95% Confidence Interval (CI) [0.54,0.54]) for 8–14 days; aHR 0.26 (95% CI [0.25,0.26] for 15–30 days and aHR 0.14 (95% CI [0.14,0.14]) for >30 days, compared to ≤7 days, respectively). Being prescribed two or more benzodiazepines versus 1 was also associated with a reduced likelihood of discontinuation (aHR 0.59 (95% CI [0.57,0.61])), as was being prescribed long-acting benzodiazepines (aHR 0.80 (95% CI [0.80,0.80])) or a combination of short and long acting benzodiazepine (aHR 0.84 (95% CI [0.80,0.88])) versus short-acting benzodiazepines alone. Mean daily doses of >5 to ≤10 DME and >10 to ≤20 DME were associated with an increased likelihood of discontinuation (aHR 1.03 (95% CI [1.03,1.03]); aHR: 1.03 (95% CI [1.03,1.04])), whereas doses >20 DME were associated with a reduced likelihood of discontinuation (aHR 0.98 (95% CI [0.97,0.98])) compared with ≤5 DME. Findings may be subject to bias from unmeasured confounding. Conclusion This large population-based cohort study found that prescribing shorter courses of benzodiazepines, use of a single benzodiazepine, use of a short-acting agent, were associated with reduced likelihood of long-term benzodiazepine use. Findings suggest that simple changes to prescribing practices could reduce prolonged benzodiazepine use and the morbidity and mortality associated with long-term use of these medications.