Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.CV) 2026-06-18

Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them

Image-to-Video diffusion models leverage input images to generate visually stunning content, yet frequently produce motion that violates physical laws. We reveal a surprising finding: a 2-step generation often exhibits better physical consistency than a 50-step output from the same model. Through spectral analysis, we trace this to phase erosion during denoising; the phase degrades significantly (dropping by $\approx 18\%$ from step 2 to step 50), whereas the magnitude remains relatively stable. Building on this insight, we propose PhaseLock, a training-free framework that preserves the valid motion priors from few-step inference throughout the denoising trajectory. Rather than relying on full-step inference for physical consistency, PhaseLock extracts a motion prior from just 2 steps and enforces it onto high-fidelity generation via Latent Delta Guidance. Our approach effectively mitigates phase degradation, improving physical consistency by an average of 6.2 points across diverse models while largely maintaining visual fidelity, with negligible overhead ($1.06\times$ time, $1.02\times$ memory) and reduced reliance on expensive external guidance methods ($\sim5\times$ time). Project Page: https://dnwjddl.github.io/phaselock

02.
arXiv (CS.AI) 2026-06-11

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

arXiv:2606.09426v2 Announce Type: replace Abstract: Computer-use agents (CUAs) increasingly operate in runtimes that combine visual desktop control, command-line execution, code editing, browsers, and external tools. Existing benchmarks, however, often evaluate these interfaces as separable capabilities, leaving long-horizon cross-interface orchestration under-tested. Thus, we introduce WeaveBench, a long-horizon hybrid-interface benchmark with 114 tasks across 8 real-world work domains, grounded in real user requests and publicly verifiable artifacts. Each task requires agents to combine GUI observations/actions with CLI/code operations within a single trajectory. We evaluate these tasks on a real Ubuntu desktop inside deployed CLI-agent runtimes, augmented with a minimal desktop-control plugin. We also propose a companion trajectory-aware judge that inspects deliverables, files, screenshots, logs, and action traces, while detecting shortcut behaviors such as fabricated visual evidence or hard-coded metrics. Across frontier model-runtime pairings, the best PassRate reaches only 41.2%, showing the benchmark remains far from saturated. The trajectory-aware judge further reveals that outcome-only grading substantially overestimates agent performance. Overall, WeaveBench exposes a critical gap in CUA evaluation and provides an effective testbed to measure whether agents can orchestrate GUI, CLI, and code operations across long-horizon real-world tasks.

03.
arXiv (CS.LG) 2026-06-12

Hölder++: Improving the Quality-Coherence Trade-off in Multimodal VAEs

arXiv:2606.13381v1 Announce Type: new Abstract: Existing approaches for multimodal variational autoencoders (VAEs) face a trade-off between generative quality and coherence-i.e., they struggle to generate realistic and diverse samples that, at the same time, are semantically consistent across modalities. A recent work shows that using a simple approximation to Hölder pooling as an aggregation method improves coherence over the SOTA MMVAE+, despite assuming a single shared representation across all modalities. Yet, it slightly compromises sample diversity. Inspired by this insight, we propose Hölder++, a novel multimodal VAE that improves the generative quality-coherence trade-off through: (i) the first implementation of Hölder pooling without any approximation for multimodal VAEs; (ii) an extended architecture that models distinct shared and private (i.e., modality-specific) representations (Hölder+); and (iii) hierarchical inference that further enhances the disentanglement between the shared and private representations (Hölder++). Our experiments corroborate that Hölder++ consistently improves the generative quality-coherence trade-off, yields more structured latent spaces, and learns shared representations that are informative for downstream tasks.

04.
arXiv (quant-ph) 2026-06-24

Altermagnet-Superconductor Heterostructure: a Scalable Platform for Braiding of Majorana Modes

arXiv:2506.08095v2 Announce Type: replace-cross Abstract: Topological quantum computation, featuring qubits built out of anyonic excitations known as Majorana zero modes (MZMs), have long presented an exciting pathway towards scalable quantum computation. Recently, the advent of altermagnetic materials has presented a new pathway towards localized MZMs on the boundary of two-dimensional materials, consisting of an altermagnetic film, subject to a superconducting proximity effect from a superconducting substrate. In this work, we demonstrate the possibility for an altermagnet-superconductor heterostructure, to not only harbor MZMs, but also freely manipulate their position along the topological boundary of the material, via rotation of the Néel vector. Using this mechanism, on a square platform, we utilize a time-dependent method to simulate the Z-gate via braiding, and then extend this to a larger H-junction, where we implement the $\sqrt{X}$ and $\sqrt{Z}$ gate on a single-qubit system. Further, this structure is eminently scalable to many-qubit systems, thus providing the essential ingredients towards universal quantum computation.

05.
arXiv (CS.AI) 2026-06-12

Once-for-All: Scalable Simultaneous Forecasting via Equilibrium State Estimation

arXiv:2606.13285v1 Announce Type: cross Abstract: We introduce Equilibrium State Estimation (ESE), a novel paradigm for simultaneous prediction, where multiple interacting systems require separate yet coordinated forecasts. Such scenarios often arise in real-world settings such as economics and healthcare modeling. Unlike existing approaches that predict one system at a time, ESE forecasts all systems in a single pass. It first estimates the equilibrium state across systems, then generates holistic forecasts based on the difference between the current state and the estimated equilibrium. Extensive experiments on synthetic and real-world datasets, including currency exchange and COVID-19 spread modeling, demonstrate that ESE is at least as accurate as state-of-the-art (SOTA) methods while being significantly faster. In addition, ESE integrates seamlessly with conventional predictors, combining their accuracy with its exceptional efficiency and delivering a 10-70x speedup. With linear-time complexity, ESE scales far better than SOTA methods as the number of systems increases. Moreover, it remains accurate under diverse perturbations, establishing ESE as a fast, generalizable, robust, and scalable multi-prediction method.

06.
arXiv (CS.LG) 2026-06-11

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

arXiv:2606.11797v1 Announce Type: new Abstract: Studies on rodents such as mice have shown the capabilities to adapt their behavior when dealing with changing parameters (``drift'') of the environment even if no information about change is provided (uncertainty) – a behavior that can be modeled by forgetting mechanisms. Non-stationary Reinforcement Learning (NSRL) deals with adapting state-of-the-art RL methods to deal with changing environments: these however usually require (partially) perfect information about the drift such as ``task IDs'' or ``context''. To mitigate the effects of drift, this work develops Space-sampled Value Decay as an explicit forgetting mechanism for value-based deep RL architectures as a simple yet effective approach. In particular we demonstrate and discuss positive effects but also limitations in achieved returns for modifications of Deep Q-networks (DQN) and Soft Actor-Critic (SAC) when evaluated on non-stationary environments.

07.
arXiv (CS.LG) 2026-06-16

M-CTX: Exact and Scalable Spatial Context Retrieval for Trajectory Analytics

arXiv:2606.15244v1 Announce Type: new Abstract: Modern trajectory predictors increasingly condition on external spatial context, such as map geometry, signed distance fields (SDFs), and nearby moving agents. While this context improves prediction quality, constructing it for every training anchor has become a hidden systems bottleneck. In a representative maritime AIS pipeline, spatial context construction requires roughly 17 CPU-days for a 5.48M-anchor corpus, dominating the cost of the downstream predictor. We present M-CTX, an exact and scalable spatial context-retrieval framework for trajectory analytics. M-CTX recasts context construction as an ingest-once, query-many spatial database workload and replaces three brute-force stages – OSM range retrieval, SDF computation, and moving-vessel neighbour lookup – with composable, index-backed operators. Its learned range-index backend, BR-LZ, provides recall-complete MBR-overlap range retrieval and reduces candidate amplification by 1.1x–2.7x relative to global-expansion one-curve baselines. Across four maritime regions, eight baseline systems, synthetic workloads with up to 40M spatial features, and 10^7-record AIS streams, M-CTX reproduces the reference context exactly. On the 5.48M-anchor corpus, it reduces context construction from about 17 CPU-days to 1.8 hours, a measured 226x end-to-end speed-up. An optional storage mode further compresses SDF context by 64x with only a 0.04 m ADE change. These results establish exact spatial context retrieval as a first-class database problem in modern trajectory analytics. Code and datasets are publicly available at https://github.com/mark000071/M-CTX-Traj.

08.
arXiv (quant-ph) 2026-06-16

Retrocausal capacity of a quantum channel: Communicating through noisy closed timelike curves

arXiv:2509.08965v3 Announce Type: replace Abstract: We study the capacity of a quantum channel for retrocausal communication, where messages are transmitted backward in time, from a sender in the future to a receiver in the past, through a noisy postselected closed timelike curve mathematically represented by the channel. We completely characterize the one-shot retrocausal quantum and classical capacities, and we show that the corresponding asymptotic capacities are equal to the average and sum, respectively, of the channel's max-information and its regularized Doeblin information. This endows these information measures with a novel operational interpretation. Furthermore, our characterization can be generalized beyond quantum channels to all completely positive maps. This imposes information-theoretic limits on transmitting messages via postselected-teleportation-like mechanisms with arbitrary initial- and final-state boundary conditions, including those considered in various black-hole final-state models.

09.
arXiv (CS.LG) 2026-06-24

Closing the Loop: Formally Verified Law as a Reward Signal for Self-Improving Legal AI

arXiv:2606.23913v1 Announce Type: new Abstract: This article develops an architecture that creates a formally verifiable reward signal to train legal AI, adapting the LLM proposes, verifier disposes paradigm from mathematical AI to the distinctive demands of law. We present an architecture comprising LLM-driven autoformalization into a formal legal calculus extending Catala, a verification kernel, and explanation generation grounded in formal proof traces. For the computational components of law, the architecture provides provable correctness. For open-textured legal analysis, it provides structural guarantees: every required stage of the legal argument is addressed, argumentation is exercised at the correct stages and not omitted, and the deductive links between steps are valid. We demonstrate the architecture on procedural deadline calculations in German law, Commerce Clause analysis in U.S. constitutional law, and cross-jurisdictional sanction proportionality. We further show that the same architecture has a structural advantage for legal AI training: a deterministic external verifier supplies verifiable outcomes for legal problems and thereby closes the traditional reinforcement-learning loop gap in law.

10.
arXiv (CS.AI) 2026-06-24

Page image classifier fine-tuned on century-spanning archives of scanned documents for further content-specific processing

arXiv:2606.07558v2 Announce Type: replace-cross Abstract: Purpose: Digitization projects in the humanities produce vast, heterogeneous archives of historical documents, making manual sorting impractical at scale. This work addresses the need for an automated system to classify scanned page images based on visual content type - text, tables, and graphics - enabling content-specific downstream processing such as Optical Character Recognition (OCR) or structured data extraction. Methods: An image classification system was developed and evaluated on a dataset of over 48,000 annotated historical page images from century-old Czech archaeological archives, refined through four successive annotation stages with domain-expert review. A Random Forest Classifier baseline was established using hand-crafted image features. Subsequently, deep learning architectures were fine-tuned and compared: Convolutional Neural Networks (EfficientNetV2, RegNetY), Vision and Document Image Transformers (ViT, DiT), and multimodal CLIP models. An 11-category label scheme was designed collaboratively with domain experts and evaluated via five-fold cross-validation. Results: The feature-based baseline achieved approximately 75% accuracy. Fine-tuned CNNs and Transformers substantially outperformed it, with RegNetY-16GF achieving 99.16% and ViT-large 99.12% Top-1 accuracy on the held-out test set. CLIP ViT-B/16 reached 99.14% with optimized text descriptions. Conclusion: Image-only models, particularly RegNetY-16GF, deliver near-perfect classification accuracy and produce consistent labels across 649,508 unlabeled archival pages with over 90% inter-model agreement. Fine-tuned CLIP, despite competitive test-set accuracy, showed under 65% agreement with image-only models on unlabeled data, making it less suitable for deployment. The final models, annotated dataset, and software are publicly available under open-source licenses.

11.
arXiv (CS.AI) 2026-06-16

Evaluation of Alternative-Based Information Systems for Deliberative Polling using an Agentic Simulator

arXiv:2606.11692v1 Announce Type: cross Abstract: Deliberative polling promises to improve collective decision-making by exposing shareholders to a broad range of arguments before they vote. Yet ensuring that every voter encounters a representative sample of the reason space, the coverage problem, remains an open challenge, particularly at scale and in adversarial or strategically motivated electorates. This paper introduces a way of evaluating solutions using the LLM-based Agentic Bipolar Argumentation Simulator, grounded in a framework which formalises a poll as a six-tuple of endorsing and opposing justifications, attack and enhance relations, and shareholder- and relation-weights. ABAS simulates N autonomous shareholder agents, each assigned a latent opinion according to desired distributions in [-1, 1], who sequentially vote, choose or author justifications, and optionally submit argumentation-graph links. The simulator implements recommendations that rank existing justifications by their observable endorsement mass. It evaluates the mechanism's success by coverage, namely the fraction of the corpus reason-tag set represented in the K recommendations presented to each shareholder, as a solution to the NP-hard Subsuming Justification Problem. Reported experiments characterise how creativity rate (pown), recommendation size (K), argumentation density (plinks), and population size (N) affect coverage and corpus diversity. In an authenticated electorate where Sybil attacks are impossible and only the relation graph is gameable, we stress-test the scoring with coordinated strategic voting attacks: a tag-flood attack collapses coverage, while author-count relation weighting through a reversed-PageRank rule resists the flood markedly better than uniform weights.

12.
arXiv (CS.LG) 2026-06-24

An Agnostic Machine Learning Model of Photosynthetic Habitability

arXiv:2606.24458v1 Announce Type: cross Abstract: The search for exoplanet biosignatures is guided by whether planetary environments can sustain photosynthesis. As such, the Photosynthetic Habitable Zone (PHZ) was recently proposed, as the overlap between the canonical habitable zone and the orbital range where stellar irradiance is sufficient to drive photosynthesis. Existing PHZ estimates rely on empirical light-response curves from Earth phytoplankton, and thus include implicit Earth-centric biases. We introduce an agnostic PHZ derived from a generalized model of photosynthesis grounded in thermodynamics and redox chemistry, without reference to model organisms. The model is built on a generic photochemical reaction in which photon capture couples oxidation of a donor molecule to the reduction of CO2. The optical properties and CO2 reduction rate are optimized against irradiance spectra for exoplanets orbiting main-sequence stars, using a genetic algorithm that mimics evolution by natural selection. Our simulations predict that photosynthetic organisms compensate for reduced flux by evolving larger light-harvesting structures. As a result, photosynthetic viability declines only linearly with orbital distance, despite stellar flux falling off quadratically. As such, the agnostic PHZ expands well beyond previous Earth-based estimates. Earth-like (visible light) oxygenic photosynthesis is flux-limited at the outer habitable zone for cool M-dwarf stars; however, both anoxygenic photosynthesis and a hypothetical, NIR-driven oxygenic photosynthesis are viable across the entire habitable zone for M, K, and G stars. This implies that M-dwarf exoplanets could sustain robust oxygenic photosynthesis, though it would be different to that found on Earth, presenting reflectance biosignatures in the NIR band rather than the visible.

13.
arXiv (CS.CL) 2026-06-11

AI4SLT: Empirical Processes in Lean 4 for Formal Statistical Learning Theory

We present the first comprehensive Lean 4 formalization of statistical learning theory (SLT) grounded in empirical process theory. Our en-to-end formal infrastructure implement the missing contents in latest Lean library, including a complete development of Gaussian Lipschitz concentration, Dudley's entropy integral theorem for sub-Gaussian processes, and an application to least-squares (sparse) regression with a sharp rate. The project was carried out using a human-AI collaborative workflow, in which humans design proof strategies and AI agents execute tactical proof construction, leading to the human-verified Lean 4 toolbox for SLT. Beyond implementation, the formalization process exposes and resolves implicit assumptions and missing details in standard SLT textbooks, enforcing a granular, line-by-line understanding of the theory. This work establishes a reusable formal foundation and opens the door for future developments in machine learning theory. The code is provided in https://github.com/YuanheZ/lean-stat-learning-theory.

14.
arXiv (CS.CL) 2026-06-18

Written by AI, Managed by AI: Semantic Space Control and Index Sickness Elimination Across 391 Consecutive Sessions

The prevailing engineering intuition for addressing conceptual drift in long-horizon LLM collaboration is to trade more formal constraints for more reliable outputs – designing symbolic identifier systems, accumulating defensive rules in System Prompts, expanding context windows. Our engineering record shows that in long-horizon settings, this direction may produce effects contrary to design intent. Using action research methods in a real software project (Bang-v3) spanning approximately one month and 391 collaborative sessions, we document and analyze the failure process of these strategies. When the symbolic system exceeds a complexity threshold, LLMs do not become more accurate – instead, they abandon genuine understanding of business semantics, retreat to self-referential reasoning within the symbolic layer, and generate outputs that appear internally consistent but are physically disconnected from reality. We name this failure pattern "Index Sickness," and its canonical manifestation "Phantom Legislation." We name the underlying principle the "Pang Principle (Semantic Vitality Law)": natural language carrying explicit purpose conveys far greater information quality than symbolic expression. From this, we design and validate its physical engineering mechanism: "Baseline-Log Physical Separation." In the same project, this mechanism reduced AI Instructions volume by ~75%, and across the subsequent ~150 sessions, no recurrence of Index Sickness was observed. A bilingual companion version (Chinese) is included as supplementary material.

15.
arXiv (CS.AI) 2026-06-16

From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperability

Authors:

arXiv:2606.07150v2 Announce Type: replace-cross Abstract: Agent-interoperability protocols such as A2A and MCP standardize what agents say to one another but assume address-based transport. Whether over HTTP(S) or a content-protecting binding such as MLS-based SLIM, these transports protect message content yet leave the communication graph exposed: which agent contacts which, when, and how often. In agent systems this graph is more consequential than a privacy framing suggests. Endpoints are capability-labeled, workflows are structured and chained, and interactions are coupled to real actions, so an observer recovers more than past relationships: it can infer the pending workflow and, at machine speed, act on that inference before the workflow completes. The threat is therefore one of workflow integrity, not privacy alone. We formalize a threat model for the communication graph and locate what makes its metadata distinctively consequential: not stronger fingerprinting, which we measure to be comparable to other machine traffic, but exposure across independent trust domains, coupled to autonomous action. We define transport- and bootstrap-layer privacy properties, evaluate candidate transports, and give an A2A case study where a metadata-protecting binding surfaces the protocol's implicit identity assumptions. On a generative model anchored to a real capture and over a live A2A binding, a label-blind classifier recovers a task's class from passive metadata well above chance, and from only its opening; a defense-aware adversary does not overturn this, and only the full set of properties drives recovery toward chance. The leverage of acting on the leak is distinct from recoverability: under a fixed budget an adversary realizes most of a clairvoyant attacker's advantage from a workflow's opening, governed by precision over the top-ranked workflows rather than overall accuracy, so a defense suppresses it even while recovery stays above chance.

16.
arXiv (quant-ph) 2026-06-24

Rapid Cavity-Based Mid-Circuit Measurement and Feedforward in a Neutral Atom Array

arXiv:2606.24869v1 Announce Type: new Abstract: Measuring part of a quantum system in the midst of its evolution and acting on the result in real time is essential for numerous quantum information protocols. Neutral-atom arrays are a leading platform for quantum information processing, but their mid-circuit measurement-and-feedforward cycle times have remained slow, typically exceeding 1 ms. Here we demonstrate fast mid-circuit measurement and real-time feedforward in an array of atomic qubits coupled to a high-finesse optical cavity. Local light shifts tune individual data qubits out of resonance with the cavity, shielding their coherence, while a near-resonant probe drives a selected qubit whose emission is collected with Purcell enhancement. Mid-circuit measurements of four qubits with sub percent infidelity reduce the coherence of a fifth unmeasured data qubit by less than 2%. We implement real-time feedforward to correct measurement-induced phase shifts and to realize an adaptive circuit for optimal quantum state discrimination and conditional state preparation. Our approach reduces the measurement-and-feedforward cycle time to below 100 $\mu$s and establishes optical cavities as a route to fast control of neutral-atom quantum systems.

17.
arXiv (CS.LG) 2026-06-19

A Model-Driven Approach for Developing Families of Reinforcement Learning Environments

arXiv:2606.20324v1 Announce Type: cross Abstract: Virtual training environments are software-intensive systems in which reinforcement learning (RL) agents learn, adapt, and demonstrate meaningful behavior. Virtual training environments offer a safe and cost-efficient alternative to training agents in real-world settings. However, to converge, most realistic RL problems require training in multiple, mostly similar but slightly different environments - i.e., families of environment variants. The typical development process of environment families is a labor-intensive and error-prone manual endeavor that does not scale well. To alleviate these issues, in this paper, we propose a model-driven approach for developing families of RL training environments. To obtain the family of environments, we develop an approach and prototype tool. In our approach, a hybrid genetic algorithm - a combination of population-based global search and heuristic local search - generates environment families. Mutations and constraints are expressed as model transformations and are operationalized into a search process by a state-of-the-art model transformation engine. We demonstrate the soundness of our approach in a wildfire mitigation scenario and curriculum learning - a particular learning paradigm that relies on environment families.

18.
arXiv (quant-ph) 2026-06-16

Certified Finite-Shot Operating Windows for Virtual Distillation and Symmetry Verification

arXiv:2606.15464v1 Announce Type: new Abstract: Quantum error mitigation methods are usually compared through their infinite-shot bias, but on real devices the comparison is decided by finite sampling budgets, estimator instabilities, and per-shot resource costs. We develop a finite-shot operating-window theory that makes this comparison certifiable for virtual distillation (VD) and symmetry verification (SV): for each method we derive a mean-squared-error law with explicit, non-asymptotic remainder constants. For VD, the law captures the statistical bias and denominator instability of its quotient estimator, with a concentration certificate locating the sample size beyond which the quotient is trustworthy; for SV, it isolates the bias floor left by undetectable errors and the sampling penalty set by the acceptance probability. A selection trichotomy classifies any two-method comparison into a tie, uniform dominance, or a genuine tradeoff with a certified crossing window, including a self-consistency test that rejects spurious crossings. The theory makes falsifiable predictions – operating-window locations scaling as $p^{-2}$ or $p^{-1}$ in the noise rate, and the sign pattern of all pairwise comparisons – which exact white-box experiments confirm with fitted exponent $-1.97$ against the predicted $-2$ and with $300/300$ sign agreement, within a pre-registered analysis whose single failed gate, an over-strict all-instance criterion, is reported and audited in full. Gate-level simulation and archived runs on two IBM backends then test the windows under device conditions: idealized VD windows exist, but realistic interferometry overhead and denominator instability erase them, and calibrated SV is the practical winner in the tested QAOA instances. This absence of a universal winner is not a failure of mitigation; it is the regime structure that certified operating windows predict.

19.
arXiv (CS.LG) 2026-06-12

Data-driven Lake Water Quality Forecasting for Time Series with Missing Data using Machine Learning

arXiv:2601.15503v2 Announce Type: replace Abstract: Volunteer-led lake monitoring yields irregular, seasonal time series with many gaps arising from ice cover, weather-related access constraints, and occasional human errors, complicating forecasting and early warning of harmful algal blooms. We study Secchi Disk Depth (SDD) forecasting on a 30-lake, data-rich subset drawn from three decades of in-situ records collected across Maine lakes. Missingness is handled via Multiple Imputation by Chained Equations (MICE), and we evaluate performance with a normalized Mean Absolute Error (nMAE) metric for cross-lake comparability. Among six candidates, ridge regression provides the best mean test performance. Using ridge regression, we then quantify the minimal sample size, showing that under a backward, recent-history protocol, the model reaches within 5% of full-history accuracy with approximately 176 training samples per lake on average. We also identify a minimal feature set, where a compact four-feature subset matches the thirteen-feature baseline within the same 5% tolerance. Bringing these results together, we introduce a joint feasibility function that identifies the minimal training history and fewest predictors sufficient to achieve the target of staying within 5% of the complete-history, full-feature baseline. In our study, meeting the 5% accuracy target required about 64 recent samples and just one predictor per lake, highlighting the practicality of targeted monitoring. Hence, our joint feasibility strategy unifies recent-history length and feature choice under a fixed accuracy target, yielding a simple, efficient rule for setting sampling effort and measurement priorities for lake researchers.

20.
arXiv (CS.AI) 2026-06-17

RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models

arXiv:2506.17639v2 Announce Type: replace-cross Abstract: Vision-Language-Action models (VLA) have demonstrated remarkable capabilities and strong potential in complex robotic manipulation. However, their large parameter sizes and high inference latency hinder real-world deployment, especially on resource-constrained platforms. To address this, we conduct a systematic empirical study of model compression for VLAs. Building on these insights, we present RLRC, a three-stage compression and recovery pipeline consisting of structured pruning, performance recovery via SFT and RL, and subsequent quantization. The RL stage incorporates a critic warm-up strategy and BC loss regularization to stabilize training and preserve policy behavior. RLRC achieves up to an 8 times memory reduction and 2.3 times inference speedup while maintaining the original task success rate. Extensive experiments across multiple VLA backbones show that RLRC consistently outperforms existing compression baselines, highlighting its effectiveness for on-device deployment. Project website: https://rlrc-vla.github.io

21.
medRxiv (Medicine) 2026-06-16

Usability testing with a prototype user interface of an Artificial Intelligence driven air-Safety Tool (AISaT)

Involving end-users in the development of an AI tool is an important facilitator to its implementation. Usability testing was therefore conducted with a prototype user interface of an Artificial Intelligence driven air-Safety Tool (AISaT) to capture the perspectives and user experiences of AISaT from 10 staff members across two hospitals working within estates, infection prevention and control, and clinical areas, to inform the development of next iterations of AISaT. The perspectives shared could be grouped under improvements to the understand-ability; content; navigation; visibility; usability; workflow; ownership; and frequency of use of the tool. There were key areas that can and will be easily improved within AISaT, however there were areas that required a deeper level of critical reflection, such as incorporating data on more existing variables in a room (i.e., existing ventilation) and whether all patients should be assumed as infectious and breathing heavily. The research team must consider if the target audience of end users and recommended frequency of AISaT use will be pre-defined by the tool developers, or whether this level of detail should be left to each individual hospital to decide.

22.
arXiv (quant-ph) 2026-06-17

Manipulation of Topological Corner States via Subchiral Symmetry

arXiv:2606.17975v1 Announce Type: new Abstract: Higher-order topological phases provide robust corner modes, but their use requires controllable creation, isolation, and transfer of individual modes and their superpositions. Here we demonstrate, using the two-dimensional Benalcazar-Bernevig-Hughes model as an example, that subchiral symmetry provides a general control principle for manipulating topological corner modes. The conventional chiral symmetry decomposes into four subchiral symmetries, each associated with one zero-energy corner mode. By selectively breaking these subsymmetries with controlled intercell hoppings, we reduce the fourfold corner-state manifold step by step to single isolated modes. We further design adiabatic protocols that transfer either a single corner state or a superposition of two corner states between selected corners, while preserving the relative phase in the latter case. Both numerical simulations and IBM quantum-processor implementations show that the proposed protocols can be executed with high fidelity, establishing subchiral symmetry as a route to programmable higher-order topological state manipulation.

23.
Nature Medicine 2026-06-09

Adjuvanted inactivated rabies virus-vectored Lassa virus vaccine in healthy adults: a phase 1 trial

Lassa fever causes substantial morbidity and mortality in West Africa, and no licensed vaccine is available. We evaluated LASSARAB, an inactivated rabies virus-vectored Lassa virus (Josiah strain) glycoprotein complex vaccine. We conducted a randomized, controlled, dose-escalation phase 1 trial. Participants (total n = 54) received two intramuscular doses of LASSARAB containing 700 (n = 15), 1,400 (n = 15) or 2,800 (n = 14) relative units of antigen formulated with the TLR-4 agonist 3D-6-acyl PHAD-SE adjuvant, or licensed rabies vaccine control (n = 10), administered 28 days apart. This protocol-defined interim analysis reports the primary safety evaluation and secondary immunogenicity assessments through day 61. There were no prespecified hypotheses or formal power calculations. All primary safety end points demonstrated an acceptable safety profile. After dose 1, local solicited adverse events occurred in 86.7–100.0% of LASSARAB groups and 80% of controls; systemic events in 33.3–71.4% and 60.0% of controls. After dose 2, local solicited adverse events occurred in 66.7–86.7% of LASSARAB groups and 55.6% of controls; systemic events in 53.3–71.4% of LASSARAB groups and 55.6% of controls. Events were predominantly mild and self-limited. Unsolicited adverse events occurred in 28.6–60.0% of LASSARAB groups and 20.0% of controls. No serious adverse event, immune-mediated condition or sensorineural hearing loss occurred. Safety laboratory abnormalities occurred in 13.3–66.7% of LASSARAB groups and 30.0% of controls (14 mild, 6 moderate and none severe). After two doses, Lassa virus GPC IgG ELISA seroconversion (≥fourfold rise) was achieved in 100.0% (44 of 44) of LASSARAB recipients and 0.0% (0 of 10) of controls. Rabies glycoprotein IgG ELISA seroconversion (≥fourfold rise) and neutralizing antibody by rapid fluorescent focus inhibition test (RFFIT) seroprotection (≥0.5 IU ml−1) were also 100% across all groups, including controls. LASSARAB + 3D-6-acyl phosphorylated hexaacyl disaccharide (PHAD)-SE demonstrated a favorable safety profile and immunogenicity against Lassa and rabies viruses. The per-protocol final study report will include safety and durability through day 394. ClinicalTrials.gov identifier NCT06546709 . An interim report of a first-in-human phase 1 trial found an adjuvanted, combination inactivated rabies-vectored, Lassa fever vaccine (LASSARAB + 3D-6-acyl PHAD-SE) to be safe and induced immunogenicity to both Lassa and rabies viruses in healthy participants.

24.
arXiv (CS.CL) 2026-06-24

When Retrieval Metrics Mislead: Measuring Policy Signal in Long-Horizon Tool-Use Agents

Exact-match retrieval recall is often used as a proxy for whether a retriever supplies useful policy context to a downstream decision model. We test this proxy for pre-action policy classification in tau-bench using Qwen2.5-3B/7B classifiers. Under gold-policy conditioning, a compact structured state improves macro-F1 over raw trajectories by 0.13-0.17 after tuning. We then replace the benchmark-designated policy clause with the top-ranked clause retrieved from decision-time context. Although the exact governing clause is retrieved at rank 1 for only 7% of airline states, the primary 3B classifier obtains macro-F1 0.58 with retrieved clauses versus 0.60 with gold clauses (Delta=-0.02, task-cluster 95% CI [-0.23,+0.21]); mismatched-policy and no-policy controls score 0.32 and 0.21. We do not detect a macro-F1 difference between retrieved and gold clauses in this configuration, although the interval remains too wide to establish non-inferiority. The same qualitative pattern appears with a second retriever and at 7B, while varying across fine-tuning configurations. These results indicate that exact-match clause recall can underestimate downstream policy utility in this benchmark setting, motivating evaluation with retrieved policies in the classification loop rather than recall alone.

25.
medRxiv (Medicine) 2026-06-10

Global and local genetic overlap among ME/CFS, irritable bowel syndrome and psychiatric traits: a hypothesis-generating analysis

Authors:

Background. Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and irritable bowel syndrome (IBS) frequently co-occur following infection, yet shared genetic architecture at the locus level has not been systematically characterised. Aims. To estimate global and local genetic correlations between ME/CFS (including infection-onset subgroup), IBS, major depressive disorder (MDD) and loneliness/isolation, and characterise ME/CFS cell-type heritability enrichment. Method. GWAS summary statistics: DecodeME (15,579 ME/CFS; 9,738 infection-onset), FinnGen R9 (9,296 IBS), PGC MDD Wave 2 (45,396) and UK Biobank loneliness (N=455,364). LDSC for global correlations; LAVA for local correlations across 2,495 loci; MAGMA for cell-type enrichment (Descartes Human atlas); coloc.abf for colocalisation. Results. All pairwise global correlations were significant after Bonferroni correction, including ME/CFS-all-MDD (rg=0.598, 95% CI 0.46-0.74) and ME/CFS-all-IBS (rg=0.573, 0.39-0.75). Of 4,232 local tests, 16 reached FDR