Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (math.PR) 2026-06-19

Theory of uncertain probability: can we derive the probability density function of uncertain random experiments with continuously changing conditions?

作者:

arXiv:2606.20169v1 Announce Type: new Abstract: This paper aims to explore the formation mechanism of probability distribution in situations where the differences among random experiments are distinguishable, and these differences continue to evolve along with the dynamic changes in conditions and their mechanisms of action. To this end, we are motivated to devise a new theoretical system – theory of uncertain probability (TUP) with Kolmogorov's system and nonlinear theories as special cases. TUP develops a novel model that integrates probability and uncertainty as well as the known and unknown to more accurately depict numerous typical random phenomena under more realistic assumptions, and thus provides appropriate tools for greater variety of real needs. It also allows for pioneering interpretation of the causal mechanisms underlying many important distributional characteristics and incorporation of pathwise property to distribution model.

02.
medRxiv (Medicine) 2026-06-18

Cost-effectiveness of a virtual fracture clinic versus traditional in-person fracture clinic care for adults with acute simple fractures: a protocol for a health economic evaluation within the RECITAL trial

ABSTRACT Introduction Traditional in-person fracture clinics are often overcrowded and inconvenient for patients. Virtual fracture clinics aim to address some of these concerns by improving the efficiency of the orthopaedic service and reducing unnecessary interventions while maintaining safety and quality of care. The RECITAL trial is a non-inferiority randomised controlled trial comparing follow-up care provided at a virtual fracture clinic for people with acute simple fractures to follow-up care provided at an in-person fracture clinic. This study describes the protocol for an economic evaluation of RECITAL where the primary aim is to investigate the cost-effectiveness of a virtual fracture clinic compared with traditional in-person fracture clinic care from a health system perspective. Methods and analysis The RECITAL trial recruited 312 participants with acute simple fractures and randomised them to receive follow-up care provided at a virtual fracture clinic or follow-up care provided at an in-person fracture clinic. We will conduct a within-trial analysis from a health system perspective (primary analysis), as well as a health service, patient and societal perspective. The economic evaluation will estimate the difference in the cost of resource inputs on an intention to treat basis used by participants in the two arms of the trial, allowing comparisons to be made between the in-person and virtual fracture clinics. Data for intervention costs and healthcare utilisation will be collected from trial records, hospital electronic medical records and district performance units. The results of the economic evaluation will be expressed in terms of incremental cost per utility weight gained at 12 weeks and will be plotted on a cost-effectiveness plane. Bootstrapping by resampling will be used to estimate 95% confidence intervals around costs and outcomes, and to calculate the confidence intervals around the incremental cost-effectiveness ratio. A cost-effectiveness acceptability curve (CEAC) will be plotted, which will provide information about the probability that an intervention is cost-effective, given the level of a decision makers willingness to pay for each additional outcome. Ethics and Dissemination The trail was approved by the SLHD Ethics Review Committee (RPAH Zone) (X23-0200 and 2023/ETH01038). The findings will be disseminated through a peer-reviewed journal and conference presentations. Trial registration number The trial was prospectively registered on the Australian New Zealand Clinical Trials Registry (ANZCTR; 12623000934640)

03.
arXiv (math.PR) 2026-06-25

On the L{é}vy concentration function of Gaussian quadratic forms with applications to second order U-statistics

arXiv:2606.25441v1 Announce Type: new Abstract: We provide an upper-bound for the L{é}vy concentration function: $$ Q_{S}(\varepsilon):= \sup_{x \in\mathbb{R}}\mathbb{P} (x < S \leq x+\varepsilon) $$ where $S$ is a weighted sum of noncentral chi-square random variables: $$ S:= \sum_{k=1}^\infty \lambda_k (Z_k^2 - 1) + \mu_kZ_k $$ Here, $\{Z_k\}_{k=1}^\infty$ is a sequence of independent standard Gaussian random variables and $\{\lambda_k\}_{k=1}^\infty, \{\mu_k\}_{k=1}^\infty$ are real valued, square summable sequences. Random variables of this type often appear as limiting distributions of second order U-statistics. Our bound is adaptive, in that it recovers (up to constant factors) Gaussian type concentration function estimates if $\|\lambda\|_2$ is negligible compared to $\|\mu\|_2$ and chi-square estimates if $\|\mu\|_{2}$ is negligible compared to $\|\lambda\|_2$. Our bound generalizes existing bounds in various ways. In particular, we make no assumptions regarding the number of nonzero $|\lambda_k|$ or the size of the minimal $|\lambda_k|$, nor do we make any assumptions on the signs of $\lambda_k$. Finally, we apply our bound to some examples of interest, specifically quadratic forms that arise in limit theorems for second-order U-statistics.

04.
medRxiv (Medicine) 2026-06-17

Macrophage-targeted glucocorticoid prodrug resolves acute inflammation while preserving HPA axis function: mechanistic, preclinical, and Phase II/III clinical evidence

Glucocorticoids (GCs) remain the fastest-acting anti-inflammatory agents but are constrained by systemic exposure that suppresses the hypothalamic pituitary adrenal (HPA) axis, silences adaptive immunity, and drives chronic toxicities. Chronic inflammatory diseases are sustained by long-lived CD206+ macrophages containing immune-resistant pathogenic material not cleared physiologically. We developed 101-PGC-005 ('005), a macrophage-targeted type 1a dexamethasone prodrug engineered for low-affinity, recycling-compatible uptake via CD206, with intracellular release triggered by acidic endosomes. We evaluated '005 in mechanistic assays, pathogen-diverse preclinical models, three human pharmacokinetic (PK) studies, and an adaptive-design randomized Phase II/III trial in 309 hospitalized patients with moderate COVID-19. In two completed Phase I human studies, a first-in-human dose-escalation and repeated-dose study and a dedicated single/multiple-dose PK and safety study; '005 circulated as intact prodrug with rapid systemic clearance (Tmax ~0.5 h; terminal half-life ~1.9 h), with no measurable free dexamethasone after single dosing and only low, clinically non-significant free dexamethasone after repeated dosing, and intact prodrug recovered unchanged in urine. Morning cortisol and ACTH were preserved after 30 mg once daily for three consecutive days (1.5 times the intended therapeutic dose). A cerebrospinal fluid PK study is evaluating central-compartment penetration. In the Phase II/III trial, powered for non-inferiority, conducted across six sites in India under GCP with Ministry of Health approval and independent DSMB oversight; '005 (20 mg IV daily for 3 days) was superior to dexamethasone (6 mg IV daily for 3 -10 days) on the primary endpoint of time to > a 2-point improvement on the WHO ordinal scale (HR 2.31; 95% CI 1.83-2.93; p < 0.0001; median 3 vs. 4 days). '005 was also superior on viral clearance (HR 1.47; 95% CI 1.17-1.84; p = 0.0001), hospital discharge rate, SpO2; recovery, and fever resolution. Zero patients in the '005 arm received investigator-initiated corticosteroid supplementation despite protocol allowance. All 309 randomized patients completed the study (ITT = per-protocol). Safety profiles were equivalent (TEAEs 54.8% vs 54.5%; p = 0.958), with no Grade 3+ events, SAEs, deaths, or discontinuations in either arm. Mechanistically, '005 delivered dual benefit: acute debulking of inflammatory macrophages and selective depletion of chronically activated pathology-sustaining macrophages, while preserving CXCL10 antiviral signaling and physiologic HPA control. Critically, HPA preservation is not merely a safety feature, it is a core efficacy mechanism: by clearing the pathogenic macrophage burden that was overriding HPA regulation, '005 restores the conditions for endogenous cortisol to resume its pulsatile, demand-responsive anti-inflammatory role across all GR-expressing cells, lymphocytes, endothelial cells, neurons, and newly differentiated macrophages, that '005 itself cannot reach. These findings support regulatory-grade evidence for macrophage-targeted corticosteroid therapy and provide the foundation for further development across acute inflammatory indications (sepsis, viral pneumonia, cytokine-release syndromes) and chronic macrophage-driven diseases (atherosclerosis, metabolic steatohepatitis, neurodegeneration, tumor-associated macrophages).

05.
arXiv (quant-ph) 2026-06-16

Entanglement as a Witness of Quantum Coherence: A Bipartite Monty-Hall Protocol

arXiv:2604.25953v3 Announce Type: replace Abstract: We present a bipartite protocol inspired by the Monty Hall puzzle that operationally distinguishes quantum coherence from classical ignorance. A principal qutrit is entangled with an ancillary qutrit via a controlled unitary, preparing $|\Psi\rangle = \frac{1}{\sqrt{3}}(|A,0\rangle + |B,1\rangle + |C,2\rangle)$. A rank-1 projective discard then eliminates one basis state, leaving a coherent superposition of the two remaining states. Finally, the ancilla and qutrit are measured, yielding joint probabilities that encode the interplay between superposition and measurement back-action. We show that the conditional probability $P(B|anc=0)$ takes the value $1/4$ in both quantum mechanics and the classical ignorant-host model, making it unsuitable as a witness. The true quantum-classical separation emerges in conditional joint probabilities that correlate ancilla outcomes with specific discard operations. We define witnesses $\mathcal{W}_{i,j} = P(anc=i, qutrit=j \mid discard k)$ where $j$ differs from the ancilla-implied state. Quantum mechanics predicts $\mathcal{W} = 1/4$, while any classical epistemic model with perfect initial correlations yields $\mathcal{W} = 0$. We provide the explicit $9 \times 9$ unitary matrix, a complete analysis of all measurement outcomes, and a detailed proof of the violation. The witness is fully immune to white noise and robust against moderate dephasing. The protocol requires only a single pair of entangled qutrits and sequential measurements – no spatial separation, no multiple copies, and no complex sets of incompatible observables. This makes it suitable for advanced undergraduate laboratories and provides a pedagogically accessible test of the ontic-epistemic distinction in quantum foundations.

06.
arXiv (quant-ph) 2026-06-15

Collision models for open quantum systems coupled to finite environments

arXiv:2606.14163v1 Announce Type: new Abstract: We study a system qubit repeatedly interacting with the same environmental qubit, with a reservoir acting on the environment between collisions via a completely positive, trace-preserving map. We show that complete suppression of system–environment correlations uniquely requires a full environmental reset, recovering a semi group dynamics with a time-independent Gorini–Kossakowski–Sudarshan–Lindblad generator, whereas a partial reset yields a continuous transition between Markovian and non-Markovian regimes governed by a single dimensionless relaxation parameter. For a resonant excitation-exchange interaction, we obtain exact closed-form expressions for the Bloch-vector dynamics for both a generalized depolarizing channel and a generalized amplitude-damping channel acting as the reservoir-induced map. Using the Breuer–Laine–Piilo measure and a Choi-matrix CP-divisibility witness, we identify three distinct dynamical regimes across the parameter space: CP-divisible Markovian dynamics, CP-indivisible but P-divisible dynamics, and non-P-divisible non-Markovian dynamics. The boundaries between these regimes, and the structural differences between uniform and anisotropic environmental relaxation, are characterized numerically.

07.
arXiv (CS.CL) 2026-06-17

Priors Persist Through Suppression: A Stroop Paradigm for Lexical Override

作者:

Glossaries, technical specifications, and system prompts routinely ask language models to use familiar words in unfamiliar ways. When this works, the local rule does not install the new meaning on top of the old one; the pretrained prior keeps operating underneath, and its strength still shows through. We test this with a Stroop-style paradigm: a remapping rule (doctor means forest) pitted against the query word's lexical-prior distractor (hospital), with matched neutral controls. Across 11 open-weight models spanning four families and 1B-9B parameters, lexical-prior strength predicts interference even after item-level controls for answer prior, frequency, tokenization, and prompt wording. Activation patching on five aligned models locates a source-position triplet (definition subject, definition target, query word) that nearly fully recovers the conflict effect (aggregate $R \in [0.92, 1.06]$); a definition-target swap shows the triplet performs binding rather than identity matching. Dissociation experiments isolate target preservation as the binding-specific signature: distractor suppression occurs under matched, swap, and item-mismatched conditions alike, whereas target logit collapse occurs only when the definition-target position is corrupted. Behavior and mechanism converge on the same channel: the prior's strength both predicts which overrides fail and marks where the causal repair lands.

08.
bioRxiv (Bioinfo) 2026-06-10

GEOAgent: An AI-driven Autonomous Framework for Intelligent GEO Data Retrieval and Standardized Preprocessing

Datasets in the Gene Expression Omnibus (GEO) remain difficult to reuse at scale because sample annotations are heterogeneous and raw sequencing data require assay-specific preprocessing. We present GEOAgent, an AI-driven autonomous framework designed for intelligent dataset retrieval and standardized preprocessing by coupling autonomous semantic governance with an automated Nextflow pipeline named bioStream. Metadata from 181,760 sequencing series and 84,756 associated PubMed records were organized in a relational database and semantic index to support natural-language dataset retrieval. The framework automatically determines assay modalities, resolves experimental design pairings, and standardizes sample naming to minimize manual curation overhead. Based on these parsed attributes, the framework generates deployment-ready manifests to automatically execute containerized workflows across bulk and single-cell omics modalities. In expert-curated benchmarks, the workflow achieved 96% retrieval precision alongside 100% accuracy in assay classification and sample relationship resolution. The web platform is publicly accessible, while the source code and associated databases are openly available via GitHub and Zenodo.

09.
arXiv (CS.LG) 2026-06-24

Target-Aware Linear Regression Under Distribution Shift

arXiv:2606.22775v2 Announce Type: replace-cross Abstract: Distribution shift between training and deployment is a pervasive challenge for modern AI systems. In many cases, the target marginals of covariates and response are known or specified through population-level observations, boundary conditions, properties of simulator configurations, or alignment-time distributional constraints. Such knowledge may provide valuable side information for regression estimation. We study this problem in the multivariate linear regression setting with a stable conditional mean $E[Y\mid X]$ across source and target, and identify the hybrid-loss estimator, which jointly incorporates both target marginals, as a benchmark target-aware estimator. Its direct computation, however, requires solving a coupled nonlinear optimization that is expensive at scale. Our main contribution is to develop and evaluate two computationally tractable alternatives: a constrained moment-matching estimator and a two-stage estimator that augments ordinary least squares with a calibration step. For all three estimators, we derive and compare closed-form asymptotic mean squared errors, yielding conditions under which the tractable alternatives match or closely approximate the hybrid benchmark, and regimes in which they do not. Monte Carlo experiments across three controlled shift regimes validate the theoretical results, investigate the accuracy-runtime tradeoffs among the three estimators, and translate into guidance on estimator choice. In particular, the two-stage estimator nearly matches the hybrid benchmark in the high signal-to-noise regime at essentially no additional cost, providing theoretical grounding for empirical observations in nonlinear settings.

10.
arXiv (CS.AI) 2026-06-19

Boundary Embedding Shaping with Adaptive Contrastive Learning for Graph Structural Disentanglement

arXiv:2606.20283v1 Announce Type: cross Abstract: Graph neural networks (GNNs) excel at aggregating neighbor information for classification, yet their performance is hindered by graph structural entanglement, where spurious correlations from semantically irrelevant neighbors contaminate node embeddings. This challenge is most acute for nodes near class boundaries in the embedding space, where amplified structural noise blurs decision boundaries and destabilizes predictions. Existing robust GNN methods largely treat all nodes uniformly, ignoring boundary vulnerabilities. In this paper, to improve classification performance, we tackle graph structural disentanglement by identifying boundary-region entanglement as the primary bottleneck and propose Boundary Embedding Shaping (BES), an adaptive contrastive learning GNN plug-in module that selectively suppresses spurious structural noise at decision boundaries with minimal model parameter perturbation. Extensive experiments demonstrate that BES consistently improves boundary discrimination and outperforms existing leading methods. Notably, BES boosts GCN performance by an average of 3.3% in node classification (up to 5.0% on WikiCS) and achieves superior accuracy in link prediction.

11.
arXiv (CS.CV) 2026-06-17

OmniDrive: An LLM-Choreographed Multi-Agent World Model with Unified Latent Co-Compression for Multi-View Driving Video Generation

Generative world models for autonomous driving face two unresolved tensions: heterogeneous control injection, where free-form language, HD-maps, trajectories, and camera poses reside in incompatible representational spaces, and post-hoc cross-view fusion, where per-camera latents fail to encode global 3-D geometry. We trace both to a single root cause: the absence of a shared symbolic interlingua aligning language, geometry, and pixels at the latent-token level. We present DRIVE-CHOREO, an LLM-choreographed multi-agent world model that recasts controllable multi-view video generation as latent choreography. Three Qwen2.5-VL agents - a Director parsing user intent into a structured WorldScript, a Cartographer grounding it into spatially-anchored layout tokens, and an Auditor feeding cross-view critiques back as auxiliary supervision - jointly author a single position-aware token sequence. This sequence is co-compressed with the multi-view video via a view-time permutation that enforces inter-camera geometry within the convolutional receptive field of a 3-D VAE. On nuScenes, DRIVE-CHOREO sets new state-of-the-art multi-view consistency and BEV mAP (21.6) with competitive FVD (45.7); a detector trained purely on our synthetic data gains +2.4 NDS on the real validation split, validating downstream utility.

12.
PLOS Medicine 2026-05-14

First-trimester nonsteroidal anti-inflammatory drugs exposure and risk of major congenital malformations: A retrospective register-based cohort study

by Ariel Avraham Hasidim, Itamar Ben Shitrit, Daphna Idan, Tal Michael, Amalia Levy, Gali Pariente, Eitan Lunenfeld, Sharon Daniel Background Pain and fever are common in early pregnancy, yet their management poses a major clinical dilemma. Although not confirmed, recent studies have raised safety concerns regarding acetaminophen. Evidence on the use of nonsteroidal anti-inflammatory drugs (NSAID) in the first trimester remains inconclusive. This uncertainty has left clinicians with limited evidence to guide treatment decisions. This study evaluated the association between first-trimester NSAID exposure and the risk of major congenital malformations (MCMs) in a large, population-based cohort of pregnancies. Methods and findings We conducted a population-based retrospective cohort study within the Southern Israeli Pregnancy Registry (siPREG) project, including all singleton pregnancies of women aged 15–45 years resulting in live births, stillbirths, or elective terminations for fetal malformations at a Soroka University Medical Center between 1998 and 2018. Pregnancies exposed to established teratogens, multiple gestations, and those with documented genetic or chromosomal anomalies were excluded. First-trimester NSAID exposure was defined by pharmacy dispensations (overall and by specific agents). MCMs were identified from linked clinical, hospitalization, and termination records through the first postnatal year.Propensity scores were estimated using covariates selected via a directed acyclic graph, including maternal age, ethnicity, diabetes, medical indication for NSAID use, exposure to other antipyretics, obesity, smoking, folic-acid use, gravidity, perinatal care, and year of pregnancy. Generalized full matching was used to balance covariates. Adjusted risk ratios were derived using weighted Poisson regression with G-computation, and two-way cluster-robust standard errors, jointly clustering by maternal identifier and matching subclass. Sensitivity analyses included a dose–response assessment across defined-daily-dose (DDD) categories and a tipping-point analysis evaluating the impact of potential misclassification from unrecorded over-the-counter NSAID use.A total of 264,858 singleton pregnancies were included in the final cohort; 20,202 (7.6%) were exposed to NSAID, most commonly ibuprofen (5.1%), diclofenac (1.6%), and naproxen (1.2%). NSAID exposure, in total and as individual agents, was not associated with MCMs overall (8.2% versus 7.0%; matched-adjusted-Relative Risk (aRR) = 0.99 (95% CI [0.90,1.10])) or with organ-system-specific MCMs, including cardiovascular (matched-aRR = 1.05 (95% CI [0.92,1.20]), musculoskeletal (matched-aRR = 1.03 (95% CI [0.77,1.39])), central nervous system (matched-aRR = 0.77 (95% CI [0.53,1.11])), cleft palate (matched-aRR = 0.95 (95% CI [0.47–1.91])), gastrointestinal (matched-aRR = 1.03 (95% CI [0.64–1.63])), and genitourinary (matched-aRR = 0.99 (95% CI [0.72,1.35])) malformations. Dose–response analyses showed no significant association with MCMs across cumulative NSAID exposure: short-term (1–7 DDD, matched-aRR = 1.06 (95% CI [0.97,1.15]), medium-term (8–21 DDD, matched-aRR = 1.10 (95% CI [0.99,1.22]), and long-term (>21 DDD, matched-aRR = 1.24 (95% CI [0.94,1.63])). The main limitation was the potential for minor exposure misclassification due to over-the-counter availability of ibuprofen, although sensitivity analyses simulating such misclassification suggested minimal impact on the risk estimates. Conclusion In this large, population-based cohort, we found no evidence supporting an association between first-trimester exposure to NSAID and MCMs, providing reassuring evidence regarding their fetal safety in early pregnancy.

13.
arXiv (quant-ph) 2026-06-25

Majorana-Pauli stabilizer codes and duality webs of fermionic topological phases

arXiv:2606.25048v1 Announce Type: new Abstract: Stabilizer codes provide exact lattice realizations of bosonic topological orders. In contrast, systematic stabilizer descriptions of intrinsically fermionic topological phases remain much less developed. In this work, we introduce Majorana-Pauli stabilizer codes, a class of exactly solvable fermionic lattice models whose stabilizers are built from both generalized Pauli operators and Majorana operators. As a main example, we construct an exactly solvable stabilizer realization of the fermionic toric code: an intrinsically fermionic $\mathbb Z_2$ topological order in $(2{+}1)$ dimensions, using $\mathbb Z_8$ Pauli operators coupled to Majorana modes. Within this stabilizer framework, the anyons, string operators, fusion rules, and braiding statistics all follow naturally from the stabilizer algebra. More broadly, we show that the fermionic toric code belongs to a duality web generated by anyon condensation and by gauging bosonic or fermion-parity symmetries. This web connects bosonic topological orders, symmetry-enriched topological phases, and both bosonic and fermionic symmetry-protected topological phases, all within a common stabilizer description. We further show that the construction extends to all Abelian fermionic topological orders with gapped boundaries and to all supercohomology fermionic SPT phases in $(2{+}1)$ dimensions. Going beyond Majorana operators, we introduce fermionic versions of the clock and shift operators and use them to construct an exact bosonization map for $\mathbb Z_D^F$ symmetries for $D$ even. Using this, we realize a stabilizer model for a nontrivial $\mathbb Z_8^F$ fermionic SPT phase with no free-fermion analog. Altogether, these results extend the stabilizer-code paradigm to a broad class of intrinsically fermionic phases bridging fermionic quantum many-body physics to quantum error correction.

14.
arXiv (CS.LG) 2026-06-11

Quantum Occam Learning: Sample-Supported Expressibility for Circuit-Based Quantum Learning

arXiv:2606.12211v1 Announce Type: cross Abstract: A central principle in quantum machine learning is that an ansatz should be expressive enough to represent the quantum data of interest. Yet, the expressibility is statistically meaningful only insofar as it can be learned from finitely many copies of an unknown quantum state. In this work, we develop an information-theoretic Occam theory for quantum data generated by finite-size quantum circuits. For the class $S_{n,G}$ of $n$-qubit pure states preparable with at most $G$ two-qubit gates, a metric-entropy argument gives the realizable sample law $\widetilde{\Theta}(G/\epsilon^2)$ in the circuit-limited regime. For an arbitrary source $\hat{\rho}$, we introduce the best $G$-gate approximation error $d_G(\hat{\rho})$ and the approximate circuit complexity $C_\eta(\hat{\rho})$. We prove an agnostic quantum Occam theorem: with $M$ copies, one can learn up to the best $G$-gate approximation error plus a statistical penalty $\widetilde{O}(\sqrt{G/M})$. We then remove the need to know $G$ in advance through an adaptive model-selection theorem whose oracle inequality selects the circuit complexity justified by the data. Matching lower bounds yield a sample-supported expressibility law: at trace-distance accuracy $\epsilon$, $M$ samples can support only $G_supported \simeq M\epsilon^2$ gates, up to logarithmic factors and tomography saturation at $2^n$. Thus, the circuit complexity becomes an adaptive statistical resource rather than a static promise. Our framework turns bounded circuit complexity into a model-selection principle for quantum machine learning.

15.
bioRxiv (Bioinfo) 2026-06-16

Physics-Driven Zero-Shot Reconstruction of Isotropic 3D Fluorescence Microscopy under Undersampled Acquisition

Three-dimensional (3D) imaging represents the development of next generation of fluorescence microscopy. However, routine axial down-sampling makes isotropic resolution unrealistic. Here, we propose DeepUI, a physical zero-shot framework designed to achieve isotropic 3D fluorescence images from a low axial sampling rate. DeepUI fully leverages the intrinsic characteristics of 3D images through physics-guided degradation, which incorporates spatial-frequency joint learning to generate a scaled optical transfer function, combined with noise degradation and an up-sampling branch. Typically requiring just 5 minutes for training and 0.5 minutes for high-throughput and fast prediction, we demonstrate the superior performance of DeepUI to get isotropic results, and the exclusivity to axial down-sampling conditions, even in more challenging conditions, including defocused background, noise, and resolution blur.

16.
arXiv (CS.CV) 2026-06-16

CLAP: Contrastive Latent Action Pretraining for Learning Vision-Language-Action Models from Human Videos

Generalist Vision-Language-Action models remain constrained by the scarcity of robotic data relative to the abundance of human video demonstrations. Existing Latent Action Models attempt to use video data but often suffer from visual entanglement, encoding noise rather than manipulation skills. To address this limitation, we propose Contrastive Latent Action Pretraining (CLAP), a framework that first uses Act-VAE to learn an executable action-token vocabulary from robot trajectories and then aligns human visual transitions with this vocabulary through contrastive learning. This alignment maps unlabeled human videos into a physically grounded latent action space rather than reconstructing appearance. Building on the aligned tokens, we train CLAP-NTP as an autoregressive VLA using robot demonstrations and pseudo-labeled human videos, preserving instruction following and object generalization. For deployment and target-domain adaptation, we further introduce a post-training strategy that combines CLAP-RF, a Rectified Flow action head for low-latency continuous action chunk prediction, with Knowledge Matching regularization to preserve pretrained semantic knowledge during fine-tuning. Extensive experiments show that CLAP achieves strong performance against competitive baselines while enabling effective skill transfer from human videos to robotic execution.

17.
arXiv (quant-ph) 2026-06-16

Communication Complexity of Distributed Unitary Synthesis

arXiv:2511.04250v2 Announce Type: replace Abstract: We study space-bounded communication complexity for unitary implementation in distributed quantum processors, where we restrict the number of qubits per processor to ensure practical relevance and technical non-triviality. We model distributed quantum processors using distributed quantum circuits with nonlocal two-qubit gates, defining the distributed communication complexity of a unitary as the minimum number of such nonlocal gates required for its realization, up to permutations of data qubit positions. Our contributions are twofold. First, for general $n$-qubit unitaries, we improve upon the trivial $O(4^n)$ communication bound. Considering $k$ pairwise-connected processors (each with $n/k$ data qubits and $m$ ancillas), we prove the communication complexity satisfies $O\left(\max\{4^{(1-1/k)n - m}, n\}\right)$ – for example, $O(2^n)$ when $m=0$ and $k=2$ – and establish the tightness of this upper bound. We further extend the analysis to approximation models and general network topologies. Second, for special unitaries, we show that both the Quantum Fourier Transform (QFT) and Clifford circuits admit linear upper bounds on communication complexity in the exact model, outperforming the trivial quadratic bounds applicable to these cases. In the approximation model, QFT's communication complexity reduces drastically from linear to logarithmic, while Clifford circuits retain a linear lower bound. These results offer fundamental insights for optimizing communication in distributed quantum unitary implementation, advancing the feasibility of large-scale DQC systems.

18.
arXiv (CS.CV) 2026-06-16

MeshLoom: Feed-Forward Non-Rigid Registration of Mesh Sequences

We present MeshLoom, a feed-forward registration network that directly reconstructs vertex deformations across mesh sequences. Our approach advances non-rigid registration beyond existing models, which are typically constrained by costly per-instance optimization, narrow object categories, pairwise-only inputs, or merely intermediate outputs. The network is simple and efficient, registering multiple meshes within seconds. At its core lies a topology-aware encoder–decoder design. Specifically, we first introduce a topology-aware point representation that encodes the anchor (reference) mesh's topology into its per-vertex features. This representation strengthens the network's understanding of the anchor-mesh geometry and disambiguates points that are Euclidean-close yet geodesically distant. We then propose a multi-modal encoder that fuses this anchor-mesh representation with complementary cues from each frame, such as shape latents and image features. These multi-source signals are compressed into a compact global motion embedding that captures dense inter-frame correspondence. A lightweight decoder then queries this global embedding with the anchor-mesh point representation, retrieving per-vertex deformations at target timestamps. Through extensive experiments across diverse motions and object categories, we show that MeshLoom achieves state-of-the-art results on non-rigid registration. In addition, we find that our global embedding-then-query paradigm naturally enables the network to generate deformations at intermediate timestamps, which extends MeshLoom to motion interpolation and mesh morphing. Project page: https://meshloom.github.io/ .

19.
arXiv (CS.CV) 2026-06-12

Towards Effective Waste Segmentation for Automated Waste Recycling in Cluttered Background

Rapid expansion of urban areas and population growth is causing an immense increase in waste production, which demands the need for efficient and automated waste management. In this scenario, automated waste recycling (AWR) using deep learning methods can assist humans in optimal waste management. Recent deep learning approaches for AWR provide promising waste segmentation performance, however, these methods rely on large backbone networks that are inefficient for AWR systems and suffer from performance deterioration in cluttered scenes. To this end, an optimal waste segmentation network is introduced which effectively utilizes the spatial domain to capture localized structural dependencies and the spectral domain to efficiently extract global contextual relationships. This cascaded design allows the network to progressively leverage both local and global representations across complementary domains to highlight the semantic information necessary for effective segmentation of various waste objects. Furthermore, auxiliary feature enhancement module (AFEM) is introduced to enhance the target objects' boundaries and blob amplification for better segmentation in cluttered scenarios. Extensive experimentation on ZeroWaste-aug, ZeroWaste-f and SpectralWaste datasets reveals the merits of the proposed method.

20.
arXiv (CS.AI) 2026-06-16

SorryDB: Can AI Provers Complete Real-World Lean Theorems?

arXiv:2603.02668v2 Announce Type: replace Abstract: We present SorryDB, a dynamically-updating benchmark of open Lean tasks drawn from 78 real world formalization projects on GitHub. Unlike existing static benchmarks, often composed of competition problems, hillclimbing the SorryDB benchmark will yield tools that are aligned to the community needs, more usable by mathematicians, and more capable of understanding complex dependencies. Moreover, by providing a continuously updated stream of tasks, SorryDB mitigates test-set contamination and offers a robust metric for an agent's ability to contribute to novel formal mathematics projects. We evaluate a collection of approaches, including generalist large language models, agentic approaches, and specialized symbolic provers, over a selected snapshot of 1000 tasks from SorryDB. We show that current approaches are complementary: even though an agentic approach based on Gemini Flash is the most performant, it is not strictly better than other off-the-shelf large-language models, specialized provers, or even a curated list of Lean tactics.

21.
arXiv (CS.LG) 2026-06-25

A Spectral Phase Diagram for Binary Few-Shot Classification: Intrinsic Dimensionality, Geometric Saturation, and Representational Diagnosis

作者:

arXiv:2606.24903v1 Announce Type: new Abstract: Deciding when to stop collecting labeled examples is a fundamental but undertheorized problem in applied machine learning. The saturation index $S(K) = \operatorname{erank}(\widehat{\Sigma}_W^{(K)}) / K$ measures the ratio of the effective rank of the pooled within-class sample covariance to the shot count; we prove it falls below a threshold precisely when the covariance estimator is well-concentrated around the population covariance and the linear discriminant has stabilized. The index is computable in $O(d^3)$ time from support features alone, requiring no test labels or trained classifier. Evaluated across $N = 246$ doubling-pair observations from seventeen binary tasks and six datasets, sixteen of seventeen tasks have a positive within-task Spearman correlation between $S(K)$ and marginal accuracy gain (median $\rho = 0.811$). The pooled Spearman correlation is $\rho = 0.548$ ($p = 1.1 \times 10^{-20}$, $N = 246$). A three-phase diagram (exploration, transition, saturation) with mean marginal gains of $3.48\%$, $2.40\%$, and $0.82\%$ is supported by all pairwise significance tests ($p \leq 0.008$). As a binary stopping rule, the index achieves AUC $= 0.752$, providing meaningful probabilistic guidance for annotation decisions. Asymptotic effective rank and peak accuracy show no significant monotone relationship across tasks (Spearman $r_s = 0.380$, $p = 0.133$, $N = 17$). A small saturation index paired with low accuracy diagnoses representational inadequacy. All results are for binary classification with a fixed linear classifier; extensions to $N$-way settings and pretrained backbone representations are discussed as future work.

22.
arXiv (CS.CV) 2026-06-25

SurgAtlas: A Large-Scale Surgical Video-Language Dataset with 2,391 Hours of Open and Minimally Invasive Surgery

We introduce SurgAtlas, the largest surgical video-language dataset to date, comprising 15,291 videos (2,391 hours) spanning 18 surgical specialties and over 5,000 procedure types, sourced entirely from publicly available YouTube content. SurgAtlas is also the first surgical video-language dataset to include open surgery at scale, with 6,182 open procedure videos alongside over 9,000 minimally invasive recordings, and the first to establish standardized benchmarks for open-surgery video understanding. We additionally provide an expert-validated subset with verified visual question-answer pairs across diverse open and minimally invasive procedures, serving as a clinically grounded benchmark for surgical reasoning. Compared with existing surgical video-language datasets, SurgAtlas provides one of the most diverse annotation schemas, combining segment-level captions, step- and phase-level descriptions, video-level surgical descriptions, and reasoning-oriented question-answer pairs organized within a hierarchical taxonomy. These annotations are constructed through an automated multi-tier pipeline with LLM-based enrichment and a staged VQA generation framework with explicit groundedness verification. The scale and diversity of SurgAtlas enable training surgical foundation models with broad procedural coverage: we finetune Qwen3-VL-8B through a two-stage captioning-then-instruction pipeline and achieve competitive or state-of-the-art results on multiple established surgical benchmarks, including phase recognition, triplet detection, and reasoning question answering. More broadly, SurgAtlas provides a large native public video corpus that can support future large-scale pretraining of multimodal surgical AI systems and contribute to the development of next-generation foundation models for surgery.

23.
arXiv (CS.AI) 2026-06-15

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

arXiv:2606.14466v1 Announce Type: cross Abstract: This paper investigates the fragility of post-hoc explanation methods in audio deepfake detection. While previous work on explanation manipulation focused on images using standard $L_p$ metrics, we introduce a psychoacoustic framework that optimizes inaudible perturbations to decouple model attributions from final classifications. We evaluate this vulnerability across state-of-the-art architectures under strict prediction-preserving constraints. By evaluating the manipulation cost through domain-specific perceptual audio quality metrics alongside explanation alignment criteria, our framework demonstrates that an adversary can systematically distort automated explanation heatmaps while preserving the predicted deepfake label. Full code available at: https://github.com/cncPomper/Audio-XAI

24.
arXiv (CS.CV) 2026-06-17

Qwen-RobotManip Technical Report: Alignment Unlocks Scale for Robotic Manipulation Foundation Models

Foundation models in language and multimodality achieve strong generalization by aligning heterogeneous data under a unified formulation and training at scale. In this report, we investigate whether this scaling recipe can be applied to robotic manipulation to achieve genuine generalization. This is challenging because, unlike text, manipulation data is heterogeneous by nature, expensive to collect, and narrow in diversity, making alignment and scale simultaneously difficult. We present Qwen-RobotManip, a generalizable Vision-Language-Action foundation model built on Qwen-VL. Qwen-RobotManip introduces a unified alignment framework across the representation, motion, and behavioral dimensions of manipulation, making large-scale multi-source training coherent rather than conflicting. This alignment capability in turn enables Qwen-RobotManip to absorb manipulation data at a scale that prior training regimes could not sustain. A human-to-robot synthesis pipeline converts egocentric hand demonstrations into robot trajectories across 15 platforms, and a rigorous curation pipeline harmonizes heterogeneous datasets. Using only open-source datasets and human videos without proprietary data collection, Qwen-RobotManip constructs a ~38,100-hour pretraining corpus and exhibits emergent generalization capabilities, including zero-shot instruction following, robustness to perturbations, reactive error recovery, and cross-embodiment transfer. We find that standard benchmarks fail to capture pretraining quality and instead adopt OOD settings including RoboCasa365, LIBERO-Plus, EBench, RoboTwin-Clean2Rand, RoboTwin-IF, and RoboTwin-XE. Qwen-RobotManip substantially outperforms prior state-of-the-art models, including $\pi$0.5, across all OOD settings, ranks 1st in RoboChallenge with a 20% relative improvement, and is validated on real-robot platforms including AgileX ALOHA, Franka, UR, and ARX.

25.
Nature (Science) 2026-06-24

Daily briefing: Sperm whales have different dialects

作者:

Whales in different areas of the Mediterranean use varying patterns of clicks and pauses. Plus, a technique to make protein samples one billion times bigger and the science of grief. Whales in different areas of the Mediterranean use varying patterns of clicks and pauses. Plus, a technique to make protein samples one billion times bigger and the science of grief.