Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
medRxiv (Medicine) 2026-06-15

Multidimensional nutritional assessment in Crohns disease: cross-sectional comparison of active disease and remission

Malnutrition is common in Crohns disease (CD), and its assessment requires multiple tools. Comprehensive evaluation of nutritional status in a population with CD, predominantly characterized by metabolic phenotype, was inadequately reported. This study evaluated the nutritional status of CD patients using anthropometric, clinical, and biochemical measures and compared patients with active disease with those in remission. This cross-sectional study included 127 adults with CD: 63 with active disease and 64 in remission. Disease activity was classified using the Crohns Disease Activity Index, the Simple Endoscopic Score for Crohns Disease, and magnetic resonance enterography. Nutritional assessment included body mass index (BMI), mid-upper arm circumference, calf circumference, triceps skinfold thickness, mid-arm muscle circumference, Mini Nutritional Assessment-Short Form (MNA-SF), and biochemical markers including hemoglobin, serum iron, folate, vitamin B12, albumin, and zinc. Malnutrition was defined using the Global Leadership Initiative on Malnutrition criteria. Overall, 47.2% of participants were malnourished. Malnutrition was significantly more frequent in active disease than in remission (81.0% vs. 14.1%, P

02.
arXiv (CS.CV) 2026-06-12

Efficient, Robust, and Anti-Collusion Fingerprinting of Image Diffusion Models

Model fingerprinting, embedding user-specific identifiers (fingerprints) into generated outputs, has recently emerged as a popular solution to protect the intellectual property rights (IPR) of generative text-to-image (T2I) models and prevent unauthorized redistribution. In this work, we reveal a previously unexplored systematic vulnerability in existing generative model fingerprinting methods: they lack robustness against collusion attacks, where multiple attackers combine their models to remove or obscure the fingerprints. To address this issue, we take the first step towards a robust fingerprinting method for T2I models with anti-collusion capabilities. The proposed method encodes strings of bits, namely fingerprints, into the coefficients of a personalized normalization module (PNM) incorporated into T2I models, so that fingerprints can be reliably recovered from any generated image. To defend against collusion attacks and prevent unauthorized model redistribution, we introduce an anti-collusion mechanism based on lossless function-invariant parameter transformations. This mechanism significantly degrades the image generation quality of colluded models, making them effectively unusable. Moreover, our method allows developers to efficiently create multiple copies of fingerprinted T2I models by reparameterizing the PNM without the need for retraining. We also introduce a worst-case optimization strategy to improve robustness against model-level attacks. Our experiments demonstrate that the proposed method achieves high fidelity and robustness across multiple T2I image generation and editing tasks, with fingerprint extraction accuracy exceeding 99.5%. Compared with existing methods, our method demonstrates, for the first time, a notable proactive robustness to collusion attacks by significantly increasing the FID of colluded models.

03.
arXiv (CS.CL) 2026-06-16

Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm

Pretraining large language models (LLMs) typically requires centralized clusters with thousands of high-memory GPUs (e.g., H100/A100). Recent decentralized training methods reduce communication overhead by employing federated optimization; however, they still need to train the entire model on each node, remaining constrained by GPU memory limitations. In this work, we propose SParse Expert Synchronization (SPES), a memory-efficient decentralized framework for pretraining mixture-of-experts (MoE) LLMs. SPES trains only a subset of experts per node, substantially lowering the memory footprint. Each node updates its local experts and periodically synchronizes with other nodes, eliminating full-parameter transmission while ensuring efficient knowledge sharing. To mitigate limited per-expert data utilization under sparse expert updates, we introduce an expert-merging warm-up strategy, where experts exchange knowledge early in training, to rapidly establish foundational capabilities. With SPES, we train a 2B-parameter MoE LLM using 16 standalone 48GB GPUs over internet connections, which achieves competitive performance with centrally trained LLMs under similar computational budgets. We further demonstrate scalability by training a 7B model from scratch and a 9B model upcycled from a dense checkpoint, both of which match prior centralized baselines. Our code is available at https://github.com/zjr2000/SPES.

04.
arXiv (CS.AI) 2026-06-18

Caring Without Feeling: Affective Dynamics as the Control Layer of Human-AI Agent Collaboration

arXiv:2606.18259v1 Announce Type: cross Abstract: AI agents that plan, retain memory across sessions, invoke external tools and act with partial autonomy are transforming human–AI collaboration. Research on affective computing, simulated empathy in large language models, trust in automation and AI safety has illuminated important design principles, yet these literatures remain fragmented. No integrated account explains how affective cues operate within agentic collaboration – settings in which humans delegate, monitor and correct consequential tasks. This Review synthesises computational and interactional mechanisms of affective dynamics: the processes through which affective cues, emotion-like behaviour and perceived agent affect shape trust calibration, delegation decisions, error correction, dependence and governance. We trace how model-generated affective signals enter interaction loops that govern reliance, repair and oversight, and propose a framework that treats affect not as an internal property of AI but as a coordination layer through which humans and agents negotiate capability, uncertainty and responsibility. The framework provides a foundation for calibrated measurement, purposeful design and informed governance.

05.
Nature (Science) 2026-06-17

Analysis of 173,303 exomes and genomes in the Pakistan Genome Resource

Naturally occurring loss-of-function variants in human genes enable drug target discovery because they mimic pharmacological inhibition of proteins. However, the study of these genetic variants is constrained by their rarity. Sequencing of diverse populations, particularly those enriched in familial relatedness, has been postulated to promote discovery of rare genetic variants1–3. Here we present the Pakistan Genome Resource, a South Asian biobank with high familial relatedness comprising 173,303 participants, who collectively carry naturally occurring homozygous loss-of-function variants in 6,476 genes. We describe the genetic architecture of this population, associations between genes and biomarkers, the distribution of loss-of-function variants across molecular pathways, and recall-by-genotype studies of therapeutically relevant genes. The Pakistan Genome Resource expands the catalogue of human genetic variants, provides a comprehensive genetic reference resource for the Pakistani population, and demonstrates the value of studying diverse cohorts to advance human health. The Pakistan Genome Resource compiles biobank data from 173,303 individuals with high familial relatedness, broadening the catalogue of human genetic variation and establishing a population-specific genomic reference for Pakistan.

06.
arXiv (math.PR) 2026-06-11

Hierarchical Random Measures without Tables

arXiv:2505.02653v2 Announce Type: replace-cross Abstract: The hierarchical Dirichlet process is the cornerstone of Bayesian nonparametric multilevel models. Its generative model can be described through a set of latent variables, commonly referred to as tables within the popular restaurant franchise metaphor. The latent tables simplify the expression of the posterior and allow for the implementation of Gibbs sampling algorithms to approximately draw posterior samples. However, managing their assignments can become computationally expensive, especially as the size of the dataset and the number of levels increase. In this work, we identify a prior for the concentration parameter of the hierarchical Dirichlet process that (i) induces a quasi-conjugate posterior distribution, and (ii) removes the need for tables, leading to more interpretable expressions for the posterior, with both a scalable and an exact algorithm to sample from it. Remarkably, this construction extends beyond the Dirichlet process, leading to a new framework for defining normalized hierarchical random measures and a new class of algorithms to sample from their posteriors. The key analytical tool is the independence of multivariate increments, that is, their representation as completely random vectors.

07.
arXiv (quant-ph) 2026-06-16

Trainable Quantum Channels as Computational Primitives for Quantum Learning

arXiv:2606.15808v1 Announce Type: new Abstract: Variational quantum learning is traditionally constrained to unitary dynamics, often treating quantum channels as detrimental noise. In this work, we reformulate the quantum channels as trainable computational primitives and establish a non-unitary quantum machine learning framework grounded in open-system dynamics. We demonstrate that the outputs of channel-enhanced quantum models form a structured superposition of multiple functional components. Each component is governed by an effective observable whose spectrum can be adaptively modulated during training, a significant departure from the spectral invariance in unitary transformations. Moreover, the proposed framework generalizes conventional unitary quantum models by retaining them as a special case while introducing additional non-unitary degrees of freedom. Furthermore, we reveal that trainable quantum channels enrich the optimization geometry through ensemble-averaged gradient and additional optimization directions induced by the Kraus operators. Empirical evaluations on classification tasks using trainable amplitude-damping and phase-damping channels confirm enhanced optimization dynamics and predictive performance. Our work provides a principled approach for leveraging quantum channels as trainable resources and advances the design of high-performance quantum learning architectures.

08.
arXiv (CS.LG) 2026-06-15

The Risk Shadow of Principal Component Analysis: When 99.9999% Variance Preservation Causes Catastrophic Decision Errors

arXiv:2606.14533v1 Announce Type: new Abstract: Principal Component Analysis (PCA) preserves variance, not the information needed to detect rare catastrophic events. This paper proves the existence of a {\it Risk Shadow}: PCA can retain over 99.9999 percent of total variance while completely erasing all signal about rare, high-impact failures. When this happens, even the best possible classifier operating on the PCA representation reduces to a constant predictor. The root cause is a fundamental mismatch between variance maximization and tail risk awareness. To break the shadow, we introduce Expectile PCA (ExPCA) and Tail-Preserving PCA (TP-PCA), two methods that reweight the data covariance toward high-impact events. We prove theoretically that ExPCA strictly outperforms PCA in retaining rare-event information, and we validate our claims on synthetic data and a real-world credit card fraud detection benchmark. Our results call for a fundamental rethinking of variance-based dimensionality reduction in high-stakes decisions.

09.
arXiv (CS.AI) 2026-06-11

Learning to Inject: Automated Prompt Injection via Reinforcement Learning

arXiv:2602.05746v2 Announce Type: replace-cross Abstract: Prompt injection is a critical vulnerability in LLM agents, yet the strongest methods still rely on human red-teamers and hand-crafted prompts. Adapting automated jailbreak optimizers does not close this gap: jailbreaks shape models toward generic compliance, while prompt injection requires emitting specific tool calls with correct parameters. The success signal is binary, and randomly sampled suffixes almost never trigger it, so standard optimizers have no gradient to follow. We present AutoInject, a black-box reinforcement learning (RL) framework that learns adversarial suffixes for prompt injection. A learned comparison-based reward scores each candidate against the best suffix seen so far, turning the binary signal into a dense reward suitable for RL optimization. The framework supports both online query-based attacks and offline-trained transferable suffixes that need no utility access at deployment, and incorporates a utility objective when task-completion feedback is available. On AgentDojo, AutoInject outperforms template attacks, GCG, TAP, and adaptive attack across production models, with statistically significant improvements under McNemar's test with p

10.
arXiv (CS.CV) 2026-06-19

SSD: Spatially Speculative Decoding Accelerates Autoregressive Image Generation

Autoregressive models excel in visual generation by treating images as 1D sequences of discrete tokens, mirroring language modeling. However, this flattening discards the intrinsic 2D spatial locality of visual signals, creating severe computational bottlenecks during inference. We introduce Spatially Speculative Decoding (SSD), a framework that aligns the predictive objective with the natural geometry of images. Rather than predicting only the immediate next token in a 1D sequence, our model simultaneously predicts the adjacent horizontal token and the token directly below it. By capitalizing on this 2D spatial correlation, spatially speculative decoding overcomes the memory wall in visual inference. Our approach accelerates autoregressive image generation by up to 13.3x while maintaining high fidelity on DPG-Bench and GenEval. Our results suggest that respecting the underlying geometry of vision unlocks massive computational efficiencies, paving the way for real-time, high-resolution autoregressive generative models.

11.
arXiv (quant-ph) 2026-06-16

Controlled Quantum Metrology with Anisotropic Heisenberg Spin Interactions under Intrinsic Decoherence

arXiv:2606.16918v1 Announce Type: new Abstract: We theoretically investigate quantum parameter estimation in a two-qubit anisotropic Heisenberg spin system with Dzyaloshinskii-Moriya (DM) interaction in the presence of intrinsic decoherence described by the Milburn model. Using the Quantum Fisher Information (QFI), we study the estimation of both the uniform magnetic field and the DM interaction strength. Analytical expressions for the time-evolved density matrix are obtained and used to explore the effects of exchange anisotropy, intrinsic decoherence, and probe-state preparation on the achievable estimation precision. Our results show that suitable tuning of the anisotropic exchange coupling and the initial entangled state can considerably enhance the estimation performance, with different optimal parameter regimes emerging for magnetic-field and DM-interaction sensing. To better understand the role of quantum resources in metrology, we also examine the behaviour of concurrence, quantum coherence, and von Neumann entropy. Overall, our findings demonstrate that anisotropic Heisenberg spin systems with DM interaction provide a promising and flexible platform for high-precision quantum metrology even in the presence of intrinsic decoherence.

12.
arXiv (CS.CL) 2026-06-16

Evaluating and Preserving Lexical Stress in English-to-Chinese Speech-to-Speech Translation

Speech-to-speech translation (S2ST) systems have achieved impressive progress in semantic accuracy and speech naturalness. However, the cross-lingual transfer of lexical stress, a vital cue for emphasis and speaker intent, remains heavily underexplored, compounded by a lack of reliable automatic evaluation metrics for tonal languages like Chinese. We investigate English-to-Chinese S2ST stress transfer by constructing a stress-annotated Chinese dataset and an XLS-R-based Mandarin stress detector. Integrating this with the English EmphAssess system, we propose a novel objective metric for cross-lingual stress evaluation. Furthermore, we fine-tune CosyVoice3 to build a stress-aware S2ST system. Experiments demonstrate that our proposed S2ST architecture significantly outperforms existing systems in stress translation capability while maintaining competitive translation quality. Furthermore, our evaluation metric exhibits a strong correlation with human subjective judgments.

13.
arXiv (CS.LG) 2026-06-19

Semantic-Anchored Evidential Fusion for Domain-Robust Whole-Slide Survival Analysis

arXiv:2606.19966v1 Announce Type: cross Abstract: Whole-slide images (WSIs) are widely used for computational cancer prognosis. However, most existing methods primarily focus on in-domain performance and fail to generalize across clinical centers. This limitation stems from their reliance on pixel-derived representations that are highly susceptible to domain-specific artifacts caused by staining protocols and scanner hardware. We hypothesize that high-level pathology semantics, such as tumor grade and micro-environmental architecture, provide a domain-invariant semantic representation that mirrors the robust diagnostic logic of human pathologists. Therefore, we propose a Semantic-Anchored Evidential Fusion Survival (SAEFS) framework, where SAEFS derives semantic anchors from WSIs via Visual Question Answering (VQA), employs a dual-stream WSI evidence extraction architecture, uses Dirichlet-based Subjective Logic to model uncertainty, and fuses semantic and visual evidence through a cautious conjunction rule to avoid overconfident fusion from correlated sources. Trained exclusively on one source domain and evaluated zero-shot across four unseen domains, SAEFS consistently outperforms state-of-the-art models both in prediction accuracy and reliability, improving the average C-index by 10.2%. Quantitative analyses further show that VQA-derived semantic features exhibit significantly lower cross-center divergence than pixel-derived features, highlighting their robustness for cross-center clinical applications.

14.
arXiv (quant-ph) 2026-06-19

Subsystem Quantum Error Correction for Noisy Quantum Metrology

arXiv:2606.19628v1 Announce Type: new Abstract: Quantum error correction has been successfully applied to enhance the precision of parameter estimation in the presence of noise. Nonetheless, existing methods require a number of noiseless, controllable ancillae and lack efficient encoding and decoding procedures. In this Letter, we demonstrate that subsystem error correction provides a new direction that can substantially simplify the metrological protocol. We derive general conditions under which subsystem stabilizer codes achieve the Heisenberg limit and show that, for broad classes of noise, this can be realized by syndrome-free protocols using at most a single ancilla qubit. Furthermore, we extend this framework to dynamical error correction and show that Floquet codes can protect time-dependent metrological signals in reaching the Heisenberg limit.

15.
arXiv (CS.CL) 2026-06-17

PARSE: Provenance-Aware Retrieval Sanitization for Professional Domain LLM Agents

作者:

Prompt injection defenses evaluated on synthetic benchmarks do not generalize to real enterprise documents, which are longer, denser, and interleave legitimate authority language with factual content. We demonstrate this gap with a real-document benchmark of 122 tasks across five professional domains (financial, legal, medical, scientific, DevOps) using actual SEC filings, Federal Register rules, PubMed abstracts, arXiv papers, and GitHub postmortems. Paraphrasing, the strongest defense on synthetic benchmarks, shows no statistically significant attack success rate reduction on real documents (p=0.500) while degrading utility from 91.8% to 82.8%. We introduce PARSE (Provenance-Aware Retrieval Sanitization), a domain-aware, fact-preserving sanitization pipeline that classifies each sentence by injection likelihood, extracts structured facts before rewriting, and verifies fact preservation via a consistency-checking loop. A directiveness gate routes 59% of real enterprise documents to a lightweight path, concentrating computational cost on high-risk documents. PARSE achieves 15.6% attack success rate – a 38% reduction versus the 25.4% baseline – at 86.9% utility, the only condition that is both statistically significant (p=0.014, adequately powered) and maintains near-baseline utility. Practitioners should evaluate defenses on domain-matched real documents, not synthetic proxies.

16.
arXiv (CS.LG) 2026-06-11

Probabilistic Contrastive Pretraining for Multi-task ADME Property Prediction

arXiv:2606.11508v1 Announce Type: new Abstract: Accurate prediction of absorption, distribution, metabolism, and excretion (ADME) properties is critical to drug discovery, but remains challenging because ADME endpoints are noisy, interdependent, and often data-limited. We propose a molecular graph-transformer pretraining framework that combines chemistry-specific self-supervision with contrastive mutual information machine learning (cMIM). Our method encodes molecular graphs into latent variables, reconstructs SMILES strings from the graph-derived latent codes, and augments the contrastive objective with domain-specific self-supervised chemistry tasks. Rather than treating these tasks as auxiliary regularizers with separately tuned loss weights, we formulate reconstruction, contrastive discrimination, and chemistry-specific supervision as unit-weighted log-probability factors in a single probabilistic latent-variable objective. For fine-tuning, we propose a multi-task GNN readout architecture with task-specific multilayer perceptron heads, preserving shared representation learning while mitigating negative transfer and improving the modeling of heterogeneous, nonlinear task relationships. Across Biogen, ExpansionRX, and ChEMBL-MT, the resulting Contrastive KERMT pretraining improves over the KERMT baseline by 7.6%, 9.9%, and 9.5% respectively (averaged over significantly-improved endpoints). Adding ADME-adjacent molecules to the pretraining corpus further improves transfer, and the contrastive component sharpens chemically meaningful latent neighborhoods.

17.
arXiv (CS.LG) 2026-06-19

Light Interaction: Training-Free Inference Acceleration for Interactive Video World Models

arXiv:2605.31158v3 Announce Type: replace-cross Abstract: Interactive video world models generate video chunk by chunk in response to user-controlled camera movements, enabling applications such as real-time game simulation, virtual scene navigation, and embodied AI training. However, scaling to long interactive trajectories is prohibitively expensive due to growing context memory, quadratic attention complexity, and repeated denoising steps. We present Light Interaction, a training-free inference acceleration framework for interactive video world models. Our key insight is that interaction naturally enables trajectory-dependent adaptive computation: retrieved spatial memory can be discarded during novel exploration, temporal context can be adjusted according to local latent dynamics, and early-step model outputs can be reused when the camera revisits familiar regions. Based on this insight, Light Interaction combines adaptive context management, denoising cache acceleration, and hardware-software co-designed 3D block sparse attention with fused Triton kernels. Evaluated on HY-WorldPlay and Matrix-Game-3.0, Light Interaction achieves up to 2.59x speedup without model retraining while maintaining competitive visual quality.

18.
arXiv (quant-ph) 2026-06-11

Observable signatures of exceptional points from left-right eigenstate distinction

arXiv:2606.11333v1 Announce Type: new Abstract: Non-Hermitian quantum systems exhibit qualitatively distinct physical behavior compared to Hermitian systems, a prime example being spectral singularities known as exceptional points. Their relevance in, e.g., quantum sensing, unidirectional transport, and robust lasing makes it important to be able to identify exceptional points through observable features of a many-body system. Here, using as an example a one-dimensional complex XY spin chain realizing both rotation-time RT- and parity-time PT-symmetric regimes, we develop a framework for detecting exceptional points based on the distinction between left and right eigenvectors of the Hamiltonian, which in a non-Hermitian system are no longer the adjoint of each other. We first show that a global measure constructed from the difference between the Hamiltonian and its adjoint locates exceptional points via distinct non-analytic behavior. At the level of observables, differences in local spin correlations evaluated on the right and left eigenstates provide a reliable static detection scheme. In contrast, static bipartite entanglement measures fail to capture this distinction, urging us to study the quantum dynamics of the model. Following a sudden quench, we demonstrate that the time-averaged right-left entanglement entropy difference directly encodes signatures of the exceptional point. In the RT-symmetric regime, it exhibits a pronounced peak at the exceptional point, whereas in the PT-symmetric regime it behaves as an order-parameter-like quantity, remaining finite in one phase and vanishing at the transition. Our results establish a direct link between the structure of non-Hermitian eigenstates and observable signatures of exceptional points, providing a practical route to identify them in existing quantum simulators.

19.
arXiv (CS.LG) 2026-06-17

Another Look at Log-PCA for Probability Measures: A Dynamical Formulation and Statistical Convergence

arXiv:2606.17196v1 Announce Type: cross Abstract: This paper is concerned with learning principal variations of random probability measures on $\mathbb{R}^m$ under the Wasserstein geometry. We introduce a new dynamical formulation to interpret the log-PCA, a linearized principal geodesic analysis, as a variational approach. Our differentiable version, termed as the Wasserstein Tangential PCA (WT-PCA), captures the local principal modes of geodesic variations of a (weighted) probability measure on the Wasserstein space via its covariance operator at barycenter. Based on the dynamical perspective and leveraging parallel transport structure of the optimal transport problems, we derive a general statistical convergence rate of the empirical WT-PCA when estimated from data in terms of the 2-Wasserstein distance between the population and empirical barycenter reference measures.

20.
arXiv (quant-ph) 2026-06-16

A New Definition of Quantum Superposition

arXiv:2606.15607v1 Announce Type: new Abstract: The usual description of the superposition of two (pure quantum) states is ambiguous, since the binary operation of summation in a Hilbert space does not pass down to the quotient projective space. Even though Dirac noted this as early as 1930, it is often asserted that the superposition is a binary operation acting on two states with a value that is a unique state. The goal for this note is to motivate a rigorous, geometrical definition of the superposition of states in the setting of complex projective space, which has been argued elsewhere to be the natural geometric phase space for quantum theory. The upshot is that the new definition of the superposition of two pure states, viewed as two distinct points in the projective space, is the unique (complex) line on which those two points lie. Finally, a comparison is given between superposition and expansion in an orthonormal basis.

21.
arXiv (CS.CV) 2026-06-18

How far have we gone in Generative Image Restoration? A study on its capability, limitations and evaluation practices

Generative Image Restoration (GIR) has achieved impressive perceptual realism, but how far have its practical capabilities truly advanced compared with previous methods? To answer this, we present a large-scale study grounded in a new multi-dimensional evaluation pipeline that assesses models on detail, sharpness, semantic correctness, and overall quality. Our analysis covers diverse architectures, including diffusion-based, GAN-based, PSNR-oriented, and general-purpose generation models, revealing critical performance disparities. Furthermore, our analysis uncovers a key evolution in failure modes that signifies a paradigm shift for the perception-oriented low-level vision field. The central challenge is evolving from the previous problem of detail scarcity (under-generation) to the new frontier of detail quality and semantic control (preventing over-generation). We also leverage our benchmark to train a new IQA model that better aligns with human perceptual judgments. Ultimately, this work provides a systematic study of modern generative image restoration models, offering crucial insights that redefine our understanding of their true state and chart a course for future development.

22.
arXiv (CS.AI) 2026-06-18

Skill-MAS: Evolving Meta-Skill for Automatic Multi-Agent Systems

arXiv:2606.18837v1 Announce Type: cross Abstract: Large Language Model (LLM)-based automatic Multi-Agent Systems (MAS) generation has become a crucial frontier for tackling complex tasks. However, existing methods face a dilemma between model capability and experience retention. Inference-time MAS leverages frozen frontier LLMs but repeats identical searches without learning from past experience. Conversely, Training-time MAS internalizes experience via gradient updates but is constrained by the low capability ceiling of smaller models, and is hard to scale to large frontier LLMs. To bridge this gap, we propose Skill-MAS, a novel third path that decouples experience retention from parametric updates by conceptualizing the high-level orchestration capability as an evolvable Meta-Skill. Skill-MAS refines this architectural knowledge through a closed optimization loop: (1) Multi-Trajectory Rollout samples a behavioral distribution for each task under the current Meta-Skill; and (2) Selective Reflection adaptively selects priority tasks and applies hierarchical contrastive analysis to distill systemic experience into generalizable, strategy-level principles. Extensive experiments across four complex benchmarks and four distinct LLMs demonstrate that Skill-MAS not only achieves remarkable performance gains but also maintains a favorable cost-performance trade-off. Further analysis reveals that the evolved Meta-Skills are highly robust and exhibit strong transferability across unseen tasks and different LLMs.

23.
bioRxiv (Bioinfo) 2026-06-13

PertDiffBench: Benchmarking Diffusion Models for Single-Cell Perturbation Response Prediction

Diffusion models are increasingly used to predict transcriptional responses to perturbations, but whether they improve on simpler generative and representation-based baselines remains unclear. Existing evaluations often do not separate the effects of model architecture, input representation, biological context and metric choice, making it difficult to determine where diffusion-based methods are useful. Here we introduce PertDiffBench, a standardized benchmark for diffusion-based transcriptomic perturbation prediction across single-cell and bulk RNA-seq datasets. PertDiffBench evaluates diffusion-based models across three complementary evaluation settings: standard prediction in known single-cell contexts and bulk perturbation conditions, generalization to unseen cell types, species, drugs and intermediate time points, and stress tests of feature dimensionality, input representation, noise type and gene ordering. Across these settings, diffusion models did not show a consistent advantage. scGen remained a strong baseline in common prediction tasks, whereas scDiffusion was the most competitive diffusion-based method in several generalization settings. Temporal imputation showed a different pattern, with a simple DDPM operating directly in expression space outperforming more specialized models. Stress tests showed that performance was model dependent and sensitive to feature dimensionality, encoder choice, noise type and gene ordering. Pretrained encoders did not consistently improve performance, with the classical scVI representation slightly exceeding STATE in seen-condition and unseen-cell-type settings. These results indicate that diffusion-model performance in perturbation response prediction depends strongly on task design and representation choice. PertDiffBench provides a practical framework for evaluating these models under biologically varied and stress-tested conditions.

24.
arXiv (CS.AI) 2026-06-11

DiffCold: A Diffusion-based Generative Model for Cold-Start Item Recommendation

arXiv:2606.12245v1 Announce Type: cross Abstract: Cold-start item recommendation remains a persistent challenge in real-world systems due to the absence of interaction histories. While prior models attempt to bridge this gap using item content features, they universally suffer from the seesaw dilemma: enhancing performance for cold items inevitably degrades performance for warm items, and vice versa. We identify that this dilemma stems from a fundamental distributional disparity: warm item embeddings occupy a complex ``behavioral manifold" shaped by rich interaction signals, whereas cold item embeddings are constrained to a ``semantic manifold" derived solely from auxiliary content. Existing methods often force a rigid mapping between these inconsistent spaces, causing the model to sacrifice the precision of warm representations to accommodate cold ones. To address this, we propose DiffCold, a diffusion-based generative model that unifies warm and cold representations. Unlike GANs or VAEs, DiffCold leverages conditional diffusion to reconstruct warm item embeddings from content, preserving the underlying manifold structure without degradation. We further tailor this paradigm with two specific designs: a Retrieval-enhanced Aggregator that initializes generation using semantically similar warm items to bypass inefficient noise, and a Simulation-based Representation Alignment module that enforces distribution consistency between generated and real embeddings via contrastive learning. Experiments on three benchmarks confirm that DiffCold resolves the seesaw dilemma, consistently outperforming state-of-the-art methods across all metrics.

25.
arXiv (CS.CL) 2026-06-15

ADORE: Iterative Query Expansion with Retrieval-Grounded Relevance Feedback

LLM-based query expansion improves retrieval by enriching the original query with additional context. Yet most methods remain generation-driven, producing plausible pseudo-documents or expansions without checking how the target corpus responds. This can introduce retrieval drift, amplify misleading vocabulary, or miss terms that distinguish relevant from non-relevant documents. We argue that effective expansion requires retrieval-grounded feedback, not just single-pass generation or unverified iteration. We introduce ADORE (ADapt, Observe, Relevance Evaluate), an iterative framework that turns retrieval outcomes into feedback for the next expansion. At each round, an LLM generates pseudo-passages, a retriever exposes the corpus response, and a relevance assessor evaluates retrieved documents against the original query. These judgments identify what to reinforce, what remains undercovered, and what to suppress. Across TREC Deep Learning, BEIR, and BRIGHT, ADORE consistently outperforms strong query expansion baselines with notable improvements across nearly all evaluation settings, improving average nDCG@10 by 24.5% over BM25 and 3.6% over the strongest prior query expansion method on BEIR, and by 122.9% over BM25 and 9.2% over the best query expansion baseline on BRIGHT. Our code and data are publicly available.