Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-17

Adaptive Domain Models: Bayesian Evolution, Warm Rotation, and Principled Training for Geometric and Neuromorphic AI

arXiv:2603.18104v5 Announce Type: replace Abstract: Prevailing AI training assumes reverse-mode automatic differentiation over IEEE-754 arithmetic. The memory overhead of training relative to inference, optimizer complexity, and structural degradation of geometric properties through training are consequences of this arithmetic substrate. This paper develops an alternative training architecture grounded in three prior results: the Dimensional Type System and Deterministic Memory Management framework (Haynes 2026), which establishes stack-eligible gradient allocation and exact quire accumulation as design-time verifiable properties; the Program Hypergraph (Haynes 2026), which establishes grade preservation through geometric algebra computations as a type-level invariant; and the b-posit bounded-regime design (Jonnalagadda et al. 2025), which makes posit arithmetic tractable across hardware targets conventionally considered inference-only. Their composition enables depth-independent training memory bounded to approximately twice the inference footprint, grade-preserving weight updates, and exact gradient accumulation, applicable uniformly to loss-function-optimized and spike-timing-dependent neuromorphic models. We introduce *Bayesian distillation*, a mechanism by which the latent prior structure of a general-purpose model is extracted through the ADM training regime, resolving the data-scarcity bootstrapping problem for domain-specific training. For deployment, we introduce *warm rotation*, an operational pattern in which an updated model transitions into an active inference pathway without service interruption, with correctness formalized through PHG certificates and signed version records. The result is a class of domain-specific AI systems that are smaller and more precise than general-purpose models, continuously adaptive, verifiably correct with respect to the physical structure of their domains, and initializable from existing models.

02.
arXiv (CS.LG) 2026-06-15

Minimum Distance Summaries for Robust Neural Posterior Estimation

arXiv:2602.09161v2 Announce Type: replace-cross Abstract: Simulation-based inference (SBI) enables amortized Bayesian inference by first training a neural posterior estimator (NPE) on prior-simulator pairs, typically through low-dimensional summary statistics, which can then be cheaply reused for fast inference by querying it on new test observations. Because NPE is estimated under the training data distribution, it is susceptible to misspecification when observations deviate from the training distribution. Many robust SBI approaches address this by modifying NPE training or introducing error models, coupling robustness to the inference network and compromising amortization and modularity. We introduce minimum-distance summaries, a plug-in robust NPE method that adapts queried test-time summaries independently of the pretrained NPE. Leveraging the maximum mean discrepancy (MMD) as a distance between observed data and a summary-conditional predictive distribution, the adapted summary inherits strong robustness properties from the MMD. We demonstrate that the algorithm can be implemented efficiently with random Fourier feature approximations, yielding a lightweight, model-free test-time adaptation procedure. We provide theoretical guarantees for the robustness of our algorithm and empirically evaluate it on a range of synthetic and real-world tasks, demonstrating substantial robustness gains with minimal additional overhead.

03.
arXiv (CS.LG) 2026-06-16

Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

arXiv:2606.15219v1 Announce Type: new Abstract: In this work, we tackle the following question: Can neural networks trained with gradient-based methods achieve the optimal computational-statistical tradeoff in learning Gaussian single-index models? Prior research has shown that any polynomial-time algorithm under the statistical query (SQ) framework requires $\Omega(d^{s^\star/2}\lor d)$ samples, where $s^\star$ is the generative exponent representing the intrinsic difficulty of learning the underlying model. However, it remains unknown whether neural networks can achieve this sample complexity. Inspired by prior techniques such as label transformation and landscape smoothing for learning single-index models, we propose a unified gradient-based algorithm for training a two-layer neural network in polynomial time. Our method is adaptable to a variety of loss and activation functions, covering a broad class of existing approaches. We show that our algorithm learns a feature representation that strongly aligns with the unknown signal $\theta^\star$, with sample complexity $\widetilde{O} (d^{s^\star/2} \lor d)$, matching the SQ lower bound up to a polylogarithmic factor for all generative exponents $s^\star\geq 1$. Furthermore, we extend our approach to the setting where $\theta^\star$ is $k$-sparse for $k = o(\sqrt{d})$ by introducing a novel weight perturbation technique that leverages the sparsity structure. We derive a corresponding SQ lower bound of order $\widetilde{\Omega}(k^{s^\star})$, matched by our method up to a polylogarithmic factor. Our framework, especially the weight perturbation technique, is of independent interest, and suggests potential gradient-based solutions to other problems such as sparse tensor PCA.

04.
arXiv (CS.LG) 2026-06-17

Multi-Adapter PPO: A Cross-Attention Enhanced Wavelength Selection Framework for LIBS Quantitative Analysis

arXiv:2606.17476v1 Announce Type: new Abstract: Laser-induced breakdown spectroscopy (LIBS) quantitative analysis faces critical challenges in wavelength selection due to high-dimensional spectral data and the fundamental trade-off between prediction accuracy and feature efficiency. This paper presents a novel Multi-Adapter PPO framework that transforms wavelength selection into a reinforcement learning problem, leveraging cross-attention mechanisms and multiple specialized adapters to capture complex spectral relationships. Our approach outperforms traditional Particle Swarm Optimization (PSO) by an average of 28.4\% in comprehensive score and 45.2\% in prediction accuracy across steel and coal datasets. The proposed method demonstrates superior performance in balancing prediction accuracy with feature efficiency, achieving state-of-the-art results in LIBS quantitative analysis while maintaining interpretability and computational efficiency. We released our code and dataset here: https://github.com/Hflying/MAPPO

05.
arXiv (CS.CV) 2026-06-12

Modality-Aware Feature Matching in Visual and Vision-Language Applications: A Comprehensive Survey

Feature matching is a cornerstone task in computer vision, essential for applications such as image retrieval, stereo matching, 3D reconstruction, and SLAM. This survey comprehensively reviews modality-based feature matching, exploring traditional handcrafted methods and emphasizing contemporary deep learning approaches across various modalities, including RGB images, depth images, 3D point clouds, LiDAR scans, medical images, and vision-language interactions. Traditional methods, leveraging detectors like Harris corners and descriptors such as SIFT and ORB, demonstrate robustness under moderate intra-modality variations but struggle with significant modality gaps. Contemporary deep learning-based methods, exemplified by detector-free strategies like CNN-based SuperPoint and transformer-based LoFTR, substantially improve robustness and adaptability across modalities. We highlight modality-aware advancements, such as geometric and depth-specific descriptors for depth images, sparse and dense learning methods for 3D point clouds, attention-enhanced neural networks for LiDAR scans, and specialized solutions like the MIND descriptor for complex medical image matching. Cross-modal applications, particularly in medical image registration and vision-language tasks, underscore the evolution of feature matching to handle increasingly diverse data interactions.

06.
arXiv (CS.LG) 2026-06-19

Sparsity, Superposition, and Forgetting: A Mechanistic Study of Representation Retention in Continual Learning

arXiv:2606.20431v1 Announce Type: new Abstract: Continual learning (CL) systems often forget previously acquired knowledge, yet the mechanisms driving forgetting remain hard to isolate in practice because real datasets entangle many factors. We present a controlled, toy-world framework that makes these mechanisms observable and testable. Using a synthetic generator-separator pipeline, we define ground-truth latent features, build tasks with tunable sparsity and overlap, and introduce measurable quantities for representation strength and superposition (directional overlap among features). We then study retention dynamics-the temporal change of representation strength by fitting sparse dynamical relations (via SINDy) between retention, superposition, and exposure history. A complementary task-level analysis based on effective rank characterizes how representational capacity is allocated across tasks. Our controlled experiments yield three takeaways. (1) Superposition tends to increase over time with transient dips at task boundaries, suggesting boundary-specific interference rather than steady drift. (2) Higher feature sparsity induces more superposition yet does not inevitably cause forgetting; when representations remain strong, forgetting can be reduced despite overlap. (3) Task-level effective rank grows with sparsity, indicating broader capacity usage under sparse regimes. Together, these results nuance the common intuition that more superposition leads to more forgetting by showing that overlap interacts with representation strength and capacity allocation. Our toy analysis provides falsifiable hypotheses and diagnostic tools for CL.

07.
arXiv (CS.LG) 2026-06-16

Conformal Candidate Certification for Offline Model-Based Optimization

arXiv:2606.15217v1 Announce Type: cross Abstract: Offline model-based optimization (MBO) proposes candidates by optimizing a surrogate trained on a fixed historical dataset. Because candidates are deliberately out-of-distribution, surrogate rankings are least reliable exactly where the optimizer is most aggressive, yet existing methods provide no per-candidate statistical certificate that a design meets a target threshold. We propose Conformal Candidate Certification (CCC), a post-hoc wrapper that attaches a calibrated one-sided lower bound to each candidate and advances only those whose bound exceeds the target. We show that entropy-regularized surrogate maximization induces a Gibbs-tilted proposal, so the same surrogate supplies importance weights for weighted conformal prediction without a separate density-ratio estimation step. In a controlled synthetic study, CCC certifies $16.7\%$ of an aggressive proposal pool with empirical coverage 0.990 at nominal 0.90, while standard conformal prediction ignoring the covariate shift collapses to 0.416 coverage.

08.
arXiv (CS.AI) 2026-06-15

TwinBI: An Agentic Digital Twin for Efficient Augmented Interactions with Business Intelligence Dashboards

arXiv:2606.13731v1 Announce Type: new Abstract: Business intelligence (BI) increasingly combines dashboard interaction with LLM-based assistance, but these two modes often fall out of sync during multi-step analysis. As users switch between direct dashboard manipulation and natural-language queries, it becomes difficult to preserve a consistent analytical state across filters, hierarchies, metrics, and chart context. We present TwinBI, an agentic digital-twin framework that couples an LLM-based agent system with an executable BI dashboard state. TwinBI unifies conversational interaction, dashboard manipulation, semantic grounding, and provenance tracking through a shared analytical state reconstructed from a unified interaction log. It also exposes artifacts such as schema views, SQL, logs, and an /insights command for state-grounded analytical summaries. We evaluate TwinBI in two complementary ways. In a controlled A/B benchmark with the same backbone agent, TwinBI improves exact-match accuracy from 43.3% to 63.3%, partial-credit accuracy from 48.3% to 70.8%, and substantially reduces timeout rate from 40.0% to 10.0% relative to Dashboard alone. In a usability study, participants benefited from the integrated dashboard-and-chat workflow, with high task accuracy, moderate workload, and favorable ratings for state-aware interaction mechanisms. These results suggest that TwinBI improves both agent-level analytical reliability and user-facing analytical support by turning visible dashboard state into richer actionable context. Our dataset and source code are available at: https://github.com/simonjisu/TwinBI

09.
arXiv (CS.AI) 2026-06-16

CoAgent: Concurrency Control for Multi-Agent Systems

arXiv:2606.15376v1 Announce Type: cross Abstract: Multi-agent LLM systems – coding agents, devops agents, document agents – now routinely run several agents in parallel against the same git tree, Kubernetes cluster, or document. As soon as two of them mutate shared state, they enter the regime classical concurrency control has studied for decades, but classical mechanisms fit LLM agents poorly. A single agent transaction spans minutes of inference, read sets are broad and opaque rather than statically inferable, and the live state agents act on admits neither fork nor buffer, so writes take effect the moment they execute. Locks block long inference intervals; OCC abort-and-retry discards minutes of work on every conflict. This paper builds concurrency control on a capability classical transactions lack: the LLM inside each agent can judge whether a conflicting write invalidates its plan, and can repair exactly the operations that depended on it. Control therefore turns advisory: the runtime informs, the agent repairs. Our protocol, MTPO (Monotonic Trajectory Pre-Order), fixes a serialization order at launch, serves each read the order-filtered value, and applies writes speculatively in place; a one-way notification asks an affected reader to re-judge and patch its plan, while the framework mechanically undoes and reorders misplaced writes through the saga-style inverse each tool registers in advance. At quiescence the run is serializable in the pre-decided order. We realize MTPO as CoAgent, toolcall middleware whose privileged ToolSmith grows footprint-declared, undoable tools online. On ten contended workloads, CoAgent stays within 5\% of serial correctness at a $1.4\times$ speedup and near-serial token cost, where 2PL and OCC surrender nearly all concurrency gains; on a bash-only target system, it grows a 25-tool library online and lifts the task pass rate from 45/71 to 63/71 at $0.80\times$ the time and $0.86\times$ the cost.

10.
arXiv (CS.LG) 2026-06-15

Shuttling Compiler for Trapped-Ion Quantum Computers Based on Large Language Models

arXiv:2512.18021v3 Announce Type: replace-cross Abstract: We present the first shuttling compiler based on large language models (LLMs) for trapped-ion quantum computers, where qubits are shuttled between segments for gate execution and qubit storage. We fine-tune pre-trained LLMs on examples from linear and branched one-dimensional shuttling architectures. Thus, we obtain a layout-independent compilation strategy that learns the required shuttling operations directly from data. Using benchmark circuits with up to 16 qubits, such fine-tuned LLMs can now generate valid schedules for shuttling architectures. Notably, we also obtain a valid schedule for a previously unseen four-way junction layout. This demonstrates that trained LLMs can generalize to layouts not encountered during training. For various architectures, LLM-based schedules improve upon state-of-the-art baseline compiler results, reducing the shuttling effort by up to 15%.

11.
arXiv (CS.AI) 2026-06-19

Global Ease of Living Index: a machine learning framework for longitudinal analysis of major economies

arXiv:2502.06866v3 Announce Type: replace-cross Abstract: The drastic changes in the global economy, geopolitical conditions, and disruptions such as the COVID-19 pandemic have impacted the cost of living and quality of life. It is essential to comprehend the long-term implications of the cost of living and quality of life in major economies. A transparent and comprehensive living index must include multiple dimensions of living conditions. In this study, we present an approach to quantifying the quality of life through the Global Ease of Living Index that combines various socio-economic and infrastructural factors into a single composite score. Our index utilises economic indicators that define living standards, which could help in targeted interventions to improve specific areas. We present a machine learning framework to address missing data for certain economic indicators in specific countries. We then curate and update the data and use a dimensionality reduction approach (Principal Component Analysis and Factor Analysis) to create the Ease of Living Index for major economies since 1970. Our work significantly adds to the literature by offering a practical tool for policymakers to identify areas needing improvement, such as healthcare systems, employment opportunities, and public safety. Our approach with open data and code can be easily reproduced and applied to various contexts, providing transparency and accessibility for ongoing research and policy development in quality-of-life assessment.

12.
arXiv (CS.AI) 2026-06-19

AAPA: Adversarially Anchored Preference Alignment for Post-Training of Large Language Models

arXiv:2509.25148v2 Announce Type: replace Abstract: Post-training alignment of large language models often combines supervised fine-tuning (SFT) on expert demonstrations with reinforcement learning (RL) from preference or verifiable feedback. SFT provides a useful behavioral anchor but can overfit to static demonstrations, whereas RL encourages exploration but may drift from expert behavior or exploit imperfect rewards. We propose AAPA (Adversarially Anchored Preference Alignment), a plug-in framework that augments existing post-training objectives with a sentence-level adversarial anchoring signal. AAPA compares policy rollouts with offline, pre-collected expert responses using a fixed lightweight discriminator, and therefore requires neither online teacher inference nor discriminator co-training during policy optimization. The same anchoring term can be added to SFT, GRPO, and CHORD while preserving their original training pipelines. Experiments on instruction-following benchmarks show that AAPA consistently improves the corresponding base objectives across model scales. In particular, the staged AAPA configuration improves over a strong GRPO baseline by 5.77\% on \texttt{Qwen3-0.6B} and 3.75\% on \texttt{Qwen3-4B}. Further analyses on response length, log-probability distributions, and discriminator variants suggest that adversarial anchoring provides a stable semantic grounding signal for preference optimization. Code is available at \url{https://github.com/IsFaqq/AAPA}.

14.
arXiv (CS.AI) 2026-06-24

Variational Model Merging for Pareto Front Estimation in Multitask Finetuning

arXiv:2412.08147v2 Announce Type: replace-cross Abstract: Pareto fronts are useful to find good task-mixing strategies for multitask finetuning, but they are also costly to compute. To reduce costs, recent works have used existing model merging methods to help train cheap surrogate models to estimate the Pareto fronts. However, no work has yet considered designing new model-merging methods to directly, and provably, improve the quality of Pareto fronts. Here, we fill this gap by proposing a new Bayesian approach called Variational Model Merging. In this approach, existing model-merging methods are obtained as special cases of "posterior-merging" when Gaussian posteriors are used and new model-merging strategies can be derived by using non-Gaussian posteriors. Our main theoretical result is to show that more flexible posteriors necessarily yield better estimates of Pareto fronts. For instance, a Pareto front estimate obtained by merging full-Gaussian posteriors is expected to be better than that obtained by using isotropic Gaussian posteriors. We validate the theory through extensive empirical results on vision and language transformers where better Gaussian families consistently yields better or comparable Pareto fronts. Our work is a rare instance where Bayesian ideas are used to improve Pareto analysis.

15.
arXiv (CS.AI) 2026-06-18

WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents

arXiv:2606.18847v1 Announce Type: new Abstract: To assist humans over extended periods in real homes, embodied agents must remember user routines, world states, and past interactions. Existing long-term memory benchmarks mainly evaluate language-centric retrieval and question answering, while embodied benchmarks often focus on short-horizon task execution without testing long-term memory use in dynamic environments. We introduce WorldLines, a project-driven benchmark for long-horizon embodied household assistance. It constructs temporally extended household traces with dialogues, actions, execution feedback, object and device state changes, and converts them into evidence-linked samples for Memory QA and Embodied Task Planning. We further propose ObsMem, an observer-grounded memory framework that maintains visibility-aware memories and action-native state trails for state-aware decisions. Experiments reveal persistent challenges in partial observability, overwritten world states, and translating long-term memory into embodied plans, while ObsMem offers a stronger reference architecture for this setting.

16.
arXiv (CS.CV) 2026-06-17

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Generalist vision-language-action systems need object-centric 3D evidence and reusable manipulation experience to plan reliable robot trajectories. GeneralVLA provides a hierarchical interface for converting language and RGB-D observations into 3D end-effector paths, but two bottlenecks remain. First, monocular SAM3D-style object reconstruction can hallucinate pose and unseen geometry, while manipulation benefits from stable object shape when calibrated multi-view observations are available. Second, the original KnowledgeBank mainly retrieves semantically similar snippets and appends new knowledge, which makes it difficult to control memory quality, conflicts, confidence, and geometric relevance. To address the first challenge, we introduce GeoFuse-MV3D, a geometry-prior-guided MV-SAM3D reconstruction branch that verifies external geometry cues with input-view masks, applies soft visual-hull support, performs axis-wise refinement, and fuses only geometry while preserving appearance. To address the second challenge, we upgrade KnowledgeBank into a governed long-term memory system with explicit quality, confidence, lifecycle, verifier, and conflict metadata, together with precision-oriented retrieval. Finally, we evaluate the reconstruction branch on GSO-30 and the memory module on Terminal-Bench 2.0 and SWE-Bench Verified; GeoFuse-MV3D improves over the MV-SAM3D baseline by reducing CD and LPIPS by 2.20% and 2.02% while increasing PSNR and SSIM by 2.36% and 1.03%, and KnowledgeBank improves over ReasoningBank by 4.53% on Terminal-Bench SR and 3.73% on SWE-Bench resolve rate, while reducing AS by 4.95% and 5.65%, respectively. Code: https://github.com/AIGeeksGroup/GeneralVLA-2. Website: https://aigeeksgroup.github.io/GeneralVLA-2.

17.
arXiv (CS.LG) 2026-06-24

Federated Survival Analysis in Healthcare: A Multi-Model Evaluation on Cross-Institutional Heterogeneous Breast Cancer Data

arXiv:2606.23871v1 Announce Type: new Abstract: Survival analysis is central to clinical decision-making, yet reliable time-to-event models require large, diverse cohorts that are rarely available at a single institution, while privacy regulations restrict the centralization of patient data. Federated learning (FL) offers a privacy-preserving alternative by training shared models without exchanging raw data, but its effectiveness for survival modeling under realistic, heterogeneous conditions remains insufficiently understood. This paper presents a systematic, multi-model evaluation of federated survival analysis on a cross-institutional breast cancer cohort with naturally heterogeneous distributed clients. Three representative survival models, the Cox Proportional Hazards model, DeepSurv, and Random Survival Forest (RSF), are compared across centralized, local, and federated training, and three federated optimization strategies (FedAvg, FedProx, and FedAdam) are assessed for the gradient-based models. Results show that FL consistently outperforms local training and approaches, and occasionally exceeds, centralized performance, while RSF offers the best overall balance of discrimination, calibration, and robustness across heterogeneous clients. We further find that performance depends on the diversity of client distributions, and that FedAvg and FedProx are stronger and more stable than FedAdam. Based on these findings, we derive practical, decision-oriented guidelines mapping data, privacy, interpretability, and resource constraints to recommended model and training-paradigm choices for federated survival modeling in healthcare.

18.
arXiv (quant-ph) 2026-06-16

Entanglement-Rank Duality in Quadratic Phase Quantum States

arXiv:2605.05167v2 Announce Type: replace Abstract: Absolutely maximally entangled (AME) states are fundamental resources in quantum information theory, yet their construction and certification remain a nontrivial problem. Within the family of quadratic phase quantum states, defined by symmetric matrices $P$ over finite fields $\mathbb{F}_{p^m}$, we show that the Rank-Purity Duality $\operatorname{Tr}(\rho_S^2) = |\mathbb{F}|^{-\operatorname{rk}_{\mathbb{F}}(P_{S,\bar{S}})}$ follows from additive character orthogonality and holds over all $\mathbb{F}_{p^m}$, yielding a polynomial-time AME certification criterion. For square-free dimensions $d = p_1\cdots p_r$, the Chinese Remainder Theorem induces a prime-field factorisation. This implies additivity of Rényi-2 entropy and yields sharp obstruction criteria that rule out cases such as $\operatorname{AME}(4,6)$ and constrain the open case $\operatorname{AME}(8,6)$. As a proof of concept, we construct an explicit $\operatorname{AME}(17,10001)$ state, certified across all $65{,}535$ bipartitions, demonstrating that the framework scales to large systems and previously inaccessible local dimensions.

19.
arXiv (CS.AI) 2026-06-16

TrustedARI: Towards Trust-Native Agentic Routing Infrastructure for Agentic AI

arXiv:2606.15822v1 Announce Type: new Abstract: AI agents increasingly access external models, tools, and services through Agentic Routing Infrastructure (ARI) to manage the overhead of heterogeneous interfaces and fragmented subscriptions. Yet, the architecture of ARI introduces fundamental trust risks: it obtains plaintext access to agent queries and service responses, while leaving agents unable to verify that their queries are routed to intended service providers or that requests and responses remain untampered. To address this problem, we present TrustedARI, the first trust-native agentic routing infrastructure for agentic AI. Architecturally, TrustedARI is built upon three core innovations: (i) an ARI-adapted three-party TLS handshake that enables the agent and ARI to jointly authenticate the service provider through role-specific distribution of TLS key materials; (ii) a privacy-preserving query-construction protocol that allows the agent and ARI to collaboratively construct well-formed queries without exposing their respective private inputs; and (iii) a verifiable billing protocol that supports fair usage-based settlement while preserving the integrity and confidentiality of service responses. We implemented and extensively evaluated a prototype of TrustedARI to validate its performance. Experiments confirm that TrustedARI is highly efficient: our ARI-adapted handshake protocol reduces communication overhead by 39.34% compared to the existing three-party TLS handshake. Furthermore, the privacy-preserving query-construction protocol imposes negligible overhead-averaging 0.19 seconds in computation time and 0.58 MB in communication costs-while the verifiable billing protocol speeds up proof generation by 28.20x. Crucially, TrustedARI is readily deployable without any modification to the service providers.

20.
arXiv (CS.AI) 2026-06-17

When Rules Learn: A Self-Evolving Agent for Legal Case Retrieval

arXiv:2606.17220v1 Announce Type: new Abstract: Legal case retrieval remains challenging due to the complexity of legal language and the need for precise lexical alignment between queries and relevant cases. Although dense retrieval models have achieved notable progress, empirical studies show that BM25 continues to serve as a strong baseline in this domain. It motivates us to propose a self-evolving framework for rule-driven query rewriting that enhances BM25 without any parameter training. The framework equips an LLM-based agent with an automatic evaluation environment, enabling it to iteratively create rewriting rules, plan validation experiments over rule combinations, and eliminate ineffective rules based on historical feedbacks. We evaluate our method on the Chinese legal case retrieval benchmark LeCaRD-v2. Experimental results demonstrate that the proposed framework outperforms non-evolutionary baselines, including human-designed rules and greedy rule selection, particularly when powered by a highcapacity core LLM. We also conduct detailed analyses to investigate the mechanisms underlying self-evolution. Our findings reveal that LLM's capabilities to leverage previous experimental results and its intrinsic knowledge of rule elimination play critical roles in refining the rule set via self-evolution.

21.
arXiv (CS.LG) 2026-06-19

Alzheimer's Disease Diagnosis using a Multimodal Approach with 3D MRI and PET

arXiv:2606.20037v1 Announce Type: new Abstract: Alzheimer's disease (AD) is an irreversible neurodegenerative disorder and a leading cause of death worldwide. Early diagnosis plays an important part especially at the Mild Cognitive Impairment stage, where timely intervention can help slow its progression before it advances to AD. Neuroimaging data, like Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) scans, can help detect brain changes early by providing structural and functional brain changes related to the disease. Yet, many multimodal models still fuse MRI and PET with static concatenation and apply identical computation to all subjects, which limits robustness to patient/site heterogeneity and can waste computation. To address these limitations, we present the first study of combining 3D convolutional feature extractors with three fusion strategies - concatenation, Gated Multimodal Unit (GMU), and gated self-attention - and a sparsely gated Mixture-of-Experts (MoE) classifier that performs input-adaptive routing, activating only the most informative experts per case. Finally, we utilize Grad-CAM to visualize disease-related regions, ensuring model interpretability. Experiments are performed across three binary classification tasks (NC vs. MCI, MCI vs. AD, and NC vs. AD). Results show that GMU achieves accuracies of 80.46 % (NC vs. MCI) and 95.47 % (NC vs. AD), while gated self-attention attains 82.08 % on MCI vs. AD. Ablations show that removing the MoE consistently degrades accuracy across all tasks. These findings underscore the value of input-adaptive, multimodal modeling for AD diagnosis by leveraging the complementary nature of MRI and PET.

22.
arXiv (CS.LG) 2026-06-17

Beyond IGO-Flow: Toward Convergence Analysis of IGO in Continuous Spaces

arXiv:2606.17523v1 Announce Type: cross Abstract: Information-Geometric Optimization (IGO) provides a unified framework for black-box optimization by interpreting the adaptation of a search distribution as a natural gradient update. Despite its conceptual importance, the convergence theory of IGO remains limited: most existing results concern continuous-time idealizations such as the IGO flow, rather than discrete-time updates with non-infinitesimal learning rates. In this paper, we study discrete-time IGO in continuous spaces, formulated as natural gradient updates in the expectation-parameter coordinates of an exponential family. In particular, we analyze IGO over the multivariate Gaussian family on strongly convex quadratic objective functions. Our analysis covers a setting that simultaneously incorporates full covariance adaptation, a fixed positive learning rate, and quantile-based weights. In this setting, we prove that the covariance matrix converges to the zero matrix. We further show that the mean vector converges to the global optimum, provided that the condition number of the appropriately scaled covariance matrix is bounded at sufficiently frequent iterations. These results advance the convergence theory of IGO and help bridge the gap between the mathematical theory of IGO and practical covariance-adaptive search methods such as CMA-ES.

23.
arXiv (quant-ph) 2026-06-16

Quantum coherence and Leggett-Garg inequality

arXiv:2606.15717v1 Announce Type: new Abstract: In this paper, we attempt to establish the relationship between quantum coherence and the violation of the Leggett-Garg inequality. In particular, employing the Lindblad equation, we obtain the pseudo-density matrix for a damping system to study the effect of environment interaction on the violation of this inequality in a two-state quantum system. It is shown that the violation of the Leggett-Garg inequality can be observed as long as temporal evolution does not induce decoherence. This statement is independent of the initial state of the system. Furthermore, similar to the Horodecki criterion for the CHSH inequality (R. Horodecki et al. Phys. Lett. {\bf A200}, 340), we study necessary and sufficient conditions for violating the Leggett-Garg inequality. Hereby, under the circumstance that the inequality violation occurs, an upper bound for the time interval between consecutive measurements with respect to the time scale of interaction with the environment (the relaxation time) is obtained.

24.
medRxiv (Medicine) 2026-06-11

Dissecting the functional landscape of rare diseases through genomic variation in a heterogeneous cohort of 11,000 patients

Rare diseases (RDs) remain a major diagnostic challenge. Genetic and phenotypic heterogeneity, incomplete knowledge of disease mechanisms, and limitations in variant clinical interpretation leave many patients without a molecular diagnosis. Meanwhile, the growing volume of genomic data generated in clinical practice offers an opportunity to develop data-driven methodologies for exploring disease mechanisms and improving the reanalysis of unsolved cases. We aggregated real-world genomic data from 11,084 unrelated patients with suspected RD. Patients were clinically classified into 122 diseases. We built a multi-disease genomic variant frequency database (FJD-DB), which enabled the development of variant and gene-disease association scores by means of case-control subcohort comparisons across 32 disease groups. Functional enrichment analyses were then used to highlight disease-associated protein domains, pathways, biological processes, and phenotypes. Finally, the resulting knowledge was integrated into a data-driven framework for the guided reanalysis of unsolved RD patients applied to Inherited Retinal Dystrophies (IRD) patients as first use case. FJD-DB contained more than 45 million unique variants, including ~185,000 potentially pathogenic variants. Disease-specific analyses identified disease-associated pathogenic variants and highlighted both established and candidate disease genes. We detected 179 significantly enriched protein domains across 23 diseases, 124 Human Phenotype Ontology terms across 13 diseases, 79 Reactome pathways across 10 diseases, and 72 Gene Ontology biological processes across 8 diseases, revealing highly disease-specific functional signatures. Integration of disease-specific variant, gene, and functional association signals enabled the development of a data-driven framework for guided reanalysis of unsolved RD cases. Applied to more than 1,100 unsolved IRD cases, the framework generated clinically relevant findings in 26 patients, including four molecular diagnoses, seven candidate diagnoses, and 15 cases upgraded from non-informative findings to variants of uncertain significance. Aggregated real-world genomic data can be leveraged to identify disease-associated molecular signals generating novel biological hypotheses. A unified analytical framework provides a scalable strategy for knowledge discovery and guided reanalysis, facilitating the identification of overlooked and potentially novel genetic causes of RDs.

25.
arXiv (CS.CV) 2026-06-11

Mitigating Content Shift and Hallucination in GenAI Image Editing via Structural Refinement

Generative AI (GenAI) image editors, such as Nano Banana, produce visually compelling results for retouching tasks, enabling non-experts to edit images through text prompts alone. However, the generative nature of these models often introduces spatial misalignment, texture distortion, and content hallucination, all of which are detrimental to downstream workflows that require pixel-level fidelity. We identify a problem setting we call "structure-preserving GenAI fusion" for black-box GenAI image retouching: retain the perceptual enhancements of a GenAI output while enforcing structural faithfulness to the original input image. To address this problem, we propose a post-processing framework that fuses an input image with its GenAI-enhanced counterpart by first establishing coarse spatial and photometric correspondences, then performing a fusion stage that transfers desired enhancements while suppressing hallucinated content. In the absence of direct prior work in this setting, we evaluate our framework against representative methods from photorealistic style transfer and image fusion. Our experiments demonstrate that our method better preserves aesthetic quality while maintaining pixel-level structural consistency and the input resolution.