Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-11

CredibleDFGO: Differentiable Factor Graph Optimization with Credibility Supervision

arXiv:2605.06100v2 Announce Type: replace-cross Abstract: Global navigation satellite system (GNSS) positioning is widely used for urban navigation, but the covariance reported by the GNSS solver is often unreliable in urban canyons. Existing differentiable factor graph optimization (DFGO) methods learn measurement weighting through the solver, but they still use position-only objectives. As a result, the position estimate may improve while the reported covariance remains too small, too large, or incorrectly oriented. We propose CredibleDFGO (CDFGO), a differentiable GNSS factor graph framework that makes covariance credibility an explicit training target. A Weighting Generation Network (WGN) predicts per-satellite reliability weights, and a differentiable Gauss-Newton solver maps these weights to a position estimate and a Hessian-derived posterior covariance. We use proper scoring rules to supervise the East-North predictive distribution end to end. We study negative log-likelihood (NLL), the energy score (ES), and their combination. Results on three UrbanNav test scenes show consistent gains in covariance credibility. Positioning accuracy also improves on the medium-urban and harsh-urban scenes; on the deep-urban scene, both the mean horizontal error and the 95th-percentile error improve. On the harsh-urban Mong Kok (MK) scene, CDFGO-Combined reduces the mean horizontal error from 13.77 m to 11.68 m, reduces NLL from 40.63 to 6.59, and reduces ES from 12.31 to 9.05 relative to DFGO (MAE). Case studies link the MK improvement to better axis-wise consistency, more credible local covariance ellipses, and satellite-level reweighting.

02.
arXiv (CS.CV) 2026-06-17

Evaluating Synthetic Data Generation for Domain Generalization in Fetal Brain MRI Segmentation

Fetal brain tissue segmentation from magnetic resonance imaging (MRI) is crucial for studying neurodevelopment, but remains challenging due to data heterogeneity and limited annotations. Domain randomization (DR) has recently emerged as a promising strategy for single-source domain generalization by synthesizing training images with randomized artifacts, contrast, and resolution. In this work, we investigate how to maximize the out-of-domain (OOD) generalization of DR-based methods. We evaluate several synthetic data generation strategies for DR, with a particular focus on our recently proposed framework, FetalSynthSeg. We show that simple Gaussian mixture-based intensity modeling outperforms more complex physics-based simulations, and that intensity clustering (subdividing tissue classes based on intensity) improves OOD robustness. Evaluated on 348 fetal subjects from four sites spanning 0.55-3T and both T1w and T2w contrasts, FetalSynthSeg reaches state-of-the-art performance on several FeTA 2024 testing datasets (80-85 Dice score) and, for the first time, offers robust segmentation on modalities other than T2w for fetal brain segmentation (80 Dice on dHCP-T1w dataset). Compared with state-of-the-art methods such as BOUNTI, nnU-Net ensemble, and the FeTA 2024 winner, FetalSynthSeg delivers comparable or superior accuracy while maintaining strong robustness across domain shifts. Our code, model weights, and Docker image ready for easy inference are available at https://hub.docker.com/r/vzalevskyi/fetalsynthseg.

03.
arXiv (quant-ph) 2026-06-16

High-performance gates on trapped ion qubits using counterpropagating pulse-shaped laser beams

arXiv:2606.15672v1 Announce Type: new Abstract: Highly-localized light-matter interactions are necessary for scaling trapped-ion architectures. In hyperfine qubits, counterpropagating beams generate entangling gates by coupling with motion, but this effect is undesirable during single-qubit operations. For that reason, single-qubit gates are traditionally implemented with copropagating beams, and the coexistence of two beam geometries adds hardware and computational overhead. In an effort towards collective performance improvement with minimal overhead, we design and implement pulse-amplitude and dephasing robust dynamically corrected gates using Space Curve Quantum Control (SCQC) and compare them against the constant-amplitude gate implementation. We perform gate set tomography on a four-qubit trapped-ion register, and we discover more than 50% error reduction when robust pulses are used. We find that counterpropagating robust gates often outperform their copropagating counterparts and reach error rates as low as $(3.59 \pm 1.25)\cdot 10^{-3}$, using diamond distance as a metric. This value establishes a laser-driven-gate error reference and is merely an order of magnitude higher than the best reported $microwave$ gate on a $single$ ion. Additional experiments reveal that robust pulses can effectively suppress non-Markovian errors that grow during runtime. Our work challenges the widely accepted belief that copropagating gates should be preferred for their weak motional coupling and invites the adoption of high-performance robust pulses that suppress multiple noise sources of the trapped-ion error budget.

04.
arXiv (CS.LG) 2026-06-19

Flow Matching for Efficient and Scalable Data Assimilation

arXiv:2508.13313v4 Announce Type: replace-cross Abstract: Data assimilation (DA) estimates a dynamical system's state from noisy observations. Recent generative models like the ensemble score filter (EnSF) improve DA in high-dimensional nonlinear settings but are computationally expensive. We introduce the ensemble flow filter (EnFF), a training-free, flow matching (FM)-based framework that accelerates sampling and offers flexibility in flow design. EnFF uses Monte Carlo estimators for the marginal flow field, localized guidance for observation assimilation, and utilizes a novel flow path that exploits the Bayesian DA formulation. It generalizes classical filters such as the bootstrap particle filter and ensemble Kalman filter. Experiments on high-dimensional benchmarks demonstrate EnFF's improved cost-accuracy tradeoffs and scalability, highlighting FM's potential for efficient, scalable DA. Code is available at https://github.com/Utah-Math-Data-Science/Data-Assimilation-Flow-Matching.

05.
arXiv (CS.CL) 2026-06-15

Flood and Harvest: The Provable Necessity of Trivia for Generating Valuable Mathematics via the Lens of Language Generation in the Limit

AI systems coupled to proof assistants now generate formal mathematics at scale, and the gap between what a checker can verify and what a mathematician would value has become the binding constraint. We model the generation of valuable mathematics as nested language generation in the limit: a verifiable formal language $F$, accessed through a membership oracle (the proof checker), contains an unknown valuable language $H \in \mathcal{H}$ revealed only through an adversarial enumeration of a core $C \subseteq H$ of exact density $\alpha$ (the literature). Every output is valuable ($\in H$), trivial ($\in F \setminus H$), or a hallucination ($\notin F$). We settle four questions. First, the verifier is not taste: the collections admitting generation with breadth are exactly those of the oracle-free model, characterized fiber-wise by Angluin's condition. Second, the verifier does buy sound coverage, covering all unseen valuable statements while asserting only valid ones: possible with it, impossible without it; it relocates unavoidable errors from false to trivial. Third, and centrally, a sharp dichotomy on the tight family: generators emitting finitely many trivia achieve optimal coverage $\alpha/2$, while any infinite trivia allowance, even at vanishing rate, jumps the optimum to $1-\alpha/2$ (both tight, for cores presented as the candidate intersection), and one generator attains both ends. The transition is in trivia count, not rate; the gap $1-\alpha$ is the unrecorded mass. Fourth, both regimes instantiate in a compression model of mathematics. A perfect verifier cannot substitute for taste: the unbounded stream of correct-but-worthless statements is not an engineering accident but a provable necessity, since covering unrecorded valuable mathematics requires an infinite, but asymptotically negligible, stream of certified trivia.

06.
PLOS Computational Biology 2026-06-09

Evolution of phenocopying in a dynamical model of developmental trajectories

by Yuuki Matsushita, Archishman Raju Developmental trajectories are known to be canalized, or robust to both environmental and genetic perturbations. However, even when these trajectories are decanalized by an environmental perturbation outside the range of conditions to which they are robust, they often produce phenotypes similar to known mutants, called phenocopies. This correspondence between the effects of environmental and genetic perturbations has received little theoretical attention. Here, we study an abstract regulatory model that is evolved to follow a specific trajectory. We then study the effects of small and large perturbations to the trajectory, both by changing parameters and by perturbing the state at specific times. We find that the phenomenon of phenocopying emerges in evolved trajectories and is not present in a null model of randomly sampled trajectories. Our results suggest that, in this class of dynamic models, evolution can allow high-dimensional phenotypic landscapes to simultaneously exhibit robustness and phenocopying.

07.
arXiv (quant-ph) 2026-06-16

Entangled states are typically incomparable

arXiv:2406.03335v2 Announce Type: replace Abstract: Consider a bipartite quantum system, where Alice and Bob jointly possess a pure state $|\psi\rangle$. Using local quantum operations on their respective subsystems, and unlimited classical communication, Alice and Bob may be able to transform $|\psi\rangle$ into another state $|\phi\rangle$. Famously, Nielsen's theorem [Phys. Rev. Lett., 1999] provides a necessary and sufficient algebraic criterion for such a transformation to be possible (namely, the local spectrum of $|\phi\rangle$ should majorise the local spectrum of $|\psi\rangle$). In the paper where Nielsen proved this theorem, he conjectured that in the limit of large dimensionality, for almost all pairs of states $|\psi\rangle, |\phi\rangle$ (according to the natural unitary invariant measure) such a transformation is not possible. That is to say, typical pairs of quantum states $|\psi\rangle, |\phi\rangle$ are entangled in fundamentally different ways, that cannot be converted to each other via local operations and classical communication. Via Nielsen's theorem, this conjecture can be equivalently stated as a conjecture about majorisation of spectra of random matrices from the so-called trace-normalised complex Wishart-Laguerre ensemble. Concretely, let $X$ and $Y$ be independent $n \times m$ random matrices whose entries are i.i.d. standard complex Gaussians; then Nielsen's conjecture says that the probability that the spectrum of $X X^\dagger / \operatorname{tr}(X X^\dagger)$ majorises the spectrum of $Y Y^\dagger / \operatorname{tr}(Y Y^\dagger)$ tends to zero as both $n$ and $m$ grow large. We prove this conjecture, and we also confirm some related predictions of Cunden, Facchi, Florio and Gramegna [J. Phys. A., 2020; Phys. Rev. A., 2021].

08.
medRxiv (Medicine) 2026-06-19

The Impact of Pregnant Womens Dietary Behavior on the Physiological Adaptation Paradox and Maternal-Fetal Resource Conflict in Conflict Settings: A Predictive Analytical Study

This scientific study aims to assess the level of awareness, nutritional knowledge, and actual behavioral practices among pregnant women in the Capital District of Sanaa, Republic of Yemen, and to determine their impact on the health and clinical indicators of the mother and fetus under complex conflict conditions. The study employed a descriptive-analytical approach based on a simple random sample of 200 pregnant women attending government-run hospitals and specialized medical centers in the Capital District. Field data were collected during December 2025 using a structured and validated questionnaire consisting of 42 items measuring demographic variables, awareness, practices, barriers, and health outcomes. The results of the statistical analysis using SPSS software showed a high level of nutritional awareness (87%) and healthy dietary practices (80%) among the sample participants. Simple and multiple linear regression tests revealed a statistically significant effect of awareness and practices in explaining 20.2% of the variance in the health status of the mother and fetus (R{superscript 2}= 0.204, p < 0.001). The study demonstrated that actual behavioral practices have greater predictive power ({beta}=0.316, p=0.001) compared to theoretical cognitive awareness ({beta}=0.232, p=0.005) in determining clinical outcomes for the mother and fetus, highlighting the widening gap between knowledge and behavior under structural pressures. "Morning sickness" (80%) and the deterioration of "family economic status" (71%) emerged as the greatest physiological and material barriers to proper nutrition. With their inferential impact established as an extension of the maternal-fetal resource allocation conflict in a physiologically and economically challenging environment, the study also identified significant differences in nutritional behavior and health outcomes in favor of housewives and mothers who are more educated and have higher incomes, while no significant differences were recorded attributable to obstetric variables such as stage or order of pregnancy. The study offers a unique theoretical and practical contribution by formulating an integrated causal model that demonstrates that the fetus acts as a biological drain on the mothers cellular and mineral reserves in a war environment, which necessitates directing antenatal care and support programs toward effective behavioral empowerment and nutritional support to overcome the structural and material barriers faced by pregnant women.

09.
PLOS Computational Biology 2026-06-02

Linking reduced prefrontal microcircuit inhibition in schizophrenia to EEG biomarkers in silico

by Sana Rosanally, Frank Mazza, Heng Kang Yao, Faraz Moghbel, Hannah Seo, Etay Hay Reduced cortical inhibition by parvalbumin-expressing (PV) interneurons in schizophrenia is thought to be associated with impaired processing in the prefrontal cortex and altered EEG signals such as oddball mismatch negativity (MMN). Recent studies also suggest loss of somatostatin (SST) interneuron inhibition. However, establishing the link between reduced interneuron inhibition and reduced MMN experimentally in humans is currently not possible. To overcome these challenges, we simulated spiking activity and EEG during baseline and oddball response in detailed models of human prefrontal microcircuits in health and schizophrenia, with reduced PV and SST interneuron inhibition as constrained by postmortem patient data. We showed that reduced PV interneuron inhibition can account for the decreased MMN amplitude seen in schizophrenia, with a threshold below which the amplitude effect was low as seen in at-risk patients. In contrast, reduced SST interneuron inhibition did not affect the MMN amplitude. We further showed that both types of inhibition loss were necessary to account for changes in resting EEG in schizophrenia, with reduced SST interneuron inhibition increasing broadband power, and reduced PV and SST interneuron inhibition both leading to a right shift from alpha to beta frequencies. Our study thus links reduced PV and SST interneuron inhibition in schizophrenia to distinct EEG biomarkers that can serve to improve stratification and early detection using non-invasive brain signals.

10.
arXiv (CS.CL) 2026-06-19

Telenor Nordics Customer Service self-help corpus

作者:

This paper presents a multilingual customer service self-help corpus comprising 1,122 manually validated documents in Finnish, Danish, Norwegian, and Swedish, totaling 274,599 words and 1,884,833 characters. The documents have been sourced from the public self-help pages of four Nordic telecommunications operators and subsequently filtered for person-identifiable information and relevance through a combined LLM and human annotation pipeline. Domain-specific datasets for Nordic languages remain scarce, particularly in customer service: a domain of growing importance for retrieval-augmented generation, cross-lingual transfer learning, and emerging agent-based service architectures. An analysis of the corpus reveals substantial variation in document length and structure across operators, reflecting distinct editorial strategies, as well as broad topical coverage spanning network hardware, mobile services, TV and streaming, billing, and account management. The dataset is publicly available under a CC-BY-NC-SA-4.0 license at https://zenodo.org/records/20732652, intended to support reproducible research in Nordic NLP and information retrieval.

11.
arXiv (quant-ph) 2026-06-15

Jones-matrix analysis of phase accumulation in a linear-optical multi-pass interferometer

arXiv:2606.14422v1 Announce Type: new Abstract: Quantum information science has traditionally relied on nonclassical resources, such as entangled photon pairs and squeezed states, to achieve measurement performance beyond classical limits. Here, we revisit the multi-pass photonic scheme reported in Nature 450, 393 (2007) to clarify the physical origin of the observed superresolution and the associated claim of supersensitivity. Using a rigorous Jones-matrix formalism, we show that the round-trip evolution of the HQMQ linear optics unit is equivalent to the product of two reflections in polarization space, resulting in an effective rotation operator. This equivalence reveals that the accumulated phase arises from coherent polarization-state rotation on the Poincare'e sphere. The resulting phase accumulation is interpreted geometrically as a progressive realignment of the polarization state during successive forward and backward propagations. To validate the theoretical model, a classical-wave implementation is experimentally conducted, analyzed, and compared with the corresponding Jones-matrix solution. Finally, the scaling behavior of the Fisher information is analyzed to examine the origin of the claimed supersensitivity. The results are further compared with a recently developed coherence de Broglie wavelength framework, which achieves identical superresolution through repeated coherent interactions in a cascaded interferometeric architecture.

12.
arXiv (CS.LG) 2026-06-16

Multi-Agent Framework for Audit Risk Assessment with Explicit Uncertainty and Evidence Conflict Modeling

arXiv:2606.15640v1 Announce Type: new Abstract: Audit risk assessment increasingly benefits from combining heterogeneous evidence sources, yet existing approaches typically produce point predictions without quantifying how well different evidence streams agree. We propose UMAR (Uncertainty-Aware Multi-Agent Risk Assessment), a framework that employs three specialized agents: an MD&A Text Agent, a Financial Ratio Agent, and a CAM Agent, each producing independent risk scores with calibrated uncertainty estimates. An Uncertainty Aggregator based on Dempster-Shafer evidence theory fuses these scores while explicitly measuring inter-agent conflict. We evaluate UMAR on a U.S. dataset of 3,200 firm-year observations from SEC 10-K filings (2019-2023), with financial restatement as the target label. Experimental results show that UMAR achieves an AUROC of 0.782 and a PR-AUC of 0.341, outperforming logistic regression, XGBoost, FinBERT, and single-agent and dual-agent LLM baselines. UMAR attains the lowest expected calibration error (ECE = 0.052) among all methods and identifies evidence-conflict patterns that correlate with actual restatement risk, offering auditors potentially actionable and interpretable risk signals.

13.
medRxiv (Medicine) 2026-06-16

Utilising Artificial Intelligence to Identify Ventricular Tachycardia Ablation Targets in Sinus Rhythm

Background and Aims: Machine learning has shown potential in predicting ablation targets for ventricular tachycardia (VT) in an animal model. This study progresses to externally validating deep learning approaches for human data. Methods: The development and external validation dataset included 21 and 13 patients, respectively, with structural VT undergoing catheter ablation. In the development datasets, electrophysiological studies were conducted using the AdvisorTM HD grid (EnsiteTM X), while both CARTO and Ensite Precision were used in the validation dataset. In each patient, VT ablation targets were defined as mapping points within 8 mm of VT isthmuses. Three advanced machine learning models were trained using cardiac mapping data acquired in both omnipolar and unipolar configurations during sinus rhythm and ventricular pacing. Discrimination was evaluated using nested leave-one-out cross-validation at patient level. Results: Overall, graph convolutional networks (GCNs), which integrate intracardiac signal waveforms with three-dimensional electroanatomical geometries, achieved the highest performance, with optimal results obtained from unipolar electrograms acquired in sinus rhythm (median AUC 0.793, sensitivity 83.6%, specificity 69.0%). This may be partly explained by the inclusion of repolarization dynamics in unipolar electrograms and the higher point density of sinus rhythm maps. Comparable performance was observed in the external dataset. Conclusion: This study demonstrates that graph convolutional networks applied to sinus rhythm EGM waveforms collected during substrate mapping can localise critical components of VT re-entry circuits. This approach has potential to provide fast and accurate ablation guidance without the need to induce and map VT, improving safety and efficacy of VT catheter ablation.

14.
arXiv (CS.CV) 2026-06-17

Qwen-RobotNav Technical Report: A Scalable Navigation Model Designed for an Agentic Navigation System

Agentic navigation systems require a base navigation model whose observation strategy can be externally reconfigured at inference time, because instruction following, object search, target tracking, and autonomous driving share the same perception-planning backbone yet demand fundamentally different strategies for consuming the visual stream. We present Qwen-RobotNav, a scalable navigation model built on Qwen-RobotNav that addresses it through a parameterised interface with two complementary dimensions: multiple task modes that select the navigation behaviour, and controllable observation parameters (e.g., token budget, per-camera weights) that govern how visual history is encoded. With training-time randomization over all parameters, Qwen-RobotNav is robust to any inference-time configuration requiring zero architectural modification to the Qwen-RobotNav backbone. We train Qwen-RobotNav on 15.6M samples; co-training with vision-language data prevents the collapse into reactive action-sequence mappers observed in trajectory-only training. The parameterised interface also makes Qwen-RobotNav a natural building block for agentic systems: for long-horizon scenarios, an upper-level planner decomposes goals into sub-tasks and dynamically switches Qwen-RobotNav's task mode and context strategy mid-episode, composing complex behaviours from repeated calls to the same model. Extensive experiments show that Qwen-RobotNav sets new state-of-the-art results across major navigation benchmarks. The model exhibits favourable scaling from 2B to 8B parameters, with joint multi-task training developing a shared spatial-planning substrate that transfers across task families, and demonstrates strong zero-shot generalisation to real-world robots across diverse environments.

15.
arXiv (CS.CV) 2026-06-16

Redirecting the Flow: Image Customization through Attention Distribution Shift

Subject-driven image customization aims to generate images that not only follow textual instructions but also preserve the identity of a given reference subject. Existing approaches, including test-time fine-tuning, encoder-based methods, and token competition in shared attention spaces, suffer from limited efficiency, misalignment between extracted reference features and the generative process, and interference from irrelevant information. To address these limitations, we formulate the customization task as a distribution shift induced by incorporating reference images into text-to-image generation, and derive a Conditional Attention Distribution Shift formulation grounded in maximum entropy theory. Building on this formulation, we propose CustomShift, a dual-branch architecture based on Stable Diffusion 3. The Reference-Alignment Branch leverages self-attention between reference images and subject names to achieve layer-wise alignment with latent representations, while the Cross-Guidance Branch integrates textual and reference cues to guide generation. Experiments on the DreamBooth and Custom101 benchmarks demonstrate that our method consistently outperforms state-of-the-art approaches, achieving a better balance between semantic fidelity and subject consistency.

16.
arXiv (math.PR) 2026-06-17

The Loss of Tension in an Infinite Membrane with Holes of Decaying Spatial Density

arXiv:2606.17792v1 Announce Type: new Abstract: What is the effect of randomly removing material from an infinite stretched membrane? Under what conditions can the membrane still sustain tension? This problem was introduced by Robert Connelly in connection with applications of rigidity theory in the natural sciences, and was later studied in M. V. Menshikov, K. A. Rybnikov, and S. E. Volkov, "The loss of tension in an infinite membrane with holes distributed according to a Poisson law" (2002); a discrete version was also considered in Robert Connelly, Konstantin Rybnikov, and Stanislav Volkov, "Percolation and the Loss of Tension in an Infinite Triangular Lattice" (2001). We study a mathematical framework based on a non-homogeneous Poisson point process whose intensity $\lambda$ tends to zero at infinity. The hole shapes are i.i.d.\ and independent of their locations. We show that if the intensity does not decay too quickly, then tension is still lost throughout the whole plane, as in the homogeneous model studied in 2002. Conversely, we give sufficient conditions under which complete loss of tension does not occur. Thus, both destruction and non-destruction regimes are possible even when the intensity tends to zero, indicating a phase transition in the model. The processes studied here are closely related to bootstrap percolation.

17.
arXiv (CS.LG) 2026-06-16

Self-Supervised Learning of Iterative Solvers for Constrained Optimization

arXiv:2409.08066v3 Announce Type: replace Abstract: The real-time solution of parametric optimization problems is critical for applications that demand high accuracy under tight real-time constraints, such as model predictive control. To this end, this work presents a learning-based iterative solver for constrained optimization, comprising a neural network predictor that generates initial primal-dual solution estimates, followed by a learned iterative solver that refines these estimates to reach high accuracy. We introduce a novel loss function based on Karush-Kuhn-Tucker (KKT) optimality conditions, enabling fully self-supervised training without pre-solved optimizer solutions. Theoretical guarantees ensure that the training loss function attains minima exclusively at KKT points. A convexification procedure enables application to nonconvex problems while preserving these guarantees. Experiments on two nonconvex case studies demonstrate speedups of up to one order of magnitude compared to state-of-the-art solvers such as IPOPT, while achieving orders of magnitude higher accuracy than competing learning-based approaches.

18.
arXiv (quant-ph) 2026-06-17

Optimizing bias-tailored quantum error correction beyond code-capacity noise

arXiv:2606.17709v1 Announce Type: new Abstract: We find that the substantial advantages predicted for bias-tailored quantum error correction (QEC) under code-capacity noise are strongly reduced once realistic syndrome extraction and circuit-level noise models are considered. We start by comparing XZZX codes to rectangular surface codes with a bias-dependent optimised anisotropy. Although code-capacity simulations predict an advantage of rectangular surface codes in the limit of high noise bias, this actually disappears under circuit-level noise, making the XZZX codes the preferred and simplest choice even for platforms that allow for a flexible variation of the code layout adapted to changes in noise calibration. Our results identify bias degradation during syndrome extraction under circuit-level noise as the central limitation of biased-tailored QEC. To partially mitigate this effect, we introduce a bias-filtering CNOT gadget that temporarily encodes the ancillary target qubit during syndrome extraction in a repetition code and, upon measurement and feed forward, manages to reduce the bias degradation. In a regime of high-bias and low-idle errors, this bias-filtering gadget yields a few-percent relative improvement of the XZZX code error threshold, demonstrating that lightweight bias-filtering strategies can recover part of the lost bias-tailoring advantage for realistic circuit-level noise.

19.
arXiv (CS.CL) 2026-06-11

Compatibility-Aware Dynamic Fine-Tuning for Large Language Models

Supervised Fine-Tuning (SFT) is the predominant paradigm for aligning large language models (LLMs), yet it suffers from optimization instability and limited generalization. Recent work attributes this issue to pathological gradient scaling and proposes Dynamic Fine-Tuning (DFT) to correct it at the token level. However, DFT assumes all demonstrations are equally suitable learning targets, an assumption violated by the strong heterogeneity of large-scale instruction data, where demonstration-policy mismatch induces high-variance updates at the sample level. We introduce Compatibility-Aware Dynamic Fine-Tuning (CADFT), a principled extension of DFT that controls sample-level optimization variance. CADFT derives a dynamic, policy-dependent compatibility signal from model likelihoods to modulate supervised updates, suppressing high-variance gradients from incompatible demonstrations. We further propose a delayed, low-frequency compatibility-guided rewriting strategy to transform persistently incompatible demonstrations into learnable targets. We show that CADFT can be interpreted as a variance-controlled estimator that generalizes token-level stabilization in DFT to the sample level. Extensive experiments demonstrate improved stability, generalization, and cold-start reinforcement learning initialization, while remaining fully supervised and independent of explicit reward modeling.

20.
arXiv (CS.AI) 2026-06-11

EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

arXiv:2602.20958v2 Announce Type: replace-cross Abstract: Vision-based Unmanned Aerial Vehicles (UAVs) frameworks aid human search tasks by detecting and recognizing specific individuals, then tracking and following them while maintaining a safe distance. A key safety requirement for UAV following is the accurate estimation of the distance between camera and target object under real-world conditions, achieved by fusing multiple image modalities. As part of the system for automatic people detection and face recognition using deep learning, in this paper we present the fusion of depth camera measurements and monocular camera-to-body distance estimation for robust tracking and following. Deep learning based filtering of depth camera data and estimation of camera-to-body distance from a monocular camera are achieved with YOLO-pose, enabling real-time fusion of depth information using the Extended Kalman Filter (EKF) algorithm. The proposed subsystem, designed for use in drones, estimates and measures the distance between the depth camera and the human body keypoints, to maintain the safe distance between the drone and the human target. Our system provides an accurate estimated distance, which has been validated against motion capture ground truth data. The system has been tested in real time indoors, where it reduces the average errors, RMSE and standard deviations of distance estimation up to 15,3% in three tested scenarios. Based on the test results, the EKF fusion-based approach increases the depth detection range by reducing the errors outside the optimal depth camera working range. It also shows improved robustness and precision in challenging conditions, such as reflections and poor visibility, making it suitable for SAR.

21.
medRxiv (Medicine) 2026-06-10

Impact of Early Treatment on Symptom Improvement and Procedural Events among Men with BPH and Bothersome Lower Urinary Tract Symptoms: A Contemporary Analysis of the American Urological Association Quality (AQUA) Registry

PURPOSE: As the armamentarium of BPH therapies continues to expand, it remains imperative to maximize patient satisfaction and minimize decisional regret. We sought to determine the impact of time from BPH diagnosis to index treatment on symptom improvement and subsequent procedural events. MATERIALS AND METHODS: We queried the American Urological Association Quality Registry for men [&ge;] 40 years old with BPH, available IPSS data, and no receipt of prior BPH treatment. Index treatment included medication, surgery, or minimally invasive surgical therapy (MIST). Outcomes included IPSS over 3 years of follow-up, change in percentage of mild lower urinary tract symptoms (LUTS) by 3 months, and time to procedural event. Patients were stratified by time from index diagnosis to treatment by 3 years. Outcomes were compared across time-to-treatment cohorts with appropriate statistical tests with p < 0.05 as significant. RESULTS: 43,919 patients met criteria with 19,642 pursuing treatments. Patients pursued treatment at comparably lower baseline IPSS compared to prior prospective series. Patients undergoing surgery and MIST had significantly higher baseline IPSS, while medical comorbidities were significantly more common among men initiating pharmacotherapy. Early surgery and MIST were associated with significant improvement in IPSS within 6-12 months and an increase in mild LUTS by 3 months. All forms of early treatment were associated with delayed time to procedural events, including catheterization and fulguration. CONCLUSIONS: Early procedural intervention for BPH is associated with early symptom improvement and delayed time to procedural events among real-world, contemporary practice.

22.
PLOS Computational Biology 2026-06-04

CIPHER: An end-to-end framework for designing optimized aggregated spatial transcriptomics experiments

by Zachary Hemminger, Haley De Ocampo, Fangming Xie, Zhiqian Zhai, Jingyi Jessica Li, Roy Wollman Motivation Most imaging-based spatial transcriptomics methods measure individual genes, which limits scalability and typically requires integration with scRNA-seq to recover full cellular states. Recent approaches such as CISI, FISHnCHIPs, and ATLAS address this limitation by measuring aggregate transcriptional signatures, where multiple genes are pooled into each channel to increase throughput. While aggregate measurements improve scalability, they shift the problem from gene selection to feature design. For effective integration with scRNA-seq, these signatures must be not only discriminative in transcriptional space but also straightforward to measure, with balanced signal, sufficient dynamic range, and robustness to experimental noise. By optimizing decoding accuracy in isolation, existing methods leave substantial performance on the table. Results We present CIPHER (Cell Identity Projection using Hybridization Encoding Rules), a neural-network framework that jointly optimizes the experimental encoding matrix, i.e., the way that genes are aggregated to signatures, and the downstream cell embedding. CIPHER integrates the physical limits of imaging assays directly into its loss function, shaping the latent space to maximize discriminability while maintaining robustness to measurement noise and signal constraints. Using a large-scale mouse brain scRNA-seq reference, we show that CIPHER-designed encodings yield latent spaces with improved cell-type separability, uniform signal utilization, and greater resilience to hybridization variability, resulting in higher decoding accuracy from both simulated and experimental data. Conclusion CIPHER formulates aggregate signature design as a joint optimization problem over decoding accuracy and experimental measurability. This enables systematic, scRNA-seq-aligned feature design for scalable spatial transcriptomics based on aggregate measurements. Availability Code and documentation are available at https://github.com/wollmanlab/Design/.

23.
arXiv (CS.CL) 2026-06-12

Zero-source LLM Hallucination Detection with Human-like Criteria Probing

Large language models (LLMs) often hallucinate by generating factually incorrect or unfaithful content, posing significant risks to their safe use. Detecting such hallucinations is particularly challenging under the zero-source constraint, where no model internals or external references are available, and detection must rely solely on the textual query-answer pair. In this paper, we propose Human-like Criteria Probing for Hallucination Detection (HCPD), a paradigm that emulates the multi-faceted reasoning of human evaluators. Its core is a Human-like Criteria Probing (HCP) mechanism, in which a LLM agent adaptively decomposes its judgment into a weighted set of interpretable criteria and aggregates criterion-specific scores into a final truthfulness measure. To achieve this adaptive capability, we introduce a reward-based alignment scheme using only weak supervision from semantic consistency. At inference, we employ a multi-sampling aggregation strategy to ensure robust decisions while preserving full interpretability. We further provide theoretical analysis supporting the reliability of our approach. Extensive experiments show that HCPD consistently outperforms state-of-the-art baselines, offering an effective and explainable solution for zero-source hallucination detection. Code is available at https://github.com/TRISKEL10N/HCPD.

24.
arXiv (CS.AI) 2026-06-12

Arbor: Tree Search as a Cognition Layer for Autonomous Agents

arXiv:2606.12563v1 Announce Type: new Abstract: Arbor is a multi-agent framework that introduces structured tree search as a cognition layer for autonomous agents operating in large, stateful action spaces. Prior autonomous optimization systems operate on isolated targets with stateless evaluation. Arbor instead maintains an explicit search tree of scored hypotheses that serves as the shared working memory across agents, evolving with every measurement, treating failures as diagnostic signal that reshapes subsequent exploration, and expanding as prior successes shift the bottleneck distribution. We validate Arbor on full-stack LLM inference optimization, a domain where achieving peak performance has historically required coordinated effort from engineering teams across the application, framework, compiler, kernel, and hardware stack. Arbor pairs an Orchestrator agent, which drives optimization by delegating to Domain Specialists across the inference stack, with a Critic agent that safeguards stability through root-cause analysis, introspection, and measurement validation – a checks-and-balances architecture where neither agent can unilaterally drive the system. Agent capabilities are decomposed into hard skills (domain expertise) and soft skills (coordination protocols that determine how contributions compose), enabling fully autonomous multi-day campaigns. Arbor achieves up to 193% inference throughput-latency Pareto improvement over vendor-optimized baselines, while a single agent without the harness plateaus at +33% throughput improvement and crashes irrecoverably within hours. Arbor generalizes to multiple generations of hardware platform, and run-to-run variance is within 2 percentage points demonstrating that the method is hardware-agnostic and reproducible.

25.
arXiv (math.PR) 2026-06-16

Risk-averse mean field games: exploitability and non-asymptotic analysis

arXiv:2301.06930v5 Announce Type: replace-cross Abstract: In this paper, we use mean field games (MFGs) to investigate approximations of $N$-player games ($N$pGs) with uniformly symmetrically continuous heterogeneous closed-loop actions. To incorporate agents' risk aversion (beyond the classical expected utility of total costs), we use an abstract evaluation functional for their performance criteria. Centered around the notion of exploitability, we conduct non-asymptotic analysis on the approximation capability of MFGs from the perspective of state-action distributions without requiring the uniqueness of equilibria. Under suitable assumptions, we first show that scenarios in the $N$pGs with large $N$ and small average exploitabilities can be well approximated by approximate solutions of MFGs with relatively small exploitabilities. We then show that $\delta$-mean field equilibria can be used to construct $\varepsilon$-equilibria in $N$pGs. Furthermore, in this general setting, we prove the existence of mean field equilibria. This proof reveals a possible avenue for incorporating penalization for randomized action into MFGs.