Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (math.PR) 2026-06-15

Lehner's operator norm formulas, semidefinite programming, and spiked matrix models

arXiv:2606.14687v1 Announce Type: new Abstract: Lehner (1999) derived elegant formulas for the operator norm $\|\mathfrak{X}\|$ of operators of the form $\mathfrak{X} = \mathbf{A}_0 \otimes \mathfrak{1} + \sum_{i = 1}^n \mathbf{A}_i \otimes \mathfrak{m}_i$, also easily generalized to the spectral edge $\lambda_{\max}(\mathfrak{X})$, in terms of nonlinear optimization problems over positive definite matrices. Here the $\mathbf{A}_i$ are finite-dimensional Hermitian matrices, the $\mathfrak{m}_i$ are either free semicircular or free Rademacher families of operators, and $\mathfrak{1}$ is the identity operator. We first show that both of Lehner's nonlinear optimizations can be rewritten as linear semidefinite programs (SDPs), even in the Rademacher case where Lehner's optimization is not itself convex. We give the primal and dual forms of these SDPs, derive the complementary slackness relations and consequences thereof, and propose that the SDPs are more stable and accurate than the iterative numerical scheme proposed in Lehner's original work. We then apply the SDPs from the semicircular case to spiked matrix models, studied recently via Lehner's formula by Bandeira, Cipolloni, Schröder, and van Handel (2024). We give a new proof of the Baik–Ben Arous–Péché (BBP) transition they establish in models with isotropic (but possibly correlated) Gaussian noise by constructing feasible variables for the associated primal and dual SDPs. Combining our construction with a sensitivity interpretation of optimal dual variables, we study the fluctuations of leading eigenvectors of such models. We conjecture and give numerical evidence that these fluctuations are Gaussian but anisotropic and non-universal, and that their covariance may be computed in terms of the optimizer of the dual of Lehner's formula, which in turn is approximately the leading eigenmatrix of a completely positive operator associated to the covariance of the noise model.

02.
arXiv (CS.LG) 2026-06-19

Distributionally Robust Set Representation Learning Under Inference-Time Element Corruption

arXiv:2605.30089v2 Announce Type: replace Abstract: Standard Set Representation Learning methods typically excel on curated data but often overlook the challenge of inference-time element corruption. This refers to scenarios where deployed models encounter element-level degradations, such as outliers or missing components, that may distort set representation and degrade performance. We propose SW-DRSO, a distributionally robust optimization framework tailored for sets. Rather than minimizing loss solely on observed training data, SW-DRSO optimizes a tractable surrogate of the worst-case expected loss over a family of plausible inference-time variations. We introduce a barycentric adversary that approximates the intractable search over corrupted sets by a differentiable training-time optimization over simplex weights. Extensive experiments across four tasks demonstrate that SW-DRSO effectively enhances robustness against corruption while maintaining high overall performance.

04.
arXiv (CS.AI) 2026-06-18

Self-Evolving Multi-Agent Systems via Textual Backpropagation

arXiv:2506.09046v3 Announce Type: replace-cross Abstract: Leveraging multiple Large Language Models (LLMs) has proven effective for addressing complex, high-dimensional tasks, but current approaches often rely on static, manually engineered multi-agent configurations. To overcome these constraints, we present the Agentic Neural Network (ANN), a framework that conceptualizes multi-agent collaboration as a layered neural network architecture. In this design, each agent operates as a node, and each layer forms a cooperative team focused on a specific subtask. Our framework follows a two-phase optimization strategy: (1) Forward Phase - Drawing inspiration from neural network forward passes, tasks are dynamically decomposed into subtasks, and cooperative agent teams with suitable aggregation methods are constructed layer by layer. (2) Backward Phase - Mirroring backpropagation, we refine both global and local collaboration through iterative feedback, allowing agents to self-evolve their roles, prompts, and coordination. This neuro-symbolic approach enables our framework to create new or specialized agent teams post-training, delivering notable gains in accuracy and adaptability. Across seven benchmark datasets, our work surpasses leading multi-agent baselines under the same configurations, showing consistent performance improvements.

05.
arXiv (quant-ph) 2026-06-12

Where a Quantum Reservoir Works: A Transferable Operating Band

arXiv:2606.13284v1 Announce Type: new Abstract: In quantum reservoir computing, a fixed quantum system transforms an input signal, while learning reduces to training a simple linear readout on its measured outputs. Since the quantum dynamics themselves are never optimized, the method is well suited to today's hardware. Yet these dynamics must still be chosen carefully, because their settings remain fixed throughout training and inference. It therefore remains an open question where, in its control space, a fixed quantum system learns well. We address this question for a dissipative reservoir by mapping performance over three central physical controls: the strength of the input drive, the coupling between neighboring qubits, and the rate of dissipation. Good performance concentrates in a single, well-defined operating region of this control space. This region transfers across tasks and reservoir initializations, and the same memory-defined regime persists under architectural changes. It is also mechanistically grounded, since it disappears whenever any of the mechanisms that create it is removed. Finally, the region can be located cheaply before any task is run, using a simple memory diagnostic.

06.
arXiv (CS.AI) 2026-06-17

Using Cognitive Models to Improve Language Model Simulation of Human Persuasion Games

arXiv:2606.17657v1 Announce Type: new Abstract: People make decisions differently in strategic interactions. Some update beliefs like a Bayesian; others exhibit biases like motivated reasoning. Although creators of large language models use simulated humans for safety evaluations and training, they often fail to cover this breadth of human behavior. We argue that cognitive science and economics provide a convenient tool for doing so, making use of mathematical models of human decision-making. We propose an approach that we call Equation-to-Behavior Prompting for guiding large language models to match cognitive models, and evaluate this approach on persuasion games based on legal decision-making. We find that large models can approximate equation-based specifications – Bayesian updating, affine distortion, motivated updating, and Grether's $\alpha$-$\beta$ model – using prompting, but small models fail to do so. However, training small models with reinforcement learning to adhere to mathematical rules, Equation-to-Behavior RL, reduces belief error by 26.5% in out-of-distribution parameterizations. We show that these simulations can help create diverse training environments; training small models to consider different kinds of decision-makers improves average belief change by 2.5%–12% over Bayesian-only training, even when persuading GPT-5-mini. Our work could improve human simulations for training and evaluation in increasingly realistic settings, and could also enable novel research into more complicated mathematical models of human decision-making.

07.
arXiv (CS.AI) 2026-06-11

SAGE: Scalable AI Governance & Evaluation

arXiv:2602.07840v4 Announce Type: replace-cross Abstract: Evaluating relevance in large-scale search systems is fundamentally constrained by the governance gap between nuanced, resource-constrained human oversight and the high-throughput requirements of production systems. While traditional approaches rely on engagement proxies or sparse manual review, these methods often fail to capture the full scope of high-impact relevance failures. We present SAGE (Scalable AI Governance \& Evaluation), a framework that operationalizes high-quality human product judgment as a scalable evaluation signal. At the core of SAGE is a bidirectional calibration loop where natural-language Policy, curated Precedent, and an LLM Surrogate Judge co-evolve. SAGE systematically resolves semantic ambiguities and misalignments, transforming subjective relevance judgment into an executable, multi-dimensional rubric with near human-level agreement. To bridge the gap between frontier model reasoning and industrial-scale inference, we apply teacher-student distillation to transfer high-fidelity judgments into compact student surrogates at 92$\times$ lower cost. Deployed within LinkedIn Search ecosystems, SAGE guided model iteration through simulation-driven development, distilling policy-aligned models for online serving and enabling rapid offline evaluation. In production, it powered policy oversight that measured ramped model variants and detected regressions invisible to engagement metrics. Collectively, these drove a 0.25\% lift in LinkedIn daily active users.

08.
arXiv (quant-ph) 2026-06-17

Demultiplexing Generalized Information via Quantum Transmission Lines

arXiv:2606.17894v1 Announce Type: new Abstract: Demultiplexers are the fundamental primitives of network architecture, enabling perfect routing of an input classical signal to a designated one, among multiple output ports. Quantum transmission lines, having access to the quantum systems directly, are able to transmit both the classical and quantum information encoded in quantum systems. A natural question therefore emerges that whether the scrambled classical and quantum information in a quantum system can be perfectly demultiplexed in the designated classical and quantum output ports? Here we answer this question by introducing a quantum to quantum-classical device, namely the quantum demultiplexer (Q-DEMUX). We characterize the class of Q-DEMUXs enabling perfect routing of both the classical and the quantum information along with their simple circuit realizations. Our results highlight an explicit connection between the strength of a Q-DEMUX with the incompatibility of quantum instruments. Finally, we extend the notion in a stronger variant where the sender is oblivious regarding the nature of the data to be transmitted through the Q-DEMUX.

09.
Nature (Science) 2026-06-22

Stereoretentive decarbonylative C(sp<sup>3</sup>)-C(sp<sup>3</sup>) cross-coupling

作者:

While C(sp3)–C(sp3) bond-forming cross-coupling methods have become more common, stereocontrolled bond-formation remains a challenge,1 despite its importance for drug discovery, where there is a emerging demand for molecules with increased sp3 character.2-4 Enantiospecific cross-coupling approaches would complement advances in enantioselective coupling,5-8 but have been limited to specialized substrates with lower availability5,9 because stereospecific oxidative addition of more abundant chiral alkyl electrophiles is unknown.10 Inspired by the classic, stereoretentive Curtius rearrangement,11 herein we disclose a catalytic strategy that proceeds by an analogous stereoretentive decarbonylation step to form a versatile chiral alkylnickel intermediate from easily-available chiral amino-acid and α-hydroxy-acid derivatives. The chiral alkylnickel intermediates decompose and/or racemize on the order of minutes, but are sufficiently stable to enable stereoretentive cross-electrophile coupling12 with alkyl radicals (derived from alkyl iodides) at relatively low temperature (22-40 °C). This mechanistic strategy provides a straightforward approach to stereocontrolled C(sp3)–C(sp3) bond formation, including diastereomers that are inaccessible by stereoselective radical mechanisms. The “metallo-Curtius” strategy described in this study lays a mechanistic foundation for the development many new stereospecific cross-coupling reactions.

10.
arXiv (CS.AI) 2026-06-16

Discovering Lattice Reduction Strategies via Self-Play

arXiv:2606.15301v1 Announce Type: cross Abstract: The Lenstra-Lenstra-Lovász (LLL) algorithm is a seminal contribution to computer science used for lattice basis reduction, yet its polynomial-time outputs produce bases that are far from optimal as the dimension grows. We show that deep reinforcement learning can discover strictly superior, generalizable reduction strategies by interacting with the primitive action space of LLL. We formulate lattice reduction as a single-player Markov Decision Process (MDP) and train a deep residual network using an AlphaZero-style self-play pipeline augmented with adaptive-horizon MCTS (Monte Carlo Tree Search), which couples multi-step network predictions with an entropy-gated expansion mechanism. The resulting policy, DeltaStar, is trained exclusively on small $8$-dimensional $q$-ary lattices and requires fewer primitive row operations than LLL. Crucially, it generalizes zero-shot to unseen moduli and higher dimensions up to $n=32$ without retraining.

11.
arXiv (CS.CV) 2026-06-17

Phenotyping TPF via Self-Supervised Learning: A Label-Agnostic Framework with Expert Validation

The full potential of artificial intelligence in tibial plateau fracture characterisation remains unrealised, constrained by a fundamental dependency on labelled datasets whose consistency cannot be guaranteed: conventional classification schemes such as Schatzker and AO/OTA suffer from inter-observer variability, causing supervised models to learn human disagreement rather than stable fracture morphology. We design, implement, and validate a label-agnostic framework that eliminates this constraint by learning fracture representations directly from imaging data without observer-assigned labels. A RadImageNet-pretrained ResNet-50 encoder is fine-tuned on 154 cleaned knee radiographs using the SimCLR contrastive objective, preceded by a data cleaning protocol and followed by UMAP dimensionality reduction and k-means clustering to discover four imaging-derived phenotypes. Phenotype validity is assessed through a blinded expert review protocol administered to two independent clinicians. The four phenotypes demonstrate robust stability (bootstrap ARI = 0.319 +/- 0.041), strong internal cohesion (silhouette = 0.511), and coherence ratings of 3-5/5 from both reviewers under blinded conditions; one phenotype was unanimously identified as exhibiting comminution – a high-complexity feature isolated without any supervisory signal. Inter-partition comparison against Schatzker labels yields ARI = 0.013, confirming orthogonality to conventional classification boundaries. Notably, expert reviewers anchored to established classification vocabularies perceived imaging-derived groups as heterogeneous precisely where Schatzker alignment was lowest, suggesting that Schatzker-trained perception and label-agnostic embedding geometry measure orthogonal dimensions. These findings establish label-agnostic SSL phenotyping as a reproducible and clinically interpretable complement to conventional classification.

12.
arXiv (CS.LG) 2026-06-17

MGUP: A Momentum-Gradient Alignment Update Policy for Stochastic Optimization

arXiv:2606.17526v1 Announce Type: new Abstract: Efficient optimization is essential for training large language models. Although intra-layer selective updates have been explored, a general mechanism that enables fine-grained control while ensuring convergence guarantees is still lacking. To bridge this gap, we propose MGUP, a novel mechanism for selective updates. MGUP augments standard momentum-based optimizers by applying larger step-sizes to a selected fixed proportion of parameters in each iteration, while applying smaller, non-zero step-sizes to the rest. As a nearly {plug-and-play} module, MGUP seamlessly integrates with optimizers such as AdamW, Lion, and Muon. This yields powerful variants such as MGUP-AdamW, MGUP-Lion, and MGUP-Muon. Under standard assumptions, we provide theoretical convergence guarantees for MGUP-AdamW (without weight decay) in stochastic optimization. Extensive experiments across diverse tasks, including MAE pretraining, LLM pretraining, and downstream fine-tuning, demonstrate that our MGUP-enhanced optimizers achieve superior or more stable performance compared to their original base optimizers. We offer a principled, versatile, and theoretically grounded strategy for efficient intra-layer selective updates, accelerating and stabilizing the training of large-scale models. The code is publicly available at https://github.com/MaeChd/MGUP.

13.
arXiv (CS.AI) 2026-06-19

Class-Incremental Motion Forecasting

arXiv:2603.09420v3 Announce Type: replace-cross Abstract: Motion forecasting enables autonomous vehicles to anticipate scene evolution by predicting the future trajectories of dynamic agents. However, existing approaches typically assume a closed-world setting with a fixed object taxonomy and access to high-quality perception, limiting their applicability in the real world where perception is imperfect, and new object classes may emerge over time. In this work, we introduce class-incremental motion forecasting, a novel setting in which new object classes are sequentially introduced over time and future object trajectories are predicted directly from camera images. We propose the first end-to-end framework for this setting, which adapts to newly introduced classes while mitigating catastrophic forgetting of previously learned ones. Our method generates motion forecasting pseudo-labels for known classes and matches them with 2D instance masks from an open-vocabulary segmentation model. This 3D-to-2D keypoint voting mechanism filters inconsistent and overconfident predictions, while a query feature variance-based replay strategy samples informative past sequences to preserve prior knowledge. Extensive evaluations on nuScenes and Argoverse 2 show that our approach successfully preserves performance on known classes while effectively adapting to novel ones. We further demonstrate zero-shot transfer to real-world driving and show that the framework extends naturally to open- and closed-loop end-to-end class-incremental planning on nuScenes and NeuroNCAP. Code and models will be made publicly available at https://omen.cs.uni-freiburg.de.

14.
arXiv (quant-ph) 2026-06-15

Extensible Fluxonium Architecture Using Tunable Couplers with Low Shunt Capacitance

arXiv:2606.01647v2 Announce Type: replace Abstract: Fluxonium qubits have demonstrated high-fidelity operations and long coherence times in small-scale systems, highlighting their promise for quantum computing. However, large-scale integration into a high-performance two-dimensional (2D) qubit array remains the central challenge for practical applications. In this work, we introduce an extensible architecture for scaling up fluxonium qubits in 2D grids. To address the key challenges, namely achieving controllable strong interaction and high connectivity for qubits featuring small shunting capacitors (footprints), we propose using low-shunt-capacitance couplers to enable tunable interactions between fluxonium qubits. When embedded into 2D square lattices, large couplings can be achieved even with relatively small coupling capacitances, thus enabling multiple connections with sufficient capacitance budget. We further propose coupler realizations based on generalized flux qubit circuits, specifically the quarton and the fluxonium, and demonstrate that both enable fast, high-fidelity gates with low spectator errors, while supporting multiple connections on 2D grids.

15.
arXiv (CS.CV) 2026-06-12

Reinforcement Learning for Neural Model Editing

作者:

Editing pretrained neural networks requires specialized algorithms tailored to specific objectives. Designing such algorithms is often time-consuming and demands significant effort. We present an exploratory framework that formulates neural model editing as a reinforcement learning problem, where agents modify models using reward feedback. We introduce two environments: MaskWorld, where agents scale weights multiplicatively, and ShiftWorld, where agents apply additive weight updates. The reward function combines a utility-preservation objective with a task-specific editing objective, enabling agents to learn targeted modifications while maintaining overall model performance. We evaluate the framework on bias mitigation in text classification and machine unlearning in image classification, both of which traditionally rely on specialized algorithms. Our results show that the learned policies reduce forget set accuracy to nearly 0% while preserving over 90% retain set accuracy on the unlearning task. In the bias mitigation setting, the learned policies improve bias-related performance by more than 5% while maintaining general classification utility. Our findings show that neural model editing can be cast as a reinforcement learning problem, allowing editing policies to be learned from reward feedback rather than manually engineered for each task.

16.
arXiv (CS.CV) 2026-06-18

Moving Beyond Diversity: Visual Token Pruning as Subspace Reconstruction for Efficient VLMs

Despite their remarkable performance, Vision Language Models (VLMs) incur substantial computational overhead due to the large number of visual tokens. While diversity maximization has become a dominant strategy for token reduction, existing methods rely on cosine-based normalized similarity that discards magnitude information, failing to faithfully approximate the original feature representation and leading to suboptimal performance, particularly on compositional multi-skill reasoning tasks. In this paper, we introduce SPARE, a subspace reconstruction method that reformulates token pruning as a column subset selection problem and explicitly minimizes reconstruction error. By iteratively selecting tokens with large projection residuals, SPARE performs reconstruction-driven pruning beyond angular diversity. Moreover, we reveal a counterintuitive anti-relevance phenomenon: tokens with lower image-text relevance score can better preserve contextual information. Based on this finding, we incorporate anti-relevance into SPARE as an additional selection criterion to promote context-aware token selection. Extensive experiments across multiple VLMs and benchmarks demonstrate that SPARE consistently achieves state-of-the-art performance, with strong gains on compositional tasks. When applied to LLaVA, SPARE removes up to 94% of visual tokens while retaining 95% of the baseline performance, all in a fully training-free manner.

17.
arXiv (CS.AI) 2026-06-16

Parallel Test-Time Scaling with Multi-Sequence Verifiers

arXiv:2603.03417v2 Announce Type: replace-cross Abstract: Parallel test-time scaling, which generates multiple candidate solutions for a single problem, is a powerful technique for improving large language model performance. However, it is hindered by two key bottlenecks: accurately selecting the correct solution from the candidate pool, and the high inference latency from generating many full solutions. We argue that both challenges are fundamentally linked to verifier calibration, as a well-calibrated verifier improves answer selection and enables early-stopping strategies to reduce latency. However, existing non-generative verifiers are limited as they score each candidate in isolation, overlooking rich contextual information across the set of candidates. To address this, we introduce the Multi-Sequence Verifier (MSV), a lightweight verifier that predicts each candidate's correctness conditioned on the full sampled set. MSV achieves improved calibration, which directly enhances best-of-N selection performance and empowers a novel early-stopping framework. Across challenging mathematical reasoning benchmarks, MSV improves best-of-64 accuracy by up to 6\% relative to strong baselines, and in the early-stopping setting reaches the same accuracy as baselines with less than half the latency.

18.
arXiv (CS.LG) 2026-06-12

$\alpha$-fair heterogeneous agent reinforcement learning

arXiv:2606.13076v1 Announce Type: cross Abstract: Cooperation in multi-agent systems is typically optimized through utilitarian objectives that maximize overall efficiency but fail to account for reward distribution, often resulting in inequitable "leader-follower" dynamics. While fairness-based approaches encourage pro-social behaviors where every agent benefits from cooperation, many current algorithms - including those utilizing reward shaping - break the stationarity of Markov Games or lack rigorous theoretical guarantees. This creates a critical gap between fair objective methods and theoretically safe learning frameworks. We propose a novel framework that bridges $\alpha$-fairness with Heterogeneous-Agent Trust Region Learning (HATRL), ensuring monotonic improvement and convergence toward Nash Equilibria. Our approach leverages a fair advantage function that dynamically weights agent utilities based on their expected returns, allowing the global objective to transition from purely utilitarian efficiency to $\alpha$-fairness welfare based on the parameter $\alpha$. We introduce two practical algorithms, $\alpha$-fair HATRPO and $\alpha$-fair HAPPO, and demonstrate through experiments in sequential social dilemmas like CleanUp and CommonHarvest that they perform better than HATRL's algorithms from a utilitarian point of view while achieving socially higher outcomes.

19.
arXiv (CS.AI) 2026-06-11

ConsistencyPlanner: Real-time Planning with Fast-Sampling Consistency Models

arXiv:2606.11569v1 Announce Type: cross Abstract: Closed-loop planning in complex, real-world driving scenarios presents a critical challenge for autonomous driving systems. While traditional rule-based methods are interpretable, their predefined heuristics lack the adaptability for dynamic traffic environments. Learning-based approaches have shown considerable promise. Conversely, learning-based approaches, despite their promise, struggle to balance the modeling diverse and multimodal driving behaviors and real-time planning, often leading to indecisive or unsafe actions. To address this limitation, we propose Consistency Planner, a real-time planning framework with fast-sampling consistency models. Our approach is built upon two key technical contributions. Efficient Multimodal Sampling: We employ fast-sampling consistency models to generate a diverse set of plausible future trajectories. This enables efficient, real-time exploration of multimodal actions, overcoming the computational bottlenecks of previous iterative generative methods. Heterogeneous Feature Fusion: We introduce an attention-enhanced decoder that dynamically integrates heterogeneous input features (including scene feature and action token) into a cohesive representation for robust planning. Extensive evaluation in the Waymax simulator demonstrates superior performance in safety metrics compared to existing methods, with particularly strong results in challenging dynamic scenarios.

20.
arXiv (CS.CV) 2026-06-16

CRIS: Cross-Plane Self-Supervised Isotropic Restoration for Anisotropic Volumetric Imaging Across Modalities

Anisotropic volumetric acquisitions are common in clinical MRI and volume electron microscopy (vEM), where sparse through-plane sampling creates thick slices or sections that degrade orthogonal reformats and downstream analysis. We present CRIS, a cross-plane self-supervised framework for isotropic restoration without paired isotropic ground truth. CRIS casts 3D restoration as 2D stripe completion on orthogonal reformats of an isotropic grid: high-resolution in-plane slices are synthetically degraded and periodically masked for training, while at inference blank slices define the isotropic grid, two orthogonal reformats are restored, and predictions are fused by multi-view averaging. We evaluate CRIS on two MRI cohorts and two microscopy benchmarks up to 8x anisotropy. On brain MRI, CRIS achieves 32.921 +/- 0.436 dB PSNR and 0.9631 +/- 0.0027 SSIM, outperforming interpolation, SMORE4, SIMPLE, SA-INR, and ATME, and gives the best segmentation consistency (Dice 0.940 +/- 0.004, ASSD 0.245 +/- 0.014 mm, HD99 1.275 +/- 0.061 mm). On reference-free abdominal MRI, CRIS reduces FID/KID to 48.714/0.023. On vEM, CRIS outperforms interpolation, NIIV, and vEMINR, reaching 29.133 dB/0.834 3D PSNR/SSIM at 4x, 27.123 dB/0.734 on EPFL at 8x, and 21.915 dB/0.699 on noisy hemibrain data. In a robustness experiment, one variable-gap CRIS model evaluated across gap factors 3–7 and coronal, axial, and sagittal degradations maintained higher PSNR/SSIM than interpolation (36.36–31.14 dB and 0.977–0.932 vs. 33.07–27.85 dB and 0.951–0.853). These results support CRIS as a modality-flexible route to isotropic restoration without paired isotropic targets or configuration-specific retraining. Code is available at https://github.com/adi-hatav/CRIS.

21.
arXiv (math.PR) 2026-06-12

Storage and Transport Capacity Design for a Self-Reliable Two-Node Stochastic Resource System

arXiv:2606.12707v1 Announce Type: cross Abstract: We study a two-node stochastic resource system operating over a finite horizon. Each node experiences uncertain supply and demand and is equipped with finite storage. The objective is to ensure that resource levels remain within prescribed limits with high probability. To this end, we formulate a chance-constrained capacity-design problem in which resources can be exchanged through a capacity-limited transport link. We characterize the minimum storage required at each node, derive the optimal transport policy, and quantify the trade-off between storage and transport capacities. Our results show the existence of a critical transport-capacity threshold that enables full risk pooling between the nodes. Moreover, this threshold decreases with the operating horizon, implying that full-pooling performance can be achieved with progressively smaller transport capacity over longer horizons.

22.
arXiv (CS.LG) 2026-06-11

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

arXiv:2512.11081v2 Announce Type: replace-cross Abstract: Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.

23.
arXiv (CS.CL) 2026-06-16

Scaling Human and G2P Supervision for Robust Phonetic Transcription

Expert phonetic annotation is costly, especially for non-standard dialects and atypical speech. A common alternative is using Grapheme-to-Phoneme (G2P) models to auto-generate phonetic labels from text transcripts at scale. We study how automatic phonetic transcription performance scales with human and G2P supervision in English. Using a curated 80-hour benchmark spanning native, non-native and post-stroke speech, we identify a supervision quality threshold: G2P supervision helps only when fewer than 20-30 hours of human annotation are available. Beyond this threshold, it provides no significant benefit and can reduce cross-dialect robustness. What is effective after this threshold is ASR pretraining which we use to achieve a 2.3x reduction in weighted phone feature error rate over prior systems, with strong gains on non-native and aphasic speech. These results suggest that quantity-driven G2P scaling may yield diminishing returns for robust generalization.

24.
arXiv (CS.CL) 2026-06-11

PRInTS: Reward Modeling for Long-Horizon Information Seeking

Information-seeking is a core capability for AI agents, requiring them to gather and reason over tool-generated information across long trajectories. However, such multi-step information-seeking tasks remain challenging for agents backed by language models. While process reward models (PRMs) can guide agents by ranking candidate steps at test-time, existing PRMs - designed for short reasoning with binary judgment - cannot capture richer dimensions of information-seeking steps, such as tool interactions and reasoning over tool outputs, nor handle the rapidly growing context in long-horizon tasks. To address these limitations, we introduce PRInTS, a generative PRM trained with dual capabilities: (1) dense scoring based on the PRM's reasoning across multiple dimensions of step quality (e.g., interpretation of tool outputs, tool call informativeness) and (2) trajectory summarization that compresses the growing context while preserving essential information for step evaluation. Extensive evaluations across FRAMES, GAIA (levels 1-3), and WebWalkerQA (easy-hard) benchmarks on multiple models reveal that best-of-n sampling with PRInTS enhances information-seeking in open-source models as well as specialized agents, matching or surpassing frontier models with a much smaller backbone agent and outperforming other strong reward modeling baselines.

25.
medRxiv (Medicine) 2026-06-12

Metastatic Patterns and Treatment Characteristics of Triple-Negative Breast Cancer in Nigeria: A Retrospective Cohort Study

Background: Triple-negative breast cancer (TNBC) is an aggressive breast cancer subtype characterized by the absence of estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 expression. It is associated with limited targeted treatment options, early relapse, and a high propensity for visceral metastasis. Data describing metastatic patterns and treatment characteristics of TNBC in Nigeria remain limited. Methods: This retrospective descriptive cohort study included 869 patients with TNBC managed at the Medserve-LUTH Cancer Center, Lagos University Teaching Hospital, Nigeria between June 2019 and June 2024. Demographic, clinicopathologic, metastatic, and treatment-related data were extracted from electronic medical records. Descriptive statistics were used to summarize patient characteristics, metastatic patterns, and treatment profiles. Associations between metastatic disease and selected clinicopathologic and treatment variables were explored using Pearsons chi-square test. Complete-case analysis was applied throughout. Results: The mean age at presentation was 52.09 {+/-} 12.26 years. Most patients were married (79.1%), postmenopausal (64.3%), and of Yoruba ethnicity (56.8%). Advanced disease predominated, with Stage III and Stage IV disease accounting for 42.9% and 35.6% of cases, respectively. Invasive ductal carcinoma was the most common histologic subtype (77.0%), while Grade II tumours constituted 51.3% of graded cases. Surgery was performed in 73.1% of patients, predominantly mastectomy (70.9% of surgical procedures). Chemotherapy was administered to 83.2% of patients, most commonly anthracycline-based regimens (41.8%), while radiotherapy was delivered to 63.5% of patients, with hypofractionated schedules of 42-43 Gy in 15-16 fractions accounting for 47.2% of radiotherapy courses. Metastatic disease was documented in 32.9% of evaluable patients. Lung metastasis was the most frequent site (62.5%), followed by bone (46.3%), regional lymph node invasion (38.5%), liver (23.0%), and brain (22.6%). Tumour grade and histologic subtype were not significantly associated with metastatic disease, whereas radiotherapy exposure demonstrated a significant association with metastatic status ({chi}{superscript 2} = 10.35, p = 0.001). Conclusion: TNBC in this Nigerian cohort was characterized by advanced-stage presentation, invasive ductal predominance, extensive use of multimodality treatment, and substantial visceral metastatic burden. Lung metastasis was the most common metastatic site. These findings provide contemporary real-world data on TNBC in Nigeria and highlight the continuing need for earlier diagnosis, timely referral, and sustained investment in comprehensive cancer care services.