Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
medRxiv (Medicine) 2026-06-15

Fanconi Anemia as a Window into Premalignant Field Cancerization of the Oral Mucosa

Head and neck squamous cell carcinoma (HNSCC) evolves through stepwise clonal expansion within genetically altered mucosa fields, yet actionable biomarkers remain undefined. Leveraging Fanconi anemia (FA), a cancer predisposition syndrome with extreme HNSCC risk due to defective DNA interstrand crosslink repair, we profiled premalignant changes in the oral cavity using noninvasive brush biopsies. Consistent with our prior demonstration of genomic instability in FA-associated SCCs, we detected pathogenic TP53 variants in 26% and copy number alterations in 60.5% in clinically normal-appearing oral mucosa of individuals with FA. These subclinical clonal expansions define candidate biomarkers of early clonal evolution amenable to serial sampling for risk stratification and prevention studies. Since FA-associated SCCs share genomic features with sporadic HNSCC, these findings may extend to the broader population. We also identify somatic reversion of a pathogenic FANCB variant, providing evidence of genomic self-correction and suggesting a potential avenue for gene-based cancer prevention in FA.

02.
arXiv (CS.LG) 2026-06-17

Finite-Time Queue Peak Laws in Stochastic Networks: Logarithmic Scaling After Geometric Thresholds

arXiv:2606.18218v1 Announce Type: cross Abstract: We study finite-horizon queue peaks in generalized switches, a standard stochastic-network model in which many queues share constrained service resources. Arrivals may be dependent, time-varying, and adapted to the past; the standing load condition is uniform interior slack, meaning the conditional mean arrival vector stays in a fixed contraction of the capacity region. We show that this slack reshapes the finite-time peak law for drift-minimizing scheduling policies such as MaxWeight. The square-root envelope that is sharp without slack persists only up to a geometry-dependent threshold; beyond that threshold, the running maximum grows only logarithmically with the horizon, both with high probability and in expectation. The mechanism is self-normalization: in the current queue direction, the projected fluctuation scale is normalized by the stabilizing drift scale. This removes capacity geometry from the logarithmic coefficient, while geometry remains in the threshold. Matching lower bounds show that both the logarithmic term and a geometric threshold are unavoidable. When finite-time state-space collapse is available, the threshold can be sharpened using local bottleneck geometry. For generalized input-queued switches, we obtain finite-time peak bounds with tight logarithmic coefficients. Simulations illustrate the two-phase envelope, local geometric refinements, and variance-sensitive improvements predicted by the theory.

03.
bioRxiv (Bioinfo) 2026-06-23

VCBench: A Multi-Dimensional Benchmark for Single-Cell Foundation Models

Single-cell foundation models are increasingly positioned as virtual cells, yet their capabilities are assessed by fragmented, largely single-task benchmarks that obscure where these models improve on simple baselines. VCBench addresses this by synthesizing four independent virtual-cell frameworks into seven capability dimensions: perturbation response prediction, cross-species universality, gene regulatory network (GRN) inference, modality integration, temporal dynamics, multi-scale integration, and in silico experimentation. Each dimension is assessed for operational testability under current architectures and datasets: five admit direct or proxy evaluation, while multi-scale integration and in silico experimentation are structurally untestable as end-to-end tasks. We evaluate five foundation models (Geneformer, scGPT, UCE, TranscriptFormer, Arc State) against pre-registered linear and nearest-neighbor baselines across the five testable dimensions, and report three findings. First, the baselines match or exceed every foundation model on four of the five scored dimensions, replicating the reported competitiveness of linear baselines on perturbation prediction and extending it to cross-species transfer, GRN inference, and temporal ordering. Second, TranscriptFormer alone exceeds the strongest baseline on cross-modal RNA-to-protein prediction (53% Pearson improvement, with a documented contamination caveat) and is the only model to reach Level 2 in the pre-registered Virtual Cell (VC) Level rubric; the architectural choice behind this advantage simultaneously causes a spectral collapse that destroys its temporal-ordering performance, a tradeoff invisible to single-task benchmarks. Third, no foundation model publishes a complete cell-level training manifest, leaving data contamination undetectable to users. Alongside the benchmark, VCBench releases a Contamination Reporting Schema and contributes two further methodological tools: a common-label-set protocol that controls for class-count confounds in cross-species transfer, and a spread-error correlation probe for epistemic calibration.

04.
arXiv (quant-ph) 2026-06-12

Quantum Error Correction Codes for Truncated SU(2) Lattice Gauge Theories

Authors:

arXiv:2511.13721v2 Announce Type: replace Abstract: We construct two quantum error correction codes for pure SU(2) lattice gauge theory in the electric basis truncated at the electric flux $j_max=1/2$, which are applicable on quasi-1D plaquette chains, 2D honeycomb and 3D triamond and hyperhoneycomb lattices. The first code converts Gauss's law at each vertex into a stabilizer while the second only uses half of the vertices and is locally the carbon code. Both codes are able to correct single-qubit errors. The electric and magnetic terms in the SU(2) Hamiltonian are expressed in terms of logical gates in both codes. The logical-gate Hamiltonian in the first code exactly matches the spin Hamiltonian for gauge singlet states found in previous work.

05.
arXiv (CS.CL) 2026-06-12

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities – proof generation, proof verification, and critique-conditioned proof repair – using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.

06.
arXiv (CS.LG) 2026-06-12

Simultaneous Latent Budget Trees for Stratified Classification

arXiv:2606.13295v1 Announce Type: cross Abstract: In the era of Explainable Artificial Intelligence, there is a renewed focus on single trees for their ease of interpretation. This paper introduces Simultaneous Latent Budget Trees, a probabilistic machine learning framework for classification trees in the presence of a stratification factor such as a temporal, spatial, or demographic variable, acting as a control variable or potential confounder. Standard tree growth procedures are not designed to optimize a conditional split rule. A model-based split rule is proposed in which child nodes are interpreted as latent components of a simultaneous mixture model, such as the Simultaneous Latent Budget Model and its constrained versions, fitted to the parent node. Mixing parameters drive the observations, differently for each group, to the child nodes whereas latent budgets parameters update the response classes profile of each level of the control variable. Parameters are estimated by least squares considering a neural network perspective of the model. An informative tree structure can be interactively visualized with interpretation aids on the node and the paths, including visual pruning and decision tree selection procedure. Suitable measures are proposed to handle an unbalanced response class distribution. The proposed methodology is applied to investigate gender-related differences in disease progression of Amyotrophic Lateral Sclerosis. The SLBT library with the various tree-based algorithms is available in the linked GitHub repository.

07.
arXiv (quant-ph) 2026-06-11

Emergent Bell Phase in an Electro-Nanomechanical Quantum Simulator

arXiv:2511.02613v2 Announce Type: replace Abstract: Suspended carbon nanotubes hosting electrostatically defined quantum dots allow for exceptionally strong and tunable electromechanical coupling as well as mechanical modes that can reach the quantum ground state of motion simply by cryogenic cooling. This makes them a unique platform for quantum simulation of electron-phonon coupling. Here, we propose an experimentally realisable setup with two such carbon nanotubes in parallel, each hosting four quantum dots. Our system not only exhibits phonon-mediated electron-electron attraction, but also supports a robust, maximally entangled Bell phase at mesoscopic scales shared across the subsystems. These features highlight its potential as a simulator of strongly correlated quantum systems.

08.
arXiv (CS.CL) 2026-06-25

Beyond Function Calling: Benchmarking Tool-Using Agents under Tool-Environment Unreliability

Large language models are increasingly deployed as agents that solve tasks by interacting with external tool environments. Although recent tool-use benchmarks increasingly cover complex task settings, they still largely assume clean, stable, and trustworthy tool environments, leaving tool-environment unreliability insufficiently examined. We introduce ToolBench-X, a benchmark for evaluating agents under recoverable reliability hazards. ToolBench-X contains executable multi-step tasks across diverse domains and sequential, parallel, and mixed workflows, each paired with deterministic tools and a canonical final answer for automatic evaluation. Starting from clean tool environments, ToolBench-X injects five structured hazard types: Specification Drift, Invocation Error, Execution Failure, Output Drift, and Cross-source Conflict. Crucially, each injected instance remains solvable through at least one valid recovery path, such as retrying, fallback, verification, or cross-checking. Experiments reveal a substantial reliability gap: agents that perform well with reliable tools often fail under recoverable hazards. Further analysis shows that failures are driven less by tool-use volume or inference budget than by limited hazard diagnosis and ineffective recovery. Targeted recovery hints recover many failed tasks, while test-time scaling yields more limited gains. These results suggest that tool-use evaluation should move beyond function-call accuracy toward task completion under unreliable tool environments. The code and data is available at https://github.com/Foreverskyou/ToolBench-X.

09.
arXiv (CS.LG) 2026-06-16

An RRAM-based Hardware Implementation of a Radial Basis Function Neuron for Edge Classifiers

arXiv:2606.14739v1 Announce Type: cross Abstract: The deployment of modern machine learning (ML) solutions on resource-constrained edge devices highlights implementation challenges. This is especially true for extreme edge applications that include safety-critical components, such as autonomous navigation tasks. This paper demonstrates an artificial neural network (ANN) design leveraging Metal-Oxide Resistive RAM (RRAM) -based Analogue Content Addressable Memory (ACAM) as an efficient hardware substrate for performing metric-based classification and online adaptation on the edge. The proposed design is based on a custom Template piXeL (TXL) cell used for building the ACAM module, where each TXL cell acts as a configurable receptive field neuron. These cells employ a Radial Basis activation function to calculate the distance of an input from the programmed receptive field. The TXL can be organised into dense arrays for calculating the distance of a high-dimensional input against all stored prototypes, effectively performing fast and energy efficient similarity search. This hardware engine enables on-the-fly learning, where the receptive field parameters can be tuned to track domain shift. Through simulation of the proposed TXL-RBF classifier we can achieve 89.1\% accuracy on the MNIST dataset while consuming 185fJ per cell per operation when operating at 100MHz.

10.
arXiv (CS.CV) 2026-06-16

BadWorld: Adversarial Attacks on World Models

Visual world models (VWMs) synthesize interactive, action-conditioned rollouts from a single context image. However, it remains an open question how robust these models are to adversarial perturbations. Standard adversarial attacks fail to assess this vulnerability because attackers lack ground-truth future videos and cannot predict subsequent user controls. We introduce BadWorld, a label-free adversarial framework tailored for autoregressive VWMs that systematically overcomes both constraints. First, to bypass the need for future supervision, we propose a self-supervised velocity attack that directly disrupts the early denoising dynamics of the model. Second, to ensure the attack generalizes across unpredictable user actions, we formulate a trajectory-adaptive bi-level optimization that actively mines hard control sequences to forge control-agnostic perturbations. Evaluated on representative VWMs with continuous and discrete controls, BadWorld exposes severe structural fragility. Visually indistinguishable adversarial images reliably trigger catastrophic degradation in future rollouts, leading to incomplete denoising, structural collapse, and control inconsistency. These findings reveal critical risks for deploying VWMs in safety-critical systems while highlighting a practical mechanism for privacy protection.

11.
arXiv (CS.LG) 2026-06-24

KLip-PPO: A per-sample KL perspective on PPO-Clip

arXiv:2606.23932v1 Announce Type: new Abstract: Proximal Policy Optimization (PPO) is the standard policy-gradient algorithm for on-policy reinforcement learning. The literature presents it in two forms, a clipped surrogate that bounds the importance ratio between successive policies and a Kullback-Leibler penalty between them. These forms are treated as separate algorithms with their own gradients, their own hyperparameters, and their own reference implementations, and a sizeable body of empirical work compares them. We show that the gradient of the clipped surrogate is reproduced exactly by a Kullback-Leibler surrogate whose coefficient varies per sample, with closed-form dependence on the importance ratio and the advantage. The identity holds at every minibatch step and across the entire inner loop, and on five MuJoCo continuous-control benchmarks the two losses produce indistinguishable training curves. The reformulation exposes a structural feature of the clipped surrogate that the min notation hides. PPO-Clip's implicit per-sample penalty is a step function at the boundary of the trust region, and the shape of this coefficient is the natural design axis for generalising the algorithm. We sketch the resulting follow-up directions in the discussion.

12.
arXiv (CS.AI) 2026-06-11

MLaGA: Multimodal Large Language and Graph Assistant

arXiv:2506.02568v2 Announce Type: replace Abstract: Large Language Models (LLMs) have demonstrated substantial efficacy in advancing graph-structured data analysis. Prevailing LLM-based graph methods excel in adapting LLMs to text-rich graphs, wherein node attributes are text descriptions. However, their applications to multimodal graphs–where nodes are associated with diverse attribute types, such as texts and images–remain underexplored, despite their ubiquity in real-world scenarios. To bridge the gap, we introduce the Multimodal Large Language and Graph Assistant (MLaGA), an innovative model that adeptly extends LLM capabilities to facilitate reasoning over complex graph structures and multimodal attributes. We first design a structure-aware multimodal encoder to align textual and visual attributes within a unified space through a joint graph pre-training objective. Subsequently, we implement a multimodal instruction-tuning approach to seamlessly integrate multimodal features and graph structures into the LLM through lightweight projectors. Extensive experiments across multiple datasets demonstrate the effectiveness of MLaGA compared to leading baseline methods, achieving superior performance in diverse graph learning tasks under both supervised and transfer learning scenarios.

13.
arXiv (CS.AI) 2026-06-25

ExTra: Exploratory Trajectory Optimization for Language Model Reinforcement Learning

arXiv:2606.24994v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) for language-model reasoning can fail at both extremes of task difficulty: easy prompts often produce all-correct, low-diversity rollout groups with little gradient signal, while hard prompts can produce all-incorrect groups with no positive reward. We introduce ExTra (Exploratory Trajectory Optimization), a GRPO-compatible framework that extracts exploration signals from the model's own rollouts. ExTra combines two mechanisms: (i) a novelty reward that adds embedding-based diversity bonuses after GRPO normalization, rewarding diverse correct solutions; and (ii) entropy-guided prefix regeneration, which scores partial trajectories using entropy signals and continues exploration from promising intermediate steps. Across six mathematical reasoning benchmarks, ExTra improves Qwen3-1.7B over GRPO by about +5 points on pass@1 and +7 points on pass@16, showing that trajectory-level exploration signals can improve both single-sample accuracy and inference-time coverage.

14.
arXiv (CS.LG) 2026-06-25

$DT^2$: Decision-Targeted Digital Twins

arXiv:2606.25923v1 Announce Type: new Abstract: A digital twin (DT) is a virtual model of a real-world system that can assist decision-making by simulating scenarios induced by different policies. However, typical machine learning-based DTs do not optimise for this use case. We prove that, when model capacity is limited, training DTs to minimise one-step transition errors can produce suboptimal models for ranking sets of policies according to a reward function. We further show that this holds empirically, even with expressive model classes. To address this, we introduce $DT^2$, a decision-targeted DT training paradigm. Firstly, $DT^2$ uses fitted Q-evaluation to estimate values of candidate policies from offline data. A DT is then trained to generate rollouts that preserve pairwise policy rankings derived from these proxy ground-truth values with an architecture-agnostic loss function. We empirically demonstrate the efficacy of our method across a range of settings and architectures. $DT^2$ consistently improves policy ranking and reduces decision regret during policy selection relative to conventional DT training, both for policies used during training and for unseen policies, while maintaining a good level of raw simulation fidelity.

15.
arXiv (CS.CL) 2026-06-15

Large Language Model Agents Are Not Always Faithful Self-Evolvers

Self-evolving large language model (LLM) agents continually improve by accumulating and reusing past experience, yet it remains unclear whether they faithfully rely on that experience to guide their behavior. We present the first systematic investigation of experience faithfulness, the causal dependence of an agent's decisions on the experience it is given, in self-evolving LLM agents. Using controlled causal interventions on both raw and condensed forms of experience, we comprehensively evaluate four representative frameworks across 13 LLM backbones and 9 environments. Our analysis uncovers a striking asymmetry: while agents consistently depend on raw experience, they often disregard or misinterpret condensed experience, even when it is the only experience provided. This gap persists across single- and multi-agent configurations and across backbone scales. We trace its underlying causes to three factors: the semantic limitations of condensed content, internal processing biases that suppress experience, and task regimes where pretrained priors already suffice. These findings challenge prevailing assumptions about self-evolving methods and underscore the need for more faithful and reliable approaches to experience integration.

16.
arXiv (CS.AI) 2026-06-19

QueryGaussian: Scalable and Training-Free Open-Vocabulary 3D Instance Retrieval

arXiv:2606.19733v1 Announce Type: cross Abstract: Efficiently retrieving specific 3D instances from large-scale scenes via natural language prompts remains a formidable challenge in multimedia analysis. Existing approaches predominantly follow a "scene-level embedding" paradigm, which requires distilling high-dimensional semantic features into every 3D primitive. This strategy suffers from a fundamental architectural bottleneck: memory and computational costs scale linearly with scene complexity, inevitably triggering out-of-memory (OOM) failures in city-scale environments. To address this barrier, we propose QueryGaussian, a training-free framework for expeditious and scalable open-vocabulary 3D instance retrieval. Unlike holistic semantic distillation, QueryGaussian employs an instance-level query mechanism that decouples semantic understanding from geometric representation. Specifically, we leverage pre-trained 2D vision models to interpret user prompts and lift segmentation masks into 3D via a concurrent maximum-weight association strategy, ensuring semantic-visual consistency. To mitigate projection ambiguity, we introduce a temporal fusion module with multi-stage adaptive density clustering. Experimental results demonstrate that QueryGaussian not only matches the accuracy of state-of-the-art methods but also delivers a decisive efficiency leap, reducing GPU memory usage by over 70% and accelerating inference by 180x. Crucially, QueryGaussian enables expeditious instance retrieval on city-scale scenes containing tens of millions of Gaussians using consumer-grade hardware.

17.
arXiv (CS.AI) 2026-06-24

THEIA: Learning Complete Kleene Three-Valued Logic in a Pure-Neural Modular Architecture

arXiv:2604.11284v5 Announce Type: replace-cross Abstract: We present THEIA, a 2.75M-parameter modular neural architecture that learns the complete Kleene three-valued logic (K3) truth table from task data without external symbolic inference or hand-encoded K3 gate primitives. Across 5 seeds it passes all 39 K3 rules at >99% per-rule accuracy. K3 learnability is not the central finding: Transformer baselines also pass all 39 rules, and flat MLPs match THEIA on Phase-1 accuracy within 0.04pp. The contributions are two properties of the learned system. (1) Uncertainty-verdict asymmetric propagation. THEIA preserves Has-Unknown at every upstream boundary (80.0/91.1/90.8/99.7% across Arith/Order/Set/Logic vs. ~52% majority) while final-verdict decodability stays at or below a 73.4% U-vs-non-U oracle reference under linear and nonlinear probes. Activation patching on non-absorbent T->U cases flips 4,898/4,898 OR and 4,719/4,719 AND pairs across 5 seeds, ruling out residual shortcuts. (2) Reliability spectrum under discretized end-to-end training, on tasks decomposable along the engine boundaries. A mod-3 sequential composition task generalizes from 5- to 500-step evaluation at 99.96+-0.04% (5 seeds). Under identical Gumbel-softmax training, flat MLPs collapse to chance by 50 steps; a 2x2 ResMLP grid reaches >=99% on only 3/20 (config, seed) trials; a pre-LN Transformer reaches 99.24+-0.34%. Straight-through discretization prevents 0.999^500 compounding; the architectural separator is sustaining Phase-1 accuracy under Phase-3 training, where flat MLPs fail. Auxiliary: under per-architecture development defaults (not optimizer-controlled), THEIA reaches 12/12 Kleene coverage 6.5x faster than a parameter-comparable 8L Transformer; this narrows to ~3.6x with Transformer-standard tuning and 4.93x with the same recipe on both. Ratios are config-specific, not asymptotic.

18.
arXiv (CS.AI) 2026-06-16

Bayesian 3D Steerable CNNs: Enabling Equivariance and Uncertainty Quantification Simultaneously

arXiv:2606.15479v1 Announce Type: cross Abstract: Steerable convolutional neural networks (Steerable-CNNs) guarantee SE(3)-equivariance by parameterizing kernels as linear combinations of steerable basis functions, but their deterministic nature precludes uncertainty quantification - limiting their use in settings where confidence estimates are essential. We propose a Bayesian Steerable-CNN that places posterior distributions over the basis coefficients, yielding stochastic kernels while preserving equivariance exactly. The loss function of the model is obtained via variational inference and minimized by Bayes-by-Backpropagation. The framework admits a decomposition of predictive uncertainty into epistemic and aleatoric components. Empirically, the model attains competitive classification accuracy alongside an expected calibration error of 0.0263 and outperforms its deterministic counterpart by up to 6.17% under distributional shift induced by additive Gaussian noise. Furthermore, we leverage the model's uncertainty estimates to enhance its performance significantly, achieving a notable gain - approximately 4% higher accuracy across 84% of the test dataset. A statistically significant negative correlation between epistemic uncertainty and prediction error confirms that the learned posterior variance is semantically meaningful. The framework unifies Bayesian uncertainty quantification with the inductive bias of equivariant CNNs.

19.
arXiv (CS.AI) 2026-06-17

No-Free-Fairness: Fundamental Limits and Trade-offs in Learning Systems

Authors:

arXiv:2606.17810v1 Announce Type: cross Abstract: In this paper, we establish a set of theoretical impossibility results, termed the No-Free-Fairness theorems, that identify three fundamental sources of disparity in learning systems. First, we show that when a task exhibits irreducible cost on a subgroup, any decision rule must trade off overall performance with disparity, yielding an inherent fairness–cost frontier. Second, we prove that even in ideal, noise-free settings where a perfectly fair and accurate solution exists, finite-sample learning alone induces nontrivial subgroup disparity, ruling out distribution-free fairness guarantees. More seriously, enforcing strict relative fairness creates a statistical bottleneck: achieving low cost may require exponentially many samples. Third, we show that limitations of the model class can independently induce disparity: if the model cannot represent accurate solutions for a subgroup, fairness remains unattainable regardless of data or training procedure. Overall, these results demonstrate that unfairness is not solely a consequence of biased data or suboptimal optimization, but arises from the intrinsic structure of decision problems, the constraints of finite data, and the expressivity of models. Our framework applies broadly beyond standard supervised learning, and suggests that achieving fairness requires explicit trade-offs and should be treated as a core design consideration.

20.
arXiv (CS.LG) 2026-06-16

Diversity-Driven Offline Multi-Objective Optimization via Nested Pareto Set Learning

arXiv:2606.15115v1 Announce Type: new Abstract: Multi-objective optimization (MOO) has emerged as a powerful approach to solving complex optimization problems involving multiple objectives. In many practical scenarios, function evaluations are unavailable or prohibitively expensive, necessitating optimization solely based on a fixed offline dataset. In this setting, known as offline MOO, the goal is to find out the Pareto set without access to the true objective functions. This setting suffers from the out-of-distribution (OOD) issue, where the surrogate model is not accurate for unseen designs. Due to the OOD issue, surrogate errors may cause the optimizer to select solutions that do not lie on the true Pareto front and are biased toward its extremes. To address this, this paper proposes Diversity-driven Offline Multi-Objective Optimization (DOMOO), which aims to find out a diverse and high-quality set of solutions. First, DOMOO incorporates an accumulative risk control module that estimates the potential risk of candidate solutions and alleviates the OOD issue between the training data and the generated solutions. In addition, a nested Pareto set learning (PSL) strategy is proposed to jointly learn preference and PSL parameters, then optimize them, enabling adaptation to diverse Pareto front geometries. To further enhance solution quality, we design a diversity-driven selection strategy that extracts a representative and well-distributed set of final solutions. To achieve this diversity-driven selection strategy, we propose $IGD_offline$, a tailored indicator for the offline setting that considers both diversity and convergence, and avoids the bias of hypervolume indicator. Extensive experiments on synthetic and real-world benchmarks show that DOMOO achieves the best average rank across tasks in both convergence and diversity among the compared methods.

21.
arXiv (CS.AI) 2026-06-19

Augmenting Game AI with Deep Reinforcement Learning

arXiv:2606.20210v1 Announce Type: new Abstract: Immersion in video games depends not only on graphics, audio, and game mechanics, but also on the quality of in-game characters. Producing believable characters, or game AI, remains a significant challenge as behavioral complexity is hard to capture with hand-coded systems. Game AI is a source of immersion and engagement; however, the limitations stemming from the challenges of creating game AI often lead to frustration and the breaking of the illusion of realism within the game. The introduction of machine learning models opens the door to creating more believable, authentic, and relatable characters in games. The promise is that they either learn from interacting with the game, or from player data, to develop true human-like behavior. In this paper, we envision more applications of reinforcement learning for game AI in the future. For this to materialize, current research limitations are prohibitive to broad deployment across game genres. Therefore, we propose a framework for training reinforcement learning models with a set of requirements in mind that are suited towards game AI and game development. We present examples of games with reinforcement learning-augmented game AI and describe the practicalities of deploying player-facing machine learning agents in modern games. Furthermore, we identify bottlenecks and hard problems in these areas, which we believe offer promising research directions to accelerate the adoption of machine learning in game AI for the video game industry.

22.
arXiv (quant-ph) 2026-06-15

Multi-entropy in random tensor networks

arXiv:2606.04470v2 Announce Type: replace-cross Abstract: We study the evaluation of Rényi multi-entropies $S^{(q)}_n$ in Random Tensor Network (RTN) states in the large bond-dimension limit. For the case of Rényi index $n=2$ and arbitrary number of parties $q$, we prove that that multi-entropies are determined by minimal multiway cuts through the network. When the minimal multiway cut is degenerate, we characterize the full minimizer set via compatible families of minimal cuts and give a criterion for all minimizers to come from ordinary cut partitions. For $n=2$, this gives a natural generalization of the minimal cut description of bipartite entanglement to multipartite systems with arbitrarily many parties. For the case of integer $n>2$, we show that the minimal multiway cut conjecture is in general not true by providing explicit counter examples for both the single random tensor and for the network built from isometric tilings. We discuss the implication for our results on the multipartite entanglement structures in RTN and holography.

23.
arXiv (CS.AI) 2026-06-18

Improving Scientific Document Retrieval with Academic Concept Index

arXiv:2601.00567v2 Announce Type: replace-cross Abstract: Adapting general-domain retrievers to scientific domains is challenging due to the scarcity of large-scale domain-specific relevance annotations and the substantial mismatch in vocabulary and information needs. Recent approaches address these issues through two independent directions that leverage large language models (LLMs): (1) generating synthetic queries for fine-tuning, and (2) generating auxiliary contexts to support relevance matching. However, both directions overlook the diverse academic concepts embedded within scientific documents, often producing redundant or conceptually narrow queries and contexts. To address this limitation, we introduce an academic concept index, which extracts key concepts from papers and organizes them guided by an academic taxonomy. This structured index serves as a foundation for improving both directions. First, we enhance the synthetic query generation with concept coverage-based generation (CCQGen), which adaptively conditions LLMs on uncovered concepts to generate complementary queries with broader concept coverage. Second, we strengthen the context augmentation with concept-focused auxiliary contexts (CCExpand), which leverages a set of document snippets that serve as concise responses to the concept-aware CCQGen queries. Extensive experiments show that incorporating the academic concept index into both query generation and context augmentation leads to higher-quality queries, better conceptual alignment, and improved retrieval performance.

24.
medRxiv (Medicine) 2026-06-18

Biomedical Capacity, Governance, and Health Security: A Dominican Republic Research Analysis of Stakeholder Perspectives

The COVID-19 pandemic exposed critical vulnerabilities in globally concentrated biomedical supply chains and accelerated interest in nearshoring and hemispheric health-security strategies. The Dominican Republic, already the third-largest medical device exporter in Latin America, occupies a strategically significant but institutionally constrained position within this realignment. This study evaluates stakeholder perceptions of the principal opportunities and barriers affecting biomedical ecosystem development in the Dominican Republic, with particular attention to governance, workforce capacity, and value-chain upgrading pathways. Methods. A concurrent mixed-methods design was employed, integrating a cross-sectional electronic survey of 142 purposively sampled domain experts (administered September-December 2025) with a qualitative executive consultation with senior government and industry leaders. Survey analyses combined descriptive statistics, one-sample t-tests against the scale neutral midpoint, chi-square goodness-of-fit tests, Friedman non-parametric ranking, Spearman rank correlations, and exploratory linear and logistic multivariable regression. Qualitative responses were analyzed using a framework approach grounded in the Triple Helix model of innovation systems. Results. Perceived government support was significantly below neutral (mean = 2.67, SD = 1.12; p = 0.034). Workforce shortages (83.3%) and weak academia-industry collaboration (71.4%) were the most frequently endorsed barriers ({chi}2(5) = 18.7, p = 0.002). Regulatory modernization (88.1%) and workforce development (85.7%) ranked as the highest-priority policy levers (Friedman p = 0.005). Clinical trials and contract research organization services were the dominant sub-sector priority (76.2%, binomial p < 0.001). In multivariable analysis, perceived government support, talent availability, and confidence in IP protection jointly explained 46% of the variance in sector competitiveness (R2 = 0.46, p < 0.001). Strong majority support existed for a formal public-private biomedical coordination authority (73.8%, p < 0.001).Conclusion. Institutional credibility and advanced human capital–rather than geography or market access–are the perceived binding constraints on the Dominican Republics biomedical trajectory. Regulatory modernization, targeted workforce investment, and the establishment of a national biomedical coordination authority represent the highest-leverage interventions for positioning the country as a hemispheric hub for biomedical manufacturing, clinical research, and health security.

25.
arXiv (CS.AI) 2026-06-15

Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals

arXiv:2601.19810v2 Announce Type: replace-cross Abstract: Unsupervised pre-training can equip reinforcement learning agents with prior knowledge and accelerate learning in downstream tasks. A promising direction, grounded in human development, investigates agents that learn by setting and pursuing their own goals. The core challenge lies in how to effectively generate, select, and learn from such goals. Our focus is on broad distributions of downstream tasks where solving every task zero-shot is infeasible. Such settings naturally arise when the target tasks lie outside of the pre-training distribution or when their identities are unknown to the agent. In this work, we (i) optimize for efficient multi-episode exploration and adaptation within a meta-learning framework, and (ii) guide the training curriculum with evolving estimates of the agent's post-adaptation performance. We present ULEE, an unsupervised meta-learning method that combines an in-context learner with an adversarial goal-generation strategy that maintains training at the frontier of the agent's capabilities. On XLand-MiniGrid benchmarks, ULEE pre-training yields improved exploration and adaptation abilities that generalize to novel objectives, environment dynamics, and map structures. The resulting policy attains improved zero-shot and few-shot performance, and provides a strong initialization for longer fine-tuning processes. It outperforms learning from scratch, DIAYN pre-training, and alternative curricula. Code is available at: https://github.com/Octavio-Pappalardo/ulee-jax