Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.AI) 2026-06-16

Phase-Aware Guidance Injection for Recurrent MAPPO in Assembly-Line Disruption Recovery

arXiv:2606.16330v1 Announce Type: new Abstract: Disruption recovery in industrial assembly lines requires timely decisions under machine faults, worker absence, and emergency orders. Existing methods either rely on rigid handcrafted recovery logic or learn adaptive policies that do not readily exploit heterogeneous external recovery knowledge at decision time to reduce abnormal recovery time (ART) and preserve on-time delivery (OTD). To address this gap, we propose a phase-aware guidance injection framework that augments a trained recurrent MAPPO (RMAPPO) scheduling policy through logit-level action bias during evaluation. The framework provides a unified decision-time interface for rule-based, replay-based, and online LLM-based guidance, while activating intervention only during abnormal and recovery phases. Experiments on a custom AssemblyLineEnv show that high-quality rule guidance yields the strongest gains, replay-based guidance degrades smoothly under imperfect availability, and online LLM guidance still provides useful intermediate improvements. These results show that decision-time guidance injection can exploit heterogeneous recovery hints without redesigning the actor.

02.
arXiv (CS.AI) 2026-06-18

From Memorization to Creation: Evaluating the Cognitive Depth of LLM-Generated Educational Questions

arXiv:2606.18257v1 Announce Type: cross Abstract: While LLMs show promise in automating educational content creation, their ability to generate questions that stimulate higher-order thinking remains understudied. This work evaluates six widely-used LLMs through a Bloom's Taxonomy lens, focusing on their capacity to transcend rote memorization and achieve cognitive leaps. Using a hybrid human–AI evaluation protocol, we generate and analyze 20{,}700 questions across computer science, K–12 math, and social-science domains. Key contributions include: (1) a fine-grained prompting strategy that reduces question repetitiveness by 24.45\% for Qwen2.5-7B-Instruct, and increases the proportion of higher-order cognitive level outputs by 11.53\% for InternLM3-8B-Instruct; (2) quantitative metrics for cognitive shift intensity (CogShift) and category drift, revealing InternLM3's superior performance in multi-level transitions; (3) an interpretability analysis revealing metric-level correlations that enhance the transparency of Chain-of-Thought prompting. Our findings highlight the importance of cognitive-aware prompt design and provide benchmarks for deploying LLMs in personalized learning systems.

03.
arXiv (CS.CV) 2026-06-16

Gaussian Spatial Priors for Anatomy-Aware Object Detection in Surgical Videos

Detecting anatomical structures in surgical video is essential for intraoperative safety frameworks such as the Critical View of Myopectineal Orifice (CVMPO) in inguinal hernia repair. While prominent structures like the Cooper's Ligament and Triangle of Doom are reliably detected by standard methods, smaller structures such as the epigastric vessels remain challenging due to their visual ambiguity and intermittent visibility. We observe that the spatial relationship between structures is anatomically constrained, and propose a Gaussian Spatial Prior (GSP) module that encodes this relationship as a compact, parametric bias injected into the self-attention of a DAB-DETR decoder. The prior is computed offline from training annotations as a small set of frozen Gaussian parameters and recomputed at each decoder layer using the iteratively refined reference points. On a dataset of inguinal hernia repair videos with 5-fold cross-validation, GSP improves dependent class detection by $+33.5\%$ ($AP_{50}$) over DAB-DETR and $+53.9\%$ over YOLOv26, while also improving anchor detection by $+6.0\%$. These gains are statistically significant across all folds ($p=0.012$, paired $t-$test).

04.
medRxiv (Medicine) 2026-06-15

Instrumental Activities of Daily Living in Older Adults with Epilepsy: A Cross-Sectional and Longitudinal Multicenter Study

Objective: Instrumental activities of daily living (IADLs) represent a critical but understudied measure of day-to-day function in persons with epilepsy(PWE). In the multicenter Brain Aging and Cognition in Epilepsy (BrACE) study of PWE aged greater than or equal to 55 years, we examined the proportion, clinical correlates, epilepsy-related predictors, and longitudinal trajectory of IADL impairment. Methods: IADLs were assessed using the Functional Activities Questionnaire (FAQ; range=0 to 30; higher=more impaired); a FAQ greater than or equal to 2 defines MCI-level impairment, and a FAQ greater than or equal to 5 defines dementia-level functional impairment. Multivariable logistic regression identified predictors of baseline function. Global cognition (Montreal Cognitive Assessment [MoCA]), individual cognitive measures, and quality of life (QOL) were compared between the impaired and unimpaired groups. Linear regression evaluated predictors of longitudinal functional decline. Results: Of 57 participants (mean age=66.6 years; female=52.6%), 38.6% (n=22) had MCI-level functional impairment and 17.5% (n=10) had dementia-level functional impairment. In univariate analyses, worse FAQ scores were associated with lower education, higher area deprivation index, early-onset epilepsy (EOE less than 60 years), antiseizure medication polytherapy, and epilepsy localization. In multivariable analysis, temporal lobe epilepsy (OR=4.46, 95% CI=1.09, 21.83,p=0.047), EOE(OR=7.14, 95% CI=1.16, 59.97, p=0.046), and lower education(OR=0.70,95% CI=0.49, 0.93, p=0.025) remained independently associated with baseline MCI-level functional-impairment. Lower education (OR=0.55,95% CI=0.29, 0.84, p=0.021) was the only factor associated with dementia-level IADL-impairment. IADL-impaired participants demonstrated lower verbal memory scores (adjusted p=0.041) and MoCA scores (adjusted p

05.
medRxiv (Medicine) 2026-06-10

Documented clinical genetic testing among carriers of hereditary breast and ovarian cancer variants: Ancestry and socioeconomic disparities in the All of Us research program

Importance: Hereditary breast and ovarian cancer (HBOC) variant carriers benefit from risk-reducing interventions, but only if identified. The extent to which carriers are clinically recognized, and whether recognition is equitable across diverse populations, is poorly characterized in a single large U.S. cohort. Objective: To estimate P/LP HBOC carrier prevalence across genetic ancestry groups, quantify documented clinical genetic testing among carriers, and evaluate ancestry and socioeconomic disparities in testing. Design, Setting, and Participants: Cross-sectional analysis of the All of Us Research Program Controlled Tier (Curated Data Repository v8/C2024Q3R9), comprising participants with short-read whole genome sequencing and linked electronic health record (EHR) and survey data. Carriers were ascertained from research genomic data independent of clinical testing. Exposures: Genetically inferred ancestry (African [AFR], Admixed American [AMR], East Asian [EAS], European [EUR], Middle Eastern [MID], South Asian [SAS]); self-reported household income and educational attainment. Main Outcomes and Measures: (1) Carrier prevalence with Wilson 95% CIs; (2) documented clinical genetic testing (procedure codes) among carriers; (3) adjusted odds of documented testing among women, by ancestry, before and after socioeconomic adjustment, using multivariable logistic regression. Results: Among 414,830 participants, P/LP HBOC carrier prevalence was 1.42% (95% CI, 1.38-1.45) overall and similar across ancestry groups (AFR 1.24%, AMR 1.32%, EAS 1.19%, EUR 1.52%, MID 1.68%, SAS 1.33%; overlapping CIs). Among 250,071 women in the testing analysis, documented clinical genetic testing was rare: only 74 of 5,878 carriers overall (1.3%) and 59 of 3,572 European-ancestry carriers (1.7%) had a documented test, with counts below reportable thresholds in all other ancestry groups. African-ancestry women had lower adjusted odds of documented testing than European-ancestry women (Model 1 adjusted odds ratio [aOR], 0.32; 95% CI, 0.27-0.39), an association that attenuated but persisted after adjustment for income and education (Model 2 aOR, 0.48; 95% CI, 0.40-0.58; P < 0.001); Admixed American women also had reduced adjusted odds (aOR, 0.71; 95% CI, 0.61-0.84). Lower income and lower education were independently and dose-dependently associated with lower testing odds (income

06.
arXiv (quant-ph) 2026-06-16

Hyperinvariant Spin Network States – An AdS/CFT Model from First Principles

arXiv:2510.06602v2 Announce Type: replace Abstract: We study the existence and limitations of hyperinvariant tensor networks incorporating a local SU(2) symmetry. As discrete implementations of the anti de-Sitter/conformal field theory (AdS/CFT) correspondence, such networks have created bridges between the fields of quantum information theory and quantum gravity. Adding SU(2) symmetry to the tensor network allows a direct connection to spin network states, a basis of the kinematic Hilbert space of loop quantum gravity (LQG). We consider a particular situation where the states can be interpreted as kinematic quantum states for three-dimensional quantum gravity. We show that important aspects of the AdS/CFT correspondence are realized in certain quantum states of the gravitational field in LQG, thus justifying, from first principles, a class of models introduced by [F. Pastawski et al., JHEP 06, 149 (2015)]. We provide examples of hyperinvariant tensor networks, but also prove constraints on their existence in the form of no-go theorems that exclude absolutely maximally entangled states as well as general holographic codes from local SU(2)-invariance. We calculate surface areas as expectation values of the LQG area operator and discuss further possible constraints as a consequence of a decay of correlations on the boundary.

07.
arXiv (CS.LG) 2026-06-16

Q-Learning with Fine-Grained Gap-Dependent Regret

arXiv:2510.06647v2 Announce Type: replace-cross Abstract: We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing fine-grained gap-dependent regret bounds for both UCB-based and non-UCB-based algorithms. In the UCB-based setting, we develop a novel analytical framework that explicitly separates the analysis of optimal and suboptimal state-action pairs, yielding the first fine-grained regret upper bound for UCB-Hoeffding (Jin et al., 2018). To highlight the generality of this framework, we introduce ULCB-Hoeffding, a new UCB-based algorithm inspired by AMB (Xu et al.,2021) but with a simplified structure, which enjoys fine-grained regret guarantees and empirically outperforms AMB. In the non-UCB-based setting, we revisit the only known algorithm AMB, and identify two key issues in its algorithm design and analysis: improper truncation in the $Q$-updates and violation of the martingale difference condition in its concentration argument. We propose a refined version of AMB that addresses these issues, establishing the first rigorous fine-grained gap-dependent regret for a non-UCB-based method, with experiments demonstrating improved performance over AMB.

08.
arXiv (CS.LG) 2026-06-18

Zero-Shot Active Feature Acquisition via LLM-Elicitation

arXiv:2606.18933v1 Announce Type: new Abstract: Active feature acquisition (AFA) sequentially selects which features to observe to reach a classification or ranking decision. Its central limitation is reliance on large amount of labeled data to fit probabilistic models guiding acquisition. Large language models (LLMs) supply unsupervised domain knowledge, but are poor sequential planners. Asking one to both know and decide conflates capabilities best kept separate. Here, we develop a framework for zero-shot AFA through disciplined elicitation: asking the LLM only for what it can be trusted to return, the unary deviations and pairwise co-variations that are the sufficient statistics of a Markov random field (MRF). We apply our framework to two settings: binary classification and top-$k$ identification. In practice, the LLM reliably returns only discriminative statistics, what distinguishes the classes rather than each class in isolation, which precludes classical AFA. We apply a maximum-entropy closure that resolves this gauge ambiguity. We evaluate on a cohort of Inflammatory Bowel Disease (IBD) patients, an active clinical setting where diagnostic ambiguity and patient heterogeneity obstruct stable treatment strategies. Our framework outperforms the LLM both on real labels and on its own extracted beliefs. Where it matters most, on the hardest patients, our top-$k$ acquisition policy markedly outperforms all existing methods.

09.
arXiv (math.PR) 2026-06-19

Hermite trace polynomials and chaos decompositions for the Hermitian Brownian motion

arXiv:2207.13180v4 Announce Type: replace Abstract: For a non-zero parameter $q$, we define Hermite trace polynomials, which are multivariate polynomials indexed by permutations. We prove several combinatorial properties for them, such as expansions and product formulas. The linear functional determined by these trace polynomials is a state for $q = \frac{1}{N}$ for $N$ a non-zero integer. For such $q$, Hermite trace polynomials of different degrees are orthogonal. The product formulas extend to the closure with respect to the state. The state can be identified with the expectation induced by the $N \times N$ Hermitian Brownian motion. Hermite trace polynomials are martingales for this Brownian motion, while the elements in the closure can be interpreted as stochastic integrals with respect to it. Using the grading on the algebra, we prove several chaos decompositions for such integrals, as well as analyze corresponding creation and annihilation operators. In the univariate, pure trace polynomial case, trace Hermite polynomials can be identified with the Hermite polynomials of matrix argument.

10.
arXiv (CS.AI) 2026-06-16

RIDGECUT: Learning Graph Partitioning with Rings and Wedges

arXiv:2505.13986v4 Announce Type: replace-cross Abstract: Reinforcement learning (RL) has shown promise for combinatorial optimization problems on graphs by learning heuristics that generalize across instances. However, effectively incorporating domain knowledge into RL frameworks for graph partitioning remains challenging, as existing approaches typically rely on unconstrained node-level actions that lead to large action spaces and inefficient exploration. In this paper, we propose RidgeCut, an RL framework that constrains the action space to enforce structure-aware partitioning in the Normalized Cut problem. Using transportation networks as a motivating example, we introduce a novel concept that leverages domain knowledge about urban road topology – where natural partitions often take the form of concentric rings and radial wedges. By transforming the graph into linear or circular representations, our method enables the use of transformer-based policies and efficient learning via Proximal Policy Optimization. The resulting partitions from RidgeCut are not only aligned with expected spatial layouts but also achieve lower normalized cuts compared to existing methods. Experimental results on synthetic and real-world traffic graphs demonstrate that RidgeCut consistently outperforms existing methods while exhibiting strong inductive generalization across graph sizes. Although motivated by road networks, RidgeCut provides a general mechanism for embedding structural priors into RL frameworks for graph partitioning.

11.
arXiv (CS.AI) 2026-06-12

Under What Conditions Can a Machine Become Genuinely Creative?

Authors:

arXiv:2606.13196v1 Announce Type: new Abstract: Recent AI systems can generate texts, software architectures, hypotheses, designs, and scientific workflows that appear creative. This paper asks under what conditions a machine can become genuinely creative, and how human agency can be preserved within shared cognitive and creative environments. It develops a requirement framework derived from Designics, the science of meaning-bearing intentional change. The paper argues that genuine machine creativity should not be defined by output novelty, current performance, or transient architecture alone. Instead, creativity is understood as the structural transformation of incomplete situations through recursive intervention dynamics. On this view, it depends on ten requirements: environment representation, scoped perception, conflict identification, intervention capability, consequence observation, knowledge and environment update, rescoping, local-to-global unfolding, value-based scoping, and human-AI co-living. These are organized through the three laws of Designics: perception, conflict, and capability. The paper illustrates the computational tractability of these requirements through selected cyber-physical and cyber-biological studies, including recursive element extraction, autonomous mesh generation, and neurophysiological and workload analysis. It then treats open-ended systems, automated discovery frameworks, self-modifying agents, foundation models, and agentic workflows as pressure cases: they demonstrate powerful generative means but do not by themselves establish genuine machine creativity. Finally, the paper argues that proactive AI ethics is internal to genuine machine creativity rather than an after-the-fact filter. Value-based scoping and human-AI co-living must shape how creative machines perceive environments, identify conflicts, select interventions, observe consequences, update knowledge, and rescope future action.

12.
arXiv (CS.AI) 2026-06-16

Honeypot Protocol

Authors:

arXiv:2604.13301v1 Announce Type: cross Abstract: Trusted monitoring, the standard defense in AI control, is vulnerable to adaptive attacks, collusion, and strategic attack selection. All of these exploit the fact that monitoring is passive: it observes model behavior but never probes whether the model would behave differently under different perceived conditions. We introduce the honeypot protocol, which tests for context-dependent behavior by varying only the system prompt across three conditions (evaluation, synthetic deployment, explicit no-monitoring) while holding the task, environment, and scoring identical. We evaluate Claude Opus 4.6 in BashArena across all three conditions in both honest and attack modes. The model achieved 100% main task success and triggered zero side tasks uniformly across conditions, providing a baseline for future comparisons with stronger attack policies and additional models.

13.
bioRxiv (Bioinfo) 2026-06-18

Calculation of sequence space coverage in a mutagenesis library

Directed evolution requires screening of large mutagenesis libraries, but accurate calculation of library sizes needed to discover functional variants remains challenging. Existing models provide baseline estimates, yet current computational approaches for finding the best variants scale poorly with library complexity. Here, we introduce a scalable algorithmic framework to compute exact discovery probabilities in saturation mutagenesis libraries with no requirement for explicit sequence enumeration. By aggregating variants into a composition log–sum distribution and applying log-space convolution across randomisation blocks, it is possible to extend this to massive sequence spaces and mixed codon schemes. By inverting these calculations, absolute mathematical ceilings for experimental design are established. Ultimately, this framework provides a rapid, quantitative tool to balance the statistical coverage-diversity trade-off within the limitations of laboratory screening. Finally, this is implemented as an open-source web application (SSCC) that allows researchers to construct heterogeneous library designs and compute required sampling depths, coverage probabilities, and absolute randomisation limits.

14.
arXiv (CS.LG) 2026-06-19

How to sketch a learning algorithm

Authors:

arXiv:2604.07328v3 Announce Type: replace Abstract: How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $\delta$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/\delta)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/\delta)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

15.
arXiv (CS.CL) 2026-06-16

LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization

With the rapid progress of speech language models (SLMs), discrete speech tokens have emerged as a core interface between speech and text, enabling unified modeling across modalities. Recent speech tokenization approaches aim to isolate semantic information from low-level acoustics to better align with language models (LMs). In particular, previous methods use self-supervised learning (SSL) teachers such as HuBERT to extract semantic representations, which are then distilled into a semantic quantizer to suppress acoustic redundancy as well as capture content-related latent structures. However, these tokenizers often operate at relatively high frame rates, producing token sequences significantly longer than their textual counterparts and hindering seamless integration with pretrained LMs. Although recent methods attempt to reduce the token rate by applying uniform average pooling to SSL features, this can over-smooth content-bearing regions and dilute the structural information, thereby potentially limiting the LM alignment. To address this, we propose LM-SPT, an LM-aligned speech tokenization method based on semantic speech-resynthesis distillation. Instead of directly matching teacher and student features via pooling, LM-SPT resynthesizes speech from semantic tokens only and minimizes the discrepancy between representations extracted from the original and resynthesized waveforms using a frozen, LM-aligned speech encoder. This indirect supervision avoids rigid temporal alignment and encourages dedicated semantic units that are more semantically aligned with LMs under reduced frame rates. Experimental results show that the proposed LM-SPT consistently outperforms previous semantic-enhanced speech tokenizers when applied to SLMs for the tasks of automatic speech recognition and text-to-speech, even without compromising the speech reconstruction fidelity at the codec level.

16.
arXiv (CS.AI) 2026-06-16

ALCL: An Adaptive Log-Correntropy Loss for Robust Learning under Non-Gaussian Noise

arXiv:2606.16050v1 Announce Type: cross Abstract: Robust deep learning under heavy-tailed and impulsive noise remains challenging because conventional losses such as mean squared error (MSE) exhibit unbounded sensitivity to outliers. Although correntropy-based objectives improve robustness, existing formulations rely on fixed kernel parameters that must be empirically tuned and remain static during training. To address these limitations, we propose an Adaptive Log-Correntropy Loss (ALCL), a heavy-tailed loss formulation that adaptively learns its robustness geometry during optimization. ALCL introduces a logarithmic residual model whose shape and scale parameters are learned jointly with network weights through differentiable reparameterization. This yields a principled maximum likelihood formulation whose influence function is formally bounded and redescending, allowing the loss geometry to adapt dynamically to evolving residual statistics while suppressing extreme outliers. Comparative experiments on four widely used benchmark datasets spanning grayscale and red-green-blue (RGB) image data under mixed heavy-tailed and impulsive noise demonstrate that ALCL consistently outperforms MSE and optimally tuned generalized correntropy losses in both reconstruction fidelity and downstream classification accuracy. While performance differences remain small under low-noise conditions, under high-noise regimes ALCL improves median accuracy by up to 4.75% on grayscale benchmarks and 4.51% on RGB datasets, with reduced variance across runs. These results demonstrate that adaptive robustness through joint learning of loss parameters provides a computationally efficient alternative to static correntropy-based losses for deep learning in non-Gaussian environments.

17.
medRxiv (Medicine) 2026-06-12

Conversational Artificial Intelligence-Enabled Precision Oncology Reveals Context-Specific TGFβ and JAK/STAT Alterations in Pancreatic Cancer

Background: Pancreatic ductal adenocarcinoma (PDAC) is characterized by extensive molecular complexity, profound stromal remodeling, and limited responsiveness to systemic therapies. Although gemcitabine-based regimens remain widely utilized, the molecular pathways that influence treatment-associated biological variation are incompletely understood. The TGF{beta} and JAK/STAT signaling networks are recognized regulators of tumor progression, immune modulation, and therapeutic resistance; however, their genomic architecture in clinically stratified PDAC populations remains poorly defined. Methods: We employed a conversational artificial intelligence-driven analytical framework to investigate TGF{beta} and JAK/STAT pathway alterations in a cohort of 184 PDAC patients. Clinical and molecular data were integrated to generate age- and treatment-stratified cohorts, enabling pathway-level and gene-level analyses according to gemcitabine exposure. Findings generated through AI-assisted interrogation were subsequently evaluated using conventional statistical approaches. Results: TGF{beta} pathway alterations were identified in approximately one-quarter to one-third of tumors across clinical subgroups and demonstrated relatively stable frequencies regardless of age at diagnosis or gemcitabine treatment status. Gene-level analyses revealed that pathway disruption was predominantly driven by recurrent alterations in SMAD4, with additional low-frequency events involving TGFBR1 and TGFBR2. Notably, TGFBR2 mutations were significantly more frequent among late-onset PDAC patients receiving gemcitabine compared with untreated late-onset patients (8.8% vs. 1.4%; p = 0.04), suggesting a potential treatment-associated enrichment. In contrast, JAK/STAT pathway alterations were rare throughout the cohort, with only isolated mutations observed in pathway components including JAK1, JAK2, JAK3, STAT1, STAT3, and related regulatory genes. No significant differences in JAK/STAT alteration frequencies were identified according to age or treatment exposure. Conclusions: TGF{beta} and JAK/STAT pathways exhibit distinct genomic architectures in PDAC. TGF{beta} pathway disruption represents a recurrent feature of disease biology, largely driven by SMAD4 alterations, while TGFBR2 enrichment in gemcitabine-treated late-onset tumors suggests a potential context-specific association worthy of further investigation. Conversely, genomic alterations within the JAK/STAT pathway are uncommon, indicating that pathway activity may be regulated predominantly through non-genomic mechanisms. These findings demonstrate the utility of conversational artificial intelligence agents for rapid, scalable, and clinically contextualized pathway interrogation and support future studies integrating multi-omic data to refine precision medicine strategies in PDAC.

18.
arXiv (CS.AI) 2026-06-12

ReCal: Reward Calibration for RL-based LLM Routing

arXiv:2606.12479v1 Announce Type: cross Abstract: Large language model (LLM) routing has emerged as an effective paradigm for leveraging the complementary strengths of multiple LLMs through dynamic model and reasoning-strategy selection. Recent reinforcement learning (RL)-based routing methods further improve routing quality by optimizing routing policies from interaction feedback. However, they still struggle to provide informative and comparable learning signals under heterogeneous tasks with varying difficulty. In practice, multiple objectives (e.g., correctness, format behavior) are aggregated into a single scalar reward, leading to ambiguous credit assignment and conflicting optimization signals. Moreover, reward signals exhibit significant variability across instances, where some instances produce higher or more variable rewards, introducing optimization bias that favors trivial samples over informative ones. To address these issues, we propose ReCal, a \underline{Re}ward \underline{Cal}ibration framework for RL-based LLM routing. We first introduce a hierarchical reward decomposition mechanism with component-wise advantage estimation. We further propose a distribution-aware optimization strategy that calibrates optimization variability through variance-aware reweighting and per-dataset normalization. Experiments on seven datasets demonstrate that ReCal consistently improves routing performance, and training stability over baselines. Code is available at https://anonymous.4open.science/r/ReCal.

19.
arXiv (CS.CL) 2026-06-19

From 50K to 8.2 Million in 24 Hours: Vozinha's Algorithmic Consecration and the Multilingual Making of World Cup Visibility

We present a multilingual computational discourse analysis of how language constructed the algorithmic consecration of Vozinha, the 40-year-old Cape Verde goalkeeper, after Spain 0-0 Cape Verde at the 2026 FIFA World Cup. The study contributes a multilingual corpus in Portuguese, Spanish, English, and French; a nine-frame narrative taxonomy with cue-based frame annotation; a reproducible annotation pipeline combining LLM-assisted suggestion with human validation; and an analysis of cross-lingual narrative diffusion across discourse phases. We treat the platform follower count itself, narrated as "50k to 8M", as a linguistic object: a circulating and narratable proof of visibility rather than a mere measurement. The follower-growth timeline is used only as contextual metadata: we reconstruct a conservative phase structure, not a continuous API-native series, and type every datapoint by value class, confidence, and evidence type. The only exact primary scraper anchor is 8,235,652 followers at 2026-06-16 15:47 UTC; all other figures are reported as estimated ranges or thresholds, including an estimated pre-match baseline of 45k-56k. Findings suggest that distinct languages carried distinct frames: Portuguese mobilization, Spanish crisis, English nation-making, and a shared platform-metric spectacle through which peripheral athletic performance became globally visible. As a v0.1 pilot, the paper releases the corpus schema, frame taxonomy, annotation guidelines, hashed visual-evidence log, and typed timeline, while flagging full double annotation and inter-annotator agreement as planned work.

20.
arXiv (CS.CL) 2026-06-11

Multi-task Learning is Not Enough: Representational Entanglement in Dual-output Second Language Speech Recognition

Second-language (L2) speech recognition often requires transcriptions of pronunciations and intended meanings. Multi-task learning (MTL) is a natural approach because it assumes that shared representations benefit both outputs. However, this paper shows that this assumption does not hold across Korean and English. MTL improves meaning but degrades surface transcription, especially in English, where the degradation scales with surface-meaning divergence measured by Levenshtein edit distance. Encoder analysis links these patterns to encoder-level entanglement, with Korean preserving distinct task representations while English produces nearly identical ones. Cross-task decoder analysis shows that the meaning dual-output decoder adapts with a unique representation, while the surface dual-output decoder remains constrained by the encoder. These findings motivate the design of MTL frameworks that mitigate encoder-level entanglement to reduce surface degradation in dual-output L2 automatic speech recognition.

21.
arXiv (CS.CL) 2026-06-11

Augmenting Molecular Language Models with Local $n$-gram Memory

Transformer-based language models for SMILES strings suffer from a locality gap: standard character-level tokenization fragments chemically meaningful motifs, forcing models to repeatedly learn local syntax at the expense of long-range dependencies. To address this without disrupting standard tokenizers, we propose MolGram, which integrates a conditional $n$-gram memory module into molecular language models. MolGram maps local string patterns to learned embeddings via scalable hash lookups and dynamically injects this regional context into hidden states. Evaluations across three tasks, including unconditional molecule generation, forward reaction prediction, and single-step retrosynthesis, show that MolGram consistently improves performance. Crucially, our analyses demonstrate that MolGram outperforms baselines with 3$\times$ more parameters, establishing explicit local pattern memory as a highly efficient inductive bias.

22.
arXiv (CS.AI) 2026-06-18

PSyGenTAB: A Privacy-Preserving Framework for Synthetic Clinical Tabular Data Generation via Constrained Optimization

arXiv:2606.18518v1 Announce Type: cross Abstract: The development of medical AI is constrained by limited access to high-quality clinical data due to institutional silos and strict privacy regulations such as HIPAA and GDPR. Synthetic data generation offers a potential solution, but existing methods lack principled mechanisms to explicitly manage the privacy-utility trade-off, often degrading clinically meaningful patterns or risking patient re-identification. We present PSyGenTAB, a privacy-preserving generative framework that formulates synthetic healthcare data generation as a constrained optimization problem solved using the Augmented Lagrangian Method. By embedding configurable privacy constraints directly into model training, PSyGenTAB enforces minimum privacy thresholds while maximizing clinical data utility. Across multiple clinically motivated benchmarks, PSyGenTAB preserves inter-feature clinical relationships and minority-class diagnostic patterns essential for reliable health AI. Downstream evaluation using Train-on-Synthetic, Test-on-Real and Train-on-Real, Test-on-Synthetic protocols shows that models trained on synthetic data achieve performance comparable to those trained on real patient records. Privacy auditing further demonstrates reduced exact record reproduction and strong resilience to membership inference attacks. These results establish PSyGenTAB as a principled framework for balancing privacy protection and clinical utility in synthetic healthcare data, supporting secure cross-institutional AI development.

23.
arXiv (CS.AI) 2026-06-17

Dimensionality Controls When Modularity Helps in Continual Learning

arXiv:2606.17889v1 Announce Type: cross Abstract: Compositional learning systems must balance plasticity, the ability to acquire new knowledge, with stability, the preservation of previously learned components, especially when tasks share structure and risk interference. We study how modular architecture, task similarity, and representational dimensionality jointly shape compositional continual learning in a sequential A-B-A paradigm, comparing a task-partitioned recurrent network to a single-network baseline while inducing high- and low-dimensional regimes via weight-scale manipulations. In a high-dimensional "lazy" regime, both architectures achieve similar performance and internal geometry, suggesting that explicit modular structure has little impact when representations are weakly constrained. In a lower-dimensional "rich" regime, modularity becomes decisive: the modular network develops graded task-specific subspaces that overlap for similar tasks, partially align for moderately dissimilar tasks, and separate for dissimilar tasks, yielding a more compositional and interpretable organization than the single network. These findings identify the representational regime induced by initialization scale, which co-varies with representational dimensionality, as a key factor governing when compositional, modular structure is functionally beneficial in continual learning, and support viewing safety and robustness as problems of adaptive allocation of representational subspaces rather than fixed separation versus sharing.

24.
arXiv (CS.LG) 2026-06-18

A Human-in-the-Loop Bayesian Optimization Framework for Constraint-Aware Bioprocess Development

arXiv:2606.19230v1 Announce Type: new Abstract: This work presents an extension to Pareto Front Guided Sampling (PFGS), a Human-in-the-Loop (HitL) Bayesian Optimization (BO) framework in which Gaussian process (GP) surrogate-derived quantities are reformulated as objectives of a multi-objective optimization problem, and the resulting Pareto front is exposed to a domain expert for interactive candidate selection rather than returning a single automated recommendation. The framework is extended in two directions: constrained optimization is addressed by incorporating the posterior probability of satisfying output specification limits as an explicit Pareto objective, computed analytically from the GP posterior distribution; robust optimization is addressed by a Monte Carlo sampling strategy that estimates expected lower-confidence performance over a user-defined variability of input perturbations, capturing performance degradation under likely implementation deviations. The resulting multi-dimensional Pareto representation renders trade-offs between predicted performance, model uncertainty, probabilistic constraint satisfaction, and input robustness simultaneously visible through pairwise two-dimensional projections on an interactive dashboard, enabling selection criteria to be iteratively refined as the surrogate model improves and development objectives evolve. The framework is showcased on an eight-dimensional fed-batch Chinese Hamster Ovary (CHO) cell culture simulator demonstrating systematic identification of high-performing, feasibility-compliant, and perturbation-resilient operating conditions, and illustrating how expert-defined requirements provide a principled stopping criterion and support informed allocation of experimental resources.

25.
Nature (Science) 2026-06-10

Gen Z scepticism towards AI is a wake-up call — universities must take it seriously

Authors:

The challenge for universities is not adopting artificial intelligence, but doing so in ways that the current generation of students can trust. The challenge for universities is not adopting artificial intelligence, but doing so in ways that the current generation of students can trust.