Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.LG) 2026-06-11

Mechanisms of Introspective Awareness

arXiv:2603.21396v5 Announce Type: replace Abstract: Recent work has shown that LLMs can sometimes detect when steering vectors are injected into their residual stream and identify the injected concept – a phenomenon termed "introspective awareness." We investigate the mechanisms underlying this capability in open-weights models. First, we find that it is behaviorally robust: models detect injected steering vectors at moderate rates with 0% false positives across diverse prompts and dialogue formats. Notably, this capability emerges specifically from post-training; we show that preference optimization algorithms like DPO can elicit it, but standard supervised finetuning does not. We provide evidence that detection cannot be explained by simple linear association between certain steering vectors and directions promoting affirmative responses. We trace the detection mechanism to a two-stage circuit in which "evidence carrier" features in early post-injection layers detect perturbations monotonically along diverse directions, suppressing downstream "gate" features that implement a default negative response. This circuit is absent in base models and robust to refusal ablation. Identification of injected concepts relies on largely distinct later-layer mechanisms that only weakly overlap with those involved in detection. Finally, we show that introspective capability is substantially underelicited: ablating refusal directions improves detection by +53%, and a trained bias vector improves it by +75% on held-out concepts, both without meaningfully increasing false positives. Our results suggest that this introspective awareness of injected concepts is robust and mechanistically nontrivial, and could be substantially amplified in future models. Code: https://github.com/safety-research/introspection-mechanisms.

02.
arXiv (CS.CL) 2026-06-16

CoBit: Language Modeling with Bitstream Diffusion

Diffusion language models (DLMs) promise parallel, order-agnostic generation, but on standard benchmarks they have historically lagged behind autoregressive models in sample quality and diversity. Recent continuous flow and diffusion approaches have narrowed this gap. In this work, we further close the autoregressive gap by modeling text as a continuous diffusion process over fixed-width binary bitstreams. We refer to the resulting model as CoBit (Continuous Bitstream Diffusion). Our approach represents semantic tokens as analog bit sequences and uses a matched-filter residual parameterization to isolate contextual learning from analytic independent-bit posteriors. Crucially, we adopt a stochastic sampler that applies Langevin-type corrections gated by the entropy-rate profile, concentrating stochasticity in high-information regions while remaining nearly deterministic elsewhere. On LM1B, our 130M-parameter model reaches a generative perplexity (GenPPL) of 59.76 at matched real-data entropy (4.31) using 256 neural function evaluations (NFEs), outperforming prior DLM baselines and reaching the autoregressive reference. On OpenWebText (OWT), our sampler establishes a new continuous-DLM Pareto frontier, achieving GenPPL 27.06 at entropy 5.26 using 4x fewer steps than previous 1024-NFE baselines. Scaling the same recipe to a 462M-parameter model (CoBit-M) further improves the OWT GenPPL-entropy frontier over the 130M model (CoBit-S) and over medium-scale continuous and discrete DLM baselines, reaching GenPPL 19.5 at entropy 5.40, near real-data entropy (5.44), and approaching pretrained GPT-2 Medium over the high-quality region. As an additional benefit, bitstream diffusion removes the O(V) vocabulary scaling bottleneck of standard DLMs: by predicting O(log V) bitwise logits via semantic bit-patching, it lowers memory and raises throughput, a scalable paradigm as vocabulary sizes grow.

03.
arXiv (CS.CV) 2026-06-11

Spatially Coupled Phase-to-Depth Calibration for Fringe Projection Profilometry

In fringe projection profilometry (FPP), depth is commonly recovered by fitting a phase-to-depth relation independently at each camera pixel. Although such pixel-wise calibration achieves high local accuracy, neighboring pixels can acquire markedly different calibration functions even when they observe the same smooth surface, producing spatially inconsistent geometry and structured surface artifacts. We propose a spatially coupled phase-depth transformation in which all pixels share a single low-dimensional mapping-global phase scalars combined with affine spatial terms on the undistorted reference-camera grid-rather than independent per-pixel fits, optionally augmented by a bounded, spatially smooth correction field. We further introduce a native-grid pairing scheme that constructs phase-depth calibration pairs directly on the reference-camera grid: when depth supervision comes from a rectified active-stereo pipeline, planes are fitted in stereo 3D and sampled back onto the camera grid along native rays, so the phase maps are never rectified. On a dental target with high-resolution scanner ground truth, the proposed model attains point-to-surface RMSE comparable to an active-stereo reference (about 12{\mu}m aggregate) while substantially improving spatial coherence over pixel-wise polynomial and rational calibration, and reduces the runtime mapping to a few element-wise operations per pixel with negligible parameter storage.

04.
arXiv (CS.AI) 2026-06-24

Real-Time Interactive Music Generation via Data-Free Streaming Consistency Distillation

arXiv:2606.24307v1 Announce Type: cross Abstract: Interactive music and live performance relies on real-time human expression, but modern generative music AI remains largely absent from this domain due to its prohibitive inference latency and offline rendering paradigm. To provide pioneer musicians with a novel medium for interactive composition, we should fundamentally change these static models into dynamic, playable instruments. In this paper, we propose a framework that bridges this gap. To achieve the low latency required for live interaction without sacrificing structural coherence, we formulate distillation within a streaming autoregressive latent space. Our approach gets rid of the need for expensive paired audio-latent datasets by utilizing prompt-only inputs to synthesize teacher-guided, chunk-wise trajectories on the fly. Because live instruments require high acoustic fidelity, we introduce music-aware consistency objectives, which combine latent, spectral, and temporal-difference losses, to preserve crucial qualities like timbre, transients, and rhythmic stability during accelerated single-step streaming generation. Implemented via parameter-efficient adaptation, our distillation reduces generation steps to achieve a low real-time factor. Crucially, by operating as a continuous autoregressive stream, the system can seamlessly assimilate dynamic human inputs on the fly, allowing users to instantly steer the musical trajectory without interrupting the audio flow. Ultimately, this work recontextualizes generative text-to-music models not as passive prompt-and-wait systems, but as responsive instruments, opening new frontiers for live human-AI musical co-creation.

05.
PLOS Computational Biology 2026-06-22

CoDaLoMic: An R package for modeling microbiome compositional and longitudinal data

by Irene Creus-Martí, Andrés Moya, Francisco J. Santonja In this paper we present CoDaLoMic, an R package for analyzing longitudinal and compositional microbiome datasets. The CoDaLoMic package implements three models specifically designed for the analysis of microbiome data that are both compositional and longitudinal. Unlike many existing methods that focus solely on pairwise interactions, CoDaLoMic also captures interactions among groups of bacteria, providing a more robust methodological framework for studying microbial relationships at the community level. In addition, the package facilitates the analysis of microbiome variability in relation to host health status and allows for the identification of groups of taxa that exhibit similar temporal dynamics. Working with time series data makes it possible to understand not only the current state of a microbial community but also its dynamics over time, which is essential for identifying patterns of ecological succession, detecting events of dysbiosis or recovery, and inferring potential causal relationships between taxa. On the other hand, focusing on interactions among groups of bacteria, rather than analyzing only pairwise relationships, enables a more integrated and functionally meaningful view of the microbiome. Many key ecological functions are the result of the collective behavior of functionally related groups of taxa. Two datasets have been considered in CoDaLoMic, one real and one simulated. The real dataset contains the information of the genera present in the microbiome of the Blatella germanica cockroach at 105 time points. The simulated dataset is defined taking Lotka-Volterra structure into account. CoDaLoMic is available at CRAN.

06.
arXiv (CS.AI) 2026-06-11

MLaGA: Multimodal Large Language and Graph Assistant

arXiv:2506.02568v2 Announce Type: replace Abstract: Large Language Models (LLMs) have demonstrated substantial efficacy in advancing graph-structured data analysis. Prevailing LLM-based graph methods excel in adapting LLMs to text-rich graphs, wherein node attributes are text descriptions. However, their applications to multimodal graphs–where nodes are associated with diverse attribute types, such as texts and images–remain underexplored, despite their ubiquity in real-world scenarios. To bridge the gap, we introduce the Multimodal Large Language and Graph Assistant (MLaGA), an innovative model that adeptly extends LLM capabilities to facilitate reasoning over complex graph structures and multimodal attributes. We first design a structure-aware multimodal encoder to align textual and visual attributes within a unified space through a joint graph pre-training objective. Subsequently, we implement a multimodal instruction-tuning approach to seamlessly integrate multimodal features and graph structures into the LLM through lightweight projectors. Extensive experiments across multiple datasets demonstrate the effectiveness of MLaGA compared to leading baseline methods, achieving superior performance in diverse graph learning tasks under both supervised and transfer learning scenarios.

07.
Nature (Science) 2026-06-24

Alternate RNA decoding results in stable and abundant proteins in mammals

Authors:

Amino acid substitutions may substantially alter protein stability and function1,2. However, the contribution of substitutions that arise from alternate translation (deviations from the genetic code) is unknown. Here to address this issue, we analysed deep proteomic, transcriptomic and genomic data from more than 1,000 human samples, including 6 cancer types and 26 healthy human tissues. This global analysis identified 60,803 fragmentation spectra corresponding to 8,746 unique substitutions in proteins derived from 1,767 genes, including 1,955 confidently localized sites. Some substitutions were shared across samples, whereas others exhibited strong tissue-type and cancer specificity. Notably, products of alternate translation were more abundant than their canonical counterparts for hundreds of proteins, which suggests that there is sense-codon recoding. Recoded proteins included transcription factors, proteases, signalling proteins and proteins associated with neurodegeneration. Mechanisms that contribute to substitution abundance included protein stability, codon frequency, codon–anticodon mismatches and RNA modifications. We also characterized how alternatively translated proteoform ratios vary across protein domains, tissue types and cancers. These ratios were positively associated with intrinsically disordered regions and genetic polymorphisms in the gnomAD database, although the polymorphisms could not account for the substitutions. The sequence, relative abundance and the tissue specificity of alternatively translated proteins were conserved between humans and mice. These results demonstrate the contribution of alternate translation to the diversification of mammalian proteomes and its association with protein stability, tissue-specific proteomes and disease. Alternate RNA decoding, an understudied process, leads to peptide sequence modifications that can have substantial functional effects on protein stability, tissue-specific proteomes and disease.

08.
arXiv (CS.AI) 2026-06-15

PRISM: Perception Reasoning Interleaved for Sequential Decision Making

arXiv:2605.05407v2 Announce Type: replace Abstract: Scaling LLM-based embodied agents from text-only environments to complex multimodal settings remains a major challenge. Recent work identifies a perception-reasoning-decision gap in standalone Vision-Language Models (VLMs), which often overlook task-critical information. In this paper, we introduce PRISM, a framework that tightly couples perception (VLM) and decision (LLM) through a dynamic question-answer (DQA) pipeline. Instead of passively accepting the VLM's description, the LLM critiques it, probes the VLM with goal-oriented questions, and synthesizes a compact image description. This closed-loop interaction yields a sharp, task-driven understanding of the scene. We evaluate PRISM on the ALFWorld and Room-to-Room (R2R) benchmarks. We show that: (1) PRISM significantly outperforms state-of-the-art image-based models, (2) our Interactive goal-oriented perception pipeline yields systematic and substantial gains, and (3) PRISM is fully automatic, eliminating the need for handcrafted questions or answers.

09.
arXiv (quant-ph) 2026-06-16

Lattice surgery for near-term experimental logical qubit entanglement creation in planar architectures

arXiv:2606.15190v1 Announce Type: new Abstract: In the era of early fault-tolerant quantum computing, basic demonstrations of entanglement operations between a few logical qubits are at the frontier of recent developments in quantum computing. In this work, we describe in detail, at both the logical and physical qubit levels, a logical teleportation protocol between two surface code logical qubits based on lattice surgery. We address several aspects of the teleportation protocol pertinent to superconducting qubit architectures. We explore the modularity constraints in the number and location of stabilizer readouts and compare variants of the teleportation protocol in this regard. Additionally, we investigate potential performance improvements related to in-sequence decision logic and the optimal size of the interface region between two surface code patches on a superconducting chip. Based on our simulations, we show possible near-term improvements in lattice surgery protocols that facilitate fault-tolerant quantum computing in superconducting circuit architectures.

10.
arXiv (quant-ph) 2026-06-15

Extensible Fluxonium Architecture Using Tunable Couplers with Low Shunt Capacitance

arXiv:2606.01647v2 Announce Type: replace Abstract: Fluxonium qubits have demonstrated high-fidelity operations and long coherence times in small-scale systems, highlighting their promise for quantum computing. However, large-scale integration into a high-performance two-dimensional (2D) qubit array remains the central challenge for practical applications. In this work, we introduce an extensible architecture for scaling up fluxonium qubits in 2D grids. To address the key challenges, namely achieving controllable strong interaction and high connectivity for qubits featuring small shunting capacitors (footprints), we propose using low-shunt-capacitance couplers to enable tunable interactions between fluxonium qubits. When embedded into 2D square lattices, large couplings can be achieved even with relatively small coupling capacitances, thus enabling multiple connections with sufficient capacitance budget. We further propose coupler realizations based on generalized flux qubit circuits, specifically the quarton and the fluxonium, and demonstrate that both enable fast, high-fidelity gates with low spectator errors, while supporting multiple connections on 2D grids.

11.
arXiv (CS.AI) 2026-06-18

Beyond Similarity: Temporal Operator Attention for Time Series Analysis

arXiv:2605.11287v2 Announce Type: replace-cross Abstract: A persistent paradox in time-series forecasting is that structurally simple MLP and linear models often outperform high-capacity Transformers. We argue that this gap arises from a mismatch in the sequence-modeling primitive: while many time-series dynamics are governed by global temporal operators (e.g., filtering and harmonic structure), standard attention forms each output as a convex combination of inputs. This restricts its ability to represent signed and oscillatory transformations that are fundamental to temporal signal processing. We formalize this limitation as a simplex-constrained mixing bottleneck in softmax attention, which becomes especially restrictive for operator-driven time-series tasks. To address this, we propose $Temporal Operator Attention (TOA)$, a framework that augments attention with explicit, learnable sequence-space operators, enabling direct signed mixing across time while preserving input-dependent adaptivity. To make dense $N \times N$ operators practical, we introduce Stochastic Operator Regularization, a high-variance dropout mechanism that stabilizes training and prevents trivial memorization. Across forecasting, anomaly detection, and classification benchmarks, TOA consistently improves performance when integrated into standard backbones such as PatchTST and iTransformer, with particularly strong gains in reconstruction-heavy tasks. These results suggest that explicit operator learning is a key ingredient for effective time-series modeling.

12.
arXiv (CS.AI) 2026-06-25

AI-Assisted Computational Reproducibility on the FABRIC Testbed

arXiv:2606.25879v1 Announce Type: cross Abstract: Computational reproducibility remains difficult despite being central to scientific research. In this paper, we show how the international FABRIC testbed, combined with large language model (LLM) coding assistants through LoomAI, can simplify reproducing published experiments across multiple domains. We reproduced three case studies on FABRIC, covering BBR-family congestion-control evaluations, LAMMPS molecular dynamics scaling benchmarks on a CPU-only MPI cluster, and stress protein homeostasis genomics pipelines. Rather than focusing only on matching numerical outputs, we evaluate whether the reproduced experiments support the same scientific conclusions as the original studies. The AI assistant was effective in setting up the environment, adapting code, and debugging, but struggled with the analysis stages that lacked clearly defined workflows, which required human guidance to establish execution order and data dependencies. Across the case studies, the AI-assisted workflow reduced reproduction effort by roughly 4–6 times. We conclude with practical recommendations for improving AI-assisted reproducibility on research testbeds.

13.
arXiv (CS.LG) 2026-06-18

Modeling Doppler Shifts in Radial-Velocity Data with Deep Learning toward Earth-mass Exoplanet Detection

arXiv:2606.18464v1 Announce Type: cross Abstract: Detecting the tiny Doppler shifts induced by Earth-mass planets in stellar radial-velocity measurements remains extremely challenging due to stellar activity. Many deep-learning methods performing well on simulated data remain difficult to apply reliably on real stellar spectra. The aim of this work is to develop a deep-learning framework that generalizes to real, unseen spectra and improves the detectability of Earth-mass planets in radial-velocity data. We train artificial neural networks on HARPS-N solar spectra with injected planetary signals, using physics-motivated spectral representations based on flux and line-formation temperature, together with their velocity gradients. Two training strategies are explored: hold-out testing and cross-validation. Model robustness is enhanced through genetic-algorithm-based hyperparameter optimization, and predictive uncertainty is quantified using Monte Carlo dropout. Our most precise neural network model reliably retrieves, under the cross-validation strategy, the amplitudes, phases, and orbital periods of planetary signals with amplitudes greater than or equal to 25 cm/s and periods between 10 and 550 days. In addition, in all cases tested here, the successfully recovered signals correspond to the most significant peaks in the periodograms of the Doppler-shift predictions. Temperature-based spectral-shell representations consistently outperform flux-based shells. We also release doppleriann, a Python package implementing the proposed framework. Our results demonstrate that combining physically motivated spectral representations with deep learning provides a promising pathway toward the detection of Earth-mass planets in radial-velocity data from real observations, supported by a modeling framework that is both physically grounded and statistically rigorous, incorporating uncertainty quantification and optimized training strategies.

14.
medRxiv (Medicine) 2026-06-19

Specific epigenetic age acceleration measures are associated with oral health outcomes in U.S. adults

Objectives: Oral health conditions impact a significant proportion of the global population. Chronological age is a known risk factor; however, characterization of epigenetic age remains limited and is expected to provide additional insight into biological mechanisms. Materials and Methods: The National Health and Nutrition Examination Survey (NHANES) was used to analyze the effect of epigenetic age measures of DunedinPoAm, and epigenetic age acceleration (EAA) of Horvath, Hannum, Weidner, Lin, VidalBralo, PhenoAge, GrimAge, and GrimAge2, on various oral health outcomes from survey and examination results. Univariable and multivariable logistic regression were performed, adjusting for sex, race-ethnicity, education, poverty income ratio categories, and dental insurance coverage status. Results: DunedinPoAm was associated with the last dental appointment being for an existing issue (p=0.0093), poor general oral condition (p=0.0226), limiting food due to teeth problems (p=0.0031), and recommendation to see a dentist within the next two weeks (p=0.0171). EAAs for PhenoAge, GrimAge, and GrimAge2, were associated with a smaller number of oral health outcomes, whereas EAAs for Horvath, Hannum, Weidner, Lin, and Vidal-Bralo showed no associations. Conclusions: In a representative U.S. population, DunedinPoAm was most consistently positively associated with different adverse oral health outcomes compared with other epigenetic aging measures. Tracking specific epigenetic ages such as DunedinPoAm, EAA GrimAge, EAA GrimAge2, and PhenoAge, may aid in additional monitoring of oral health outcomes. Understanding specific aging-related CpGs associated with oral health may aid in elucidating underlying molecular mechanisms.

15.
arXiv (CS.LG) 2026-06-11

SpaTeoGL: Spatiotemporal Graph Learning for Interpretable Seizure Onset Zone Analysis from Intracranial EEG

arXiv:2602.11801v2 Announce Type: replace Abstract: Accurate localization of the seizure onset zone (SOZ) from intracranial EEG (iEEG) is essential for epilepsy surgery but is challenged by complex spatiotemporal seizure dynamics. We propose SpaTeoGL, a spatiotemporal graph learning framework for interpretable seizure network analysis. SpaTeoGL jointly learns window-level spatial graphs capturing interactions among iEEG electrodes and a temporal graph linking time windows based on similarity of their spatial structure. The method is formulated within a smooth graph signal processing framework and solved via an alternating block coordinate descent algorithm with convergence guarantees. Experiments on a multicenter iEEG dataset with successful surgical outcomes show that SpaTeoGL is competitive with a baseline based on horizontal visibility graphs and logistic regression, while improving non-SOZ identification and providing interpretable insights into seizure onset and propagation dynamics.

16.
arXiv (CS.LG) 2026-06-18

A physical adaptive material motor unit neural network: a hygromorph composite material machine

arXiv:2606.18275v1 Announce Type: cross Abstract: Advances in novel materials science enable structures to function as intelligent machines by embedding memory and learning capabilities directly into materials. Our work introduces a physical adaptive material motor unit neural network,leveraging a new generation of controllable actuators composed of wood- and carbon black-based composites, sensitive to temperature and relative humidity. These material actuators are assembled into a motor unit-like structure inspired by muscle contraction trigger, forming an intelligent machine capable of dynamic shading control that can be used, for example, in buildings. The machine is governed by a neural network trained on over 350 experimental data points collected under diverse environmental conditions. By establishing a new data-aware backpropagation training, we show that the machine predicts shading responses and learns to predict appropriate behaviour incrementally as the database expands. We also demonstrate the ability of the machine to optimise configurations to achieve similar shading outputs under two distinct conditions.

17.
arXiv (math.PR) 2026-06-16

Asymptotic behavior of some strongly critical decomposable 3-type Galton–Watson processes with immigration

arXiv:2406.09852v2 Announce Type: replace Abstract: We study the asymptotic behavior of a critical decomposable 3-type Galton-Watson process with immigration when its offspring mean matrix is triangular with diagonal entries 1. It is proved that, under second or fourth order moment assumptions on the offspring and immigration distributions, a sequence of appropriately scaled random step processes formed from such a Galton-Watson process converges weakly. The limit process can be described using independent squared Bessel processes $({\mathcal X}_{t,1})_{t\geq0}$, $({\mathcal X}_{t,2})_{t\geq0}$, and $({\mathcal X}_{t,3})_{t\geq0}$, the linear combinations of the integral processes of $({\mathcal X}_{t,1})_{t\geq0}$ and $({\mathcal X}_{t,2})_{t\geq0}$, and possibly the 2-fold iterated integral process of $({\mathcal X}_{t,1})_{t\geq0}$. The presence of the 2-fold iterated integral process in the limit distribution is a new phenomenon in the description of asymptotic behavior of critical multi-type Galton-Watson processes with immigration. Our results complete and extend some results of Foster and Ney (1978) for some strongly critical decomposable 3-type Galton-Watson processes with immigration.

18.
medRxiv (Medicine) 2026-06-17

Treatment of Multi-Drug-Resistant Tuberculosis with Second-Line All-Oral Drugs in Ghana: Incidence of Adverse Events.

Introduction: The treatment of multidrug-resistant tuberculosis (MDR-TB) remains challenging due to the toxicity of second-line medications and suboptimal treatment outcomes. This study aimed to determine the incidence of adverse events and identify factors associated with these events in patients undergoing treatment for MDR-TB with second-line all-oral drugs in Ghana. Methods: This retrospective cohort study reviewed the medical records of 384 MDR-TB patients treated with second-line all-oral drugs at selected health facilities in Ghana, including the Greater Accra Regional Hospital, Eastern Regional Hospital, and Kumasi South Hospital. Data were extracted using the Kobo Collect tool, capturing patient demographics, baseline clinical and laboratory characteristics, treatment regimens, and adverse events. The study period spanned from 2020 to August 2024. Results: The study included a total of 384 MDR-TB patients, with a mean age of 45 years (SD = 15). The majority of patients were male (65.78%), and most were within the 45-64 years age group (33.85%), followed by those aged 25-44 years (31.25%). Regionally, the highest number of cases were reported from the Greater Accra Region (39.06%), followed by the Eastern Region (31.25%) and Kumasi South Hospital (29.69%). Approximately one in four patients (25%) presented with comorbidities, with HIV being the most common (19.5%). The most frequently reported adverse events were diarrhea (14%), dizziness (13.7%), and vomiting (12.3%). Most of these were mild to moderate in severity and tended to decrease as treatment progressed. Severe adverse events, such as leukopenia and acute kidney injury, were rare, occurring in less than 5% of patients. Over the course of treatment, gastrointestinal adverse events such as vomiting and nausea showed a significant decline, indicating possible patient adaptation or improved clinical management. Results from the multivariate Poisson regression analysis revealed that age and comorbidities were significant predictors of adverse events. Patients aged 65 years and above had a 56% lower risk of developing adverse events compared to younger patients (Adjusted Risk Ratio [aRR] = 0.44, 95% CI: 0.25-0.79, p = 0.005). Conversely, patients with comorbid conditions such as diabetes or hypertension were approximately 2.6 times more likely to experience adverse events compared to those without comorbidities (aRR = 2.65, 95% CI: 1.58-4.43, p < 0.001). The effect of sex was not statistically significant after adjustment (aRR = 1.03, 95% CI: 0.70-1.50, p = 0.86). At the end of the treatment period, 74.9% of patients achieved successful outcomes, including both those who were cured and those who completed treatment without being classified as cured. However, 25.1% had unsuccessful outcomes, which included treatment failure, relapse, or death. Conclusion: In conclusion, adverse events are common in the treatment of MDR-TB with second-line All-Oral drugs, with gastrointestinal adverse events being the most prevalent. These findings highlight the importance of monitoring and managing adverse events to optimize treatment outcomes for MDR-TB patients in Ghana.

19.
arXiv (CS.CV) 2026-06-24

Latent Visual States for Efficient Multimodal Reasoning

The integration of visual evidence has significantly enhanced the capabilities of large multimodal models. However, this integration predominantly relies on generating discrete outputs (etc., code or box coordinates) to invoke external tools, a process that introduces rigid dependencies and substantial latency. To overcome these limitations, we propose {EVA} (LatEnt Visual StAtes), a novel framework that natively generates continuous latent visual representations. These internal representations manifest as an adaptive sequence of Latent\_slot tokens, serving as intermediate visual thoughts during the reasoning process. These Latent\_slot tokens are then trained end-to-end with the discrete text tokens. This co-optimization, notably, causes extreme policy deviation in the 'transition window' following the Latent\_slot tokens. We develop D-GSPO (Decouple-GSPO) to target this root cause by decoupling the optimization of latent and discrete components. To support SFT, we construct EVA-230K, a high-quality text-image interleaved CoT dataset encompassing a diverse range of real-world scenes, documents, charts and OCR tasks. Extensive experiments across multiple benchmarks confirm that EVA achieves significant performance gains while enhancing inference efficiency.

20.
arXiv (CS.CL) 2026-06-24

MMed-Bench-IR: A Heterogeneous Benchmark for Multilingual Medical Information Retrieval

Retrieval-augmented generation (RAG) in clinical settings increasingly requires multilingual retrieval against predominantly English evidence corpora. Multilingual medical retrieval demands three capabilities: cross-lingual alignment, concept discrimination, and evidence retrieval. However, existing benchmarks evaluate these only in isolation, leaving the interaction between biomedical expertise and multilingual coverage unmeasured. We introduce MMed-Bench-IR, a benchmark designed to disentangle these axes across 6 languages and three structurally heterogeneous tasks: (1) cross-lingual medical QA retrieval with 6,127 queries grounded in the Unified Medical Language System (UMLS), (2) concept discrimination over 4,975 confusion sets at three difficulty tiers, and (3) multilingual evidence retrieval for RAG with 2,040 quality-assured queries. The three tasks share zero concept and query overlap by design, ensuring that aggregate scores reflect genuine capability breadth. Evaluation of ten systems across six paradigm families reveals severe cross-lingual failure: biomedical encoders that score 0.818 nDCG@10 in English drop to 0.056 in Japanese, a gap that English-only benchmarks cannot detect.

21.
bioRxiv (Bioinfo) 2026-06-24

trAIt: Species-by-Trait Data Retrieval using Large Language Models

Biological research often requires information about species' traits. Manual literature collation can be time-consuming and miss parts of the literature. To address this gap, we developed trAIt, a publicly available software for the retrieval of characteristics of species from scientific literature catalogued in the Europe PubMed Central (PubMed) database. trAIt provides a graphical user interface in which users specify species and characteristics of interest. Leveraging a large language model (LLM), trAIt retrieves relevant papers, combines their content through a consensus-based summarization model, and outputs a species-by-characteristic table. For a case study involving frog species, trAIt recovered 47.1% of trait-species combinations in 2.75 hours, while an expert curator independently recovered 62.4% over months. The consensus-based summarization substantially aids accuracy compared to single-source extraction. Across three case studies of vertebrate taxa, an expert confirmed the accuracy of 70.9% of trait-species entries recovered by trAIt. We observed considerable variation across taxa in trAIt's accuracy, which is possibly due to heterogeneity in open-access literature availability and inconsistencies in species and trait terminology. In sum, our analysis suggests that LLM-based tools can accelerate biological data synthesis but should be used to support domain experts' research, rather than replace their judgment.

22.
arXiv (math.PR) 2026-06-24

Typical geometry of self-repelling polymers in a constant force field

arXiv:2606.24352v1 Announce Type: cross Abstract: We study a general class of self-repelling polymers on $\mathbb Z^2$, including the simple random walk, the self-avoiding walk and the repulsive Domb-Joyce model, in the presence of a constant force field acting on each monomer. Conditioning the polymer to have fixed length and fixed endpoints, we identify the limiting free energy and prove that typical trajectories concentrate exponentially near a deterministic macroscopic shape. This shape is characterized as the unique minimizer of a variational problem and can be interpreted as a geodesic of a height-dependent Finsler metric. We also analyze two limiting regimes with universal features: for small field strength, in the symmetric case, the geodesic is close to a classical catenary, while for large field strength it converges to a universal polygonal shape governed by the nearest-neighbor lattice constraint.

23.
arXiv (math.PR) 2026-06-11

Convergence of a Critical Multitype Bellman–Harris Process with One Infinite-Mean Lifetime

arXiv:2606.11511v1 Announce Type: new Abstract: We study a critical multitype Bellman–Harris branching particle system in $\mathbb R^N$ with a finite type space $\mathbb K=\{1,\dots,K\}$. Particles of type $i$ move according to a symmetric $\alpha_i$-stable process and reproduce according to a critical offspring law whose mean matrix is irreducible and stochastic. The lifetime distribution of type $1$ is assumed to have infinite mean with regularly varying tail $$ 1-F_1(t)\sim c_1t^{-\gamma},\, 0 \frac{\gamma}{\beta}, $$ and a local increment condition on the heavy lifetime distribution, we prove convergence of the system to a Poisson random measure concentrated on the infinite-mean type.

24.
arXiv (CS.LG) 2026-06-17

Provably Efficient Regularized Online RLHF with Generalized Bilinear Preferences

arXiv:2602.23116v3 Announce Type: replace Abstract: We consider the problem of regularized best-response max-regret minimization in online RLHF under general preferences and bandit feedback. While various regularizers are utilized to robustify alignment, known polylogarithmic regret guarantees remain heavily specific to KL. To investigate whether such fast rates extend beyond KL, we adopt the Generalized Bilinear Preference Model (GBPM) – capturing intransitive preferences over $d$-dimensional item-wise features via a rank-$2r$ skew-symmetric matrix – to isolate the impact of generic regularization. Crucially, under GBPM, we prove that the dual gap of any greedy policy is bounded by the squared estimation error, derived using only strong convexity and skew-symmetry. Under a feature coverage assumption, we establish a generic polylogarithmic regret of $\tilde{\mathcal{O}}(\eta d^4 C_{\min}^{-1} (\log T)^2 \wedge d^2 C_{\min}^{-1/2} \sqrt{T})$ with Greedy Sampling, and a dimension-wise improved regret (for well-conditioned arm-sets) of $\tilde{\mathcal{O}}(C_{\min}^{-2} \sqrt{\eta r T} \wedge r^{1/3} C_{\min}^{-4/3} T^{2/3})$ with Explore-Then-Commit, where $\eta^{-1}$ is the regularization coefficient, $T$ is the time horizon, and $C_{\min}$ is an arm-set dependent quantity. This demonstrates that ``fast'' regrets are not KL-specific, but rather a fundamental consequence of generic strongly convex geometry.

25.
Nature (Science) 2026-06-24

Small-molecule modulation of β-arrestins

β-Arrestins are multifunctional regulators of G-protein-coupled receptor (GPCR) signalling and orchestrate diverse downstream signalling events and physiological responses across the GPCR superfamily1–3. Although GPCR pharmacology has advanced to target orthosteric and allosteric sites, as well as G proteins and GPCR kinases, direct chemical tools to modulate β-arrestin activities have remained conspicuously absent. Here we report the identification of small-molecule inhibitors that selectively target β-arrestins and delineate their mechanism of action through integrated pharmacological, biochemical, biophysical and structural analyses. These inhibitors disrupt β-arrestin engagement with agonist-activated GPCRs, impairing desensitization, internalization and β-arrestin-dependent physiological functions while sparing G protein–receptor coupling. Cryo-electron microscopy, molecular dynamics simulations and structure-guided mutagenesis reveal that one modulator, Cmpd-5, engages a pocket within the central crest of β-arrestin1 formed by the middle, C and lariat loops, a critical receptor-binding interface, stabilizing a distinct conformation that is incompatible with full β-arrestin–receptor engagement. Together, these findings establish a mechanistic framework for β-arrestin modulation, reveal&nbsp;a novel allosteric site for structure-based drug design, and open new avenues for transducer-targeted, pathway-specific GPCR therapeutic agents. Integrated pharmacological, biochemical, biophysical and structural analyses of small-molecule β-arrestin inhibitors show how they block β-arrestin engagement with activated GPCRs, revealing their mechanism of action and uncovering a previously unrecognized allosteric regulatory site.