Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (math.PR) 2026-06-12

Non-commutative Law of iterated logarithm

arXiv:2509.22037v2 Announce Type: replace-cross Abstract: We prove optimal non-commutative analogues of the classical Law of Iterated Logarithm (LIL) for both martingales and sequences of independent (non-commutative) random variables. The classical martingale version was established by Stout [Sto70b] and the independent case by Hartman-Wintner [HW41]. Our approach relies on a key exponential inequality essentially due to Randrianantoanina [Ran24] that improves that from Junge and Zeng [JZ15]. It allows to derive an optimal non-commutative Stout-type LIL just as in [Zen15], from that martingale result we then deduce a non-commutative Hartman-Wintner type LIL for independent sequences of random variables.

02.
arXiv (quant-ph) 2026-06-17

When Renormalisation Remembers: UV/IR Mixing as an Entanglement Bridge

Authors:

arXiv:2606.17147v1 Announce Type: cross Abstract: Renormalisation is traditionally understood to be a Wilsonian memoryless process in which ultraviolet (UV) degrees of freedom gradually decouple, leaving an autonomous infrared (IR) description. However this need not be the case: in UV/IR mixed theories correlations between widely separated scales can persist. In this work I recast UV/IR mixing as a Hilbert-space phenomenon, realised as correlations across renormalisation scales. This formulation is implemented using the Born-Reciprocal Tensor Network (BRTN), a new configuration of tensor network that is globally symmetric under phase-space reciprocity. On this network I prepare the vacuum and reproduce the expected radiative corrections. The resulting renormalisation geometry exhibits memory, with a bridge linking reciprocal representations of IR physics, whose cross-bridge entanglement provides a precise criterion for the viability of an effective description. I analyse when this criterion is met, and show that there is a large-volume limit, with the fundamental scale held fixed, in which the obstruction to a local description scales away: Wilsonian behaviour is restored and renormalisation forgets. The BRTN therefore provides a concrete and calculable platform for UV/IR mixing.

03.
arXiv (quant-ph) 2026-06-11

Residual-Squeezing Mechanism of Mismatch in Inverse-Squeezing Kennedy Receivers

arXiv:2601.19093v4 Announce Type: replace Abstract: The discrimination of quantum states is fundamental to quantum information processing. Inverse-squeezing Kennedy (IS-Kennedy) receivers can outperform the coherent-state BPSK Helstrom benchmark at the same energy by converting transmitter-side squeezing into an effective coherent-state separation gain, without violating the Helstrom bound for the squeezed-state alphabet. This work investigates how squeezing mismatch degrades this mechanism. We show that imperfect inverse squeezing transforms the ideally nulled output into a residually squeezed state, thereby altering the photon-number statistics before detection. This residual-squeezing picture reveals a strong physical asymmetry between squeezing-magnitude and squeezing-phase mismatches. Magnitude mismatch produces an energy-independent error floor in the high-signal-energy regime, whereas phase mismatch generates a residual squeezing term that grows with signal energy. In the small-residual-squeezing regime, this leads to a polynomial growth of the leading error contribution and a rapid collapse of the SQL advantage. We also identify a parity-step effect in photon-number-resolving detection: because the nulled residual squeezed vacuum contains only even photon numbers, increasing detector resolution improves the high-energy robustness only when the effective saturation threshold crosses the next even photon number. These results identify phase locking as the dominant bottleneck for IS-Kennedy-type non-Gaussian receivers under unitary squeezing mismatch and provide design guidelines for robust squeezed-state quantum receivers.

04.
arXiv (quant-ph) 2026-06-12

Instabilities in a Non-KAM System via Information Scrambling: A Note

arXiv:2606.12761v1 Announce Type: new Abstract: We study operator growth in quantized non-KAM systems using out-of-time-ordered correlators (OTOCs), focusing on the kicked harmonic oscillator as a representative example. Since the classical harmonic oscillator is degenerate, the dynamics fall outside the usual Kolmogorov-Arnold-Moser (KAM) framework, and resonances play a central role in shaping the phase space. We examine the system near resonances, where the ratio between the oscillator and driving frequencies takes integer values. Even though the classical Lyapunov exponent remains small at these points, and hence no conventional chaos, the phase space still undergoes strong structural changes. The OTOCs are particularly sensitive to these resonances, with a quadratic-in-time growth at resonance compared to linear growth away from it. Within a perturbative treatment, we derive closed-form expressions for the OTOCs and uncover a number-theoretic structure emerging in the behavior of OTOCs, governed by the Euler totient function of the frequency ratio. Overall, the results we present in this short note imply that resonant structures can play an important role in controlling information spreading.

05.
arXiv (CS.CL) 2026-06-16

Entropy-Aware On-Policy Distillation of Language Models

On-policy distillation is a promising approach for transferring knowledge between language models, where a student learns from dense token-level signals along its own trajectories. This framework typically uses reverse KL divergence, encouraging the student to match the teacher's high-confidence predictions. However, we show that the mode-seeking property of reverse KL reduces generation diversity and yields unstable learning signals when the teacher distribution has high entropy. To address this, we introduce Entropy-Aware On-Policy Distillation. Our key idea is augmenting the standard reverse KL objective with forward KL when teacher entropy is high, capturing the full range of plausible outputs while retaining precise imitation elsewhere. It balances mode-seeking precision with mode-covering robustness without sacrificing on-policy training efficiency. Experiments show that our method maintains generation diversity (sustained token-level entropy) and improves student-teacher alignment (lower forward KL on high-entropy tokens). Across six math reasoning benchmarks, this yields Pass@8 accuracy gains of +1.37 for Qwen3-0.6B-Base, +2.39 for Qwen3-1.7B-Base, and +5.05 for Qwen3-4B-Base compared to baseline on-policy distillation methods. These results demonstrate that accounting for teacher uncertainty is essential for maintaining diversity and achieving effective knowledge transfer.

06.
arXiv (CS.LG) 2026-06-18

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI

arXiv:2606.19270v1 Announce Type: cross Abstract: Artificial intelligence has driven rapid progress in medical imaging research, producing increasingly sophisticated algorithms and steady improvements on benchmark tasks. However, this algorithm-centric trajectory has also revealed a growing imbalance: while computational methods advance rapidly, the conceptual foundations that define imaging tasks, evaluation metrics, and clinical meaning sometimes remain underexamined. In this Perspective, we distinguish algorithmic innovation, which focuses on improving computational implementations and performance within a fixed problem definition, from conceptual innovation, which reframes what problems are posed, how success is measured, and why an approach is clinically relevant. We argue that prevailing incentive structures, training pathways, and publication norms disproportionately reward algorithmic novelty, particularly for early-career researchers, while at times undervaluing conceptual contributions that are essential for scientific maturation and clinical translation. Through representative examples from medical imaging AI, we show how insufficient conceptual grounding can lead to misaligned objectives, fragile generalization, and limited real-world impact. We conclude with actionable recommendations for researchers, mentors, reviewers, and journals to better recognize, support, and integrate conceptual innovation alongside algorithmic advances.

07.
arXiv (math.PR) 2026-06-11

Continuous stochastic flows driven by white noise and their duals

Authors:

arXiv:2606.12143v1 Announce Type: new Abstract: We study a class of continuous stochastic flows driven by a space-time white noise and characterize their dual flows by explicit stochastic differential equations. A key ingredient of the proof is the convergence of solutions under coefficient approximations. As an application, we derive the dual flows in two illustrative examples, the squared Bessel flow and the Jacobi flow. We also introduce a new model of polynomially self-repelling (PSR) flow and show that it enjoys a self-duality property.

08.
arXiv (CS.CV) 2026-06-17

SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis

Generative models have shown great promise for novel view synthesis (NVS) by leveraging strong image generation priors. However, existing approaches typically follow a 2D inpainting paradigm, first completing missing image regions and then performing 3D reconstruction. This strategy often causes geometry distortion and appearance drift, as 2D inpainting models cannot reliably infer the underlying 3D structure required for cross-view consistent generation. In this paper, we propose SceneCompleter, a geometry-aware framework that reformulates generative NVS as dense 3D scene completion. Instead of hallucinating isolated 2D views, SceneCompleter jointly completes geometry and appearance through a geometry-appearance dual-stream diffusion model in a spatially aligned RGBD latent space. To provide holistic scene context, we further introduce a Scene Embedder that conditions generation on global semantic and stylistic information from reference images. The completed RGBD predictions are then aligned and integrated into an expandable 3D scene representation, enabling iterative and coherent scene completion. Extensive experiments on in-domain and out-of-distribution datasets demonstrate that SceneCompleter produces visually plausible and geometrically consistent novel views across diverse scenarios. Project Page: https://chen-wl20.github.io/SceneCompleter

09.
arXiv (CS.AI) 2026-06-16

Policy Regret for Embedding Model Routing: Contextual Bandits with Low-Rank Experts

arXiv:2606.14929v1 Announce Type: cross Abstract: Modern recommendation systems increasingly rely on dynamically routing diverse queries to multiple embedding models. Despite its practical significance, this problem remains poorly understood under realistic conditions like adversarial queries, bandit feedback, and limited observability of models. We formalize embedding model routing as an adversarial contextual linear bandit with low-rank experts, where contexts are queries, actions are items, and experts are the embedding models working on low-rank latent representation spaces. We first establish that standard regret notions suffer from structural misspecification or statistical intractability, and we identify a log-quadratic policy class that is expressive enough to capture query-dependent model routing, yet structured enough to allow efficient online learning. Second, we propose a policy gradient algorithm called Hypentropy Policy Gradient (HPG). It provably adapts to the unknown low-rank structure under incomplete information and attains $\tilde{\mathcal O}(s\sqrt{M T})$ linearized policy regret – where $s, M$, and $T$ are the intrinsic rank of the experts, the number of models, and the number of rounds – thus avoiding a curse of dimensionality. Finally, we also provide an computationally efficient and parameter-free implementation of HPG.

10.
arXiv (CS.AI) 2026-06-16

SMEPilot: Characterizing and Optimizing LLM Inference with Scalable Matrix Extensions

arXiv:2606.16332v1 Announce Type: cross Abstract: Modern CPUs increasingly integrate matrix extensions, such as Arm Scalable Matrix Extension (SME), that provide high-throughput matrix execution within the CPU. For LLM inference, however, these units are not a universal replacement for conventional CPU cores: prefill, decode, attention, and KV-cache operations expose different arithmetic intensities, vector behavior, and layout requirements, while SME units and CPU cores still compete for shared memory bandwidth. This paper studies this mismatch through a roofline-based characterization of SME-enabled CPUs and uses the resulting model to guide operator-level execution choices. We present SMEPilot, an LLM inference engine that selects CPU-only, SME-only, or cooperative SME+CPU execution for each operator shape. SMEPilot partitions matrix work across SME and CPU cores at tile granularity, overlaps SME-suitable matrix stages with CPU-suitable vector stages in attention, and maintains layout state so packed tensor representations are reused rather than repeatedly rebuilt on critical paths. Across Llama-3.2-3B, Qwen3-4B, and Qwen3-30BA3B on phone, PC, and server platforms, SMEPilot improves end-to-end inference performance by up to 3.94$\times$.

11.
medRxiv (Medicine) 2026-06-15

Beyond the Apnea-Hypopnea Index: Physiological and Demographic Predictors of Excessive Daytime Sleepiness in Obstructive Sleep Apnea

Excessive daytime sleepiness (EDS) is a common but inconsistently predicted symptom of obstructive sleep apnea (OSA). OSA is typically diagnosed with polysomnography (PSG), and the current standard for severity assessment is the apnea-hypopnea index (AHI). AHI has many limitations, including its inability to explain physiological mechanisms or reflect variability in patient symptoms, such as EDS. This retrospective study aims to find physiological and demographic parameters that better predict EDS in patients with OSA and to evaluate whether these parameters outperform AHI using PSG data from the Mount Sinai Integrative Sleep Center. Clinical variables used to predict EDS included arousal index (AI), average oxygen desaturation during sleep, average heart rate during sleep, and AHI, along with demographic variables including age, sex, and BMI. Hypothesis tests, logistic regression models, and decision tree classifier models were performed on the data to discriminate sleepy from nonsleepy patients as determined by an Epworth Sleepiness Scale (ESS) score [≥] 10. AI and oxygen desaturation were found to be the most predictive physiological variables, and sex and BMI were found to be the most predictive demographic variables. The final decision tree model with these four variables outperformed the AHI in predicting EDS. These findings suggest that daytime sleepiness in OSA can be better explained by measures of apnea burden, oxygenation impairment, and patient demographics than by AHI alone, although these remain only modestly predictive. Future studies should focus on investigating more comprehensive physiological markers, multi-night sleep data, and more objective assessments of sleepiness.

12.
arXiv (CS.AI) 2026-06-15

Listening with Attention: Entropy-Guided Explainability for Transformer-Based Audio Models

arXiv:2606.14647v1 Announce Type: cross Abstract: Transformer-based automatic speech recognition (ASR) models such as Whisper are highly accurate, but their predictions remain difficult to interpret. Existing explainable AI (XAI) methods often lack faithfulness and precise temporal grounding. We propose Listening with Entropy-guided Attention for Faithful explainability (LEAF-X), a model-intrinsic XAI framework for transformer-based ASR. LEAF-X combines entropy-guided attention weighting, multi-layer attention rollout, and optional causal ablations to identify low-entropy, high-impact heads and layers, producing sparse token-to-frame attributions. Unlike perturbation-based explainers or raw attention maps, LEAF-X exploits the internal structure of encoder-decoder and speech-augmented decoder-only models to generate explanations that better reflect model computation. Results show 32% improved faithfulness, 35-39% stronger locality/sparsity, and the most stable attributions, supporting more transparent and auditable ASR.

13.
arXiv (CS.CV) 2026-06-16

EyeMVP: OCT-Informed Fundus Representation Learning via Paired CFP–OCT Pretraining

Color fundus photography (CFP) is the mainstay for large-scale retinal screening, yet its diagnostic capacity is constrained by the lack of depth-resolved structural information. Optical coherence tomography (OCT) provides cross-sectional retinal anatomy, but is less accessible in population-level screening. Here, we present EyeMVP, a cross-modal retinal foundation model that uses paired CFP–OCT pretraining to learn OCT-informed CFP representations. EyeMVP is pretrained on 674,893 strict same-eye same-day paired CFP–OCT image triples from 112,642 patients across eight hospitals in China. The model uses cross-modal masked reconstruction to enrich CFP representations with OCT-associated supervision, while requiring only CFP images at inference. To accommodate the non-aligned imaging geometry between en-face CFP and cross-sectional OCT, EyeMVP combines source-constrained cross-attention with CFP-derived structural masks. Across 16 downstream tasks, including classification, segmentation, few-shot adaptation, and cross-modal retrieval, EyeMVP outperforms representative retinal foundation models and shows consistent gains on tasks involving macular and optic nerve structure. For CFP-challenging macular diseases, EyeMVP achieves an AUROC of 0.948 for macular edema (vs.~0.852 for EyeCLIP) and 0.825 for myopic macular schisis. In an exploratory reader study, EyeMVP exceeds junior and intermediate ophthalmologist groups but does not reach senior ophthalmologist performance on macular edema, while showing numerically higher balanced accuracy than all reader groups on myopic macular schisis. These results suggest that pixel-level cross-modal reconstruction can enrich CFP representations with OCT-associated supervision, providing a practical route toward stronger CFP-based retinal analysis in screening settings.

14.
bioRxiv (Bioinfo) 2026-06-20

SAbDab2: The structural antibody database in the age of machine learning

The Structural Antibody Database (SAbDab) is a publicly available repository of experimentally determined antibody structures, first released in 2013. Explicit support for single-domain antibodies was added in 2021, with SAbDab-nano. Recently, increasing interest in antibodies has led to a proliferation of novel antibody formats, while simultaneous advances in machine learning have increased demand for standardised, high-quality structure data. Here, we present SAbDab2, re-engineered for the machine-learning age. It introduces support for a variety of new formats, and makes it easy to retrieve and compare all known structures of a given antibody. In addition, SAbDab2 provides ready access to ML-grade structures of antibody and antibody–antigen-complexes, with standardised, versioned train/test splits. These will be updated every six months going forward, and are available at https://zenodo.org/records/20083995. SAbDab2 itself is updated weekly and is freely available at https://sabdab2.opig.stats.ox.ac.uk.

15.
arXiv (CS.AI) 2026-06-16

The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers

arXiv:2606.16974v1 Announce Type: new Abstract: The reproducibility crisis has directed the AI research community toward improving documentation practices. Several studies have identified methodological issues, and in response, the most impactful venues in the field have introduced reproducibility checklists. We seek to understand whether documentation practices have changed over time by assessing all published papers at five leading AI conferences over the past decade. Seven reproducibility variables were identified, quality-assured and used to analyse 56 800 publications. Our analysis reveals that in the period 2014 to 2024, documentation practices have improved; papers sharing both code and data increased nearly sixfold, from 11% to 64% Building on empirical reproducibility rates from a prior study, we estimate - inferred from documentation practices, not direct testing - that reproducibility increased from 28% in 2014 to 64% in 2024. Improvements in documentation practices predate the introduction of reproducibility checklists, suggesting these changes reflect a broader movement toward open science rather than a direct response to formal requirements.

16.
medRxiv (Medicine) 2026-06-11

The impact of pre-stroke statin use on baseline corrected infarct volume and collateral perfusion

Stroke is a leading cause of disability and mortality worldwide, with ischaemic stroke the most prevalent type. Statins, used for cholesterol management, have demonstrated benefits in reducing stroke risk and improving outcomes in preclinical studies. However, the impact of pre-stroke statin use on stroke outcomes remain inconsistent. In this study, we aim to evaluate whether pre-stroke statin use is associated with greater volume of salvaged tissue and improved cerebral collateral perfusion. A retrospective analysis was conducted using data from 281 patients presenting with acute ischemic stroke to the John Hunter Hospital between May 2015 and May 2020. Patients were grouped based on pre-stroke statin use, and clinical variables, including infarct volume and collateral perfusion, were assessed. The primary outcome was salvage volume derived from baseline perfusion lesion volume minus infarct volume at follow-up. Collateral perfusion was measured by the hypoperfusion volume defined by delay time (DT)>6 seconds divided by the hypoperfusion volume defined by DT >2 seconds. Patients on statins at admission were significantly older and had more comorbidities. No significant association was found between pre-stroke statin use and salvage volume or collateral perfusion after adjusting for covariates. Larger initial infarct core was a significant predictor of salvage volume due to larger salvageable tissue volume at baseline. These findings indicate that pre-morbid statin use is not associated with larger salvage volume or improved cerebral collateral perfusion.

17.
arXiv (CS.LG) 2026-06-16

Greedy Coordinate Diffusion: Effective and Semantically Coherent Adversarial Attacks via Diffusion Guidance

arXiv:2606.15531v1 Announce Type: new Abstract: Fine-tuning aligned language models on benign tasks (e.g. math tutoring) systematically breaks safety guardrails, even when training data contains no harmful content. While mechanistic approaches have shed light on where alignment resides in model weights, they do not by provide a general formal framework for deriving guarantees about when fine-tuning degrades it – leaving the field without principled tools for predicting or preventing alignment collapse. We develop a local geometric framework through geometric analysis of parameter-space trajectories and apply it to understand the fragility of alignment in fine-tuning. While first-order analysis suggests orthogonal updates are safe, we prove this is illusory: the curvature of the fine-tuning loss induces second-order acceleration that can induce second-order drift into alignment-sensitive regions. We formalize a construct of our framework as the Alignment Instability Condition (AIC), three geometric properties that, when present, are sufficient to guarantee degradation. Our main result proves quartic onset of alignment degradation along gradient-flow trajectories, determined by how sharply alignment depends on specific parameters and how strongly tasks couple to these parameters. These findings yield formal sufficient conditions under which static first-order protection can fail under gradient descent. We further empirically validate the framework's foundations, showing that the Fisher Information Matrix provides a proxy for the degree of safety degradation across diverse fine-tuning.

18.
arXiv (math.PR) 2026-06-15

On the Poisson Follower Model

arXiv:2309.04864v5 Announce Type: replace Abstract: We introduce a stochastic geometry dynamics inspired by opinion dynamics that captures the essence of modern asymmetric social networks with leaders and followers. Points in the Euclidean space represent opinions, and the leader of an agent is the one with the closest opinion. In this dynamics, each follower updates its opinion by halving the distance to its leader. We demonstrate that this simple dynamics and its iterations exhibit several interesting purely geometric phenomena related to the evolution of leadership and opinion clusters, which resemble those observed in social networks. We also show that when the initial opinions are randomly distributed as a stationary Poisson point process, the spatial frequency of each of these phenomena can be expressed through an integral geometry formula involving semi-algebraic domains. Finally, we analyze numerically the limiting behavior of this follower dynamics. In the Poisson case, the agents fall into two categories: ultimate followers, who continue updating their opinions indefinitely, and ultimate leaders, who adopt a fixed opinion after a finite time. Spatial discrete event simulations support all our findings.

19.
PLOS Computational Biology 2026-06-15

A multilevel hierarchical framework for quantification of experimental heterogeneity in population snapshot data

by David J. Warne, Xiangrun Zhu, Thomas P. Steele, Stuart T. Johnston, Scott A. Sisson, Matthew Faria, Ryan J. Murphy, Alexander P. Browning Biological systems exhibit substantial heterogeneity: that is, variation in specific characteristics of individuals within a population. As a result, it is of critical importance to appropriately account for biological heterogeneity when calibrating mathematical models to infer cellular processes and predict behaviour. Recent approaches consider ordinary differential equations with random parameters to quantify heterogeneity in dynamical processes of cells. In this setting, statistical inference is performed to characterise the distribution of these random parameters within a cell population. One significant limitation of this approach is the tacit assumption that there are no substantial deviations in these distributions across experimental replicates. In this work, we propose a flexible Bayesian hierarchical differential equation modelling framework that quantifies and distinguishes both inter-experimental heterogeneity (heterogeneity between experimental replicates) and intra-experimental heterogeneity (biological heterogeneity within replicate populations). We consider two recent studies that employ mathematical models to interpret flow cytometry snap-shot data and quantify heterogeneity in nano-particle cell interactions and cell internalisation processes. Using simulation data, we demonstrate that substantial inaccuracy in the inferred dynamics can arise when experimental heterogeneity is not accounted for. By contrast, our hierarchical approach is robust to variability in inter-experimental and intra-experimental heterogeneity and our method simplifies to previous methods when inter-experimental heterogeneity is negligible. Our approach is flexible and widely applicable to applications involving replicate populations and snapshot data. We provide open-source implementations of our methods on GitHub.

20.
medRxiv (Medicine) 2026-06-15

Pulmonary extracellular vesicles drive alveolar macrophage dysfunction via microRNA transfer in Acute Respiratory Distress Syndrome

Background: Alveolar macrophage (AM) dysfunction contributes to Acute Respiratory Distress Syndrome (ARDS) pathogenesis. We investigated the role of extracellular vesicles (EVs) in mediating this dysfunction. Methods: Pulmonary EVs were isolated from broncho-alveolar lavage and non-directed bronchial lavage samples of ventilated sepsis patients with and without ARDS, and post-operative control patients via ultracentrifugation. AMs were isolated from lung tissue resections of lobectomy patients. AMs were treated with pooled EVs for 24 hours prior to functional, metabolic and autophagy profiling. EV cargo was profiled via small RNA transcriptomics and proteomics. Mechanistic role of EV microRNAs was assessed via mimic / antagomir transfection. Results: Pulmonary EVs from sepsis patients with ARDS impaired AM efferocytosis, and control EVs had no effect. ARDS EV treatment enhanced AM mitochondrial-linked respiration, but not glycolysis. ARDS EV treatment impaired LC3B-II and LAMP1 expression, indicating dysregulated AM autophagy-lysosomal machinery. Proteomics revealed downregulation of innate immune pathways in ARDS EVs. Transcriptomics revealed enrichment of 24 microRNAs in ARDS EVs; miR-652-3p was the most enriched, validated by RT-qPCR. EV miR-652-3p was associated with 90-day mortality (9.20 vs 0.59 RQ, p=0.0295) and inversely correlated with oxygenation (PaO2/FiO2). AM transfection with miR-652-3p mimic induced similar dysregulation of function and autophagy as ARDS EVs. Transfection of ARDS EVs with antagomirs to miR-652-3p prior to AM treatment partially rescued efferocytosis and autophagy. Conclusions: Targeting EV miR-652-3p may restore alveolar macrophage function and reduce excessive inflammation, thus offering a novel therapeutic strategy for patients with ARDS.

21.
arXiv (CS.LG) 2026-06-17

Evaluating Open-Source LLMs for Multi-Label ATT&CK Technique Classification on CTI Reports

arXiv:2606.18166v1 Announce Type: cross Abstract: Classifying Cyber Threat Intelligence (CTI) using MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) is essential for proactive defense, but historically required extensive human effort. Pre-Large Language Model (LLM) automation sped up this process, but could not resolve the complex language and multi-step attack patterns found in unstructured CTI reports. LLMs addressed previous limitations by using contextual reasoning to understand unstructured text. However, current evaluations rely on simplified, single-technique sentences that ignore the complexity of real-world CTI reports, which often leads to inflated performance results. Consequently, the baseline performance of open-source LLMs on complex unstructured CTI reports remains unevaluated. To address this gap, we constructed a ground-truth dataset of 2,076 human-annotated sentences (1,281 technique-positive, 795 negative) from 83 complex unstructured CTI reports. These sentences were mapped to 114 unique ATT&CK techniques using a six-phase annotation process, achieving \k{appa} = 0.68 inter-annotator agreement. Using this dataset, we evaluated seven open-source LLMs ranging from 8B to 236B parameters across prompt strategy and temperature configurations. The highest-performing LLM achieved a micro-averaged F1 score of 0.22, establishing the empirical baseline for multi-label ATT&CK classification on complex unstructured CTI. Parameter size showed a statistically significant positive correlation with F1 score. Prompt strategy and temperature produced no statistically significant gains across model configurations. These results indicate that current open-source LLMs are insufficient for production-grade ATT&CK classification. The dataset, benchmark, and findings provide a reproducible foundation for future CTI research.

22.
PLOS Medicine 2026-05-21

U = U for all: Advancing equity in HIV prevention

by Thiago S. Torres, Paula M. Luz Suppression of HIV with antiretrovirals eliminates HIV transmission risk, summarized as Undetectable = Untransmittable (U = U). However, U = U literacy remains unevenly understood and shared, and stigmas persist. Equitable and accurate awareness of U = U requires culturally tailored interventions, improved provider education, and supportive policy environments beyond biomedical evidence alone. Suppression of HIV with antiretrovirals eliminates HIV transmission risk, summarized as Undetectable = Untransmittable (U=U). However, U=U literacy remains unevenly understood and shared, and stigmas persist. In this Perspective, Thiago Torres and Paula Luz outline what is needed to improve equity and accuracy in global awareness and education of U=U.

23.
arXiv (CS.LG) 2026-06-16

Neural Bayesian Anomaly Mitigation: A Robust Loss that Doubles as an Unsupervised Contamination Classifier

arXiv:2606.16524v1 Announce Type: new Abstract: Engineered robust losses such as Huber, Student-$t$, and generalised cross-entropy make supervised models tolerant of contamination but cannot answer which observations are corrupted. We introduce Neural Bayesian Anomaly Mitigation (NBAM), a general-purpose drop-in loss derived from a Bayesian latent-switch mixture model: the marginal likelihood defines a robust supervised loss, and the associated posterior defines an unsupervised contamination classifier. Like Huber or Student-$t$, NBAM can replace the standard training loss in any supervised pipeline; unlike them, it additionally learns a structured contamination model and returns a calibrated per-sample contamination posterior. A learned input-dependent prior $\pi_\phi(x)$ captures the spatial locality of contamination, so that samples near known corruptions are more likely to be flagged, while an Occam penalty emerges automatically and regularises against over-flagging. On CIFAR-10 with asymmetric label contamination, NBAM recovers the structure of the corruption process without supervision: the contamination posterior separates clean from corrupted samples, and the learned anomaly head identifies the direction of every label-flip pair. Alongside these capabilities, NBAM outperforms the four robust-loss baselines considered here at contamination rates 0.2-0.6.

25.
arXiv (quant-ph) 2026-06-12

A Quantum Algorithm for Random Number Generation

arXiv:2606.13034v1 Announce Type: new Abstract: We present a quantum algorithm for random number generation that achieves a provable quadratic speedup over classical Markov chain mixing, building on the Diaconis-Shahshahani Fourier analysis of the top-to-random card shuffle. The algorithm integrates three quantum primitives into a unified mixing circuit: the Quantum Fourier Transform (QFT), which diagonalizes the Markov transition operator; controlled phase rotations, which encode the shuffle eigenvalue spectrum; and the Grover diffusion operator, which acts as a quantum analogue of the Aldous-Diaconis strong uniform stopping time by reflecting amplitudes about their mean at each iteration. For an n-qubit register, the mixing time is O(\sqrt{n \log n}) iterations. Extending to m qudits of local dimension d reduces this to O(\sqrt{\log_d N}) iterations, where N = d^m, compared to the classical O(n \log n) bound. The qudit formulation further reduces QFT circuit depth from O(\log^2 N) to O(\log_d^2 N) gates per layer by encoding the same N-state space using m = \log_d N subsystems instead of \log_2 N qubits. We validate both variants on IBM superconducting hardware.