Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
medRxiv (Medicine) 2026-06-19

Performance of family history-based colorectal cancer screening criteria by race and age at diagnosis in the Disparities and Cancer Epidemiology (DANCE) study

Importance: Family history (FH) and age are the primary criteria employed for early colorectal cancer (CRC) risk stratification. We evaluated how well these criteria identify individuals diagnosed with CRC across age and racial groups. Objective: To evaluate the performance of FH and age based screening criteria for identifying individuals with CRC, with attention to differences by race and age at diagnosis. Design, Setting, and Participants: This case control and case only analysis used data from the Disparities and Cancer Epidemiology (DANCE) cohort, a population based study of invasive CRC cases diagnosed from 2013 to 2022, recruited through the Metropolitan Detroit Cancer Surveillance System and the Louisiana Tumor Registry. Analyses included 1,158 non-Hispanic Black (NHB) and non-Hispanic White (NHW) CRC cases and 1,434 cancer-free controls from the Inflammation Health and Lung Epidemiology (INHALE) study, enrolled from the same Detroit catchment area. Data were analyzed in 2025. Exposures: Self reported cancer FH among first-degree (FD) relatives and grandparents, summarized into three FH-based screening criteria: at least one FD relative with CRC (colon early-screening criterion), any FH of Lynch syndrome related cancers, and meeting NCCN criteria for Lynch syndrome genetic testing. Main Outcomes and Measures: Proportion of cases meeting each FH based screening criterion stratified by race and age at diagnosis (

02.
arXiv (CS.CL) 2026-06-19

PASQA: Pitch-Accent-Focused Speech Quality Assessment Model Trained on Synthetic Speech with Accent Errors

Existing mean opinion score (MOS) prediction models typically predict utterance-level naturalness MOS and can be insensitive to localized pitch-accent errors. We propose Pitch-Accent-focused Speech Quality Assessment (PASQA), which explicitly targets pitch-accent correctness. To train our model, we construct a controlled Japanese accent-error dataset by changing accent patterns using an accent-controllable text-to-speech system, and compute a pseudo accent-quality score from the accent-error rate. PASQA builds on self-supervised representations and employs mora-conditioned fusion, ranking loss, an auxiliary accent-error localization task, and speaker-invariant training. Experiments show that conventional models fail to preserve the ordering by accent-error severity, whereas PASQA achieves high ordering accuracy on both seen and unseen speakers. Further, PASQA shows stronger agreement with human accent-correctness judgments. The code is available at https://github.com/lycorp-jp/PASQA.

03.
arXiv (CS.LG) 2026-06-19

Unsupervised Causal Abstractions Discovery

arXiv:2606.19594v1 Announce Type: new Abstract: Causal abstractions formalize when a high-level structural causal model (SCM) captures the interventional behavior of a lower-level SCM. Existing applications of this notion largely follow a hypothesis-testing paradigm: an expert proposes a candidate high-level model and then evaluates if the low-level system implements it. We study the complementary problem of learning a high-level model directly from low-level measurements. Our contributions leverage hypotheses from low-rank causal discovery, and can be summarized as follows: (1) we show that observations generated by a low-rank graph induce latents that form a causal abstraction, (2) we provide identifiability results about these latents, and (3) we propose a practical objective to learn this high-level SCM.

04.
arXiv (CS.CV) 2026-06-12

Distributional Loss for Robust Classification

This paper proposes a novel loss concept for supervised classification tasks. Rather than enforcing a direct mapping from each input sample to a single assigned label, we define an optimization objective over all classifier outputs as a bimodal Gaussian distribution. This softer target formulation implicitly captures class ambiguity, mitigates overfitting, and encourages the learning of more robust decision boundaries, all without requiring additional label information. Experimental results demonstrate consistent improvements in robustness, with particularly pronounced gains in low-data regimes, while requiring only minimal modifications to standard training pipelines.

05.
arXiv (quant-ph) 2026-06-24

Unitary Designs from Doped Matchgate Circuits

arXiv:2606.23800v1 Announce Type: new Abstract: Matchgate circuits realize free-fermion dynamics: they are efficiently classically simulable, yet cannot on their own generate the generic randomness required for universal computation or unitary design formation. We study a controlled route beyond this integrable limit by doping matchgate circuits with non-Gaussian gates-physically, the injection of fermionic interactions into an otherwise free system. Using the matchgate commutant framework, we obtain analytic control over unitary $2$-design formation. For globally scrambled dynamics, the design problem maps exactly onto a classical birth-death Markov chain with an Ornstein-Uhlenbeck continuum limit, recasting the emergence of quantum randomness in terms of spectral gaps and mixing times and yielding rigorous bounds on the number of non-Gaussian gates needed for approximate $2$-designs. These bounds hold for a broad class of parity-preserving non-Gaussian gates, independently of microscopic details, with numerics indicating that the same mechanism governs higher-order designs. Used as local building blocks in a glued-circuit architecture, they yield approximate parity-preserving $2$-designs in polylogarithmic depth with a sparse non-Gaussian gate count, with implications for Page-like entanglement growth and fermionic classical-shadow protocols. Finally, locality reshapes this picture: in local brickwork dynamics, design formation is diffusion-limited and far slower. Our results establish doped matchgate circuits as a controlled, analytically tractable route from free fermions to interaction-generated quantum designs.

06.
arXiv (math.PR) 2026-06-11

Marked random graphs with given degree sequence: large deviations on the local topology

arXiv:2401.00351v2 Announce Type: replace Abstract: We investigate the behavior of the empirical neighborhood distribution of marked graphs in the framework of local weak convergence. Here we extend known results by considering uniform random graphs with given degree sequences and i.i.d. marks on half-edges and vertices. We establish a large deviation principle for such families of empirical measures. The proof builds on Bordenave and Caputo's seminal 2015 paper, and Delgosha and Anantharam's 2019 introduction of BC entropy, relying on combinatorial lemmas that allow one to construct suitable approximations of measures supported on marked trees. Possible applications of these results are in the study of interacting diffusions on top of random graphs.

07.
arXiv (quant-ph) 2026-06-16

Towards Quantum Limited Spatial Resolution of NV-Diamond Magnetometry

arXiv:2508.13438v2 Announce Type: replace Abstract: Optically addressable ensembles of solid-state defects, such as nitrogen vacancy (NV) centers, are a leading modality for imaging-based magnetometry, thermometry and strain sensing. However, monitoring the fluorescence of individual defects within a sub-diffraction ensemble remains an outstanding challenge that currently limits access to atomic-scale features and dynamics. For compact clusters of NVs, we formulate imaging-based atomic sensing as a low-dimensional multiparameter estimation task in which one seeks to localize each defect and quantify the field strength in its immediate vicinity. In this work, we employ optical spatial mode demultiplexing (SPADE) to enhance localization and brightness estimation accuracy at sub-diffraction scales. Specifically, we develop a two-stage sensing protocol that augments direct imaging by projecting the incoming optical field onto point spread function (PSF)-adapted, i.e., PAD spatial modes and Yuen-Kennedy-Lax (YKL) spatial modes enabling efficient extraction of emitter positions and brightnesses. The YKL-SPADE measurement employed for brightness estimation is shown to be quantum-optimal in the case of two emitters and establishes a new connection between quantum detection and estimation theories. We numerically evaluate the statistical performance of our protocol for sub-diffraction optically detected magnetic resonance (ODMR) and Rabi sensing experiments. Compared to conventional focal plane intensity measurements, our protocol improves emitter localization accuracy by 6$\times$ and brightness estimation accuracy by 2$\times$ for tightly confined ensembles, residing well below the diffraction limit.

08.
arXiv (CS.AI) 2026-06-17

Offline Preference-Based Trajectory Evaluation

Authors:

arXiv:2606.17541v1 Announce Type: cross Abstract: Offline evaluation of agentic systems often collapses trajectories to terminal success, discarding information about partial progress and inducing widespread ties, creating substantial statistical inefficiency by reducing effective sample size and weakening the ability to distinguish systems. We propose preference-based trajectory evaluation, which compares trajectories directly through temporal preferences over progress and time-to-return profiles. We find that, across diverse agentic and interactive benchmarks, standard success-based metrics produce tied comparisons on roughly 75% of instances, whereas trajectory-aware preferences reduce ties to roughly 35%, improving discriminative power, ranking stability, and data efficiency. Our results suggest that benchmark saturation, often attributed to poor data collection or problem difficulty, may also be explained by the choice of evaluation measure.

09.
arXiv (quant-ph) 2026-06-11

Quantum ergodicity and semiclassical measures: mathematical results

arXiv:2606.12098v1 Announce Type: new Abstract: In this chapter we review some results describing the high-frequency eigenmodes of the Laplacian on compact manifolds, or Euclidean domains, for which the geodesic flow is chaotic. We focus on the macroscopic distribution of these eigenmodes, which is described by the concept of semiclassical measure. The main result on the question is the Quantum Ergodicity theorem, originally due to Schnirelman. We provide the detailed proof of this theorem, including the adjustments necessary to treat the case of manifolds with boundary. We also discuss the Quantum Unique Ergodicity conjecture, and some progress towards this conjecture for strongly chaotic (Anosov) systems. In particular, we describe the constraints on admissible semiclassical measures, in terms of their Kolmogorov-Sinai entropy, as well as more recent delocalization results.

10.
arXiv (CS.AI) 2026-06-24

Data Scale, Not Latency, Shapes Cross-Lingual Encoder Transfer in Streaming ASR

Authors:

arXiv:2606.24169v1 Announce Type: new Abstract: Adapting a streaming speech recognition model to a new language requires choosing between two plausible warm starts: a multilingual (ML) encoder or an English-only (EN) encoder. The common intuition is that the multilingual encoder should help most at low data, but it is unclear how long that advantage persists, whether tight streaming latency amplifies it, and whether it survives deployment quantization. We answer these questions with a controlled sweep of a 0.6 B-parameter cache-aware FastConformer transducer across eight European languages, up to five target-language data scales (100 h to 2500 h), three streaming tiers plus offline decoding, and up to four public test sets. The main result is that multilingual initialization is a data-limited advantage, not a latency-limited one. On FLEURS at 160 ms, the mean EN-ML word error rate (WER) gap falls from +4.21 percentage points (pp) at 100 h to +0.20 pp at 2500 h; a power-law fit summarizes this decay, with each doubling of target-language data roughly halving the remaining advantage. Across the three streaming tiers, the across-language mean EN-ML gap is approximately stable at each scale from 100 to 1000 h, and is near zero by 2500 h. Finally, 4-bit weight-only encoder quantization at the matched 560 ms streaming tier reduces the encoder footprint by about 3x, with an average FLEURS WER increase of about 0.5 pp. The resulting guideline is simple: use multilingual initialization in low-data regimes, treat the choice as effectively irrelevant at large data, and make latency and quantization decisions independently.

11.
arXiv (CS.CV) 2026-06-17

DriveJudge: Rethinking Autonomous Driving Evaluation with Vision-Language Models

Autonomous driving has shifted towards end-to-end policy learning, where reliable, interpretable policy evaluation is a fundamental challenge as driving quality is highly context-dependent. Commonly used rule-based driving metrics like EPDMS are interpretable but lack context-awareness, while recent VLMbased evaluations are context-aware but limited by ambiguous VLM outputs and weak physical grounding. To evaluate driving in a manner that is both interpretable and context-aware, we introduce DriveJudge. DriveJudge is a driving evaluation agent that combines rule-grounded evaluation with Vision-Language Model (VLM) reasoning and selectively invokes physically-grounded deterministic rule functions after interpreting the environmental context. To train and evaluate DriveJudge, we curate a large-scale dataset of 33,577 challenging driving samples with human annotations on whether the driving behavior is reasonable in the given scenario. With this dataset, we address the underexplored problem of driving metric evaluation, and introduce two human-aligned benchmark tasks: Driving Quality Classification and Trajectory Preference Selection. DriveJudge outperforms EPDMS for driving quality classification by 21.23 AUC, and the recent VLM-based DriveCritic for trajectory preference selection by 6.5%, setting a new standard for interpretable and precise driving evaluation.

12.
medRxiv (Medicine) 2026-06-15

International Consensus Guideline on Management of Genitourinary Adverse Events Associated with Prostate Cancer Radiotherapy

Purpose/Objective: Genitourinary (GU) adverse events (AEs) are common during and after pelvic radiation therapy (RT) for prostate cancer and can substantially impact quality of life. We convened an international committee to establish consensus in the prevention, mitigation, and management of radiation-related acute and late GU AEs, as there are no relevant evidence-based consensus guidelines to inform treating providers. Materials/Methods: A systematic evidence review focused on mitigation and management of radiation-related acute and late GU AEs was performed in PubMed, Embase and Cochrane. The following topics were addressed: management of acute GU AEs in the intact and post-operative settings; RT techniques; bladder outlet obstruction procedures; and indications for urology referral or hyperbaric oxygen therapy (HBO). Evidence-based consensus recommendations were developed using a Delphi process. We highlight the current state of evidence and evidence gaps worthy of future study. Results: Consensus was reached for 31 key questions. For management of lower urinary tract symptoms (LUTS), most evidence comes from trials in patients without cancer and not undergoing RT. A consensus algorithm for medical management of acute GU AEs was developed with the following highlights: (a) alpha blockers as 1st-line for obstructive symptoms in the intact setting, (b) anti-spasmodics as 1st -line for irritative symptoms in the intact setting, and (c) anti-spasmodics as 1st -line in the post-operative setting. The consensus algorithm provides an ordered list of medications to offer if 1st -line options afford inadequate relief. For RT fractionation, randomized clinical trial (RCT) data are available. 40% of panelists rarely or never use standard fractionation over moderate hypofractionation for patients with baseline LUTS, but most consider moderate hypofractionation over SBRT for AUA IPSS > 15. For patients with severe obstructive LUTS (most commonly AUA IPSS >20), the panel recommends a prophylactic bladder outlet obstruction procedure and, if obstructive symptoms improve, consideration of moderate hypofractionation or SBRT, based on retrospective data. There is one RCT supporting use of HBO for late radiation cystitis. Conclusions: The consensus guideline synthesizes available evidence and expert opinion across key clinical decision points to provide practical guidance in the prevention, mitigation, and management of radiation-related acute and late GU AEs in prostate cancer RT. Envisioned as a living document with periodic updates, this guideline serves as a resource for practicing radiation oncologists by outlining expert-derived consensus recommendations of evidence-based care in areas where high-quality data is limited.

13.
arXiv (CS.LG) 2026-06-19

Minimal Filling Architectures of Polynomial Neural Networks: Counterexamples, Frontier Search, and Defects

arXiv:2605.09609v2 Announce Type: replace Abstract: We provide counterexamples to the unimodal minimal filling architecture conjecture for polynomial neural networks (PNNs) with power activation functions. Fixing the input and output widths, the conjecture states that any minimal filling architecture has unimodal widths for the hidden layers. We found counterexamples via a frontier search, recursive dimension bounds on neurovarieties, and symbolic computation. Notably, several subarchitectures of our main example exhibit large defect, in contrast with the predominantly small-defect behavior observed in prior literature.

15.
arXiv (CS.CV) 2026-06-12

Context-Aware Feature-Fusion for Co-occurring Object Detection in Autonomous Driving

Object detection in autonomous driving requires precise localization and an inherent understanding of the relational context between co-occurring objects. In extremely complex heterogeneous environments rare classes, small-scale objects, and frequently appearing objects are difficult for standard object detection frameworks to handle. In this paper, we propose a novel framework called Context-Centric Feature Fusion (CCFF), which utilizes two attention-based modules, Local Context Fusion Module (LCFM) uses the RoI-to-RoI self-attention mechanism to resolve spatial interactions, mainly considering small and partially obscured objects, while Global Context Attention Module (GCAM) converts the co-occurrence of objects priors by pooling top-K RoI features into a global context attention token, avoiding the computational overhead of pixel-level global pooling. This fusion of local and object-centric global features yields contextualized embeddings that enhance classification results and co-occurring objects detection. Our method is evaluated on two datasets, Cityscapes and BDD100K which demonstrate significant improvement on relational consistency, achieving a Category-level Consistency Strategy (CCS) of 0.973 and 0.969, respectively. Furthermore, our approach produces substantial gains in small object detection (AP_S: 14.1%) and successfully recovers rare classes such as "Train" that are typically lost in large distributions. Our efficiency report shows that the framework processes images in real time with a 0.2 FPS overhead. The code is available at https://github.com/BinayKSingh/CCFF.

16.
PLOS Medicine 2026-05-20

Brain morphology in Anorexia Nervosa and its subtypes: A multi-cohort study of individual participant data

by Fabio Bernardoni, Dominic Arold, Luis Schoppik, Klaas Bahnsen, Ruiyang Ge, Clara Moreau, Lasse Bang, Federico D’Agata, Giovanni Abbate-Daga, Christian K. Tamnes, Iain Campbell, Owen O’Daly, Ulrike Schmidt, Guido Frank, Stefanie Horndasch, Andreas Hess, Arnd Dörfler, Hans-Christoph Friederich, Joe Simon, Angela Favaro, Luca Lavagnino, Christina E. Wierenga, Amanda Bischoff-Grethe, Amy E. Miles, Allan Kaplan, Aristotle Voineskos, Paul A. M. Smeets, Annemarie A. van Elburg, Unna Danner, Sophia I. Thomopoulos, Laura Berner, Neda Jahanshad, Sophia Frangou, Joseph A. King, Paul Thompson, Stefan Ehrlich Background In a recent coordinated meta-analysis of neuroimaging data, we reported gray matter (GM) alterations in acutely underweight patients with anorexia nervosa (AN). Here, we extend these findings by examining individual variation in brain structure within AN, individual-level differentiation between AN and healthy controls (HC), and differences between AN subtypes, with potential relevance for understanding clinical heterogeneity. Methods and findings We analyzed individual-level data from 11 international sites in the ENIGMA Eating Disorders Working Group, including 570 female participants with AN and 739 HC. We examined cortical thickness, cortical surface area and subcortical volumes in AN versus HC using three complementary approaches: (i) group-level differences in a mega-analysis correcting for age effects, (ii) frequencies of extreme deviations (infra-/supranormal; z  1.96) based on normative reference models by the CentileBrain Initiative, and (iii) individual-level classification performance using machine learning. The same analytic framework was applied to compare AN restricting versus binge-eating/purging subtype, additionally correcting for BMI effects.Mega-analyses reinforced previous meta-analytic findings of pronounced and widespread GM deficits in AN compared to HC. Normative modelling revealed that the frequency of infranormal z-scores (23/68 cortical thickness, 13/14 subcortical volume metrics) and supranormal z-scores (35/68 cortical thickness, 17/68 cortical surface area metrics) was significantly higher in AN than expected based on reference data. Individuals with AN could be reliably differentiated from HC using machine-learning classifiers (ROC–AUC = 0.75–0.81). In contrast, neither group-level differences nor frequency of extreme z-scores differed between AN subtypes, and individuals with different subtypes could not be reliably differentiated from each other. Importantly, the observational design cannot distinguish neurobiological differences related to AN from the effects of starvation or low BMI in the AN versus HC analyses. The lack of differences between subtypes does not exclude brain structural differences between AN subtypes that might be detectable with other modalities or analytic approaches. Conclusion Using a mega-analytic approach, we confirm widespread GM deficits in AN, show that these alterations are (in some patients) extreme, and demonstrate that they enable robust classification with superior performance compared to most MRI-based psychiatric classification studies. The absence of differences between AN subtypes may reflect shared neurobiology, though other imaging modalities may reveal distinctions beyond brain structure.

17.
arXiv (quant-ph) 2026-06-16

Boson Sampling as a Probe of Chaotic and Integrable Quantum Dynamics in a Photonic Chip

arXiv:2605.25398v2 Announce Type: replace Abstract: Quantum chaos plays a key role in understanding complex quantum dynamics, while integrated photonics offers unique advantages for quantum applications, including high-speed operation, scalability, and programmable unitary transformations. However, integrated photonic approaches to probing quantum chaos remain largely unexplored, owing to the absence of a clear connection between programmable photonic dynamics and established chaos diagnostics. In this work, we establish Fock-state boson sampling as a practical probe of quantum chaos by exploiting the sensitivity of multiphoton interference to the random-matrix properties of underlying single-particle unitary dynamics. More importantly, we design and fabricate a programmable quantum photonic chip to experimentally implement this framework, achieving the first integrated-photonic demonstration of quantum-chaos probes based on boson sampling. Experimental results show that the three complementary probes proposed in this work, namely the distance to Porter–Thomas statistics, Shannon entropy, and Out-of-Time-Ordered-Correlator-equivalent observables, exhibit close agreement with theoretical predictions and consistently distinguish chaotic and integrable dynamics. Our work provides a scalable route for investigating complex quantum dynamics on programmable photonic platforms while leveraging the intrinsic advantages of boson sampling through multiphoton interference and complex output statistics.

18.
arXiv (CS.AI) 2026-06-16

Controlled Dynamics Attractor Transformer

arXiv:2606.15207v1 Announce Type: cross Abstract: Transformer architectures have dramatically advanced representation learning and inference in deep models through self-attention mechanisms. In parallel,associative memory (AM) frameworks map representations onto energy landscapes, offering interpretable retrieval mechanisms. However, their continuous-time inference dynamics lack the biological plausibility of classical Continuous Attractor Neural Networks (CANNs). To bridge this gap, we propose Controlled Dynamics Attractor Transformer (CDAT), which couples a mixture von Mises-Fisher (Mo-vMF) attention energy with a Hopfield refinement energy, while augmenting energy descent with a CANN-inspired excitation-inhibition modulation. CDAT instantiates a topology-constrained dynamical system whose couplings encode relational structure among tokens, thereby linking attractor-style dynamics to modern energy-based attention. We further provide a constructive dissipation analysis to formally establish their controlled inference dynamics. Benefiting from these robust and structured dynamics, CDAT achieves state-of-the-art performance across multiple benchmarks in graph anomaly detection and graph classification.

19.
arXiv (CS.CV) 2026-06-16

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

Proactive and real-time interactive experiences are essential for human-like AI companions, yet face three key challenges: (1) achieving low-latency inference under continuous streaming inputs, (2) autonomously deciding when to respond, and (3) controlling both quality and quantity of generated content to meet real-time constraints. In this work, we instantiate AI companions through two gaming scenarios, commentator and guide, selected for their suitability for automatic evaluation. We introduce the Live Gaming Benchmark, a large-scale dataset with three representative scenarios: solo commentary, co-commentary, and user guidance, and present Proact-VL, a general framework that shapes multimodal language models into proactive, real-time interactive agents capable of human-like environment perception and interaction. Extensive experiments show Proact-VL achieves superior response latency and quality while maintaining strong video understanding capabilities, demonstrating its practicality for real-time interactive applications.

20.
arXiv (CS.CL) 2026-06-18

Freeing the Law with LOCUS: A Local Ordinance Corpus for the United States

Progress in legal AI increasingly depends on access to authoritative legal text at scale. Yet one of the most consequential layers of American law remains largely absent from existing machine-readable corpora: local ordinances. Local codes govern zoning, housing, business licensing, public health, noise, animal control, and many other domains of everyday regulation, but they are fragmented across vendor platforms designed for human browsing rather than bulk research access. We introduce LOCUS - the Local Ordinance Corpus for the United States - a comprehensive corpus and county-harmonized access layer for U.S. municipal and county ordinance codes. The raw corpus, available for release to researchers, represents nearly all publicly available municipal and county ordinance codes. The resulting raw corpus contains codes from 9,239 cities and counties. A smaller county-harmonized LOCUS access layer provides coverage for the largest 2,309 of 3,144 U.S. counties, accounting for a majority of the population. We use OCR to handle the myriad of document formats that have kept the law from being a public resource. We release the corpus with coverage metadata to support reproducibility, downstream legal AI research, and the incremental expansion of machine-readable access to local law. We train a collection of ModernBERT-based classifiers and scorers to facilitate analyzing U.S. local law among several dimensions, such as opacity and paternalism, that have not previously been studied at this scale. LOCUS-v1 and its derivative models are available at: https://huggingface.co/datasets/LocalLaws/LOCUS-v1

21.
arXiv (CS.AI) 2026-06-17

SketchXplain: Intuitive Visual Explanations of Image Classifiers with Sketches

arXiv:2606.17646v1 Announce Type: cross Abstract: Saliency map visualizations explain image-based AI predictions by pointing to regions, but these are often unintuitive and semantically unclear, leaving an interpretability gap. We argue that AI explanations should be intuitive – coherent to user knowledge, yet simple and selective to accelerate interpretation. Inspired by artistic drawings, we propose SketchXplain to generate sketch-based visual explanations for intuitive image-based explainable AI (XAI). Combining techniques in saliency maps, concept-bottleneck models, and sketch optimization, SketchXplain integrates saliency to select coherent observation artifacts, concepts for knowledge coherence, cues to represent them, and abstraction for simplicity. Evaluating on face expression recognition, modeling and user studies showed that SketchXplain supported quicker interpretation with more aligned visualizations than saliency maps or simple drawings. Further evaluation on skin lesion diagnosis found that SketchXplain more coherently visualized disease symptoms, better supporting lay diagnosis. Thus, this work illustrates the value of sketches for intuitive, simple, coherent, and quick image-based XAI visualizations.

22.
arXiv (CS.AI) 2026-06-18

Mechanism-Guided Selective Unlearning for RLVR-Induced Reasoning

arXiv:2606.19222v1 Announce Type: cross Abstract: We propose MAST (Mechanism-Aligned Selective Targeting), a mechanism-guided method for unlearning RLVR-induced reasoning with substantially lower collateral damage than standard full-parameter updates. In matched SFT/RLVR checkpoints on Qwen2.5-Math-1.5B and Qwen3-1.7B-Base, the SFT-to-RLVR increment differs sharply from the SFT update in token-level delta-log-probability, and full-parameter gradient ascent forgets only by damaging retain MATH and GSM8K. MAST ranks attention-projection tensors by off-principal energy, update magnitude, and forget-gradient coupling magnitude, then updates only the top-ranked subset. On the primary model, MAST induces statistically significant target forgetting (MATH forget 45/150 to 37/150; McNemar p=0.0078) while preserving GSM8K (+0.8 pp) and MATH retain (-0.5 pp). The advantage reproduces across seeds, NPO/SimNPO objectives, and Qwen3, where MAST preserves GSM8K while full-parameter unlearning collapses it.

23.
arXiv (CS.CV) 2026-06-15

Value-order Decomposition for Generalist Anomaly Detection

Industrial anomaly detection suffers from limited data, making cross-domain generalization particularly challenging. Generalist Anomaly Detection (GAD) aims to train a unified model on a source domain that can effectively detect anomalies in unseen target domains. In the initial semantic feature space, strong entanglement between anomalies and object categories or defect types hinders effective generalization across domains. Recent works address this issue by projecting features into a residual space; however, such methods primarily increase cross-domain overlap for normal features, while anomalous features remain specific to object categories, defect types and data domains, leading to poor alignment and generalization. To address this limitation, we propose Value-order Decomposition (VOD), a simple yet effective technique that bridges three types of generalization gaps across object categories, defect types (including real and synthetic defects), and data domains. VOD disentangles and suppresses object-category-, defect-type-, and domain-specific information, promoting alignment within normal and abnormal samples while preserving their separability, thereby enabling robust generalization across the three gaps. Leveraging the strong alignment between real and synthetic defects within the same object, we perform anomaly detection using only normal and synthetic-abnormal reference, and effectively generalize to unseen real defect types. Experiments on diverse industrial and medical benchmarks demonstrate that our method, using a simple cut-and-paste anomaly simulation strategy, achieves strong generalization across the three gaps.

24.
medRxiv (Medicine) 2026-06-12

Opportunistic CKD Screening in Hospitalized Patients

Background. Chronic kidney disease (CKD) affects 10-13% of adults worldwide but remains largely undiagnosed until advanced stages. Hospitalization provides an opportunity for early detection through opportunistic urine albumin-to-creatinine ratio (UACR) measurement. Methods. We conducted a prospective three-arm study of opportunistic CKD screening in general internal medicine wards at Hadassah Mt. Scopus (MS), Hadassah Ein Kerem (EK), and Shaare Zedek Medical Center (SZMC) in Jerusalem (Protocol HMO-23-0300). Adult inpatients without known CKD or recent UACR were enrolled. Pathological UACR was defined as [≥]30 mg/g. Confirmed CKD required two pathological measurements [≥]90 days apart (KDIGO-compatible). eGFR was computed using the 2021 CKD-EPI race-free equation. Pooled proportions were estimated by fixed-effects logit meta-analysis; odds ratios by DerSimonian-Laird random-effects models. Results. A total of 158 patients were enrolled (MS n=50, EK n=57, SZMC n=51). Pathological first UACR was identified in 43/158 patients (27.2%; 95% CI 21.3-34.1%; I2=0% across centers). Of 24 patients with a second UACR available, 14 (58%) confirmed CKD, yielding a pooled confirmed-CKD rate of 8.9% of all screened patients. In-hospital mortality was significantly higher among patients with pathological UACR (9.3% vs ~2%; Fisher's exact p=0.012). In per-center multivariate logistic regression, three predictors reached pooled significance: BUN (OR 1.10 per mg/dL, 95% CI 1.04-1.17, p=0.002, I2=0%), heart failure (OR 3.21, 95% CI 1.34-7.70, p=0.009, I2=0%), and diabetes mellitus (OR 2.54, 95% CI 1.11-5.82, p=0.028, I2=17%). Cardiac/vascular admissions had the highest pathological UACR rate (~42%); GI/hepatic admissions had 0%. Conclusions. Opportunistic inpatient UACR screening identifies previously unrecognized CKD in approximately 9% of general internal medicine patients, with consistent results across three independent centers. BUN elevation, heart failure, and diabetes are the strongest independent predictors. Pathological UACR carries significant short-term mortality risk, supporting integration of routine screening into inpatient care pathways.

25.
bioRxiv (Bioinfo) 2026-06-23

Comorbidity structure as an inductive bias: Comparing output-head designs for multi-label prediction of diabetes and myocardial infarction complications

Background: Clinical complications are often predicted with separate sigmoid outputs, even when the target labels arise from related pathophysiological processes. This paper asks whether output-layer choice should reflect both predictive convenience and the biological structure assumed among complications. The central premise is that label-dependence mechanisms are explicit hypotheses about comorbidity, not generic modelling additions. Methods: Output-head assumptions were compared across two clinically distinct multi-label prediction tasks. In Type 2 diabetes (T2D), six heads were evaluated for nephropathy, neuropathy, and retinopathy: independent baseline, linear additive, multiplicative, symmetric conditional random field (CRF), residual multilayer perceptron (MLP), and combined additive-multiplicative. In myocardial infarction (MI), four heads were evaluated for ventricular tachycardia, ventricular fibrillation, and atrioventricular block: independent baseline, linear additive, multiplicative, and symmetric CRF. All experiments used five training data fractions and seven independent seeds, with the same shared-backbone protocol within each disease setting. Results: In T2D, the symmetric CRF gave the most consistent improvement pattern, ranking highest at full data and at the two lowest data fractions while adding only three interaction parameters. At 20% training data, it was the only interaction head whose aggregate mean exceeded the independent baseline. The residual MLP, despite 123 interaction parameters, remained below the baseline across all T2D fractions. In MI, rankings changed across fractions: the multiplicative head led at 80% and 60%, the CRF led at 100% and 20%, and the baseline led at 40%. The combined additive-multiplicative head did not improve robustness in T2D and showed the largest negative baseline-relative deviations at lower fractions. Conclusions: The findings support a biology-guided view of output-layer design. A small constrained mechanism was most useful when its symmetry matched the shared microvascular structure of T2D, whereas the heterogeneous electrophysiology of MI produced no stable winner. Output-layer choice should therefore be reported and defended as an assumption about disease structure instead of a routine hyperparameter decision.