Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.LG) 2026-06-24

FAIRVAR: Fair Federated Learning via Variance Regularization

arXiv:2508.12042v3 Announce Type: replace Abstract: Federated learning (FL) allows collaborative training of machine learning models across multiple parties without sharing raw data. However, heterogeneous data can cause some clients to have disproportionate influence on the global model, leading to disparities in their performance. Fairness, understood as reducing these disparities, is therefore a crucial concern in FL and has been addressed in various ways. We studied performance equitable fairness in FL, where the goal is to minimize performance disparities across clients. We evaluated several existing fairness-aware methods and introduce here a new gradient-variance-regularized method, implemented in two variants: FairGrad (approximate) and FairGrad* (exact). We theoretically characterize the connections between these methods and, empirically, on heterogeneous benchmarks, show that FairGrad and FairGrad* consistently improve fairness by reducing variance in client accuracies, while maintaining competitive or improved mean performance compared to existing fairness-aware baselines.

02.
arXiv (CS.AI) 2026-06-12

Otters++: A Time-to-first-spike Based Energy Efficient Optical Spiking Transformer

arXiv:2606.13016v1 Announce Type: new Abstract: Spiking neural networks (SNNs) are promising for energy-efficient inference, and time-to-first-spike (TTFS) coding is especially attractive because each neuron fires at most once. In practice, however, this benefit is often reduced by the cost of computing a temporal decay term and multiplying it by the synaptic weight. We address this issue by turning a physical hardware "bug," the natural signal decay in optoelectronic devices, into the main computation of TTFS, named Otters++. Specifically, we use the measured decay of a custom In$_2$O$_3$ optoelectronic synapse to directly realize the TTFS temporal term, removing the need for explicit digital decay computation. To scale this idea to Transformer models, we establish a layer-wise functional equivalence between the Otters++ and a quantized neural network (QNN), and develop a hybrid training method that uses device-faithful SNN computation in the forward pass and QNN straight-through gradients through the equivalent QNN path in the backward pass, together with model distillation. This avoids differentiation through discrete first-spike events and reduces the over-sparsity problem in direct TTFS-SNN training. We further make training aware of measured device noise by sampling run-to-run variation, and refine the system-level energy model by accounting for device sharing and multi-hop communication. On GLUE dataset, Otters++ improves the average score to 84.17\% while maintaining a clear energy advantage over prior spiking Transformer baselines. These results show that physically grounded TTFS computing can be efficient, trainable, and robust under realistic hardware effects.

03.
arXiv (CS.CL) 2026-06-16

QK-Normed MLA: QK normalization without full key caching

Query-key (QK) normalization stabilizes attention by controlling the scale of queries and keys before the dot product, but is not immediately compatible with Multi-head Latent Attention (MLA). MLA achieves efficient decoding by caching low-dimensional latent states instead of full keys, whereas post-projection QK RMSNorm appears to require the fully projected key for every cached token. We show this apparent incompatibility is an implementation artifact, not an architectural constraint. RMSNorm decomposes into a static affine weight and a dynamic scalar RMS statistic. The static key-side weight can be absorbed into the MLA query-side projection; the dynamic key statistic reduces to one inverse-RMS scalar per token and KV group. The resulting formulation is exactly equivalent to explicit post-projection QK RMSNorm in exact arithmetic and preserves MLA's latent decode path. In our 400M runs trained for up to 100B tokens, QK-Normed MLA achieves lower training loss and better downstream accuracy than QK clipping, while H800 decode benchmarks show less than 2% latency overhead up to 256k context. These results make QK normalization a practical stabilization option for MLA models without requiring full-key caching.

04.
arXiv (CS.LG) 2026-06-19

Multimodal Concept Bottleneck Models

arXiv:2606.19882v1 Announce Type: cross Abstract: Concept Bottleneck Models (CBMs) enhance the interpretability of deep learning networks by aligning the features extracted from images with natural concepts. However, existing CBMs are constrained in their ability to generalize beyond a fixed set of predefined classes and the risk of non-concept information leakage, where predictive signals outside the intended concepts are inadvertently exploited. In this paper, we propose Multimodal Concept Bottleneck Model (MM-CBM) to address these issues and extend CBMs into CLIP. MM-CBM utilizes dual Concept Bottleneck Layers (CBLs) to align both the image and text embeddings into interpretable features. This allows us to perform new vision tasks like zero-shot classification or image retrieval in an interpretable way. Compared to existing methods, MM-CBM achieves up to 51.26% accuracy improvement on average across four standard benchmarks. Our method maintains high accuracy, staying within ~5% of black-box performance while offering greater interpretability.

05.
arXiv (CS.AI) 2026-06-19

Execution-bound advisory automation for agentic AI: a reproducible AIBOM-driven CSAF-VEX framework

arXiv:2606.19390v1 Announce Type: cross Abstract: A protocol driven framework is presented that binds SBOM and AIBOM artefacts to deterministic environment capture and structured runtime telemetry. Exploitability is computed from declared artefacts, observed activation conditions, and enforced execution policies. CSAF VEX advisories are generated from combined static and runtime evidence, cryptographically signed, and validated through deterministic replay. Evaluation uses approximately 10000 component entries across synthetic Agentic AI workloads 50 to 5000 components, incorporating OSV, GitHub Advisory, KEV, and EPSS datasets.

06.
arXiv (CS.AI) 2026-06-19

Charting the Future of Scholarly Knowledge with AI: A Community Perspective

arXiv:2509.02581v2 Announce Type: replace-cross Abstract: Despite the growing availability of tools designed to support scholarly knowledge extraction and organization, many researchers still rely on manual methods, sometimes due to unfamiliarity with existing technologies or limited access to domain-adapted solutions. Meanwhile, the rapid increase in scholarly publications across disciplines has made it increasingly difficult to stay current, further underscoring the need for scalable, AI-enabled approaches to structuring and synthesizing scholarly knowledge. Various research communities have begun addressing this challenge independently, developing tools and frameworks aimed at building reliable, dynamic, and queryable scholarly knowledge bases. However, limited interaction across these communities has hindered the exchange of methods, models, and best practices, slowing progress toward more integrated solutions. This manuscript identifies ways to foster cross-disciplinary dialogue, identify shared challenges, categorize new collaboration and shape future research directions in scholarly knowledge and organization.

07.
arXiv (CS.CL) 2026-06-16

ACCORD: Action-Conditioned Contextual Grounding for Language Agents

User instructions are often underspecified because humans rely on implicit assumptions about the surrounding environment. For large language model (LLM) agents operating in information-rich digital and physical environments, these assumptions cannot be inferred from the instruction alone; they must be recovered from the current state of tools, data, interfaces, and observations. Effective execution therefore requires agents to identify missing context, ground it in observed evidence, and carry it forward into subsequent actions. We show that current agents often fail to do so. They act from assumed rather than observed specifics, overlook information they could have gathered, and fail to incorporate evidence that has already been returned. Building on this insight, we propose ACCORD (Action-Conditioned Contextual Grounding), a simple and effective agent framework for adaptive grounding. Before each action, ACCORD actively probes the environment for missing information and integrates relevant context from the agent's trajectory that would otherwise be overlooked. Requiring no additional training or task-success signals, ACCORD improves task-goal completion on AppWorld by up to +20.6 points with GPT-5-mini, from 42.0% to 62.6%, compared to strong baselines. These gains persist with a substantially stronger base model (+10.8 with Claude-4.5-sonnet), an open-weight model (+10.1 with Qwen3.5-27B-FP8), and on the embodied AlfWorld benchmark (+7.4 success rate with GPT-5-mini).

08.
arXiv (CS.CV) 2026-06-12

ShowFlow: From Robust Single Concept to Condition-Free Multi-Concept Generation

Customizing image generation remains a core challenge in controllable image synthesis. For single-concept generation, maintaining both identity preservation and prompt alignment is challenging. In multi-concept scenarios, relying solely on a prompt without additional conditions like layout boxes or semantic masks, often leads to identity loss and concept omission. In this paper, we introduce ShowFlow, a comprehensive framework designed to tackle these challenges. We propose ShowFlow-S for single-concept image generation, and ShowFlow-M for handling multiple concepts. ShowFlow-S introduces a KronA-WED adapter, which integrates a Kronecker adapter with weight and embedding decomposition, and together with a novel Semantic-Aware Attention Regularization (SAR) training objective to enhance single-concept generation. Building on this foundation, ShowFlow-M directly reuses robust models learned by ShowFlow-S to support multi-concept generation without extra conditions, incorporating a Subject-Adaptive Matching Attention (SAMA) and a Layout Consistency guidance as the plug-and-play module. Extensive experiments and user studies validate ShowFlow's effectiveness, highlighting its potential in real-world applications like advertising and virtual dressing. Our source code will be publicly available at: https://htrvu.github.io/showflow.

09.
arXiv (CS.LG) 2026-06-24

Computational references are not experiments: pre-registered validation of machine-learned sodium-cathode voltages

arXiv:2606.23725v1 Announce Type: cross Abstract: Machine-learning screens for battery materials are trained and judged almost entirely against computed reference voltages, and those references carry their own systematic errors. We report a case in which this matters quantitatively: our own screening stack (a graph-network voltage screen, a prior-art triage layer, and a local PBE+U bench) fails pre-registered validation against experiment-anchored literature values. Verdict thresholds, failure modes, and the primary metric were committed before analysis. On an operator-audited set of known Na-ion cathodes (n = 6 after one documented exclusion; verdict unchanged at n = 7), the raw held-out mean absolute error was 0.67 V, the pre-registered conservative metric, the upper 95% confidence bound of the cross-validated bias-corrected error, was 1.09 V, and the residual was strongly voltage-dependent (r = -0.94), so no additive calibration is valid. On the two compounds where prediction, database reference, and experiment could all be compared, the Materials Project PBE+U reference sat about 0.54 V below measurement: the reference, not the model, dominated the error. A prior-art screen found at least 70% of the targeted Na substitution space already published. We retire the screen, bound what "verified" means for our DFT ledger, and pre-register a calibration audit of it against four benchmark Li couples.

10.
medRxiv (Medicine) 2026-06-24

Computational Decomposition of New Memory Failure in Alzheimer's Disease Through a Hippocampal Cortical Consolidation Bottleneck Model

Alzheimer's disease (AD) is clinically marked by difficulty retaining newly learned information, yet routine memory scores often conflate poor initial encoding with failure to stabilise information after encoding. This ambiguity limits the mechanistic interpretability of cognitive assessment during the transition from mild cognitive impairment to AD. Here we propose a Hippocampal Cortical Consolidation Bottleneck (HCCB) model to computationally separate these two components of new memory failure. The model represents newly presented information as a rapidly formed hippocampal trace and a slowly stabilised cortical trace, predicting a residual bottleneck when delayed recall falls below the level expected from immediate recall. We operationalised this prediction as Consolidation Bottleneck Index*(CBI*), a cognitively normal reference normalised residual index, and evaluated it using Alzheimer's Disease Neuroimaging Initiative (ADNI) cognitive and MRI data, with independent dynamical support from OpenNeuro EEG. Simulations showed recent memory vulnerability when hippocampal vulnerability exceeded cortical vulnerability. In ADNI, CBI* increased from cognitively normal participants to mild cognitive impairment nonconverters, reached Alzheimer like levels in mild cognitive impairment converters, and was associated with hippocampal atrophy. CBI* added minimal discrimination beyond established clinical and structural predictors, supporting its role as a mechanistic phenotype rather than a replacement prognostic model. OpenNeuro EEG further showed increased neurodynamic rigidity in AD. Our findings provide a computational framework for quantifying failed stabilisation of newly encoded information in AD progression.

12.
arXiv (CS.AI) 2026-06-17

Knowledge Reutilization in Meta-Reinforcement Learning

arXiv:2606.18132v1 Announce Type: new Abstract: Meta-reinforcement learning enables fast adaptation by extracting shared structure from related tasks, but existing end-to-end methods often couple task inference with embodiment-specific control. This coupling can obscure non-parametric task semantics, reduce sample efficiency, and limit cross-agent reuse. We propose a meta-knowledge reutilization framework that learns task-level knowledge on a dynamics-simplified agent and transfers it to heterogeneous agents. The framework uses a Bayesian non-parametric prior to organize latent task modes and a high-level policy to generate task-level magnitude guidance. To bridge reusable task knowledge with different embodiments, we introduce a semantic-magnitude interface and a lightweight temporal adaptor, which convert frozen meta-knowledge into temporally aligned subgoals for embodiment-specific low-level controllers. Experiments on multiple locomotion agents show that our framework reduces final-step tracking error by 94.75% – 99.79% compared with recent state-of-the-art baselines and achieves comparable deployment performance with about 23.8% of their interaction data.

13.
arXiv (CS.LG) 2026-06-11

Spectrally Regularized Latent Flow Matching for Turbulence Generation

arXiv:2606.11691v1 Announce Type: new Abstract: Latent diffusion and flow matching have emerged as leading approaches for synthetic turbulence generation, yet they systematically under-represent dissipation-range amplitudes. We introduce a latent flow matching framework with a spectrally regularized compression stage that directly targets this failure mode. On a 256^2 DNS dataset at Re_f \approx 2250, replacing an MSE-trained VAE with a zone-weighted log-spectral objective raises deep-dissipation retained spectral power from 25% to 94% in reconstruction and from 20% to 79% in unconditional generation. The improved latent representation also yields a substantially better sampling cost-fidelity tradeoff: the MSE-trained latent space imposes a fundamental quality ceiling near DD bias -0.70 that no integrator or step-count can overcome, while the spectrally regularized latent space reaches DD bias -0.117 at just 20 function evaluations. Mechanistically, encoder-decoder swap experiments show that the improvement is driven primarily by encoder-induced latent reorganization rather than decoder capacity, while a support-amplitude decomposition reveals that MSE-trained models behave as conservative suppression models, minimizing pointwise error by attenuating intermittent high-wavenumber structure. Both pipelines recover the second-order structure function and the correct sign of S_3, indicating the correct cascade direction without explicit supervision. A small residual gap in the magnitude of S_3 suggests that phase-coherent triadic organization remains a complementary axis to amplitude fidelity for future generative turbulence models.

14.
arXiv (CS.LG) 2026-06-25

A 3D-Printable Dataset for Fair Testing and Comparisons of Tactile Sensors

arXiv:2606.25886v1 Announce Type: cross Abstract: Existing texture datasets for tactile sensing primarily consist of sensor readings from a specific sensor interacting with available surfaces/objects rather than describing the textures themselves, limiting fair comparison between tactile sensors and hindering reproducible research. In this work, we introduce a 3D-printable dataset of mathematically defined textures designed to be fabricated reliably across different printers and filament types. The dataset consists of six parametrically generated surface patterns derived from combinations of sine-wave and Fourier-based functions, giving controlled variation in spatial frequency, amplitude, and directional structure. We evaluate the reproducibility of these textures across three popular 3D printers and multiple filament types by measuring variance in images captured using an optical TacTip sensor under controlled contact conditions. Our results show that print quality, particularly peak sharpness and stringing, affects tactile variance, with higher-end printers producing significantly more consistent signatures. Classification experiments using neural networks and PCA-based models further demonstrate that high-quality prints support strong within-printer generalisation, while cross-printer generalisation remains challenging due to geometric inconsistencies. This work establishes the first openly available, physically reproducible 3D-printed texture benchmark, providing a foundation for fair comparison of tactile sensors.

15.
medRxiv (Medicine) 2026-06-24

Atlas of glomerular disease-specific genetic effects on blood transcriptome

IgA nephropathy (IgAN), IgA vasculitis (IgAV), focal segmental glomerulosclerosis (FSGS), membranous nephropathy (MN), and minimal change disease (MCD) account for the majority of idiopathic glomerulo-nephropathies (GN). These disorders involve immune system dysregulation and have a complex genetic architecture. Currently, there are no adequately powered blood transcriptomic datasets coupled to genetic data from patients with GN that can delineate disease-context specific genetic effects on blood immune cell transcriptome. We performed whole genome sequencing coupled with bulk blood transcriptome sequencing on 1,822 participants from the CureGN study, a prospective cohort of participants with a kidney biopsy diagnosis of primary GN. We generated disease-context specific transcriptome-wide maps of gene expression QTL (eQTL), splicing QTL (sQTL), and double strand RNA-editing QTL (edQTL) for FSGS (N=447), IgAN (N=403), IgAV (N=123), MCD (N=408), and MN (N=441), as well as cross-disease maps for all 1,822 participants. Our QTL mapping identified 16,068 eGenes, 4,644 sGenes and 4,611 edQTLs with an FDR

16.
arXiv (CS.CL) 2026-06-17

Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models

AI agents are moving from advisors to actors, booking travel, planning menus, and running procurement on behalf of users. Existing benchmarks for AI and animal welfare evaluate model text responses to question-answer prompts, leaving open whether the welfare reasoning surfaced in those responses transfers to agentic deployment where the model must take actions with tools. We introduce TAC (Travel Agent Compassion), the first agentic benchmark measuring whether AI agents avoid options involving animal exploitation when acting on behalf of users. TAC presents an AI agent with twelve hand-authored travel booking scenarios across six categories of animal exploitation, augmented to forty-eight samples to control for price, rating, and position confounds. We evaluate seven frontier models from four labs. Every model scores below the chance level of sixty-four percent, with the best performer (Claude Opus 4.7) at fifty-three percent. A single welfare-aware sentence in the system prompt yields gains of forty-seven to sixty-three percentage points in Claude and GPT-5.5, twenty-six points in GPT-5.2, and under twelve points in DeepSeek and Gemini. An auxiliary Inspect Scout audit of 288 base-condition transcripts from the top two performers, using Gemini 2.5 Flash Lite as judge, flags zero transcripts for evaluation awareness, suggesting the below-chance rates do not stem from the models recognising the evaluation. We discuss implications for category-level variation across cultural domains, the limits of text-response welfare benchmarks, and the EU General-Purpose AI Code of Practice systemic risk framework.

17.
arXiv (CS.LG) 2026-06-17

Constrained Diffusion Models with Primal-Dual Inference

arXiv:2606.17192v1 Announce Type: new Abstract: This paper develops constrained diffusion models with primal-dual inference (PDI) to sample from optimal distributions of entropy-regularized optimization problems with average constraints. We formalize constrained sampling in the Lagrangian dual domain, where the optimal distribution takes the form of a Gibbs distribution indexed by the optimal dual variable. Rather than estimating this dual multiplier before sampling and freezing it throughout generation, PDI jointly infers the optimal primal distribution and its parametrizing dual variable. Each reverse diffusion step denoises using the score field associated with the current multiplier and then updates the multiplier through dual ascent using the estimated constraint violation of the denoised samples. To enable this conditional score field, we train a single dual-conditioned score network over the family of Gibbs distributions induced by the dual variables encountered during inference. We prove that the time average of the dual variables generated along the inference trajectory converges to a neighborhood of the dual optimum and bound the effect of residual dual mismatch on the terminal distribution through schedule-dependent stability factors. We evaluate PDI on constrained sampling from a mixture of Gaussians, wireless resource allocation, and portfolio management.

18.
medRxiv (Medicine) 2026-06-24

TCIA Radiology Image Processing for AI and Radiomics

We developed a standardized, reproducible preprocessing framework for computed tomography (CT) imaging data from multi-institutional repositories such The Cancer Imaging Archive (TCIA), enabling consistent radiomics and artificial intelligence (AI) analyses. Imaging data from TCGA-KIRC patients available on TCIA were used as a representative heterogeneous dataset characterized by variation in acquisition protocols, inconsistent metadata, and differing image quality. The proposed modular pipeline includes series filtering, DICOM-to-NIfTI conversion, orientation harmonization to a canonical coordinate system, voxel spacing normalization, intensity clipping and normalization, segmentation integration, and metadata validation, and is implemented in a reproducible, notebook-based framework compatible with common radiomics and deep learning workflows. This pipeline standardizes imaging data into analysis-ready volumes with consistent geometry, intensity distributions, and spatial alignment, reducing non-biological variability that can adversely affect radiomic feature stability and model performance. The modular design enables task-specific adaptation of individual preprocessing steps while maintaining overall consistency. Although demonstrated on TCIA, this framework is generalizable to other heterogeneous imaging datasets and provides a foundation for robust, large-scale computational imaging studies.

19.
medRxiv (Medicine) 2026-06-24

Model-based Detection of Spatial Disease Boundaries Using Amortized Bayesian Inference

Disease boundary analysis identifies abrupt changes in health outcomes across geographic boundaries, guiding targeted public health interventions and outbreak surveillance. Current implementations often adopt a Bayesian "wombling" approach and largely rely on Markov Chain Monte Carlo (MCMC) posterior sampling, presenting scalability issues for large-scale disease surveillance. We leverage amortized Bayesian inference (ABI) to accelerate the detection of spatial health disparities between neighboring US counties by embedding neural posterior estimation within a Bayesian areal wombling framework. Exploiting the computational efficiency of ABI, we further introduce the Residual Disparity Elimination Target, a metric for the required reduction in mortality or prevalence for a region to eliminate a significant disparity with its neighbor. We analyze tracheal, bronchus, and lung cancer mortality rates across mainland US counties and achieve results concordant with MCMC analysis while scaling areal wombling to hundreds of outcomes and translating disparity detection into interpretable policy objectives.

20.
arXiv (CS.LG) 2026-06-16

Filtered ANN as a Phase Transition: When Selectivity-Estimation Error Causes Plan Regret

arXiv:2606.16341v1 Announce Type: new Abstract: A filtered approximate-nearest-neighbor (ANN) query returns the k nearest vectors among those satisfying an attribute predicate P of selectivity s. The best execution strategy – pre-filter, post-filter, or in-filter – changes with s, so a system must estimate s and choose. We model this as an argmax over a landscape with phases (regions where each strategy wins) separated by boundaries, and show that selectivity-estimation error produces plan regret – recall lost versus the oracle strategy – only in the critical regions around those boundaries. The regret is a wedge of log-width equal to the multiplicative estimation error epsilon and height equal to the local cliff |V'(s*)| epsilon; the flip-margin 1/|V'(s*)| is the condition number of a sibling cardinality-estimation study reappearing as the local boundary theory. The two phase boundaries follow from independent mathematics: order statistics place the post-filter cliff at s ~ k/K, and site percolation places the in-filter cliff at s_c ~ 0.83/M for graph degree M (corpus-size independent). Criticality exists only under a constrained budget B < sqrt(k n). Under pre-registered decision rules we confirm, on synthetic sweeps and real SIFT1M, that regret concentrates ~290x at the boundary and that the regret curves obey a finite-size scaling collapse onto one universal wedge across two decades of corpus size. A real approximate index does not mis-locate the boundary, but a biased cost model opens a persistent miscalibration band that estimation-error robustness cannot fix. The contribution is a characterization, not a new index. Code and the full pre-registration are public.

21.
PLOS Medicine 2026-06-23

Multi-omics biomarkers of endothelial dysregulation preceding chronic lung allograft dysfunction: A prospective cohort study

Authors:

by Giulia Iacono, Christina Begka, Bailey Cardwell, Carmel Daunt, Roxanne Chatzis, Celine Pattaroni, Alana Butler, Matthew Macowan, Bronwyn Levvey, Gregory I. Snell, Glen P. Westall, Benjamin J. Marsland Background Long-term survival of lung transplant recipients remains limited by chronic lung allograft dysfunction (CLAD). CLAD is only diagnosed following a persistent and substantial decline in lung function, after which irreversible damage to the lungs has occurred, limiting opportunities to effectively intervene at an early stage. There is a critical need for earlier detection prior to its clinical manifestation. The immunological drivers of CLAD remain unclear, limiting the development of predictive biomarkers and new therapies. Methods and findings In this hypothesis-generating, prospective cohort study, we profiled the microbial, metabolic, lipidomic, and gene expression dynamics of longitudinally collected broncho-alveolar lavages (BALs) from 56 CLAD-free lung transplant recipients up to 30 months post-transplant, and compared BALs from 13 CLAD-free patients to BALs from 13 patients who developed CLAD. In CLAD-free patients, the first 6 months post-transplant were hallmarked by diminished microbial diversity and increased abundance of Staphylococcus and Candida, coupled with upregulated innate and adaptive immune responses, and elevated nitric oxide metabolism (FDR 

22.
arXiv (CS.CV) 2026-06-15

HumP-KD: A Hybrid Uncertainty-Aware Multi-Stage Progressive Knowledge Distillation Framework for Efficient Fire Classification

Real-time fire classification systems require models that are simultaneously accurate, computationally efficient, and deployable on resource-constrained hardware. This work proposes HumP-KD, a Hybrid Uncertainty-aware Multi-stage Progressive Knowledge Distillation framework for efficient fire classification. Two datasets, FlameVision and Dataset-II, containing 8,600 and 31,309 images, are used. Various CNN and transformer baselines are applied under standard preprocessing, online augmentation, Gaussian noise and motion blur robustness conditions. The proposed HumP-KD model distills knowledge from two frozen heterogeneous transformer teachers, Swin-Tiny and ViT-Base, along with their Meta-MLP ensemble, into a lightweight MobileViT-S student via three tightly integrated components. Hierarchical Progressive Knowledge Distillation employs a Hierarchical Feature Builder. It generates a fused spatial attention mask to guide distillation toward discriminative regions selectively. Multi-Stage Knowledge Distillation progressively activates three distillation stages across training. On Dataset-II, HumP-KD achieves a mean F1 score of $0.9876 \pm 0.0063$ across 10 independent trials, significantly outperforming the MobileViT-S baseline trained without distillation ($0.9537 \pm 0.0351$), with statistical significance confirmed by both independent t-test ($p = 0.0195$) and Wilcoxon signed-rank test ($W = 1$, $p = 0.0039$). The proposed method also demonstrates strong generalization across datasets and robustness under degraded visual conditions. The student model retains only 4.94M parameters and 19.01Mb model size, representing a $5.7\times$ parameter reduction over Swin-Tiny and a $17.5\times$ reduction over ViT-Base, while achieving 37.72 CPU FPS, making it suitable for real-time deployment.

23.
medRxiv (Medicine) 2026-06-12

An integrative multi-omics framework identifies epigenetic dysregulation of HAND2 as a potential primary driver of impaired enteric neural crest cell differentiation in Hirschsprung Disease

Hirschsprung disease (HSCR) is a congenital neurodevelopmental disorder characterized by segmental aganglionosis due to impaired developmental processes of enteric neural crest cells (NCCs). Despite being the leading genetic cause of functional intestinal obstruction in early childhood, HSCR represents a paradigmatic challenge in precision medicine: its multifactorial etiology, complex gene-environment interactions and limited resolution of single-modality analyses have long hindered mechanistic understanding and therapeutic translation. Here, we applied an integrative multi-omics approach combining genetic, phenotypic, epigenomic and transcriptomic analyses of matched ganglionic and aganglionic formalin-fixed paraffin-embedded (FFPE) patient tissues, complemented by patient-specific in vitro models. Beyond established genetic contributors, our integrative approach reveals novel regulatory pathways predominantly affecting enteric NCC differentiation, with convergent evidence pointing to epigenetic dysregulation as a primary disease mechanism. Notably, we identified over 1,300 differentially methylated positions between ganglionic and aganglionic FFPE samples, with HAND2 emerging as a key candidate due to multiple hypermethylated sites and consistently reduced expression levels in aganglionic tissues and in vitro models, suggesting a potential role in HSCR pathophysiology. We propose that our multi-omics approach offers a powerful and comprehensive framework for dissecting disease mechanisms. Beyond advancing biological understanding, this strategy holds promise for paving the way for molecularly informed patient stratification and supporting the development of personalized treatment and postoperative management strategies.

24.
arXiv (quant-ph) 2026-06-11

Global vs. Local Discrimination of Locally Implementable Multipartite Unitaries

arXiv:2509.10430v2 Announce Type: replace Abstract: We study single-shot distinguishability of locally implementable multipartite unitaries under Local Operations and Classical Communication (LOCC) and global operations. As unitary discrimination depends on both the choice of probing states and the measurements on the evolved states, we classify LOCC and global distinguishability into two categories: adaptive strategies, where probing states are chosen based on measurement outcomes from other subsystems, and restricted strategies, where probing states remain fixed. Our findings uncover three surprising features in the bipartite setting and establish new structural limits for unitary discrimination: (i) Certain pairs of unitaries are globally distinguishable with restricted strategies but indistinguishable under LOCC, even with adaptive strategies. (ii) There exist sets of four unitaries that are distinguishable via LOCC, yet remain globally indistinguishable with restricted strategies. (iii) Some sets of unitaries are globally indistinguishable under adaptive strategies, when probed with separable states, but become distinguishable via LOCC.

25.
arXiv (CS.CV) 2026-06-19

CARE: Competence-Aware Reward Shaping for Adaptive Reasoning Length in Video-MLLMs

In multimodal video reasoning, reinforcement learning-based methods typically rely on simplistic and inflexible reasoning-length control strategies that fail to adapt to the model's evolving competence. This mismatch may suppress necessary exploration at early stages, while encouraging redundant reasoning and inefficient decoding once the model becomes more competent. In this paper, we propose CARE, a competence-aware reward shaping framework for adaptive reasoning length optimization in multimodal reasoning. Specifically, CARE maintains a smoothed competence estimate via an exponential moving average of pass rates, and uses it to route training into progressive stages that shift the reward preference from exploration-oriented long-form reasoning to efficiency-oriented concise reasoning. To avoid conflating verbosity with intrinsic task complexity, CARE further normalizes reasoning effort with batch-level statistics, and introduces a posterior amplifier to strengthen reward signals for unexpectedly strong performance on historically difficult samples. The proposed mechanism is seamlessly integrated into the GRPO training pipeline and incurs no additional inference-time overhead. Extensive experiments on multiple video reasoning and general video understanding benchmarks demonstrate that CARE consistently improves reasoning accuracy, stabilizes reinforcement learning, and significantly enhances token efficiency. Moreover, CARE exhibits a characteristic inverted-U trajectory of reasoning length during training, and yields shorter yet more informative reasoning traces at convergence, indicating effective adaptive allocation of reasoning budget. We provide the source code for our proposed CARE framework and experiments at https://github.com/1Pansy/Video-CARE.