Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.LG) 2026-06-16

Simulation-Augmented Multi-Step Split Conformal Prediction for Aggregated Forecasts

arXiv:2606.16356v1 Announce Type: new Abstract: We study uncertainty quantification for aggregated forecasting tasks such as annual totals and year-over-year growth rates. We propose SA-MSCP, a simulation-augmented multi-step split conformal method that generates future paths from cross-validated residuals using a block bootstrap and constructs prediction intervals from empirical quantiles. Experiments show that SA-MSCP improves empirical coverage over a simulated-path baseline for aggregated and growth-rate targets. Our results demonstrate that simulation-enhanced conformal calibration is an effective and general framework for uncertainty quantification in aggregated time-series forecasting.

02.
arXiv (CS.CV) 2026-06-19

MMD-SLAM: Structure-Enhanced Multi-Meta Gaussian Distribution-Guided Visual SLAM

3D Gaussian Splatting (3DGS) has significantly boosted novel view synthesis and high-fidelity scene reconstruction, expanding the potential of 3DGS-based Visual Simultaneous Localization and Mapping (SLAM) methods. However, most existing systems fail to fully exploit the underlying structural information, which limits rendering quality and often leads to inconsistent maps. To address these limitations, we propose MMD-SLAM, a structure-enhanced Visual SLAM framework that leverages the Atlanta World (AW) assumption to guide a Multi-Meta Gaussian representation for photorealistic mapping. First, we introduce a point-line fusion strategy for pose optimization, where 3D line segments are incorporated to improve tracking robustness and provide additional constraints for mapping. Second, we design a Multi-Meta Gaussian representation with dominant directions, explicitly encoding structural priors from the AW hypothesis. Finally, we propose a Gaussian evolution strategy that adapts to scene geometry and incorporates structural cues into global optimization. Extensive experiments demonstrate that these innovations enable MMD-SLAM to achieve state-of-the-art performance in both tracking accuracy and mapping quality. e.g., our method achieves a 48.56% reduction in ATE RMSE on ScanNet and a 5.71% improvement in PSNR on Replica, compared with MonoGS.

03.
arXiv (CS.CV) 2026-06-19

PCFootprint: A Large-Scale Dataset and Benchmark for Vectorized Building Footprint Extraction from Aerial LiDAR Point Clouds

Building footprint extraction is a fundamental task in photogrammetry, remote sensing, and computer vision. Recent image-based methods have achieved remarkable progress in extracting vectorized footprints from high-resolution optical imagery. However, optical imagery inherently susceptible to occlusions, perspective distortions, and residual relief displacement, yielding incomplete or misaligned footprint extraction. Furthermore, the lack of explicit elevation information limits its direct applicability to Level of Detail building modeling. In this paper, we present PCFootprint, the first large-scale public dataset for footprint extraction from airborne laser scanning point clouds. PCFootprint comprises \num{33000} tiles derived from the Estonian Land and Spatial Development Board, covering diverse urban and rural landscapes. Each tile spans \qtyproduct{128 x 128}{\m} with systematically aligned vectorized footprints aligned to point clouds. The dataset includes a \num{3000} tiles cross-domain test set for evaluating generalization across geographic regions. We establish comprehensive benchmarks by evaluating mainstream methods. Experimental results reveal significant challenges including high intra-class variance, data imbalance, and noise across complex geospatial environments. We believe PCFootprint will advance future research in building modeling, urban scene understanding, and geospatial analysis. The PCFootprint dataset is publicly available at \url{https://huggingface.co/datasets/Haoyuan-Shen/PCFootprint}.

04.
arXiv (CS.AI) 2026-06-15

A Multi-Agent AI System for Automated High School Transcript Processing: Collaborative Document Analysis at Scale

arXiv:2606.13916v1 Announce Type: new Abstract: Each year, college admissions offices face an overwhelming challenge: processing millions of high school transcripts, each with unique formats, grading systems, and layouts. This manual process creates operational bottlenecks that delay admissions decisions and consume valuable resources. We present a transformative solution through a multi-agent AI system where specialized agents collaborate to automatically process diverse transcript formats through intelligent coordination and communication. Our multi-agent architecture consists of three specialized agents-a Pattern Recognition Agent for format-specific parsing, a Semantic Analysis Agent for natural language understanding, and a Vision Intelligence Agent for multimodal document analysis-coordinated by an Orchestration Agent that manages agent communication and result reconciliation. Our key innovation lies in agent-based quality control using GPA extraction as a coordination signal, ensuring reliable agent collaboration and preventing critical information loss. When evaluated on 40 real world transcripts from high schools across 13 U.S. states, our agent system successfully processed every document, achieving 96.7% accuracy compared to expert manual review while maintaining practical processing speeds of 45 seconds per transcript. This work demonstrates how multi-agent coordination can solve complex document processing challenges, offering institutions a scalable, collaborative AI solution that preserves accuracy while dramatically reducing processing time.

05.
arXiv (CS.CL) 2026-06-12

Two Wrongs, No Right: Auditing Social-Desirability Bias in LLM Annotators for Computational Social Science

作者:

LLM annotators are increasingly used in computational social science (CSS), but it is unclear whether their alignment-shaped errors preserve the empirical conclusions a researcher would report. We audit three open-source 7B instruction-tuned models (Zephyr, Mistral-Instruct, Qwen2.5-Instruct) across six TweetEval tasks under four prompt conditions (72 cells) and find that social-desirability failures do not run in a single direction. Zephyr exhibits leniency bias, systematically under-applying harmful labels (offensive language: false benign rate 0.729, false alarm rate 0.031). Mistral and Qwen exhibit overcorrection, over-applying the same labels (Mistral hate-speech FAR = 0.604). All three models exhibit neutrality bias on abortion stance, underestimating opposition prevalence by 24 to 40 percentage points and inflating the neutral label. None of the four prompting interventions we test (neutral, safety framing, depersonalized, chain-of-thought) corrects these failures across models; safety framing can worsen stance distortion. Strikingly, Zephyr's hate-speech prevalence estimate matches the gold rate exactly while its class-conditional errors are large in both directions, an accidental cancellation that misleads aggregate validation. We translate these patterns into a three-part taxonomy with diagnostic FBR/FAR signatures and a lightweight gold-sample validation protocol. The headline for trustworthy CSS: a model that looks calibrated on aggregate metrics can still flip the substantive empirical conclusion a researcher would report.

06.
arXiv (math.PR) 2026-06-17

A Tanaka-Type Formula for Compact Sets and Equilibrium Measures of L\'{e}vy Processes

arXiv:2606.17472v1 Announce Type: new Abstract: Tanaka's formula is a classical identity for Brownian motion, and Tsukada (2018) extended it to L\'{e}vy processes not necessarily symmetric. From a potential-theoretic point of view, this formula shows that the invariant function for the process killed upon hitting a singleton can be decomposed into the sum of a martingale part and a local time. In this paper, we generalize this singleton setting and derive a Tanaka-type formula for a compact set $B$. To this end, we introduce the equilibrium measure, defined as the rescaled limit of the $q$-capacity measures, and show that the invariant function for the process killed upon hitting $B$ can be represented as the integral, with respect to the equilibrium measure, of the invariant functions associated with processes killed upon hitting singletons, up to an additive constant called the Robin constant. Moreover, when $B$ is an interval, we obtain explicit representations of the equilibrium measure, the Robin constant, and the martingale part for recurrent stable processes as well as for recurrent spectrally negative L\'{e}vy processes. Finally, we discuss how an analogous Tanaka-type formula can also be established for transient L\'{e}vy processes.

07.
arXiv (CS.CL) 2026-06-16

RoTRAG: Rule of Thumb Reasoning for Conversation Harm Detection with Retrieval-Augmented Generation

Detecting harmful content in multi turn dialogue requires reasoning over the full conversational context rather than isolated utterances. However, most existing methods rely mainly on models internal parametric knowledge, without explicit grounding in external normative principles. This often leads to inconsistent judgments in socially nuanced contexts, limited interpretability, and redundant reasoning across turns. To address this, we propose RoTRAG, a retrieval augmented framework that incorporates concise human written moral norms, called Rules of Thumb (RoTs), into LLM based harm assessment. For each turn, RoTRAG retrieves relevant RoTs from an external corpus and uses them as explicit normative evidence for turn level reasoning and final severity classification. To improve efficiency, we further introduce a lightweight binary routing classifier that decides whether a new turn requires retrieval grounded reasoning or can reuse existing context. Experiments on ProsocialDialog and Safety Reasoning Multi Turn Dialogue show that RoTRAG consistently improves both harm classification and severity estimation over competitive baselines, with an average relative gain of around 40% in F1 across benchmark datasets and an average relative reduction of 8.4% in distributional error, while reducing redundant computation without sacrificing performance.

08.
arXiv (CS.AI) 2026-06-16

Distilling Drifting Transformers with Representation Autoencoders

arXiv:2606.15553v1 Announce Type: cross Abstract: Representation Autoencoders (RAEs) have improved diffusion and flow models by semantically richer latent space owing to the strongly label-wise clustered DINO features in the pretrained encoders. Yet in the distillation stage, the severe anisotropy and large curvatures caused by the rich semantic representations would hinder the convergence and performance, making the trajectory-based distillation unstable. In this work, we argue that the RAE latent space is compatible with distillation via the newly proposed Drifting Models. We first quantitatively study the curvatures and isotropy statistics across different autoencoders, and theoretically reveal that Drifting Model itself is highly likely to fail on extremely scattered spaces like reconstruction-based VAEs. These motivate us to apply the drifting paradigm directly to representation autoencoders. Our proposed method, Drift-RAE, distills pretrained flow models in RAE latent spaces using Drifting, together with insightful modifications that improve training stability by thereotically aligning drifting fields with other frameworks. Regarding the experimental evidences, we achieve 1.77 FID on ImageNet 256 dataset using only 10k distillation steps, surpassing state-of-the-art RAE distillation methods and appearing comparative with the original Drifting Model without requiring an auxiliary MAE feature extractor. The code will be made publicly available.

09.
PLOS Computational Biology 2026-06-17

Machine learning-driven identification of virulence determinants in <i>Borrelia burgdorferi</i> associated with human dissemination

by Hoa Thanh Nguyen, Catherine A. Brissette Lyme disease, the most common tick-borne infectious disease in the United States, presents with highly variable clinical outcomes, ranging from localized erythema migrans to severe disseminated complications affecting the heart, joints, and nervous system. The bacterial determinants underlying this phenotypic variation remain largely unknown, limiting our ability to predict disease progression and optimize treatment strategies. Here, we applied machine learning (ML) approaches to identify specific amino acid residues within surface-exposed virulence factors that predict human dissemination phenotypes. Utilizing the published whole genome sequences from 299 clinical Borrelia burgdorferi isolates collected from the United States and Slovenia over a 30-year period (1992–2021), we extracted and characterized translated amino acid sequences (variants) of seven known virulence factors (BB_0406, BBK32, DbpA, OspA, OspC, P66, and RevA). Protein variants were classified based on their association with disseminated versus localized infections using clinical metadata. Cramér’s V analysis revealed possible strong associations between dissemination phenotypes and five adhesins: BBK32, DbpA, OspC, P66, and RevA. We developed ML models using five algorithms with multiple feature selection strategies, achieving robust predictive performance for DbpA, OspC, and RevA variants (all performance metrics > 0.7). Feature importance analysis identified 57, 29, and 42 key predictive residues for DbpA, OspC, and RevA, respectively. Notably, B-cell epitope prediction revealed significant enrichment of ML-identified residues within predicted epitope regions for OspC (11 overlapping residues, OR = 3.57, p = 0.006) and RevA (12 overlapping residues, OR = 2.37, p = 0.048), suggesting these residues may influence immune recognition and bacterial persistence. This study establishes the first computational framework linking Borrelia protein sequence variants to clinical dissemination phenotypes, providing molecular insights into Lyme disease pathogenesis that may inform the development of improved diagnostics and therapeutic targets.

10.
bioRxiv (Bioinfo) 2026-06-19

Simulation-based Bayesian deep learning enables uncertainty-aware tumor fraction estimation in cell-free DNA

Background: Estimating tumor fraction from whole-genome cell-free DNA sequencing is critical for liquid biopsy, but is hampered by weak signals and baseline noise at low tumor fractions. Existing computational methods often require matched controls or large labeled datasets for training and lack uncertainty quantification. To address these gaps, we developed purNPE, a Bayesian deep-learning framework trained without labeled cancer cell-free DNA samples. Specifically, purNPE leverages a two-part generative model: one component simulates diverse tumor copy-number profiles based on evolutionary genealogies, while a second, data-driven component learns and replicates realistic sequencing background patterns from cancer-free cell-free DNA. By training a Neural Posterior Estimator on synthetic tumor profiles augmented with learned noise, purNPE performs amortized inference in milliseconds without needing a reference sample set at inference. Results: In a real-world pan-cancer cohort, purNPE achieved comparable performance with existing methods against orthogonal mutant-allele-fraction validation (MAE = 0.066). In silico and semi-synthetic experiments suggested analytical sensitivity around 1% tumor fraction under the evaluated conditions and showed strong classification accuracy in low tumor fractions (AUC = 0.98 for TF [&le;] 3% versus controls). Conclusions: This work provides a framework for using simulation-based inference to derive calibrated, uncertainty-aware TF estimates, offering a potential alternative to traditional data-dependent methods.

11.
arXiv (CS.LG) 2026-06-11

CP4SBI: Local Conformal Calibration of Credible Sets in Simulation-Based Inference

arXiv:2508.17077v3 Announce Type: replace-cross Abstract: Current experimental scientists have been increasingly relying on simulation-based inference (SBI) to invert complex non-linear models with intractable likelihoods. However, posterior approximations obtained with SBI are often miscalibrated, causing credible regions to undercover true parameters. We develop $\texttt{CP4SBI}$, a model-agnostic conformal calibration framework that constructs credible sets with local Bayesian coverage. Our two proposed variants, namely local calibration via regression trees and CDF-based calibration, enable finite-sample local coverage guarantees for any scoring function, including HPD, symmetric, and quantile-based regions. Experiments on widely used SBI benchmarks demonstrate that our approach improves the quality of uncertainty quantification for neural posterior estimators using both normalizing flows and score-diffusion modeling.

12.
arXiv (CS.LG) 2026-06-16

Learning Hybrid Biophysical Neuron Models with Neural ODEs

arXiv:2606.16693v1 Announce Type: cross Abstract: Biophysical neuron models link measurements of neural activity to underlying cellular mechanisms. Yet, a central challenge is that the kinetics of many ion channels are poorly characterized, and practical simplifications – omitting channels or reducing morphological detail – introduce systematic gaps between model and biology. Bridging these gaps requires approaches that can flexibly discover unmodeled dynamics while preserving mechanistic interpretability. Here, we introduce a hybrid modeling framework that embeds neural ordinary differential equations into conductance-based biophysical models to capture unknown currents or mis-specified channel kinetics. By parameterizing the neural ODE in terms of voltage-dependent steady-state and time-constant functions, we recover interpretable gating dynamics directly from voltage recordings without assuming a functional form. We show that the hybrid model fits the gating kinetics of 2400 ion channel models and recovers unknown gating dynamics from single current-clamp recordings, generalizing to out-of-distribution stimulus regimes under realistic inputs and parameter misspecification. We also use our method to reduce a multicompartment model of a cortical neuron into a single-compartment hybrid model with a learned axial current, yielding up to an order of magnitude lower computational cost. Together, our results establish a plug-and-play framework for selectively replacing unknown components of conductance-based models with neural ODEs while preserving their mechanistic structure.

13.
medRxiv (Medicine) 2026-06-15

Beyond the Apnea-Hypopnea Index: Physiological and Demographic Predictors of Excessive Daytime Sleepiness in Obstructive Sleep Apnea

Excessive daytime sleepiness (EDS) is a common but inconsistently predicted symptom of obstructive sleep apnea (OSA). OSA is typically diagnosed with polysomnography (PSG), and the current standard for severity assessment is the apnea-hypopnea index (AHI). AHI has many limitations, including its inability to explain physiological mechanisms or reflect variability in patient symptoms, such as EDS. This retrospective study aims to find physiological and demographic parameters that better predict EDS in patients with OSA and to evaluate whether these parameters outperform AHI using PSG data from the Mount Sinai Integrative Sleep Center. Clinical variables used to predict EDS included arousal index (AI), average oxygen desaturation during sleep, average heart rate during sleep, and AHI, along with demographic variables including age, sex, and BMI. Hypothesis tests, logistic regression models, and decision tree classifier models were performed on the data to discriminate sleepy from nonsleepy patients as determined by an Epworth Sleepiness Scale (ESS) score [&ge;] 10. AI and oxygen desaturation were found to be the most predictive physiological variables, and sex and BMI were found to be the most predictive demographic variables. The final decision tree model with these four variables outperformed the AHI in predicting EDS. These findings suggest that daytime sleepiness in OSA can be better explained by measures of apnea burden, oxygenation impairment, and patient demographics than by AHI alone, although these remain only modestly predictive. Future studies should focus on investigating more comprehensive physiological markers, multi-night sleep data, and more objective assessments of sleepiness.

14.
arXiv (CS.AI) 2026-06-12

Rarity-Gated Context Conditioning for Offline Imitation Learning-Based Maritime Anomaly Detection

arXiv:2606.13311v1 Announce Type: cross Abstract: Contextual anomaly detection aims to identify abnormal behavior conditional on context variables, but practical deployments often face highly imbalanced context distributions where rare regimes can be critical information. Under such frequency bias, context-conditioned models can produce unstable decisions and excessive false alarms in rare contexts. We propose Rarity-Gated Feature-wise Linear Modulation (RGFiLM), a rarity-aware conditioning module that combines feature-wise modulation (i.e., context-conditioned scaling and shifting of hidden features) with a gate controlled by a data-driven rarity score. The rarity score is estimated from the empirical distribution of context variables and regulates how strongly context modulates intermediate representations: the gate becomes more decisive under rare contexts while remaining conservative under frequent contexts. We evaluate RGFiLM on maritime trajectory anomaly detection using AIS motion sequences with ERA5 environmental context in an environment-sensitive detour scenario. When instantiated in a sequential anomaly scoring pipeline, RGFiLM achieves the best mean F1–False Positive Rate (FPR) trade-off among the compared context-agnostic and context-conditioned methods. These results suggest that explicitly accounting for context rarity is an effective approach for reducing false alarms in context-sensitive anomaly detection.

15.
bioRxiv (Bioinfo) 2026-06-11

DivQuant: Estimation of Species Richness and Entropy from Small Samples

Estimating diversity properties of discrete distributions from a small observed sample is a fundamental problem in algorithmic statistics that has applications in many fields, in particular bioinformatics, but also in ecology or linguistics. The two most common diversity measures are the number of distinct elements in a multiset, also referred to as species richness in ecology or alpha diversity in microbial analysis, and the Shannon entropy, also referred to as evenness. Estimating these properties from a small sample is particularly challenging for distributions with many rare elements. Thus, many estimators have been proposed in the past that, in practice, work well for different types of distributions. We present DivQuant, an optimization-based, extrapolating richness and entropy estimator with three contributions. First, we formulate the upsampling problem as a convex quadratic program with a Neyman {chi}2 objective. Unlike the linear program of its predecessor RichnEst, DivQuant admits confidence intervals via {chi}2 test inversion that are empirically well-calibrated. Second, we replace RichnEst's fixed-threshold fingerprint truncation with the rare/abundant fingerprint split of Valiant and Valiant, which strongly reduces problem size and preserves enough degrees of freedom for the confidence-interval program to remain valid and feasible. Third, we plug the optimal population fingerprint returned by the program into Shannon's entropy formula to obtain an entropy estimate. DivQuant attains close-to-nominal 95% confidence intervals in essentially all tested regimes, including six simulated distribution families, Tara Oceans microbiome data, and 10X Genomics scRNA-seq data, while competing state-of-the-art methods (RichnEst, iNext, PreSeq) miss the true richness in up to 80% of instances, well above the nominal 5%. In addition, DivQuant outperforms classical asymptotic entropy estimators (Miller-Madow, CAE) and the extrapolating iNext estimator. Running times remain competitive, with DivQuant typically completing in seconds. DivQuant is available as a command-line tool at https://gitlab.com/rahmannlab/divquant.

16.
PLOS Computational Biology 2026-06-17

Combining machine learning and iterative experiments to keep pace with emerging viral variants of concern

by Thomas Sheffield, Ryan C. Bruneau, Stephen Won, Kenneth L. Sale, Brooke Harmon, Le Thanh Mai Pham Modeling and predicting viral mutations before they emerge plays a crucial role in pandemic preparedness, enabling the early identification of emerging variants of concern (VOCs) and guiding timely updates to vaccines, diagnostic tests, and therapeutic strategies. However, existing machine learning models and large-scale experiments lose their predictive power as viral variants evolve further from the original strains in sequence space. Here, we present a scalable framework that integrates random forest and neural network machine learning models with targeted high-throughput experimentation to anticipate and evaluate emerging SARS-CoV-2 receptor-binding domain (RBD) variants. Using public datasets, we trained predictive models for binding to human Angiotensin-converting enzyme 2 (ACE2), RBD expression, and antibody escape, and refined these models through iterative integration of experimental data focused on over 200 variants derived from wild-type (WT) and Omicron strains. Through an indirect transfer learning approach, our machine learning models achieved high accuracy having correlation coefficients of up to 0.79 for antibody binding. The models were also generalizable across diverse antibody types including heavy-chain-only antibodies (HCAbs) by encoding complementarity-determining regions (CDRs) as input features. This dynamic approach enables rapid assessment of emerging variants, facilities prioritization of the therapeutic strategies, and supports a proactive, data-driven response to evolving viral threats.

17.
arXiv (quant-ph) 2026-06-12

Quantum Otto engine powered by an anisotropic Heisenberg XYZ model under independent local magnetic fields

arXiv:2606.12877v1 Announce Type: new Abstract: We study a quantum Otto heat engine whose working substance is an anisotropic two-qubit Heisenberg XYZ model. Independent local magnetic fields are used to control each spin individually. The influence of the longitudinal coupling, anisotropy, transverse coupling, and local fields on the net work output and efficiency is systematically examined. Reducing the longitudinal coupling is found to markedly improve both the maximum work and the peak efficiency. The engine performance reaches an optimum at a particular value of the anisotropy parameter. A local work analysis clarifies how work is produced during the cycle. Because of the asymmetric local fields and the intrinsic spin-spin interaction, the two qubits play markedly different thermodynamic roles; the interaction term itself contributes crucially to the total work. We further analyze the variation of quantum entanglement, quantified by concurrence, along the cycle. The results indicate that a pronounced change in entanglement between the hot and cold isomagnetic strokes is closely correlated with the efficiency enhancement. This work offers new insight into the operating principles and control of quantum Otto heat engines.

18.
arXiv (CS.LG) 2026-06-12

When to Align, When to Predict: A Phase Diagram for Multimodal Learning

arXiv:2606.11190v2 Announce Type: replace Abstract: Cross-modal alignment (CA) and cross-modal prediction (CP) are the dominant paradigms for multimodal representation learning, yet there is no systematic understanding of when each succeeds, when each fails, and when cross-modal training helps at all – a gap that leaves practitioners, especially in scientific domains like biomedicine or astrophysics, with heterogeneous instruments and multiple levels of organization and measurement, unable to diagnose why standard methods underperform the best single modality. We develop a unified linear framework that addresses both questions. Under a spiked signal-plus-noise model with structured cross-modal nuisance correlation, we derive separation ratios for both objectives that expose complementary failure modes: alignment whitens each modality and fails when nuisance is strongly correlated across views; prediction encodes whatever is cross-predictable through a one-sided whitening, with recovery governed by source-modality quality. The resulting phase diagram partitions multimodal problems into four regimes: Both, CA only, CP only, and Neither. We present a data-driven procedure to locate real-world datasets in this diagram using a small labeled subsample, identifying the preferred objective and prediction direction before any cross-modal training. Experiments on synthetic data, stereo-vision benchmarks, image-caption pairs, and real astrophysical data validate the predictions in the nonlinear regime, including the Neither regime where cross-modal training is actively harmful. Our framework lets practitioners diagnose their multimodal problem and choose the right objective before committing to training. Code to reproduce the results is available at https://github.com/IlayMalinyak/mm_align_vs_pred.

19.
arXiv (CS.LG) 2026-06-12

How Useful is Causal Invariance for Domain Adaptation in Finite-Sample Settings?

arXiv:2606.12680v1 Announce Type: new Abstract: Machine learning models often degrade when they are deployed on a target distribution that differs from the source distributions they were trained on. Recent work in causality-based domain generalization has shown how shared causal structure between domains can induce invariant predictors, e.g., models on a subset of features which have stable risk across structured domain shifts. However, the extent to which such population-level causal invariances can lead to gains in finite-sample settings remains underexplored. In particular, in practice we often have access to a few labeled target samples, a setting called supervised domain adaptation (sDA). In this paper, we explore when (full or partial) causal knowledge can provably improve supervised domain adaptation. As a first step, we study linear regression, where full or partial causal knowledge specifies a collection of invariant or possibly invariant feature subsets, each yielding a source-trained candidate predictor. We derive matching upper and lower bounds showing that finite-sample gains are governed by the target-risk margins separating the candidates, together with the finite-source estimation error. When these margins are sufficiently large relative to $n_Q$, an adaptive aggregation procedure can match the best candidate predictor while avoiding negative transfer relative to target-only learning. On the other hand, when the margins are too small, no algorithm can reliably exploit the candidate collection to obtain faster finite-sample rates. We further connect these margins to structural shift magnitude in linear SCMs and validate the theory on real-world causal benchmarks.

20.
arXiv (CS.AI) 2026-06-19

Latent Confounded Causal Discovery via Lie Bracket Geometry

arXiv:2606.19610v1 Announce Type: cross Abstract: Recent work on Kan-Do-Calculus (KDC) has established that the boundary between passive observation and active intervention in causal inference is a category-theoretic bi-adjunction, with interventions modeled by left Kan extensions and conditioning by right Kan extensions. This paper introduces two causal discovery algorithms under latent confounding, building on the information-geometric and categorical consequences of KDC. In smooth statistical settings, Radon-Nikodym derivatives between observational and interventional measures induce local causal vector fields; failures of these fields to close under Lie brackets become computable Frobenius residuals, which we interpret as witnesses of failed visible integrability and possible latent or unmodeled structure. Our first algorithm, BRIDGE (Bracket Residuals for Interventional Discovery and Geometric Estimation), combines an interventional density or Radon-Nikodym-ratio engine with a geometric screen that proposes a high-recall family of admissible arrows, identifies non-closing visible pairs as latent-obstruction candidates, and passes the reduced family to downstream score-based or differentiable discovery routines. The second algorithmic contribution, Spectral Kan-Do Flow Matching (SKFM), learns amortized intervention fields and factors latent curvature spectrally, exposing the direct Lie-space endpoint toward which BRIDGE points. A detailed set of experiments show that both algorithms are capable of discovering causal models with latent confounders while collapsing the super-exponential space of possible DAGs by many orders of magnitude. This paper introduces a new paradigm in causal discovery, where latent structure is inferred directly from the geometry of intervention-induced flows.

21.
arXiv (CS.CV) 2026-06-12

An Improved Generative Adversarial Network for Micro-Resistivity Imaging Logging Restoration

An improved GAN-based imaging logging image restoration method is presented in this paper for solving the problem of partially missing micro-resistivity imaging logging images. The method uses FCN as the generative network infrastructure and adds a depth-separable convolutional residual block to learn and retain more effective pixel and semantic information; an Inception module is added to increase the multi-scale perceptual field of the network and reduce the number of parameters in the network; and a multi-scale feature extraction module and a spatial attention residual block are added to combine the channel attention. The multi-scale module adds a multi-scale feature extraction module and a spatial attention residual block, which combine the channel attention mechanism and the residual block to achieve multi-scale feature extraction. The global discriminative network and the local discriminative network are designed to gradually improve the content and semantic structure coherence between the restored parts and the whole image by playing off each other and the generative network. According to the experimental results, the average structural similarity measure of the five sets of imaged logging images with different sizes of missing regions in the test set is 0.903, which is an improvement of about 0.3 compared with other similar methods. It is shown that the method in this study can be used for the restoration of micro-resistivity imaging log images with good improvement in semantic structural coherence and texture details, thus providing a new deep learning method to ensure the smooth advancement of the subsequent interpretation of micro-resistivity imaging log images.

22.
arXiv (CS.AI) 2026-06-18

Vibe Coding Ate My Homework: An evaluation of AI approaches to greenfield software engineering and programming

arXiv:2606.18293v1 Announce Type: cross Abstract: Thanks to rapid developments in generative AI, we are in the midst of a paradigm shift that may change how we interact with computers forever. We have observed a growth in the use of natural language prompts to build applications and coding infrastructures without underlying knowledge of the field, and this practice has been dubbed `vibe coding.' It arguably represents what the field of programming has been building towards since the beginning, with every higher level of abstraction that is conceived. Vibe coding promises to be the endpoint for the meta of high-level programming as far as method of input is concerned: eliminating a human's use of code syntax entirely in favour of programming in their mother tongue. This paper aims to evaluate the viability of vibe coding for greenfield software engineering tasks, as well as analyse the benchmarks that have been used to measure its software engineering prowess. To this end, we have developed an evaluation suite for analysing an LLM's proficiency in carrying out simple, isolated greenfield programming tasks in Python to provide scoped insight on the matter.

23.
arXiv (CS.AI) 2026-06-12

What Type of Inference is Active Inference?

arXiv:2606.04935v2 Announce Type: replace Abstract: Active inference casts decision-making as inference, with the Expected Free Energy (EFE) unifying goal-directed and information-seeking behavior. Recent work showed that EFE minimization can be written as Variational Free Energy (VFE) minimization on a generative model augmented with epistemic priors. We prove that the VFE of the augmented model can be rewritten as the VFE of the predictive model plus explicit entropy-correction terms, making the EFE contribution transparent. We then show that proper EFE-based planning requires combining these epistemic corrections with a planning correction that turns marginal inference into policy optimization, yielding a full variational characterization of EFE-based planning. This clarifies which corrections are needed for cross-entropy planning and for full EFE-based planning. The same entropy-corrected formulation leads to a detailed message-passing scheme for EFE-based planning together with simpler ablations. Experiments on three grid-world environments show that full EFE-based planning outperforms ablations that omit either the planning correction or the epistemic corrections.

24.
arXiv (CS.LG) 2026-06-16

Semi-Supervised Speech Confidence Detection using Pseudo-Labelling and Whisper Embeddings

arXiv:2606.16505v1 Announce Type: cross Abstract: Understanding speaker confidence is crucial in educational settings, as it can enhance personalised feedback and improve learning outcomes. This study introduces a novel framework for detecting speaker confidence by integrating human-engineered features with embeddings from the Whisper encoder. To address data limitations, a pseudo-labelling technique is employed to expand the labelled dataset, allowing the model to learn from both human-annotated and model-generated labels. The framework combines traditional speech features including pitch, volume, rate of speech, and the presence of disfluencies and stress, with Whisper embeddings, and uses a co-attention mechanism to fuse these representations and achieve an overall accuracy of 75%. This study contributes to advancing speech analysis, enabling applications that support personalised learning and speaking skill development.

25.
arXiv (CS.CV) 2026-06-19

HypOProto: Hyperbolic Ordinal Prototypes for Left Ventricular Filling Pressure Classification

Echocardiography (echo) is a widely used imaging modality for assessing cardiac function, with Left Ventricular Filling Pressure (LVFP) serving as a critical physiological marker for conditions such as heart failure. Standard LVFP classification into normal vs elevated categories relies on the Doppler-derived $E/e'$ ratio, which is operator-dependent and often unavailable in resource-limited settings, motivating methods that infer LVFP directly from B-mode echo. Existing deep learning approaches achieve high performance but remain largely black-box, limiting clinical interpretability. We propose HypOProto, a hyperbolic, ordinal prototype-based framework for interpretable LVFP classification using a frozen, explainable foundation model backbone. HypOProto arranges prototypes along the physiological $E/e'$ scale, placing borderline cases near the hyperboloid root where small angular differences separate similar cases, while normal and elevated cases occupy outward positions reflecting increasing diagnostic certainty. This hyperbolic geometry encodes clinically meaningful ordinal relationships and improves interpretability. We also introduce a novel Hyperbolic Prototype Angular Separation (HyperPAS) loss, enforcing inter-class prototype separation in hyperbolic space. HypOProto achieves SOTA performance while maintaining transparency, and highlights clinically relevant regions in visualizations. This work represents the first prototype-based framework for LVFP classification in echo. Our code can be found at https://github.com/DeepRCL/HypOProto.