Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.CL) 2026-06-18

ScholarSum: Student-Teacher Abstractive Summarization via Knowledge Graph Reasoning and Reflective Refinement

Abstractive summarization plays a crucial role in enabling efficient understanding of scientific literature, yet it inherently demands both linguistic fluency and factual faithfulness. Existing approaches often fail to reconcile these two requirements. Extractive methods rely on rigid sentence splicing that disrupts macro-level logical coherence, while large language model (LLM)-based generative approaches, despite mastering linguistic fluency, exhibit limited factual consistency. In this work, we propose ScholarSum, a hierarchical reflective graph-based framework that emulates a student-teacher writing process for fluent and faithful scientific summarization. ScholarSum first organizes the document into a hierarchical knowledge graph by segmenting it into semantically coherent units, whose multi-layered community structure captures global logic and macro-level themes. Guided by this global structure, the student generates an initial draft, which is subsequently refined through fine-grained evidence retrieval. To ensure factual consistency, a teacher-like reviewer then iteratively examines the draft, identifies unsupported content, and prompts targeted re-retrieval and rewriting until the summary meets rigorous quality standards. Extensive experiments demonstrate that ScholarSum significantly outperforms previous baselines in terms of both completeness and faithfulness. Our code is available at https://github.com/Xiaoyu-Tao/ScholarSum.

02.
arXiv (CS.AI) 2026-06-17

FlowRAG: Synergizing Explicit Reasoning via Frequency-Aware Multi-Granularity Graph Flow

arXiv:2606.17856v1 Announce Type: new Abstract: Graph-based retrieval-augmented generation (GraphRAG) is effective for knowledge-intensive and multi-hop query tasks; however, many existing methods primarily seed entity-based graphs and rely on implicit semantic relevance propagation. This often (i) under-retrieves when user queries are abstract and semantically sparse at the entity level, and (ii) suffers from brittle multi-hop reasoning, where noisy activations can derail entity-to-entity transitions and corrupt the inferred relation chain, yielding unreliable conclusions. To this end, we propose \texttt{FlowRAG}, a semantic-aware retrieval framework that improves both semantic recall and explicit reasoning. Specifically, \texttt{FlowRAG} constructs a quad-level heterogeneous graph over passages, summaries, sentences, and entities, where summary nodes serve as a coarse semantic hub. At retrieval time, a dual-granularity activation module combines summary–query alignment with sentence-level matching to activate relevant entities under paraphrase and abstraction robustly. We then introduce a frequency-aware weighted flow module that routes relevance through entity–passage links weighted by within-passage term frequency, pruning noisy connections and extracting high-confidence reasoning paths as an explicit logic skeleton for generation. Extensive experiments show that \texttt{FlowRAG} obtains state-of-the-art performance on complex reasoning benchmarks.

03.
medRxiv (Medicine) 2026-06-18

Intra-arterial recombinant human TNK tissue-type plasminogen activator (rhTNK-tPA) thrombolysis for acute medium vessel occlusion (MeVO-TNK): Study rationale and design

Background The optimal management of acute ischemic stroke caused by medium vessel occlusion (MeVO) remains uncertain. Recent randomized trials have failed to demonstrate a clear benefit of endovascular therapy in this population, whereas intra-arterial thrombolysis (IAT) has emerged as a biologically plausible alternative. However, prospective evidence supporting IAT in MeVO is lacking, and the optimal dosing strategy for stand-alone IAT remains undefined. Aim To preliminarily evaluate the efficacy and safety of intra-arterial tenecteplase (IA-TNK) plus standard medical therapy (SMT) compared with SMT alone in patients with acute MeVO stroke, and to explore a stepwise IA-TNK dosing strategy. Design The MeVO-TNK trial is a multicenter, prospective, randomized, open-label, blinded-endpoint (PROBE), exploratory phase II study. A total of 60 participants with imaging-confirmed MeVO will be randomized 1:1 to receive either IA-TNK plus SMT or SMT alone. Participants presenting beyond 6 hours from symptom onset must demonstrate salvageable penumbral tissue on advanced imaging. Those assigned to the intervention group will receive up to two intra-arterial boluses of tenecteplase (0.0625 mg/kg per bolus), with the second bolus administered based on angiographic assessment of reperfusion and safety. Outcomes The primary efficacy outcome is final infarct volume measured at 72{+/-}24 hours after randomization. Secondary efficacy outcomes include the proportions of patients achieving modified Rankin Scale (mRS) scores of 0-1, 0-2 and 0-3 at 90 days, a shift analysis of the mRS distribution at 90 days, early neurological deterioration, and National Institutes of Health Stroke Scale score at 7 days or discharge. The primary safety outcome is symptomatic intracranial hemorrhage within 24 hours. Conclusions This trial will provide preliminary evidence on the biological efficacy, reperfusion potential and safety of stand-alone IA-TNK for acute MeVO stroke, helping to address an important evidence gap and inform the design of future confirmatory studies.

04.
arXiv (CS.AI) 2026-06-19

Execution-bound advisory automation for agentic AI: a reproducible AIBOM-driven CSAF-VEX framework

arXiv:2606.19390v1 Announce Type: cross Abstract: A protocol driven framework is presented that binds SBOM and AIBOM artefacts to deterministic environment capture and structured runtime telemetry. Exploitability is computed from declared artefacts, observed activation conditions, and enforced execution policies. CSAF VEX advisories are generated from combined static and runtime evidence, cryptographically signed, and validated through deterministic replay. Evaluation uses approximately 10000 component entries across synthetic Agentic AI workloads 50 to 5000 components, incorporating OSV, GitHub Advisory, KEV, and EPSS datasets.

05.
arXiv (CS.CL) 2026-06-16

Hidden Ghost Hand: Unveiling Backdoor Vulnerabilities in MLLM-Powered Mobile GUI Agents

Graphical user interface (GUI) agents powered by multimodal large language models (MLLMs) have shown greater promise for human-interaction. However, due to the high fine-tuning cost, users often rely on open-source GUI agents or APIs offered by AI providers, which introduces a critical but underexplored supply chain threat: backdoor attacks. In this work, we first unveil that MLLM-powered GUI agents naturally expose multiple interaction-level triggers, such as historical steps, environment states, and task progress. Based on this observation, we introduce AgentGhost, an effective and stealthy framework for red-teaming backdoor attacks. Specifically, we first construct composite triggers by combining goal and interaction levels, allowing GUI agents to unintentionally activate backdoors while ensuring task utility. Then, we formulate backdoor injection as a Min-Max optimization problem that uses supervised contrastive learning to maximize the feature difference across sample classes at the representation space, improving flexibility of the backdoor. Meanwhile, it adopts supervised fine-tuning to minimize the discrepancy between backdoor and clean behavior generation, enhancing effectiveness and utility. Extensive evaluations of various agent models in two established mobile benchmarks show that AgentGhost is effective and generic, with attack accuracy that reaches 99.7\% on three attack objectives, and shows stealthiness with only 1\% utility degradation. Furthermore, we tailor a defense method against AgentGhost that reduces the attack accuracy to 22.1\%. Our code is available at \texttt{anonymous}.

06.
arXiv (CS.LG) 2026-06-11

Persistent Homology as a Theory of Emergent Structure

Authors:

arXiv:2507.03065v2 Announce Type: replace Abstract: Why do some macroscopic structures remain identifiable even though their microscopic constituents continually change? Vortices persist while fluid parcels turn over, neural memories persist while spikes and synapses fluctuate, and institutions persist while individuals enter and leave. We propose a scale-relative answer: an emergent property is a persistent nontrivial homology class $[z]\in H_p=\ker\partial_p/\im\partial_{p+1}$, a macro-feature that is closed but not exact across a filtration of descriptions. This identification turns emergence into a measurement problem. Persistent bars detect stable macro-features, and we introduce a contractive-similarity (CS) graph operator to supply scaffold spectral gaps that predict robustness. Hodge decomposition separates harmonic macro-scaffold from exact and co-exact micro-flow; and functorial condensation explains when one level's emergent class becomes a unit for the next. The resulting scaffold-flow framework expresses six familiar signatures of emergence (i.e., inevitability, coherence, irreducibility, complementarity, robustness, and hierarchy) within one mathematical language. It also yields falsifiable predictions across atmospheric, neural, and social systems: genuine emergent structures should persist across filtrations, remain spectrally stable, respond disproportionately to harmonic interventions, and require timescale separation for hierarchical autonomy.

07.
arXiv (CS.CV) 2026-06-17

DVD: Discrete Voxel Diffusion for 3D Generation and Editing

We introduce Discrete Voxel Diffusion (DVD), a discrete diffusion framework to generate, assess, and edit sparse voxels for SLat (Structured LATent) based 3D generative pipelines. Although discrete diffusion has not generally displaced continuous diffusion in image-like generation, we show that it can be an effective first-stage prior for sparse voxel scaffolds. By treating voxel occupancy as a native discrete variable, DVD avoids continuous-to-discrete thresholding and provides a simple framework for voxel generation, uncertainty estimation, and editing. Beyond quality gains, DVD provides more interpretable generation dynamics through explicit categorical modeling. Furthermore, we leverage the predictive entropy as a robust uncertainty metric to identify ambiguous voxel regions and complicated samples, facilitating tasks such as data filtering and quality assessment. Finally, we propose a lightweight fine-tuning strategy using block-structured perturbation patterns. This approach empowers the model to inpaint and edit voxels within a single sampling round, requiring negligible auxiliary computation and no additional model evaluations. Code is available at https://github.com/TeCai/DVD.

08.
arXiv (quant-ph) 2026-06-15

Universal Speed Limit in a Far-from-Equilibrium Bose Gas: Symmetry and Dynamical Decoherence

arXiv:2605.11895v2 Announce Type: replace-cross Abstract: Predicting universal transport coefficients in far-from-equilibrium quantum systems remains a fundamental challenge. A paradigmatic example is the non-thermal fixed point (NTFP) of isolated Bose gases, where coherence spreads as $\ell^2(t) = C\hbar t/m$ with a universal constant $C$. While the scaling exponent $z=2$ is well established, the amplitude $C$ has remained elusive because the underlying particle cascade $n(k)\sim k^{-4}$ leads to a divergent kinetic energy, threatening the very existence of a constant speed limit. Here we resolve this paradox and present the first analytical, parameter-free prediction of a universal amplitude $C$. A deep interplay between symmetry and dissipation is uncovered. The emergent weak U(1) symmetry at the NTFP enforces a conserved total current, forcing the low-energy phase dynamics to obey a diffusive Langevin equation with noise entering as the divergence of a stochastic current. This structure, combined with dynamical decoherence of high-momentum modes, yields a universal power-law momentum distribution $\tilde{f}(v)\sim(1+v^2)^{-3}$ (with $v=k\ell$) that naturally regularizes the ultraviolet divergence. From this, a parameter-free geometric baseline $C=3$ is obtained, independent of microscopic details. The experimental value $C=3.4(3)$ [Martirosyan et al., Nature 647, 608 (2025)] is then shown to be quantitatively consistent with universal logarithmic corrections arising from a marginally irrelevant coupling at the fixed point. A new paradigm is thus established for predicting transport coefficients in strongly correlated non-equilibrium systems: symmetry constraints determine the low-energy effective theory, dynamical decoherence provides a natural ultraviolet completion, and scaling analysis delivers testable predictions moving beyond scaling exponents to quantitative amplitude prediction.

09.
Nature (Science) 2026-06-24

Daily briefing: Sperm whales have different dialects

Authors:

Whales in different areas of the Mediterranean use varying patterns of clicks and pauses. Plus, a technique to make protein samples one billion times bigger and the science of grief. Whales in different areas of the Mediterranean use varying patterns of clicks and pauses. Plus, a technique to make protein samples one billion times bigger and the science of grief.

10.
arXiv (quant-ph) 2026-06-19

Smooth time-dependent control of dipolar Bose-Einstein condensates

arXiv:2606.20507v1 Announce Type: cross Abstract: We consider protocols for control of dipolar Bose-Einstein condensates where the critical role is played by the long-range anisotropic interatomic magnetic dipole-dipole interaction. The phase diagram of such a condensate has been explored theoretically and experimentally with certain values of the interatomic scattering length corresponding to superfluid and supersolid phases, where supersolidity appears as a modulation in the ground state density. Preparation of this modulated ground state is challenging, since excitations appear as a result of a finite-time evolution required to produce qualitative changes in the wavefunction density. To solve this problem we consider the time-dependent control of a dipolar Bose-Einstein condensate using shortcuts to adiabaticity techniques, concentrating on design of the time-dependent scattering length, a parameter of the system easily tunable by contemporary experiments. The first technique is the variational approach based on the Euler-Lagrange equations for a separable ansatz describing the evolution of the superfluid state. Secondly, we study the transition from superfluid to supersolid using a direct optimization protocol. We discuss the fidelity of the developed protocols in terms of the evolution time.

11.
arXiv (CS.CV) 2026-06-19

MeshPad: Interactive Sketch-Conditioned Artist-Reminiscent Mesh Generation and Editing

We introduce MeshPad, a generative approach that creates 3D meshes from sketch inputs. Building on recent advances in artist-reminiscent triangle mesh generation, our approach addresses the need for interactive mesh creation. To this end, we focus on enabling consistent edits by decomposing editing into 'deletion' of regions of a mesh, followed by 'addition' of new mesh geometry. Both operations are invoked by simple user edits of a sketch image, facilitating an iterative content creation process and enabling the construction of complex 3D meshes. Our approach is based on a triangle sequence-based mesh representation, exploiting a large Transformer model for mesh triangle addition and deletion. In order to perform edits interactively, we introduce a vertex-aligned speculative prediction strategy on top of our additive mesh generator. This speculator predicts multiple output tokens corresponding to a vertex, thus significantly reducing the computational cost of inference and accelerating the editing process, making it possible to execute each editing step in only a few seconds. Comprehensive experiments demonstrate that MeshPad outperforms state-of-the-art sketch-conditioned mesh generation methods, achieving more than 22% mesh quality improvement in Chamfer distance, and being preferred by 90% of participants in perceptual evaluations.

12.
arXiv (CS.CL) 2026-06-12

How reliable are LLMs when it comes to playing dice?

We investigate the probabilistic reasoning capabilities of large language models through a controlled benchmarking study on discrete probability problems. We constructed two datasets, respectively a set of standard exercises and a set of counterintuitive exercises, designed to trigger heuristic reasoning, and evaluated 8 state-of-the-art models, each tested with and without Chain-of-Thought prompting. Models achieve an average accuracy of 0.96 on standard problems but only 0.59 on counterintuitive ones. We further provide empirical evidence of token bias: performance drops by over 20% when canonical formulations are replaced by disguised variants. Embedding misleading suggestions in the prompt reduces performance by up to 34%, with no model proving immune. Taken together, the reported findings suggest that current LLMs are not yet genuine probabilistic reasoners, despite their success in advanced mathematical problems.

13.
arXiv (CS.AI) 2026-06-25

AI Coaching for Accelerating Human Skill Development with Reinforcement Learning

arXiv:2606.25337v1 Announce Type: cross Abstract: AI copilots can substantially boost human performance through shared control, but excessive assistance can induce over-reliance and skill atrophy. This paper studies how an embodied AI agent can act as a coach that accelerates human motor-skill development. We argue that effective coaching requires strategic scaffolding and stepping back that are aligned with the learner's capability, allowing productive failures that drive learning. We formalize the interactive AI coaching process as a non-cooperative dynamic game in which the learner optimizes task performance while the coach targets the learner's independent competence. Building on this formalism, we develop a reinforcement learning framework combining adaptive shared control with probabilistic models of the coach's causal influence on skill evolution, enabling tractable training of coaching policies. A comprehensive user study (N=33) on first-person-view drone racing shows significant gains in human learning outcomes over state-of-the-art AI coaching baselines.

14.
arXiv (CS.AI) 2026-06-12

Evaluation Sovereignty in Metadata-Driven Classification: A Multi-Track Framework for Weakly Supervised Information Systems

arXiv:2606.13436v1 Announce Type: new Abstract: Evaluation in machine learning is typically treated as a neutral measurement process. However, in operational information systems, evaluation outcomes are often conditioned by the processes used to generate labels. This paper does not seek to improve classification performance. Instead, it examines the validity of performance measurement under differing label-authority regimes. This issue is particularly relevant in large-scale metadata-driven systems, where labels are often incomplete, inconsistent, or weakly supervised. We introduce evaluation sovereignty, defined as the degree to which performance metrics are independent of label authority and supervision regime, and propose a multi-track evaluation framework that systematically varies training and evaluation label sources. Using hierarchical multi-label classification on large-scale scientific metadata, we demonstrate that models exhibiting strong performance under operational ("silver") evaluation degrade substantially under independent ("gold") evaluation, particularly for fine-grained classification. For example, Micro-F1 decreases from approximately 0.54 to 0.03. Notably, ranking-based metrics remain above baseline, revealing a divergence between latent model signal and classification validity. These findings suggest that commonly reported performance metrics may reflect alignment with labeling processes rather than true predictive capability. We therefore reconceptualize evaluation validity as a system-level property shaped by label governance and provide a practical methodology for auditing intelligent systems operating under weak supervision.

15.
arXiv (CS.CV) 2026-06-15

FEMOT: Multi-Object Tracking using Frame and Event Cameras

Conventional RGB cameras have been widely used in multi-object tracking due to their ability to capture rich appearance and semantic information. However, their performance is often degraded under complex real-world challenges, such as motion blur, low illumination, and overexposure. Bio-inspired event cameras offer high temporal resolution and high dynamic range, providing complementary cues under extreme scenarios. Nevertheless, RGB-event multi-object tracking remains underexplored due to the lack of large-scale and well-annotated datasets. To address this issue, we propose FEMOT, a large-scale RGB-event multi-object tracking dataset that covers diverse real-world scenarios and 14 challenging attributes. With both RGB and event data as well as high-quality annotations, FEMOT provides a reliable platform for systematically evaluating RGB-event multi-object tracking methods. Based on FEMOT, we retrain and evaluate over ten strong trackers, thereby establishing a comprehensive benchmark for future research. Furthermore, we propose FEMOTR, a multimodal tracking framework that decouples RGB and event features and fuses them in the frequency domain, thereby effectively exploiting their complementary characteristics for robust object localization and identity association. Extensive experiments on FEMOT and DSEC-MOT datasets demonstrate the effectiveness of the proposed method. The source code and benchmark dataset have been released on https://github.com/Event-AHU/FEMOT.

16.
arXiv (CS.AI) 2026-06-11

An XAI View on Explainable ASP: Methods, Systems, and Perspectives

arXiv:2601.14764v2 Announce Type: replace Abstract: Answer Set Programming (ASP) is a popular declarative reasoning and problem solving approach in symbolic AI. Its rule-based formalism makes it inherently attractive for explainable and interpretive reasoning, which is gaining importance with the surge of Explainable AI (XAI). A number of explanation approaches and tools for ASP have been developed, which often tackle specific explanatory settings and may not cover all scenarios that ASP users encounter. In this survey, we provide, guided by an XAI perspective, an overview of types of ASP explanations in connection with user questions for explanation, and describe their coverage by current theory and tools. Furthermore, we pinpoint gaps in existing ASP explanations approaches and identify research directions for future work.

17.
arXiv (CS.AI) 2026-06-16

Multi-Sensor Fusion for UAV Classification Based on Feature Maps of Image and Radar Data

arXiv:2410.16089v2 Announce Type: replace Abstract: The unique cost, flexibility, speed, and efficiency of modern UAVs make them an attractive choice in many applications in contemporary society. This, however, causes an ever-increasing number of reported malicious or accidental incidents, rendering the need for the development of UAV detection and classification mechanisms essential. We propose a methodology for developing a system that fuses already processed multi-sensor data into a new Deep Neural Network to increase its classification accuracy towards UAV detection. The DNN model fuses high-level features extracted from individual object detection and classification models associated with thermal, optronic, and radar data. Additionally, emphasis is given to the model's Convolutional Neural Network (CNN) based architecture that combines the features of the three sensor modalities by stacking the extracted image features of the thermal and optronic sensor achieving higher classification accuracy than each sensor alone.

18.
arXiv (CS.CV) 2026-06-12

Amnesia: A Stealthy Replay Attack on Continual Learning Dreams

Continual learning (CL) models often use experience replay to reduce catastrophic forgetting, but their robustness to replay sampling interference remains underexplored. Existing CL attacks alter inputs or training pipelines (poisoning/backdoors) and rarely include explicit auditable constraints, limiting realism. Here, auditability means a monitor can verify compliance from sampler-visible telemetry - e.g., logged replay index/label statistics - by checking that the realized replay class histogram stays close to a nominal baseline and that replay rate is unchanged per batch and/or over a rolling window. We study a limited-privilege insider who controls only replay index selection, not pixels, labels, or model parameters, while staying within auditable limits such as queue priorities. We introduce Amnesia, a replay composition attack that maximizes degradation under two budgets: a visibility budget delta bounding the TV/KL divergence from a nominal class histogram p0, and a mass budget f fixing the replay rate. Amnesia has two steps: (i) compute lightweight class utilities, such as EMA loss or confidence, to tilt p0 toward harmful classes; and (ii) project the tilt back into the delta-ball using efficient KL (exponential tilt) or TV (balanced mass redistribution) optimizers. A windowed scheduler enforces rolling audits. Across challenging CL benchmarks and strong replay baselines, Amnesia consistently lowers final accuracy (ACC) and worsens backward transfer (-BWT). The KL variant delivers high impact while remaining largely undetected under multiple audit schemes, including per-batch and rolling-window checks. The TV variant is more damaging but easier to detect, especially under tight per-class constraints. These results expose index-only replay control as a practical, auditable threat surface in CL systems and establish a principled impact-visibility trade-off.

19.
arXiv (CS.LG) 2026-06-16

Imbalanced Classification under Capacity Constraints

arXiv:2605.03289v2 Announce Type: replace-cross Abstract: Detecting observations from a minority class under severe class imbalance is a central challenge in applications such as fraud detection, medical screening, and industrial quality control. In these settings, each positive prediction triggers a costly follow-up action, an MRI scan, a transaction audit, whose execution is subject to real operational constraints. This paper proposes a formal classification framework under capacity constraints: given a user-defined bound limit $b$ on the proportion of observations that can be labeled as belonging to the minority class, the goal is to find the classifier that maximizes sensitivity on that class. We characterize the optimal classifier under this constraint and establish its equivalence with the classical Bayes classifier under a reweighting of the prior probabilities. We also introduce a capacity-adjusted performance metric $M$ that accounts for the effective detection rate when the capacity constraint is binding. The framework is implemented on top of standard learning methods, k-NN, SVM, random forests, and neural networks, and statistical consistency is established for each. We further show that these methods reduce to post-hoc thresholding when no hyperparameters are oriented toward the capacity-constrained objective, and introduce a capacity-aware support vector machine that exploits the constraint during training and achieves the strongest empirical performance. Experiments on the Taiwanese credit card default dataset confirm that capacity-constrained classifiers substantially outperform both classical approaches and SMOTE under high imbalance regimes. The framework extends naturally to multiclass settings and online environments.

20.
medRxiv (Medicine) 2026-06-22

Pump-Free Patient-Derived Human Proximal Tubule Microphysiological System for Modeling Flow-Dependent Epithelial Maturation and Cisplatin Injury

Recent initiatives by the U.S. Food and Drug Administration and the National Institutes of Health to reduce animal testing in drug development have highlighted the need for in vitro platforms that better recapitulate human biology for preclinical safety assessment. Drug-induced nephrotoxicity remains a major cause of drug attrition, underscoring the need for human-relevant kidney models. To address this, a pump-free human patient-derived proximal tubule microphysiological system was developed by integrating human renal proximal tubular epithelial cells (hRPTECs), isolated from non-tumorous nephrectomy cortex, with a porous membrane-based microfluidic device. Expanded hRPTECs were cultured for 10 days under static conditions or rocker-driven shear stress approximating physiological proximal tubular flow. Shear stress increased epithelial density, enhanced proximal tubule marker expression (Na+/K+-ATPase and aquaporin-1), and improved Zonula occludens-1 and occludin localization. Bulk RNA sequencing demonstrated transcriptomic changes associated with enhanced apical maturation and epithelial signature. In cisplatin-induced injury assays, shear-conditioned epithelia exhibited reduced cell density and increased {gamma}H2AX staining, indicating greater sensitivity to nephrotoxicity. These findings demonstrate that rocker-driven shear stress promotes epithelial maturation in patient-derived hRPTECs. The pump-free human patient-derived proximal tubule microphysiological system offers a practical, scalable, and physiologically relevant platform for modeling flow-dependent proximal tubule biology and assessing human-relevant nephrotoxicity.

21.
arXiv (CS.CV) 2026-06-25

A welding penetration prediction model for laser welding process based on self-supervised learning using physics-informed neural networks

The laser welding full-penetration is of critical importance, as it constitutes one of the fundamental factors in achieving defect-free welded joints. Accurate prediction of the penetration state is therefore essential for ensuring weld quality. To this end, this paper introduces SimPhysNet, a novel algorithm that achieves high classification accuracy in laser welding penetration prediction using only a limited number of labelled images. This approach effectively overcomes the limitations of supervised learning classification algorithms, which are hindered in industrial applications by their dependence on extensive, high-quality labelled data. The core of SimPhysNet is a unique self-supervised learning paradigm that embeds physical priors into a contrastive learning framework. By incorporating a physics-informed neural network (PINN), the model is guided to extract physically meaningful features of the molten pool and keyhole from a large set of unlabelled data, while three image augmentation tasks further enhance its generalization capabilities. Subsequently, a few-shot learning strategy, based on prototypical networks, enables robust classification by constructing class representations from a minimal set of labelled images. Experimental results demonstrate that SimPhysNet achieves a classification accuracy of 96.06% using only 200 labelled images (approximately 5% of the total labelled dataset), which is comparable to the performance of conventional supervised learning algorithms that utilize the entire labelled dataset. This work presents a new, efficient, and highly accurate method, providing the way for the intelligent automation of laser welding.

22.
arXiv (CS.AI) 2026-06-19

Modeling Day-Long ECG Signals to Predict Heart Failure Risk with Explainable AI

arXiv:2601.00014v2 Announce Type: replace-cross Abstract: Heart failure (HF) affects 11.8% of adults aged 65 and older, reducing quality of life and longevity. Preventing HF can reduce morbidity and mortality. We hypothesized that artificial intelligence (AI) applied to 24-hour single-lead electrocardiogram (ECG) data could predict the risk of HF within five years. To research this, the Technion-Leumit Holter ECG (TLHE) dataset, including 69,663 recordings from 47,729 patients, collected over 20 years was used. Our deep learning model, DeepHHF, trained on 24-hour ECG recordings, achieved an area under the receiver operating characteristic curve of 0.80 that outperformed a model using 30-second segments and a clinical score. High-risk individuals identified by DeepHHF had a two-fold chance of hospitalization or death incidents. Explainability analysis showed DeepHHF focused on arrhythmias and heart abnormalities. This study highlights the feasibility of deep learning to model 24-hour continuous ECG data, capturing paroxysmal events essential for reliable risk prediction. Artificial intelligence applied to single-lead Holter ECG is non-invasive, inexpensive, and widely accessible, making it a promising tool for HF risk prediction.

23.
arXiv (CS.CL) 2026-06-12

Epistemic Constitutionalism Or: how to avoid coherence bias

Authors:

Large language models increasingly function as artificial reasoners: they evaluate arguments, assign credibility, and express confidence. Yet their belief-forming behavior is governed by implicit, uninspected epistemic policies. This paper argues for an epistemic constitution for AI: explicit, contestable meta-norms that regulate how systems form and express beliefs. Source attribution bias provides the motivating case: I show that frontier models enforce identity-stance coherence, penalizing arguments attributed to sources whose expected ideological position conflicts with the argument's content. When models detect systematic testing, these effects collapse, revealing that systems treat source-sensitivity as bias to suppress rather than as a capacity to execute well. I distinguish two constitutional approaches: the Platonic, which mandates formal correctness and default source-independence from a privileged standpoint, and the Liberal, which refuses such privilege, specifying procedural norms that protect conditions for collective inquiry while allowing principled source-attending grounded in epistemic vigilance. I argue for the Liberal approach, sketch a constitutional core of eight principles and four orientations, and propose that AI epistemic governance requires the same explicit, contestable structure we now expect for AI ethics.

24.
arXiv (CS.CV) 2026-06-15

Fusion of Pervasive RF Data with Spatial Images via Vision Transformers for Enhanced Mapping in Smart Cities

In this paper, we present a deep learning-based approach that integrates the DINOv2 architecture to improve building mapping by combining (possibly erroneous) maps from open-source platforms with pervasive radio frequency (RF) data collected from multiple wireless user equipments and base stations. Unlike prior methods, our approach leverages a vision transformer-based architecture to jointly process both RF and map modalities within a unified framework, effectively capturing spatial dependencies and structural priors for enhanced mapping accuracy. For the evaluation purposes, we employ a synthetic dataset co-produced by Huawei. To address the challenges associated with real-world data imperfections, we introduce controlled noise to its RF data so as to simulate real-world conditions. Additionally, we develop and train a model that leverages only aggregated path loss information to tackle the mapping problem. We measure the results according to three performance metrics: the Jaccard index (intersection over union, IoU), the Hausdorff distance, and the Chamfer distance. Our design achieves a macro IoU of 65.3%, significantly surpassing (i) the erroneous maps baseline, which yields 40.1%, (ii) an RF-only method from the literature, which yields 37.3%, and (iii) a non-AI fusion baseline that we designed which yields 42.2%. The comparative evaluation highlights the limitations of relying solely on RF data or on spatial data, as well as the effectiveness that AI can have on fusing data towards enhancing smart city mapping accuracy. We further validate our method on real-world data from the Oslo region, complementing the synthetic evaluation with a real deployment setting, where our best fusion model reaches 64.9% macro IoU. We additionally outline a strategy for deploying the model over larger areas by tiling the region with overlapping windows.

25.
arXiv (CS.LG) 2026-06-24

Dirac-Frenkel dynamics with inertia for nonlinearly parametrized solutions of evolution problems

arXiv:2606.24769v1 Announce Type: cross Abstract: Even when Dirac-Frenkel dynamics determine a well-defined evolution in function space, the corresponding parameter dynamics can be non-unique or ill-conditioned for redundant nonlinear parametrizations such as neural networks or mixture models. We propose to add inertia to the Dirac-Frenkel dynamics and show that this allows useful parameter velocity information to persist from the past trajectory in directions that are weakly informed, while well-informed parameter velocity directions continue to follow the Dirac-Frenkel dynamics. We prove that the inertial formulation yields well-posed parameter dynamics and provide a posteriori error bounds. After time discretization, the method requires the solution of the same type of regularized linear least-squares problem as standard Dirac-Frenkel dynamics, but with the previous velocity appearing as an anchor. Numerical experiments demonstrate the increased robustness obtained with inertia.