Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
medRxiv (Medicine) 2026-06-18

Early-life Urban Environment, Nutrition, and Pubertal Timing in Southern Europe: An Exposome Analysis

Background: Urban environmental and lifestyle factors during early life may influence pubertal timing, but the combined effects of multiple environmental exposures within an exposome analytical framework remain poorly understood. Objective: To examine the association between early-life urban environmental exposures and pubertal timing, and to explore whether these exposures interact with early-life nutritional factors, namely breastfeeding duration and childhood diet quality. Methods: Data from two European population-based birth cohorts were analysed: Generation XXI (G21, Portugal; n=5263; 51.5% girls) and INfancia y Medio Ambiente (INMA, Spain; n=1019; 50.1% girls). Urban environmental exposures including indicators of air pollution, traffic, built environment, and natural spaces were estimated at 4 early-life stages at both cohorts: pregnancy (INMA only), birth, 1 year, and 4-5 years of age. Pubertal development timing was assessed using Tanner staging and/or the Pubertal Development Scale (PDS), and age at menarche was self-reported. Exposome-Wide Association Study (ExWAS) models and unsupervised clustering followed by ordinal logistic regression models were used to examine single- and multi-exposure associations, respectively. Regression models were fitted adjusting for relevant child characteristics, maternal factors, and household socioeconomic conditions, and corrected for multiple testing. Results: Individuals living in more unfavourable urban environments characterised by higher building density, air pollution, and lower access to natural spaces showed earlier pubertal timing according to multiple outcomes, across multiple early-life exposure periods, and in both cohorts. In the G21 cohort, these environmental profiles were associated with earlier age at menarche, particularly for exposures at 1-1.5 and 4-5 years (e.g., 1-1.5y: {beta}=-0.172, FDR-adjusted p-value=0.041), while in the INMA cohort, boys exposed to more unfavourable environmental profiles showed more advanced pubertal development, also particularly for exposures at 1-1.5 and 4-5 years of age (e.g., 1-1.5y; {beta}=0.572, FDR-adjusted p-value=0.008). Among environmental domains, air pollution and traffic were the factors most consistently associated with pubertal timing. Regarding early-life nutritional factors, longer duration of exclusive breastfeeding was associated with a lower Tanner stage among girls in G21. No significant interactions between breastfeeding duration and environmental exposure clusters were observed. Conclusion: Early-life urban environmental exposures, particularly air pollution and traffic, may influence pubertal timing. Exclusive breastfeeding may have a protective role against earlier pubertal development. These findings highlight the importance of improving urban environmental conditions and promoting breastfeeding to support healthy developmental trajectories.

02.
arXiv (CS.CL) 2026-06-18

ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like depth-first search (DFS). This leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad first constructs valid tool-use chains through an iterative process guided by textual "gradients", and then synthesizes corresponding user queries. This "answer-first" approach led to ToolGrad-500, a dataset generated with more complex tool use, lower cost, and almost 100% pass rate. Experiments show that ToolGrad models outperform those trained on expensive baseline datasets and proprietary LLMs. The ToolGrad source code, dataset, and models are available at https://github.com/zhongyi-zhou/toolgrad.

03.
arXiv (math.PR) 2026-06-16

Risk or Replace: Efficient Asymptotics for Data-Driven Maintenance

arXiv:2606.14706v1 Announce Type: cross Abstract: Condition-based maintenance (CBM) is an approach that plans interventions for deteriorating systems according to their observed operational state. CBM reduces unplanned downtime and extends usable lifetime. We study a heterogeneous population of components that degrade over time according to a stochastic processes with non-negative and i.i.d. increments that are characterized by component-specific parameters that remain unobservable to the decision maker. We rely on degradation data to estimate these parameters and determine replacement actions at equidistant epochs. The goal is to minimize the long-run average cost, which incorporates fixed replacement costs, failure costs, and operating costs. This problem can be formulated as a high-dimensional partially observable Markov decision process (POMDP), which is generally intractable. We develop a tractable, data-driven CBM policy that estimates the optimal policy of a hypothetical Oracle that has full information of the underlying degradation parameters and call this policy the Estimated Oracle's Optimal Policy (EOP). We introduce a scaling regime where both the failure thresholds and cost parameters increase proportionally, reflecting practical settings in which component lifetimes and maintenance costs are large relative to the time between two consecutive CBM decision moments. We show that the regret of the EOP, defined as the difference between its long-run average cost and that of the Oracle, converges to zero in the scaling regime when the parameter estimator is consistent. Across extensive experiments using both real and simulated data, the EOP achieves very low regret and, whenever the optimal POMDP policy can be computed exactly, a negligible optimality gap.

04.
arXiv (CS.AI) 2026-06-11

LSTM based IoT Device Identification

arXiv:2304.13905v2 Announce Type: replace-cross Abstract: While the use of the Internet of Things is becoming more and more popular, many security vulnerabilities are emerging with the large number of devices being introduced to the market. In this environment, IoT device identification methods provide a preventive security measure as an important factor in identifying these devices and detecting the vulnerabilities they suffer from. In this study, we present an end-to-end machine learning pipeline that identifies IoT devices in the Aalto university dataset (IoT devices captures) using Long Short-Term Memory (LSTM) networks. Raw network packet captures (PCAP) are processed into 25 engineered features, which are then arranged as sliding-window time-series sequences. We systematically evaluate sequence lengths from 2 to 20, reporting that performance improves approximately linearly up to length 6 and thereafter in a wave-like pattern, reaching its peak at length 18. On the final held-out test set with the optimal configuration, the model achieves an accuracy of 79.85% and a macro-averaged F1-score of 75.70% across 27 device classes.

05.
arXiv (quant-ph) 2026-06-19

Generating function and Bloch representation for quantum Fisher tensor

arXiv:2511.05260v2 Announce Type: replace Abstract: The Uhlmann relative amplitude between two density matrices is shown to be a generating function, through which the quantum Fisher tensor that contains both the quantum Fisher information matrix and the mean Uhlmann curvature can be obtained via differentiation over system parameters. In the pure state limit, our generating function recovers that of the quantum geometric tensor proposed by Het\'{e}nyi and L\'{e}vay, and also clarifies the fidelity and phase between two quantum states as the generating functions of the quantum metric and Berry curvature, respectively. A generic expression for the quantum Fisher tensor in terms of the Bloch representation of density matrices is derived, which facilitates the calculation of the tensor, mean Uhlmann curvature, and geometric properties derived from the quantum Fisher information matrix. Canonical ensembles of spins are adopted to demonstrate our formalism, which reveals a constant Ricci scalar, a vacuum Einstein equation, and a cosmological constant on the 3D Euclidean manifold of the magnetic field

06.
arXiv (CS.CV) 2026-06-16

AME: A Multi-Type Contributor Attribution Framework in Generative AI Markets

Generative AI enables value creation through multi-stage collaboration among heterogeneous contributors, including training data, base models, fine-tuning behaviors, and prompts. However, how to fairly allocate the data value remains largely unexplored. This paper formulates multi-stage generative AI value allocation as a new research problem and identifies three core challenges: heterogeneous data contribution valuation, data rights mapping, and trustworthy execution. We propose AME (Attribution-Mapping-Execution) framework, a unified framework that integrates data contribution valuation, data rights mapping, and trustworthy execution into a single workflow. Experimental results demonstrate that AME framework achieves data value allocation outcomes more consistent with human reference judgments while maintaining low-cost trustworthy execution. Our work provides an initial foundation for value assessment and revenue allocation in generative AI data markets.

07.
arXiv (CS.CV) 2026-06-17

GSPan: A Continuous Gaussian Primitive Representation for Arbitrary-Scale Pansharpening

Pansharpening aims to generate high-resolution multispectral (HRMS) images by fusing low-resolution multispectral (LRMS) and panchromatic (PAN) observations. Most existing deep learning methods treat pansharpening as fixed-grid prediction, which limits scale adaptation. To address this, we propose GSPan, a framework that introduces 2D Gaussian Splatting (GS) into pansharpening. Instead of directly predicting pixels, GSPan represents band-wise residual details as continuous and learnable 2D Gaussian primitives. We design a Dual-Stream Hierarchical Interaction (DSHI) architecture with a Spatial-Spectral Interactive Attention (SSIA) module to estimate these primitives from complementary PAN and MS observations. The predicted primitives are rendered as a residual detail field and injected into the upsampled MS image. This continuous representation allows GSPan to render fused images on arbitrary target sampling grids without scale-specific retraining. It further enables a Scale-Decoupled Asymmetric Inference (SDAI) strategy, which estimates primitives at a reduced resolution and renders the fused image at the target resolution for efficient large-scene pansharpening. Experiments on QuickBird, GaoFen-2, WorldView-3, and WorldView-3-4K datasets show that GSPan delivers state-of-the-art fusion performance. Moreover, SDAI markedly accelerates inference, achieving a favorable trade-off between computational efficiency and fusion quality. Our results demonstrate the potential of continuous Gaussian residual representations as a flexible and scale-decoupled alternative to fixed-grid prediction.

08.
arXiv (CS.AI) 2026-06-16

Medical world models: representing medical states, modelling clinical dynamics and guiding intervention policies

arXiv:2606.16721v1 Announce Type: new Abstract: Medical diagnosis and treatment are dynamic processes in which patient states evolve over time and clinical interventions alter future outcomes. Although current medical AI can detect disease, estimate risk and generate reports, many systems still return static labels or scores, offering limited insight into how illness may progress or how alternative interventions may reshape its trajectory. Medical world models adapt the world-model idea from artificial intelligence to healthcare by learning internal simulators of patient-state dynamics. Their long-term goal is to help clinicians anticipate deterioration, compare treatment-conditioned futures and tailor care to individual patients. Yet relevant work remains scattered across foundation models, longitudinal modelling, disease simulation, treatment-effect estimation, reinforcement learning and digital twins. To bridge this gap, this review outlines a roadmap for advancing medical AI from isolated diagnosis and prediction toward medical world models that simulate disease evolution and support intervention decisions. This roadmap is organized around three coupled capabilities: patient-state construction, clinical dynamics modelling and intervention decision support. Across representative systems, the comparison highlights what each capability contributes and how partial components can be integrated into more mature perception–dynamics–planning systems. Finally, we identify the challenges involved in turning plausible rollouts into clinically useful simulators. Related literature is available at https://github.com/1999kevin/awesome_medical_world_models.

09.
arXiv (CS.CV) 2026-06-11

CellNet – Localizing Cells using Sparse and Noisy Point Annotations

Counting living cells is an important step in many biological research workflows. Our collaborators at the Wellcome Sanger Institute study vital genes in humans via large scale saturation genome editing screening, which requires repeatedly counting cells a great number of times. Computer Vision based automation is crucial for high throughput and resource efficiency. In this work, we develop a regression-based deep learning computer vision algorithm to detect and count cells in phase-contrast microscopy images. To reduce annotation effort, which in practice often becomes a bottleneck, we focus on counting cells only using sparse point annotations, which are fast and easy to acquire. By comparison to state-of-the-art 0-shot methods, we show that regression-based counting is a promising alternative in low data regimes. Through developing methods to automatically count living cells in microscopy images, we contribute to valuable research on the human genome. The code is available at https://github.com/beijn/cellnet.

10.
arXiv (CS.CL) 2026-06-12

Recursive Agent Harnesses

Recursive language models (RLMs) showed that recursion over model calls is an effective strategy for long-context reasoning, and production coding agents have begun to write code that spawns subagents at scale, most recently in Anthropic's dynamic workflows. We name and study the pattern between these two lines of work, where the recursive unit is a full agent harness with filesystem tools, code execution, and planning rather than a model call with no tools. We call this the Recursive Agent Harness (RAH) and frame it as harness recursion, the code-first extension to the model recursion of RLMs. A parent agent generates and runs an executable script that spawns subagent harnesses in parallel for fine-grained workloads and uses structured function calls for small subtasks. We provide a controlled evaluation on long-context reasoning. With the backbone held fixed at GPT-5 to match the published Codex and RLM baselines, RAH improves the Codex coding-agent baseline from 71.75% to 81.36% on Oolong-Synthetic (199 samples, 13 context-length buckets up to 4M tokens), a gain attributable to the harness rather than the model. With a stronger backbone, Claude Sonnet 4.5, the same design reaches 89.77%.

11.
arXiv (CS.AI) 2026-06-18

Efficient Zeroth-Order Federated Finetuning of Language Models on Resource-Constrained Devices

arXiv:2502.10239v3 Announce Type: replace-cross Abstract: Federated Learning (FL) is a promising paradigm for finetuning Large Language Models (LLMs) across distributed data sources while preserving data privacy. However, finetuning such large models is challenging on edge devices due to its high resource demand. Zeroth-order Optimization (ZO) estimates gradients through finite-difference approximations, which rely on function evaluations under random perturbations of the model parameters. Consequently, ZO with task alignment provides a potential solution, allowing finetuning using only forward passes with inference-level memory requirements and low communication overhead, but it suffers from slow convergence and higher computational demand. In this paper, we propose a new ZO-based method that applies a more efficient technique to reduce the computational demand associated with using a large number of perturbations while preserving their convergence benefits. This is achieved by splitting the model into consecutive blocks and allocating a higher number of perturbations to the second block, enabling efficient reuse of intermediate activations to update the full network with fewer forward evaluations. Our evaluation on RoBERTa-large, OPT1.3B, LLaMa-3-3.2B models shows up to $3\times$ reduction in computation compared to the other ZO-based techniques, while retaining the memory and communication benefits over first-order federated learning techniques.

13.
arXiv (math.PR) 2026-06-17

Limit theorems for descents and inversions of shelf-shuffles

arXiv:2510.00343v2 Announce Type: replace Abstract: We prove central limit theorems for the number of descents and inversions of permutations produced by shelf-shuffles. These are a model for casino card shuffling machines. We show the asymptotic normality of the number of descents in two limiting regimes depending on the ratio of cards to shelves. On the other hand, we study the inversions by employing a modification of the techniques from Islak's analysis of the statistics of riffle shuffles. In particular, we obtain a bound for the rate of convergence for inversions that is independent of the number of shelves.

14.
arXiv (CS.LG) 2026-06-19

The Representational Limit of Scalar Interactions: An Interventional Decomposition

arXiv:2606.19410v1 Announce Type: cross Abstract: Signed pairwise interaction scores fundamentally conflate uniqueness (U), redundancy (R), and synergy (S). We prove this on a minimal 3-way XOR structural causal model: faithful indices such as Shapley-Taylor return zero per pair, whereas projective indices such as Shapley Interaction spread the third-order effect into pair scalars that conflate the three mechanisms. We introduce Stochastic Hi-Fi, a post-hoc, retraining-free predictability decomposition that estimates per-feature U/R/S profiles by interventional masked inference. The estimator provides exact interventional semantics, finite-sample Monte Carlo bounds, strict variance reduction from coupled diamond sampling, and uniform finite-vocabulary convergence. Across tabular SCMs, Stochastic Hi-Fi recovers structure missed by scalar baselines (up to 411x larger interaction-magnitude recovery ratios). It also separates redundant and synergistic heads in the GPT-2 IOI circuit. On NIH ChestX-ray14, Stochastic Hi-Fi matches GradCAM on Pointing Game and improves substantially on Deletion AUC.

15.
arXiv (CS.CL) 2026-06-15

Small LLMs: Pruning vs. Training from Scratch

Pruning promises a shortcut to strong small language models. In this work, we examine this promise by pruning Llama-3.1-8B at pruning ratios of 0.5–0.8 with six methods spanning depth, width, and sparse granularities, under two controlled token-matched settings. (1) With the same training token budget, pruned initialization consistently outperforms random initialization. This shows that the parent model provides a strong starting point, although the advantage narrows as the training token budget grows and as the pruning ratio rises, nearly vanishing at the highest pruning ratio we study. (2) When training from scratch is instead given the full token budget consumed by the whole pipeline, pruning at finer granularities still retains an advantage, while coarser structured pruning can be matched or surpassed. This suggests that the parent model transfers knowledge that additional training tokens alone cannot fully recover, but only at fine granularity. Taken together, our results yield a clear recommendation: with a large pretrained model in hand and a limited training token budget, pruning is better than training from scratch; when the training budget is not limited, training from scratch can be competitive for coarser pruning, so a large pretrained parent is not always necessary.

16.
arXiv (CS.CV) 2026-06-15

CausalMotion: Structured Physical Reasoning as Keyframe and Trajectory Guidance for Training-Free Video Generation

Recent advances in diffusion-based video generation have significantly improved visual quality and short-term temporal coherence. However, existing methods still struggle to produce videos with physically consistent and causally plausible dynamics, especially in scenarios involving long-horizon interactions. This limitation arises from the fact that video diffusion models primarily learn physical consistency implicitly, while vision-language models can directly model physical laws. Based on this idea, in this work, we propose CausalMotion, a training-free framework that injects explicit physical reasoning into video generation through structured intermediate representations. Our key idea is to decouple reasoning from generation by leveraging a vision-language model to decompose a text prompt into a sequence of causally consistent keyframes and object-centric motion trajectories. These representations are then aligned and integrated as soft constraints to guide a pretrained video diffusion model during inference. This design enables explicit modeling of object dynamics and causal transitions without requiring additional training or supervision. Extensive experiments show that our method consistently improves physical plausibility and temporal coherence, particularly in dynamics-intensive scenarios, while maintaining high perceptual video quality.

17.
arXiv (CS.AI) 2026-06-15

TwinBI: An Agentic Digital Twin for Efficient Augmented Interactions with Business Intelligence Dashboards

arXiv:2606.13731v1 Announce Type: new Abstract: Business intelligence (BI) increasingly combines dashboard interaction with LLM-based assistance, but these two modes often fall out of sync during multi-step analysis. As users switch between direct dashboard manipulation and natural-language queries, it becomes difficult to preserve a consistent analytical state across filters, hierarchies, metrics, and chart context. We present TwinBI, an agentic digital-twin framework that couples an LLM-based agent system with an executable BI dashboard state. TwinBI unifies conversational interaction, dashboard manipulation, semantic grounding, and provenance tracking through a shared analytical state reconstructed from a unified interaction log. It also exposes artifacts such as schema views, SQL, logs, and an /insights command for state-grounded analytical summaries. We evaluate TwinBI in two complementary ways. In a controlled A/B benchmark with the same backbone agent, TwinBI improves exact-match accuracy from 43.3% to 63.3%, partial-credit accuracy from 48.3% to 70.8%, and substantially reduces timeout rate from 40.0% to 10.0% relative to Dashboard alone. In a usability study, participants benefited from the integrated dashboard-and-chat workflow, with high task accuracy, moderate workload, and favorable ratings for state-aware interaction mechanisms. These results suggest that TwinBI improves both agent-level analytical reliability and user-facing analytical support by turning visible dashboard state into richer actionable context. Our dataset and source code are available at: https://github.com/simonjisu/TwinBI

18.
arXiv (CS.CL) 2026-06-11

Can AI Reason Like an Urban Planner? Benchmarking Large Language Models Against Professional Judgment

Problem, Research Strategy, and Findings: The rise of large language models (LLMs) raises a key question for urban planning: which forms of professional planning knowledge can AI replicate, and which still require human judgment? Although AI tools are increasingly used in planning practice, there is still no systematic framework for testing whether they can reason with the contextual sensitivity, value awareness, and institutional literacy central to planning expertise. This paper introduces Urban Planning Bench (UPBench), a domain-specific evaluation framework that assesses LLM reasoning through a 4x5 matrix of four knowledge pillars and five cognitive levels adapted from Bloom's revised taxonomy. Evaluating 25 LLMs with automated scoring and expert review, we find a non-monotonic cognitive curve: models perform better on higher-order analytical tasks than on factual recall and integrative judgment. This suggests that planning knowledge often treated as lower-order is deeply shaped by institutional, jurisdictional, and temporal context, making it hard for LLMs to generalize. We summarize these limits as four epistemic diagnostics: regulatory hallucination, conceptual conflation, wickedness paralysis, and phronetic deficit. Takeaway for Practice: The findings support differential delegation in planning. LLMs can assist with cross-disciplinary synthesis, literature review, scenario generation, and preliminary policy analysis. However, they remain unreliable for jurisdiction-specific regulation, normative conflict resolution, and context-sensitive procedure. Agencies should require verification for AI-assisted regulatory analysis, while planning education should emphasize institutional literacy, normative judgment, and contextual sensitivity.

19.
arXiv (CS.CV) 2026-06-16

City landscape in sight: A crowdsourced framework for unlocking urban-scale window view perceptions from real estate imagery

City landscapes viewed through home windows influence quality of life, yet perceptions of actual window views at the urban scale remain understudied. This study presents an approach for large-scale mapping of perceptions using 12,334 window view images (WVIs) collected from actual residential properties listed on real estate platforms in Wuhan, China, representing a rarely explored form of urban view imagery that offers advantages over the rendered or simulated window views commonly examined in previous studies. Through a non-immersive virtual reality platform, we collected 27,477 pairwise comparisons across six perceptual dimensions (e.g.\ Vivid) from 304 participants based on 499 WVIs. A hybrid neural network model was trained to predict human perceptions of all crowdsourced WVIs and map their spatial distribution. Results reveal significant spatial autocorrelation with distinct hot and cold spots across the whole city. Floor level strongly influences human perceptions: while higher floors offer more preferred and extensive window views, lower-floor windows provide residents with quiet and vivid views. An inference model further shows that window view composition matters considerably: high ratios of sky, trees, and low-rise buildings enhance people's preferences and perceptions of vividness, whereas high ratios of high-rise buildings increase perceptions of monotony and oppression. Importantly, these effects are non-linear: the excessive presence of certain elements can alter their impact on human perception. This work advances urban-scale understanding of residents' visual experiences and provides evidence-based guidance for human-centric urban planning and real estate to optimise visual landscapes from windows.

20.
arXiv (CS.CV) 2026-06-16

Contrastive Learning for Seismic Horizon Tracking with Domain-Specific Priors

Unsupervised 3D seismic horizon tracking faces a key limitation: signal-based propagators provide accurate trace-level alignment but often fail near faults, whereas texture-driven deep models are more robust to discontinuities, typically at the cost of labeled data requirements and reduced trace-level precision. We propose a self-supervised fusion of both paradigms in which signal-derived local horizon correspondences act as domain-specific priors to train a texture-based deep learning model. Specifically, we estimate reliable trace-to-trace flows from reflector slopes and use them to form positive pairs in a contrastive objective, while restricting training to high-confidence neighborhoods, optionally augmented with a fault mask. The objective is not to infer ambiguous correspondences close to discontinuities, but to preserve horizon identity across them. As a result, the network learns voxel-wise embeddings that preserve local signal continuity while enabling horizon propagation beyond discontinuities through similarity search. Experiments on the public F3 dataset and a faulted synthetic dataset achieve lower mean absolute error (MAE) than unsupervised baselines and competitive performance against a semi-supervised method using a single labeled slice.

21.
arXiv (CS.AI) 2026-06-19

ScholarQuest: A Taxonomy-Guided Benchmark for Agentic Academic Paper Search in Open Literature Environments

arXiv:2606.20235v1 Announce Type: cross Abstract: Academic paper search is a core step in scientific research, and LLM-based search agents are emerging as a promising paradigm for iterative, intent-driven literature exploration. However, existing benchmarks are insufficient for systematically evaluating agentic academic search under realistic open literature environments. We propose ScholarQuest, a large-scale, taxonomy-guided benchmark for agentic academic paper search. ScholarQuest is constructed from over 1,000 computer science topics and four representative research intents, including method-oriented, setting-anchored, comparison-based, and scope-controlled queries. It further provides scalable answer construction and a shared retrieval backend ScholarBase for reproducible evaluation. Benchmarking results show that agentic methods outperform single-shot retrieval baselines, yet the best-performing agent only achieves 0.314 Recall@100 and 0.355 Recall@All, indicating substantial room for improvement. In addition, analyses of search efficiency, intent-level robustness, and failure cases further highlight the benchmark's ability to provide multi-dimensional evaluation signals for academic paper search agents.

22.
arXiv (CS.AI) 2026-06-18

Information-Theoretic Measures in AI: A Practical Decision Guide

arXiv:2604.23716v2 Announce Type: replace Abstract: Information-theoretic (IT) measures are ubiquitous in artificial intelligence: entropy drives decision-tree splits and uncertainty quantification, cross-entropy is the default classification loss, mutual information underpins representation learning and feature selection, and transfer entropy reveals directed influence in dynamical systems. A second, less consolidated family of measures, integrated information (Phi), effective information (EI), and autonomy, has emerged for characterizing agent complexity. Despite wide adoption, measure selection is often decoupled from estimator assumptions, failure modes, and safe inferential claims. This paper provides a practical decision framework for all seven measures, organized around three prescriptive questions for each: (i) what question does the measure answer and in which AI context; (ii) which estimator is appropriate for the data type and dimensionality; and (iii) what is the most dangerous misuse. The framework is operationalized in two complementary artifacts: a measure-selection flowchart and a master decision table. We cover both AI/ML and decision-making agent application domains per measure, with standardized Bridge Boxes linking IT quantities to cognitive constructs. Three worked examples illustrate the framework on concrete practitioner scenarios spanning representation learning, temporal influence analysis, and evolved agent complexity.

23.
arXiv (CS.AI) 2026-06-17

Kolmogorov Regression for Robust Diffusion Policies

作者:

arXiv:2606.18186v1 Announce Type: cross Abstract: Finite-dimensional (FD) diffusion policies exhibit temporal drift owing to discretization artifacts that degrade long-horizon performance (when deployed on physical systems). We introduce a backward Kolmogorov equation that lifts diffusion policies to a Cameron-Martin space – a subset of the Hilbert space. Essentially, replacing stochastic score matching with a deterministic boundary-value PDE problem. Our core innovation thrives on Gaussian measure theory whereupon the diffusion noise covariance operator is realized from a colored noise distribution which prescribes a notion of regularity on samples from the model at inference time. We train the diffusion model with a derived precision-weighted Cameron- Martin loss and a Kolmogorov residual is introduced as a PDE diagnostic during inference. These substitutions yield (i) convergence guarantees where the bound's constants depend on the effective rank of the kernel rather than action dimension, (ii) improved trajectory regularity via spectral weighting, and (iii) a deterministic failure detector without reward signals. Validation across two application domains demonstrates substantial improvements: on the PushT manipulation benchmark, the Cameron-Martin loss achieves a 17% improvement in maximum episode reward (0.95 vs. 0.78 for MSE) and 67.6% reduction in inter-step drifts during inference via the introduced residual magnitude. Similarly, on a 6-station manufacturing line with constant work-in-process (CONWIP) flow control, we achieve 28.4% lower RMSE than classical LSTM baselines; a high starvation-event recall (1.0 in test cycles), and effective bottleneck identification (Precision@1 = 1.0 in test set, 13x signal-to-noise ratio). We then certify the dispatch policies with Hamilton-Jacobi reachability theory which reduces deadlock events by 96% compared to uncontrolled dispatch over 100 simulated runs (351 events prevented).

24.
arXiv (CS.AI) 2026-06-17

Brick-DICL: Dynamic In-Context Learning for Automated Brick Schema Classification

arXiv:2606.17637v1 Announce Type: new Abstract: Building Management Systems (BMS) are essential for optimizing energy efficiency and operational performance in modern buildings. However, the lack of standardization across BMS points from different manufacturers creates significant barriers to integration and data utilization. While the Brick schema offers a standardized ontology for building systems, mapping BMS points to appropriate Brick classes presents three critical challenges: (i) the extensive number of Brick classes (936 in the latest version), (ii) limited domain-specific knowledge in large language models (LLMs), and (iii) substantial manual effort required for verification. To address these challenges, we propose Brick-DICL, a two-stage dynamic in-context learning framework for automated Brick schema classification. Brick-DICL consists of two primary components: metadata-RAG, which retrieves relevant examples to enhance LLMs' domain knowledge, and class-RAG, which narrows down potential Brick classes to address the large classification space. Additionally, we implement a multi-LLM filtering mechanism that compares predictions across multiple models, flagging low-confidence classifications for human review. As a result: (i) General: Brick-DICL is applicable to any building management system regardless of manufacturer or metadata format; (ii) Novel and Powerful: as the first dynamic in-context learning approach for Brick schema classification, Brick-DICL achieves significant classification accuracy improvements on building datasets, outperforming existing methods; (iii) Efficient: our multi-LLM filtering strategy reduces manual verification effort, enabling rapid digital building onboarding. Extensive experiments demonstrate Brick-DICL's effectiveness across diverse building datasets, accelerating the path toward standardized, interoperable building management systems.

25.
arXiv (quant-ph) 2026-06-19

Efficient upsampling for tensor-network and quantum-state encoded functions

arXiv:2601.03885v2 Announce Type: cross Abstract: Both tensor trains (TTs) and quantum states provide compressed representations of grid-structured data with potentially exponential compression power. We present a unified framework for upsampling data encoded in vector amplitudes, with efficient realizations in both classical TT and quantum settings. Starting from an \(n\)-core TT or an \(n\)-qubit state on a coarse grid with \(2^n\) points, the construction produces an \((n+m)\)-core TT or \((n+m)\)-qubit state on a finer grid with \(2^{n+m}\) points. In the TT setting, it supports interpolation, quasi-interpolation, augmentation, and synthesis through efficient low-rank contractions, with the added \(m\) cores retaining constant rank. For function-value encodings, the resulting interpolation satisfies an \(\ell^2\)-error bound independent of the number of added grid points, achieves exponential compression at fixed accuracy, and has a logarithmic complexity in the number of grid points. In the quantum setting, the refined state is prepared by a \(\mathrm{poly}(n,m)\)-size circuit using \(\log(p+1)\) ancillas, where \(p\) controls the smoothness of the quasi-interpolant; the corresponding error scales quadratically with the initial grid spacing. We validate our framework for tensor networks in one-, two-, and three-dimensional examples, including functions, derivatives, airfoil masks, and synthetic random fields such as three-dimensional turbulence. In particular, fractal fields can be generated directly in TT format with logarithmic memory and runtime. These results open a practical route to multiscale solvers, generative models, and geometry-aware algorithms on tensor-network and quantum platforms, with potential applications in scientific simulation, imaging, and real-time graphics.