Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (quant-ph) 2026-06-16

High-performance gates on trapped ion qubits using counterpropagating pulse-shaped laser beams

arXiv:2606.15672v1 Announce Type: new Abstract: Highly-localized light-matter interactions are necessary for scaling trapped-ion architectures. In hyperfine qubits, counterpropagating beams generate entangling gates by coupling with motion, but this effect is undesirable during single-qubit operations. For that reason, single-qubit gates are traditionally implemented with copropagating beams, and the coexistence of two beam geometries adds hardware and computational overhead. In an effort towards collective performance improvement with minimal overhead, we design and implement pulse-amplitude and dephasing robust dynamically corrected gates using Space Curve Quantum Control (SCQC) and compare them against the constant-amplitude gate implementation. We perform gate set tomography on a four-qubit trapped-ion register, and we discover more than 50% error reduction when robust pulses are used. We find that counterpropagating robust gates often outperform their copropagating counterparts and reach error rates as low as $(3.59 \pm 1.25)\cdot 10^{-3}$, using diamond distance as a metric. This value establishes a laser-driven-gate error reference and is merely an order of magnitude higher than the best reported $microwave$ gate on a $single$ ion. Additional experiments reveal that robust pulses can effectively suppress non-Markovian errors that grow during runtime. Our work challenges the widely accepted belief that copropagating gates should be preferred for their weak motional coupling and invites the adoption of high-performance robust pulses that suppress multiple noise sources of the trapped-ion error budget.

02.
arXiv (CS.AI) 2026-06-19

Data Standards for Humanoid Robotics: The Missing Infrastructure for Physical AI

arXiv:2606.19769v1 Announce Type: cross Abstract: The scalability of humanoid robots will depend not only on models and hardware, but also on whether physical experience can accumulate across robots, tasks, organizations, and time. Drawing on the authors' work in developing ISO/WD 26264-1, Humanoid robot datasets – Part 1: General requirements, within ISO/TC 299/WG 16, this article argues that data standards are becoming foundational infrastructure for Physical AI. We develop three insights. First, humanoid robot data is embodied interaction data, not a collection of isolated digital samples; a useful dataset must preserve the relationship among robot body, action, task, scene, execution trace, and outcome. Second, its value depends on physical coherence: multimodal streams are reusable only when timing, coordinate frames, calibration, kinematics, units, and synchronization assumptions remain inspectable. Third, the main bottleneck is not only data scarcity, but non-cumulative data caused by high collection costs, data silos, and inconsistent evaluation. We argue that humanoid robot data standards address these bottlenecks by making embodied experience interpretable, shareable, traceable, and reusable. A general standard should provide horizontal infrastructure for lifecycle management, metadata, provenance, quality, versioning, and traceability, while capability-specific parts should define domain grammar for manipulation, locomotion, human-robot interaction, cognition, and future humanoid capabilities. As AI moves from screens into bodies, data standards must evolve from organizing digital information to structuring physical interaction.

03.
arXiv (CS.LG) 2026-06-18

How fast can you find a good hypothesis?

arXiv:2509.03734v3 Announce Type: replace-cross Abstract: In the hypothesis selection problem, we are given sample and query access to finite set of candidate distributions (hypotheses), $\mathcal{H} = \{H_1, \ldots, H_n\}$, and samples from an unknown distribution $P$, both over a domain $\mathcal{X}$. The goal is to output a distribution $Q$ whose distance to $P$ is comparable to that of the nearest hypothesis in $\mathcal{H}$. Specifically, if the minimum distance is $\mathsf{OPT}$, we aim to output $Q$ such that, with probability at least $1-\delta$, its total variation distance to $P$ is at most $C \cdot \mathsf{OPT} + \varepsilon$. The optimal approximation for proper algorithms (where $Q \in \mathcal{H}$) is $C=3$ using $\Theta(\log(n/\delta)/\varepsilon^2)$ samples from $P$ and for improper algorithms (where $Q$ is not necessarily in $\mathcal{H}$) is $C=2$ using $\tilde{\Theta}(\log(n/\delta)/\varepsilon^2)$ samples from $P$. In the improper setting, the algorithm achieving $C=2$ [Bousquet, Braverman, Kol, Efremenko, Moran, FOCS 2021] runs in time which grows polynomially with $|\mathcal{X}|$ – it does not run in finite time for real-valued distributions. A promising path towards improved runtime is to consider improper algorithms which output a mixture $Q$ of the hypotheses as such a distribution can be represented in $n$ words of memory. We show (1) a lower bound that no algorithm which outputs a mixture can achieve approximation better than $C = 3-2/n$ unless the number of samples is polynomial in $|\mathcal{X}|$, as well as (2) an algorithm which runs in time $poly(n)$ and achieves the same approximation guarantee. In the proper setting, [Aliakbarpour, Bun, Smith, NeurIPS 2024] provided an algorithm with $C=3$ running in $\tilde{O}(n/(\delta^3\varepsilon^3))$ time. We improve this time complexity to $\tilde{O}(n/(\delta \varepsilon^2))$, significantly reducing the dependence on the confidence and error parameters.

04.
arXiv (CS.CL) 2026-06-24

Societal Alignment Frameworks Can Improve LLM Alignment

Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity rather than perfect their specification. Beyond technical improvements in LLM alignment, we discuss the need for participatory alignment interface designs.

05.
arXiv (CS.LG) 2026-06-16

Leveraging Physiological Signals to Predict Exam Outcomes with Machine Learning

arXiv:2606.14960v1 Announce Type: new Abstract: This study investigates the application of machine learning models to predict exam outcomes using physiological data collected during examination sessions. Physiological stress indicators, including electrodermal activity, heart rate, and skin temperature, were analyzed to uncover their association with academic performance. A variety of machine learning approaches were employed, ranging from standard models like logistic regression, random forest, and support vector machines to more advanced architectures, including transformers, long short-term memory (LSTM), and gated recurrent unit (GRU) models. This diversity aimed to capture the complex interactions within the data effectively. A key focus was assessing the adaptability of transformers in processing numerical data and evaluating their performance in this novel context. Standard performance metrics, such as accuracy, precision, recall, and F1-score, were used to compare model efficacy. The experimental results demonstrate that while deep learning models generally excel at capturing complex relationships in physiological data, simpler models like random forests can sometimes achieve superior performance while offering computational efficiency and interpretability. Furthermore, transformers demonstrated notable versatility, showcasing performances comparable to those of the LSTM and GRU models. This research underscores the importance of experimenting with a broad class of models that align with the objectives of the problem at hand, balancing precision, efficiency, and interpretability. By elucidating the relationships between physiological signals and academic performance, this study contributes to understanding stressors affecting students' mental health. It further promotes leveraging physiological data to enhance student well-being and academic outcomes.

06.
arXiv (CS.AI) 2026-06-11

Noise-Guided Transport for Imitation Learning

arXiv:2509.26294v2 Announce Type: replace-cross Abstract: We consider imitation learning in the low-data regime, where only a limited number of expert demonstrations are available. In this setting, methods that rely on large-scale pretraining or high-capacity architectures can be difficult to apply, and efficiency with respect to demonstration data becomes critical. We introduce Noise-Guided Transport (NGT), a lightweight off-policy method that casts imitation as an optimal transport problem solved via adversarial training. NGT requires no pretraining or specialized architectures, incorporates uncertainty estimation by design, and is easy to implement and tune. Despite its simplicity, NGT achieves strong performance on challenging continuous control tasks, including high-dimensional Humanoid tasks, under ultra-low data regimes with as few as 20 transitions.

07.
arXiv (CS.AI) 2026-06-11

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

arXiv:2602.22638v2 Announce Type: replace Abstract: Route-planning agents powered by large language models (LLMs) have emerged as a promising paradigm for supporting everyday human mobility through natural language interaction and tool-mediated decision making. However, systematic evaluation in real-world mobility settings is hindered by diverse routing demands, non-deterministic mapping services, and limited reproducibility. In this study, we introduce MobilityBench, a scalable benchmark for evaluating LLM-based route-planning agents in real-world mobility scenarios. MobilityBench is constructed from large-scale, anonymized real user queries collected from Amap and covers a broad spectrum of route-planning intents across multiple cities worldwide. To enable reproducible, end-to-end evaluation, we design a deterministic API-replay sandbox that eliminates environmental variance from live services. We further propose a multi-dimensional evaluation protocol centered on outcome validity, complemented by assessments of instruction understanding, planning, tool use, and efficiency. Using MobilityBench, we evaluate multiple LLM-based route-planning agents across diverse real-world mobility scenarios and provide an in-depth analysis of their behaviors and performance. Our findings reveal that current models perform competently on Basic information retrieval and Route Planning tasks, yet struggle considerably with Preference-Constrained Route Planning, underscoring significant room for improvement in personalized mobility applications. We publicly release the benchmark data, evaluation toolkit, and documentation at https://github.com/AMAP-ML/MobilityBench.

08.
arXiv (CS.CL) 2026-06-11

GraspLLM: Towards Zero-Shot Generalization on Text-Attributed Graphs with LLMs

Research on Text-Attributed Graphs (TAGs) has gained significant attention recently due to its broad applications across various real-world data scenarios, such as citation networks, e-commerce platforms, social media, and web pages. Inspired by the remarkable semantic understanding ability of Large Language Models (LLMs), there have been numerous attempts to integrate LLMs into TAGs. However, existing methods still struggle to generalize across diverse graphs and tasks, and their ability to capture transferable graph structural patterns remains limited. To address this, we introduce the GraspLLM, a framework that combines Graph structural comprehension with semantic understanding prowess of LLMs to enhance the cross-dataset and cross-task generalizability. Specifically, we represent node texts from different graphs in a unified semantic space with a frozen general embedding model, on top of which we perform motif-aware contrastive learning across multiple motif-induced adjacency matrices to extract dataset-agnostic structural information. Then, with our proposed optimal contextual subgraph, we extract the most contextually relevant subgraph for each target node and align these subgraphs to the token space of LLM via an alignment projector. Extensive experiments on TAG benchmark datasets spanning diverse domains reveal that GraspLLM consistently outperforms previous LLM-based methods for TAGs, especially in zero-shot scenarios, highlighting its strong generalizability across different datasets and tasks. Our code is available at https://github.com/Heinz217/GraspLLM.

09.
arXiv (CS.CV) 2026-06-15

Improving Lunar Topography with Deep Learning Schrödinger Bridges

Increasing the resolution of planetary topography models can enable a better understanding of surface processes and geomorphology; however, existing analytical super-resolution methods are expensive and difficult to apply at large scales. Generative models provide the tools to learn complex relationships within data and can be applied at scale due to hardware accelerators and parallelization. We present a diffusion-based Schrödinger Bridge (SB) generative modeling approach for lunar topography super-resolution, connecting the distribution of low-resolution topography to that of high-resolution topography, incorporating physically-constraining optical imagery. Our approach is inspired by existing Shape-from-Shading methods, which improve a priori low-resolution topography by using optical images at the target resolution. We train SBs on a novel dataset of rendered lunar topography, emulating optical imagery from the Lunar Reconnaissance Orbiter Narrow Angle Camera. The result is a flexible approach for topography super-resolution which can provide pixel-level uncertainties in the reconstruction.

10.
arXiv (quant-ph) 2026-06-16

Scalable generation of heralded single photons via active feed-forward switching of a fiber delay line

arXiv:2606.16741v1 Announce Type: new Abstract: Quasi-deterministic single-photon generation is a key requirement for many photonic quantum technologies. Photon sources based on spontaneous parametric down-conversion (SPDC) are widely used for producing high-quality photons; however, the probabilistic nature of the process limits the generation of synchronized multi-photon states. Here, we demonstrate temporal synchronization of multiple photon-generation events using a free-space-fiber hybrid delay line with feed-forward control, enabling fast and efficient switching and scalable operation. Narrow-band, telecom-wavelength photons compatible for fiber transmission are heralded from a monolithic cavity SPDC source and synchronized across 20 time bins. This yields a sixfold enhancement in synchronized rates and enables multi-photon synchronization, with only a marginal increase of higher-order photon-number contributions.

11.
arXiv (CS.AI) 2026-06-24

VoltanaLLM: Energy-Efficient and SLO-Aware Disaggregated LLM Serving via Adaptive Frequency Control and State-Space Routing

arXiv:2509.04827v3 Announce Type: replace-cross Abstract: The energy cost of Large Language Model (LLM) inference is rapidly becoming a barrier to sustainable and scalable deployment. Although modern serving architectures expose distinct prefill and decode behaviors, existing systems fail to exploit these phase differences for energy-efficient serving under strict latency SLOs. This paper introduces VoltanaLLM, the first system that explicitly targets and reduces the energy bloat in modern prefill-decode (P/D) disaggregated LLM serving. Guided by a control-theory perspective, VoltanaLLM separates two levers: per-instance operating-point selection (GPU frequency per iteration) and system-level state-space routing of requests. We empirically observe that LLM inference exhibits a U-shaped energy-frequency curve creating "sweet spots" that depend on phase behavior and load. VoltanaLLM exploits this by combining phase-specific, iteration-level frequency selection driven by a lightweight, online-adaptive latency predictor, with a decode state-space guided router that avoids architectural granularity-induced inefficiencies, all while meeting desired SLOs. We implement VoltanaLLM using SGLang and evaluate it across multiple models and real-world workloads. Our results show VoltanaLLM reduces end-to-end energy by up to 36.3% versus a static max-frequency baseline while maintaining high SLO attainment, and generalizes to newer GPUs. These results point to sustainable LLM serving via phase-aware, iteration-level frequency selection coupled with architecture-aware routing. Source code is available in https://github.com/Supercomputing-System-AI-Lab/VoltanaLLM.

12.
arXiv (quant-ph) 2026-06-15

Inhomogeneous Light-Matter Coupling as a Resource for Noiseless Quantum Memories

arXiv:2605.26783v3 Announce Type: replace Abstract: Inhomogeneous ensembles of two-level systems are central to both fundamental light-matter physics and quantum-network applications. Understanding and optimizing ensemble-based quantum memories and entanglement protocols requires a unified framework that describes how to store quantum states of light as collective matter excitations and retrieve them on demand. Here we develop such a framework, the waveguide model, by mapping the dark collective modes of the ensemble onto an effective waveguide with well-defined input-output relations, valid in both the weak-excitation regime and near population inversion. This model reveals that inhomogeneous coupling – often regarded as a limitation – is instead the physical origin of noisy-echo suppression by adiabatic pulses, a key ingredient for realizing noiseless quantum memories. For entanglement generation, the same mechanism exposes a previously unexplored shortcoming of robust control pulses and leads to a new composite-pulse protocol that overcomes it. These results establish the waveguide model as a practical bridge between fundamental collective physics and quantum-network protocol design, recasting inhomogeneous coupling from an obstacle into a control knob for collective emission.

13.
arXiv (CS.AI) 2026-06-19

Sensorimotor World Models: Perception for Action via Inverse Dynamics

arXiv:2606.20104v1 Announce Type: cross Abstract: Perception for action suggests that representations of the world should be shaped not by visual fidelity alone, but by their relevance for actions. At the same time, latent JEPA-style world models advocate learning compact predictive states from high-dimensional observations to facilitate the prediction of future states, but end-to-end training of these models is nontrivial because representations may collapse if our only goal is to construct a latent state that is easy to predict. We introduce a sensorimotor world model (SMWM): a latent world model trained end-to-end with inverse dynamics regularization. This single regularizer addresses both issues: it prevents representation collapse and induces action-aligned representations. By forcing latent states to preserve information about the action underlying a transition, it biases the model toward the controllable degrees of freedom of the environment while discarding uncontrollable distractors. This yields stable latent world models trained from offline, reward-free trajectories, without frozen encoders, exponential moving averages, or complex latent regularizers. Empirically, SMWM learns compact, interpretable latent spaces and enables competitive planning performance across simple 2D and 3D control tasks.

14.
arXiv (CS.CV) 2026-06-19

One-Shot Novel View and Pose Human Image Synthesis via 3D Prior Guided Diffusion Model

This paper addresses the challenge of one-shot novel view and pose human image synthesis. The existing methods transfer the reference human image to a target pose using a set of 2D pose keypoints or synthesize human images based on generalizable human NeRF which uses human model priors to extract point-wise features. However, pose transfer based methods can not handle complex human pose using ambiguous 2D pose as the condition, while generalizable human NeRFs may be inaccurate to recover occluded/invisiable human parts without extracted reliable features. To solve these problems, we propose a novel approach for novel view and pose synthesis from a singe human image via conditional denoising diffusion model. Our diffusion model divides the novel view and pose synthesis problem into a sequence of conditional denoising steps. Specifically, to generate humans with complex and arbitrary poses, we introduce 3D human priors, i.e., 3D normal map and color prompt, as geometry and color conditions into the generation process. By transferring the reference human into the target human with a series of diffusion steps, our diffusion model enables high-quality synthesis including the occluded/invisible parts. Further, we propose a self-reconstruction based customized refinement to enhance fine details when tested on novel persons.Experimental results on different public datasets demonstrate that our approach significantly outperforms previous methods and also shows better generalization ability across datasets. The code will be made publicly available at https://github.com/Yankeegsj/3DPGDM.

15.
PLOS Computational Biology 2026-06-22

pyhgf: A neural network library for predictive coding

by Nicolas Legrand, Lilian Weber, Peter Thestrup Waade, Anna Hedvig Møller Daugaard, Mojtaba Khodadadi, Nace Mikuš, Christoph Mathys Bayesian models of cognition have gained considerable traction in computational neuroscience and psychiatry. Their scopes are now expected to expand rapidly to artificial intelligence, providing general inference frameworks to support embodied, adaptable, and energy-efficient autonomous agents. A central theory in this domain is predictive coding, which posits that learning and behaviour are driven by hierarchical probabilistic inferences about the causes of sensory inputs. Biological realism constrains these networks to rely on simple local computations in the form of precision-weighted predictions and prediction errors. This can make this framework highly efficient, but its implementation comes with unique challenges on the software development side. Embedding such models in standard neural network libraries often becomes limiting, as these libraries’ compilation and differentiation backends can force a conceptual separation between optimization algorithms and the systems being optimized. This critically departs from other biological principles such as self-monitoring, self-organisation, cellular growth, and functional plasticity. In this paper, we introduce pyhgf: a Python package backed by JAX and Rust for creating, manipulating, and sampling dynamic networks for predictive coding. We improve over other frameworks by enclosing the network components as transparent, modular, and malleable variables in the message-passing steps. The resulting graphs can implement arbitrary algorithms as belief propagation. Moreover, the transparency of core variables can also translate into inference processes that leverage self-organisation principles and express structure learning, meta-learning, or causal discovery as the consequence of network structural adaptation to surprising inputs. The main functions of the library are differentiable and seamlessly integrate into sampling or optimization workflows. Additionally, we offer generalized Bayesian filtering and the hierarchical Gaussian filter as key examples of dynamic networks implemented in our library. The source code, tutorials, and documentation are hosted under the main repository at https://github.com/ComputationalPsychiatry/pyhgf.

16.
arXiv (CS.AI) 2026-06-12

Definitional alignment before capability alignment: a Design-Science framework for adjudicating claims about AGI

arXiv:2606.12713v1 Announce Type: new Abstract: Claims that artificial general intelligence has already arrived and claims that it remains decades away are often defended from overlapping evidence. "AGI" lacks a single shared and stable referent and competing operationalizations can return different verdicts on the same system. This article treats that under-specification as a design and governance problem. Following Design Science Research Methodology, it develops DAF-AGI, a second-order conceptual artifact with two coupled components: five ordinal criteria for assessing the adjudicative fitness of candidate definitions and a structured governance audit of authorship, interest, certification, external verification and revision authority. The artifact is demonstrated on five prominent measurement families and one deflationary boundary position in a documented corpus and then stress-tested against a stylized strong arrival claim: that current generative systems constitute AGI because they outperform a well-educated adult on many cognitive tasks. On evidence from the cited 2024-2025 sources, the claim was certifiable only under a performance-based operationalization; capability-ontology, psychometric and skill-acquisition approaches did not certify it, the economic family remains indeterminate and the deflationary position refuses binary adjudication. The contribution is a novel integration and operationalization, not an empirical validation: independent application, inter-rater testing and author-external cases remain necessary. The paper further proposes definitional sovereignty as an enabling component of algorithmic sovereignty: the institutional capacity to contest, certify and revise imported technological categories under public accountability.

17.
medRxiv (Medicine) 2026-06-23

Antibodies against influenza A/H1N1pdm2009 and B/Victoria strains but not A/H3N2 are increased in recent onset type 1 narcolepsy versus matched controls

Study Objectives: Onsets of Narcolepsy type-1 (NT1) increased following A/H1N1 vaccination with PandemrixTM in Europe and with A/H1N1pdm2009 infections in China and other countries. To test if other strains could trigger narcolepsy, we measured strain-specific antibodies in patients with recent onset NT1 compared to controls. Methods: Antibodies against hemagglutinin (HA) and neuraminidase (NA) were tested in 62 patients with very recent onset (onset and blood collection following a single flu season, mean +/- SEM: 0.44 +/- 0.06 years since onset) and 100 controls matched by age, sex, season and year of collection (2000-2025). Results were next extended to 181 recent onset patients (mean +/- SEM: 1.00 +/- 0.05 years) versus 260 controls, matched by sex, season and year, but having a slightly higher mean age. HA inhibition (HAI) and NA inhibition (NAI) assays were conducted using flu strains known to circulate during the corresponding flu seasons. HAI results are shown as % positive (titers >= 40) and NAI results as geometric mean titers. Odds ratio (OR) and coefficient were used to compare antibody titers in NT1 versus controls. The contribution of each assay to prediction was finally quantified in the larger sample set using Shapley decomposition. Results: NT1 patients had increased anti-HA and anti-NA antibodies against A/H1N1pdm2009 (anti-HA OR = 3.86, anti-NA coefficient = 0.35) and B/Victoria (anti-HA OR =1.90, anti-NA coefficient = 0.22), but not A/H1N1pre2009, A/H3N2, or B/Yamagata, independent of HLA-DQB1*06:02 status, age, sex, and flu season. Correlations between anti-HA and anti-NA antibodies titers were weak to moderate but significant (r2=-0.10 to 0.34). Multivariable model outperformed age-only baseline (McFadden R2 = 0.19 vs. 0.03; AUC = 0.79 vs. 0.64; likelihood-ratio test X2 = 51, p

18.
arXiv (CS.CV) 2026-06-19

Benchmarking Vision Foundation Models for Domain-Generalizable Face Anti-Spoofing

Face Anti-Spoofing (FAS) remains challenging due to the requirement for robust domain generalization across unseen environments. While recent trends leverage Vision-Language Models (VLMs) for semantic supervision, these multimodal approaches often demand prohibitive computational resources and exhibit high inference latency. Furthermore, their efficacy is inherently limited by the quality of the underlying visual features. This paper revisits the potential of vision-only foundation models to establish a highly efficient and robust baseline for FAS. We conduct a systematic benchmarking of 15 pre-trained models, such as supervised CNNs, supervised ViTs, and self-supervised ViTs, under severe cross-domain scenarios including the MICO and Limited Source Domains (LSD) protocols. Our comprehensive analysis reveals that self-supervised vision models, particularly DINOv2 with Registers, significantly suppress attention artifacts and capture critical, fine-grained spoofing cues. Combined with Face Anti-Spoofing Data Augmentation (FAS-Aug), Patch-wise Data Augmentation (PDA) and Attention-weighted Patch Loss (APL), our proposed vision-only baseline achieves state-of-the-art performance in the MICO protocol. This baseline outperforms existing methods under the data-constrained LSD protocol while maintaining superior computational efficiency. This work provides a definitive vision-only baseline for FAS, demonstrating that optimized self-supervised vision transformers can serve as a backbone for both vision-only and future multimodal FAS systems. The project page is available at: https://gsisaoki.github.io/FAS-VFMbenchmark-CVPRW2026/ .

19.
arXiv (CS.AI) 2026-06-24

When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments

arXiv:2407.18957v5 Announce Type: replace-cross Abstract: Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large language model based agents. We have developed a multi-agent AI system called StockAgent, driven by LLMs, designed to simulate investors' trading behaviors in response to the real stock market. The StockAgent allows users to evaluate the impact of different external factors on investor trading and to analyze trading behavior and profitability effects. Additionally, StockAgent avoids the test set leakage issue present in existing trading simulation systems based on AI Agents. Specifically, it prevents the model from leveraging prior knowledge it may have acquired related to the test data. We evaluate different LLMs under the framework of StockAgent in a stock trading environment that closely resembles real-world conditions. The experimental results demonstrate the impact of key external factors on stock market trading, including trading behavior and stock price fluctuation rules. This research explores the study of agents' free trading gaps in the context of no prior knowledge related to market data. The patterns identified through StockAgent simulations provide valuable insights for LLM-based investment advice and stock recommendation. The code is available at https://github.com/MingyuJ666/Stockagent.

20.
arXiv (CS.CL) 2026-06-11

Notes2Skills: From Lab Notebooks to Certainty-Aware Scientific Agent Skills

Scientific discovery workflows usually contain and rely heavily on lab notes, where researchers record observations, interpret uncertain results, and plan follow-up experiments. Such informative lab notes preserve evolving scientific reasoning and author uncertainty, rather than polished final results exhibited in publications, providing a valuable opportunity for AI to engage in scientific exploration at a more comprehensive and deeper level. However, most prior work on scientific text focuses on papers, protocols, or structured databases, leaving informal laboratory notes underexplored as inputs to AI agents for science. This gap matters because lab notes often intermingle validated observations, tentative judgments, and possible experimental next steps within the same passage. If these signals are conflated, an AI agent may mistake uncertain scientific judgments for confirmed conclusions or executable actions. To this end, we present Notes2Skills, a two-stage framework for turning lab notebooks into verifiable skills for scientific AI agents while preserving the author's certainty. Across seven conditions and three wet-lab sessions, Notes2Skills is the only configuration that neither mistakes uncertain notes for firm instructions nor discards firm ones. We show that certainty preservation is the missing piece between lab notebooks and reliable agent skills, opening a path toward safer AI co-scientist systems.

21.
medRxiv (Medicine) 2026-06-16

Sleep regularity outweighs sleep duration as a predictor of disease

Sleep regularity, the consistency of sleep-wake timing from one day to the next, is more strongly associated with longevity than adequate sleep duration. Whether this relationship persists across common diseases is unknown. We compared sleep regularity vs. sleep duration as risk factors for 199 diseases and disorders, using ten million hours of objective sleep-wake data (N=60,998, age[mean{+/-}SD]=62.8{+/-}7.8, 55% female). Multivariable-adjusted risks of incident diseases/disorders for regular/irregular and short/adequate sleepers were compared across 9.5 years of follow-up. Irregular sleep predicted risks for 131 diseases/disorders, more than double the number predicted by short sleep duration (63). Irregular sleep was a superior predictor than short sleep duration for 90 diseases/disorders, including circulatory, metabolic, digestive, renal, infectious, neurological, and musculoskeletal conditions, and mental disorders, whereas short sleep duration was the superior predictor for only 9 diseases/disorders. For models where short sleep duration explained disease risks, 83% were improved by adding sleep regularity. Sleep regularity was a stronger predictor of diseases/disorders than sleep duration in this cohort and should be considered an essential dimension of sleep health.

22.
arXiv (CS.CV) 2026-06-12

Flex4DHuman: Flexible Multi-view Video Diffusion for 4D Human Reconstruction

We present Flex4DHuman, a multi-view video diffusion model that transforms a monocular or sparse multi-view video of a dynamic subject into synchronized dense multi-view videos using only relative camera-pose conditioning. Unlike prior human-centric methods that rely on skeletons, depth maps, normals, or rendered target-view geometry, Flex4DHuman requires no explicit geometry priors and instead conditions generation through relative camera-pose positional encoding. The generated videos can be directly ingested by downstream reconstruction pipelines to create dynamic 4D Gaussian splats. Built on the Wan 2.1 1.3B text-to-video model, Flex4DHuman preserves the backbone architecture and encodes camera and view information through a five-axis positional encoding that extends spatio-temporal RoPE with view indices and continuous SE(3) relative camera geometry. A three-stage curriculum progressively trains the model for pose following, flexible reference-to-target view generation, and temporal rollout. To support temporal rollout, we train with clean historical target-view tokens. We also add multi-view captions to enable test-time text control. Combined with an off-the-shelf 4D Gaussian Splatting stage, our framework lifts monocular static-camera videos into dynamic 4D Gaussian splats. Experiments on DNA-Rendering and ActorsHQ show that Flex4DHuman surpasses prior state-of-the-art methods, while the same formulation generalizes to animal categories after mixed human-animal training. These capabilities make Flex4DHuman a practical step toward scalable 4D content creation from casual monocular videos for simulation, gaming, AR/VR, and video re-shooting.

23.
arXiv (math.PR) 2026-06-12

Scaling limit of additive functionals for reversible non-gradient exclusion process: critical cases

arXiv:2606.13442v1 Announce Type: new Abstract: For the reversible speed-change exclusion process $(\eta_t)_{t \geq 0}$ in $\mathbb{Z}^d$, we study the scaling limit of additive functionals ${\Gamma_t(f) = \int_0^t f(\eta_s)\, \mathrm{d} s}$. Concerning the local centered function $f$, the previous work [Commun. Math. Phys. 104, 1-19, 1986] by Kipnis and Varadhan and [Comm. Pure Appl. Math., 66: 649-677, 2013] by Gon{ç}alves and Jara respectively covered the cases $d \geq 3$ and $d=1$. The present paper completes the missing part $d=2$, and also develops the theory for functions with higher degree. The novelty is a quantitative homogenization of the resolvent, which allows to overcome the obstacle of correlation function in non-gradient models.

24.
arXiv (CS.AI) 2026-06-16

Gender Differences in AI Literacy Workshop Outcomes and Deepfake Engagement

arXiv:2606.14718v1 Announce Type: cross Abstract: As Artificial Intelligence (AI) literacy initiatives expand in K-12 settings, understanding how gender shapes student baseline perceptions, tool-use, and responsiveness to interventions is essential for equitable curriculum design. This study examines gender differences in AI literacy, safety awareness, and STEM career aspirations among Australian secondary students (Years 7, 8, and 10; N(pre) = 199, n(post) = 136) from two co-educational government schools who participated in a one-day AI literacy workshop. Using statistical regression methods controlling for year level and school, we found that pre-workshop, male students reported significantly higher STEM career interest across all three domains (AI, computer science, and engineering), while female students were significantly more likely to use AI for schoolwork and to seek advice from AI tools. Gender-differentiated patterns also emerged in deepfake behaviours: males were significantly more likely to have created or shared deepfake content. Both genders improved in AI knowledge post-intervention, yet females showed a richer profile of gains: wider conceptual understanding, greater confidence, and meaningful increases in AI and CS career interest that partially narrowed the gender STEM gap. These findings highlight the need for gender-responsive AI curricula, particularly deepfake safety education for male students, and demonstrate that even single-day workshops can narrow gender gaps in STEM aspirations and AI confidence.

25.
arXiv (CS.LG) 2026-06-15

A General Framework for Decision Trees via Bregman Divergences

arXiv:2606.13984v1 Announce Type: cross Abstract: Decision trees are one of the fundamental tools in statistical learning due to their interpretability, flexibility, and their ability to adapt to nonlinear structures. Among them, the Classification and Regression Trees, introduced by Breiman, Friedman, Olshen, and Stone in 1984, became one of the most influential algorithms and remains one of the most widely used methods for classification and regression problems. On the other hand, Bregman divergences, introduced by Lev Bregman in 1967 in the context of convex optimization, provide a broad family of loss functions that naturally generalize the squared Euclidean distance. This family includes, among others, the Kullback-Leibler divergence, the Poisson divergence, and the Itakura-Saito divergence, as well as several losses associated with distributions belonging to the exponential family. Moreover, Bregman divergences possess a rich geometric structure and deep connections with convex analysis and information geometry. In this work, we propose a generalization of the CART paradigm based on Bregman divergences, thereby obtaining a broader family of decision trees adapted to different statistical models and underlying geometries. Although algorithms such as CART or classical implementations such as rpart incorporate different impurity criteria, these are usually introduced in an ad hoc manner for each specific model. In contrast, the Bregman divergence approach provides a unified framework that allows these criteria to be derived and interpreted from common convex and geometric principles. Beyond the algorithmic construction, we also investigate theoretical properties of these trees. In particular, we study how properties of the generating convex function – such as strong convexity or smoothness – influence impurity gains between parent and child nodes, as well as stability and consistency properties of the estimator.