Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
PLOS Computational Biology 2026-06-02

Assessing the importance of sex and disease-specific anatomy in electrophysiology and mechanical simulations with a newly developed public virtual cohort of four-chamber heart models

by José Alonso Solís-Lemus, Rosie K. Barrows, Cristobal Rodero, Marina Strocchi, Natalie Montarello, Nishant Lahoti, Cesare Corrado, Abdul Qayyum, Shahrokh Rahmani, Caroline Roney, Gernot Plank, Christoph Augustin, Hao Xu, Alistair Young, Pras Pathmanathan, Ronak Rajani, Steven A. Niederer This work presents a study on how differences in cardiac anatomy attributed to sex and disease can influence cardiac electrophysiology and mechanics using a virtual cohort of four-chamber heart models. Patient anatomy varies across sex and disease. However, capturing this variation in in-silico studies remains poorly accounted for, with studies often using either single representative cases or imbalanced virtual cohorts. Whole-heart electromechanics models incorporate the patient’s anatomy, electrophysiology and mechanics across different scales, from molecular, tissue and whole-heart and circulatory system levels. However, cardiac models are typically built from one or a small number of anatomies, with sex rarely reported and the effects of anatomical variability, which include those due to sex or disease, largely unexplored. This limits clinical translation and reduces regulatory credibility. We developed fifty patient-specific anatomical models of 25 male and 25 female hearts in heart failure and control cases. We ran benchmark passive inflation and paced activation simulations with consistent parameters and boundary conditions across cases to isolate the impact of anatomical variations with sex and disease. Heart failure models exhibited increased chamber volumes, larger volume changes during inflation, and delayed activation times relative to controls. These trends were consistent across sexes, although right ventricular activation showed a significant sex-based difference. Variations in anatomy with sex and disease have a significant impact on cardiac simulations, which support the inclusion of multiple heart anatomical models in in-silico trials. The resulting virtual cohort captures key anatomical variability and is publicly available, along with the underlying code (see Data Availability statement).

02.
arXiv (CS.AI) 2026-06-17

An AI Security Agent for Banking: Multi-Vector Fraud and AML Detection Across Retail and Corporate Accounts

arXiv:2606.17555v1 Announce Type: cross Abstract: Banks simultaneously face signature-based fraud (card-not-present attacks, account takeover, ATM cloning) and behavioural financial crime (structuring, layering, mule networks, business email compromise) – two threat families with fundamentally different detection requirements. Static rule engines that reliably catch brute-force and high-velocity events are structurally blind to business-email-compromise (BEC) payment redirection, session hijacking, and money-laundering layering, which are engineered to appear indistinguishable from legitimate activity at the individual transaction or session level. This paper presents an AI security agent for retail and corporate banking that addresses this gap through a three-component fusion architecture operating on two parallel event streams: a transaction stream (card fraud, ACH/wire fraud, AML categories) and a session stream (account takeover, session hijacking, SIM-swap, insider abuse). Each stream combines an LSTM sequence model capturing per-account behavioural history, a statistical velocity/threshold monitor, and a graph/network module capturing account-counterparty relationship patterns (fan-in, fan-out, pass-through ratio) for money-laundering detection. Experiments on a synthetic event log of 237,669 transactions and 113,508 sessions across 13 threat categories and 3,470 simulated accounts demonstrate overall F1 of 0.787 (transaction stream) and 0.867 (session stream) for the proposed model, versus 0.562/0.733 for a rule-based baseline and 0.655/0.713 for an LSTM-only baseline. The agent includes a customer-facing transaction-verification chatbot (96.6% identity verification accuracy, 86.8% mass-reset attack detection) and an analyst case-summary assistant (99.3% action-recommendation F1), with Critical-tier automated response latency under 0.43 ms at the 95th percentile.

03.
arXiv (CS.CL) 2026-06-17

An expressivity analysis of hierarchical modelling in deep transformers via bounded-depth grammars

Deep neural networks are widely believed to derive their expressive power from their ability to form hierarchical representations, capturing progressively more abstract and compositional features across layers. In language modeling, transformers have emerged as the dominant architecture, with early layers capturing local syntactic patterns and later layers encoding more complex clause-level dependencies. While this intuition has shaped model design, there remains a lack of rigorous theoretical work demonstrating how deep transformers represent such hierarchical structures. In this work, we analyze the expressiveness of deep transformer models through the formal lens of bounded-depth, non-recursive context-free grammars. For this class of grammars, we explicitly construct transformers with positional attention whose depth grows linearly with grammar depth, while the neuron count scales with the number of derivation-tree shapes and quadratically with the number of production rules. Our theoretical results support the linear representation hypothesis by demonstrating that these architectures possess the structural capacity to encode abstract grammatical states into low-dimensional, linearly separable subspaces within the residual stream.

04.
arXiv (math.PR) 2026-06-16

The optimal sub-Gaussian normalisation for randomised monotone functions

arXiv:2312.01265v5 Announce Type: replace Abstract: Let $\mathcal{M}$ denote the class of randomised monotone functions on $\mathbb{R}$ with values in $[0,1]$, and let $U_{\mathcal{M}}\colon \mathbb{R}_+\to \mathbb{R}_+$ be the minimal function for which $$ \mathbb{P}\left\{ \sqrt{\eta_f}\, \sup_{t\in\mathbb{R}} \left| f_Z(t) - \Exf{f_Z(t)} \right| \ge \varepsilon\sqrt{U_{\mathcal{M}}(\eta_f)} \right\} \le 2\e^{-2\varepsilon^2} $$ holds for every member $f_Z$ of $\mathcal{M}$ with finite effective sample size $\eta_f$ and every positive $\varepsilon$. We prove that for every $x> 1$, $$ \left| \sqrt{U_{\mathcal{M}}(x)} - \sqrt{\log_4 x} \right| \le 2 \min\!\left\{ 1,\, \frac{2 \ln(\e + \ln x)}{\sqrt{\ln x}} \right\}\,. $$ The optimal adjustment $\sqrt{U_{\mathcal{M}}(x)}$ matches $\frac{1}{\sqrt{2\ln 2}}\sqrt{\ln x}$ for all $x>1$, with residuals bounded as above.

05.
arXiv (CS.CL) 2026-06-12

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities – proof generation, proof verification, and critique-conditioned proof repair – using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.

06.
arXiv (CS.AI) 2026-06-16

Optimising Temporary Accommodation Placement Across London with AI-Powered SaaS in E-Governance Systems

arXiv:2606.16652v1 Announce Type: cross Abstract: Temporary accommodation has become a major fiscal and administrative pressure for English local authorities, particularly in London, where demand and costs have risen sharply. This paper documents the creation and use of DOMUS, a cloud-based, AI-enabled decision-support system built from scratch at the University of East London and customised for the needs of London Borough of Newham to support statutory Temporary accommodation placement. DOMUS integrates household case records, policy-constrained affordability and suitability rules, and live private-rental listings within a single governance-aligned workflow. The system combines transparent, rule-based filtering with large language model-assisted search to standardise the application of bedroom need, affordability thresholds, geographic preferences, and accessibility requirements, while preserving officer discretion and audibility. Household and property attributes are encoded into policy-consistent representations prior to AI-assisted ranking and explanation. A pilot deployment in Newham's secure environment evaluated operational performance relative to manual workflows. Results indicate substantial reductions in search time, improved adherence to key placement constraints, and high staff satisfaction, while maintaining statutory compliance and role-based accountability. Beyond TA, the paper frames DOMUS as replicable digital public infrastructure: a modular, cloud-native Software-as-a-Service architecture that can be deployed across other UK boroughs and adapted to other public administration tasks characterised by scarcity, rule-bound eligibility, and high stakes. The findings demonstrate the feasibility of scalable, ethically governed AI deployment in local government and contribute to debates on AI-enabled public value creation in e-governance.

07.
arXiv (CS.AI) 2026-06-16

Greed Is Learned: Visible Incentives as Reward-Hacking Triggers

arXiv:2606.16914v1 Announce Type: new Abstract: Deployed agents increasingly act with their reward proxy in view, such as a balance, score, or KPI dashboard. We show that reinforcement learning can make a policy addicted to such a visible self-benefit channel. It chases the displayed payoff across held-out domains, sacrifices the true task to do so, and follows the channel wherever we rewrite it, while policies that never saw the channel stay honest. We call this reward-channel addiction and study it in MoneyWorld, a synthetic sandbox. The addiction can flip a model's safety alignment: trained only on innocuous money tasks with no safety content, the model abandons the safe action it otherwise always takes whenever a dashboard pays for an unsafe one, and reverts to safe once the channel is hidden. This learned bribe replicates across model scales and families. Blindly optimizing super-capable, next-generation AI on KPIs or P\&L can be dangerous for alignment. Greed is learned when following such a channel pays.

08.
arXiv (math.PR) 2026-06-17

Optional Stopping for Superhedging Supermartingales

arXiv:2606.17452v1 Announce Type: new Abstract: Superhedging supermartingales, introduced by the authors in previous work, are non-probabilistic processes defined via subadditive outer integrals that carry a purely financial interpretation in terms of superhedging cost. Building on the Leinert-König theory of non-lattice integration, the present paper establishes several results that are classical in probability theory but whose non-probabilistic proofs require fundamentally new arguments: (i) a tower inequality for the conditional outer integral \overline{\sigma}_j applied at stopping times, reducing to equality when the integrand is conditionally integrable; (ii) three versions of Doob's optional stopping theorem, organised by the class of supermartingale and the range of the stopping times; and (iii) Dubins' upcrossing inequality in both finite- and infinite-time horizons. A key structural result, property (K)-a.e., identifies conditions under which the two superhedging operators \overline{\sigma}_j and \overline{I}_j coincide on non-negative functions, extending the scope of all preceding results to the positive operator \overline{I}_j. None of the proofs invoke classical measure-theoretic tools; in particular, (classical) integrability and measurability are not assumed. The analogues of classical stochastic results acquire a purely financial interpretation and, in this way, gain depth and generality by providing a context that is independent of any a priori probabilistic structure.

09.
medRxiv (Medicine) 2026-06-15

Primary care practitioners preconception health literacy and information-seeking: A cross-sectional survey.

Background Parental health before pregnancy influences maternal and child outcomes. Primary care professionals, including general practitioners [GPs], midwives, and naturopaths, can provide preconception care, yet many report limited knowledge and difficulty accessing relevant information. This study described Australian GPs, midwives, and naturopaths preconception health literacy, including knowledge and ability to access information. Methods Between July and September 2022, Australian GPs, midwives, and naturopaths completed a 32-item online cross-sectional survey. Participants were recruited through professional associations, and data were analysed using descriptive and inferential statistics Results Participants (N=373) included naturopaths (40.7%), GPs (32.4%), and midwives (26.8%). Reported barriers to clinician health literacy including lack of preconception care resources (25.5%), and limited clinician knowledge (23.6%). The proportion identifying limited clinician knowledge differed significantly between professions (GP: 31.4%; midwives: 23.0%; naturopaths: 17.8%; p=0.030). The highest level of accurate knowledge regarding preconception exposures was for pre-pregnancy obesity (82.7%), while low birth weight was the most accurately identified preconception outcomes (83.7%). Incorrect responses were most common for maternal multivitamin use as an exposure (28.3%) and childhood leukaemia as an outcome (26.3%). Differences between professions were strongest for infant outcomes, with moderate associations observed for shoulder dystocia (V=.2355), precipitous labour (V=.2173), macrosomia (V=.2060), labour dystocia (V=.2018) and cryptorchidism (V=.2018). Discussion Preconception health literacy varies across primary care professions. Clinicians require greater access to targeted resources and education tailored to their differing scopes of practice and experience. Improving clinician preconception health literacy may strengthen consistent evidence-based care and support better maternal, child, and long-term family health outcomes.

10.
Nature (Science) 2026-06-24

GW250114 reveals signatures of post-merger black-hole horizon

作者:

The horizon of a black hole, the ‘surface of no return’, is characterized by its rotation frequency ΩH and surface gravity κ. A striking signature is that any infalling object appears to orbit at ΩH owing to frame dragging, while its emitted signals decay exponentially at a rate set by κ as a consequence of gravitational redshift. Recent theoretical work1 predicts that gravitational waves from binary black-hole mergers carry direct imprints of the properties of the merger remnant in the form of a ‘direct wave’. This gravitational-wave component oscillates near 2ΩH, reflecting the horizon’s frame dragging, and decays at an increasing rate characterized by κ, with additional screening from the black hole’s spacetime. Here we report observational evidence of a direct wave in GW2501142, with a 90% credible matched-filter signal-to-noise ratio of $${15.8}_{-0.5}^{+0.1}$$ ( $${17.1}_{-0.4}^{+0.1}$$ ) in the LIGO Hanford (Livingston) detector. The measured properties are in full agreement with theoretical predictions for a Kerr black hole. These findings establish an observational channel to directly measure frame-dragging effects in black-hole ergospheres and explore (near-)horizon physics in dynamical, strong-gravity regimes. The observation of a direct wave after the merger of two black holes reveals signatures associated with the remnant black-hole horizon, establishing an observational channel to directly measure frame-dragging effects in black-hole ergospheres and probe the horizon surface gravity.

11.
medRxiv (Medicine) 2026-06-18

Maternal and fetal HLA heterozygosity in preeclampsia: Insights from a large multi-ancestry pregnancy cohort

Preeclampsia (PE) is a leading cause of maternal and neonatal morbidity, with immune dysregulation at the maternal-fetal interface central to its pathogenesis. The highly polymorphic human leukocyte antigen (HLA) region mediates maternal immune tolerance of the semi-allogeneic fetus, yet the contribution of HLA diversity to PE risk remains poorly defined. Whether the HLA heterozygote advantage observed in other immune disorders is relevant to PE has not been systematically evaluated. Using data from the multi-ancestry TOPMed Boston-Colombia Collaborative for Adverse Pregnancy Outcomes (n = 12,790; 4,770 PE, 8,020 controls; 10,808 maternal, 1,982 fetal, including 1,848 pairs), we evaluated associations between heterozygosity across eight classical HLA loci and PE and four sub-phenotypes, adjusting for genetic ancestry. HLA heterozygosity was common across most loci (>80%). No individual maternal HLA locus was associated with overall PE; however, heterozygosity across class I loci showed a protective effect in preterm PE (OR=0.82, 95%CI:0.69-0.97), with a similar pattern for HLA-A heterozygosity (OR=0.78, 95%CI:0.64-0.96). In contrast, fetal heterozygosity at HLA-DQB1 was nominally associated with increased risk of PE (OR=1.36, 95%CI:1.03-1.79) and preterm PE (OR=1.73, 95%CI:1.13-2.73). No individual maternal or fetal HLA alleles were associated with PE. Maternal-fetal mismatch analysis demonstrated locus-specific associations with preterm PE, including increased risk with HLA-DQA1 mismatch and reduced risk with HLA-C mismatch. These findings highlight distinct maternal and fetal immunogenetic contributions to PE risk and underscore the importance of considering HLA diversity-rather than individual alleles alone-in studies of PE etiology.

12.
arXiv (quant-ph) 2026-06-12

Trading symmetry for Hilbert-space dimension in Bell-inequality violation

arXiv:2601.02893v3 Announce Type: replace Abstract: In quantum information, asymmetry, i.e., the lack of symmetry, is a resource allowing one to accomplish certain tasks that are otherwise impossible. Similarly, in a Bell test using any given Bell inequality, the maximum violation achievable using quantum strategies respecting or disregarding a certain symmetry can be different. In this work, we focus on the symmetry involved in the exchange of parties and explore when we have to trade this symmetry for a lower-dimensional quantum strategy in achieving the maximal violation of given Bell inequalities. For the family of symmetric Collins-Gisin-Linden-Massar-Popescu inequalities, we provide evidence showing that there is no such trade-off. However, for several other Bell inequalities with a small number of dichotomic measurement settings, we show that symmetric quantum strategies in the minimal Hilbert space dimension can only lead to a suboptimal Bell violation. In other words, there exist symmetric Bell inequalities that can only be maximally violated by asymmetric quantum strategies of minimal dimension. In contrast, one can also find examples of asymmetric Bell inequalities that are maximally violated by symmetric correlations. The implications of these findings on the geometry of the set of quantum correlations and the possibility of performing self-testing therefrom are briefly discussed.

13.
arXiv (quant-ph) 2026-06-24

Logical qubits with erasure conversion using metastable neutral atoms

arXiv:2506.13724v2 Announce Type: replace Abstract: Implementing large-scale quantum algorithms with practical advantage will require fault-tolerance achieved through quantum error correction, but the associated overhead is prohibitive. This overhead can be reduced by engineering physical qubits with fewer errors, and by shaping the residual errors to be more easily correctable. In this work, we demonstrate quantum error correcting codes and logical qubit circuits in a metastable ytterbium-171 nuclear spin qubit with a noise bias towards erasure errors. These errors can be located separately from any syndrome information diagnosing the error, and we demonstrate adaptive circuit execution based on erasure information. We show that dephasing errors on the qubit during coherent transport can be strongly suppressed, and implement entangling gates that maintain a high fidelity in the presence of gate beam inhomogeneity or pointing errors. Furthermore, we demonstrate logical qubit encoding in the [[4, 2, 2]] code, with error correction during decoding based on mid-circuit erasure measurements despite the fact that the code is too small to correct any Pauli errors. Finally, we demonstrate logical qubit teleportation between multiple code blocks with conditionally selected ancillas based on mid-circuit erasure checks, a key part of leakage-robust error correction schemes using neutral atoms.

14.
arXiv (quant-ph) 2026-06-17

Quantum Routers: A Switching-Fabric Framework for Quantum-Native Forwarding

arXiv:2606.17773v1 Announce Type: new Abstract: Forwarding in quantum networks cannot be realized by directly transposing classical switching fabrics, since the no-cloning theorem and the quantum measurement postulate constrain the direct relay of quantum information while ruling out copy-based buffering and inspection. In this paper, we propose a switching-fabric framework for quantum routers based on multipartite entanglement. Specifically, we formalize the notion of an entanglement-based switching fabric, in which a graph state acts as the forwarding resource and entanglement forwarding is realized through local Pauli measurements. We translate the classical notions of blocking and non-blocking operation into structural conditions for entanglement-based fabrics, by deriving the edge-controlled (EC) design principle for non-blocking operation. We instantiate this principle through a monolithic EC crossbar and a modular Clos-type EC fabric, for which we characterize resource scaling and identify the regime where the modular design becomes more resource-efficient than the monolithic one. Finally, a forwarding-latency analysis establishes a fundamental distinction between matching-oblivious and matching-driven forwarding: the proposed EC fabrics realize all requested input-output entanglement links with constant forwarding depth under sufficient measurement parallelism, whereas matching-driven EPR-based fabrics exhibit latency that scales with the number of requested connections. The proposed framework provides a hardware-agnostic foundation for quantum-router switching fabrics.

15.
arXiv (quant-ph) 2026-06-11

Quantum repeater segment with free-space coupled co-trapped ions using telecom photon interference

arXiv:2606.12313v1 Announce Type: new Abstract: A quantum repeater segment is a basic building block of a quantum repeater, generating buffered entanglement of quantum memories to connect quantum repeater cells. It also enables the connection between quantum computers. In the implementation we present here, photons emitted from two co-trapped free-space coupled $^{40}$Ca$^+$ ions are converted to the telecom-C band and interfered after transmission over 440$\,$m of optical fiber (220$\,$m per arm), where a photonic Bell measurement is performed to create entanglement between the memories. With this scheme we generate an entangled $\left|\Psi^+\right\rangle$ Bell state with $\ge 68(8)\,$% fidelity, highlighting trapped $^{40}$Ca$^+$ ions as a promising quantum repeater hardware platform.

16.
arXiv (CS.AI) 2026-06-12

A Mathematical Theory of Value: a synthesis on goal-directed agency under resource constraints

作者:

arXiv:2606.12502v1 Announce Type: cross Abstract: We propose that value – the quantity goal-directed agents create, destroy, and exchange – is a lawful structural quantity in the same category as information. Following Shannon's method, we make one ruthless abstraction: value is the rate at which an agent converts a resource into goal-progress, relative to a frame fixed by its goal. A scale-invariance axiom forces a logarithmic measure, $V=\sum_i k_i \ln e_i$; compounding of a reinvested resource forces the same form via the ergodicity argument of Peters (2019). The two routes are kin rather than independent; their agreement is a consistency check, not an over-determination. We derive a coding theorem of value: $\Delta G \le I(X;Y)$, achieved by Bayes-proportional allocation; realized value decomposes as $G=D(q\|r)-D(q\|p)$, identifying misalignment with measurable waste. For populations, value is frame-relative while price is frame-independent; a fleet that pools its resource and fuses its perception inherits the ceiling $G_{\mathrm{fleet}} \le I(X;Y_{1:m}) \le H(X)$ (a corollary; an earlier sum-form claim was wrong and is corrected in v5). A dynamical layer yields an is/ought asymmetry from which alignment emerges as a control-stability condition with a closed-form residual. We test the single-frame laws on live language models in a pre-registered scale-up: perception mutual information tracks realized capability rather than parameter count (Spearman $\rho = 0.977$ pooled over 30 model$\times$domain points), out-of-sample $\Delta G$ tracks $I(X;Y)$, and over-confidence is measurable dissipation; a further pre-registered test shows the bridge is shape-invariant across four task shapes ($n=42$, slope 0.953). None of the mechanisms is individually new – generalized Kelly, Armstrong & Mindermann (2018), classical control; the contribution is their unification and the governance mapping (incentive design over oversight) that follows.

17.
arXiv (CS.AI) 2026-06-15

AudioDER: A Deduplication-Enhanced Reasoning Dataset for Post-Training Large Audio-Language Models

arXiv:2606.14591v1 Announce Type: cross Abstract: Large Audio-Language Models (LALMs) have shown strong performance on a wide range of audio understanding tasks, yet they still struggle with complex audio reasoning. A practical way to improve such capabilities is post-training, whose effectiveness critically depends on the quality and diversity of training data. However, existing audio-language datasets often contain substantial redundancy, where many samples are highly similar in acoustic content and thus provide overlapping supervisory signals. Such redundancy not only increases annotation cost, but also limits corpus diversity and reduces the effectiveness of post-training. To address this issue, we propose a redundancy-aware data construction pipeline for building reasoning-oriented supervision for LALMs. Specifically, we first perform acoustic similarity-based deduplication across raw audio datasets to improve corpus diversity. We then integrate existing audio captions and question-answer pairs into a unified multiple-choice format. Based on these unified annotations, we leverage Qwen3-30B to generate chain-of-thought (CoT) rationales for reasoning-oriented supervision. Based on this pipeline, we construct AudioDER, a reasoning-oriented post-training dataset containing approximately 191k samples spanning sound, speech, and music. Each sample consists of an audio clip, a multiple-choice question, four answer candidates, an audio caption, and a CoT rationale. Extensive experiments show that post-training on AudioDER consistently improves the performance of Qwen2-Audio-7B-Instruct on multiple audio reasoning benchmarks, including MMAU-mini, MMSU, and MMAR. We hope AudioDER can serve as a valuable resource for advancing audio reasoning research and the development of more capable LALMs.

18.
arXiv (CS.CV) 2026-06-18

URDF Synthesis from RGB-D Sequences via Differentiable Joint Inference and Energy-Consistent Verification

作者:

Reconstructing simulation-ready digital twins of articulated objects from sensor observations remains constrained by two persistent gaps: (i) part-level geometric reconstruction is decoupled from kinematic-parameter estimation, and (ii) the recovered models often violate basic dynamic invariants such as energy conservation, leading to drift when the URDF is replayed in physics simulators. We present KinemaForge, a constraint-driven pipeline that jointly infers part-level shape, joint topology, and joint parameters from short RGB-D sequences and validates the result against an energy-consistent verifier built on differentiable rigid-body dynamics. The pipeline introduces three components: a kinematic constraint graph that encodes joint-part incidences as soft edges; a differentiable screw-axis solver that backpropagates from rendered observations through Featherstone's articulated-body algorithm to joint parameters; and an energy residual loss that penalises non-physical free responses of the reconstructed model. Across five PartNet-Mobility categories and an internal RGB-D benchmark, KinemaForge reduces the average joint-axis error from 4.52 degrees to 2.83 degrees (-37.4%) over the strongest geometric baseline (PARIS) and from 5.30 degrees to 2.83 degrees (-46.6%) over the interaction-based Ditto baseline, lowers long-horizon simulation drift by 64% (vs. PARIS) over 50 s rollouts, and yields URDFs whose closed-loop manipulation success rate improves by 14.6 percentage points over Ditto in our preliminary evaluation. Code and reconstruction data will be released upon acceptance.

19.
arXiv (CS.AI) 2026-06-16

IoT-Zoo: A Container-Based Framework for Heterogeneous IoT Device Profiles and Reproducible Traffic Capture

arXiv:2606.15653v1 Announce Type: cross Abstract: The validation of networking and security solutions for the Internet of Things (IoT) requires realistic and reproducible experimental data. However, existing platforms often achieve scalability by replicating a limited set of device types, which restricts profile diversity and fails to capture the heterogeneity of real-world IoT environments. In this paper, we present IoT-Zoo, a container-based testbed designed to support reproducible experimentation through heterogeneous, dataset-driven IoT device profiles. Built upon Containernet, IoT-Zoo automates the deployment of multi-domain scenarios and supports real application protocols such as MQTT and RTSP. The platform provides a single-command interface for environment provisioning and automated traffic capture (PCAP), enabling the generation of consistent traffic baselines and reducing the operational effort required to evaluate networking and security solutions.

20.
arXiv (CS.AI) 2026-06-16

Latent Thought Flow: Efficient Latent Reasoning in Large Language Models

arXiv:2606.16222v1 Announce Type: new Abstract: Large Language Models (LLMs) increasingly rely on intermediate reasoning, yet explicit Chain-of-Thought (CoT) suffers from a linguistic space bottleneck: each thought must be decoded into tokens, causing high inference overhead. Latent reasoning moves deliberation into continuous space, but existing methods mostly learn deterministic or reward-maximizing paths, lacking a principled way to allocate probability across trajectories with different correctness and costs. We propose Latent Thought Flow (LTF), which models reasoning as variable-length continuous trajectories and trains a sampler to match a reward-induced posterior over answer quality and computation cost. We instantiate this with a continuous GFlowNet using stochastic latent transitions. To handle sparse answer supervision, we introduce an Entropy-Weighted Subtrajectory Balance objective for intermediate rewards and a reference-prior regularizer to anchor exploration. Experiments under finetuning and transfer learning settings show that LTF outperforms explicit CoT and latent reasoning baselines, improving accuracy by 9.5% while reducing reasoning length by 27.2% on average compared with strong latent reasoning baselines.

21.
arXiv (CS.LG) 2026-06-16

A spectral audit framework reveals task-dependent aperiodic reliance across EEG and ECG deep learning

arXiv:2606.08583v2 Announce Type: replace Abstract: Deep learning on physiological time series is interpreted through domain-specific features – oscillatory rhythms in EEG, morphological complexes in ECG – yet these signals sit atop a broadband aperiodic 1/f-like envelope that covaries with arousal, age, and pathology. We introduce a spectral audit framework combining aperiodic/periodic decomposition, phase-preserving Fourier interventions, sham controls, and simulation validation. Aperiodic reliance was task-dependent and architecture-general: across six neural architectures, flattening drops exceeded 0.42 balanced-accuracy points for sleep-wake classification, reached 0.07-0.13 for clinical abnormality detection, and remained minimal for motor imagery. Six of seven EEG foundation models showed FDR-significant aperiodic reliance on clinical EEG; age/sex and recording-era controls reduced but did not eliminate the effect. Applying the audit to PTB-XL ECG revealed neural drops of 0.32–0.36 persisting after demographic matching, confirming this confound class extends beyond EEG. Aperiodic controls should become standard for interpretable physiological time-series deep learning.

22.
arXiv (CS.CL) 2026-06-16

CoRA: Confidence-Rationale Alignment for Reliable Chain-of-Thought Reasoning

Chain-of-thought (CoT) reasoning can improve LLM performance, but high answer confidence may be misleading when the accompanying CoT rationale is plausible yet incomplete or poorly supported. We study confidence–rationale alignment: whether a model's confidence in its committed answer is justified by its generated rationale. We introduce a GRPO-based reinforcement learning framework that jointly rewards answer correctness, committed-answer probability, and rubric-based rationale support, where the rubric assesses grounding, coherence, task match, and connection to the selected answer without revealing the gold answer to the judge. Across MedQA, MathQA, and OpenBookQA using three open-weight LLMs, our method reduces the confidence–rationale alignment error by up to 26.51% compared with untuned checkpoints, SFT, and correctness-only GRPO, while maintaining competitive accuracy and often improving calibration. These results show that reliable CoT reasoning requires not only confident answers, but rationales that substantively support them.

23.
arXiv (CS.AI) 2026-06-18

An In-depth Study of LLM Contributions to the Bin Packing Problem

arXiv:2510.27353v2 Announce Type: replace Abstract: Recent studies have suggested that Large Language Models (LLMs) could provide interesting ideas contributing to mathematical discovery. This claim was motivated by reports that LLM-based genetic algorithms produced heuristics offering new insights into the online bin packing problem under uniform and Weibull distributions. In this work, we reassess this claim through a detailed analysis of the heuristics produced by LLMs, examining both their behavior and interpretability. Despite being human-readable, these heuristics remain largely opaque even to domain experts. Building on this analysis, we propose a new class of algorithms tailored to these specific bin packing instances. The derived algorithms are significantly simpler, more efficient, more interpretable, and more generalizable, suggesting that the considered instances are themselves relatively simple. We then discuss the limitations of the claim regarding LLMs' contribution to this problem, which appears to rest on the mistaken assumption that the instances had previously been studied. Our findings instead emphasize the need for rigorous validation and contextualization when assessing the scientific value of LLM-generated outputs.

24.
arXiv (CS.CL) 2026-06-19

Light-weight Pronunciation Assessment via Discrete Speech Token Surprisal

Training automated pronunciation assessment often relies on labeled learner errors or non-native corpora that are costly to collect. We propose a lightweight framework trained only on native speech resources, operating unsupervised or lightly calibrated with a small set of scored utterances. At inference, learner speech is discretized with an SSL encoder and a K-means codebook. A token language model trained on native sequences computes surprisal where higher surprisal indicates phonotactic deviation. We add a transcript-guided Text2DUnit–DTW module that predicts native token sequences from reference text and aligns them to acoustic tokens to derive error-sensitive features. Surprisal and alignment features are fused via simple regression. On SpeechOcean762, PCC improves from 0.60 to 0.66 with transcript guidance, near supervised baselines. Cross-dataset evaluation on L2-ARCTIC shows consistent gains.

25.
arXiv (CS.CL) 2026-06-11

Hubs or Fringes: Pretraining Data Selection via Web Graph Centrality

The performance of modern language models depends critically on pretraining data composition. Yet existing data selection methods rely on auxiliary classifiers for document scoring or mixture optimization, adding computational overhead and dependence on labeled data. We propose WebGraphMix, a lightweight data selection framework that computes structural centrality scores over the Common Crawl host-level web graph and uses them to vary the proportion of central versus peripheral documents in the pretraining mixture. We hypothesize that central hosts expose models to reusable abstractions, while peripheral hosts encode specialized, long-tail knowledge. WebGraphMix computes centrality scores efficiently at web scale, requiring no model training, labeled data, or downstream supervision. We integrate WebGraphMix into the DataComp-LM pipeline and train models at 400M and 1B parameter scales with 8B and 28B tokens respectively, evaluating on 23 tasks ranging from factual knowledge to symbolic reasoning. Our experiments show that central and peripheral web regions encode complementary capabilities. Mixture combining both at a ratio of 1:1 achieves 41.4% on average, compared to 39.8% for uniform sampling. Combining structural scores with document-level quality classifier scores further improves performance to 43.8%. These findings demonstrate that web graph topology is a meaningful axis for pretraining data curation, capturing information that is largely orthogonal to existing content-based approaches.