Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CV) 2026-06-16

InfoGeo: Information-Theoretic Object-Centric Learning for Cross-View Generalizable UAV Geo-Localization

Cross-view geo-localization (CVGL) is fundamental for precise localization and navigation in GPS-denied environments, aiming to match ground or UAV imagery with satellite views. Existing approaches often rely on global feature alignment, but they suffer from substantial domain shifts induced by varying regional textures and weather conditions. This issue becomes even more pronounced in UAV-based scenarios, where the broader perspective inevitably introduces dense, fine-grained objects, creating significant visual clutter. To address this, we draw inspiration from Object-Centric Learning (OCL) and propose InfoGeo, an information-theoretic framework designed to enhance robustness and generalization. InfoGeo reformulates the optimization as an information bottleneck process with two core objectives: (i) maximizing view-invariant information by aligning the object-centric structural relations across views, and (ii) minimizing view-specific noisy signals through cross-view knowledge constraints. Extensive evaluations across diverse benchmarks and challenging scenarios demonstrate that InfoGeo significantly outperforms state-of-the-art methods.

02.
arXiv (CS.CL) 2026-06-16

Extending Item Response Theory for Efficient and Meaningful Multilingual Evaluation

Multilingual benchmarks are central to evaluating large language models (LLMs) across languages, but they suffer from three issues: exhaustive evaluation scales linearly with the number of languages, automatic translation introduces errors that are easily missed at scale, and some items conflate general and culture-specific knowledge. We address all three with a unified statistical framework, Multilingual-IRT, which extends Item Response Theory with per-language difficulty deviations, split discriminability separating content from language effects, and per-language ability residuals. Fitting Multilingual-IRT on 25 LLMs across 29 languages of MMLU-Pro-X, we show that its fitted parameters support three practical applications: predicting unobserved (item, LLM, language) instances with 11-16% lower binary cross-entropy than the strongest accuracy-based baseline, surfacing candidate translation errors distributed across all 28 non-English languages, whereas accuracy-based baselines concentrate detections in a few languages, and recovering culture-specific items that accuracy-based baselines miss.

03.
arXiv (CS.AI) 2026-06-19

Techniques for Peak Memory Reduction for LoRA Fine-tuning of LLMs on Edge Devices

arXiv:2606.19528v1 Announce Type: cross Abstract: Fine-tuning of Large Language Models (LLMs) using Low-Rank Adaptation (LoRA) on an end-user's data offers personalized experiences while keeping data private, but faces severe memory constraints on consumer hardware. Peak memory during fine-tuning often exceeds device limits, especially for models with billions of parameters and long-context training data. This paper introduces a suite of complementary techniques to reduce memory footprint without sacrificing model quality: (1) base model quantization with on-the-fly dequantization, (2) memory-efficient checkpointing combining selective activation caching and disk offloading, (3) softmax approximation using semantically relevant token subsets, and (4) logits masking. Experiments on Llama-3.2 3B and Qwen-2.5 3B demonstrate up to $26\times$ and $28\times$ reduction in peak memory, enabling fine-tuning on resource-constrained devices.

04.
Nature (Science) 2026-06-17

A 98-qubit trapped-ion quantum computer with all-to-all connectivity

Quantum computers require both high-fidelity operations and large qubit numbers to surpass classical capabilities1. Trapped-ion platforms have demonstrated the highest gate fidelities of any modality2–6 but scaling to larger qubit numbers while preserving performance has remained a central challenge. We report on Quantinuum Helios, a 98-qubit trapped-ion quantum processor based on the quantum charge-coupled device (QCCD) architecture7. Helios features 137Ba+ hyperfine qubits8,9, all-to-all connectivity enabled by a rotatable ion storage ring connecting two quantum operation regions by a junction10,11, speed improvements from parallelized operations12 and a new software stack with real-time compilation of dynamic programs13. Averaged over all operational zones in the system, we achieve average infidelities of 2.5(1) × 10−5 for single-qubit (1Q) gates, 7.9(2) × 10−4 for two-qubit (2Q) gates and 3.3(5) × 10−4 for state preparation and measurement (SPAM), none of which are fundamentally limited and probably able to be improved. These component infidelities are predictive of system-level performance in both random Clifford circuits and random circuit sampling (RCS), the latter demonstrating that Helios operates well beyond the reach of classical simulation and establishes a new frontier of fidelity and complexity for quantum computers14. A new quantum computer, Quantinuum Helios, which is a 98-qubit trapped-ion quantum processor built on the QCCD architecture, demonstrates performance well beyond classical capabilities and provides a path for scaling up quantum computing.

05.
arXiv (CS.LG) 2026-06-15

Testing For Distribution Shifts with Conditional Conformal Test Martingales

arXiv:2602.13848v2 Announce Type: replace Abstract: We propose a sequential test for detecting arbitrary distribution shifts that allows conformal test martingales (CTMs) to work under a fixed, reference-conditional setting. Existing CTM detectors construct test martingales by continually growing a reference set with each incoming sample, using it to assess how atypical the new sample is relative to past observations. While this design yields anytime-valid type-I error control, it suffers from test-time contamination: after a change, post-shift observations enter the reference set and dilute the evidence for distribution shift, increasing detection delay and reducing power. In contrast, our method avoids contamination by design by comparing each new sample to a fixed null reference dataset. Our main technical contribution is a robust martingale construction that remains valid conditional on the null reference data, achieved by explicitly accounting for the estimation error in the reference distribution induced by the finite reference set. This yields anytime-valid type-I error control together with guarantees of asymptotic power one and bounded expected detection delay. Empirically, our method detects shifts faster than standard CTMs, providing a powerful and reliable distribution-shift detector.

06.
medRxiv (Medicine) 2026-06-22

GCH1 p.Ser80Asn Confers Risk for Parkinson's Disease in East Asian Populations

Introduction: GCH1 has been implicated in Parkinson's disease (PD), but its risks variants and associations are not well defined. Objectives: To investigate the clinical relevance and PD risk associated with the GCH1 p.Ser80Asn variant. Methods: We first identified a segregating GCH1 p.Ser80Asn variant in a Malaysian Chinese PD family via whole genome sequencing (WGS). We assessed its risk association using multi-ancestry WGS data from the Global Parkinson's Genetics Program (GP2) (n=22,372PD vs n=8,826Controls) and meta-analysis of East Asian (EAS) cohorts (n=4,712PD vs 38,733Controls). Clinico-demographic details of affected variant carriers were collated. Results: The GCH1 p.Ser80Asn variant was enriched in GP2 EAS PD populations (n=9/2,757; 0.33%) but not detected in other ancestries. Meta-analysis revealed increased PD risk in EAS populations (odds ratio:5.1; 95%CI:2.3-10.7; p=2.89x10-5). Affected carriers (mean age at onset:56.3+-12.5 years) had additional occurrence of dystonia, while dementia was rare. Conclusions: The GCH1 p.Ser80Asn variant is a rare, EAS-enriched risk variant for PD.

07.
arXiv (CS.AI) 2026-06-16

Optimal Transport for Machine Learners

arXiv:2505.06589v2 Announce Type: replace-cross Abstract: Modern machine learning repeatedly manipulates probability measures: empirical datasets, generated samples, latent distributions, class-conditional laws, particle systems, weights of wide networks and attention patterns. Optimal transport is useful in this setting because it compares such objects by asking how mass should move. It therefore combines a statistically meaningful notion of discrepancy with a geometry of interpolation, dual certificates and variational dynamics. This makes OT a common language for losses, generative modeling, domain adaptation, robust learning, barycenters, gradient flows and mean-field descriptions of learning algorithms. This book presents the main OT techniques with these machine-learning uses in mind. It starts from finite assignment and the Monge map viewpoint, passes to Kantorovich couplings and dual potentials, and then explains the algorithmic ideas that make transport usable: linear programming, semi-discrete cells, Sinkhorn scaling and low-dimensional projections. The same objects are then reused as a geometry of measures, giving Wasserstein distances, barycenters, gradient flows, dynamic formulations and Gaussian/Bures formulas. The final chapters emphasize the variants most relevant to modern ML: divergences and adversarial losses, entropic and unbalanced relaxations, robust or spectral ground geometries, Gromov and quantum extensions, and transport-based views of generative models, mean-field networks and attention dynamics. The goal is to keep the mathematics explicit while exposing the computational and geometric intuitions needed to turn OT into a working toolbox for machine learners.

08.
arXiv (CS.AI) 2026-06-19

Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

arXiv:2606.20517v1 Announce Type: new Abstract: LiveCodeBench (LCB) has recently become a widely adopted benchmark for evaluating large language models (LLMs) on code-generation tasks. By curating competitive programming problems, constantly adding fresh problems to the set, and filtering them by release dates, LCB provides contamination-aware evaluation and offers a holistic view of coding capability. However, LCB remains restricted to Python, leaving open the question of whether LLMs can generalize across the diverse programming languages required in real-world software engineering. We introduce Multi-LCB, a benchmark for evaluating LLMs across twelve programming languages, including Python. Multi-LCB transforms Python tasks from the LCB dataset into equivalent tasks in other languages while preserving LCB's contamination controls and evaluation protocol. Because it is fully compatible with the original LCB format, Multi-LCB will automatically track future LCB updates, enabling systematic assessment of cross-language code generation competence and requiring models to sustain performance well beyond Python. We evaluated 24 LLMs for instruction and reasoning on Multi-LCB, uncovering evidence of Python overfitting, language-specific contamination, and substantial disparities in multilingual performance. Our results establish Multi-LCB as a rigorous new benchmark for multi-programming-language code evaluation, directly addressing LCB's primary limitation and exposing critical gaps in current LLM capabilities.

09.
Nature (Science) 2026-06-22

Cancer cells adopt unprecedented strategies to produce a molecule that protects them from iron-dependent death

The finding that spermine molecules in cells bind to iron to prevent it unleashing ferroptosis, a type of cell death, opens up strategies for treating tissue damage and cancer. The finding that spermine molecules in cells bind to iron to prevent it unleashing ferroptosis, a type of cell death, opens up strategies for treating tissue damage and cancer.

10.
arXiv (CS.AI) 2026-06-15

Safety-Contract Graph Multi-Agent Reinforcement Learning for Autonomous Network Security Response

arXiv:2606.13832v1 Announce Type: cross Abstract: Autonomous network-security response systems promise to reduce Security Operations Centre (SOC) reaction latency, but reward-only multi-agent reinforcement learning (MARL) can improve security reward while remaining non-deployable. We present a safety-contract graph MARL framework and instantiate it as ACD$^3$-GAT (Adaptive Constrained Counterfactual Decisioning with a Graph Attention Network encoder), an architecture that separates simulator observations from reusable operational budgets, constrained optimization, graph state encoding, and counterfactual action screening. We evaluate the method in CAGE Challenge 4, where agents operate under budgets for Mean Time to Recover (MTTR), false-positive response, and firewall change-management disruption. Across the benchmark, every unconstrained method violates the SOC downtime budget in 100% of evaluated episodes, with mean downtime proxy costs of 311-430 against a budget of 50. This complements prior CAGE Challenge 4 findings by showing that reward-only learning lacks operational discipline. Constrained MAPPO-GAT (C-MAPPO-GAT) isolates Lagrangian operational-cost control and budget-aware screening, while ACD$^3$-GAT adds budget context, CVaR tail-risk estimation, opponent-belief state, and Graph Counterfactual Risk Propagation (G-CRP). The replicated comparison includes three 200-episode seeds for IPPO, MAPPO-GAT, C-MAPPO-GAT, and ACD$^3$-GAT. C-MAPPO-GAT reduces downtime violation from 100% to 0.3% and mean downtime cost from 355.4 to 15.5 relative to MAPPO-GAT. ACD$^3$-GAT reduces mean downtime cost to 48.2 with a 13.8% violation rate, placing it on the safety-contract frontier rather than at the most conservative compliance point. Topology-seed and coupled adaptive Red-process stress tests preserve this contrast and show lower worst adaptive degradation for safety-constrained policies than reward-only MAPPO-GAT.

11.
arXiv (CS.CV) 2026-06-18

Spatially Stratified Distillation for Heterogeneous Radar Place Recognition

Scalable, all-weather place recognition increasingly relies on heterogeneous radar place recognition to bridge diverse hardware platforms. A notable application is matching queries from cost-effective 4D automotive radars against high-fidelity reference maps built by dense spinning radars. This process is fundamentally limited by the extreme sparsity (and narrow field-of-view) of the 4D sensor, which captures only a fraction of the structural density present in the spinning radar database. Prior efforts address this issue by unifying different radar signals. That is, projecting both signals into a common representational space. Yet, they suffer performance degradation in multi-session environments. In this paper, we propose spatially-stratified distillation (SSD); a strategy that replaces standard uniform distillation with an asymmetric spatial alignment derived directly from physical radar returns. In regions where both radars exhibit overlapping returns, SSD enforces strong feature alignment. Crucially, in sparse regions where the 4D student lacks returns but the teacher contains valid structure within the shared field of view, SSD applies heavily discounted distillation weights. Extensive evaluations of the recent HeRCULES dataset demonstrate that SSD significantly outperforms prior place recognition methods, achieving state-of-the-art results on its challenging dynamic sequences.

12.
arXiv (CS.CL) 2026-06-15

Deja Vu at Scale: Paraphrase-Robust Detection of Duplicate Gherkin Steps in Behaviour-Driven Software Testing with Sentence-Transformer Embeddings and a 1.1M-Step Open Benchmark

Context. Behaviour-Driven Development (BDD) suites in Gherkin accumulate step-text duplication with documented maintenance cost. Prior detectors either require runnable tests or are single-organisation, leaving a gap: a static, paraphrase-robust, step-level detector and a public benchmark to calibrate it. Objective. We release (i) the largest cross-organisational BDD step corpus to date, (ii) a labelled pair-level calibration benchmark, and (iii) a four-strategy detector with a consolidation-savings model linking clusters to ISO/IEC 25010 maintainability sub-characteristics. Method. The corpus contains 347 public GitHub repositories, 23,667 .feature files, and 1,113,616 Gherkin steps, SPDX-tagged. The detector layers exact hashing, normalised Levenshtein, sentence-transformer cosine, and a Levenshtein-banded hybrid. Calibration uses 1,020 manually labelled step pairs under a released rubric (60-pair overlap, Fleiss kappa = 0.84). We report precision, recall, and F1 with bootstrap 95% CIs under the primary rubric and a score-free relabelling, and benchmark against SourcererCC-style and NiCad-style lexical baselines. Results. Step-weighted exact-duplicate rate is 80.2%; median-repository rate is 58.6% (Spearman rho = 0.51). The top hybrid cluster has 20,737 occurrences across 2,245 files. Near-exact reaches F1 = 0.822 on score-free labels; semantic F1 = 0.906 under the primary rubric reflects a disclosed stratification artefact. Lexical baselines reach F1 = 0.761 and 0.799. The savings model estimates 893,357 corpus-wide eliminable step occurrences; on the median repository 62.5% of step lines are eliminable.

13.
arXiv (CS.CL) 2026-06-18

PragReST: Self-Reinforcing Counterfactual Reasoning for Pragmatic Language Understanding

Natural language understanding often depends on meanings that are implied rather than explicitly stated, requiring pragmatic reasoning. Despite strong performance on math and logical reasoning, large language models (LLMs) still struggle with making pragmatic inferences, often choosing literal interpretations. To improve LLM pragmatic reasoning, we introduce PragReST, a self-supervised framework that constructs pragmatic QA data, generates counterfactual reasoning traces, and trains models to internalize them through supervised fine-tuning and reinforcement learning, without human-labeled training data or distillation from a stronger teacher. Across four pragmatic benchmarks (PragMega, Ludwig, MetoQA, and AltPrag), PragReST improves over backbone models, task-specific pragmatic tuning baselines, and non-counterfactual variants of the same pipeline. On accuracy-based benchmarks, PragReST improves over the instruct backbone by 5.37 and 5.50% (absolute) for Qwen3-8B and Qwen3-14B, respectively. Our error analysis and ablations underscore the importance of counterfactual reasoning: PragReST primarily reduces errors caused by failures to contrast observed utterances with plausible alternatives, and removing counterfactual reasoning substantially reduces performance. Moreover, our training preserves out-of-domain performance on general-knowledge and mathematical reasoning benchmarks.

14.
arXiv (CS.CL) 2026-06-18

PatchWorld: Gradient-Free Optimization of Executable World Models

Text-agent environments are typically modeled as partially observable Markov decision processes (POMDPs), assuming that the simulator's latent state and transition dynamics are hidden from the agent. Yet little work has examined whether executable code can be induced to serve as a world model for prediction and planning under partial observability. We introduce PatchWorld, a gradient-free framework that turns offline trajectories into executable Python world models through counterexample-guided code repair. Instead of predicting the next observation with a black-box model, PatchWorld induces symbolic belief-state programs whose action updates can be inspected, replayed, and locally patched. Across seven AgentGym environments, PatchWorld-Simple achieves the highest code-based planning score among evaluated methods, reaching 76.4\% macro success in live one-step lookahead while invoking no LLM calls inside the world-model prediction module itself. We further find that a human-specified residual-memory bias improves surface observation fidelity but weakens decision utility. This exposes a tradeoff in executable world models, since improving observation fidelity can come at the expense of action-discriminative dynamics, and vice versa. Code is available at https://github.com/HKBU-KnowComp/PatchWorld.

15.
arXiv (CS.AI) 2026-06-19

Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe

arXiv:2606.20381v1 Announce Type: new Abstract: FP4 training promises substantial reductions in memory and computation cost for LLM pretraining, yet current FP4 hardware paths and recipes, including NVIDIA Blackwell/Rubin-class systems and AMD MI350-series GPUs, remain centered on E2M1 data elements. In this study, we identify a fundamental limitation of that choice: non-uniform formats such as E2M1 inherently suffer from Shrinkage Bias, a systematic negative rounding error caused by the geometric asymmetry of their representable bins. We show that this bias accumulates multiplicatively across layers and is amplified by the Random Hadamard Transform (RHT), providing a unified explanation for the training instability observed in existing E2M1-based FP4 recipes. In contrast, uniform grids (E1M2/INT4) bypass this grid-geometry error and better convert the improved bucket utilization from RHT into higher quantization quality. Based on this finding, we propose UFP4, a uniform 4-bit training recipe that applies RHT to all three training GEMMs while restricting stochastic rounding to dY alone. On Dense 1.5B, MoE 7.9B, and MoE 124B long-run pretraining, UFP4 consistently achieves lower BF16-relative loss degradation than strong E2M1-based baselines, supported by scaling-law analysis and ablation studies. Our results suggest that future accelerators should support E1M2/INT4-style uniform 4-bit grids as first-class training primitives alongside E2M1.

16.
arXiv (CS.CL) 2026-06-18

Aligning Implied Statements for Implicit Hate Speech Generalizability with Context-Bounded Semi-hard Negative Mining

Classifying implicit hate speech remains a challenge, as intent is often masked through insinuation and context rather than explicit slurs. Prior supervised contrastive approaches improve in-domain detection but can overfit surface cues and struggle to transfer across datasets. We propose ImpSH, a triplet-based framework that aligns posts with implied statements when available and uses context-bounded semi-hard negatives to focus learning on near confusions. We also examine AugSH, which forms positives via data augmentation. In controlled evaluations on IHC, SBIC, and DynaHate with BERT and HateBERT, ImpSH is a viable alternative to standard supervised contrastive baselines and often improves cross-domain performance under matched preprocessing and tuning budgets. Representation analysis using alignment and uniformity indicates tighter positive pairs with balanced global spread, and qualitative nearest-neighbor case studies illustrate typical false negatives under domain shift. These results demonstrate that aligning posts with their implied statements via context-bounded mining provides a more stable, bijective-like mapping to related insinuations, overcoming the volatility inherent in traditional clustering-based representation learning.

17.
arXiv (CS.LG) 2026-06-12

Ride, Track, and Recover: Pilot Randomized Trial of a Wearable Digital Self-Management Intervention During a Veteran Endurance-Cycling Program

arXiv:2606.13529v1 Announce Type: cross Abstract: Post-traumatic stress disorder (PTSD) in veterans is characterized by persistent hyperarousal and comorbid anxiety and depressive symptoms that are difficult to monitor and manage outside clinical settings. Thirteen veterans participating in a Project Hero cycling event in Texas were randomized by computer-generated sequence in a naturalistic setting to two arms: (1) digital intervention plus physical activity, or (2) physical activity only, plus a third at-home monitoring control cohort consisting of 7 veterans selected from the broader Project Hero veteran community. Continuous smartwatch sensing combined heart rate and accelerometer features to detect hyperarousal events, which were confirmed in real time by participants. Weekly self-report measures of anxiety, depression, and PTSD severity were collected. Generalized additive mixed models characterized nonlinear trajectories over time. Baseline-normalized hyperarousal trajectories differed significantly across conditions, with the digital intervention group (n=7) showing structured stabilization compared to late-study escalation in the physical-only group (n=3). Both cycling groups exhibited acute symptom improvements during the endurance event; however, the digital intervention group demonstrated a higher overall maintenance of gains. The at-home control group (n=4) showed gradual symptom declines. Perceived precision of ML detections varied substantially across individuals and was positively associated with symptom severity, with higher-severity participants confirming a greater proportion of detected events. These results suggest that coupling wearable detection with digital self-management tools may support stabilization of hyperarousal and symptom improvement while emphasizing the importance of personalization and human-centered design in wearable mental health systems.

18.
arXiv (CS.CL) 2026-06-19

How Linear Is a Transformer Feed-Forward Block? Per-Block Linear Recoverability Is Learned, Not Architectural

作者:

Transformer feed-forward networks (FFNs) are often treated as nonlinear stores of computation, yet how nonlinear a trained FFN block actually is has rarely been measured. We treat each FFN as a position-wise input-to-output map and split it into the exact least-squares linear approximation plus a residual. The held-out variance the closed-form linear map explains defines a block's linear recoverability (R^2_lin), an optimiser-free measure of its linearity. Across all twelve blocks of GPT-2, Pythia-160m, and llama-160m, R^2_lin is highly heterogeneous and non-monotone with depth, ranging from near-linear (>0.99) to strongly nonlinear (

19.
arXiv (CS.AI) 2026-06-12

GeoNatureAgent Benchmark: Benchmarking LLM Agents for Environmental Geospatial Analysis Across Frontier and Open-Weight Foundation Models

arXiv:2606.12821v1 Announce Type: new Abstract: Environmental scientists spend disproportionate effort on data wrangling rather than analysis, and AI agents that automate geospatial workflows remain unvalidated: no benchmark evaluates agents operating through structured tool calling against real APIs. We introduce the GeoNatureAgent Benchmark, the first benchmark for environmental analysis agents that operate via structured tool calls to a production-style geospatial API. It comprises 93 tasks across 18 categories, covering municipality analysis, multi-turn conversation, spatial reasoning, cross-indicator synthesis, error handling and recovery, ranking, comparison, multilingual understanding, habitat analysis, and task rejection. Tasks are evaluated against an open, self-hostable API serving three environmental indicators across Spain and Portugal via sixteen tools. We evaluate seven LLMs (Claude Sonnet 4, DeepSeek V3.2, GLM-5, Gemini 2.5 Pro, Qwen3-235B, GPT-OSS-120B, Llama 4 Scout) under three temperature-1.0 seeds, reporting capability and per-case cost as orthogonal axes. We find: (1) Claude Sonnet 4 leads at 60.8% +/- 0.8%, followed by DeepSeek V3.2 at 56.3% +/- 3.1%, with no other model above 51%; (2) the cost-accuracy Pareto frontier is occupied mostly by open-weight models, with DeepSeek V3.2 offering 93% of Claude's capability at 11x lower cost ($0.011/case); (3) comparison tasks remain universally unsolved (0% on close-value comparisons), exposing systematic reasoning limits; and (4) structured tool calling against a real API is more discriminative than general-purpose GIS benchmarks, with accuracies 25-35 points lower. We further show extensibility by integrating BigEarthNet V2 land cover for Portugal alongside Spanish CO2 and erosion indicators. The benchmark, harness, and self-hostable API are publicly available.

20.
arXiv (CS.AI) 2026-06-15

Exact Linear Attention

作者:

arXiv:2605.18848v4 Announce Type: replace-cross Abstract: This paper introduces Exact Linear Attention (ELA), a mechanism that achieves linear computational complexity for Transformer attention by exploiting the exact decomposition property of kernel functions, thereby eliminating approximation error. We identify and address two key limitations of prior linear attention – gradient explosion and token attention dilution – by imposing kernel constraints that ensure non-negativity, discriminability, and geometric interpretability. Several kernel functions are proposed, including the Hadamard Exp Kernel, Summation Squared Euclidean Distance Kernel, and Subtraction Squared Euclidean Distance Kernel, each tailored for specific attention behaviors. Beyond the core attention formulation, the paper presents three engineering innovations: (1) a Hyper-Link structure that replaces traditional residual connections to mitigate gradient degradation; (2) a Memory Lobe module based on bidirectional linear attention, which captures "transformation flow" across layers to implement qualitative memory and an implicit reinforcement learning paradigm; and (3) a routing-score-based bias mechanism for Mixture-of-Experts (MoE) to improve interpretability and semantic alignment. Experimental results demonstrate that ELA achieves up to 6x faster decoding speed and 75% reduction in KV cache memory usage compared to full attention, while maintaining comparable or superior training performance. The proposed memory module accelerates convergence and enhances generalization. Furthermore, we extend the linear attention principle to vision models, yielding YOLO-LAT, which attains up to 4.3x GPU inference speedup and 7.9x parameter reduction with competitive detection accuracy. These results underline the broad applicability of exact linear attention for scaling Transformer models to ultra-long sequences and efficient visual tasks.

21.
arXiv (CS.CV) 2026-06-19

LaTtE-Flow: Layerwise Timestep-Expert Flow-based Transformer

Recent advances in multimodal foundation models unifying image understanding and generation have opened exciting avenues for tackling a wide range of vision-language tasks within a single framework. Despite progress, existing unified models typically require extensive pretraining and struggle to achieve the same level of performance compared to models dedicated to each task. Additionally, many of these models suffer from slow image generation speeds, limiting their practical deployment in real-time or resource-constrained settings. In this work, we propose Layerwise Timestep-Expert Flow-based Transformer (LaTtE-Flow), a novel and efficient architecture that unifies image understanding and generation within a single multimodal model. LaTtE-Flow builds upon powerful pretrained Vision-Language Models (VLMs) to inherit strong multimodal understanding capabilities, and extends them with a novel Layerwise Timestep Experts flow-based architecture for efficient image generation. LaTtE-Flow distributes the flow-matching process across specialized groups of Transformer layers, each responsible for a distinct subset of timesteps. This design significantly improves sampling efficiency by activating only a small subset of layers at each sampling timestep. To further enhance performance, we propose a Timestep-Conditioned Residual Attention mechanism for efficient information reuse across layers. Experiments demonstrate that LaTtE-Flow achieves strong performance on multimodal understanding tasks, while achieving competitive image generation quality with around 6x faster inference speed compared to recent unified multimodal models.

22.
arXiv (quant-ph) 2026-06-19

Random Local Stabilizer Codes in Three Dimensions without String or Self-Similar Fractal Logical Operators

作者:

arXiv:2606.19873v1 Announce Type: new Abstract: Quantum error-correcting codes (QECs) are essential components quantum computation and have deep connections to quantum phases of matter. A key obstruction to passive self-correcting QECs is the presence of string logical operators, which can generate logical errors through constant-energy-barrier processes. Haah's Codes (fracton codes) showed that three-dimensional stabilizer codes can forbid such string logical operators, but their translation-invariant structure supports self-similar fractal logical operators with a logarithmic energy barrier. We introduce the qutrit random cubic codes, a family of local qutrit Calderbank-Shor-Steane stabilizer Hamiltonians with similar cube-check structure as Haah's Code 1 but built from spatially varying stabilizers. We prove that these models retain the no-string property and numerically observe that they have properties distinct from translation-invariant fracton codes: the smallest ground-state degeneracy exponent is $k=2$ for odd $L$ and $k=4$ for even $L$; noncontractible plane-logical operators span the entire logical space; and charge-push diagnostics show that the self-similar fractal operators are absent. These results demonstrate that constrained randomness can fundamentally change the nature of stabilizer codes and improve their self-correction properties. They further point to broader families of quantum error-correcting codes and quantum phases beyond canonical topological and fracton orders.

23.
arXiv (CS.LG) 2026-06-18

KANEL\'E: Kolmogorov-Arnold Networks for Efficient LUT-based Evaluation

arXiv:2512.12850v3 Announce Type: replace-cross Abstract: Low-latency, resource-efficient neural network inference on FPGAs is essential for applications demanding real-time capability and low power. Lookup table (LUT)-based neural networks are a common solution, combining strong representational power with efficient FPGA implementation. In this work, we introduce KANEL\'E, a framework that exploits the unique properties of Kolmogorov-Arnold Networks (KANs) for FPGA deployment. Unlike traditional multilayer perceptrons (MLPs), KANs employ learnable one-dimensional splines with fixed domains as edge activations, a structure naturally suited to discretization and efficient LUT mapping. We present the first systematic design flow for implementing KANs on FPGAs, co-optimizing training with quantization and pruning to enable compact, high-throughput, and low-latency KAN architectures. Our results demonstrate up to a 2700x speedup and orders of magnitude resource savings compared to prior KAN-on-FPGA approaches. Moreover, KANEL\'E matches or surpasses other LUT-based architectures on widely used benchmarks, particularly for tasks involving symbolic or physical formulas, while balancing resource usage across FPGA hardware. Finally, we showcase the versatility of the framework by extending it to real-time, power-efficient control systems.

24.
arXiv (CS.AI) 2026-06-11

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

arXiv:2604.25018v2 Announce Type: replace-cross Abstract: The Internet of Everything (IoE) represents an evolution of the Internet of Things (IoT) by integrating people, data, processes, and things into a unified intelligent ecosystem. IoE aims to enhance automation, decision-making, and service efficiency across multiple application domains such as smart cities, healthcare, industry, and next-generation wireless networks. This paper provides a structured overview of the IoE concept, its core components, architectural foundations, enabling technologies, and major research challenges. Finally, open research directions toward 6G-enabled intelligent IoE systems are discussed, with emphasis on scalability, security, privacy, and energy efficiency.

25.
arXiv (CS.LG) 2026-06-16

Beyond Accuracy: Measuring Bias Acknowledgment in Chain-of-Thought Reasoning for Responsible AI Evaluation

arXiv:2606.15127v1 Announce Type: new Abstract: Reasoning models are increasingly used in settings where the final answer is not the only object of review: educational tools may show students intermediate steps, decision-support systems may require human oversight, and audit workflows may inspect traces for misleading or biased input. In such settings, two responses can receive the same final-answer score while differing in whether the trace explicitly flags injected biasing content. Accuracy-only evaluation collapses these cases. We study this gap as a measurement blind spot for responsible evaluation and introduce a minimal trace-level diagnostic with two axes: susceptibility (whether the bias breaks a previously correct answer) and acknowledgment (whether the trace contains a rubric-defined surface reference to the injected content). Across thousands of biased GSM8K trials, GPT-4o and Claude Sonnet~4 have similar susceptibility rates ($1.3\%$ vs.\ $1.2\%$) but substantially different acknowledgment rates ($13.0\%$ vs.\ $75.0\%$) under the same rubric.