Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.AI) 2026-06-17

EvolveNav: Proactive Preflection and Self-Evolving Memory for Zero-Shot Object Goal Navigation

arXiv:2606.18235v1 Announce Type: new Abstract: Zero-Shot Object-Goal Navigation (ZS-OGN) requires embodied agents to explore and locate target objects without any prior training. To this end, recent methods leverage foundation models. But they typically rely on static priors and lack adaptation, which leads to repeated errors and costly trial and error. In this paper, we propose a self-evolving ZS-OGN framework that enables continuous test-time improvement. Specifically, we build an agentic rule memory by extracting actionable knowledge from past trajectories. Then, we propose a retrieval strategy based on upper confidence bound, selecting effective rules by balancing semantic relevance and historical success. In addition, we introduce a memory-guided preflection module that forecasts potential outcomes before action, reducing inefficient exploration. Extensive experiments show that our method outperforms existing zero-shot baselines, achieving a 10.1\% improvement in success rate with fewer unnecessary steps.

02.
arXiv (CS.CL) 2026-06-19

The ACUTE Protocol: Operationalizing Language Model Activations for Better Calibration, Utility, and Trust

As language models improve and become increasingly deployed to solve a variety of tasks, trustworthiness becomes essential. Calibration is a good proxy for trust: well-calibrated confidence estimates help inform the risk versus reward tradeoff when trusting a specific model output. Unfortunately, even as models improve, they remain poorly calibrated, often biasing towards overconfidence. Additionally, calibration can be gamed: a policy that always predicts the base rate is perfectly calibrated, but completely uninformative. To resolve this, we develop a new metric, expected utility renormalized by the oracle (EURO), that balances calibration and informativeness. We also propose a general-purpose activation-based confidence, utility, and trust estimation protocol (ACUTE) to appropriately adjudicate uncertainty. The ACUTE protocol provides flexible, sample-efficient, and compute-efficient confidence estimators for 3 tasks including multiple choice question answering, tool-calling, and scientific document summarization across 6 models from 4 model families. ACUTE outperforms strong baselines on EURO, while maintaining low calibration error. Taken together, our work shows that equipping LLMs with the ACUTE protocol can improve calibration, utility, and trustworthiness in numerous settings.

03.
arXiv (quant-ph) 2026-06-12

Symmetry-Accelerated Classical Simulation of Clifford-Dominated Circuits

arXiv:2510.18977v2 Announce Type: replace Abstract: Classical simulation of quantum circuits plays a crucial role in validating quantum hardware and delineating the boundaries of quantum advantage. Among the most effective simulation techniques are those based on the stabilizer extent, which quantifies the overhead of representing non-Clifford operations as linear combinations of Clifford unitaries. However, finding optimal decompositions rapidly becomes intractable as it constitutes a superexponentially large optimization problem. In this work, we exploit symmetries in the computation of the stabilizer extent, proving that for real, diagonal, and real-diagonal unitaries, the optimization can be restricted to the corresponding subgroups of the Clifford group without loss of optimality. This ``strong symmetry reduction'' drastically reduces computational cost, enabling optimal decompositions of unitaries on up to seven qubits using a standard laptop – far beyond previous two-qubit limits. Additionally, we employ a ``weak symmetry reduction'' method that leverages additional invariances to shrink the search space further. Applying these results, we demonstrate exponential runtime improvements in classical simulations of quantum Fourier transform circuits and measurement-based quantum computations on the Union Jack lattice, as well as new insights into the nonstabilizer properties of multicontrolled phase gates and unitaries generating hypergraph states. Our findings establish symmetry exploitation as a powerful route to scale classical simulation techniques and deepen the resource-theoretic understanding of quantum advantage.

04.
arXiv (CS.AI) 2026-06-17

LineageMark: Multi-user White-box Watermarking for Contribution Tracing in Model Derivation Chains

arXiv:2606.17123v1 Announce Type: cross Abstract: In open large language model (LLM) ecosystems, models are frequently adapted across multiple domains and applications, forming multi-stage derivation chains. Consequently, tracking and verifying historical contributions is essential for model provenance and intellectual property protection. However, existing watermarking methods are mainly designed for single-user, one-time embeddings, often fail under repeated model derivation and incremental updates. To address this problem, we propose LineageMark, a multi-user white-box watermarking framework for model derivation chains. The framework encodes watermarks in model parameters using a projection-based approach. Stable carriers are first selected to reduce sensitivity to model changes, each watermark bit is then represented as a projection statistic over these carriers. Additional watermark insertions introduce only bounded perturbations in the projection space, and margin constraints are used to maintain signal integrity. We evaluate the effectiveness of LineageMark in multi-stage model derivation chains. Experimental results show that LineageMark preserves contributor watermarks across multi-stage derivation and supports incremental multi-user watermark insertion. Furthermore, it exhibits robustness against perturbations such as re-watermarking, fine-tuning, quantization, and pruning.

05.
arXiv (quant-ph) 2026-06-19

Quantum Computing Applications for Flight Trajectory Optimization

arXiv:2304.14445v2 Announce Type: replace Abstract: Major players in the global aerospace industry are shifting their focus toward achieving net carbon-neutral operations by 2050. A considerable portion of the overall carbon emission reduction is expected to come from new aircraft technologies, such as flight path optimization. In pursuing these sustainability objectives, we delve into the capacity of quantum computing to tackle computational challenges associated with flight path optimization, an essential operation within the aerospace engineering domain with important ecological and economic considerations. In recent years, the quantum computing field has made significant strides, paving the way for improved performance over classical algorithms. In order to effectively apply quantum algorithms in real-world scenarios, it is crucial to thoroughly examine and tackle the intrinsic overheads and constraints that exist in the present implementations of these algorithms. Our study delves into the application of quantum computers in flight path optimization problems and introduces a customizable modular framework designed to accommodate specific simulation requirements. We examine the running time of a hybrid quantum-classical algorithm across various quantum architectures and their simulations on CPUs and GPUs. A temporal comparison between the conventional classical algorithm and its quantum-improved counterpart indicates that achieving the theoretical speedup in practice may necessitate further innovation. We present our results from running the quantum algorithms on IBM hardware and discuss potential approaches to accelerate the incorporation of quantum algorithms within the problem domain.

06.
arXiv (CS.AI) 2026-06-15

AFFORDANCE20Q: Evaluating Affordance Reasoning from Physical Properties

arXiv:2606.14240v1 Announce Type: new Abstract: Affordance reasoning, the inference of an object's action possibilities from its physical properties (e.g., shape and material), is fundamental to human physical understanding and increasingly critical for Large Language Models (LLMs). However, existing affordance benchmarks largely expose explicit object identities in the evaluation setup, allowing models to rely on memorized object-affordance mappings rather than reasoning over physical properties. To address this gap, we introduce Affordance20Q, a novel affordance reasoning benchmark formulated as a 20-Questions game without exposing the object's identity. In each game, the model identifies a hidden object's affordance from a candidate set by asking yes/no questions about its physical properties. Affordance20Q comprises 1,009 games over 454 objects and 59 affordances, all manually filtered, refined, and annotated. We conduct comprehensive experiments with 15 state-of-the-art LLMs and find a substantial gap (~20 points) compared to human performance. A KL-based information-gain (IG) analysis further shows that models fail to ask discriminating questions as the game progresses. To close the gap, we develop KB-Anchored Rule Induction (KARI), a pipeline based on LLMs that generates affordance rules grounded in evidence from knowledge bases (KBs). KARI improves open-source LLMs by up to 15.2 points, while the limited coverage of KBs hinders further gains. We release all our code and data at https://github.com/1171-jpg/Affordance20Q.git

07.
PLOS Computational Biology 2026-06-10

A mean-field model of neural networks with PV and SOM interneurons reveals connectivity-based mechanisms of gamma oscillations

by Farzin Tahvili, Martin Vinck, Matteo Di Volo Classic theoretical models of cortical oscillations are based on the interactions between two populations of excitatory and inhibitory neurons. Nevertheless, experimental studies and network simulations suggest that interneuron subclasses such as parvalbumin (PV) and somatostatin (SOM) exert distinct control over oscillatory dynamics. Yet, we lack a theoretical understanding of the mechanisms underlying oscillations in E-PV-SOM circuits and of the differences with respect to the classical mechanisms for oscillations in simpler E–I networks. Here, we derive a biologically realistic mean-field model of a canonical three-population E-PV-SOM circuit. This model robustly generates oscillations whose features are consistent with experimental observations, including the relative timing of PV and SOM activity and the effects of optogenetic perturbations. By reducing the model to a linear analytical form, we demonstrate that gamma oscillations emerge directly from the cell-specific connectivity of the three-population circuit. This connectivity motif alone accounts for experimentally observed phase relationships, with PV activity consistently leading that of SOM neurons. Together, this mean field model identifies a distinct structural mechanism giving rise to oscillations in canonical E–PV–SOM circuits and provides theoretical primitives for constructing large-scale, cell-type-specific models of cortical dynamics.

08.
arXiv (math.PR) 2026-06-12

Sub-Riemannian spectral distance

arXiv:2606.12804v1 Announce Type: cross Abstract: We study eigenvalues and eigenfunctions of the ``div-grad type" sub-Laplacian with respect to Popp's volume on a compact equiregular sub-Riemannian manifold $M$. Since Popp's volume is canonically determined by the sub-Riemannian structure of $M$, the spetra of the sub-Laplacian carry geometric meanings. In this paper, we first embed $M$ into the Hilbert space of square-summable sequences using eigenfunctions and then define a spectral distance between two compact equiregular sub-Riemannian manifolds. Our result is a sub-Riemannian analogue of Berard-Besson-Gallot's classical work in the Riemannian case.

09.
arXiv (CS.CV) 2026-06-11

P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

Multimodal large language models can write code to produce complex programs as well as use programs to do 3D modeling, which opens up a new avenue for 3D generation powered by their priors, world knowledge and reasoning. Yet existing benchmarks rarely evaluate 3D modeling through code. Such modeling demands more than runnable code: from a text or visual specification, a model must generate a parametric 3D program that is geometrically precise, semantically aligned and assembly-consistent. We introduce P3D-Bench, a benchmark for parametric 3D generation. Unlike a 3D mesh, a parametric 3D program exposes explicit dimensions, construction operations and part relations, revealing whether a model recovers a design's structure, not just its appearance. Under a unified protocol, P3D-Bench covers three task families (Text-to-3D, Image-to-3D and Assembly-3D) and scores each output for executability, geometric fidelity, topology, text-grounded constraints, multiview semantic alignment and part-level structure. We evaluate frontier MLLMs and text-only LLMs on 400 text cases, 400 image cases and 203 annotated assemblies, with domain-specific models as reference points. Our extensive evaluation yields three findings. First, assemblies are the hardest setting, where models still fail to compose multiple parts into a coherent structure. Second, models can often recover the global shape and semantic identity of the target object, yet fail to reproduce the precise parametric geometry specified by the input. Third, part-level modeling remains weak on assemblies, where models recover neither the geometry of each part nor the right number of parts. These results position P3D-Bench as a benchmark for evaluating precise parametric geometry and part-level structure in parametric 3D generation.

10.
arXiv (CS.CV) 2026-06-17

Visual Retrieval-Augmented Generation for Silhouette-Guided Animal Art

Generative AI has advanced the ability to render photorealistic or artistic images, yet it remains limited in a key aspect of human creativity: interpreting ambiguous shapes. This phenomenon, rooted in pareidolia, allows humans to perceive meaningful forms in random patterns such as clouds, stones, or leaves. To computationally replicate this imaginative process, we introduce Visual Retrieval-Augmented Generation (Visual-RAG), a framework that generates animal art directly from natural silhouettes. Our method retrieves structurally similar animal shapes from a curated corpus of 28,586 high-quality silhouettes and uses them as reference exemplars to guide diffusion-based generation with ControlNet and IP-Adapter. Ablation studies confirm that shape Context with RANSAC provides the most accurate alignment, while removing shape standardization reduces the inlier ratio to just 13.4\%, underscoring the importance of structural fidelity in Visual-RAG. A user study with 12 participants evaluated the outputs in terms of aesthetics, silhouette fidelity, and overall impression. Results reveal that while Visual-RAG provides plausible interpretations, challenges remain in achieving high perceptual impact. This work lays the foundation for computational pareidolia, showing how machines can contribute to the early stages of imaginative discovery.

11.
arXiv (CS.AI) 2026-06-19

Efficient and Sound Probabilistic Verification for AI Agents

arXiv:2606.20510v1 Announce Type: cross Abstract: Securing AI agents that operate in complex digital environments has become a critical need, and runtime monitoring approaches that formulate and enforce policies expressed in a formal language like Datalog offer a promising solution. However, existing approaches are restricted to deterministic policies. In many practical applications of AI agents, there is a need to enforce security policies in the face of ambiguity, leading to probabilistic predicates or state transitions (for example, a declassifier or Personally Identifiable Information (PII) detector that has some failure probability on each invocation). Furthermore, in many such applications, one cannot easily make the independence assumptions necessary to invoke prior work on probabilistic inference in Datalog. We address this by introducing a sound and efficient framework for such verification based on distributionally robust optimization, computing sound upper bounds on the probability of policy violation regardless of possible correlations between predicates. On standard benchmarks for terminal and tool calling agents, we demonstrate that our approach outperforms prior art and improves the security-utility trade-off while ensuring rigorous bounds on the probability of policy violation.

12.
arXiv (CS.AI) 2026-06-16

Benign in Isolation, Harmful in Composition: Security Risks in Agent Skill Ecosystems

arXiv:2606.15242v1 Announce Type: cross Abstract: Skills are becoming the capability layer through which LLM agents turn plans into actions, but their use introduces security risks such as data leakage, unauthorized operations, and tool misuse. Existing vetting usually evaluates each skill in isolation, while real agent tasks often invoke multiple skills in a shared execution context. This creates Skill Composition Risk (SCR): a skill that appears benign alone can become harmful when its outputs, trust signals, authorization cues, or side effects influence later invocations along an activated path. We introduce SCR-Bench to evaluate this risk in controlled, sandboxed skill environments. Rather than relying only on textual intent or surface behavior, SCR-Bench records downstream state changes and path-level outcomes across composed skill executions. It contains three sub-benchmarks: SCR-CapFlow for capability-flow composition, SCR-TrustLift for trust-transfer composition, and SCR-AuthBlur for authorization-confusion composition. Across SCR-Bench, composed paths expose risks that are largely absent under isolated evaluation. In SCR-CapFlow, attack success rate reaches 33.6 percent under composition, compared with near-zero isolated baselines. In SCR-TrustLift, attack success rate exceeds 96.5 percent on four of five backends. In SCR-AuthBlur, the risky-approval rate increases by 71.8 percent relative to the L0 isolated baseline under the L1 context setting. These results show that agent skill security should be assessed at the level of activated paths rather than isolated artifacts. SCR and SCR-Bench provide a foundation for path-aware risk evaluation and defense in LLM agent skill ecosystems. Benchmark: https://github.com/saint-viperx/SCR_Bench.

13.
Nature (Science) 2026-06-12

An innovative technology boosts image quality for protein structures

After years of effort, two research teams have developed ‘laser phase plate’ systems that could help cryo-electron-microscopy users to generate high-quality structures for a broad range of proteins. After years of effort, two research teams have developed ‘laser phase plate’ systems that could help cryo-electron-microscopy users to generate high-quality structures for a broad range of proteins.

14.
arXiv (CS.CV) 2026-06-11

ERN-Net : Evolving Reason Node-Net for Document Binarization

This paper presents ERN-Net, an Evolving Reason Node-Net for efficient document image binarization. ERN-Net enhances degradation-sensitive regions, such as faint strokes, broken characters, and noisy backgrounds, through evolving reason nodes and multi-scale reasoning. We further compare ResNet-101, ConvNeXt-Tiny, and ConvNeXt-Base, and find that ConvNeXt-Tiny provides the best practical trade-off between accuracy and memory usage. In addition, DIBCO-based pretraining improves binarization performance without increasing model memory consumption, requiring only about 1.5 additional training hours. Experiments on DIBCO-style benchmarks show that ERN-Net is effective under low-data and low-memory settings.

15.
arXiv (CS.CV) 2026-06-19

Fast Human Attention Prediction for Fixation-guided Active Perception in Autonomous Navigation

Human visual attention relies on structured scanpaths to efficiently process scenes, yet instilling this behavior into robot autonomy is in its infancy and hindered by the high,computational costs of existing predictive models. To address this, we introduce GazeLNN, a computationally lightweight,scanpath prediction model that leverages Liquid Neural Networks as its recurrent engine and employs MobileNetV3 for feature extraction. Operating auto-regressively, the architecture predicts sequential fixation heatmaps conditioned on the current visual stimulus and fixation history. Despite requiring only 0.61 GFLOPs, GazeLNN achieves state-of-the-art performance on the MIT Low Resolution dataset achieving 0.47 ScanMatch score. It outperforms existing recurrent baselines across diverse evaluation metrics, while reducing computational costs by 99.40% and accelerating inference by up to six times. To investigate the role of human attention modeling in robot autonomy and demonstrate the practical utility of this highly efficient architecture, we integrate GazeLNN into an active camera-robot control policy trained via Reinforcement Learning. This integration enables human-fixation-guided perception during autonomous navigation, validated through successful real-world deployments on an aerial robot.

16.
arXiv (CS.AI) 2026-06-16

LLMs on Tabular Data with Limited Semantics: Evidence from Industrial Car Retrofit Prediction

arXiv:2606.15314v1 Announce Type: cross Abstract: Industrial retrofit planning depends on structured operational data rather than free text: planners must estimate whether a newly registered prototype will require a retrofit, which retrofit package it will need, and how long the work will take. We study an industrial dataset linking a prototype-registration system (284,271 vehicles) with a retrofit-management system (48,716 cleaned visits), and compare strong tabular machine learning baselines with three LLM-based strategies on row-serialized inputs: embedding features (Amazon Titan), direct prompted classification (Claude Sonnet 4), and an ML+LLM stacking approach. Across binary occurrence prediction, 15-way retrofit-type classification, per-visit duration regression, and an aggregated monthly benchmark, classical tree ensembles remain the strongest standalone models. However, the LLM results reveal a consistent pattern: embeddings remain useful on tables (binary AUC = 0.982), direct prompting collapses once semantic signal is stripped by hashing (binary AUC = 0.500; multiclass weighted F1 = 0.018), and hybrid stacking yields the best manually built multiclass model (weighted F1 = 0.626). On the monthly benchmark, lag-based machine learning outperforms time-series foundation models, though Chronos-small remains competitive in zero-shot forecasting. The results suggest that on privacy-constrained industrial tables, LLMs are more effective as complementary components than as replacements for strong tabular baselines.

17.
arXiv (CS.CV) 2026-06-18

Motion-Focused Latent Action Enables Cross-Embodiment VLA Training from Human EgoVideos

Training generalist Vision-Language-Action(VLA) models typically requires massive, diverse robotic datasets with high-fidelity action annotations. While egocentric human manipulation videos are abundant and capture significant environmental diversity, the absence of action labels makes them difficult to use in conventional training paradigms. To address this, we propose a latent-action-based framework designed to extract general action priors from unlabeled human videos. The architecture features a Hybrid Disentangled VQ-VAE that decouples motion dynamics from environmental backgrounds through physical masks, enabling the construction of a cross-embodiment action codebook. By pre-training on human videos with the codebook, the VLM backbone learns deep representations of action intent. For adaptation to specific embodiments, we introduce an intent-perception decoupling strategy where the VLM predicts the action intent while a separate frozen visual encoder provides state-specific features to the action expert, thereby reducing action hallucinations. Results in simulation and real-world environments show that our method, pre-trained exclusively on unlabeled human videos, performs competitively with state-of-the-art VLA models trained on massive annotated datasets, requiring only 50 trajectories for downstream adaptation.

18.
arXiv (quant-ph) 2026-06-16

Adaptively secure unitary designs with constant non-Clifford cost

arXiv:2510.08129v2 Announce Type: replace Abstract: Randomness is a fundamental resource in quantum information, with crucial applications in cryptography, algorithms, and error correction. A central challenge is to construct unitary $k$-designs that closely approximate Haar-random unitaries while minimizing the costly use of non-Clifford operations. In this work, we present a protocol able to generate unitary $k$-designs on $n$ qubits, secure against any adversarial quantum measurement, with a system-size-independent number of non-Clifford gates. Our construction applies a $k$-design only to a subsystem of size $\Theta(k)$, independent of $n$. This ``seed'' design is then ``diluted'' across the entire $n$-qubit system by sandwiching it between two random Clifford operators. The resulting ensemble forms an $\varepsilon$-approximate unitary $k$-design on $n$ qubits. We prove that this construction achieves full quantum security against adaptive adversaries using only $\tilde{O}(k^2 \log\varepsilon^{-1})$ non-Clifford gates. If one requires security only against polynomial-time adaptive adversaries, the non-Clifford cost decreases to $\tilde{O}(k + \log^{1+c} \varepsilon^{-1})$. This is optimal, since we show that at least $\Omega(k)$ non-Clifford gates are required in this setting. Compared to existing approaches, our method significantly reduces non-Clifford overhead while strengthening security guarantees to adaptive security as well as removing artificial assumptions between $n$ and $k$. These results make high-order unitary designs practically attainable in near-term fault-tolerant quantum architectures.

19.
arXiv (quant-ph) 2026-06-11

Honest-binding quantum bit commitment from separable operations

arXiv:2501.07351v3 Announce Type: replace Abstract: Bit commitment is a fundamental cryptographic primitive and a cornerstone for numerous two-party cryptographic protocols, including zero-knowledge proofs. However, it has been proven that unconditionally secure bit commitment, both classical and quantum, is impossible. In this work, we demonstrate that imposing a restriction on the committing party to perform only separable operations enables secure quantum bit commitment schemes. Specifically, we prove that in any perfectly hiding bit commitment protocol, an honestly-committing party limited to separable operations will be detected with high probability if they attempt to alter their commitment. To illustrate our findings, we present an example protocol.

20.
arXiv (CS.AI) 2026-06-15

From Shield to Target: Denial-of-Service Attacks on LLM-Based Agent Guardrails

arXiv:2606.14517v1 Announce Type: cross Abstract: LLM-based guardrails have emerged as a highly effective defense against prompt injection and jailbreak attacks in autonomous agents. However, we reveal that the very reasoning and task-following capabilities enabling this protection introduce a novel vulnerability: attackers can inject crafted data to trap the guardrail in extended reasoning loops, effectuating a systematic denial-of-service (DoS) attack. To systematically expose this threat, we design a beam-search optimization framework that crafts natural-language payloads to maximize guardrail reasoning length, utilizing an LLM proposer guided by a strategy bank. Based on the observation of guardrail's schema-following nature, we also provide another attack framework driven by mechanism-aware structural mutations with less computational load. The attack efficacy is systematically evaluated in two parts. First, in standalone evaluations, the attack generalizes across diverse guardrail architectures, safety templates, and agent benchmarks. Payloads optimized on a single open-source surrogate successfully transfer to eight leading model backbones (e.g., Claude, GPT, Gemini, DeepSeek, and Qwen), achieving a 13–63$\times$ token amplification. Second, in end-to-end real-world agent deployments (web, desktop, code, and multi-agent systems), the attack reveals up to a 148$\times$ latency amplification. We show that a single poisoned document can saturate shared guardrail infrastructures, effectively starving co-located agents and paralyzing the entire system. By uncovering this availability flaw, our work underscores the urgent need to develop cost-bounded, reasoning-robust guardrails.

21.
arXiv (CS.CL) 2026-06-16

Ling and Ring 2.6 Technical Report: Efficient and Instant Agentic Intelligence at Trillion-Parameter Scale

Efficient and scalable agentic intelligence requires models that can deliver both low-latency responses and strong reasoning capabilities while remaining practical to train, serve, and deploy. In this report, we present Ling-2.6 and Ring-2.6, a family of models designed to address this challenge at scale. Ling-2.6 is optimized for instant response generation and high capability per output token, whereas Ring-2.6 is tailored for deeper reasoning and more advanced agentic workflows. Instead of training from scratch, we upgrade the Ling-2.0 base model through architectural migration pre-training and large-scale post-training. This upgrade is guided by a unified co-design of model architecture, optimization objectives, serving systems, and agent training environments, enabling improvements in both model capability and deployment efficiency. At the architectural level, we introduce a hybrid linear attention design that integrates Lightning Attention with MLA, improving the efficiency of long-context training and decoding. To further enhance token efficiency, we optimize capability per output token through Evolutionary Chain-of-Thought, Linguistic Unit Policy Optimization, bidirectional preference alignment, and shortest-correct-response distillation. For agentic capabilities, we propose KPop, a reinforcement learning framework designed to support stable training of Ring-2.6-1T on large-scale environment-grounded data. KPop improves training efficiency through asynchronous scheduling across coding, search, tool use, and workflow execution, enabling scalable learning from complex agent-environment interactions. Together, Ling-2.6 and Ring-2.6 provide a practical pathway toward efficient, scalable, and open agentic systems. We open-source all checkpoints in the 2.6 family to support further research and development in practical agentic intelligence.

22.
arXiv (CS.LG) 2026-06-16

NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

arXiv:2602.06694v3 Announce Type: replace Abstract: Weight-only quantization has become a standard approach for efficiently serving large language models (LLMs). However, existing methods fail to efficiently compress models to binary (1-bit) levels, as they either require large amounts of data and compute or incur additional storage. In this work, we propose NanoQuant, the first post-training quantization (PTQ) method to compress LLMs to both binary and sub-1-bit levels. NanoQuant formulates quantization as a low-rank binary factorization problem, and compresses full-precision weights to low-rank binary matrices and scales. Specifically, it utilizes an efficient alternating direction method of multipliers (ADMM) solver to precisely initialize latent binary matrices and scales, and then tunes the initialized parameters through a block and model reconstruction process. Consequently, NanoQuant establishes a new Pareto frontier in low-memory post-training quantization, and enables sub-1-bit compression. NanoQuant makes large-scale deployment feasible on consumer hardware. For example, it compresses Llama2-70B by 25.8$\times$ in just 13 hours on a single H100, enabling a 70B model to operate on a consumer 8 GB GPU. Code is available at https://github.com/SamsungLabs/NanoQuant.

23.
arXiv (CS.CL) 2026-06-12

More Context, Larger Models, or Moral Knowledge? A Systematic Study of Schwartz Value Detection in Political Texts

Detecting Schwartz values in political text is difficult because implicit cues often depend on surrounding arguments and fine-grained distinctions between neighboring values. We study when context and explicit moral knowledge help sentence-level value detection. Using the ValuesML/Touché ValueEval format, we compare sentence, window, and full-document inputs; no-RAG and retrieval-augmented settings with a curated moral knowledge base; supervised DeBERTa-v3-base/large encoders; and zero-shot LLMs from 12B to 123B parameters. The results show that more context is not uniformly better: full-document context improves supervised DeBERTa encoders by 3.8-4.8 macro-F1 points over sentence-only input, but does not consistently help zero-shot LLMs. Retrieved moral knowledge is more consistently useful in matched comparisons, improving each tested model family and context condition under early fusion. However, scaling from DeBERTa-v3-base to large and from 12B to larger LLMs does not guarantee gains, and simple early fusion outperforms the tested late-fusion and cross-attention RAG variants for encoders. Per-value analyses show that context and retrieval help most for socially situated or conceptually confusable values. These findings suggest that value-sensitive NLP should evaluate context, knowledge, and model family jointly rather than treating longer inputs or larger models as universal improvements.

24.
arXiv (CS.AI) 2026-06-11

PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents

arXiv:2606.12329v1 Announce Type: new Abstract: AI coding assistants now support a growing share of software work, from quick scripts to production applications. Yet these agents remain largely stateless: each new session re-reads project files, re-derives prior decisions, and - most costly - may repeat debugging attempts that already failed. Reconstructing this context can consume an estimated 5,000-20,000 tokens per session; the bottleneck is often not model capability but missing project memory. We present projectmem, an open-source, local-first memory and judgment layer for AI coding agents. projectmem records development as an append-only, plain-text event log of typed events - issues, attempts, fixes, decisions, and notes - and deterministically projects that log into compact, AI-readable summaries served through the Model Context Protocol (MCP). Beyond storage, projectmem adds a deterministic pre-action gate that warns an agent before it repeats a previously failed fix or edits a known-fragile file. We frame this as Memory-as-Governance: memory that does not merely answer the agent but acts on its next action. The system runs fully offline with no telemetry; its immutable log also serves as a provenance trail for reproducible, auditable AI-assisted development. projectmem ships as a three-dependency Python package (14 MCP tools, 19 CLI commands, 37 automated tests) and is evaluated through a two-month self-study across 10 projects comprising 207 logged events. Source code: https://github.com/riponcm/projectmem.

25.
arXiv (quant-ph) 2026-06-15

Fourier analysis of quantum neural network with non-linear data embedding

arXiv:2606.14206v1 Announce Type: new Abstract: Fourier analysis has become a crucial tool for understanding the expressivity of Variational Quantum Circuit (VQC) models, as well as an important indicator of barren plateaus (BP). While existing literature has only studied angle-embedded VQCs in a noiseless environment, here we develop the Fourier analysis of VQCs with non-linear data embedding, with particular focus on amplitude embedding, which provides a naturally compact encoding scheme. We first investigate a subtle difference in the domain of input features within amplitude embedding that leads to a distinct expressivity of the zero-frequency Fourier coefficient. By assuming that the ensemble of unitaries generated from the parameter space forms at least a 2-design with respect to the unitary group, we derive, via Weingarten calculus, that the mean of the Fourier coefficients is concentrated at zero, and the variance scales at an exponentially decaying order with respect to the multi-dimensional frequency magnitude. When a noise channel with unitary Kraus operators and probabilities $\{p_k\}$ is taken into account, the variance is further suppressed by a factor $\left(\sum_k p_k^2\right)^{Q}