Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
PLOS Computational Biology 2026-06-22

Integrative modelling of innate immune response dynamics during virus infection

by Ramya Boddepalli, Harsh Chhajera, Rahul Roya Positive-sense RNA viruses that constitute a large class of human pathogens employ various strategies to suppress and evade host immune defenses. Understanding the dynamic interaction between the viral life cycle and immune signaling is crucial to designing effective antiviral strategies. Although significant progress has been made, quantitative models that can accurately capture the intricate interactions and the intertwined dynamics during viral infection of cells remain missing. In this study, we develop a comprehensive mathematical model that integrates the intracellular viral life cycle with key cellular innate immune pathways, including RIG-I-mediated detection and JAK-STAT signaling. The model provides mechanistic insights into long-standing observations, capturing both virus-specific dynamics and innate immune response, and the key components driving their coupled dynamics. For example, a comparison of viruses shows how the Japanese Encephalitis virus undergoes a dramatic reduction in viral load in cells, due to its rapid replication that robustly activates the RIG-I pathway, in contrast to the poor immune control of Hepatitis C virus. More importantly, our model demonstrates how virus-host interactions exhibit a sharp transition boundary behavior, where minor differences in immune strength or viral suppression capacity can determine whether infections resolve or persist. We propose that ISG mRNA translation and viral replication predominantly dictate these bimodal infection outcomes. Additionally, the model not only recapitulates IFN desensitization but also identifies the molecular players involved. We demonstrate how our model’s ability to capture IFN dynamics allows us to predict optimal timing and dosing strategies for interferon-based prophylactic therapies. Together, our approach reveals fundamental features that govern the delicate balance between the establishment of infection and immune control in RNA virus infections.

02.
arXiv (CS.CL) 2026-06-18

Enhancing Decision-Making with Large Language Models through Multi-Agent Fictitious Play

Large language model (LLM)-based multi-agent systems (MAS) have demonstrated great potential in solving tasks with execution complexity, by distributing subtasks across cooperative agents. However, this divide-and-conquer paradigm falls short on decision-making tasks that are also prevalent in the real world. These tasks require simultaneous reasoning from the stances of all involved stakeholders whose decisions are mutually dependent and thus cannot be solved in isolation. We characterize this challenge as stance entanglement, a form of decision complexity distinct from execution complexity. To address it, we propose Multi-Agent Fictitious Play (MAFP), a novel MAS paradigm that represents stakeholder stances as agents and formulates decision-making as an equilibrium-seeking process. Built on the game-theoretic principle of fictitious play, MAFP iteratively updates each agent's decision by best responding to the empirical mixture of other agents' past decisions. This enables agents to expose and address one another's weaknesses, progressively improving decision quality and robustness. We evaluate MAFP on challenging decision-making tasks that test the capability of deciding strategies for competitive scenarios prior to acting. MAFP outperforms both single-round and multi-round baselines on two complementary metrics, tournament strength and robustness, demonstrating its effectiveness in addressing stance entanglement.

03.
bioRxiv (Bioinfo) 2026-06-22

EMAlign: accurate alignment of cryo-EM maps through main-chain probability using deep learning

Accurate alignment of cryo-EM density maps is essential for comparing conformational states, searching map libraries, and guiding atomic model building, but remains challenging for noisy experimental maps and partially overlapping structures. Existing alignment methods are often based on raw maps, which may result in reduced accuracy due to the density noise, or require manual intervention for local alignment, which suffers from limited general applicability. Addressing the limitations, we present EMAlign, an automatic global and local cryo-EM map alignment with predicted main-chain probability using deep learning. First, EMAlign predicts main-chain prob ability maps from raw cryo-EM density maps using a BiMCUNet network. Then, a fast Fourier transform (FFT)-based search strategy is used to globally search the accurate alignment between cryo-EM maps based on predicted main-chain probability maps. As such, the main-chain prob ability map overcomes the noisy raw map problem, and the FFT-based exhaustive global search ensures the general applicability of alignment. EMAlign is evaluated on 64 global map pairs, 195 local map pairs, and 60 structure-to-map pairs at 3-10 [A] resolution and compared with gmfit, fitmap, VESPER, and CryoAlign. It is shown that EMAlign outperforms the other methods in both global and local alignment, achieving mean RMSDs of 1.03 [A] (global), 2.56 [A] (local), and 0.82 [A] (structure-to-map), with success rates of 100.0%, 100.0%, and 98.3% under the criterion of RMSD < 10 [A]. The EMAlign package is freely available at https://github.com/huang-laboratory/EMAlign/.

04.
arXiv (CS.LG) 2026-06-18

The Illusion of Improvement: Reject Inference Strategies in Credit Scoring

arXiv:2606.18479v1 Announce Type: new Abstract: Reject inference methods are widely used to mitigate survival bias in credit scoring, yet their effectiveness remains poorly understood. We systematically evaluate several such methods and uncover a structural failure mode: in a natural retraining cycle, models whose accuracy improves while recall collapses create an illusion of improvement that leads practitioners to believe the system is getting better when, in fact, its rejection quality – the ability to correctly screen out defaulters – is deteriorating. We then propose a controlled exploration strategy that breaks the feedback loop without statistical assumptions: the lender deliberately approves a fraction of rejected applicants and observes their true outcomes. We show that accuracy and rejection quality give opposite recommendations on whether to explore: accuracy favors no exploration, while rejection quality improves with it, confirming that standard evaluation metrics are misleading under selection bias. Even minimal exploration rates (2–5\%) prove sufficient in our experiments to diagnose the severity of the feedback loop at near-zero cost. Our findings are consistent across two machine learning methods and three real-world datasets, and suggest that standard evaluation protocols are inadequate for assessing models trained under survival bias.

05.
arXiv (CS.LG) 2026-06-17

AnomalyMatch: Discovering Rare Objects of Interest with Semi-supervised and Active Learning

arXiv:2505.03509v3 Announce Type: replace Abstract: Anomaly detection in large datasets is essential in astronomy and computer vision. However, due to a scarcity of labelled data, it is often infeasible to apply supervised methods to anomaly detection. We present AnomalyMatch, an anomaly detection framework combining the semi-supervised FixMatch algorithm using EfficientNet classifiers with active learning. AnomalyMatch is tailored for large-scale applications and integrated into the ESA Datalabs science platform. In this method, we treat anomaly detection as a binary classification problem and efficiently utilise limited labelled and abundant unlabelled images for training. We enable active learning via a user interface for verification of high-confidence anomalies and correction of false positives. Evaluations on the GalaxyMNIST astronomical dataset and the miniImageNet natural-image benchmark under severe class imbalance display strong performance. Starting from five to ten labelled anomalies, we achieve an average AUROC of 0.96 (miniImageNet) and 0.89 (GalaxyMNIST), with respective AUPRC of 0.82 and 0.77. After three active learning cycles, anomalies are ranked with 76% (miniImageNet) to 94% (GalaxyMNIST) precision in the top 1% of the highest-ranking images by score. We compare to the established Astronomaly software on selected 'odd' galaxies from the 'Galaxy Zoo- The Galaxy Challenge' dataset, achieving comparable performance with an average AUROC of 0.83. Our results underscore the exceptional utility and scalability of this approach for anomaly discovery, highlighting the value of specialised approaches for domains characterised by severe label scarcity

06.
arXiv (CS.CV) 2026-06-18

Grids Often Outperform Implicit Neural Representations at Compressing Dense Signals

Implicit Neural Representations (INRs) have recently shown impressive results, but their fundamental capacity, implicit biases, and scaling behavior remain poorly understood. We investigate the performance of diverse INRs across a suite of 2D and 3D real and synthetic signals with varying effective bandwidth, as well as both overfitting and generalization tasks including tomography, super-resolution, and denoising. By stratifying performance according to model size as well as signal type and bandwidth, our results shed light on how different INR and grid representations allocate their capacity. We find that, for many tasks involving dense signals, a simple regularized grid with interpolation trains faster and to higher or comparable quality than any INR with the same number of parameters. We also find limited settings – namely fitting binary signals such as shape contours – where INRs outperform grids, to guide future development and use of INRs towards the most advantageous applications.

07.
arXiv (CS.LG) 2026-06-18

Wasserstein Policy Learning for Distributional Outcomes

arXiv:2606.19117v1 Announce Type: cross Abstract: Offline policy learning has received growing attention in causal inference. The primary objective is to learn a policy (individualized treatment rule) as a mapping from covariates to treatment that maximizes the empirical welfare defined as the mean of scalar-valued potential outcomes. In this paper, we study offline policy learning with distribution-valued outcomes, where each potential outcome is a probability measure on $\mathbb{R}$ and the reward is defined through a utility functional applied to the Wasserstein barycenter of induced outcome distributions. We establish statistical guarantees for the policy learning framework based on both Inverse Probability Weighting (IPW) and Doubly Robust (DR) estimators. By handling the challenging uniform deviation over the product of the combinatorial policy class and the infinite-dimensional quantile domain, we prove that the finite-sample regret has leading dependence $\widetilde{\mathcal{O}}(\sqrt{\mathrm{N-dim}(\Pi)/N})$. In the one-dimensional Wasserstein setting and under the stated regularity conditions, the leading regret rate is still governed by the policy-class complexity. Moreover, we provide a minimax lower bound establishing the sharpness of the leading dependence on $N$ and $\mathrm{N-dim}(\Pi)$.

08.
arXiv (CS.AI) 2026-06-17

AI Adoption Across a Multinational Workforce: Sociotechnical Conditions for GenAI Acceptance in Human Resources

arXiv:2606.17887v1 Announce Type: cross Abstract: Generative AI (GenAI) deployment in the workplace is accelerating rapidly. Nevertheless, questions of who adopts, who benefits, and who is left behind and why are still understudied. In this paper, we investigate these dynamics in the context of a multinational tech company transitioning from a legacy Human Resources (HR) search system to a GenAI-supported system, analyzing search log data, survey data (n=25), and ten semi-structured interviews. Our findings show that adoption depended on the fit between the GenAI system's design assumptions and employees' work positionalities (role, spoken language, tenure). Further, we find that employees' trust in GenAI answers was built through source-checking, comparison among systems, and seeking input from colleagues or HR when in doubt. Our contribution is twofold. First, we provide empirical evidence of workplace GenAI adoption during a live organizational transition, showing that adoption is influenced by factors such as situational fit, search literacy, and trust calibration. It is also further shaped by knowledge conditions such as the system's content quality, employee training, and guidance. Second, we translate these findings into design considerations for inclusive deployment and adoption in high-stakes environments such as HR. We argue that organizations should design systems considering the role and context-sensitive benefits they yield to different social groups. They also need to treat the organizational knowledge infrastructure as AI infrastructure to improve the accountability and usability of GenAI systems

09.
medRxiv (Medicine) 2026-06-22

Leishmaniasis on YouTube: a critical appraisal of the quality, reliability, and transparency of educational content

Background: Leishmaniasis is a neglected tropical disease of significant global public health importance, for which accurate information is essential to support prevention and early care-seeking, particularly in endemic, resource-limited settings. YouTube is a widely used source of health information, but the quality and reliability of leishmaniasis-related content have not been evaluated. We aimed to assess the quality, reliability, and transparency of English-language YouTube videos on leishmaniasis. Methods: We conducted a cross-sectional analysis of YouTube videos retrieved via the YouTube Data API on 15 June 2026 using the terms "leishmaniasis," "cutaneous leishmaniasis," and "visceral leishmaniasis." After applying eligibility criteria and screening the 150 most-viewed eligible videos, 48 videos were included. Two reviewers independently assessed each video using the modified DISCERN (mDISCERN) tool, the Global Quality Score (GQS), and the JAMA benchmark criteria, with disagreements resolved by consensus. Inter-rater agreement was assessed using the intraclass correlation coefficient (ICC), and associations were examined using Spearman's rank correlation. Results: Of 402 videos retrieved, 48 met the inclusion criteria. The median GQS was 3.00 (IQR 2.00-4.00) and median mDISCERN was 3.00 (IQR 2.38-4.50), indicating moderate quality and reliability, while the median JAMA score was 2.00 (IQR 1.00-2.00), reflecting limited transparency; no video met all four JAMA criteria. The overwhelming majority of videos (47/48, 97.9%) were of professional or institutional origin. Inter-rater agreement was good to excellent (ICC 0.883 for GQS, 0.896 for mDISCERN, 1.000 for JAMA). The instruments were strongly inter-correlated (mDISCERN-GQS rho = 0.841, p < 0.001). Quality scores did not correlate positively with views, likes, or video duration; comments correlated weakly and negatively with mDISCERN (rho = -0.337, p = 0.031) and JAMA (rho = -0.381, p = 0.014). Conclusions: YouTube videos on leishmaniasis are of moderate quality and reliability but limited transparency, and are produced almost exclusively by professional sources. Video popularity, length, and age were not indicators of quality. There is a need for experts and institutions to produce clearly authored, well-sourced, and transparent educational content on this neglected tropical disease.

10.
arXiv (CS.LG) 2026-06-15

Curvature-Informed Potential Energy Surface for Protein-Ligand Binding Affinity Prediction

arXiv:2606.14217v1 Announce Type: new Abstract: Accurate prediction of protein-ligand binding affinity is essential for structure-based drug discovery. Recent geometric deep learning methods have achieved promising performance by representing protein-ligand complexes as three-dimensional graphs. However, most existing approaches mainly rely on static interaction geometry from a single bound conformation, while neglecting molecular flexibility and binding-induced conformational changes. To address this limitation, we propose a curvature-informed potential energy surface (CPES) graph neural network for protein-ligand binding affinity prediction, which incorporates physics-informed curvature representations to model conformational flexibility. CPES first derives curvature spectral descriptors from the Hessian of the potential energy surface evaluated at equilibrium configurations, whose eigenvalues define the local principal curvatures of the potential energy surface. It then uses spectral cross-attention to compare the unbound ligand and protein with the bound complex, thereby capturing binding-induced changes in conformational dynamics. In parallel, hierarchical protein-ligand interaction representations are learned from static structural features through geometry-aware message passing, soft clustering, and bidirectional cross-attention. Finally, CPES fuses the curvature-informed dynamic representations with static interaction representations for affinity regression. Extensive evaluations on multiple benchmark datasets demonstrate that CPES achieves improved predictive performance and offers physical interpretability.

11.
arXiv (CS.LG) 2026-06-11

Mitigating Disparate Impact of Differentially Private Learning through Bounded Adaptive Clipping

arXiv:2506.01396v2 Announce Type: replace Abstract: Differential privacy (DP) has become an essential framework for privacy-preserving machine learning. Existing DP learning methods, however, often have disparate impacts on model predictions, e.g., for minority groups. Gradient clipping, which is often used in DP learning, can suppress larger gradients from challenging samples. We show that this problem is amplified by adaptive clipping, which will often shrink the clipping bound to tiny values to match a well-fitting majority, while significantly reducing the accuracy for others. We propose bounded adaptive clipping, which introduces a tunable lower bound to prevent excessive gradient suppression. Our method improves worst-class accuracy by over 10 percentage points on Skewed and Fashion MNIST compared to unbounded adaptive clipping, 7 points compared to Automatic clipping, and 5 points compared to constant clipping. The code is available at https://github.com/TrustworthyMLHelsinki/adaptive-clipping-fairness.

12.
arXiv (CS.AI) 2026-06-15

The Weight Norm Sets the Grokking Timescale: A Causal Delay Law

arXiv:2606.13753v1 Announce Type: cross Abstract: Grokking is the delayed onset of generalization in neural networks, arising long after they fit the training data. Whether the weight norm causes this delay is disputed: some studies report a critical norm at the transition, others observe grokking with no fixed norm at all. We settle this by intervening on the norm during training rather than only observing it. Under free training with weight decay, networks grok when the weight norm reaches a value Wc that varies little across seeds and learning rates (CV 1 to 2 percent) and grows with the modular base as a power law. When we instead clamp the norm to a fixed multiple rho of Wc and hold it there, the network still groks, but the delay follows T_grok proportional to exp(alpha rho). One exponent, alpha near 7.5, fits this delay across four moduli (R^2 = 0.996). Over the swept ranges the held norm moves the delay by about 19x and the learning rate by only about 2x, and holding the norm above Wc slows grokking rather than preventing it. A final LayerNorm removes the dependence by decoupling weight scale from the network function; without it the exponential law returns. This pinned-norm delay is the exponential counterpart to the logarithmic delay predicted for a freely contracting norm.

13.
medRxiv (Medicine) 2026-06-10

Development of a Novel Blood-Based Assay for Brain-Derived Tau and Its Validation in Traumatic Brain Injury

Brain-derived tau (BD-tau) is an emerging blood-based biomarker for neurodegeneration, yet there are currently limited well validated BD-tau assays available for research and clinical use. To enhance access to this vital biomarker for neurological disorders including traumatic brain injury (TBI), we developed a novel blood-based immunoassay for BD-tau on the ultra-sensitive Quanterix HD-X platform using Single Molecule Array technology. Analytical validation assessed dilution linearity, specificity, precision, detection limits, and spike recovery, each recording robust metrics in agreement with international expert recommendations. The assay demonstrated robust validation metrics, achieving between-run stability of 95% when analyzing aliquots from six independent plasma and serum samples across five analytical runs. It also showed strong dilution linearity when diluted four-fold and achieved over 90% recovery when spiked with cerebrospinal fluid. Next, we evaluated the clinical utility of the assay in cohorts of individuals with traumatic brain injury (TBI), where strong performances were recorded whether using the 2-step or 3-step assay formats ({rho}= 0.94; p < 0.0001). Furthermore, plasma BD-tau distinguished samples from TBI patients based on time from injury and severity (AUC=0.93). Plasma BD-tau differentiated between favorable and unfavorable functional outcomes in the acute-severe group. Our findings underscore the significant potential of the BD-tau assay as a biomarker for TBI in the severe phase.

14.
arXiv (CS.CL) 2026-06-11

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

General-purpose agents such as OpenClaw are increasingly used as autonomous tool users, but their coding ability is difficult to measure under SWE-bench: a generic agent does not by itself satisfy the clean Docker workspace, patch, and prediction contract required for scoring. We introduce Claw-SWE-Bench, a multilingual SWE-bench-style benchmark and adapter protocol that makes heterogeneous agent harnesses, or claws, comparable under fair settings including a fixed prompt, runtime budget, workspace contract, patch extraction procedure, and evaluator. The full benchmark contains 350 GitHub issue-resolution instances across 8 languages and 43 repositories, drawn from SWE-bench-Multilingual and SWE-bench-Verified-Mini after future-commit cleanup. We also release Claw-SWE-Bench Lite for faster validation, which is an 80-instance subset selected by a cost-aware, rank-aware procedure over 17 calibration columns. On the full benchmark, OpenClaw with a minimal direct-diff adapter scores only $19.1\%$ Pass@1, whereas the full adapter reaches $73.4\%$ with the same GLM 5.1 backbone, showing that adapter design is essential for enabling OpenClaw-style harnesses to perform coding tasks effectively. Across an OpenClaw $\times$ nine-model sweep and a five-claw $\times$ two-model sweep, model choice changes Pass@1 by $29.4$ pp and harness choice by $27.4$ pp under fixed models; systems with similar accuracy can differ substantially in total API cost. Claw-SWE-Bench therefore treats harness and cost accounting as first-class axes of SWE-style coding-agent evaluation, providing both a full benchmark and a low-cost reference set for reproducible comparison. The data is available at https://github.com/opensquilla/claw-swe-bench and https://huggingface.co/datasets/TokenRhythm/Claw-SWE-Bench.

15.
arXiv (CS.CL) 2026-06-16

The Truth Stays in the Family: Enhancing Contextual Grounding via Inherited Truthful Heads in Model Lineages

Recent advances in large language models (LLMs) have produced many specialized multimodal LLMs (MLLMs) that share common foundational LLMs, forming distinct model lineages. It remains unclear whether a fundamental behavioral link exists between the foundational LLMs and downstream variants. We investigate this question by quantifying head-level context-truthfulness scores. Across diverse LLM and MLLM lineages, including Vicuna-, Qwen2.5-, LLaMA2-, and Mistral-based models, we find that Truth Scores are strongly preserved within model families, even after instruction tuning or multimodal adaptation. We further show that this inheritance is consistent with attention-head weight preservation, and that context-truthful heads attend to query-relevant evidence. Building on this finding, we propose TruthProbe, a soft-gating strategy that amplifies context-truthful heads while preserving other head contributions. TruthProbe improves contextual truthfulness on HaluEval and reduces multimodal hallucination on POPE and CHAIR, with base-LLM Truth Scores transferring effectively to their fine-tuned LLM and MLLM descendants. Code is available at https://github.com/miso-choi/TruthProbe.

16.
arXiv (CS.AI) 2026-06-16

Interpretation as Linear Transformation: A Cognitive-Geometric Model of Concepts and Meaning

arXiv:2512.09831v2 Announce Type: replace Abstract: This paper develops a geometric framework for modeling concepts, motivation, and influence across cognitively heterogeneous agents. Each agent is represented by a personalized value space, a vector space encoding the internal dimensions through which the agent interprets and evaluates meaning. Evaluative concepts are formalized as structured vectors, abstract beings, whose transmission is mediated by linear interpretation maps. An abstract being survives communication only if it avoids the null spaces of these maps, yielding a structural criterion for intelligibility, miscommunication, and concept death. Within this framework, I show how conceptual distortion, motivational drift, and the limits of mutual understanding arise from purely algebraic constraints. A central result, the No-Null-Space Leadership Condition, characterizes leadership as a property of representational reachability rather than persuasion or authority. More broadly, the model explains how abstract beings can propagate, mutate, or disappear as they traverse diverse cognitive geometries. The account unifies insights from conceptual spaces, social epistemology, and AI value alignment by grounding meaning preservation in structural compatibility rather than shared information or rationality. I argue that this cognitive-geometric perspective clarifies the epistemic boundaries of influence in both human and artificial systems, and offers a general foundation for analyzing conceptual dynamics across heterogeneous agents.

17.
arXiv (CS.CV) 2026-06-16

Wavelength-Multiplexed 2D Beam Steering via a Passive Diffractive Network

We introduce a wavelength-addressable diffractive optical network that transforms illumination wavelength into a high-dimensional control parameter for arbitrarily programmable 2D beam steering. The proposed passive architecture comprises cascaded spatially optimized diffractive layers, jointly designed using deep learning, to rapidly map distinct wavelengths to predefined/desired output angles. Unlike conventional single-layer dispersive optical elements, which are physically restricted to 1D linear mapping, this framework harnesses complex wavefront transformations to utilize the illumination wavelength as an intrinsic addressing key for arbitrary 2D beam steering, eliminating the need for mechanical scanning or electronic phase control. We numerically demonstrate wavelength-controlled beam steering across 625 wavelength channels spanning 400-750 nm, realizing a 25 x 25 array of independently addressable beam positions with subwavelength positioning accuracy and high channel fidelity. Unlike conventional gratings, which constrain wavelength routing to a linear trajectory, the proposed diffractive network performs nonlocal wavefront transformations, enabling arbitrary wavelength-to-angle mappings across a 2D field of view. We further validate the proposed framework experimentally in both the terahertz and visible spectral regimes, demonstrating wavelength-multiplexed beam steering using 3D fabricated passive diffractive layers at terahertz frequencies and phase-only spatial light modulators in the visible spectrum. This wavelength-addressable diffractive architecture establishes a compact and scalable paradigm for high-speed programmable beam steering, with potential applications in optical communications, routing, imaging, sensing, and emerging photonic information-processing systems.

18.
arXiv (quant-ph) 2026-06-17

Tungsten Germanide Superconducting Nanowire Single-Photon Detectors with Saturated Internal Detection Efficiency at Wavelengths up to 29 {\mu}m

arXiv:2511.20868v2 Announce Type: replace-cross Abstract: Superconducting nanowire single-photon detectors (SNSPDs) are among the most sensitive single-photon detectors available and have the potential to transform fields ranging from infrared astrophysics to molecular spectroscopy. However, extending their performance into the mid-infrared spectral region - crucial for applications such as exoplanet transit spectroscopy and vibrational fingerprinting of molecules - has remained a major challenge, primarily due to material limitations and scalability constraints. Here, we report on the development of SNSPDs based on tungsten germanide, a novel material system that combines high mid-infrared sensitivity with compatibility for large-scale fabrication. Our detectors exhibit saturated internal detection efficiency at wavelengths up to 29 {\mu}m, while using 2.7x thicker films (8 nm vs 3 nm) and up to 4.5x wider nanowires (360 nm vs 80 nm) compared to mid-infrared-optimized SNSPDs fabricated from tungsten silicide. This advance will enable scalable, high-performance single-photon detection in a spectral region that was previously inaccessible, opening new frontiers in remote sensing, thermal imaging, environmental monitoring, molecular physics, and astronomy.

19.
arXiv (CS.CV) 2026-06-16

A biological vision inspired framework for machine perception of abutting grating illusory contours

Higher levels of machine intelligence demand alignment with human perception and cognition. Deep neural networks (DNN) dominated machine intelligence have demonstrated exceptional performance across various real-world tasks. Nevertheless, recent evidence suggests that DNNs fail to perceive illusory contours like the abutting grating, a discrepancy that misaligns with human perception patterns. Departing from previous works, we propose a novel deep network called illusory contour perception network (ICPNet) inspired by the circuits of the visual cortex. In ICPNet, a multi-scale feature projection (MFP) module is designed to extract multi-scale representations. To boost the interaction between feedforward and feedback features, a feature interaction attention module (FIAM) is introduced. Moreover, drawing inspiration from the shape bias observed in human perception, an edge detection task conducted via the edge fusion module (EFM) injects shape constraints that guide the network to concentrate on the foreground. We assess our method on the existing AG-MNIST test set and the AG-Fashion-MNIST test sets constructed by this work. Comprehensive experimental results reveal that ICPNet is significantly more sensitive to abutting grating illusory contours than state-of-the-art models, with notable improvements in top-1 accuracy across various subsets. This work is expected to make a step towards human-level intelligence for DNN-based models.

20.
arXiv (CS.AI) 2026-06-19

A Neuromorphic Reinforcement Learning Framework for Efficient Pathfinding in Robotic Mobile Fulfillment Systems

arXiv:2606.20031v1 Announce Type: cross Abstract: Dynamic environmental changes, confined workspaces, and stringent real-time constraints make pathfinding in Robotic Mobile Fulfillment Systems (RMFS) a challenging problem for conventional search- and rule-based methods, which typically suffer from high computational complexity and long decision latency. While reinforcement learning (RL) has emerged as a powerful alternative, deploying learned policies with extreme energy efficiency on resource-constrained hardware remains an open challenge. We present SDQN-RMFS, an end-to-end framework that achieves high-fidelity deployment of an RL-trained policy from a full-precision artificial neural network (ANN) through to a neuromorphic chip. By computing only when triggered by sparse events, this framework unlocks ultra-low-power RMFS pathfinding. Our full-stack pipeline operates as follows: an ANN policy is first efficiently trained via a collision-allowing strategy to densify informative trajectories, and then converted into a spiking neural network (SNN) via a hard-label knowledge distillation approach. This effectively addresses the output distribution mismatch, preserving policy capability across the ANN-to-SNN pipeline while substantially reducing inference latency. Hardware experiments demonstrate up to 11,281$\times$ energy savings and a nearly two-fold reduction in latency compared to a high-performance GPU baseline, while maintaining decision quality on par with the original trained policy. These results establish physical neuromorphic inference as a practical and energy-sustainable pathway for large-scale RMFS operations.

21.
arXiv (CS.CV) 2026-06-15

Prompt2Effect: Training-Free Image-to-Video Model Specialization via LoRA Generation

Personalizing Image-to-Video (I2V) diffusion models with specific visual effects is increasingly demanded for high-end video generation. Current practice requires training a separate Low-Rank Adaptation (LoRA) module for each effect, incurring substantial data curation and iterative optimization costs that hinder interactive control. We present Prompt2Effect, a weight-driven hypernetwork that amortizes per-effect training by directly synthesizing effect-specific LoRA weights in a single forward pass. Unlike prior hypernetworks that regress adapter weights purely from semantics, Prompt2Effect is explicitly conditioned on the frozen base model weights, grounding weight prediction in the structural geometry of each layer. Furthermore, instead of predicting raw LoRA matrices, we introduce an SVD-canonicalized parameterization that resolves factorization ambiguity and stabilizes large-scale weight synthesis. Together, these design principles enable accurate and scalable LoRA prediction for high-dimensional I2V diffusion models. Extensive experiments demonstrate that Prompt2Effect achieves on-par or superior video quality and effect alignment compared to conventional LoRA fine-tuning, while reducing the computational cost from 56 GPU training hours to 3.3 seconds of hypernetwork inference. When used as initialization for subsequent fine-tuning, our predicted weights further improve final performance and accelerate optimization by approximately 10x.

22.
arXiv (CS.AI) 2026-06-11

SPEAR: A System for Post-Quantization Error-Adaptive Recovery Enabling Efficient Low-Bit LLM Serving

arXiv:2606.11244v1 Announce Type: cross Abstract: Efficient large language model (LLM) serving is increasingly constrained by deployment cost. Quantization is a key technique for reducing serving cost, yet even state-of-the-art 4-bit quantizers exhibit a noticeable quality gap from FP16, particularly for smaller models where low-bit serving is most beneficial. We identify a fundamental cause of this gap: quantization error is highly input-dependent and varies substantially across tokens, while existing post-quantization compensation methods are static and apply identical corrections to all inputs. As a result, easy tokens are over-corrected while hard tokens remain under-corrected. We present SPEAR, a system for post-quantization error-adaptive recovery that improves low-bit LLM serving. SPEAR introduces lightweight Error Compensators (ECs) modulated by per-token gates and places them only at the most error-sensitive layers identified through a CKA-guided entropy-aware diagnostic. This focuses a small parameter budget where it is most effective. Efficient deployment of ECs presents several systems challenges, including additional computation, tensor-parallel synchronization caused by input-dependent gating, and latency instability across configurations. SPEAR addresses these issues through adaptive kernel-fusion dispatch, combining an epilogue-integrated peer-reduction kernel with P2P dual-write to fuse the post-EC computation into low-bit GEMMs, and an SLO-constrained EC-aware scheduler for predictable serving performance. Across challenging per-channel quantization settings, SPEAR recovers 56-75% of the perplexity gap between W4 and FP16 while adding less than 1% model memory overhead and maintaining latency comparable to a widely used 4-bit serving deployment.

23.
arXiv (CS.LG) 2026-06-18

Towards a future space-based, highly scalable AI infrastructure system design

arXiv:2511.19468v2 Announce Type: replace-cross Abstract: If AI is a foundational general-purpose technology, we should anticipate that demand for AI compute – and energy – will continue to grow. The Sun is by far the largest energy source in our solar system, and thus it warrants consideration how future AI infrastructure could most efficiently tap into that power. This work explores a scalable compute system for machine learning in space, using fleets of satellites equipped with solar arrays, inter-satellite links using free-space optics, and Google tensor processing unit (TPU) accelerator chips. To facilitate high-bandwidth, low-latency inter-satellite communication, the satellites would be flown in close proximity. We illustrate the basic approach to formation flight via an 81-satellite cluster of 1 km radius, and describe an approach for using high-precision ML-based models to control large-scale constellations. Trillium TPUs are radiation tested. They survive a total ionizing dose equivalent to a 5 year mission life without permanent failures, and are characterized for bit-flip errors. Launch costs are a critical part of overall system cost; a learning curve analysis suggests launch to low-Earth orbit (LEO) may reach $\lesssim$\$200/kg by the mid-2030s.

24.
arXiv (CS.AI) 2026-06-16

FlowMPC: Improving Flow Matching policies with World Models

arXiv:2606.16286v1 Announce Type: cross Abstract: Flow Matching (FM) is a powerful approach for behavior cloning in multimodal action spaces [Jiang et al., 2025], but because it is not trained to directly maximize expected return, there is still room to improve how FM policies act at test time. This work investigates whether a learned world model can improve FM policies by enabling Model Predictive Path Integral (MPPI) planning over candidate action sequences proposed by the policy. Building on TD-MPC2 [Hansen et al., 2024], I introduce FlowMPC, a framework that combines an imitation-learned FM policy with a learned world model for test-time planning in ManiSkill manipulation tasks [Tao et al., 2025]. Across PickCube and PickSingleYCB, adding the world model improved performance over the FM policy alone, with especially clear gains in end-of-episode success. These results suggest that world-model-based planning can effectively complement flow-based imitation policies without modifying the FM training objective.

25.
arXiv (math.PR) 2026-06-15

The 1/4-phenomenon of placement probabilities of tilings in the Aztec diamond

arXiv:2512.08377v2 Announce Type: replace-cross Abstract: We consider domino tilings of the Aztec diamond. Using the Domino Shuffling algorithm introduced by Elkies, Kuperberg, Larsen, and Propp in arXiv:math/9201305, we are able to generate domino tilings uniformly at random. In this paper, we investigate the probability of finding a domino at a specific position in such a random tiling. We prove that this placement probability is always equal to $1/4$ plus a rational function, whose shape depends on the location of the domino, multiplied by a position-independent factor that involves only the size of the diamond. This result leads to significantly more compact explicit counting formulas compared to previous findings. As a direct application, we derive explicit counting formulas for the domino tilings of Aztec diamonds with $2\times 2$-square holes at arbitrary positions.