Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (CS.LG) 2026-06-12

ShapeBench: A Scalable Benchmark and Diagnostic Suite for Standardized Evaluation in Aerodynamic Shape Optimization

arXiv:2605.20763v2 Announce Type: replace Abstract: Rapid progress in aerodynamic shape optimization (ASO) has outpaced currently-available standardized evaluation frameworks. Fair comparison requires a unified benchmark spanning diverse shape classes, objective formulations, and matched-budget state-of-the-art baselines. We introduce ShapeBench, an open-source ASO benchmark with a unified API spanning 103 tasks across eight shape categories and multiple optimization regimes. Each ShapeBench task includes a validated surrogate for fast search; when feasible, a high-fidelity Computational Fluid Dynamics (CFD) pipeline for final verification is available, enabling systematic fidelity-gap analysis. ShapeBench provides a reproducible protocol with well-configured baselines to compare fairly using a consistent budget metric, allowing for comparison among both classical and LLM-driven methods, including general-purpose optimizers and a new domain-specialized evolutionary LLM baseline, ShapeEvolve. Results on ShapeBench demonstrate substantial variance in optimizer rankings across shape categories and problem formulations, with mean pairwise Spearman $\rho = 0.013$, so single-task conclusions do not reliably generalize across problem classes. The benchmark is also far from saturation; classical methods are rarely applicable across all shape categories and tasks, further highlighting the need for more general-purpose approaches.

02.
bioRxiv (Bioinfo) 2026-06-08

DipSkmer: Reference-free population genomics with diploid genome skims

Ecologists and conservation biologists rely on genetic diversity as a key essential biodiversity variable (EBV) used to track population health and dynamics, and utilize the population parameter {theta} (estimated by the average pairwise genomic distance) as a key metric of diversity. While whole-genome-sequencing (wgs) is increasingly affordable, it will be considerable time before the full diversity of life is represented by high-quality assembled genomes; even then, constant monitoring will still require repeated sampling of populations. In contrast, genome skimming (low-coverage, short-read wgs) is highly cost-effective but challenging to analyze because the coverage is too low for assembly and reliable error correction. Mature methods, such as Mash, exist for estimating pairwise genomic distances based on the Jaccard similarity of k-mer sets computed using sketching techniques. Some, such as Skmer, additionally model the impacts of low coverage. These methods have been successfully applied to assembly-free species identification and phylogenetics; however, their use in population genetics has been limited. This is because these methods implicitly treat genomes as haploid and heterozygosity confounds true estimates of genomic distance for diploid organisms. In this paper, we address this problem through a number of technical advances. First, we use coalescent theory to mathematically derive how the Jaccard index between two diploid samples changes with the scaled population size parameter ({theta}). Next, we derive an estimator that computes {theta} from the Jaccard index, in addition to several auxiliary variables, which we also estimate from the genome skims. The resulting method, DipSkmer, enables more accurate estimates of coverage, sequencing error, and pairwise nucleotide distance for diploid samples. Analyses of both simulated and empirical datasets show that for diploids and low distances (e.g.,

03.
arXiv (CS.LG) 2026-06-17

Recursive Learning Without Collapse: A Weighting-Based Stabilization Framework

arXiv:2502.18049v5 Announce Type: replace-cross Abstract: Recent studies identified an intriguing phenomenon in recursive generative model training known as model collapse, where models trained on data generated by previous models exhibit severe performance degradation. Addressing this issue and developing more effective training strategies have become central challenges in generative model research. In this paper, we investigate this phenomenon within a novel framework, where generative models are iteratively trained on a combination of newly collected real data and synthetic data from the previous training step. To develop an optimal training strategy for integrating real and synthetic data, we evaluate the performance of a weighted training scheme in various scenarios, including Gaussian distribution estimation, generalized linear models, and nonparametric estimation. We theoretically characterize the impact of the mixing proportion and weighting scheme of synthetic data on the final model's performance. Our key finding is that, across different settings, the optimal weighting scheme under different proportions of synthetic data asymptotically follows a unified expression, revealing a fundamental trade-off between leveraging synthetic data and model performance. In some cases, the optimal weight assigned to real data corresponds to the reciprocal of the golden ratio. Finally, we validate our theoretical results on extensive simulated datasets and a real tabular dataset.

04.
arXiv (CS.CV) 2026-06-16

CropTrack: A Tracking with Re-Identification Framework for Precision Agriculture

Multiple-object tracking (MOT) in agricultural environments presents major challenges due to repetitive patterns, similar object appearances, sudden illumination changes, and frequent occlusions. Contemporary trackers in this domain rely on the motion of objects rather than appearance for association. Nevertheless, they struggle to maintain object identities when targets undergo frequent and strong occlusions. The high similarity of object appearances makes integrating appearance-based association nontrivial for agricultural scenarios. To solve this problem we propose CropTrack, a novel MOT framework based on the combination of appearance and motion information. CropTrack integrates a reranking-enhanced appearance association, a one-to-many association with appearance-based conflict resolution strategy, and an exponential moving average prototype feature bank to improve appearance-based association. Evaluated on publicly available agricultural MOT datasets, CropTrack demonstrates consistent identity preservation, outperforming traditional motion-based tracking methods. Compared to the state of the art, CropTrack achieves significant gains in association accuracy and identification precision scores with a lower number of identity switches.

05.
arXiv (CS.CV) 2026-06-12

Allure of Craquelure: A Variational-Generative Approach to Crack Detection in Paintings

Recent advances in imaging technologies, deep learning and numerical performance have enabled non-invasive detailed analysis of artworks, supporting their documentation and conservation. In particular, automated detection of craquelure in digitized paintings is crucial for assessing degradation and guiding restoration, yet remains challenging due to the possibly complex scenery and the visual similarity between cracks and crack-like artistic features such as brush strokes or hair. We propose a hybrid approach that models crack detection as an inverse problem, decomposing an observed image into a crack-free painting and a crack component. A deep generative model is employed as powerful prior for the underlying artwork, while crack structures are captured using a Mumford–Shah-type variational functional together with a crack prior. Joint optimization yields a pixel-level map of crack localizations in the painting.

06.
arXiv (CS.AI) 2026-06-16

Learning aligned EEG representations with subject-specific encoders

arXiv:2606.16462v1 Announce Type: cross Abstract: Cross-subject EEG decoding promises more training data, but it also exposes neural networks to strong inter-subject distribution shifts. We study whether task supervision and architecture alone can learn subject-aligned representations. We replace a shared EEG encoder with subject-specific encoders followed by a common classifier, and compare this hybrid model with standard EEGNet, AttentionBaseNet, and CTNet baselines with Euclidean Alignment (EA) on four motor-imagery datasets. EA improves shared encoders by recentering subject covariances, but the hybrid encoder largely internalises this role: validation-loss curves and latent-distance analyses change little when EA is removed. Subject-specific heads increase class distinctiveness and place each subject close to its own latent manifold, improving most subjects while leaving a method-sensitive subset. These results support subject-specific encoders as a learned alignment mechanism for EEG decoding and identify head selection for unseen subjects as the remaining bottleneck.

07.
arXiv (CS.AI) 2026-06-18

CaVe-VLM-CoT: An Interpretable Vision-Language Model Framework

arXiv:2606.18385v1 Announce Type: new Abstract: Vision-Language Models (VLMs) remain prone to hallucinations, producing fluent but visually unfaithful outputs. Existing chain-of-thought and retrieval-augmented methods only partially address this, as they neither enforce step-level citation grounding nor route verification failures back to retrieval for correction. We present CaVe-VLM-CoT, a modular reflection-based agentic-RAG framework that enforces evidence-grounded reasoning through a five-stage closed-loop pipeline: Extractor, Retriever, Solver, Citation Injector, and Verifier, in which detected ungrounded claims trigger structured feedback to the Extractor for targeted re-retrieval. Since no existing framework jointly measures retrieval quality, step-wise citation faithfulness, and cross-modal grounding, we propose a suite of 23 component-wise metrics across all stages, anchored by CaVeScore, a composite metric weighting accuracy, citation precision and recall, attribution, and evidence grounding. Without any architectural or prompt modifications, CaVe-VLM-CoT achieves 87.1\% accuracy and 56.6\% CaVeScore on ScienceQA , and 55.2\% accuracy and 35.7\% CaVeScore on MMMU (30 subjects).

08.
arXiv (CS.LG) 2026-06-11

SwiftCTS: Fast Cross-Design Prediction and Pareto Optimization of Clock Tree Metrics via Few-Shot Calibration

arXiv:2606.11348v1 Announce Type: new Abstract: Clock Tree Synthesis (CTS) is a computationally expensive stage in the physical design flow, requiring iterative EDA tool invocations to navigate a vast configuration space for optimal power, wirelength, and timing skew. Existing machine learning approaches require computationally expensive retraining or fine-tuning cycles to adapt to unseen macro architectures and are architecturally mismatched to the millions of evaluations demanded by exhaustive combinatorial search. We present SwiftCTS, a physics-informed surrogate framework that addresses both limitations simultaneously. By coupling lightweight, physics-grounded statistical features with gradient-boosted ensembles, SwiftCTS trains in under five seconds on a CPU and delivers sub-millisecond inference without GPU support. To handle out-of-distribution (OOD) designs without retraining or fine-tuning, we introduce a K-shot multiplicative calibration mechanism that anchors predictions to just one or two physical reference runs, reducing power prediction error from 24.5\% to 3.3\% and wirelength error from 56.6\% to under 1\% on unseen macros. Integrating this engine with an evolutionary optimizer, SwiftCTS evaluates 100,000 CTS configurations in under ten seconds, yielding Pareto-optimal frontiers that are physically validated within the OpenROAD flow. Closed-loop validation confirms prediction errors below 0.5\% for power and wirelength, and timing skew predictions within five picoseconds on an OOD benchmark, consistently outperforming default tool heuristics across all target metrics. Code publicly available at: \href{https://anonymous.4open.science/r/SwiftCTS-7E6E}{https://github.com/BarsatKhadka/SwiftCTS}

09.
arXiv (CS.LG) 2026-06-11

Deep Learning of Solver-Aware Turbulence Closures from Nudged LES Dynamics

arXiv:2604.23874v3 Announce Type: replace-cross Abstract: The differentiable physics paradigm may be leveraged as an a-posteriori approach for discovering turbulence closure models by embedding a neural network parameterization directly inside the solver and optimizing it given potentially sparse target data. This addresses a key limitation of a-priori learning where direct numerical simulation (DNS) data is used to approximate the subgrid stress with the assumption of a low-pass filter. Closures trained in this a-priori manner frequently lead to unstable deployments due to the mismatch between the assumed filter and the effect of numerical discretizations and coarse-graining. In comparison, while typically stable during deployment, a-posteriori learning incurs high computational costs due to the need to backpropagate through a large eddy simulation (LES) solver. Furthermore, a-posteriori methods are challenging to apply broadly since they require significant modification of existing solvers. Finally, both approaches are limited when generalization is desired across different numerical schemes with their implicit filtering characteristics. In this work, we present a deep-learning approach for turbulence closure modeling built on the continuous data assimilation framework. Our approach enables the a-priori training of closures using sparsely observed DNS data without modifying or differentiating through the LES solver, while preserving stability during deployment for the recovery of invariant statistics. We focus on the model's ability to adapt to different discretizations by explicitly conditioning it on the numerical scheme. We use two- and three-dimensional canonical cases to test our framework and show that the learned correction systematically tracks the discretization error of the coarse solver.

10.
arXiv (CS.AI) 2026-06-15

Exact Linear Attention

Authors:

arXiv:2605.18848v4 Announce Type: replace-cross Abstract: This paper introduces Exact Linear Attention (ELA), a mechanism that achieves linear computational complexity for Transformer attention by exploiting the exact decomposition property of kernel functions, thereby eliminating approximation error. We identify and address two key limitations of prior linear attention – gradient explosion and token attention dilution – by imposing kernel constraints that ensure non-negativity, discriminability, and geometric interpretability. Several kernel functions are proposed, including the Hadamard Exp Kernel, Summation Squared Euclidean Distance Kernel, and Subtraction Squared Euclidean Distance Kernel, each tailored for specific attention behaviors. Beyond the core attention formulation, the paper presents three engineering innovations: (1) a Hyper-Link structure that replaces traditional residual connections to mitigate gradient degradation; (2) a Memory Lobe module based on bidirectional linear attention, which captures "transformation flow" across layers to implement qualitative memory and an implicit reinforcement learning paradigm; and (3) a routing-score-based bias mechanism for Mixture-of-Experts (MoE) to improve interpretability and semantic alignment. Experimental results demonstrate that ELA achieves up to 6x faster decoding speed and 75% reduction in KV cache memory usage compared to full attention, while maintaining comparable or superior training performance. The proposed memory module accelerates convergence and enhances generalization. Furthermore, we extend the linear attention principle to vision models, yielding YOLO-LAT, which attains up to 4.3x GPU inference speedup and 7.9x parameter reduction with competitive detection accuracy. These results underline the broad applicability of exact linear attention for scaling Transformer models to ultra-long sequences and efficient visual tasks.

11.
arXiv (CS.LG) 2026-06-19

Advances in Scientific Machine Learning for Coupled Fluid Flow and Transport

arXiv:2606.19562v1 Announce Type: new Abstract: This chapter reviews recent advances in Scientific Machine Learning (SciML) for modeling coupled fluid flow and transport phenomena governed by the incompressible Navier-Stokes and scalar transport equations. Such systems, found in applications like turbidity currents and thermal convection, feature strong nonlinear coupling and multiscale behavior that make high-fidelity simulations computationally expensive. To address this, the chapter surveys state-of-the-art SciML methods for building efficient surrogate models, including linear reduced-order techniques based on Singular Value Decomposition (such as Dynamic Mode Decomposition) and nonlinear neural network approaches like Physics-Informed Neural Networks (PINNs) and $\beta$-Variational Autoencoders ($\beta$-VAEs). It first covers the authors' work combining these models with High Performance Computing strategies, including Adaptive Mesh Refinement/Coarsening (AMR/C) and scientific floating-point data compression. It then presents two new contributions: surrogate modeling of turbidity currents via PINNs, and the extraction of disentangled nonlinear modes from thermal flows using $\beta$-VAEs. Governing equations and representative benchmarks, including lock-exchange flows and Rayleigh-Bénard convection, illustrate these methodologies. The chapter is intentionally long, covering both the mathematical and physical foundations of coupled fluid flow and the computational aspects of state-of-the-art modeling. Overall, it demonstrates how SciML enables fast, accurate approximations of complex coupled systems within the specific data regimes and modeling assumptions considered, while substantially reducing computational cost relative to full-order simulations. Broader capabilities such as real-time prediction and uncertainty quantification remain active research directions whose feasibility depends strongly on the problem at hand.

12.
arXiv (CS.LG) 2026-06-15

Utility-Constrained Policy Optimization

arXiv:2606.14029v1 Announce Type: new Abstract: Constrained MDPs (CMDPs) are a widely adopted framework for incorporating safety into RL agents; however, the framework does not support risk-sensitive constraints. This can be problematic: For example, CMDPs allow for optimal solutions that, in order to satisfy the risk-neutral constraints, mix infrequent catastrophic behaviors and frequent, overly conservative ones. Moreover, prior empirical results suggest that enforcing stricter, risk-sensitive constraints can improve performance even under risk-neutral evaluation. The natural framework to incorporate risk-sensitive constraints is utility-constrained MDPs (UCMDPs), but no practical solutions for this problem existed. In this work, we introduce a simple yet powerful methodology for UCMDPs and constrained RL. Besides allowing for risk-sensitive constraints, our framework does not require us to fix constraint limits in advance of training the agent, provided that a sensible range is known. This increases policy flexibility and, in practice, allows for adjustments to these limits at no extra training cost. Besides benefiting from the generality of the framework, our agent shows strong performance in practice, consistently matching or outperforming existing baselines in several Safety Gymnasium benchmark tasks.

13.
arXiv (CS.LG) 2026-06-11

From inverse problems to neural operators: prediction, mechanism, and generalization of data-driven models

Authors:

arXiv:2606.08956v2 Announce Type: replace Abstract: Scientists have historically relied on mathematical models based on differential equations to relate system inputs – forces, fluxes, or heat sources – to outputs, such as displacement, velocity, concentration, and temperature. These models rely on deep domain knowledge to determine the form of the governing differential equation, which is then calibrated with data by solving an inverse problem. In recent years, the field of Scientific Machine Learning has introduced a variety of alternative modeling strategies for physical systems. A method called Sparse Identification of Nonlinear Dynamics learns the governing equation as a sparse linear combination of terms in a user-defined library. Neural Ordinary Differential Equations construct the governing equation by taking in the state and its derivatives at the input layer of a neural network. Entirely foregoing the modeling framework of differential equations, neural operators directly learn a non-linear mapping between the system inputs and outputs. From inverse problems to neural operators, all of these modeling strategies can be conceptualized as data-driven machinery to predict a system's response over a range of inputs. It is then natural to wonder how exactly these various strategies relate to each other, and whether they can be neatly taxonomized. Drawing from the philosophical literature on scientific models, we argue that many model types have a common structure, differing only in the assumed model class of the input-output relation they define. Connecting to philosophical ideas on mechanism, and arguing that data from physical systems arises from solutions to parsimonious differential equations, we propose that only certain models are capable of mechanism discovery, and thus generalization. Our analysis is intended to unite apparently disparate modeling strategies and provide insight into their appropriate use cases.

14.
arXiv (CS.LG) 2026-06-16

Circuit Tracing in Autoregressive Protein Language Models

arXiv:2606.16044v1 Announce Type: new Abstract: Protein language models (pLMs) can generate novel protein sequences with properties beyond those observed in nature, yet the mechanisms underlying protein generation remain poorly understood. Existing mechanistic interpretability methods based on sparse autoencoders and transcoders primarily focus on protein representation learning models and do not capture the computation required for autoregressive generation. Here, we introduce ProGenMech, a mechanistic interpretability framework for generative protein language models that extends cross-layer transcoders (CLTs) to ProGen3, a sparse Mixture-of-Experts model trained for both causal generation and span infilling. Unlike per-layer approaches, CLTs reconstruct each layer using sparse latent variables from all preceding layers, enabling faithful recovery of inter-layer generative computation. We further develop a zero-shot circuit discovery framework to identify sparse latent circuits responsible for protein generation and fitness prediction. In causal generation and zero-shot fitness estimation tasks, ProGenMech outperforms local transcoder baselines in recovering ProGen3's probability distribution and functional scoring behavior, while matching the original model's generative distribution in span infilling tasks. Moreover, the recovered circuits reveal biologically meaningful motifs and functional regions associated with conserved sequence patterns and protein fitness landscapes, establishing a foundation for interpretable and steerable protein generation.

15.
PLOS Computational Biology 2026-06-15

Fung-AI: An AI/ML-driven pipeline for antifungal peptide discovery

by Daniel S. Berman, Libby M. Lewis, Tom D. Curtis, Olivia N. Tiburzi, Daniel F. Q. Smith, Arturo Casadevall, Laura J. Dunphy Emerging fungal pathogens represent a concerning threat to both global health and food security. In this study, we aimed to address our rising vulnerability to fungal pathogens through the development of the Fung-AI pipeline: an AI/ML-driven approach for antifungal discovery. A generative adversarial network (GAN) was trained to generate novel candidate antifungal peptide sequences. Next, in silico antifungal and hemolytic classifiers were built to further prioritize AI-generated peptides for experimental validation. From a pool of ~10,000 candidates, thirteen peptides were selected for testing over two-stages of experimentation. Five peptides were found to display mild antifungal activity against the wheat pathogen, Fusarium graminearum, with minimal inhibitory concentrations (MICs) ranging from 250 µg/mL to 500 µg/mL. Four of the five peptides also showed activity against the human pathogen, Candida albicans (MIC: 500 µg/mL). Two of our AI-generated antifungal peptides additionally demonstrated low cytotoxicity in HepG2 human liver carcinoma cells (LC50 > 704.2 µg/mL) indicating that they may be useful as scaffolds for future optimization for therapeutic applications. None of our peptides were found to considerably inhibit the emerging pathogen C. auris, suggesting the need for pathogen-specific down-selection of candidate peptides. Overall, we present a proof-of-principle, generative-AI-based approach for the rapid design of de novo antifungal peptides.

16.
arXiv (CS.CL) 2026-06-16

PACUTE: Phonology-, Affix-, and Character-level Understanding of Tokens for Filipino

Large language models (LLMs) process text as sequences of subword tokens, which can obscure the character-level and morphological structure that underlies word formation. This limitation is most acute for languages with non-concatenative morphology, where standard tokenizers systematically misalign token boundaries with morpheme boundaries. We introduce PACUTE, a diagnostic benchmark of 4,600 tasks designed to evaluate morphological understanding in Filipino, a language characterized by productive infixation, reduplication, and diacritic-driven lexical distinctions that are typically absent from written text. PACUTE includes a hierarchical diagnostic framework of six compositional levels that localizes where morphological understanding breaks down. Evaluating open-weight LLMs and frontier commercial models, we find that open-weight models perform near chance on morpheme decomposition regardless of scale. Frontier models perform much better, often recovering individual affixes under contains-match scoring, but remain far below their character-level ceilings on compositional tasks of morpheme transformations and syllabification. These results identify productive morphological composition, rather than character access alone, as the persistent bottleneck for Filipino word-structure understanding.

17.
arXiv (quant-ph) 2026-06-19

Ultrafast nonadiabatic dynamics of tetraphenylsubstituted nitrogen-based heterocycles

arXiv:2604.16897v2 Announce Type: replace-cross Abstract: Tetraphenylpyrazine (TPP) and 2,3,4,5-tetraphenyl-1H-pyrrole (TePP) are closely related heterocycles bearing four phenyl substituents, whose structural similarity makes them a useful pair for comparing how intramolecular flexibility influences excited-state relaxation and emission in the gas phase and in the solid state. TPP is a prototypical solid-state luminescence enhancement (SLE) emitter, exhibiting a markedly increased quantum yield upon molecular aggregation. In contrast, TePP displays similar quantum yields in solution and solid state, characteristic of dual-state emission (DSE). This behaviour indicates that intramolecular rotations are already significantly hindered in the isolated-molecule regime, consistent with our previous observations for TPP and other solid-state emitters (Hernández-Rodríguez et al., ChemPhysChem, 2024, 25, e202400563). To unravel the excited-state dynamics underlying this contrasting behaviour, we performed mixed quantum-classical trajectory simulations on a single molecule of TPP and TePP employing the surface-hopping method. Twelve singlet states were included at the TD-B3LYP-D3/def2-SVP level, which were previously benchmarked against coupled cluster methods. Simulated observables such as gas phase ultrafast electron diffraction (GUED) and time-resolved fluorescence (TR-FL) signals allow us to dissect the distinct deactivation pathways operating in both systems in the gas phase, while also providing mechanistic insight into how these pathways are expected to evolve in solution and solid-state environments.

18.
arXiv (quant-ph) 2026-06-15

Spin counting via projection noise measurement of mesoscopic solid-state spin ensemble

arXiv:2606.14437v1 Announce Type: new Abstract: Quantum projection noise is the fundamental noise source for the population measurement of spin ensembles. While projection-noise-limited measurements have been extensively studied in atomic systems, corresponding experiments on solid-state spin ensembles remain challenging due to dominant classical readout noise. Here, we report direct measurement of the quantum projection noise of mesoscopic ensembles of nitrogen-vacancy (NV) spin defects at room temperature. Our experiment is enabled by a high optically-detected magnetic resonance (ODMR) contrast of over 20% for a single crystallographic orientation of the defect spins, obtained by combining polarization-selective optical excitation with spin-to-charge conversion. We use our protocol to demonstrate projection noise measurements and spin counting from nanoscale NV ensembles of up to 43 spins. We further demonstrate that the protocol allows for significant gains in sensitivity for magnetometry applications without need for cryogenic operation or high bias magnetic fields.

19.
arXiv (CS.CL) 2026-06-11

Fanar-Sadiq: A Multi-Agent Architecture for Grounded Islamic QA

Large language models (LLMs) can answer religious knowledge queries fluently, yet they often hallucinate and misattribute sources, which is especially consequential in Islamic settings where users expect grounding in canonical texts (Qur'an and Hadith) and jurisprudential (fiqh) nuance. Retrieval-augmented generation (RAG) improves grounding, however, a single retrieve-then-generate pipeline is insufficient for diverse Islamic queries, including verbatim scripture, citation-grounded guidance, and rule-constrained computations such as zakat and inheritance. To address these challenges, we present Fanar-Sadiq, a bilingual Arabic-English Islamic QA system built on a multi-agent, tool-augmented architecture. It is a core component of the Fanar AI platform. Fanar-Sadiq routes Islamic queries to specialized modules within an agentic tool architecture. It supports intent-aware routing, retrieval-grounded fiqh answers with normalized citations and verification traces, exact verse lookup with quotation validation, and deterministic Sunni zakat and inheritance calculators with madhhab-sensitive branching. We evaluate the end-to-end system on public Islamic QA benchmarks and show strong effectiveness and efficiency. It is publicly accessible through an API and Web application and has received over 1.9M accesses in less than a year (https://api.fanar.qa/docs).

20.
arXiv (CS.LG) 2026-06-19

Comparative Study on Agility, Efficiency, and Impact Absorption of Bipedal Robots with Active Toes

arXiv:2606.19699v1 Announce Type: cross Abstract: Human legs exhibit high efficiency, agility, and impact absorption, with toes playing a crucial role in these capabilities. While many attempts have been made to implement human-like toes in robots, they have not fully replicated human characteristics nor rigorously validated their benefits. We propose a 14-DOF biped robot emulating human toes' lightweight, high-torque, robust nature. To quantitatively analyze the effectiveness of the active toes in terms of agility, efficiency, and impact absorption, we developed a high-fidelity simulation training environment that reflects actual actuators with coupled transmissions and accurate power consumption. To ensure a fair comparison between configurations with and without active toes, we designed a minimal RL reward function and applied an identical training procedure to both. The simulation results indicate that, at 1.33 m/s walking, the toe-equipped robot reduced CoT by 17.5% and heel-strike GRF by 5.0% compared with the toe-ablation configuration. On the agility test, average and maximum path deviation decreased by 25.0% and 34.0%, respectively.

21.
arXiv (math.PR) 2026-06-15

On the Poisson Follower Model

arXiv:2309.04864v5 Announce Type: replace Abstract: We introduce a stochastic geometry dynamics inspired by opinion dynamics that captures the essence of modern asymmetric social networks with leaders and followers. Points in the Euclidean space represent opinions, and the leader of an agent is the one with the closest opinion. In this dynamics, each follower updates its opinion by halving the distance to its leader. We demonstrate that this simple dynamics and its iterations exhibit several interesting purely geometric phenomena related to the evolution of leadership and opinion clusters, which resemble those observed in social networks. We also show that when the initial opinions are randomly distributed as a stationary Poisson point process, the spatial frequency of each of these phenomena can be expressed through an integral geometry formula involving semi-algebraic domains. Finally, we analyze numerically the limiting behavior of this follower dynamics. In the Poisson case, the agents fall into two categories: ultimate followers, who continue updating their opinions indefinitely, and ultimate leaders, who adopt a fixed opinion after a finite time. Spatial discrete event simulations support all our findings.

22.
arXiv (CS.LG) 2026-06-17

From Drift to Coherence: Stabilizing Beliefs in LLMs

arXiv:2606.17832v1 Announce Type: new Abstract: Large language models (LLMs) are often hypothesized to perform implicit Bayesian inference, yet a key coherence condition, the martingale property of predictive beliefs, has been shown to fail in controlled synthetic in-context learning settings. We revisit this question in a more typical usage regime: generic multiple-choice question answering. Exploiting the discrete answer space, we compute exact predictive distributions and study belief dynamics induced by autoregressive answer resampling. We introduce prompted predictive resampling (PPR), where an LLM generates a sequence of answers to the same question. Empirically, PPR reveals early-stage belief drift, indicating martingale violations. However, after sufficient resampling steps, the belief process self-stabilizes and converges to a coherent predictive distribution. Based on this observation, we further propose (i) a seed-answer prompting strategy to accelerate stabilization, and (ii) a self-consistency loss that amortizes early-stage drift into the model via fine-tuning. Experiments on multiple-choice QA benchmarks show that our methods substantially reduce belief drift and improve predictive coherence without sacrificing accuracy.

23.
arXiv (CS.CL) 2026-06-19

MENTOR: Reinforcement Learning via Flexible Teacher-Optimized Rewards for Tool-Use Distillation

Distilling the tool-use capabilities of large language models (LLMs) into small language models (SLMs) is essential for their practical application. The predominant approach, supervised fine-tuning (SFT), suffers from poor out-of-domain (OOD) generalization due to its rigid alignment with static teacher trajectories. While reinforcement learning (RL) offers an alternative, the capacity limitations of SLMs pose a severe dilemma: sparse outcome rewards provide insufficient guidance, whereas strict trajectory matching imposes overly restrictive constraints. To bridge this capacity-driven gap, we propose MENTOR, which introduces a flexible yet process-aware reward structure. Instead of enforcing rigid replication, MENTOR uses the teacher's reference to guide tool-use behavior, balancing behavioral alignment with downstream performance. Extensive experiments on controlled executable-tool benchmarks demonstrate that MENTOR improves OOD tool-use performance compared to SFT and strict RL baselines. Our findings suggest that within verifiable tool-use environments, flexible tool-use alignment offers a more effective approach than strict trajectory replication for developing adaptable small models.

25.
arXiv (CS.LG) 2026-06-18

How fast can you find a good hypothesis?

arXiv:2509.03734v3 Announce Type: replace-cross Abstract: In the hypothesis selection problem, we are given sample and query access to finite set of candidate distributions (hypotheses), $\mathcal{H} = \{H_1, \ldots, H_n\}$, and samples from an unknown distribution $P$, both over a domain $\mathcal{X}$. The goal is to output a distribution $Q$ whose distance to $P$ is comparable to that of the nearest hypothesis in $\mathcal{H}$. Specifically, if the minimum distance is $\mathsf{OPT}$, we aim to output $Q$ such that, with probability at least $1-\delta$, its total variation distance to $P$ is at most $C \cdot \mathsf{OPT} + \varepsilon$. The optimal approximation for proper algorithms (where $Q \in \mathcal{H}$) is $C=3$ using $\Theta(\log(n/\delta)/\varepsilon^2)$ samples from $P$ and for improper algorithms (where $Q$ is not necessarily in $\mathcal{H}$) is $C=2$ using $\tilde{\Theta}(\log(n/\delta)/\varepsilon^2)$ samples from $P$. In the improper setting, the algorithm achieving $C=2$ [Bousquet, Braverman, Kol, Efremenko, Moran, FOCS 2021] runs in time which grows polynomially with $|\mathcal{X}|$ – it does not run in finite time for real-valued distributions. A promising path towards improved runtime is to consider improper algorithms which output a mixture $Q$ of the hypotheses as such a distribution can be represented in $n$ words of memory. We show (1) a lower bound that no algorithm which outputs a mixture can achieve approximation better than $C = 3-2/n$ unless the number of samples is polynomial in $|\mathcal{X}|$, as well as (2) an algorithm which runs in time $poly(n)$ and achieves the same approximation guarantee. In the proper setting, [Aliakbarpour, Bun, Smith, NeurIPS 2024] provided an algorithm with $C=3$ running in $\tilde{O}(n/(\delta^3\varepsilon^3))$ time. We improve this time complexity to $\tilde{O}(n/(\delta \varepsilon^2))$, significantly reducing the dependence on the confidence and error parameters.