Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (quant-ph) 2026-06-12

From 2D Yang-Mills to Calogero-Sutherland via a colored particle

arXiv:2606.13388v1 Announce Type: cross Abstract: We study Yang-Mills theory coupled to a particle on a cylinder, where gauge invariance and compactness reduce the dynamics to a finite dimensional quantum system. In the Abelian case, this yields a model equivalent to the Landau problem on a torus, with a degenerate ground state structure. We generalize this construction to non-Abelian gauge groups and show that, for SU(N), the system reduces to a one dimensional quantum many body problem with a singular Calogero-Sutherland-type interaction.

02.
arXiv (CS.LG) 2026-06-16

The Complexity of Min-Max Optimization for Quadratic Polynomials

arXiv:2606.17000v1 Announce Type: cross Abstract: We prove that computing approximate stationary points of min-max optimization over the hypercube is PPAD-hard for quadratic polynomials. This holds even when the polynomials are multilinear, each variable appears in at most three monomials, and the approximation factor is inverse polynomial. As a direct consequence, we obtain the first PPAD-hardness results for two-team zero-sum polymatrix games.

03.
arXiv (CS.CL) 2026-06-12

BOUTEF: A Multilingual Corpus for FakeNews in North Africa – Language as a Weapon

The rapid spread of fake news on social media has become a major challenge, particularly in multilingual and under-resourced contexts such as North Africa. In this paper, we introduce BOUTEF, a large-scale multilingual corpus designed to study the propagation, characteristics, and impact of fake news in Algeria and Tunisia. The corpus integrates three complementary components: fake narratives, genuine narratives, and associated user-generated comments, along with verified debunking information. It covers a wide range of languages and linguistic varieties, including MSA, Algerian and Tunisian dialects, Arabizi, French, English, and code-switched language. Building on this resource, we conduct a comprehensive empirical analysis combining quantitative and qualitative approaches. We examine thematic distributions, linguistic and rhetorical strategies, sentiment patterns, and social engagement dynamics. Statistical analyses reveal significant associations between thematic categories and message veracity, as well as strong correlations between user engagement and the visibility of fake content. Our findings show that fake news relies heavily on emotionally charged narratives, sensational framing, and hybrid linguistic practices that enhance virality and audience engagement. In contrast, debunking content adopts a more factual and verification-oriented style. Furthermore, a comparative analysis between Algeria and Tunisia highlights both shared dynamics and country-specific characteristics shaped by sociopolitical contexts. The results emphasize the role of informal language practices in the diffusion and reception of misinformation. By providing a rich, annotated, and publicly available dataset, this work contributes to advancing research on fake news detection, low-resource language processing, and the understanding of information disorders in complex linguistic environments.

04.
arXiv (CS.LG) 2026-06-11

Visualizing LLM Latent Space Geometry Through Dimensionality Reduction

arXiv:2511.21594v3 Announce Type: replace Abstract: Large language models (LLMs) achieve state-of-the-art results across many natural language tasks, but their internal mechanisms remain difficult to interpret. In this work, we extract, process, and visualize latent state geometries in Transformer-based language models through dimensionality reduction. We capture layerwise activations at multiple points within Transformer blocks and enable systematic analysis through Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP). We demonstrate experiments on GPT-2 and LLaMa models, where we uncover interesting geometric patterns in latent space. Notably, we identify a clear separation between attention and MLP component outputs across intermediate layers, a pattern not documented in prior work to our knowledge. We also characterize the high norm of latent states at the initial sequence position and visualize the layerwise evolution of latent states. Additionally, we demonstrate the high-dimensional helical structure of GPT-2's positional embeddings and the sequence-wise geometric patterns in LLaMa. We make our code available at https://github.com/Vainateya/Feature_Geometry_Visualization. A better formatted blog-post with identical content is available at https://iclr-blogposts.github.io/2026/blog/2026/vis-llm-latent-geometry/.

05.
arXiv (CS.LG) 2026-06-16

On the Benefits of Weight Normalization for Overparameterized Matrix Sensing

arXiv:2510.01175v2 Announce Type: replace Abstract: While normalization techniques are widely used in deep learning, their theoretical understanding remains relatively limited. In this work, we establish the benefits of (generalized) weight normalization (WN) applied to the overparameterized matrix sensing problem. We prove that WN with Riemannian optimization achieves linear convergence, yielding an exponential speedup over standard methods that do not use WN. Our analysis further demonstrates that both iteration and sample complexity improve polynomially as the level of overparameterization increases. To the best of our knowledge, this work provides the first characterization of how WN leverages overparameterization for faster convergence in matrix sensing.

06.
medRxiv (Medicine) 2026-06-10

Estimating COVID-19 Cumulative Incidence from Seroprevalence Surveys accounting for Time-Varying Seroreversion: A Fully Bayesian Methodology

Seroprevalence surveys reveal the extent of humoral immunity against pathogens such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and under some circumstances represent cumulative incidence of prior infection. However, antibody waning - or seroreversion - biases these estimates by reducing assay sensitivity in a time-varying manner. Because assay sensitivity decays over time, naively using serosurveys can substantially bias estimates of SARS-CoV-2 cumulative incidence and fatality rates. The Bayesian assay-specific, time-varying sensitivity adjustment developed in this paper can reliably correct for this bias and account for the delay between infection and serosurvey. In seroprevalence studies conducted in the United States in 2020, adjusting for time-varying sensitivity increased cumulative incidence by up to 1.4-fold, with an adjustment of 1.08 for a national study. Our estimates contrast with a previously published 2-fold adjustment that did not account for assay design. This suggests that previous analyses overestimated cumulative incidence by applying seroreversion corrections that did not account for assay-specific effects, or underestimated cumulative incidence by not applying seroreversion corrections. These biases imply fatality rate underestimation and overestimation, respectively. Our model provides a framework for design-specific time-varying sensitivity corrections in seroprevalence surveys for other pathogens.

07.
arXiv (CS.AI) 2026-06-17

DecoSearch: Complexity-Aware Routing and Plan-Level Repair for Text-to-SQL

arXiv:2606.17821v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in translating natural language to SQL, yet existing methods still falter on complex queries requiring multi-step, data-aware reasoning. We introduce DecoSearch, a training-free framework that addresses this by routing each query to the appropriate level of reasoning effort. A lightweight Schema Selector first prunes the full database schema to the relevant tables and columns. An LLM Judger then decides whether the question requires decomposition: straightforward questions follow a direct generation path and complex ones are escalated to a Directed Acyclic Graph (DAG) of atomic sub-questions, each solved by a targeted SQL generation step. A RAG component grounds the decomposer with semantically similar training examples, and a Topology Refiner restructures the reasoning plan when execution failures signal a flawed decomposition rather than a fixable SQL error. DecoSearch achieves 70.53% execution accuracy on BIRD and 88.31% on Spider with a DeepSeek backbone, surpassing all training-free baselines while consuming an order of magnitude fewer tokens than competing methods. It also functions as a model-agnostic wrapper, consistently improving fine-tuned SQL generation backbones without any modification to the pipeline.

08.
arXiv (quant-ph) 2026-06-16

Morphology-resolved scrambling in a chaotic quantum billiard

arXiv:2606.16865v1 Announce Type: new Abstract: Chaotic quantum systems can retain spatial memory through scarred eigenstates, but whether these static structures control scrambling remains unclear. This work establishes a morphology-resolved connection between scarred eigenstates and eigenstate-resolved OTOCs in a peanut-shaped quantum billiard. Scalar localisation diagnostics, including differential entropy and continuum participation ratios, detect anomalous concentration but discard spatial architecture. A scale-normalised density overlap, in contrast, directly compares probability density profiles, revealing families of orthogonal eigenstates with nearly identical spatial morphology. Comparing the complete OTOC time traces of these orthogonal eigenstates reveals that morphological recurrence has dynamical content: moderate density overlap yields no universal prediction, whereas strongly recurring morphologies exhibit nearly identical OTOC growth and saturation. Thus, scarred structures act as spatial templates for operator growth, not merely static violations of ergodicity. This morphology-resolved framework turns eigenstate shape into a quantitative predictor of scrambling and provides a scale-controlled diagnostic of weak ergodicity breaking in quantum chaos.

09.
arXiv (CS.AI) 2026-06-16

Separable Neural Architectures as Physical World Models: from Mathematical Theory to Applications

arXiv:2606.14934v1 Announce Type: cross Abstract: This work introduces the Separable Neural Architecture (SNA), a function representational class combining neural approximation with tensor decomposition. The SNA decouples localized coordinate functions (atoms) from global interactions governed by a sparse, low-rank interaction object. This architecture possesses a compact and smooth inductive bias well-suited for solving partial differential equations (PDEs). When viewed as a Galerkin trial space under the variational SNA (VSNA) framework, the formulation satisfies classical variational guarantees under Lax-Milgram: well-posedness, quasi-optimality, convergence, and stability. In high-dimensional spatiotemporal–parametric PDEs, the VSNA mitigates the curse of dimensionality by scaling algebraically rather than exponentially. Exploiting an entirely factorized, tensor-native alternating least squares (ALS) optimization framework reduces this cost to linear in dimension. The VSNA is validated across elliptic, hyperbolic, and parabolic systems, demonstrating close alignment with predicted algebraic and spectral scaling rates. We showcase the SNA as a "solve once, query anywhere" physical world model via two engineering case studies: a 7D parametric manufacturing simulation and an experimental thermal-to-property inversion pipeline for Inconel 718. The VSNA executes a 1,000,000-query Monte Carlo sweep in 102s on a standard laptop CPU, yielding a 150,000x speedup over a full-grid finite element baseline hosted on an NVIDIA A100 GPU. It further enables real-time generative inverse-mode reconstructions under 100ms. These results demonstrate that the SNA serves as a compact mathematical substrate for continuous parameter manifolds to enable real-time inversion, optimization loops, and rapid uncertainty propagation.

10.
arXiv (quant-ph) 2026-06-12

Optimal classical shadow estimation of unitary channels at Heisenberg limit

arXiv:2606.13638v1 Announce Type: new Abstract: Full tomography of an unknown quantum evolution is resource-intensive and often unnecessary when the goal is only to predict selected properties. This motivates the study of classical shadow estimation of unitary channels (CSEU), a task in which one queries an unknown $d$-dimensional unitary $U$ and stores classical data that can later be used to predict expectation values $\mathrm{tr}[O \cdot U\rho U^\dagger]$ up to additive error $\varepsilon$ for arbitrary input states $\rho$ and observables $O$. We propose a parallel, non-adaptive CSEU protocol using $\mathcal{O}(d\varepsilon^{-1})$ queries when the input states or observables have constant rank. This achieves Heisenberg scaling with respect to $\varepsilon$ and is query-optimal, as we prove a matching $\Omega(d\varepsilon^{-1})$ lower bound that remains valid even with stronger access to the unknown unitary. Our query-optimal CSEU protocol provides a versatile and powerful tool for quantum learning theory, pushing the performance limits of several fundamental learning tasks, including unitary channel tomography, Hamiltonian learning, boundary-regime quantum channel tomography, Pauli transfer matrix learning, inverse-free amplitude estimation, pure-state property estimation, and shallow-circuit learning. Remarkably, we show that optimal unitary channel tomography can be achieved using only parallel queries, closing the gap between the best achievable efficiency of parallel and sequential tomography protocols. Together, these applications establish our framework as a fundamental tool for learning properties of quantum processes, particularly for certain key tasks that require high precision.

11.
bioRxiv (Bioinfo) 2026-06-11

An AI-Powered Trisomy 21 Research Assistant

Down syndrome, caused by trisomy 21, increases the risk of diverse co-occurring conditions. With more than 34,000 related publications indexed in PubMed as of early 2026, keeping pace with this expanding literature is challenging. While general-purpose large language models are widely used for information retrieval, they often rely on broad training data rather than specific evidence. Retrieval-augmented generation (RAG) improves rigor and reliability of responses by linking model outputs to source texts. In research, source texts are peer-reviewed articles. Standard implementations treat all manuscript sections equally, allowing background text to rank as highly as experimental results. To focus model outputs on experimentally supported responses, we developed the T21 Research Assistant, a section-aware RAG system that prioritizes Results sections to ground responses in primary experimental evidence. The system draws exclusively from 1,789 open-access Down syndrome publications from PubMed Central, including 327 NIH INCLUDE-funded studies, and uses a multistage pipeline for query validation, retrieval, reranking, synthesis, and citation verification. Built on NVIDIA Nemotron models, it generates structured, cited responses. Evaluation using expert-curated questions demonstrated strong performance, achieving a BERTScore F1 of 0.712 and recall of 0.758, comparable to or exceeding leading proprietary and open-source models. T21 Research Assistant is available at: https://bioinformatics.cuanschutz.edu/t21-res-assi/

12.
arXiv (CS.CV) 2026-06-12

HairPort: In-context 3D-aware Hair Import and Transfer for Images

Transferring hairstyles between images is an important but challenging task in computer graphics, computer vision, and visual effects. It enables users to explore new looks without physically altering their hair, with applications in virtual try-on systems, augmented reality, and entertainment. Most prior works operate best under small pose gaps, and they fall short under large viewpoint and scale differences, where missing hair content must be synthesized rather than transferred. We propose HairPort, a 3D-aware hairstyle transfer framework that attempts to solve these issues by explicitly separating hair removal from transfer and enforcing geometric consistency before synthesis. We introduce a Bald Converter, which produces realistic bald versions of faces through LoRA-based in-context adaptation of FLUX.1 Kontext. To train our Bald Converter, we introduce a new dataset, Baldy, containing 6,000 paired bald and original images across diverse identities and conditions. We also use a 3D-Aware Transfer Pipeline that reconstructs and re-renders the reference hairstyle from the target viewpoint before compositing it onto the source image. Being 3D aware, our method supports large pose and scale discrepancies between the source and target. Finally, a conditional flow-matching generator synthesizes the transferred result from the bald source and geometry-aligned reference guidance. Together, our method enables accurate, pose-consistent, and identity-preserving hairstyle transfer, outperforming existing methods both qualitatively and quantitatively.

13.
arXiv (CS.CL) 2026-06-19

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Cross-lingual sentence encoders typically cover only a few hundred languages and often trade downstream quality for stronger alignment, limiting their adoption. We introduce OmniSONAR, a new family of omnilingual, cross-lingual and cross-modal sentence embedding models that natively embed text, speech, code, and mathematical expressions in a single semantic space, while delivering state-of-the-art downstream performance at the scale of thousands of languages, from high-resource to extremely low-resource varieties. To reach this scale without representation collapse, we use progressive training. We first learn a strong foundational space for 200 languages with an LLM-initialized encoder-decoder, combining token-level decoding with a novel split-softmax contrastive loss and synthetic hard negatives. Building on this foundation, we expand to several thousands language varieties via a two-stage teacher-student encoder distillation framework. Finally, we demonstrate the cross-modal extensibility of this space by seamlessly mapping 177 spoken languages into it. OmniSONAR halves cross-lingual similarity search error on the 200-language FLORES dataset and reduces error by a factor of 15 on the 1,560-language BIBLE benchmark. It also enables strong translation, outperforming NLLB-3B on multilingual benchmarks and exceeding prior models (including much larger LLMs) by 15 chrF++ points on 1,560 languages into English BIBLE translation. OmniSONAR also performs strongly on MTEB and XLCoST. For speech, OmniSONAR achieves a 43% lower similarity-search error and reaches 97% of SeamlessM4T speech-to-text quality, despite being zero-shot for translation (trained only on ASR data). Finally, by training an encoder-decoder LM, Spectrum, exclusively on English text processing OmniSONAR embedding sequences, we unlock high-performance transfer to thousands of languages and speech for complex downstream tasks.

14.
arXiv (quant-ph) 2026-06-16

Exactly Solvable Quantum Model with Spin-Dependent Coulomb Interaction

arXiv:2501.05103v5 Announce Type: replace Abstract: In this work, we report an exactly solvable quantum model featuring a spin-dependent Coulomb interaction, described by the spin vector potential \(\vec{\mathcal{A}} = k (\vec{r} \times \vec{S}) / r^2\) together with a Coulomb-type scalar potential \(\varphi = \kappa / r\) . The model is governed by the Schrödinger-type Hamiltonian \(\mathcal{H}_S = \vec{\Pi}^2 / (2M) + q \varphi\) in nonrelativistic quantum mechanics and by the Dirac-type Hamiltonian \(\mathcal{H}_D = c \vec{\alpha} \cdot \vec{\Pi} + \beta M c^2 + q \varphi\) in relativistic quantum mechanics, where \(\vec{\Pi} = \vec{p} - (q/c)\vec{\mathcal{A}}\) is the canonical momentum. We demonstrate two main results: (i) Just as the Coulomb-type scalar potential \(\mathcal{S}_Maxwell = \{\vec{\mathcal{A}} = 0,\ \varphi = \kappa / r\}\) is a local exact solution of Maxwell's equations on $r\neq0$, the gauge potential \(\mathcal{S}_YM = \{\vec{\mathcal{A}} = k (\vec{r} \times \vec{S}) / r^2,\ \varphi = \kappa / r\}\) constitutes a local exact solution of the Yang–Mills equations on the punctured region $r\neq0$. (ii) Both Hamiltonians \(\mathcal{H}_S\) and \(\mathcal{H}_D\) can be solved exactly in the presence of this spin-dependent Coulomb interaction. The resulting energy spectra are derived, and they naturally reduce to those of the ordinary hydrogen atom when the spin-dependent terms are neglected. Finally, we clarify the quantization conditions and the fixed-background interpretation of the model.

15.
arXiv (CS.AI) 2026-06-11

When Context Returns: Toward Robust Internalization in On-Policy Distillation

arXiv:2606.11627v1 Announce Type: cross Abstract: Recent work has shown that on-policy distillation can internalize privileged context, such as system prompts or task hints, into a student model so that the context is no longer needed at inference time. Although this approach successfully improves the student's no-context performance, we identify an interesting and previously unstudied phenomenon: in many settings, reintroducing the original privileged context to the distilled student actually degrades its performance, even on instances it already solves correctly without context. We term this context-induced degradation and argue that robust internalization demands not only matching the teacher's context-conditioned behavior, but also remaining stable when the context is reintroduced, a property we call context removability. Motivated by this observation, we propose a lightweight consistency regularizer that first anchors the student's no-context output via stop-gradient, then penalizes the context-conditioned output for deviating from it via forward KL divergence. This simple addition requires only one extra forward pass per training step, yet it effectively mitigates context-induced degradation and, in many cases, even improves no-context performance. Across 12 configurations spanning diverse domains and model families, our method improves context-conditioned accuracy in the majority of settings, reduces context-induced harm in 11 out of 12 settings, and effectively eliminates response-length inflation. A mechanistic case study further confirms that context removability is achieved at the representation level, with hidden states remaining nearly identical regardless of whether the context is present.

16.
arXiv (CS.LG) 2026-06-17

Characterizing Nash Equilibria in Zero-Sum Games: A Physics-Inspired, Parallelizable Approach with a Linear Number of Gradient Queries

arXiv:2507.11366v2 Announce Type: replace-cross Abstract: We study online optimization methods for zero-sum games, a fundamental problem in adversarial learning in machine learning, economics, and many other domains. Traditional methods approximate Nash equilibria (NE) using either regret-based methods (time-average convergence) or contraction-map-based methods (last-iterate convergence). We propose a new method based on Hamiltonian dynamics in physics and prove that it can characterize the set of NE in a finite (linear) number of iterations of alternating gradient descent in the unbounded setting, modulo degeneracy, a first in online optimization. Unlike standard methods for computing NE, our proposed approach can be parallelized and works with arbitrary learning rates, both firsts in algorithmic game theory. Experimentally, we support our results by showing our approach drastically outperforms standard methods.

17.
arXiv (math.PR) 2026-06-12

Quenched and Annealed CLTs for the one-periodic Aztec diamond in random environment

arXiv:2510.11846v2 Announce Type: replace Abstract: We study the asymptotic behavior of random dimer coverings of the one-periodic Aztec diamond in random environment. We investigate quenched limit theorems for the height function and we extend annealed limit theorems that were recently studied in [arXiv:2507.08560]. We consider more general choices of random edge weights (independence is not assumed) and we distinguish two cases where the random edge weights satisfy the Central Limit Theorem (CLT) under different scalings. For both cases, we prove convergence to the Gaussian Free Field for the quenched fluctuations. For the annealed version, it had been shown in [arXiv:2507.08560], that Gaussian Free Field fluctuations can be dominated by the much larger fluctuations of the random environment. To access quenched fluctuations we analyze the Schur process with random parameters in a way that allows to prove the annealed CLT for the height function for non i.i.d. weights. We consider specific examples where we determine the asymptotic fluctuations.

18.
arXiv (CS.AI) 2026-06-16

Discovering Symmetry Groups with Flow Matching

arXiv:2512.20043v3 Announce Type: replace Abstract: Symmetry is fundamental to understanding physical systems and can improve performance and sample efficiency in machine learning. Both pursuits require knowledge of the underlying symmetries in data, yet discovering these symmetries automatically is challenging. We propose LieFlow, a novel framework that reframes symmetry discovery as a distribution learning problem on Lie groups. Instead of searching for the symmetry generators, our approach operates directly in group space, modeling a symmetry distribution over a large hypothesis group $G$. The support of the learned distribution reveals the underlying symmetry group $H \subseteq G$. Unlike previous works, LieFlow can discover both continuous and discrete symmetries within a unified framework, without assuming a fixed Lie algebra basis or a specific distribution over the group elements. Experiments on synthetic 2D and 3D point clouds, ModelNet10 and a real-world MI-Motion dataset show that LieFlow accurately discovers continuous and discrete subgroups, significantly outperforming a state-of-the-art baseline, LieGAN, in identifying discrete symmetries.

19.
arXiv (CS.CV) 2026-06-18

DVANet: Degradation-aware Visual-prior Alignment Network for Image Restoration

All-in-One image restoration aims to develop a unified restoration framework for handling diverse degradation types. Existing end-to-end methods usually regard the restoration process as a black-box mapping, lacking an explicit optimization interpretation. Although deep unfolding provides an interpretable iterative modeling paradigm for image restoration, existing methods mostly rely on fixed degradation assumptions or predefined degradation information, making them difficult to adapt to unified restoration requirements under complex degradations and locally damaged content. This limitation restricts their performance in degradation suppression and structural detail recovery. To address these issues, this paper proposes DVANet, a deep unfolding network inspired by the half-quadratic splitting optimization algorithm, which formulates unified image restoration under complex degradations as a collaborative unfolding process between degradation-aware observation consistency and visual-prior-guided reconstruction. Specifically, in the degradation-aware observation consistency branch, a degradation representation module is employed to extract global degradation attributes and local degradation cues, and degradation-conditioned mapping is used to enhance the model's adaptability to different degradation types. In the visual-prior-guided reconstruction branch, DINOv3 is introduced to provide structural and semantic information as hierarchical visual priors, thereby complementing the missing structural information in damaged regions and improving detail recovery. Extensive experiments demonstrate that DVANet achieves superior or competitive performance on multi-scenario degradation and cross-domain image restoration tasks, showing favorable degradation adaptability and generalization ability.

20.
arXiv (CS.CV) 2026-06-16

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Structured benchmarks have advanced text-conditional image generation for real-world imagery, however, no such benchmark exists for synthetic radiograph generation. Despite being a highly active area of research, existing studies continue adopting inconsistent evaluation protocols and lack a unified assessment of the three most critical criteria: generative fidelity, privacy risk, and downstream utility. To address these limitations, we introduce CheXGenBench, the first unified evaluation framework for synthetic chest radiograph generation that simultaneously assesses fidelity, privacy risks, and downstream utility across frontier text-to-image (T2I) generative models. Our evaluation protocol, comprising over 20 quantitative metrics, covers 11 leading T2I architectures with plug-and-play integration for newer models. Through a rigorous and fair evaluation protocol, we establish comprehensive baseline state-of-the-art (SoTA) performances across all dimensions to guide future research. Furthermore, our results uncover several limitations of current generative models, which include first, even SoTA models struggle with long-tailed medical distributions; second, models pose high privacy risks regardless of fidelity quality; and third, while synthetic data already benefits downstream classification, it is of limited utility for downstream multimodal tasks. Drawing from these results, we propose concrete research directions to advance the field. The code is available at https://github.com/Raman1121/CheXGenBench

21.
arXiv (CS.AI) 2026-06-15

SpheriCity: Designing Trustworthy Conversational AI for Sustainability Decision Support

arXiv:2606.13854v1 Announce Type: cross Abstract: We present SpheriCity, an expert-grounded conversational prototype designed to support trustworthy knowledge sensemaking from sustainability reports. City-level circularity assessment reports contain rich information about materials, infrastructure, and policy interventions, yet their length and heterogeneous structure make cross-document synthesis and comparison difficult for practitioners and researchers working on circular economy initiatives. While large language models (LLM) promise faster knowledge access and synthesis, their opaque reasoning, hallucinations, and lack of source transparency introduce risks for trust and interpretability, and require verification in high-stakes sustainability contexts. SpheriCity addresses these challenges through a provenance-first conversational agent that foregrounds evidence traceability, structured synthesis, and interaction scaffolds to support exploratory querying and cross-document synthesis across sustainability reports. We conducted a formative expert review with six sustainability experts using representative queries spanning cross-city comparison, policy summarization, and recommendation-oriented tasks. Experts evaluated responses across dimensions and provided qualitative reflections on the system's usefulness for sustainability knowledge work. Our results reveal that transparent sourcing, contextual explanation, interpretability, and alignment with expert workflow strongly shape expert trust and judgments of system usefulness. This work contributes (1) a conversational prototype for sustainability knowledge sensemaking, (2) an expert-grounded evaluation framework for assessing AI responses in high-stakes knowledge domains, and (3) design insights into how provenance, uncertainty communication, and integration in workflow influence expert users' trust in AI assistance for sustainability decision support.

22.
arXiv (CS.CV) 2026-06-17

SkillMoV: Mixture-of-View Routing with Prototype-Conditioned Gating for Unified Multi-View Proficiency Estimation

Estimating human proficiency from video is a key challenge for automated skill assessment, with applications in sports coaching, music pedagogy, surgical training, and workplace learning. Existing approaches often focus on individual scenarios or rely on shared multi-view aggregation, limiting their ability to adapt to heterogeneous camera viewpoints and activity domains. We introduce SkillMoV, a unified, parameter-efficient framework for multi-scenario proficiency estimation from synchronized multi-view video. At its core, SkillMoV introduces a Mixture-of-View Projector (MoVP), which adapts the mixture-of-experts paradigm to camera-specific view features. MoVP is composed of four stages: (i) a Mixture-of-View soft router with twelve expert MLPs that learns view-dependent expert preferences without camera-identity supervision; (ii) cross-view attention to align synchronized cameras; (iii) learnable prototype anchoring to condition the representation on class-level reference vectors; and (iv) a prototype-conditioned gated projection that produces the final skill embedding. We evaluate SkillMoV on EgoExo4D across six skill domains and three separately trained view configurations: Ego, Exos, and Ego+Exos. SkillMoV reaches 50.17% overall accuracy in the Exos setting with a single model trained jointly across all scenarios, surpassing the strongest reported Exos result among the compared methods by 3.57 percentage points. In Ego+Exos, SkillMoV remains close to the best reported result in that setting (47.63% versus 48.20%). Ablations on the selected Exos configuration validate each component: MoV routing contributes +6.61 pp over attentive aggregation, cross-view attention +4.92 pp, prototype anchoring +4.07 pp, and stochastic view dropout +3.90 pp. Through LoRA adaptation, SkillMoV trains only 23.32% of its parameters and adds limited measured overhead relative to a LoRA-only baseline.

23.
arXiv (math.PR) 2026-06-11

Sharp log-Sobolev inequalities on finite cyclic groups

arXiv:2606.02847v2 Announce Type: replace-cross Abstract: Let $\mathbb Z_n$ be the cyclic group equipped with the uniform probability measure $\pi$, and let $A_{\psi_n}$ be the Laplacian with word length \[ \psi_n(k) = \min(k,n-k). \] We prove the sharp log-Sobolev inequality \[ Ent_{\pi}(f^2) \le 2\pi(f A_{\psi_n} f), \qquad f:\mathbb Z_n \to [0,\infty), \] for every $n \ge 4$. The proof is inspired by the recent work of Frank and Ivanisvili[FrankIvanisvili2026] on a sharp log-Sobolev inequality for nearest-neighbor simple random walk. We use their cubic-majorant reduction, which turns the problem into a 3rd moment estimate; the new point is a blockwise 3rd moment estimate adapted to the word-length multiplier. The same 3rd moment argument also recovers the log-Sobolev inequality for Poisson-semigroup on the circle, first proved by Weissler[Weissler1980]. The same sharp inequalities were also obtained recently by Yao[Yao2026] by a different method.

24.
arXiv (CS.AI) 2026-06-16

Communication-Efficient Verifiable Attention for LLM Inference

arXiv:2606.16352v1 Announce Type: cross Abstract: Computation integrity of remote large language model (LLM) serving can be questionable. For conventional deep neural networks (DNNs), the existing TEE-shielded DNN partitioning (TSDP) approach uses Trusted Execution Environment (TEE) to compute non-linear components and verify the integrity of linear components offloaded to an untrusted GPU. However, directly applying TSDP to Transformer-based LLMs incurs significant TEE computation and TEE-GPU communication overhead. This paper presents Communication-efficient TEE-GPU Attention (\textsc{VeriAttn}) for accelerating verifiable LLM inference. \textsc{VeriAttn} offloads both linear and non-linear computations of attention to the GPU, while TEE performs verification. Moreover, for prefill, \textsc{VeriAttn} uses a two-level pipeline to overlap data movement, TEE pre-/post-processing, and GPU computation. For decoding, when the key-value cache exceeds available GPU memory, \textsc{VeriAttn} partitions attention across TEE and GPU to reduce repeated key-value transfers. Evaluation on an Intel TDX platform shows that \textsc{VeriAttn} achieves 2.60-3.38$\times$ and 3.86-5.42$\times$ acceleration over TSDP for 6k-token prompts and 10k-token outputs during prefill and decoding, respectively.

25.
arXiv (CS.AI) 2026-06-15

A Temporal Planning Framework for Disruption Aware Dynamic Route Optimization in Heterogeneous Railway Systems

arXiv:2606.14582v1 Announce Type: new Abstract: Efficient route optimization play a vital role in ensuring both safety and punctuality in railway operations. It is very crucial particularly in heterogeneous multi-gauge railway networks with varying train speed, stopping pattern, infrastructure compatibility constraints increase coordination complexity. In single-track systems these challenges are further intensify due to all trains to share the same track and requires frequent track switching.Stochastic disruptions events including blocked tracks, blocked trains, engine failure and speed slowdowns introduces additional unpredictability in operations and deviate the timetable. However, existing studies predominantly focuses on high-level timetabling, omitting operational details such as track switching coordination. As a result leaving decision to human operators, increasing safety risks into railway operations. This study proposes a framework based on temporal planning for dynamic route optimization and disruption management in heterogeneous railway systems. The framework formulates railway operations as a temporal planning problem using PDDL 2.1 with explicitly modeling gauge compatibility constraints and diverse disruption scenarios. It generates conflict-free timestamped operational plans specifying both optimized schedules and executable action sequences. To evaluate the proposed framework, we developed a benchmark problem set with 200 instances using up to 1,000 track points and 120 trains. Two state-of-the-art temporal planners and a plan validator were employed to assessed the framework. The experimental results demonstrate that the framework effectively generates temporal operational plans for heterogeneous railway systems and handles multi-gauge constraints, disruptions, and reduces dependence on manual decision making.