Academic Intelligence · Curated Daily

Explore the Frontier of Global Academia

AcademicHub aggregates real-time literature from top journals and preprint platforms. Build your personal research radar and let large language models compile cross-disciplinary analysis briefings automatically.

01.
arXiv (quant-ph) 2026-06-11

Non-Hermitian Delocalization Realizes Random Dirac Criticality in One Dimension

arXiv:2606.12089v1 Announce Type: cross Abstract: Non-Hermitian systems can evade Anderson localization and exhibit delocalized states even in one dimension. Here, we show that such non-Hermitian delocalized states under periodic boundary conditions (PBC) are intrinsically critical, realizing the universality class of one-dimensional random Dirac fermions. By linking spectral winding to topological Anderson transitions via Hermitization, we demonstrate that the delocalized PBC states exhibit a Dirac-type criticality with universal algebraic correlations. In contrast to Hermitian systems, where this criticality occurs only at fine-tuned transition points, it emerges generically in non-Hermitian systems as a consequence of spectral topology. These results identify a universal mechanism by which non-Hermiticity promotes criticality, providing a unified description of non-Hermitian delocalization in one dimension.

02.
arXiv (quant-ph) 2026-06-12

Simple analytical flux-tuned iSWAP pulses for leakage suppression

arXiv:2606.13052v1 Announce Type: new Abstract: Fast, high-fidelity two-qubit gates are a key requirement for fault-tolerant quantum computation. Tunable coupler architectures provide a flexible approach for implementing entangling gates through flux control with large on-off ratios, but fast flux modulation can induce diabatic transitions and population leakage to non-computational states, limiting gate performance. Here we present an analytical flux control method enabling derivative removal by adiabatic gate ($\Phi$-DRAG) for suppressing leakage in flux tunable two-qubit gates. We show that $\Phi$-DRAG differs fundamentally from conventional microwave implementations and derive modified flux modulation protocols that suppress leakage below $10^{-4}$ for fast entangling gates. The method remains effective across a range of asymmetry between qubit anharmonicities and different circuit parameters, enabling high-fidelity two-qubit gates within the fifteen nanosecond range.

03.
arXiv (CS.AI) 2026-06-15

Metabolic cost of information processing in Poisson variational autoencoders

arXiv:2602.13421v2 Announce Type: replace-cross Abstract: Computation in biological systems is fundamentally energy-constrained, yet standard theories of computation treat energy as freely available. Here, we argue that variational free energy minimization under a Poisson assumption offers a principled path toward an energy-aware theory of computation. Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons, yielding an emergent metabolic cost term that penalizes high baseline activity. This structure couples an abstract information-theoretic quantity – the *coding rate* – to a concrete biophysical variable – the *firing rate* – which enables a trade-off between coding fidelity and energy expenditure. Such a coupling arises naturally in the Poisson variational autoencoder (P-VAE) – a brain-inspired generative model that encodes inputs as discrete spike counts and recovers a spiking form of *sparse coding* as a special case – but is absent from standard Gaussian VAEs. To demonstrate that this metabolic cost structure is unique to the Poisson formulation, we compare the P-VAE against Grelu-VAE, a Gaussian VAE with ReLU rectification applied to latent samples, which controls for the non-negativity constraint. Across a systematic sweep of the KL term weighting coefficient $\beta$ and latent dimensionality, we find that increasing $\beta$ monotonically increases sparsity and reduces average spiking activity in the P-VAE. In contrast, Grelu-VAE representations remain unchanged, confirming that the effect is specific to Poisson statistics rather than a byproduct of non-negative representations. These results establish Poisson variational inference as a promising foundation for a resource-constrained theory of computation.

04.
medRxiv (Medicine) 2026-06-22

The circulating blood proteome of childhood acute leukemia

The circulating blood proteome provides a systemic readout of disease biology and holds promise for advancing diagnostics and disease monitoring in pediatric leukemia. Here, we profiled 3072 proteins in diagnostic serum from 54 children with acute lymphoblastic leukemia (ALL), 21 with acute myeloid leukemia (AML), and 12 healthy controls using the Olink Proximity Extension Assay. We observed profound alterations in circulating protein levels in leukemia patients compared with controls and identified immunophenotype-specific proteins, including SIGLEC15 in B-cell precursor ALL (BCP-ALL), NOTCH1 in T-ALL, and CEBPA in AML, all which remained high even in patients with low (

05.
arXiv (CS.AI) 2026-06-18

Beyond Similarity: Temporal Operator Attention for Time Series Analysis

arXiv:2605.11287v2 Announce Type: replace-cross Abstract: A persistent paradox in time-series forecasting is that structurally simple MLP and linear models often outperform high-capacity Transformers. We argue that this gap arises from a mismatch in the sequence-modeling primitive: while many time-series dynamics are governed by global temporal operators (e.g., filtering and harmonic structure), standard attention forms each output as a convex combination of inputs. This restricts its ability to represent signed and oscillatory transformations that are fundamental to temporal signal processing. We formalize this limitation as a simplex-constrained mixing bottleneck in softmax attention, which becomes especially restrictive for operator-driven time-series tasks. To address this, we propose $Temporal Operator Attention (TOA)$, a framework that augments attention with explicit, learnable sequence-space operators, enabling direct signed mixing across time while preserving input-dependent adaptivity. To make dense $N \times N$ operators practical, we introduce Stochastic Operator Regularization, a high-variance dropout mechanism that stabilizes training and prevents trivial memorization. Across forecasting, anomaly detection, and classification benchmarks, TOA consistently improves performance when integrated into standard backbones such as PatchTST and iTransformer, with particularly strong gains in reconstruction-heavy tasks. These results suggest that explicit operator learning is a key ingredient for effective time-series modeling.

07.
arXiv (CS.LG) 2026-06-19

Weighted Bayesian Conformal Prediction

arXiv:2604.06464v2 Announce Type: replace Abstract: Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose Weighted Bayesian Conformal Prediction (WBCP), which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as Geographical BQ-CP, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.

08.
arXiv (CS.AI) 2026-06-11

Are LLMs Bad at Moral Reasoning?

arXiv:2606.11635v1 Announce Type: cross Abstract: For highly capable AI systems to operate safely in dynamic, open-ended environments, they must be able to identify, understand, and respond to moral reasons for action, and constrain their behaviour accordingly. A growing body of research aims to evaluate this capacity – moral competence – in today's most capable AI systems, recently reaching broadly pessimistic conclusions. One of the most ambitious such papers collects gold-standard human-authored rubrics for evaluating moral reasoning in 1,000 cases, and benchmarks frontier AI models against those rubrics, with underwhelming results. In this paper, we argue that the MoReBench dataset can be redeployed to give a much more optimistic picture of LLMs' moral reasoning (an essential part of moral competence). We show that if, instead of scoring LLMs' responses to these cases against these rubrics, we instead give the LLMs the same task given to humans – to generate scoring rubrics for the moral analysis of particular cases – the rubrics they generate are both better calibrated to the human rubrics than their open-ended responses, and, where they differ, plausibly reflect nothing more than the vast dimensionality of most moral problems, as well as highlighting some human departures from the "rubric for creating rubrics". Taking these points into consideration, the MoReBench dataset suggests that LLMs are significantly more capable at moral reasoning than was previously believed.

09.
arXiv (CS.AI) 2026-06-16

Unassigned Agents in Compilation-based Multi-agent Path Finding

Authors:

arXiv:2606.15797v1 Announce Type: new Abstract: Compilation-based techniques represent an important stream of solvers for multi-agent path finding (MAPF) due to their modularity and adaptability for non-standard variants of the problem. While in the standard MAPF the task is to navigate all agents from their initial positions to given individual goal positions without any collision, variants where a different requirement for agents is used are also relevant. Such a variant is MAPF with unassigned agents (UA-MAPF) where some agents have the same setting as in the standard MAPF with initial positions and goals while the remaining agents have the initial position but have no goal - unassigned agents. Despite unassigned agent do not need to reach any goal position they have to be moved out of the way of the standard agents if needed which represent a specific challenge. We show in this paper that UA-MAPF can be expressed in recent compilation-based techniques for MAPF based on formulating the problem as Boolean satisfiability, namely we adapt SMT-CBS and NRF-SAT, the recent solvers based on counterexample guided abstraction refinement and non-refined abstractions.

10.
arXiv (CS.AI) 2026-06-16

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

arXiv:2606.00288v2 Announce Type: replace Abstract: Large language models are undergoing a transition from model technology to system technology. Engineering challenges like cache reuse, context capacity, agent scheduling, and permission control resemble classical computer systems problems. This raises a question: if we treat the LLM as a CPU, KV cache as processor cache, context window as main memory, and agent framework as an operating system, can decades of computer architecture wisdom guide next generation model native systems? This paper pursues this analogy as a visionary survey. We map computer architecture concepts onto the emerging model native stack, survey literature across LLM as OS, memory management, agent frameworks, tool protocols, multi agent coordination, cognitive architectures, and safety governance, finding that each addresses a different layer without a unifying model. We propose the Intelligent Computing Architecture (ICA): six functional layers with interface contracts and design axioms. We resolve the tension over whether the LLM resembles a CPU or OS via a dual plane architecture a probabilistic execution plane (what can be computed) and a deterministic control plane (what should be computed), with every layer passing through as a graded crossover. We propose three Amdahl style design heuristics Semantic Locality, Context Budget, and Agent Speedup as organizing back of envelope models, illustrate their parameter ranges with published data, and identify predictive validation as the principal open task. We articulate analogy boundaries, note differences between silicon and model era architectures, and propose a research roadmap. This is a conceptual and survey contribution with no new experimental results.

11.
arXiv (quant-ph) 2026-06-12

Quantum Stochastic Inflation

arXiv:2606.12636v1 Announce Type: cross Abstract: We formulate stochastic inflation in an open quantum system framework. The field coarse-grained in a patch of fixed physical size, and the total momentum of that patch, form a canonical pair and act on a one-mode Fock space which we identify as the "bulk". At each time step, new comoving modes join the coarse-grained patch and the bulk has to be redefined. This redefinition produces an entangled mode that is traced over, yielding a non-unitary evolution equation for the bulk's density matrix. For a free test field in de Sitter, one obtains GKLS dynamics, generated by an effective Hamiltonian and a single non-Hermitian Lindblad operator, hence diffusion and Hubble friction originate from the same quantum channel. The Wigner-Weyl transform of the GKLS equation leads to a Fokker-Planck equation for the Wigner function, which matches the one that applies to the classical phase-space distribution of stochastic inflation. We also provide several schemes under which one can unravel the GKLS dynamics into stochastic Schrodinger equations when continuous measurements of the decoupled mode are performed, making contact with Langevin formulations of stochastic inflation. In the light-field regime, an additional overdamped reduction can be performed by integrating out the momentum variable in the Wigner distribution, leading to Starobinsky's slow-roll Fokker-Planck equation. In that regime, the purity of the patch is strongly suppressed. In contrast, for heavy fields, field diffusion is suppressed and the coarse-grained patch remains close to a pure underdamped oscillator, which prevents a classical stochastic treatment.

12.
bioRxiv (Bioinfo) 2026-06-11

An AI-Powered Trisomy 21 Research Assistant

Down syndrome, caused by trisomy 21, increases the risk of diverse co-occurring conditions. With more than 34,000 related publications indexed in PubMed as of early 2026, keeping pace with this expanding literature is challenging. While general-purpose large language models are widely used for information retrieval, they often rely on broad training data rather than specific evidence. Retrieval-augmented generation (RAG) improves rigor and reliability of responses by linking model outputs to source texts. In research, source texts are peer-reviewed articles. Standard implementations treat all manuscript sections equally, allowing background text to rank as highly as experimental results. To focus model outputs on experimentally supported responses, we developed the T21 Research Assistant, a section-aware RAG system that prioritizes Results sections to ground responses in primary experimental evidence. The system draws exclusively from 1,789 open-access Down syndrome publications from PubMed Central, including 327 NIH INCLUDE-funded studies, and uses a multistage pipeline for query validation, retrieval, reranking, synthesis, and citation verification. Built on NVIDIA Nemotron models, it generates structured, cited responses. Evaluation using expert-curated questions demonstrated strong performance, achieving a BERTScore F1 of 0.712 and recall of 0.758, comparable to or exceeding leading proprietary and open-source models. T21 Research Assistant is available at: https://bioinformatics.cuanschutz.edu/t21-res-assi/

13.
arXiv (CS.LG) 2026-06-18

How fast can you find a good hypothesis?

arXiv:2509.03734v3 Announce Type: replace-cross Abstract: In the hypothesis selection problem, we are given sample and query access to finite set of candidate distributions (hypotheses), $\mathcal{H} = \{H_1, \ldots, H_n\}$, and samples from an unknown distribution $P$, both over a domain $\mathcal{X}$. The goal is to output a distribution $Q$ whose distance to $P$ is comparable to that of the nearest hypothesis in $\mathcal{H}$. Specifically, if the minimum distance is $\mathsf{OPT}$, we aim to output $Q$ such that, with probability at least $1-\delta$, its total variation distance to $P$ is at most $C \cdot \mathsf{OPT} + \varepsilon$. The optimal approximation for proper algorithms (where $Q \in \mathcal{H}$) is $C=3$ using $\Theta(\log(n/\delta)/\varepsilon^2)$ samples from $P$ and for improper algorithms (where $Q$ is not necessarily in $\mathcal{H}$) is $C=2$ using $\tilde{\Theta}(\log(n/\delta)/\varepsilon^2)$ samples from $P$. In the improper setting, the algorithm achieving $C=2$ [Bousquet, Braverman, Kol, Efremenko, Moran, FOCS 2021] runs in time which grows polynomially with $|\mathcal{X}|$ – it does not run in finite time for real-valued distributions. A promising path towards improved runtime is to consider improper algorithms which output a mixture $Q$ of the hypotheses as such a distribution can be represented in $n$ words of memory. We show (1) a lower bound that no algorithm which outputs a mixture can achieve approximation better than $C = 3-2/n$ unless the number of samples is polynomial in $|\mathcal{X}|$, as well as (2) an algorithm which runs in time $poly(n)$ and achieves the same approximation guarantee. In the proper setting, [Aliakbarpour, Bun, Smith, NeurIPS 2024] provided an algorithm with $C=3$ running in $\tilde{O}(n/(\delta^3\varepsilon^3))$ time. We improve this time complexity to $\tilde{O}(n/(\delta \varepsilon^2))$, significantly reducing the dependence on the confidence and error parameters.

14.
arXiv (quant-ph) 2026-06-12

Where a Quantum Reservoir Works: A Transferable Operating Band

arXiv:2606.13284v1 Announce Type: new Abstract: In quantum reservoir computing, a fixed quantum system transforms an input signal, while learning reduces to training a simple linear readout on its measured outputs. Since the quantum dynamics themselves are never optimized, the method is well suited to today's hardware. Yet these dynamics must still be chosen carefully, because their settings remain fixed throughout training and inference. It therefore remains an open question where, in its control space, a fixed quantum system learns well. We address this question for a dissipative reservoir by mapping performance over three central physical controls: the strength of the input drive, the coupling between neighboring qubits, and the rate of dissipation. Good performance concentrates in a single, well-defined operating region of this control space. This region transfers across tasks and reservoir initializations, and the same memory-defined regime persists under architectural changes. It is also mechanistically grounded, since it disappears whenever any of the mechanisms that create it is removed. Finally, the region can be located cheaply before any task is run, using a simple memory diagnostic.

15.
arXiv (quant-ph) 2026-06-16

Quantum coherence and Leggett-Garg inequality

arXiv:2606.15717v1 Announce Type: new Abstract: In this paper, we attempt to establish the relationship between quantum coherence and the violation of the Leggett-Garg inequality. In particular, employing the Lindblad equation, we obtain the pseudo-density matrix for a damping system to study the effect of environment interaction on the violation of this inequality in a two-state quantum system. It is shown that the violation of the Leggett-Garg inequality can be observed as long as temporal evolution does not induce decoherence. This statement is independent of the initial state of the system. Furthermore, similar to the Horodecki criterion for the CHSH inequality (R. Horodecki et al. Phys. Lett. {\bf A200}, 340), we study necessary and sufficient conditions for violating the Leggett-Garg inequality. Hereby, under the circumstance that the inequality violation occurs, an upper bound for the time interval between consecutive measurements with respect to the time scale of interaction with the environment (the relaxation time) is obtained.

16.
arXiv (CS.CL) 2026-06-16

Transfer Learning for FHIR Questionnaire Terminology Binding

Electronic prior authorization workflows require FHIR Questionnaire items to carry LOINC codes, yet most items in the HL7 Da Vinci CDS-Library lack these bindings. We treat this as a retrieval problem: given a Questionnaire item's text, find the correct LOINC code in a pool of 97,314 active codes. We compare six methods (TF-IDF, frozen MiniLM, BioBERT, BioLORD, contrastively fine-tuned MiniLM, and a TF-IDF+GPT reranker) on a 54-item evaluation set spanning three query styles (natural question, medium, and terse). No single method wins on every metric. BioLORD, a frozen encoder pre-trained on biomedical ontology definitions, has the best top-rank accuracy (R@1 = 0.185, MRR = 0.246) despite seeing no task-specific data, while a contrastive fine-tune on raw LHC-Forms pairs takes R@5 (0.389) and R@10 (0.426). A distribution-shift ablation shows why the fine-tune in our main table is not the strongest one: adding GPT-generated paraphrases to the raw pairs drops R@5 from 0.389 to 0.296, so the augmented union underperforms raw-only training on every metric except R@1. Performance peaks at 5k training pairs. Error analysis on BioLORD's R@1 failures shows that wrong-specificity and ambiguous-text cases together account for 59% of errors.

17.
arXiv (quant-ph) 2026-06-15

Synchronization of Quasi-Particle Excitations in a Quantum Gas with Cavity-Mediated Interactions

arXiv:2504.17731v2 Announce Type: replace-cross Abstract: Driven-dissipative quantum systems can undergo transitions from stationary to dynamical phases, reflecting the emergence of collective non-equilibrium behavior. We study such a transition in a Bose-Einstein condensate coupled to an optical cavity and develop a cavity-assisted Bragg spectroscopy technique to resolve its collective modes. We observe dissipation-induced synchronization at the quasiparticle level, where two roton-like modes coalesce at an exceptional point. This reveals how dissipation microscopically drives collective dynamics and signals a precursor to a dynamical phase transition.

18.
arXiv (CS.AI) 2026-06-17

Explicit Context-Driven Neural Acoustic Modeling for High-Fidelity RIR Generation

arXiv:2509.15210v2 Announce Type: replace-cross Abstract: Realistic sound simulation plays a critical role in many applications. A key element in sound simulation is the room impulse response (RIR), which characterizes how sound propagates within a given space. Recent studies have applied neural implicit methods to learn RIR using context information collected from the environment, such as scene images. However, these approaches do not effectively leverage explicit geometric information from the environment. To further exploit neural implicit models with direct geometric features, we present MiNAF, which queries a rough room mesh at given locations and extracts distance distributions as an explicit representation of local context. Our approach demonstrates that incorporating explicit local geometric features can better guide the model in generating more accurate RIR predictions. Through comparisons with conventional and state-of-the-art methods, we show that MiNAF performs competitively across various evaluation metrics.

19.
arXiv (CS.CV) 2026-06-16

RAMS: Resource-Adaptive and Detection-Conditioned Model Switching for Embedded Edge Perception

Edge object detection on embedded hardware requires balancing inference latency and detection quality under changing resource pressure. We present RAMS, a lightweight runtime controller that monitors device pressure, calibrates switching thresholds from idle behavior, and dynamically selects among three resident YOLOv8 tiers (NANO/SMALL/MEDIUM at 320/416/640 px) without model-reload latency. RAMS defines five switching policies, including two detection-conditioned variants that prevent aggressive downgrades after recent vulnerable-road-user (VRU) detections. We further introduce the VRU-Weighted Accuracy Score (SWAS), a scalar metric for offline policy comparison without ground-truth annotations, together with an oracle-bounded variant that separates detector circularity from genuine tier-retention benefit. Across Raspberry Pi 5, x86 laptops, and Jetson Orin ONNX/TensorRT deployments, the same controller equations operate over a 37x latency range. On Jetson Orin TensorRT under heavy load, the safety2 policy achieves 3.41 ms mean latency, 5.6x faster than fixed-MEDIUM inference, while retaining 74% of its proxy accuracy through near-NANO operation with selective SMALL and MEDIUM locks during VRU-positive windows. Detection-conditioned switching improves SWAS by 25.4% under oracle scoring and 47.3% under detector-derived scoring relative to threshold-only policies under heavy load. Live KITTI evaluation reports per-tier VRU recall of 24.2%, 41.2%, and 59.0%, showing that reactive overrides are fundamentally limited by baseline detector recall.

20.
medRxiv (Medicine) 2026-06-22

Vaccine introductions in the WHO African Region, 2023-26: a country-level ecological analysis by Gavi eligibility and conflict-affected status

Background. The Immunization Agenda 2030 (IA2030) tracks new and underused vaccine introduction as an access metric, and its mid-term review calls for stronger country ownership, prioritisation, data use and tailored support in conflict-affected and resource-constrained settings; however, national launch status does not measure recurrent financing, implementation, safety or equity. We examined how recent vaccine-introduction activity was distributed across the WHO African Region. Methods. We conducted a descriptive country-level ecological analysis of all 47 Member States from January 2023 to June 2026. The country was the unit of analysis and contributed one cumulative, unweighted count of nationally endorsed vaccine-introduction and programme-change events. Counts were linked to Gavi eligibility, World Bank FY26 conflict-affected status, broader fragile and conflict-affected situation status in sensitivity analysis, and concurrent system-performance indicators, and modelled with Poisson regression using HC1 robust standard errors. Two Expanded Programme on Immunization (EPI) manager survey waves were summarised at country level. Reporting followed STROBE and RECORD. Results. Seventy-two events were recorded across 38 of 47 Member States: 48 new-antigen introductions, 20 dose or schedule expansions and four combination-vaccine introductions; malaria vaccines accounted for 21. Gavi-eligible conflict-affected countries averaged 2.50 events per country versus 1.27 in both comparison groups. Gavi-eligible conflict-affected status was associated with a higher count (incidence rate ratio [IRR] 1.97, 95% confidence interval [CI] 1.38-2.81; p

21.
PLOS Computational Biology 2026-06-22

<i>HoloBio</i>: A holographic microscopy tool for quantitative biological analysis

Authors:

by Waira Mona, Maria J. Gil-Herrera, Emanuel Mazo, Daniel Córdoba, Sofia Obando-Vasquez, Maria J. Lopera, Rene Restrepo, Carlos Trujillo, Ana Doblas, Raul Castaneda Holographic imaging in microscopy enables label-free quantitative information of biological specimens and has found applications across a wide range of biomedical studies, from cell morphology to particle dynamics; yet its widespread adoption is often limited by the lack of accessible and standardized analysis software. We present HoloBio, an open-source, Python-based graphical user interface developed to address this issue. This software offers two primary operational modes: a Real-Time mode that enables live processing of holograms at video frame rates, and an Offline mode designed for post-processing previously recorded holograms. HoloBio is compatible with holograms recorded using both lens-based and lensless systems, supporting off-axis architectures in telecentric and non-telecentric configurations, as well as slightly off-axis and in-line optical setups. The software incorporates tools for cell tracking, phase profiling, thickness estimation, and morphological analysis, including cell counting and object area quantification. HoloBio is designed to be accessible for users without coding expertise, offering a reproducible, high-throughput environment tailored for researchers in biology, biophotonics, and biomedical imaging.

22.
arXiv (CS.AI) 2026-06-12

Neuro-Symbolic Agents for Regulated Process Automation: Challenges and Research Agenda

arXiv:2606.13405v1 Announce Type: new Abstract: LLM-based agents are entering regulated industries where they automate judgment intensive quality management processes. We argue that symbolic structures already embedded in these domains, including regulations, typed process models, and compliance constraints, should be treated not merely as external monitoring mechanisms but as core architectural components that shape the agent's decision-making and behavior. We propose compliance-by-construction as a complementary paradigm to guardrail-based monitoring: a structural foundation that prevents control-flow violations, while guardrails remain essential for catching semantic errors. We identify a structured set of neuro-symbolic research challenges on foundational and capability level and show that addressing them jointly enables compliance-by-construction. We call on the neuro-symbolic community to engage with regulated process automation as a high impact research domain.

23.
arXiv (CS.CL) 2026-06-18

Montreal Forced Aligner and the state of speech-to-text alignment in 2026

The Montreal Forced Aligner (MFA) was released in 2016 and has since become the most widely used tool for forced alignment in research and industry. In the decade since, MFA has undergone substantial development, including expanded coverage across more languages and dialects using larger open-source datasets, harmonized IPA dictionaries, model adaptation, cross-language phone remapping, and support utilities. This paper documents MFA 3.0's developments since version 1.0 and evaluates MFA's performance across English, Japanese, and Korean, benchmarked against classic and neural forced aligners. MFA 3.0 achieves state-of-the-art or near state-of-the-art performance across all four benchmark datasets with mean boundary errors below 15 ms. Adaptation and cross-language remapping are effective for languages outside MFA's training distribution, and pronunciation probability modeling and phonological rules provide gains in specific conditions.

24.
arXiv (CS.CV) 2026-06-17

Structure-Aware Text Recognition for Ancient Greek Critical Editions

Recent advances in visual language models (VLMs) have transformed end-to-end document understanding. However, their ability to interpret the complex layout semantics of historical scholarly texts remains limited. This paper investigates structure-aware text recognition for Ancient Greek critical editions, which have dense reference hierarchies and extensive marginal annotations. We introduce two novel resources: (i) a large-scale synthetic corpus of 185,000 page images generated from TEI/XML sources with controlled typographic and layout variation, and (ii) a curated benchmark of real scanned editions spanning more than a century of editorial and typographic practices. Using these datasets, we evaluate three state-of-the-art VLMs under both zero-shot and fine-tuning regimes. Our experiments reveal substantial limitations in current VLM architectures when confronted with highly structured historical documents. In zero-shot settings, most models significantly underperform compared to established off-the-shelf software. Nevertheless, the Qwen3VL-8B model achieves state-of-the-art performance, reaching a median Character Error Rate of 1.0\% on real scans. These results highlight both the current shortcomings and the future potential of VLMs for structure-aware recognition of complex scholarly documents.

25.
arXiv (CS.CV) 2026-06-16

Ultra Flash: Scaling Real-Time Streaming Video Generation to High Resolutions

While recent autoregressive video diffusion models achieve remarkable streaming quality, they remain confined to low resolutions (e.g., 480P), leaving efficient, scalable, real-time high-resolution video generation a fundamental open challenge. To bridge this gap, we present Ultra Flash, a cascaded streaming framework capable of real-time high-resolution video generation. Ultra Flash achieves ~30 FPS at 1K resolution and ~18 FPS at 2K resolution on a single GPU through three key contributions: (1) an architecture-preserving T2V-to-TV2V super-resolution training paradigm coupled with an AIGC-oriented data degradation pipeline that effectively preserves the generative capability of the base model, enabling enhanced high-resolution detail when cascaded after mainstream low-resolution generative models; (2) a causal streaming latent upsampler paired with a high-resolution decoder, which enhances spatiotemporal coherence while enabling efficient latent spatial scaling and precise high-resolution decoding with negligible computational overhead; and (3) a cascade high-resolution streaming video generation optimization scheme that first performs hybrid-reward-enhanced sparse causalization and single-step distillation of the super-resolution model, then introduces cascaded streaming self-forcing preference optimization with dynamic cache management, jointly enhancing overall coherence, improving quality, and enabling real-time high-resolution streaming video generation. Extensive experiments demonstrate that Ultra Flash reliably produces ultra-high-resolution streaming video while maintaining state-of-the-art visual quality and superior efficiency. Project Page: https://xin1u.github.io/UltraFlash/