Academic Intelligence · Curated Daily

探索全球前沿学术脉络

AcademicHub 汇聚顶级期刊与预印本平台的实时文献。定制您的专属科研雷达,利用大语言模型自动生成交叉领域文献分析简报。

01.
arXiv (CS.CV) 2026-06-12

Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models

Omni-modal large language models (OLLMs) aim to unify multimodal understanding and generation, yet extending them to jointly produce speech and 3D facial animation remains largely unexplored despite its importance for natural human-computer interaction. A key challenge is the mismatch between the discrete semantic reasoning of LLMs and the dense temporal dynamics required for 3D facial motion. We propose Expressive Omni (Ex-Omni), an open-source model that augments OLLMs with native speech-accompanied 3D facial animation. Ex-Omni decouples semantic reasoning from temporal generation through a blendshape-aware speech unit generator and a blendshape decoder, where speech units provide temporal scaffolding and hidden speech representations carry facially relevant cues. We further introduce a unified token-as-query gated fusion (TQGF) mechanism for controlled semantic injection, as well as InstructS2SF-1200K, a dataset consisting of 1200K samples for pre-training. Extensive experiments show that Ex-Omni maintains competitive speech understanding and generation ability while achieving better audio-visual synchronization and lower face-generation latency than cascaded pipelines.

02.
arXiv (math.PR) 2026-06-12

Sphere Packings in Higher Dimension (after Boaz Klartag)

arXiv:2606.13313v1 Announce Type: cross Abstract: Let $\delta_n^L$ be the maximal density of a lattice sphere packing in the $n$-dimensional Euclidean space. We explain how Boaz Klartag proved the inequality $\delta_n^L \geq c n^2 2^{-n}$ where $c>0$ is a universal constant. In higher dimension, even for non-lattice sphere packings, this new lower bound is a substantial improvement. Klartag's proof uses the probabilistic method in two different ways. The first, very standard, relies on the statistical properties of a uniformly chosen random lattice. The second, completely new, studies the stochastic evolution of an ellipsoid constrained to contain non nonzero lattice points in the interior.

03.
arXiv (CS.CL) 2026-06-24

Reinforcement Learning Towards Broadly and Persistently Beneficial Models

As AI systems are deployed across increasingly diverse and high-stakes settings, model alignment must generalize beyond the tasks and domains seen during training. This is especially important for reinforcement learning (RL), which can introduce unexpected misalignment through reward hacking, deception, or other unintended strategies. We study whether RL on beneficial behavior, instantiated in realistic domains, can produce broad and persistent alignment generalization beyond the training distribution. We construct a dataset of realistic situations designed to measure and train beneficial traits, such as truthfulness, fairness, risk awareness, and corrigibility, spanning varied domains, including health, science, and education. We then train models with RL on this dataset and evaluate them on more than 50 independent benchmarks of alignment and beneficial behavior. Compared to a compute-matched baseline, beneficial trait RL improves performance on over 80% of these out-of-distribution benchmarks. We observe substantial out-of-distribution alignment transfer: a beneficial-behavior RL intervention entirely limited to one domain, health, produces broad improvements on non-health alignment evaluations, including reduced reward hacking, deception, and general misalignment. Finally, we study alignment persistence: whether behavior remains robustly aligned under attempts to steer models towards misalignment. Models trained with beneficial trait RL show improved persistence, including greater resistance to adversarial prompting and harmful finetuning; further work is required to isolate the sources of these effects. These results suggest that RL to reinforce beneficial behavior in realistic domains can produce models that are more robustly aligned with human flourishing.

04.
arXiv (math.PR) 2026-06-18

Multi-Dimensional Cohomological Phenomena in the Lower Multiparametric Model

作者:

arXiv:2402.02573v4 Announce Type: replace-cross Abstract: In the past two decades, extensive research has been conducted on the (co)homology of various models of random simplicial complexes. So far, it has always been examined merely as a list of groups. This paper expands upon this by describing both the ring structure and the Steenrod-algebra structure of the cohomology of the lower multiparametric model. We prove that the ring structure is always a.a.s trivial, while, for certain parameters, the Steenrod-algebra a.a.s acts non-trivially. This reveals that complex multi-dimensional topological structures appear as subcomplexes of this model.

05.
Nature (Science) 2026-06-10

Whole-genome duplication shaped cell-type evolution in the vertebrate brain

作者:

The complex brains of vertebrates have more cell types than those of their closest relatives. Whole-genome duplications (WGDs) occurred during early vertebrate evolution1, but it is unclear whether the duplicated genes (ohnologues) facilitated cell-type evolution. Here using brain single-cell transcriptomes from five chordates—human2, mouse3, lizard4, lamprey5 and amphioxus—we report that many cell-type families with conserved core transcription factors in vertebrates do not show one-to-one homology with amphioxus. Moreover, ohnologues, particularly those from the first WGD, were more important than small-scale duplication paralogues for vertebrate cell-type evolution. To explore whether ohnologues are mechanistically important for this process, we predicted ancestral cell-type states and compared them to amphioxus and experimentally investigated macroglia. The findings indicate that ohnologues had a role in early vertebrate cell-type diversification. Moreover, by examining paralogue expression across cell types and species, we show that expression changes were mainly driven by dosage selection and subfunctionalization. We also link ohnologues to cellular diversity at different anatomical and cell-type scales. Our findings demonstrate the importance of WGDs for the evolution of early vertebrate brain complexity and highlight that the resultant ohnologues continued to capacitate cell-type evolution long after they were formed. Analyses of brain single-cell transcriptomes from human, mouse, lizard, lamprey and amphioxus reveal that duplicated genes (ohnologues) played a pivotal part in early vertebrate cell-type diversification.

06.
arXiv (CS.AI) 2026-06-12

Pythagoras-Prover: Advancing Efficient Formal Proving via Augmented Lean Formalisation

arXiv:2606.12594v1 Announce Type: new Abstract: Modern Lean theorem provers achieve strong performance only with substantial training and inference compute, driven in part by scarce verified proof data and the long reasoning traces of formal proof search, making both supervised fine-tuning (SFT) and sampling expensive. We introduce Pythagoras-Prover, a compute-efficient open-source family of Lean theorem provers built for practical compute budgets. The family spans two generation paradigms: autoregressive models at 4B and 32B parameters, and a first proof-of-concept diffusion-based prover (4B) that iteratively refines Lean proofs at inference time. For training efficiency, we build a Lean-verified corpus stratified into easy, medium, and hard problems for curriculum SFT, so models acquire proof skills progressively from shorter, simpler proofs to longer, harder ones. During SFT, a dynamic proof-reasoning filtering scheme preserves informative proof traces while keeping each instance within an 8k-token context budget. We also introduce Augmented Lean Formalisation (ALF), which expands scarce verified corpora into variants of formal statements, populated via self-distillation for extra training signal without formally verifying every mutated instance. By perturbing known problems while preserving their formal character, ALF reduces reliance on any statement's surface form. Empirically, Pythagoras-Prover-4B surpasses DeepSeek-Prover-V2-671B at pass@32 on MiniF2F-Test (86.1% vs 82.4%) with ~167x fewer parameters, while Pythagoras-Prover-32B sets the open-source state of the art at 93.0% on MiniF2F-Test and solves 93 of 672 PutnamBench problems. We release MiniF2F-ALF, an ALF-mutated contamination-sensitive benchmark on which every evaluated model loses accuracy; here our 32B remains strongest and our 4B matches the prior state of the art, Goedel-Prover-V2-32B.

07.
arXiv (CS.CV) 2026-06-11

Feature extraction for plant growth estimation

Precision agriculture requires the estimation of plant growth stages in real-time. When the plant growth stage is known, the wastage of resources in cultivation, such as nutrients and water, is reduced as only the required resources need to be supplied. Plants at different growth stages, however, have similar morphological features, which can make autonomous growth stage estimation difficult. This paper presents two feature extraction methods for growth stage estimation: one that uses a bank of Gabor filters and morphological operations, and the other that uses pre-trained convolutional neural networks (CNNs) and transfer learning. We test these methods on a publicly available plant growth stage dataset (``bccr-segset``) for two species, canola and radish, grown and captured under indoor conditions. The two proposed feature extraction methods are compared, using support vector machines and boosted trees as classifiers. We find that both methods are suitable for real-time applications, and that CNN features outperform the hand-crafted features, both with regard to speed and accuracy. The best system (VGG-19 features, classified with a radial basis function support vector machine) obtained an accuracy of 98.4% for both species, processing an image in 0.08 seconds.

08.
arXiv (CS.AI) 2026-06-16

On-Policy Distillation with Curriculum Turn-level Guidance for Multi-turn Agents

arXiv:2606.15912v1 Announce Type: cross Abstract: Multi-turn agents that plan, invoke tools, and interact with environments offer a promising paradigm for solving complex tasks, yet their capabilities typically rely on very large models whose inference cost is prohibitive in practice.On-Policy Distillation (OPD) is a natural recipe for transferring such capabilities to smaller students, but we find that it suffers a characteristic failure mode in this setting: small student errors compound across turns and push the trajectory out of the teacher's familiar state distribution, so the teacher's supervision becomes least reliable precisely where the student needs it most.We propose Guided On-Policy Distillation (Guided-OPD), a simple yet effective algorithm that mixes teacher- and student-generated turns within each rollout and schedules the teacher's intervention probability along a curriculum that decays to zero.Strong guidance keeps early trajectories close to the teacher distribution and is then gradually withdrawn to recover the purely on-policy regime used at inference.On ALFWorld, ScienceWorld, and WebShop, distilling Qwen3 students from a Qwen3-30B-A3B teacher, Guided-OPD improves Score by 21.1\% and Success Rate by 25.5\% over vanilla OPD on average, with larger gains on smaller students.

09.
arXiv (CS.CL) 2026-06-19

What Makes Effective Supervision in Latent Chain-of-Thought: An Information-Theoretic Analysis

Latent Chain-of-Thought (CoT) internalizes reasoning within continuous hidden states, offering a promising alternative to verbose discrete reasoning traces. However, robust latent reasoning remains difficult because outcome supervision provides weak learning signals and leaves latent trajectories prone to semantic drift. In this work, we analyze Latent CoT from an information-theoretic perspective and identify this failure as a dual collapse: gradient attenuation along the optimization path and representational drift in the latent space. We further decompose process supervision into two complementary dimensions: Trajectory Supervision, which injects dense stepwise reasoning signals, and Space Supervision, which preserves the semantic structure of the latent manifold. Our analysis shows that rigid geometric compression can collapse the reasoning space, whereas generative reconstruction provides a more flexible semantic anchor that better preserves information capacity. To measure these effects, we introduce the Unified Latent Probe (ULP), which quantifies the mutual information between latent trajectories and explicit reasoning steps. Experiments reveal a clear Information-Performance Binding: reasoning accuracy depends on the information fidelity preserved in the latent chain. These findings provide a principled framework for latent reasoning supervision and suggest shifting from geometric imitation toward mutual information maximization. Our code is available at \href{https://github.com/EIT-NLP/Supervision-in-Latent-CoT}{this repository}.

10.
arXiv (CS.AI) 2026-06-19

Triangular Consistency as a Universal Constraint for Learning Optical Flow

arXiv:2606.19938v1 Announce Type: cross Abstract: We propose triangular consistency as a first-principled constraint for optical flow, which is agnostic to network architecture, supervision type, and dataset, and applies to both image-pair and multi-frame settings. This simple but powerful constraint is to compose two flows to induce a third flow and enforce consistency among the three. The composed flows may arise from (i) image pairs, yielding cycle consistency; (ii) multiple video frames, producing longer-range motion through temporal chaining; or (iii) image pairs combined with controlled synthetic transformations, which becomes data augmentation. This triangular consistency introduces negligible computational overhead and requires no additional annotations. Since it is derived directly from the geometry of optical flow, it does not rely on model-specific assumptions and serves as a ``universal'' plug-and-play component for optical flow training. Experiments show consistent improvement across supervised, unsupervised, and transfer learning settings.

11.
arXiv (CS.AI) 2026-06-19

Tri-Info: Generalizable, Interpretable Failure Prediction for VLA Models via Information Theory

arXiv:2606.19998v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models are increasingly deployed across diverse tasks, yet they remain black boxes whose physical interactions can cause irreversible harm, making generalizable and interpretable failure detection essential. We observe that successful and failed rollouts carry systematically different information-theoretic signatures. Building on this, we formalize VLA control as a closed-loop information pipeline and derive the Triple Information-theoretic (Tri-Info) signals that capture whether actions remain diverse, temporally consistent, and coupled to state transitions. Across six VLA models and three benchmark environments, Tri-Info matches the strongest baselines in-domain. Moreover, Tri-Info transfers across architectures, environments, and the sim-to-real gap without retraining, reaching 83\% accuracy on real-world tasks where prior detectors collapse to chance. This establishes Tri-Info as a simple yet powerful method that not only detects failures with strong cross-domain generalization, but also delivers interpretable diagnostics of the underlying failure modes.

12.
arXiv (CS.AI) 2026-06-24

ASymPO: Asymmetric-Scale Policy Optimization for Asynchronous LLM Post-Training Without Behavior Information

arXiv:2606.03070v3 Announce Type: replace-cross Abstract: Asynchronous reinforcement learning can improve language-model post-training throughput by decoupling response generation from policy optimization, but stale responses introduce distribution drift. Standard behavior-corrected methods control this drift with behavior-policy probabilities, importance ratios, or clipping, which requires token-aligned, versioned, and numerically consistent behavior log-probabilities across rollout and learner systems. We ask whether asynchronous group-relative RL can instead be stabilized using only current-policy probabilities. We identify a scale-imbalance failure mode: when stale responses are evaluated under the current policy, positive and negative loss terms can appear at different negative log-probability scales, so zero-sum advantages no longer imply balanced loss contributions. We propose Asymmetric-Scale Policy Optimization (ASymPO), which normalizes each response's token loss by its current average token negative log-probability. ASymPO requires no behavior-policy probabilities, restores response-level zero-sum balance, and preserves a nonzero learning signal. We also introduce Scaled Policy Optimization (SPO), a fixed negative-scaling baseline, and evaluate both current-policy-only objectives in asynchronous mathematical reasoning post-training.

13.
arXiv (CS.LG) 2026-06-17

A Generalized Sinkhorn Algorithm for Mean-Field Schrödinger Bridge

arXiv:2604.06531v3 Announce Type: replace-cross Abstract: The mean-field Schrödinger bridge (MFSB) problem concerns designing a minimum-effort controller that guides a diffusion process with nonlocal interaction to reach a given distribution from another by a fixed deadline. Unlike the standard Schrödinger bridge, the dynamical constraint for MFSB is the mean-field limit of a population of interacting agents with controls. It serves as a natural model for large-scale multi-agent systems. The MFSB is computationally challenging because the nonlocal interaction makes the problem nonconvex. We propose a generalization of the Hopf-Cole transform for MFSB and, building on it, design a Sinkhorn-type recursive algorithm to solve the associated system of integro-PDEs. Under mild assumptions on the interaction potential, we discuss convergence guarantees for the proposed algorithm. We present numerical examples with repulsive and attractive interactions to illustrate the theoretical contributions.

15.
arXiv (CS.LG) 2026-06-16

The Algebra of Units: From Buckingham's Pi-grec Theorem to Latent-Variable Learning

arXiv:2606.16737v1 Announce Type: cross Abstract: Engineers often measure many quantities-speed, pressure, temperature, length-expressed in different physical units. The Buckingham Pi-grec theorem states that these variables can always be combined into a smaller set of dimensionless numbers whose values fully determine the system's behaviour. Identifying the appropriate dimensionless groups has traditionally required expert knowledge and physical insight. This paper shows that they can instead be discovered automatically from data, without prior knowledge of the governing physics. The key observation is that, after logarithmic transformation, measurements collected under different scalings of the same system lie on a low-dimensional manifold whose geometry is determined by the underlying dimensionless groups. Singular value decomposition (SVD) identifies this manifold directly from data. A subsequent search over integer-exponent combinations recovers candidate dimensionless quantities, while a repeating-variable filter retains only those constructed from the machine's characteristic scales. This procedure recovers familiar engineering groups, including the flow coefficient, head coefficient, and Mach number, while excluding equivalent but less interpretable alternatives. The method is demonstrated on a synthetic compressor dataset containing 16,000 measurements. Starting from raw dimensional variables and no physics input, it recovers the correct dimensionless groups to numerical precision and reproduces the compressor performance map with an error below 0.01%. More broadly, the work reveals a close connection between classical dimensional analysis and modern data-driven learning. Both rely on the same underlying algebraic structure, suggesting new approaches for building physical models that are simultaneously interpretable, scalable, and data-efficient.

16.
medRxiv (Medicine) 2026-06-23

Antibodies against influenza A/H1N1pdm2009 and B/Victoria strains but not A/H3N2 are increased in recent onset type 1 narcolepsy versus matched controls

Study Objectives: Onsets of Narcolepsy type-1 (NT1) increased following A/H1N1 vaccination with PandemrixTM in Europe and with A/H1N1pdm2009 infections in China and other countries. To test if other strains could trigger narcolepsy, we measured strain-specific antibodies in patients with recent onset NT1 compared to controls. Methods: Antibodies against hemagglutinin (HA) and neuraminidase (NA) were tested in 62 patients with very recent onset (onset and blood collection following a single flu season, mean +/- SEM: 0.44 +/- 0.06 years since onset) and 100 controls matched by age, sex, season and year of collection (2000-2025). Results were next extended to 181 recent onset patients (mean +/- SEM: 1.00 +/- 0.05 years) versus 260 controls, matched by sex, season and year, but having a slightly higher mean age. HA inhibition (HAI) and NA inhibition (NAI) assays were conducted using flu strains known to circulate during the corresponding flu seasons. HAI results are shown as % positive (titers >= 40) and NAI results as geometric mean titers. Odds ratio (OR) and coefficient were used to compare antibody titers in NT1 versus controls. The contribution of each assay to prediction was finally quantified in the larger sample set using Shapley decomposition. Results: NT1 patients had increased anti-HA and anti-NA antibodies against A/H1N1pdm2009 (anti-HA OR = 3.86, anti-NA coefficient = 0.35) and B/Victoria (anti-HA OR =1.90, anti-NA coefficient = 0.22), but not A/H1N1pre2009, A/H3N2, or B/Yamagata, independent of HLA-DQB1*06:02 status, age, sex, and flu season. Correlations between anti-HA and anti-NA antibodies titers were weak to moderate but significant (r2=-0.10 to 0.34). Multivariable model outperformed age-only baseline (McFadden R2 = 0.19 vs. 0.03; AUC = 0.79 vs. 0.64; likelihood-ratio test X2 = 51, p

17.
arXiv (CS.AI) 2026-06-15

CoRe-MoE: Contrastive Reweighted Mixture of Experts for Multi-Terrain Humanoid Locomotion with Gait Adaptation

arXiv:2606.04718v3 Announce Type: replace-cross Abstract: Humans primarily rely on walking and running to traverse complex terrains. Similarly, humanoid robots should be able to smoothly transition between walking and running while maintaining natural and stable locomotion. However, unifying gait transition and multi-terrain adaptation within a single policy remains challenging due to gradient interference between tasks and the distribution shift caused by terrain variations. Although Mixture-of-Experts (MoE) architectures can mitigate multi-skill interference, direct joint training often fails to achieve clear expert specialization. To address these challenges, we propose CoRe-MoE, a two-stage reinforcement learning framework that decouples gait generation from terrain adaptation. In the first stage, a stable locomotion policy is learned to produce natural walking and running behaviors with smooth transitions. In the second stage, a terrain-aware MoE branch is introduced, and the gating network is trained with a contrastive objective to learn structured terrain representations and promote expert specialization. The final action is obtained through weighted fusion of the base gait policy and the terrain-aware branch, enabling the policy to preserve stable locomotion while adapting to complex terrains. Extensive simulation results demonstrate that the proposed method outperforms baseline approaches in terms of success rate, locomotion stability, and multi-terrain adaptability. Furthermore, zero-shot deployment on a Unitree G1 humanoid robot validates the effectiveness of our framework, achieving robust walking and running across stairs, slopes, steps, obstacles, and unstructured outdoor terrains while maintaining accurate foothold control and dynamic stability.

18.
arXiv (quant-ph) 2026-06-12

Quantum optical photoelectron interferometry

arXiv:2606.13447v1 Announce Type: new Abstract: We present a general theoretical framework for multiphoton processes driven by quantum light fields, establishing a direct link between photon statistics and photoelectron observables. Our results show that the autocorrelation and cross-correlation functions, which quantify the underlying photon statistics, are directly mapped onto the resulting photoelectron spectra. Although our framework is broadly applicable, we demonstrate specifically in the example of reconstruction of attosecond beating by interference of two-photon transitions (RABBIT) the influence of the light statistical properties. In this approach, the amplitude, contrast and phase of the oscillations of the sideband signal as a function of pump-probe delay reveal the quantum nature of light. We analyze these observables across several quantum configurations, including correlated infrared and harmonic modes, as well as the uncorrelated case with non-classical harmonic statistics, thereby establishing a general framework for quantum-light RABBIT spectroscopy. We compare the analytical theory with numerical simulations for the case of classical harmonics and an infrared field in a squeezed coherent state, obtaining excellent agreement. Our results reveal how the interplay between classical and quantum correlations dictates the coherence of the photoemission process, providing a new window into the quantum-optical foundations of attosecond science.

19.
arXiv (CS.CV) 2026-06-12

Heterogeneous LiDAR Early Fusion and Learned Re-Ranking Strategy for Robust Long-Term Place Recognition in Unstructured Environments

Robust localization in unstructured environments, such as agricultural fields, is a critical challenge for autonomous systems. LiDAR sensors provide detailed 3D information about the environment and are invariant to lighting conditions. For this reason, LiDAR-based place recognition methods have gained significant attention. In this paper, we propose MinkUNeXt-VINE++, a novel approach that combines early fusion of heterogeneous LiDAR data from two sensors (Livox Mid-360 and Velodyne VLP-16) and a learned re-ranking strategy in inference time. This fusion leverages the strengths of each sensor to provide a more comprehensive representation of the environment. Additionally, the re-ranking approach is particularly important in repetitive environments, such as vineyards, as finding true positives is a major challenge. We evaluated our approach using the TEMPO-VINE dataset, which provides heterogeneous LiDAR data in vineyard environments across different phenological stages. Our results demonstrate that MinkUNeXt-VINE++ significantly improves place recognition performance compared to single-sensor approaches and state-of-the-art methods. MinkUNeXt-VINE++ achieves a 20% improvement in the Recall@1 metric compared to single-sensor approaches, and +30% including re-ranking. The code of our method is publicly available for reproduction.

20.
arXiv (quant-ph) 2026-06-15

No classical particle limit for massless quanta

arXiv:2606.14632v1 Announce Type: new Abstract: We investigate whether relativistic massless classical particles may emerge as the classical limit of massless quanta. To address this question independently of any specific dynamics, environment, or pointer basis, we develop an axiomatic and purely kinematical framework for the coarse-graining approach. In this formulation, a candidate classical phase space is taken as the outcome space of a POVM subject only to minimal classicality and covariance under the relevant spacetime symmetry group. Applying this framework to the Poincaré group, we prove a no-go theorem for massless particles: the covariance requirement is incompatible with the operational conditions for classicality. The theorem leaves open field-like limits of massless quanta, for example the emergence of electromagnetic or gravitational fields, while ruling out classical massless particles, such as classical photons or gravitons.

21.
arXiv (CS.LG) 2026-06-15

Detecting Lookahead Bias in LLM Forecasts

arXiv:2512.23847v2 Announce Type: replace-cross Abstract: We develop a statistical procedure to detect lookahead bias in economic forecasts generated by large language models (LLMs). Using a date-only recall query for a firm-date pair, we estimate the probability that the LLM has internalized information about the realized outcome, a statistic we term Lookahead Propensity (LAP). LAP is materially positive throughout the in-sample period and collapses essentially to zero right after the training-data cutoff. We show that a positive interaction between LAP and the LLM forecast in an accuracy regression indicates lookahead-bias contamination, and apply the test to two forecasting tasks: news headlines predicting stock returns and earnings call transcripts predicting capital expenditures. In both applications, the LLM forecast's predictive power is amplified on high-LAP firm-date pairs, and the interaction loses significance on post-training-cutoff samples. Our test provides a cost-efficient, diagnostic tool for assessing the validity and reliability of LLM-generated forecasts.

22.
arXiv (CS.LG) 2026-06-12

Extracting Governing Equations from Latent Dynamics via Multi-View Contrastive Learning

arXiv:2606.13260v1 Announce Type: new Abstract: Identifying latent dynamical systems from noisy, high-dimensional measurements is a central problem at the intersection of representation learning, system identification, and scientific discovery. We present DYSCO, a multi-view temporal contrastive learning algorithm that jointly recovers latent trajectories and the governing dynamics from such observations, by leveraging multiple independent noisy views of the same underlying process to disentangle signal from noise. By parameterizing the dynamics in a structured functional basis, our framework further enables symbolic recovery of the governing equations within an affine gauge. We offer theoretical guarantees for strong identification up to an affine indeterminacy, extending prior identifiability results to the realistic setting of noisy nonlinear observations. Empirically, we demonstrate accurate recovery of both latent trajectories and flow fields across a diverse set of dynamical regimes (e.g., chaotic, oscillatory, and metastable) under both Gaussian and Poisson observation noise, the latter being particularly relevant for neural recordings.

23.
medRxiv (Medicine) 2026-06-16

Physiological Aging of the Respiratory System (PARS): from development to application

Background: Aging has a critical role in lung changes and the outcome of lung disease. Several lung aging equations have been proposed to measure deviation from physiological aging of the respiratory system. In this study, we aimed to develop a single measure of accelerated lung aging and show its application as a measure of lung aging. Method: We used a pre-bronchodilator pulmonary function test (PFT) from NHANES adult participants recruited from 2007 to 2011. We applied Klemera-Dubal Method (KDM) to four PFT measurements, FEV1, FVC, FEF25-75, and PEF, to calculate a measure of lung biological aging. Physiological Aging of the Respiratory System (PARS) was calculated from the residual method vs. chronological age. We tested the construct validity of PARS by measuring its association with risk factors of lung health. The prognostic validity was measured using a survival analysis. Sampling weights were applied to all analyses. Results: In 14,123 adult participants, the mean (SD) of accelerated lung age (PARS) was 0 (8.2) years. Participants with a history of asthma and emphysema had 4- and 10-year higher PARS. Cigarette smoking, lower socioeconomic status, black race, higher serum cadmium, and lower serum selenium and magnesium were associated with higher PARS. During 116 months of follow-up, PARS was associated with a higher mortality (HR = 1.06, 95%CI: 1.05-1.07 per year). Females with higher PARS had a higher risk of death (P for interaction < 0.001). Results were consistent across different subgroups and sensitivity analyses. Conclusion: PARS is a noninvasive lung aging marker and can be applied as a single measure of lung accelerated aging in the adult population. Its strong construct and predictive validity support its future application among different populations with and without lung disease.

24.
arXiv (CS.CL) 2026-06-11

ICA Lens: Interpreting Language Models Without Training Another Dictionary

Finding interpretable directions in language-model representations is critical for understanding and controlling model behavior. Sparse autoencoders (SAEs) have become the standard tool for this purpose, but using them as the default first lens often requires training, storing, and evaluating large overcomplete dictionaries. This bottleneck limits rapid exploration and raises a fundamental question: how much interpretable structure is already visible from activation geometry before training another neural dictionary? Our intuition is simple: many interpretable directions are selective on tokens, and these directions should look less Gaussian than random directions. We therefore revisit independent component analysis (ICA), a classical method for finding non-Gaussian directions, as a compact lens for language-model interpretability. We find that ICA has been underestimated for LLM interpretability, because prior uses often relied on off-the-shelf ICA implementations that are brittle on LLM activations and lacked systematic tools for inspecting and evaluating the recovered directions. To bridge these gaps, we introduce ICALens, the first practical workflow for stable, efficient, and auditable ICA analysis of LLM representations. It combines an optimized GPU-parallel FastICA pipeline with LLM-specific stability recipes and better fitting diagnostics, enabling efficient and reliable layer-wise analysis. Across GPT-2 Small, Gemma 2 2B, and Qwen 3.5 2B Base, ICALens efficiently recovers compact, human-interpretable directions without per-layer gradient-based dictionary training. On SAEBench, ICA is competitive with public SAEs in sparse probing and outperforms them in targeted probe perturbation under small-to-medium budgets. These results suggest that ICA should not be viewed as a weak baseline, but as an efficient and complementary first lens for exploring language-model representations.

25.
arXiv (CS.CV) 2026-06-24

Face versus Body Tracking for Human-Robot Interaction: An Egocentric Dataset

Meaningful human-robot interaction (HRI) requires a robot to continuously assess user engagement through persistent user tracking. However, state-of-the-art Multi-Object Tracking models are heavily optimized for surveillance or autonomous driving. A social robot faces distinct egocentric challenges, such as humans moving in unpredictable nonlinear patterns, obstructing each other, or leaving and reentering the scene. These dynamics trigger frequent identity switches (IDSW), causing the robot to lose its footing mid-conversation. To address this, we introduce a focused, custom-annotated egocentric dataset collected via the Furhat robot. We present a systematic evaluation isolating detection errors from tracking logic, comparing face versus body tracking, and assessing the impact of extended memory and appearance re-identification (ReID). Results indicate that increasing temporal memory mitigates prolonged occlusions but fails on complex dynamic events. Integrating ReID resolves complex switches but exhibits opposing effects: it substantially improves body tracking stability, yet causes facial IDSW to spike due to profile angle sensitivity. Ultimately, our optimized pipeline reduces IDSW by 49% compared to a standard tracking-by-detection baseline, effectively mitigating interaction breakdowns. As standard benchmarks lack dense, close-quarter occlusions, this work highlights the critical need for natively captured social dynamics to truly validate HRI perception models.