Our research spans two fundamental areas that are critical to a future where AI proliferates and permeates every aspect of our lives.
The study of models as complex information systems. We investigate how AI models encode, process, and transform information, developing frameworks to understand their internal representations, behaviors, and capabilities. This includes analyzing model architectures, training dynamics, and emergent properties to create comprehensive profiles of AI systems.
The development of techniques to programmatically verify execution of complex systems. We create cryptographic and algorithmic methods that enable automated verification of AI model behaviors, capabilities, and integrity without requiring access to internal parameters or training data.
Our research in Model Informatics and Verifiable Computation has led to the development of breakthrough technologies that enable trustworthy and verifiable AI deployments. These technologies form the foundation of our products and services, providing practical solutions for real-world AI verification challenges.
Learn more about our core technologies, including Behavioral Fingerprinting for AI model identification and ZkTorch for zero-knowledge verification of large language models.
A methodology for measuring agent behavioral traits as directions in embedding space, applied to diffs of agent skill files over time. Achieves 91.2% sign classification accuracy on data-seeking trait detection. Includes an agent-to-agent protocol for continuous automated evaluation. Accepted to ICML '26.
ICML '26 · AIWILD WorkshopWe introduce Stability Monitor, a black-box stability monitoring system that periodically fingerprints an endpoint by sampling outputs from a fixed prompt set and comparing the resulting output distributions over time. In controlled validation, Stability Monitor detects changes to model family, version, inference stack, quantization, and behavioral parameters. Accepted to ACM CAIS '26.
Exploring methods and frameworks for identifying AI systems developed by foreign adversaries and implementing appropriate policy responses.
White PaperIntroducing ZKTorch, the first universal zero-knowledge machine learning compiler designed for real-world AI applications.
White Paper | code
A comprehensive white paper exploring the critical importance of AI model verification, behavioral fingerprinting, and establishing trust in AI systems for enterprise deployment.
White PaperCryptographic verification of location, identity, and confidentiality in cloud environments for sovereign AI processing.
ICDS '25Model endpoints with extreme levels of behavioral instability show high levels of task instability compared to peer endpoints serving the same nominal model. First evidence that fingerprint instability predicts agent tool-calling changes.
SubstackThe evolving landscape of LLM agents requires a fundamental shift in how model hosting infrastructure is measured. Introducing Stability Arena, a public monitor designed to track Identity, Stability, and Fidelity — three new, essential metrics for the agent era.
SubstackJust because an AI is generating tokens reliably doesn't mean those tokens are the right ones. Exploring why standard reliability metrics fall short and why stability — behavioral consistency over time — is what actually matters for AI-native applications.
SubstackEstablishing ground truth via verifiability and accuracy. Exploring how to prove the authenticity of model outputs and ensure correctness — two prerequisites for meaningful AI assurance.
SubstackEstablishing clarity around the terms used to discuss AI assurances. Part I tackles explainability and interpretability — what they mean, how they differ, and why the distinction matters.
SubstackInspired by bioinformatics, Model Informatics is the systematic study of AI models as complex information systems. Exploring the tools and frameworks needed to understand, recognize, and verify properties of AI models.
SubstackAgents coordinating to complete complex, multi-step tasks for users and using a marketplace to bid out individual tasks to specialty agents.
SubstackThere is a consistent pattern of increasing the robustness of security features for core computing technologies. AI shouldn't be any different. Walking through how previous core technologies increased security as they gained adoption.
SubstackDeveloping on top of probabilistic compute changes how we build software, software products, and eventually anything. Exploring the challenges product leaders face around model uncertainty and planning.
SubstackWhile we may not know how a model generates its result, we should still know what was asked of the model to get the result. Examining the critical differences between transparency and interpretability.
SubstackJust reviewing outputs from a model won't tell you which model you're using. Even if your API provider says which model you're using, you have no way to verify it independently.
SubstackHow can you prove a model passed an eval with the reported score? Exploring why benchmark results need cryptographic verification and how VAIL makes that possible.
SubstackThe number of AI models is growing fast and that's good. Drawing parallels between AI proliferation and sugar adoption to understand societal impacts and dependencies.
SubstackSince AI models are stochastic machines, we can't predict their exact outputs. Exploring why verifiable computing is essential to ensure trust in probabilistic AI systems.
SubstackStay updated with the latest research, insights, and thought leadership on AI and model informatics.
Visit our Substack →30 minutes. We'll show you behavioral fingerprinting, stability monitoring, and agent behavior tracking on real systems.