Abstract: As AI systems become increasingly capable, the Department of War faces a critical challenge: how do we develop, rigorously evaluate, and safely deploy multi-agent AI frontier systems across domains ranging from multimodal knowledge discovery to cognitive warfare? This talk presents lessons learned from building compound AI architectures that orchestrate large language models, vision-language models, and specialized agents through retrieval-augmented generation and agentic AI workflows. I will demonstrate how these systems enable cross-disciplinary knowledge synthesis for biosecurity, cognitive warfare planning and execution, and operator-AI team optimization in wargaming and readiness applications. Finally, I will present our emerging capabilities in multi-domain wargaming, where cognitively inspired AI agents execute doctrine-based maneuvers across air, space, cyber, and information domains.
Evaluating these systems requires moving beyond traditional AI benchmarks. I will present our multi-dimensional ecosystem combining quantitative measures, qualitative SME assessments scaled through simulated domain expert agents, and causal investigations using structure learning algorithms to understand "why" behaviors emerge and "how" interventions affect mission outcomes. For safety evaluation, we examine human-agent-environment interactions holistically addressing alignment failures, emergent capabilities under distributional shift, and systemic risks from multi-agent coordination through counterfactual "what-if" analysis and continuous monitoring. The era of scientifically grounded operationally validated human-AI team optimization has begun, and this talk charts the path forward for defense applications.
Bio: Dr. Svitlana Volkova is Chief of AI at Aptima, Inc., where she sets the company's AI vision and leads a portfolio of advanced research programs in compound frontier AI systems, human-AI teaming, and AI Test and Evaluation for national defense. A recognized thought leader in AI for national security, she has shaped the technical direction of multi-million-dollar federal research initiatives with a focus on transitioning AI technologies to operational use. Her pioneering work spans multimodal frontier models, agentic AI architectures, human digital twins, and causal AI/ML—with a focus on decision advantage, readiness, and cognitive warfare applications. Dr. Volkova has authored 100+ publications with 4,900+ citations, delivered keynotes and invited talks at premier venues spanning AI research (AAAI, ACL, EMNLP), defense (I/ITSEC, MODSIM, INFOPAC), academia (Stanford, CMU), and industry (Google Research, Amazon), and served as a trusted advisor to government leadership on AI strategy. Prior to Aptima, she led AI research initiatives at Pacific Northwest National Laboratory and conducted research at Microsoft Research. She holds a PhD in Computer Science from Johns Hopkins University.