Everyday physical goods like refrigerators or even chairs undergo extensive testing so consumers can rely on their performance and know the conditions under which they will fail. Deep neural networks (DNNs) on which autonomous systems run are resistant to that sort of comprehensive evaluation by their nature, reducing operational transparency. Ensuring predictable and reliable behavior from these deep learning systems is a challenge, and the few verification algorithms in place require a level of expertise well beyond that of the average user.
The Toolbox for Verification of Autonomous Systems with Neural Components (TOOL-VAN) from Charles River Analytics enhances existing verification algorithms while making them easy to use on deep learning models. Using TOOL-VAN will enable the Navy, one of the primary funders of the solution, to purchase and implement autonomous systems with greater confidence. TOOL-VAN was awarded up to $2M Phase II Small Business Innovation Research (SBIR) funding from the Office of Naval Research.
Testing neural networks with every single permutation and combination of inputs is physically impossible, says Dr. Jeff Druce, Senior Scientist at Charles River and Co-Principal Investigator on TOOL-VAN. “You just can’t manually perturb the system in every way and know what it’s going to do,” he says. TOOL-VAN solves this problem and enables verification and validation (V&V) of such autonomous systems at scale.
“The toolbox accounts for a wider class of inputs, which is a step up from existing DNN V&V approaches,” says Michael Harradon, Principal Scientist at Charles River and Co-Principal Investigator on TOOL-VAN. Most V&V solutions can only evaluate finite test instances, preventing analysis of environmental factors, Harradon explains. Harradon notes that TOOL‑VAN expands individual test inputs to evaluate performance over a broad range of transformations including factors like contrast, lighting conditions, and even weather. If an autonomous driving system has only trained on images in daylight, for example, it will be easier to find out how the system will behave at night and in the rain.
Easy usability is an important goal for TOOL-VAN. To achieve this, the team studied how DoD analysts or regular users would use this system in practice and incorporated these findings in the tool. Using a drag-and-drop method, the solution determines which V&V method to apply from its toolkit. The results are presented in the form of a report that predicts behavior for a range of evaluation scenarios.
While TOOL-VAN started as a solution for the Navy, it can find more general applications across a variety of cyber-physical autonomous systems. Phase I of the project demonstrated the feasibility of lowering the barrier to adoption of V&V systems. Among other tasks, Phase II plans on expanding the library of formal verification methods and further refining the user interface.
In the near term, TOOL-VAN will help the Navy automate sensing and control in a variety of autonomous vehicles with confidence. Commercial implementations of TOOL-VAN include licensing or selling the solution for autonomous vehicles. Internally, Charles River plans to use the toolkit to enhance proprietary explainable AI software.
The Charles River team is excited about the solution enabling V&V at scale. “TOOL-VAN is useful because it builds understanding in the situations under which you would want to use an autonomous system—and when you wouldn’t,” Harradon says.
Contact us to learn more about TOOL-VAN and our capabilities in explainable AI, user interfaces and human-machine teaming, and robotics and autonomy.
This material is based upon work supported by the Office of Naval Research under Contract No. N68335-24-C-0261. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Office of Naval Research.