Software to detect and analyze malware attacks

Malware Analysis and Attribution using Genetic Information (MAAGI)

Cyberattacks, such as viruses, Trojans, and worms, are a growing threat to U.S. missions and resources. To combat the growing threat of cyberattacks on U.S. resources, the Defense Advanced Research Projects Agency (DARPA) created the Cyber Genome program. Cyber Genome aims to develop revolutionary, new cyber-forensic techniques to automate the discovery, identification, and characterization of malware variants.

Simulated image from Charles River Analytics project MAAGI.
Malware analysis and attribution using genetic information

The Charles River Analytics solution

As part of the Cyber Genome program, Charles River developed and is refining MAAGI (Malware Analysis and Attribution using Genetic Information). In its current version, MAAGI combines ideas and techniques from biological evolution, reverse engineering of computer programs, and linguistics to rapidly identify the source and intent of new malware attacks. MAAGI makes use of the fact that malware authors often reuse code from one attack to the next, while trying to conceal this reuse from defenders by changing the “surface” features of the malware. By discovering the essential “genetic” properties of malware that are preserved from one malware sample to the next, MAAGI seeks to determine the lineage of each sample and uses the lineage to help characterize the source of the malware. Furthermore, by understanding the patterns of evolution in malware, MAAGI can be used to predict future malware development, anticipating potential attacks rather than—as we do today—merely reacting to them. MAAGI also uses methods from functional linguistics to identify the functional features and potential intent of malware, aspects that are especially likely to be preserved even when surface features change. MAAGI allows an analyst to view the evolution of malware on a gene-by-gene basis, as shown in the figure above.


Some of MAAGI’s features include the following:

Visualization of malware lineages

  • Cluster similar malware
  • View local lineage for a cluster
  • See similarities and differences within a cluster
  • Sort features by type (e.g., files, header properties, imports, traces of DLL calls, etc.)


Selection options

  • Similarity display updates based on the malware selected
  • Ability to select multiple samples
  • Selecting a whole lineage cluster displays similarity/difference characteristics


Convenient lineage views

  • View samples and descendants, but not ancestors
  • View ancestors, but not descendants
  • View local lineage compared to other samples
  • Filter features by in-group and out-group occurrence



  • Search for similar samples
  • Search for similar samples in other lineages
  • Substring search

(U.S. Air Force photo by Tech. Sgt. Cecilio Ricardo.
Photo used with permission from the U.S. Air Force.)

The benefit

MAAGI is an innovative approach to the Cyber Genome challenge of characterizing and predicting the evolution of malware. It supports detection and attribution of cyberattacks for both the defense and law enforcement communities.

By recognizing code and techniques from previous attacks, MAAGI enables quicker response times to defend against cyberattacks. MAAGI is proactive in that it not only assesses attacks but anticipates and predicts the properties of future attacks. Finally, MAAGI changes the economics of malware by making it more difficult for malware authors to change superficial features and reuse their code.

Contact us to learn more about MAAGI and our other cybersecurity capabilities.

The views expressed are those of the author and do not reflect the official policy or position of the Department of Defense or the U.S. Government.
Distribution Statement “A” – Approved for Public Release, Distribution Unlimited.

Our passion for science and engineering drives us to find impactful, actionable solutions.