The 6th USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET), Washington, DC (August 2013)
Malware code has forensic value, as evident from recent studies drawing relationships between creators of Duqu and Stuxnet through similarity of their code. We present FuncTracker, a system developed on top of Palantir, to discover, visualize, and explore relationships between malware code, with the intent of drawing connections over very large corpi of malware—millions of binaries consisting of terrabytes of data. To address such scale we forego the classic data-mining methods requiring pairwise comparison of feature vectors, and instead represent a malware as a set of hashes over carefully selected features. To ensure that a hash match implies a strong match we represent individual functions using hashes of semantic features, in lieu of syntact features commonly used in the literature. A graph representing a collection of malware is formed by function hashes representing nodes, making it possible to explore the collection using classic graph operations supported by Palantir. By annotating the nodes with additional information, such as the location and time where the malware was discovered, one can use the relationship within malware to make connections between otherwise unrelated clues.
For More Information
(Please include your name, address, organization, and the paper reference. Requests without this information will not be honored.)