Bot Detection: Will Focusing on Recall Cause Overall Performance Deterioration?

Nazer,T.H.1, Davis, M.1, Karami, M1, Akoglu, L.2, Koelle, D.3, and Liu, H.1

Presented at the 2019 International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS), Washington, DC (July 2019)

Social bots are an effective tool in the arsenal of malicious actors who manipulate discussions on social media. Bots help spread misinformation, promote political propaganda, and initiate the popularity of users and content. Hence, it is necessary to differentiate bot accounts and human users. There are several bot detection methods that approach this problem. Conventional methods either focus on precision regardless of the overall performance or optimize overall performance, say F1, without monitoring its effect on precision or recall. Focusing on precision means that those users marked as bots are more likely than not bots but a large portion of the bots could remain undetected. From a user’s perspective, however, it is more desirable to have less interaction with bots, even if it would incur a loss in precision. This can be achieved by a detection method with higher recall. A trivial, but useless, solution for high recall is to classify every account (human or bot) as bot, hence, resulting in poor overall performance. In this work, we investigate if it is feasible for a method to focus on recall without considerable loss in overall performance. Extensive experiments with recall and precision trade-off suggest that high recall can be achieved without much overall performance deterioration. This research leads to a recall-focused approach to bot detection, REFOCUS, with some lessons learned and future directions.

Arizona State University
Carnegie Mellon University
Charles River Analytics

For More Information

To learn more or request a copy of a paper (if available), contact David Koelle.

(Please include your name, address, organization, and the paper reference. Requests without this information will not be honored.)