Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call Graph
📝 Abstract
As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort, and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that influence-based graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.
💡 Analysis
As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort, and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that influence-based graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.
📄 Content
Research Article Mal-Netminer: Malware Classification Approach Based on Social Network Analysis of System Call Graph Jae-wook Jang,1 Jiyoung Woo,1 Aziz Mohaisen,2 Jaesung Yun,1 and Huy Kang Kim1 1Graduate School of Information Security, Korea University, Seoul 136-713, Republic of Korea 2Computer Science and Engineering Department, State University of New York at Buffalo (SUNY Buffalo), Buffalo, NY 14260-2500, USA Correspondence should be addressed to Huy Kang Kim; cenda@korea.ac.kr Received 19 May 2015; Accepted 3 August 2015 Academic Editor: Michael Small Copyright © 2015 Jae-wook Jang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. As the security landscape evolves over time, where thousands of species of malicious codes are seen every day, antivirus vendors strive to detect and classify malware families for efficient and effective responses against malware campaigns. To enrich this effort and by capitalizing on ideas from the social network analysis domain, we build a tool that can help classify malware families using features driven from the graph structure of their system calls. To achieve that, we first construct a system call graph that consists of system calls found in the execution of the individual malware families. To explore distinguishing features of various malware species, we study social network properties as applied to the call graph, including the degree distribution, degree centrality, average distance, clustering coefficient, network density, and component ratio. We utilize features driven from those properties to build a classifier for malware families. Our experimental results show that “influence-based” graph metrics such as the degree centrality are effective for classifying malware, whereas the general structural metrics of malware are less effective for classifying malware. Our experiments demonstrate that the proposed system performs well in detecting and classifying malware families within each malware class with accuracy greater than 96%.
- Introduction Despite the increasing efforts and investments antivirus (AV) vendors are making to defend against the spread of malware families, malware infection is still one of today’s most serious threats in the global security landscape. Malware infection is considered the first step in many attacks launched by cyber criminals, and the ever increasing numbers of malware families have made defense against those criminals a difficult task. According to recent reports by AV-TEST [1], approximately 60 million new pieces of malware are reported for the period from January 2013 to December 2014. Techniques utilized for creating those malware pieces have evolved over time, and malware authors create new malware variants employing various circumvention techniques, such as encryption, polymorphism, and obfuscation. To defend against malware, AV vendors analyze tens of thousands of pieces of malware every day and prevent them from spreading, thus putting themselves and cyber criminals in an endless arms race. Cyber criminals can easily create malware variants with the same semantics by reusing the same core code. Although they generate many malware variants for the same malware family, the base malware families have the invariant charac- teristics and patterns, of malicious behaviors. Those invariant characteristics can be derived from the instruction or binary code of the malware. Utilizing signature-based techniques to capture the similarity between the base family and its variants has several shortcomings. Using polymorphism or metamorphism techniques, malware can evade the detection technique while maintaining its behavior unchanged. Cyber defenders, including AV vendors, are not reactive to malware and generate signatures to partly or wholly address the obfuscation and encryption circumvention Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2015, Article ID 769624, 20 pages http://dx.doi.org/10.1155/2015/769624 2 Mathematical Problems in Engineering techniques [2]. However, signature-based methods require human intervention to construct signatures based on domain knowledge, and defenders should update the signature databases with new signatures continuously. While these approaches are effective for known malware, they cannot detect unknown malware, particularly zero-day attacks. To overcome those shortcomings, the research community has established the alternative of behavior-based methods for malware detection utilizing dynamic analysis of malicious binaries. Using dynamic analysis, various prior studies proposed malware analysis methods that exploit one or more behav- ioral aspects of the malware execution, including statisti- cal methods leveraging the system or API call sets [3–5], instruction pattern sets [6], or call graph matching [7, 8].
This content is AI-processed based on ArXiv data.