“Without change there is no innovation, creativity, or incentive for improvement. Those who initiate change will have a better opportunity to manage the change that is inevitable.”
– William Pollard
Mr. Pollard surely had the positive side of his quote on his mind when he said it, but we do not live in a fairytale. The real-world can be harsh, and certainly not black and white. Therefore, there is more than enough novelty from hackers, too. One of their most important innovations used today is undoubtedly the Domain Generation Algorithm (DGA) technique, for its ability to avoid malware detection.
Once modern malware infects a computer, it usually needs to establish a connection with Command and Control (C&C) servers. These servers order the malware to perform various actions. Modern malware can run seamlessly without any major impact on the performance of the infected device. To decrease the risk of malware not being able to communicate with its C&C, hackers invented DGA and started using it on a massive scale, evading traditional security solutions. DGA was first used in 2008 by the Conficker botnet, allowing it to not communicate with command centers on pre-defined domains or IP addresses, but with a randomly generated unlimited number of domains. It started with rotating about 250 domains per day across several top-level domains (TLD – .com/ .net/ .io/ …). Nowadays, tens of thousands of different domains per day, or more. There is nearly no upper limit. Since the form of the domain name depends on the individual, or group, which created the malware, every malware has a unique approach to generate these domains. Some create domains using MD5 hashes, while others simply alternate letters and numbers, or combine words from dictionaries. If some domain gets blocked, it doesn’t matter, since during the next window, when a new domain will be used, they will establish a new connection.
So why is this problematic for mitigation? Traditional security solutions can detect known communication patterns. Once known malware relying on DGA tries communicating with a C&C server, it gets blocked based on the request it sends (because the address is known to be dangerous). It does not matter to which domain. But what if that malware is not known? What if the communication was not yet examined thoroughly and there are no well-documented patterns which can be used to identify such communication? Traditional security solutions will not be able to detect and block such malware, usually labeled as Zero-Day malware.
To properly defend against malware that uses domains created by DGA for communication with C&C servers, you need a product that can recognize them even without knowing the exact communication pattern. Whalebone does this. There are multiple indicators used within our solutions that help us determine whether the domain is malicious or not. Among them there are those which were specifically implemented to identify those created by DGA. We parse every single DNS request and based on the structure of the data our algorithm tries to determine whether the domain in the query was created by a machine. It might be that a lot of domains that the infected device tries to reach do not exist for various reasons, and we keep track of that, too. In an indeterminate situation, detection is reinforced by a neural network imitating a human analyst’s decision-making process, whether he or she would deduce that there’s something malicious or not.
As hackers push innovations to avoid detection, so also must security vendors, in order to offer protection that can match hackers’ tools. To get back to a part of William Pollard’s quote, “those who initiate change will have a better opportunity to manage the change that is inevitable,” this is especially true for defenders, security vendors, where it is even more important to implement a novel approach in an ever-changing environment.