Most Popular Subdomains and MX Records on the Internet

Simply put, today's internet runs on DNS. The concept is laced with hierarchical overtones, attributable to the structured nature of the protocol itself, and its pivotal role when it comes to the proper functioning of the network of networks. After all, a quick survey of today's visible internet rigorously points to a sizable dataset of nearly 364 million domain registrations across the most popular top-level domains. That's a testament to the rapidly growing need for coherent DNS and IP intelligence solutions that can quickly and effectively sort through the resulting complexity. In short, DNS records can reveal a plethora of important information, including perimeter protection mechanisms and technologies in use, inconsistencies in canonical entities pointing to specific security implications, and similar flaws leading to potential DNS takeover scenarios, so the value is definitely there. At the heart of this blog post lies yet another attempt at recognizing the importance of information gathering and asset discovery regarding the efforts of security researchers and bug bounty hunters alike, as they strive for a suitable interplay of passive DNS enumeration capabilities and techniques. Our goal is to showcase the most commonly used subdomains and MX record types as they complement and enrich the asset discovery ecosystem. If you're in the business of network reconnaissance or asset discovery, mastering the above techniques can go a long way in ensuring flexibility when examining potential areas of exposure and validating legitimate targets of opportunity prior to any engagement. Let's take a quick look. Most popular subdomains on the internet In the recent past, we've articulated that finding associated domains linked to a specific target is central to the idea of extending the attack surface. This long-standing argument reflects the possibility of both horizontal and vertical domain correlation, where the intent is to search for any available subdomains and siblings corresponding to the apex, success in this area is always measured in terms of forgotten or mishandled domain records as an additional target of opportunity for miscreants to capitalize on. As a refresher, domain name features consist of human-readable character strings with a one to one correspondence pointing to a specific web resource. In turn, the canonical internet protocol (DNS) leverages a subordinate arrangement starting with TLDs, or top-level domains, composed of prominent extensions such as .com or .net, followed by second- and third-level domains which consumers can acquire and control at will. This form of domain administration allows for further specialization whereby domains can be scaled to generate the desired aggregates. For instance, third-level domains, or subdomains as they are normally referred to, can identify an FTP server simply by prepending ftp to domain.com; this denotes the collective designation of a resource via a unique identifier such as ftp.domain.com, otherwise known as a fully qualified domain name, or FQDN. Playing a role often attributed to hostnames within organizational boundaries, subdomains typically exhibit the greatest flexibility when it comes to naming conventions. Thus, large-scale DNS intelligence dictates that keeping an eye on the fluidity within domain names offers a critical view of the threat landscape. This is also the case where subdomain knowledge is leveraged at high-level stages of the recon process, targeting institutional privacy via recursive DNS data and any resulting bidirectional activity in the process. So, what are the top most popular subdomains, and how can we identify them? From a corpus of over 17 billion records of crawled web data for the .com TLD, and associated URLs, hosted at Common Crawl, we set out to investigate the feasibility of pulling a subset of these records using a common programming language like Python, some commodity hardware, and supporting tools like the CDX Toolkit. Wor...

Om Podcasten

Listen to all the articles we release on our blog while commuting, while working or in bed.