Figuring Out How Spammers Get E-Mail Addresses
Spammers employ a variety of methods to acquire e-mail addresses. Some methods take advantage of the e-mail addresses readily available on the Internet, whereas others employ different levels of trickery, from harvesting to outright stealing.
Harvesting from the Internet
Spammers (and their assistants) utilize a technique called harvesting to acquire e-mail addresses. While harvesting requires a lot of bandwidth, it is ingeniously simple: Simply download the right pages from select Web sites and extract the e-mail addresses that are there for the picking. Some of the tools and sources employed in harvesting e-mail addresses from the Web include the following:
- Web spiders: Spammers employ Web crawlers and spiders that harvest e-mail addresses from Web sites. It's common for Web sites to include mailto: URLs as well as unlinked user@domain addresses. Put your e-mail address on a Web site, and you're spam bait.
- These spiders are not unlike the spiders and Web crawlers used by Yahoo, Google, and others that scan the Internet's Web sites in order to keep Web search indexes fresh. Except that e-mail address harvesting spiders are up to no good. And where do these spiders get domain names? With over 90 million .com domains in existence, it's easy enough to just guess domain names in order to come up with quite a few.
- Newsgroups: It's a straightforward task to harvest e-mail addresses from Usenet newsgroups: Just pull in a big newsfeed and extract the e-mail addresses with a simple shell or Perl script. Newsgroup volumes are still increasing exponentially — at a rate of at least several gigabytes per day. This means lots of e-mail addresses are there for the taking. Any spammer with enough bandwidth can slurp up all those bits and just sift out the e-mail addresses.
- Groups, blogs, and discussion boards: Yahoo! and Google have their groups and mailing lists, many of which make their members' e-mail addresses available. There are thousands of blogs and discussion boards out there, too, that contain easily acquired e-mail addresses.
- Test messages: In this method, spammers send test e-mails to recipients whose addresses they simply guess — so-called test e-mail messages sent to addresses like service@, info@, test@, marketing@, security@.
- Spammers at one time could reliably conclude that, if they receive no "bounce-o-gram" back from the domain, that the e-mail address must be legit. This is because e-mail servers used to routinely send nondelivery receipts (NDRs) back to the sender of a message sent to a nonexistent address. But that ain't necessarily so any more: More servers are opting to stop sending NDRs.
- Unsubscribe links: Many spam messages include an opt-out or unsubscribe link so that the recipient can request not to receive more spam. However, often the real purpose of unsubscribe links is to confirm a valid, active e-mail address.
- Malware: Spammers sometimes use Trojan horses, viruses, and worms to extract e-mail addresses from individual users' computers. If mass-mailing worms can extract the contents of a user's e-mail address book for the purpose of propagating spam, then it's going to be easy to perform the same extraction and simply send the list back to the hacker's lair. This would probably be easier, in fact, because this activity is far less likely to be detected than a mass-mailing worm.
- Unsubscribe requests: A good number of spam messages contain "unsubscribe me" links that a user clicks to opt out. However, many spam operators actually continue to send spam to e-mail addresses submitted to "unsubscribe me" links. When a user submits such a request, the spammer knows that the address being sent is a valid e-mail address. Do you think they'll actually stop sending spam to a known valid address? Not on your life!
Buying and stealing addresses
Among spammers and e-mail address brokers, e-mail addresses are a traded and sold commodity. If you know where to look, you can purchase CDs and downloads containing e-mail addresses by the hundreds of thousands or millions.
Business and service provider e-mail lists are also stolen and sold to spammers. In mid-2004, a former AOL employee was charged with stealing 90 million screen names and 30 million e-mail addresses from AOL and selling them to a spammer for $100,000. This is not an isolated case, but it is a noteworthy one because of the size of the heist. So much for privacy, eh?