Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
    New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
    6 Min Read
    How Data Analytics Is Reshaping Patient Financing Decisions
    How Data Analytics Is Reshaping Patient Financing Decisions
    13 Min Read
    business using business intelligence
    How to Use a Competitive Intelligence Dashboard to Turn Market Data Into Smarter Marketing Decisions 
    9 Min Read
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
    software developer using ai
    How Data Analytics Helps Developers Deliver Better Tech Services
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Dr Gates was right, or how I learned to stop worrying and love the spam
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Dr Gates was right, or how I learned to stop worrying and love the spam
Data MiningPredictive Analytics

Dr Gates was right, or how I learned to stop worrying and love the spam

DavidMSmith
DavidMSmith
6 Min Read
SHARE

In 2004 Microsoft founder (and honorary doctorate recipient) Bill Gates confidently stated that “Spam will soon be a thing of the past.” It’s now five years later (Gates suggested the problem would be solved in two), and spam is now 95% of all emails sent. Nonetheless, I think Gates was mostly right in principle even if the timeline was optimistic. A decade ago, when email spam was a real problem, I took care not to let my email address be displayed in public. Spammers had a habit of scraping email addresses from web-sites, with automated robots crawling the web looking…

In 2004 Microsoft founder (and honorary doctorate recipient) Bill Gates confidently stated that "Spam will soon be a thing of the past." It's now five years later (Gates suggested the problem would be solved in two), and spam is now 95% of all emails sent. Nonetheless, I think Gates was mostly right in principle even if the timeline was optimistic.

A decade ago, when email spam was a real problem, I took care not to let my email address be displayed in public. Spammers had a habit of scraping email addresses from web-sites, with automated robots crawling the web looking for any text containing the @-symbol. Despite my efforts, I had to abandon a couple of email addresses after they got added to the mailing lists traded between spammers, and the noise overwhelmed the signal in my inbox.

More Read

Chrome Experiments: Online Data Visualization
With physicists across the country pushing for universities to…
3 Ways Predictive Analytics and Big Data Can Help Forex Brokers
From Master Data to Master Graph
Top 5 Reasons R is Good for you
That was before the advent of good spam filters, though, which have greatly improved in the last couple of years. I now use Google Mail for all my mail, which has excellent spam-filtering technology. Even my non-Google addresses are forwarded to a gmail account, which I can rely on to filter the crap so that I can see the emails I actually care about.

I started my current job about 9 months ago now, and I made a conscious decision to stop worrying about spam and let my email address — david@revolution-computing.com — be free. It's linked directly on every page of this blog and on the REvolution Computing website, and I don't hesitate to include it in other public venues. It's been out there long enough to be picked up by robots and web searches, so it's probably time to evaluate the results. I'd say it's a success, and I'm very glad I took the plunge. I maybe get 2 spam emails a week in my Google Mail account (faithfully tucked away in my Spam folder), and better yet I don't think I've lost any legitimate mail to the spam filter. (So if you've emailed me and I haven't replied, I have only myself to blame. My apologies – I do get a lot of legitimate email.) I don't use any other email services so I can't speak to the performance of their spam filters, but I'm happy with my results.

So what changed between 2004 and now? My guess is that it's mainly been the transition to web-based email services. Statisticians have attempted to solve the spam problem before with predictive models, but the results were never that great at the time. The problem was likely twofold: it's a highly asymmetrical problem, where a false positive is a much bigger problem than a false negative, but too many false negatives mean the filter isn't really useful in practice. Secondly, I think the corpus was simply too small: a few hundred thousand emails, or even all the emails for all the employees of a largish company with a central email server, simply isn't going to result in a filter that gives a clean inbox while not trashing any legitimate mail sent to a broad community of users.

Web-based email certainly solves the corpus-size problem, but there's one additional detail that I expect makes it work. The defining feature of spam is that a spam email is sent to lots and lots of people and a web-based email service like Google Mail can easily see when a duplicate email is sent to lots and lots of users at the same time. Spammers have attempted various tricks to make that process more difficult — converting text to images, or adding random text to each mail to make it harder to detect duplicates — but Google seems to have largely overcome these hurdles.

So then, is the spam problem solved? At a technical level, clearly not — spam still consumes a tremendous amount of bandwidth and costs billions of dollars to contain — but at the personal level it's hardly more than a minor irritant these days. (And if it's not for you, consider a new email service.) For individuals, the real spam problem these days lies in other venues: social networking spam, blog spam, link farms, and so on. Mr Gates, when can we expect solutions to those problems? 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

Why Every Small Business Should Care About an AI Image Generator
Why Every Small Business Should Care About an AI Image Generator
Artificial Intelligence Exclusive
ai for instagram reel marketing
How AI Is Changing Instagram Reel Marketing
Artificial Intelligence Exclusive Marketing
protecting data in public
The Importance Of Protecting Sensitive Data In Public Services
Big Data Data Management Exclusive
New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
Analytics Big Data Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

PAW Analyzing and predicting user satisfaction with sponsored search

5 Min Read

Analytics and the Bottom line: How Organizations Build Success | Harvard Business Review

9 Min Read

Webinar Putting Predictive Analytics to Work Using Decision Management

1 Min Read

Analyzing Twitter

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?