Please Stop Quoting Alexa Data

Far too often I hear people quoting Alexa data. Even last week, at the 2007 Omniture Summit I witnessed Tim O’Reilly using Alexa charts to prove Web 2.0 success in front of 1,000 smart web analytics professionals. I know I couldn’t have been the only person in the crowd to notice. For Tim’s benefit, and anyone else who uses Alexa Data, please take note:


I touched on this in a competitive intelligence metrics post back in October, showing that Alexa’s data is less accurate to determining true site traffic then the # of characters the domain name, but now I’d like to really illustrate how far off Alexa’s data is.

Many people have pointed out Alexa’s data is biased towards a certain crowd and can be manipulated (see the links at the bottom of this post), but none have illustrated the margin of error that I’m about to. Below I take a look at two very different sites with very different web traffic stats.

Site 1: Allrecipes – Allrecipes is a leading food site – as you might expect, Allrecipes users are similar to what you might see on the Internet as a whole, though slightly more female.

Site 2: SEOMoz – SEOMoz is a site that caters to the SEO and online marketing community – a crowd more likely to install the Alexa toolbar.

Using Alexa, you might conclude that SEOMoz receives more traffic than Allrecipes:

Alexa Reach Chart:
Alexa Reach

Alexa Rank Chart:
Alexa rank

Both sites are very popular within their target audience, but despite what Alexa may show, Allrecipes has much more traffic. Let’s face it, more people cook food, then perform SEO! In fact, if you were to populate the above charts with actual data, SEOMoz would be a flat sliver near the x-axis. Here’s some real data from Dec. ’06:

Allrecipes Unique Visitors: 11,023,187
SEOMoz Unique Visitors: 102,523

If you were to use Alexa charts to draw conclusions about either site based off real numbers for one site, your traffic estimates would be off by approximately 11,842%. Numbers that big are often difficult to grasp, so I like to put it in perspective. A mistake of that magnitude is the equivalent of:

  • The CIA mixing up the population of Ohio for China.
  • Your accountant saying you owe $1,000 to the IRS, when you really owe $119,417.
  • A cop pulling you over for doing 60 in a 30, when you were really going half-a-mile-per-hour.
  • Telling your spouse you’ll be home in three hours, then showing up 15 days later.

These are mistakes that none of us could get away with, so why should we let Alexa?

I’m not the first to prove Alexa data is flawed. Here are links to other Alexa skeptics:
Peter Norvig, Paul Stamatiou, Josh Pigford, Matt Cutts, Rand Fishkin (thanks for the data!), Greg Linden, Bruce Stewart, Alex Iskold, John Chow, and Markus Frind.

Digg my article

21 Replies to “Please Stop Quoting Alexa Data”

  1. Thank you for writing this! I’m sending a link to this post to my boss. He frequently uses Alexa data, but I’ve never been able to prove to him that the data is messed up.

  2. Great observations, Dustin! Thanks for sharing them.

    I’ve used Alexa data for competitive trend analysis but I’ve always provided the data with a couple caveats: first, the data is not representative of the general internet population, but only of a small subset: Alexa toolbar users. What that really means is anybody’s guess because I’m not aware of a sociological study of Alexa users. :: grin ::

    And second, any traffic trend analysis is only as good as the aggregation of the trends themselves, and the trends say nothing about individual user behavior. Watching server logs and traffic analysis charts is about as effective at determining what users want from your site as measuring shopping behavior by analyzing the wear on the tiles in your shopping mall. It shows trends, not much more than that.

    But people can’t wrap their minds around that — they see numbers, and numbers mean reality, so the data must mean something relevant, right?

    Problem is, trend analysis only goes so far in telling us what we need to know: what promotion strategy has the best long-tail impact? What strategy produces the greatest conversion of visitor to loyal reader? Why did they leave the site? Did they bookmark the site, and will they return? How are the numbers impacted by corporate and ISP proxies? So on, and so on. Some of this can be gotten at with more complicated user interaction / logging, but that’s not cheap, and it could be “intrusive.”

    Again, great article. Write more!


  3. Even worse is when people try to use Alexa for analysis of any non US or worse even non English website. Alexa in Japan e.g. is just a joke. No one is using it, so the data is completely non-representative.

    I’ve got a blog and a Podcast (mainly in German), and I know that my podcast’s site has far more visitors by orders of magnitude than my blog. Still on Alexa the numbers are reversed.

  4. Just like I mentioned earlier today about alaxa’s numbers on site. Do not try to look at the numbers as is, they are meaningless. You can do trending and comparison of like sites of which both are in the top 100K sites, anything else like in your example of comparing seomoz and allrecipes in alexa is like trying to compare a freeway to a country they are different animals and thus can not be compared on alexa data, however you could draw some conclusions between two seo sites or recipe sites. But you can never mix those two because the population using alexa toolbar is skewed comparisons between sites with different subjects are meaningless.

  5. Well, it’s well known the alexa is biased to Webmasters, because they are more likely to install Alexa toolbar. That’s why SEOMoz got much higher rating, it got more attention from Webmasters. That said, it still provide a good comparison between similar web site. Beside, is there a better way to rank web sites?

  6. Interesting topic! I was about to use Alexa data when I saw this and stopped. However, if we should not use Alexa, what is an alternative to get some estimate of traffic?

    Thanks in advance,

  7. Sudip,
    Alexa has gotten better, but so has their competition. Seems like the best sources in terms of quality and reliability of data, going from the Highest to lowest:
    1) Your own analytics data (doesn’t work for competitors)
    2) Comscore or Nielsen (very expensive and not very granular)
    3) Hitwise or Compete (somewhat expensive, but much more granular) and Quancast (mostly free and quantified sites should be accurate)
    5) Google (limited information, but free)
    6) Alexa
    7) Other sources

    This list is different

  8. should put a disclaimer on its website warning people that their data is “For Amusement / Entertainment Use Only” as Alexa rankings CAN be easily manipulated.

Leave a Reply

Your email address will not be published.