An Overview of the Weblog Tools Market

By Elise Bauer
August 6, 2004

Weblogs, although often described as online diaries, are a much more interesting trend than that label would imply. Yes, weblogs are personal journals on the web, and as such they represent the breadth and depth of human interest and knowledge. Not only do blogs allow millions of people to easily and instantaneously publish ideas to websites, most weblogs incorporate interactive features that let others easily comment to those sites, thus transforming the static web into millions of dynamic conversations. Weblogs are increasingly making their way into the professional communications arena as evidenced by FCC Chairman Michael Powell’s blog which he recently started to help generate public discussions around often-controversial FCC policy. Companies are beginning to use weblogs as an internal tool for knowledge sharing. Intuit has created a weblog to open lines of communications with its QuickBooks customers. Technorati tracks over 3 million weblogs, a number that appears to be doubling every 6 months, giving weblogs growth rates that we saw in the early days of the web. Weblogs cannot be dismissed as a fad; they will change the very nature of how we connect and communicate.

Weblog Tools

What does the blog tool market look like today? Weblog tools can be distinguished along two dimensions: fee vs. free and hosted services vs. standalone software (although this distinction is blurring as standalone application companies are beginning to offer hosted versions). The vast majority of blogs are hosted on services that offer weblog building tools and server space for free, for a small fee, or as a feature of a more comprehensive service. Blogger.com, an early-to-market free ad-supported service bought by Google in 2002, is the big guerilla in the market with most likely the largest market share. Live Journal is a hosted blog community with over a million active accounts, 90% of whose users are 25 years old and younger, and two thirds of whom are female. Live Journal weblogs often look more like forums or chat sessions than web pages with structured content. DiaryLand has a large base of teen webloggers and the look of its website suggests a female skew as well. AOL launched its AOL Journals in 2003 as a feature of the AOL service. Typepad, a fee-only service offered by Six Apart, is the most professionally oriented of the hosted consumer services and attracts a broader demographic than the other services.

Hosted services probably represent between 70 and 85% of the weblogs published. The rest are built using standalone applications that are hosted on a user’s individual server or web host. The stand-alone applications tend to be aimed at a more technically sophisticated audience, have many more features and design flexibility, and are thus more suited to professional applications than the hosted services.

The key differentiator between the various stand-alone applications is whether they are open source and free, or if there is a license fee required. Of the standalone applications, Movable Type, also by Six Apart – the makers of Typepad, is the most dominant. Movable Type began its life as shareware, establishing a large user base and developer community. It has since moved to a fee-based licensing model. Expression Engine from pMachine, like Movable Type, also requires a fee-based license, as does Radio Userland. Expression Engine and Radio offer hosting packages of their software. Movable Type and Expression Engine are considered the most robust, in terms of features and extensibility, of the stand-alone applications.

There are several open-source GPL or BSD licensed weblog packages available including WordPress, Drupal, Greymatter, Textpattern, and Blosxom. Of these, WordPress appears to have the highest distribution to date. A benefit to using an open source solution is that they tend to have active developer communities contributing to the code base and to user-maintained support forums. The downside is that there aren’t the funds to do market expansion or offer more comprehensive support. One benefit to using a fee-based licensed product, not lost on the professional market, is that companies that charge for their products can offer better support and resources for product improvement.

Market Share

Trying to draw a picture of the emerging weblog tools market is difficult at best. The industry is still very young, rapidly changing, for the most part highly fragmented, and with new entrants coming on the scene every day. Coupled with the rise in weblog tools are syndication tools, such as RSS and Atom, which help proliferate weblog content across the Internet. Standards are getting established, the tools are evolving, and the market is growing quickly. Most of the tools are created by private companies that do not reveal their use statistics. A weblog tool can be a standalone software product, a paid service, or a feature of a product or service, thus muddying the waters even further. Many of the tools are free and although the companies can track downloads and accounts, people can have multiple accounts and download multiple times, therefore lending to an overestimation of actual active users.

So which tools have the greatest share of the market? Without solid numbers, which even the tool providers would have a hard time providing, that is a difficult question to answer precisely. One way to look at it may be to consider which tools have the most influence, or are getting the most use. To try to get an answer to this question I’ve turned to Google. By typing in the domain name of a tool you can find the number of web pages that link to the domain name and the number of pages that contain the search term of that domain name. For example, AOL Journals are all hosted at journals.aol.com. By searching this domain name in Google we find that about 42,300 pages link to journals.aol.com and 159,000 pages contain the domain name journals.aol.com. This compares with 291,000 pages that link to blogspot.com (Blogger’s domain name) and 1.2 million pages that contain the domain name blogspot.com.

weblog_tools2.gif

For a reality check on these numbers, we know that Technorati tracks over 3 million blogs and that number is increasing at a 6 month doubling rate. The scale at least of the Google links is in line with Technorati numbers. Live Journal reports over 1.8 million weblogs, which is within range of the 1.1 million URLs that Google has picked up. I would not have expected as many URLs for Live Journal or Diaryland, but then again these tools are targeted at a much younger audience, whose weblogs I would be less likely to come across, and who tend to keep their linking activity within their own community.

One anomaly is Typepad. Google finds 55,000 pages that link to typepad.com and 724,000 pages that contain the URL. I would expect a fee-only service to have not nearly the number of weblogs as one of the free services. Live Journal reports that only 2% of their users are paying for their relatively inexpensive premium service. What I can believe is that because Typepad is the most expensive consumer hosted blogging service, and the most feature-rich, it is attracting customers who are more serious about their blogs. I would expect that Typepad customers write more often and more thoughtfully about subjects that would interest a broader market than customers of a teen-oriented site like Live Journal or DiaryLand. Each blog entry in Typepad is stored on the Typepad server as a separate page and URL, so perhaps the Google numbers are reflecting not only the mentions of Typepad blogs on other weblogs, but also a larger number of entries than what one would find on average with a free service or product. Free services, by their nature of being free, would have a larger proportion of rarely used weblogs.

Weblog Use Index

Since actual share numbers are impossible to come by, I have combined the Google Link To and Contain URL numbers to come up with what I am calling the Weblog Use Index, an index of market influence based solely on Google results. Clearly a problem with this approach is that it weighs more heavily the hosted services where each weblog created contains the URL of the service. Weblogs that use the standalone tools may not cite the tool used and therefore would not get counted with this method. However, when we look at the overall results, they seem to fit what we would expect in general. Blogger, Google’s free service, has the lion’s share, followed by Live Journal, the most active weblog online community.

weblog_tool_index2.gif

weblog_pie_chart2.gif

This view of the blog market is certain to generate criticism. I am very open to suggestions for an improved methodology using publicly available information. I am surprised by Typepad’s importance on the Use Index given that it is a fee-based service. However, considering that I know of and read countless Typepad blogs and virtually no Live Journal or Diaryland blogs, maybe I shouldn’t be so surprised.

The Emerging Business Market

As more businesses find valuable uses for weblog technology, there will be increasing demand for professionally-oriented tools, hosted services, and professional support services. Six Apart and pMachine serve this market now with their Movable Type and Expression Engine applications, but they have barely made a dent given what the opportunity is. Expect a whole new wave of products, services, and companies to be created over the next 12 to 18 months to cater specifically to the business market.

Blogging Tools

Here are some links to weblog tools mentioned in this article and a few others as well:

Hosted Blog Services
Free:
Blogger.com
Blogdrive
A Hosted Blog Platform Test Writeup

Fee:
Typepad

Hosted Blog Communities
Live Journal
AOL Journals
DiaryLand
Xanga
AlwaysOn Network

Blog Software
Fee:
Movable Type
Expression Engine
Radio Userland

Open source and free:
WordPress
b2evolution
Drupal
Greymatter
Textpattern
Blosxom
Nucleuscms
Roller Weblogger
Pivot

Blog Indices and Search Resources
Technorati
Feedster
Blogdex
Blogpulse
Blogwise
Blogstreet

Websites Focused on Blog Market Sizing
Blogcount
bsentinel
BlogCensus

For a continuation of this analysis, see Weblog Tools Market Update February 2005.

29 thoughts on “An Overview of the Weblog Tools Market

  1. Definitely an interesting analysis, and I hope the inevitable critiques offer some suggestions for how to improve the methodology.

  2. Hi DK – Thanks for the heads up. Square space has 246 inbound links and 767 domain mentions on Google as of today, giving it a Use Index of 1. Let’s see if and how it takes off in the market. Given the various unheard of blog tools I see listed on Google ads, I would suspect that there are dozens of players out there who are not showing up on the radar screen.

  3. Elise,

    A very thoughtful article. Some comments: In regards to what the definition of “blogging” really is these days can be hard to pin down. Some tools are flexible enough to be used as simple content-management systems (CMS) and aren’t really “blogs.” Also, Drupal, in my view, is more of a CMS than it is a blogging tool. However, these are just small arguments, really.

    One consideration of your method of obtaining numbers from Google is this: Many open-source programs do not, and in some instances, cannot force a return link on the pages of those who use the tools. The commercial blogging packages certainly can, and do, require that a link be on your site. A tool like WordPress, for example, is released under the GPL, so users only link back to wordpress.org if they wish. For this reason, I hesitate to accept Google’s numbers as being a true indication of use of any particular product.

    Thanks for an informative article.

  4. Regarding the stand-alone tools: is that really Expression Engine that’s being counted, or is it mainly pMachine (especially pMachineFree)?

    Also, I’m a bit surprised that b2evolution isn’t listed. Like WordPress, it’s an offshoot of the old b2/cafelog. Where WordPress seems to be emphasizing a simple single-weblog package that’s easy to get started with, b2evo emphasizes a more complete (and complex) feature set.

  5. Hi Craig – Yes it is indeed hard to get one’s hands around what constitutes blogging software and I suspect the problem will get more difficult as time goes on.

    Regarding the Google numbers and the GPL tools, I agree that they may not be fully reflected in the numbers. My experience to date, however, is that most bloggers, including those who use WordPress, seem very willing to let the world know which tool they are using. Where this will be increasingly difficult to track is in commercial deployments where there is more reason to not place a link to the blog tool.

    Doug – Thanks for the pointer to b2evolution. According to their Google listings, they have 6,550 link to’s and 29,300 domain mentions, giving it a Use Index of 36.

    Good point on Expression Engine. What is being counted is anything that links to pmachine.com or contains that domain name which would include Expression Engine and other pMachine products that would link to pMachine. This is most likely boosting the Use Index for Expression Engine beyond what it might deserve.

    To all – I think the bigger story here is the sheer dominance of the Blogger and Live Journal and to some extent Typepad in the Google Index. These tools are all much much easier for a normal person to use than Movable Type or the GPL tools. You don’t need to know how to write HTML tags or to even use FTP to blog with these services. As blogging goes more mainstream I think we can expect this to be a more prominent trend with the consumer tools. The next horizon will be tools for commercial use, which I can assure you will not be dominated by Live Journal. :-)

    Update: I’ve added b2evolution to the list of tools.

  6. Here’s a good write-up on Blogcount of the factors that may be contributing to making some tools more Google friendly than others. Also lists some additional share questions for which I too would love to know the answers.

  7. Elise,

    This is a well written and thought out article, but I do have one thing I take issue with…

    One benefit to using a fee-based licensed product, not lost on the professional market, is that companies that charge for their products can offer better support and resources for product improvement

    This is not necessarily true, and in many ways totally false. There are numerous open source products that provide better support than their closed/proprietary competitors, MySQL, Apache, Red Hat, Perl, etc all have extensive support offerings, free and fee-based, from internal developers, third parties, etc.

    I chose to use Movable Type inspite of its license, because I felt it was the best non-hosted solution now and for the forseeable future. If there was another product on equal footing as MT, and it was GPL’d I would have gone with the GPL software — not to save money, but I have found better support from the OS community in general than from proprietary product support teams.

    Other than that … wow! Great article and stats!

    -=j=-

  8. Hi Elisa!
    like Ross said: great work! However, I’m a little disappointed not to see 20six on here. 20six is the only international *European* weblogging service, with 5 platforms in four countries:
    20six.de, links: 234,000 (contain)
    20six.co.uk, links: 357,000 (contain)
    20six.fr, links: 378,000 (contain)
    20six.nl, links: 242,000 (contain)
    myblog.de, links: 35,900 (contain)
    How would that rate in your use index and how would we measure up in terms of market share?
    Would look forward to updates, if you have any planned.
    Max Niederhofer

  9. May I point out that Nucleus has been around for some years? It may not have attracted a MT or WP like following, but it is a full-featured weblog tool, with an active community.

    Two quick Google queries reveal that 21.700 pages link to nucleuscms.org , while 32.300 contain “nucleus cms”.

  10. John Hoke – thank you for your comment. I agree that there are many open source products for which there are companies that offer excellent support. I should have been more precise in my language. I think the point I am trying to make is that for a company to succeed with a complex product in the enterprise space it needs to offer for-fee support services. Both MySQL and Red Hat do this quite well from what I understand. Companies that have a business model with either paid licenses or paid support may succeed where those who don’t won’t in the enterprise market.

    John Beimler – Rollerweblogger.org has 2,460 link to’s in Google and 16,200 domain contains, giving it a Use Index of 19. Thank you for pointing out this tool.

    Max – 20six is clearly a market leader in Europe. Thank you for bringing the company to my attention. It would be quite interesting to do a market analysis of the prominent European webtools; unfortunately, I do not have much visibility into that market right now. Perhaps someone reading this thread who understands the European market, and which tools and services are used there, could take a pass at this. If you want to see how 20six compares to the tools I have mentioned, add the “link to” and the “contains” results from a Google domain search on the domains you listed to come up with a total Use Index. By just adding up the contain numbers you have mentioned, it looks like 20six is right up there with Blogger and Live Journal. If, however, the pages in the service are all cross linked, that would mean some major double counting. Like I said, it takes someone more familiar than I to do a first pass at analysing the European market.

    Roel – nucleuscms.org has 21,700 link to’s and 12,300 contains, giving it a Use Index of 34. Thank you for bringing it to our attention.

  11. I don’t know what a “Use Index” is exactly. But if it’s related to how many times it’s linked on pages throughout the net, how would a netizen know if most of those links weren’t bashing xxx product?

    If I’m sorta close in my understanding of “Use Index”, then my question might make a difference: Did you get MT’s “Use Index” after that big announcement that miffed off many users? Ok MT’ers, I’m aware that most of that has been cleared up, so it’s not an MT bash as much as it is a question about linkage. I’m sure during that week there was a super big increase in MT linkage. But most of it was pretty bad linkage for that week or so, till things got straightened out.

    So how would an event like that affect a “Use Index”?

    I’ll stick to choosing my software, or services, by what I read from people that have actually tried it: signing up, installing, ease of use/additions/upgrades, reliability if it’s a service, if there’s a forum, ect…

  12. Regarding hosted weblog services, it is actually possible to gather statistics on the numbers of hosted weblogs. Here is a list of the top 10 hosted weblog services compiled by monitoring blogrolling’s recently updated list and other similar pages:

    Top 10 hosted weblog services — # blogs posting in June 2004

    • livejournal.com — 1,001,508
    • xanga.com — 468,679
    • blogspot.com — 290,876
    • persianblog.com — 18,731
    • blogdrive.com — 18,363
    • 20six.fr — 8,050
    • diaryland.com — 6,682
    • 20six.co.uk — 6,348
    • bravejournal.com — 5,631
    • myblog.de — 5,010

    Notice that livejournal.com and xanga.com top the list. What’s interesting is that livejournal and xanga blogs are currently almost invisible to the rest of the blogsphere because most weblog search portals index very few of them. (Shameless plug: BlogPulse is an exception.)

  13. Hello Elise,
    I have start an analysis of your method and I would like to have some more specific and detailed information if you are willing to share it:
    - two samples of the requests made to Google (one for each Link To and Contain URL, easy to reproduce but a clear statement would be nice
    - the formula used to calculate the Weblog Use Index, and maybe the rationale which lead to it’s definition.

    Thank you.

  14. Sherri – The “Use Index” as I am defining it in this case is an index based on using Google search results as a proxy for use or popularity. It is an approach that clearly has shortcomings, some of which I’ve already discussed.

    Regarding the event that you mentioned, I suspect most of the people who wrote about MT were already MT users and already linking to MT, and those who weren’t MT users most likely were not spelling out the whole domain name in their rants. I would bet that in this case it made very little difference at all. This isn’t to say that a public event could throw the numbers off (imagine if you were the Hilton Hotel in Paris looking up your keywords in Google!).

    As a Mac user, I agree with you whole-heartedly regarding how one should choose one’s tools. This article is in no way meant to encourage people to choose tools based upon the tool’s popularity.

    Jason – Basecamp appears to be a project management solution, not specifically marketed as a blogging tool. I believe blogging will become a standard feature in many collaboration-oriented products and services.

    Elja – Pivot has practically no Google linkto’s but does have 21,500 contains. Thanks for bringing it to our attention.

    Brad – As you mentioned, Pwain is not marketed as a blog tool.

    Natalie – LiveJournal and Xanga are showing up pretty loudly on the Google index. I don’t know about deriving marketshare numbers from blogrolling data. Isn’t that an opt-in service? That would really skew the numbers.

    Bumble – The numbers counted for Expression Engine are really for both Expression Engine and its sister pMachine, as I used pMachine.org to look up the Google numbers.

    Sue Boettcher – Webcrossing is corporate collaboration software and not specifically marketed as a blog tool.

    Oldcola – The methodology is easy. Search for any URL in Google and Google will prompt you with links to list the pages that “link to” your URL or “contain the domain” of your URL. This will give you numbers that reflect the extent to which that URL is getting picked up by Google. Add the two numbers together and you have the Use Index. Use the URL that would most likely appear on the page of someone using a particular tool – journals.aol.com for example, or movabletype.org.

    To all – thank you for your comments and for bringing new weblog tools to my attention. Please note that I may remove any comment that in my opinion does not add to the discussion. Also any comment that smells of spam will be deleted as soon as I can get to it.

  15. Oldcola – You are right. This methodology does not do a good job determining the share of “users” of blogging tools. However, it never attempts to do so. What it does do is give the “Google share” or the degree to which Google is picking up the URLs in their spider. This I am asserting gives some indication, though clearly not precise, and clearly with flaws, of the degree to which the tools are being “used”. What the results seem to point to is that more searchable pages are being created by certain tools than by others.

    As I have mentioned in the article, I am particularly interested in critiques that include suggestions for market mapping methodologies that use publicly available data to help all of us get a better picture of the blog tools market. Several suggestions have been made so far, and I refer you to some of the links mentioned in the previous comments and in the above listed Websites Focused On Blog Market Sizing.