The docx file format after 10 years

If you are designing or architecting a mission-critical, document-intensive system, then a key question is “what document format should I use?”

If you need to edit the documents, then PDF is out.

Docx is the obvious choice, and after 10 years in the market, its interesting to review the events which have led to its dominance.

Microsoft Office itself has reigned for 20 years.  Or is it 30 years?

There is room for debate about when its reign started.  Word for Windows was introduced in 1989, and it steadily gained marketshare until by 1997, it had 90% of the US market.

It has seen off various threats along the way, and it is Microsoft’s response to these that have shaped the Office we know today.

The four threats are (or rather, were):

  • the web (take 1)
  • OpenOffice
  • Google Docs
  • the iPhone-led and web-based (take 2) shift away from Wintel PCs

Microsoft’s response to these threats has been so effective that:

  • Office is how Fortune 1000 knowledge workers have edited documents and spreadsheets for 20 years, and although this may be changing slowly, change is very slow
  • if you open a Word document in anything else, it should “look like it does in Word”
  • Office has become a platform, with all the usual characteristics of an  ecosystem.  Or rather, the Office file formats have become a platform.
  • All sorts of systems generate documents from templates: from contracts to reports to invoices to HR documents. So much so that more Word documents are probably created by programs than by people.

After reading the rest of this post, you’ll see why we here at Native Documents have “bet the farm” on a high-fidelity Word compatible editor designed for modern HTML 5 browsers.

The Web (Take 1)

In the late 90’s, Microsoft (specifically, Bill Gates) was worried that HTML as a document format might take over from Office.

I don’t want to focus on this too much here, but I mention it for completeness.

Suffice to say that we now live in a world where HTML5 is ubiquitous, and used as the basis of an every increasing range of new webapps.  It is no exaggeration to say that HTML is the key to all of the apps we enjoy today, in our web browsers, and often on our mobile devices and even desktops.

That said, the Office document formats have remained firmly entrenched in the areas they were designed for:

  • complex business documents, that are often shared and modified by multiple authors, often across organizational boundaries
  • in business processes, pulling in data from corporate systems (eg SAP and Oracle databases)

OpenOffice

Let me start by saying OpenOffice is important to this story for personal reasons.  Two of our founders (Gary and Jason) met at the first Open Office XML Format technical committee meeting in Dec 2002.  And Florian was the engineer at Sun and Novell responsible for interop with Word.

OpenOffice is important because it forced Microsoft to open its formats, and move to XML.  It is thanks to OpenOffice that the docx file format is an ISO standard, and open to anyone with unzip and XML editing technology.

It is also important because it re-inforced to Florian that visual fidelity is important, and must be baked in from the start.  It was this that led eventually to us starting Native Documents, Inc. to solve the problem of WYSIWYG for Word documents in web browsers.

Suffice to say that with the introduction of docx in Word 2007, Microsoft saw off the OpenOffice threat, and changed everything.  With the opening up of the docx file format, the momentum behind all attempts to establish a real alternative quite simply disappeared.

The transition to docx was a great success, and by re-energizing the Office eco-system with a new-found openness, the dominance of Office and its file formats has become even stronger.

Google Docs

At the same time, another threat came from out of nowhere.  Google Docs.

The history is well known.  Back in 2005, Sam Schillace developed a new webapp called Writely, exploiting then-new Ajax technology and the “content editable” function in browsers.  Google bought the company Upstartle, and soon, Google Docs was born.

Google Docs introduced collaborative, or real-time co-editing, and allowed anyone with a web browser to edit.  No need to have Windows or install anything.  And it was free.

It made Microsoft sit up and take notice, and led eventually to Word Online.

Google Docs has never been able to crack into Microsoft’s dominance in enterprise accounts, and with Microsoft’s pivot to the cloud under Nadella (especially hosted Exchange), its hopes of doing so have long since faded.

Still, the Google Docs story underscores how advances in web technology make new things possible. A decade later, it is advances like HTML5, React, and Web Assembly, which have made our Word file editor possible. More on this in later posts.

So long, Wintel..

Its not that nobody uses Windows PCs anymore, but rather, that they also use Apple (iPhones, iPads, desktops) or Android or Chrome or whatever else might be handy and running a web browser.  Linux?  I’m writing this post using KDE Neon.

And wisely, Microsoft has freed Office to run on these other platforms.  It has always worked (mostly) on Mac OSX, but these days it also runs natively on iOS and Android.  And for platforms which aren’t specifically supported (eg Linux), you can use Word Online.

By embracing a “Word everywhere” strategy, Microsoft has done what is necessary to help ensure the long term dominance of Office as a platform.

Where does this leave us?

It leaves us in a world where Microsoft’s Office file formats are dominant, stronger than ever, and therefore a given, wherever people across organizations need to work on a “serious” business document.

It leaves us in a world where developers continue to build enterprise systems which manipulate documents in the Office Open XML formats.  It is easy and safe to do so.  And practical alternatives (eg XML based single source publishing)  remain niche solutions.

It leaves us in a world where docx, HTML and PDF are the three main document formats, none of which are going away. But only one of which is suitable for editable business documents.

However, working with Word documents in a browser is not so easy, particularly if you want to seamlessly integrate this into your application.  So your users can view/review or modify a document.  Our next post explores this critical challenge.