Why We Need Multiple Standards

The Need for Multiple Document Standards

Balancing practical reality and potential benefit

Many aspects are shaping the discussion around standardized document formats. Effort is required to sort out these important issues to understand how they impact document formats and what can be done about them. Some of these aspects include:

  • Freedom to choose a format that suits the needs of the task at hand.
  • Document formats that can be easily exchanged by many applications and systems.
  • Freedom from dependence on specific applications, vendors, or platforms to exchange documents.
  • Maximum compatibility with existing documents.
  • Preserving documents for records management and archival purposes.
  • Document formats that support the breadth of language and assistive technology requirements.
  • Accounting for the incredible variety of software applications, usage, and functionality.
  • Protecting information stored in documents from unwanted usage.

These goals represent a strong desire for independence, choice, innovation, and freedom for software applications. It also reflects a strong desire of organizations to get more out of the software they already have; to do a better job of integrating their desktops to back-end systems. There are many other factors that can be considered, but this list presents a formidable challenge for organizations seeking to comprehend what standardized document formats mean for their computing ecosystems.

Many of these goals might also be in conflict. For example, does a document format suitable for archives need to protect its content from unwanted usage? Should it also support the real-time updates and information exchange required to integrate with other applications and systems? It would seem that the many worthy goals for document format standards reflects the diversity of software usage. It would also seem that one format cannot reasonably accommodate these goals.

Users in public and private sector agencies want to achieve the benefits of open, standardized file formats, but also want to preserve their ability to work with the content in their existing documents. As in any technology area, backward compatibility is a desirable feature, and Microsoft worked with others in the industry to design and document the Ecma Office Open XML standard ("Open XML") to achieve this goal.


[Back to Top]

Open Document Format (ODF)

In 2006, another document format standard was ratified by the International Standards Organization (ISO). Open Document Format (ODF) has its origins as the "Open Office XML Format," ratified by OASIS as a document format standard in 2005. Designed originally as a document format for OpenOffice.org, ODF was intended as a document format specification that enabled the exchange between OpenOffice.org and other supporting applications. Interest in OpenOffice.org accelerated upon the declaration of the Commonwealth of Massachusetts that open file formats were to be used in the exchange of documents between government agencies. This policy raised a number of important questions for document standards, including the role of PDF, accessibility and assistive technology, and even the programs required to read and write ODF.

ODF is designed to represent functionality in OpenOffice.org products, and was originally named "Open Office XML Format." Unlike Open XML, which is specifically designed to carry forward information in legacy files, ODF is not optimized to represent content that exists in documents that have already been created; it was only designed to reflect the information created by one application. For example, the ODF technical committee within the OASIS standards body declared that a standardized markup language for spreadsheet formulas (such as "SUM" and "Average") was "outside of the scope" of their charter.

ODF is also supported in most major business productivity suites today. Microsoft Office users can download and install a free plug-in from the open source software community to convert documents to ODF. Open Office also uses ODF. Corel has announced ODF support in 2007. Other business productivity applications like KOffice, AbiWord, and others support ODF.


[Back to Top]

Aren't ODF and Open XML designed to do the same thing?

The rhetoric surrounding the standardization of ODF and Open XML has generated debate over the merits of each format. Those who are polarized toward ODF claim that Open XML and ODF are designed for the same purpose, and that only one format should exist. Open XML advocates, along with a mainstream of users, support the idea that Open XML and ODF are designed for different purposes, and exist to address different user needs, much like PDF, RTF, HTML, and the countless other text and document formats that exist.

Many arguments are made about the process by which ODF and Open XML were standardized; many argue over minute technical details within the specifications, some argue about the terms under which the intellectual property in the file formats should be made available. In truth, the Ecma Office Open XML Formats and Open Document format share many similarities in the standardization process. Both standards underwent a lengthy technical review, represented by parties from multiple companies and other interested parties. Both formats originated from software products; ODF originated from OpenOffice (and was originally named "Open Office XML Document Format"), and Open XML is a reflection of the proprietary .doc, .xls, and .ppt file formats of past Office versions.

The real differences between ODF, Open XML, and other formats are not in the politics and rhetoric. By comparison to Open XML, the ODF specification is short and simple, but is not optimized for representing the content in existing documents. The Open XML specification is optimized for the level of precision and detail required to carry forward billions of existing files, including a complete specification for spreadsheet formulas and many other features that are lacking in the ODF specification. Open XML also offers the unique capability of hosting custom-defined data languages within the document format. Organizations can use Open XML to report information from other applications and systems without having to translate it first. This capability is a key innovation for developers seeking to incorporate real-time business information into their documents, or those who seek to "tag" documents with their own categorization system in order to improve their understanding of its contents.

Support for both ODF and Open XML, as well as PDF, RTF, HTML, in productivity suites would suggest an acknowledgement of the reality that customers need many file formats to accomplish the work they do. To support the variety of needs for document formats, many translation projects are under development, to facilitate translation between Open XML, ODF, UOF, PDF, and others. Indeed, for customers who want to use multiple formats — their needs are being met by developers designing products to support multiple formats and by using translators to exchange data between them.


[Back to Top]

Can't we have just one document format?

When thinking about this question, it's important to compare it to a more digestible real world example. Most governments have the need for vehicles to carry out government business. Whether it's fire engines, ambulances, police cruisers, prisoner transport vehicles, mass transit vehicles, snow plows, or others needs, the sheer variety of tasks and needs of the populace require a government to have the flexibility to use the right vehicle for the job. Similarly, open file formats mean many things to many people, and one document format cannot address the list of needs that arise in the multitude of scenarios in which documents are created and used. Just as an ambulance isn't the optimal choice to clean streets and a snow plow isn't useful to transport commuters to work, the reality of software usage today suggests that many, many file formats exist to satisfy the incredible diversity of needs in software applications. Image file formats, editable document formats, fixed document formats, archive formats, spreadsheet formats, page layout formats, email formats, diagramming formats, and countless others exist to satisfy the needs of software usage. Some document formats are optimized to present a fixed, static representation of information so that it cannot be changed, ever. Editable document formats are designed to maximize edit ability. Specific formats like spreadsheets or page layout document formats are designed to suit specific needs of software applications and systems.

Imagine a common scenario involving PDF, Microsoft Office Excel, and Microsoft Project. All of these programs share information, and may represent information from a specific project at any instance in time. But it makes little sense to merge these into a single format, as the data represented in these formats are intended for different purposes. The PDF documents for this application would be for the purpose of presenting final versions of the information. Excel may be used to perform analysis of data, which PDF is not suited to do. The Project file contains information about tasks and resources that is editable by the project owner, and is not suitable for analysis and not appropriate for broad distribution in a final format. Combining these document formats would make little sense.

In fact, the very tenets of document format exchange, which include use in multiple applications and maximum compatibility with existing documents, demand the ability to choose those formats which best suit the task at hand. Legislating or mandating the use of a single document format is an arbitrary measure that doesn't reflect the reality of software usage today.


[Back to Top]

Examples of Overlapping Standards that Enhance Consumer Choice and Innovation

It is quite common to have standards (including multiple ISO/IEC standards) whose scopes overlap. Overlap is warranted, and fosters innovation, when the standards address distinct user requirements.

1. Digital media formats

  • Image Data. There are multiple standards for storing digital image data, e.g., CGM (an ISO/IEC standard), ASCII drawing interchange, DPX (an ANSI/SMPTE standard), GIF, JPEG (an ISO/IEC standard), and PNG (an ISO/IEC standard), to name just a few. Each of these formats addresses similar but overlapping requirements for drawings, still images, scanned images, animations, graphic designs, etc.
  • Video. Many overlapping standards exist to encode and compress digital video, such as: MPEG-1 (an ISO/IEC standard) — used for video CDs; MPEG2 (an ISO/IEC standard) — used for DVDs and Super-VCDs, as well as for digital television signals distributed by broadcasters, cable operators, and direct broadcast satellites; MPEG-4 (an ISO/IEC standard) — good for online distribution of large videos; and H.264 (jointly developed by ISO/IEC and ITU-T) — created to provide high video quality at substantially lower bit rates than previous standards. There are likewise a large number of overlapping digital interface standards used to transfer digital video at high speed, including FireWire (an IEEE standard), HDMI, SDI (an ITU-R and SMPTE standard), DVI, UDI, DisplayPort (a VESA standard), and USB.

2. Existing document formats

  • We have today (and will continue to need) multiple overlapping document format standards to meet the needs of various users, and several of them are existing ISO/IEC standards, including HTML, ODF, and PDF/A. Indeed, the JTC1 Directives themselves include a list of the different types of standard formats that may be used with JTC1 documents distributed with different purposes (See JTC1 Directives, 5th Edition, Version 2.0, Annex H). For example, the JTC1 policy references six different formats — HTML, TXT, DOC, PDF, WP, and RTF — and ranks them from "highly recommended" to "not recommended" for different purposes, such as for use in standards, web browsing, or in complex documents. Several of the formats are ranked as "highly recommended" or "possible" for the same document use, underscoring the value of multiple document formats that address the same need.

3. Digital tv formats

  • In 1996, when the FCC adopted the ATSC DTV Standard, it declined to mandate a specific supported video format based on the conclusion that it would "result in greater choice and diversity of equipment, allow computer equipment and software firms more opportunity to compete by promoting interoperability, and result in greater consumer benefits by allowing an increase in the availability of new products and services." Further, the FCC noted its preference of "allowing consumers to choose which formats are most important to them," which would hasten the adoption of digital broadcasting.
  • In allowing transmissions using interlace or progressive scan, in 480, 720, or 1080 lines of resolution, and in a 16-by-9 or other aspect ratio, the FCC sought to "foster competition among those aspects of the technology where we are least able to predict the outcome, choosing instead to rely upon the market and consumer demand." It also concluded that "allow video formats to be tested and decided by the market avoid the risk of a mistaken government intervention in the market."

4. Wireless standards

  • Of the IEEE-developed 802.11 family of wireless standards, the Wi-Fi and Bluetooth protocols were once commonly believed to be in direct competition with, and mutually exclusive of, one another. In time, however, Wi-Fi and Bluetooth were properly understood as largely targeting different market segments — the former, with greater range, best served home and office networking needs; the latter, with much more limited range, became the better choice for hand-held devices and other small consumer electronics. Still other overlapping standards are those adopted by the Infrared Data Association (IrDA), whose standards are for the short range exchange of data over infrared light, for uses such as personal area networks (PANs).

[Back to Top]

Key to Interoperability: Ecma Office Open XML-ODF Translator

  • Microsoft funded the development of the Open XML-ODF Translator project as an open source project on sourceforge.net. The Open XML-ODF Translator is available for anyone to download and use at no cost.
  • On January 31, 2007, the Open XML-ODF Translator project announced the availability of a translator for word processing documents. Available as a plug-in for Microsoft Word XP, 2003, and 2007, the Translator enables document conversion between the Ecma Office Open XML File Formats ("Open XML") and Open Document Format ("ODF") text formats. When plugged into Microsoft Office Word, for example, the Translator provides customers with the choice of opening and saving documents in ODF rather than the native Open XML format.
  • Developers of word processing programs that use ODF as the default format may also integrate this Translator into their products and enable users to open and save documents in Open XML.
  • Novell has already made the Translator available with its version of OpenOffice. This enables OpenOffice users to open/save Open XML documents on the Windows and Linux platforms. Other organizations also have translators under development.
  • The second phase of the translator project, slated for release in fall 2007, will convert spreadsheet and presentation documents between the Open XML and ODF spreadsheet and presentation formats.

Other Key Benefits of the Open XML-ODF Translator

  • The Translator addresses the needs of customers such as governments that must support multiple formats. The Open XML-ODF Translator will enable these customers to achieve interoperability between the two file formats and also allow them to use a wider range of applications.
  • By enabling conversion of documents from one file format to the other, this free technology not only enhances interoperability, but also brings greater choice and flexibility to the market for document creation, management, and archiving.
  • The success of the Translator (more than 150,000 downloads to date) demonstrates how proprietary and open source organizations can work together to meet the needs of customers and how Open XML and ODF can co-exist as open standards in actual products to provide more choice to customers and developers.
  • Translation technology promises to enhance choice and accessibility options for technology users, including those who are disabled.

The Development Process of the Open XML-ODF Translator

  • The Microsoft-funded open source Open XML-ODF Translator project is being developed by Clever Age of France and Sonata Software Ltd. of India, and tested by Dialogika of Germany and India-based Aztecsoft Ltd. The project will continue to be open source software on SourceForge.net, and freely available to all customers for development or use.
  • The open source software community has shown strong interest in the Translator project. Since the project was launched, it has become one of the 30 most active projects on SourceForge.net, which hosts more than 100,000 open source projects.

[Back to Top]