( Free ) Read The New Office 2007 XML File Format (.docx)
Working with the new Windows Office 2007 documents under Mac OS X can be a challenge until Microsoft finally gets around to releasing the free converters they promised. The new Office 2007 for Windows now defaults to save documents in a new format dubbed "Open XML." These are actually ZIP packages that contain the various XML files as well as the images and other data. For instance, in a .docx document (created by Word 2007), there is a directory labeled word that contains various XML documents with the actual text. For Excel, items are located in the /excel directory, etc. Office 2007 users can do a "save as" to use the old standard (.doc, .xls, .ppt) files. But many users will never realize this and you may end up receiving the new .docx file that you can not open.
This will be a problem for Windows user that have not upgraded yet or people using OpenOffice.org as they will not be able to open these files yet. OpenOffice.org plans a release in January to support this new file format. Until these upgrades and converters arrive we will be dealing with this incompatibility with the new Office 2007.
Here are some solutions that will help you read these files in the meantime:
docx-converter.com is a web site that can translate a new Microsoft Word 2007 .docx file into a simple html file. According to its creators, the tool "strips out some of the formatting, but now supports bold, italic, and underlined text. Left, right, center, and justified alignment. Unicode characters, and more!" This is a great interim solution that has the key advantage of retaining some formatting, but the site might buckle under heavy load. An Automator script that does the same thing is available at www.jfalcon.org.
Manual Method - BBEdit or TextMate, have a "strip all tags" function you can use on the Word XML file. To see the XML file, though, you first need to change the .docx extension to .zip, then expand that archive in the Finder. Open the resulting folder, go into the word folder, and open the document.xml folder in BBEdit or TextMate, then use each app's strip tags function to pull out the text.I feel confident that there will soon be a conversion tool available from some enterprising programmer for your desktop that might beat Microsoft's promised solution. If you come across one drop me a line.