Beyond My Mind

March 22, 2007

Deciphering Microsoft Office 2007 Bibliography Format

Filed under: Research — mahbub @ 12:56 am

I am about to write a module for JabRef, an open source bibliographic management software to export the bibliographic information for Microsoft Office 2007.

Some references that might be helpful:

  1. How to use Office 2007 bibliographic tool
  2. OpenXML Developer
  3. Blog of Brian Jones, the person behind the Office 2007 open XML
  4. ECMA Open XML Standard Elaborated Schemas (all documents)
  5. MSDN article showing how to work with Bibliography (updated March 23, 2007)

But after searching for a day, I could not find a single web page describing the exact or near exact format for bibliographic information in Microsoft Office 2007. So I started digging in myself.

I started adding some bibliographies in Microsoft Office Bibliography Editor. The very first thing I noticed is, if you add some references and don’t use them in the document they are not going be saved. If you use one or more of them in your document, all of them will be saved in “C:\Documents and Settings\<USER>\Application Data\Microsoft\Bibliography\Sources.xml“. I opened the XML file and here’s what I got (figure 1).

Figure 1: Mircosoft Office 2007 Bibliographic Database Format

Content of Microsoft Office 2007 Bibliographic Source XML

Obviously I had only one bibliographic source in the “Sources.xml”. I was almost certain that Office will import a copy of this file without any problem. A copy of this file with the information altered and GUID, LCID deleted, just worked as imported bibliography. But wait, where are my previous bibliographic sources?

So I tried to discover what happened and found that Office does NOT really imports bibliography into the “Sources.xml”, it allows you to work on currently opened XML only. All the bibliographic sources in currently opened bibliographic XML file are displayed in the ‘master list‘. You have to “copy” them into your ‘current list‘ to work with it. If you want to merge information from an external XML file into your “C:\Documents and Settings\<USER>\Application Data\Microsoft\Bibliography\Sources.xml” you have to open the external XML file, copy the information into your ‘current list‘, open the “Sources.xml” again and then copy them back into the ‘master list‘ which now points to “Sources.xml“.

I wanted to find out the least possible information required for the XML file to be recognized as a valid bibliographic source by Office 2007. The bare minimum is:

<sources xmlns="http://schemas.openxmlformats.org/officeDocument/2006/bibliography"/>

If you want to add information in this base minimum XML don’t use the “b:” tag.

From MSDN (update):

The Guid and LCID elements are optional, but you can provide values for them if you want. The Guid element value should be a valid GUID, which you can generate programmatically outside the Word object model. (See the Microsoft Visual Studio documentation or the Microsoft Windows documentation on MSDN for information about programmatically generating ID.) Word generates GUIDs when users add or edit a source. If you do not add a GUID to the XML and a user then edits a source, Word generates a GUID. This enables Word to determine which source is most recent, based on the value of the GUID, and to prompt whether the user wants Word to update the outdated source to maintain continuity between the master list and the current list.

The LCID specifies the language for the source. (See MSDN for valid language identification values.) Word uses the LCID to know how to display a cited source in a document’s bibliography. For example, one source may be written in French, one in English, and one in Japanese. From the LCID, Word determines how to display names (for example, Last, First for English), what punctuation to use (for example, using comma in one language and a semicolon in another), and what strings to use (for example, whether to use “et al” or another localized form).

Now that I deciphered how bibliographic information can be presented in an XML, so that Office 2007 recognizes it as a bibliographic source, I can now list down all the bits and pieces that can go inside it. Please follow my next post on it.

About these ads

14 Comments »

  1. I posted this already under another post of your blog but it may be useful here as well: Bibutils has added initial support for conversion of bibliographic formats to Word 2007 bibliographic XML format, which may be useful in figuring out your own export module (or might be even used instead of writing your own converter).

    http://www.scripps.edu/~cdputnam/software/bibutils/

    Comment by Matthias — March 26, 2007 @ 5:27 am | Reply

  2. […] GUID: Global ID. This enables Word to determine which source is most recent, based on the value of the GUID, and to prompt whether the user wants Word to update the outdated source to maintain continuity between the master list and the current list. Example: {F3BEFB3B-FC0D-47AB-970A-F4003FF99F9F} (more) […]

    Pingback by Details of Microsoft Office 2007 Bibliographic Format Compared to BibTex « Beyond My Mind — March 30, 2007 @ 2:09 am | Reply

  3. Thanks Matt. I will take a look at your source.

    Comment by mahbub — March 30, 2007 @ 2:17 am | Reply

  4. Is there a possibilty to include the chapter number in the bibliographic numbering scheme?
    E.g. [1.1] Author, “Title of work”, etc… resp. the reference [1.1] in the text.

    Another question… probably wrong to place it here, but anyhow: I have the German version of Office 2007. How can I make sure that Words uses the translated word “Bibliography” in the title of the bibliography instead of the German “Literaturverzeichnis”?

    Thanks!

    Andrej

    Comment by Andrej — August 18, 2008 @ 1:26 pm | Reply

  5. Is this the plugin that’s now shipped with JabRef 2.4? If so, could you post a usage HOWTO?

    Thanks,

    Scott

    Comment by scotto — September 18, 2008 @ 5:13 pm | Reply

  6. Is there any way to export Microsoft word sources.xml file as a bibtex file?!

    Comment by masoud moshref — January 6, 2009 @ 5:45 am | Reply

  7. I am looking for a way to convert a really large (200+ pages) list of references that are currently in a regular word file into the “Source” file to use with Word reference system. Is there a way to do this (other than do the manual copy/paste into the source builder)? I am at my wit’s end… Thank you!

    Comment by paperbased — July 6, 2009 @ 12:26 pm | Reply

  8. […] Deciphering Microsoft Office 2007 Bibliography Format « Beyond My Mind (tags: openoffice documentnumerique xml metadonnees bibliographie reference citation office microsoft standards microformats windows) […]

    Pingback by PabloG » Blog Archive » links for 2010-03-05 — March 5, 2010 @ 6:01 pm | Reply

  9. I’m using MS 2007 and was really enthusiastic about the bibliography tool since I do a lot of writing and must cite my sources.

    I do find it somewhat cumbersome work intensive and, if I were to want to add information or edit it, I would like to have the ability to go, say, to Access open a form and do my thing.

    Can this be done with the MS Word entries or are they bound to MS Word infinitely. BTW where is master dbase for bibilographical entries?

    Thanks very much!

    Harold

    Comment by Harold Vadney — March 18, 2010 @ 11:24 am | Reply

  10. Hello….we just rebuilt a machine with vista OS to windows 7…and now my wife tells me about all these references that she had in a library that would come up when she was doing her masters thesis. Can someone please tell me the name of that file or extension so I can go look on the HD external back up I did of her machine. Thanks.

    Comment by John Knecht — August 11, 2010 @ 6:36 pm | Reply

  11. i use both open office and microsoft office and i would say that microsoft office is more responsive and user friendly “,`

    Comment by Chinese Girls — November 17, 2010 @ 1:46 pm | Reply

  12. Could you explain the difference between JabRef modules and the JabRef export filters? I thought that export filters were the adequate tool for exporting any BibTeX database to any other format?
    After downloading the JabRef source code you can look at the .layout files in \src\resource\layout for an example of export filters.

    Comment by Matthis — January 20, 2011 @ 7:34 am | Reply

  13. Hi guys,
    I have straight forward question.
    After creating my database in Jabref and exporting it in MS Office 2007(*.xml) file, I was trying to import the file into Excel (via Developer-XML ‘tools’ etc.) but I had following issue:
    – having 376 items in Jabref, after importing into Excel there were more than 1000…
    – actualy in the case of the item with mutliple authors (let’s say 4), there was 4 rows in Excel created for each author and other cells (title, year etc.) where copy/pasted.

    Does anybody know how to manage this issue?
    Thanks in advance,
    Kind regards
    ZB

    Comment by XML, XSD — September 23, 2011 @ 7:24 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: