The cold war between Endnote, the bibliographic software owned by Thomson Reuters that has long had a virtual monopoly on the academic market, and Zotero, the open source alternative created by the incredibly resourceful and innovative Center for History and New Media at George Mason University has finally broke out into an open conflict.
Endnote clearly saw its grip on the academic market coming to a swift end as a new generation of graduate students embrace the free and powerful Firefox browser-based alternative that has rapidly caught up to its rival in features. It responded with a huge gamble and an ancient weapon: the lawsuit. It has sued George Mason University for being in violation of its site license for Endnote. GMU has paid for a site license for the Endnote software, much like other universities (I can confirm, for example, Columbia and Harvard’s internal university software sites also provide its download for their university community) and the CHNM at GMU is listed as the creator of Zotero in the software’s about information. The Endnote site license is said to have explicitly forbidden the license holder from engaging in the “reverse engineering, de-compiling, translation, modification, distribution, broadcasting, dissemination, or creation of derivative works from the [EndNote] Software.”
Lets look a bit closer at the players and the issues.
What is Endnote?
Endnote is a piece of software which allows researchers in any field to compile a list of bibliographic entries. This might mostly include lists of books or articles they have come across for use in their publications.
At its core, the software is simply a database client for research sources. However, it eventually developed three killer features that created a reluctant customer base out of virtually the entire academic world:
1) Z39.50 – In Endnote, the user doesn’t have to type in all their sources by hand. If, for example, they want to include a book which was found in the Library of Congress or any one of thousands of libraries which have an online database which supports something called the Z39.50 protocol they can use Endnote to directly import the info in question. Endnote ships with dozens of “.enz” connection files which allow it to connect to most of the important libraries in the United States and search their holdings for the source required. Endnote will then add the bibliographical information directly into the user’s own database. If you can’t find your library in the default list of connections, very often the Z39.50 .enz file can be downloaded directly from your favorite library’s homepage, usually hidden somewhere deep in the geekier sections of the website. The .enz files simply contain connection information, openly available through various library websites, that has been put into a special format readable to Endnote. Interestingly for this lawsuit, I don’t know of any case in which Endnote has sued libraries for distributing (which is a violation of the license) these .enz files which are, like .ens files (see below), a “component part” of the software.
2) Styles – Endnote provides the ability to convert one’s source entries into any bibliographical style, so that your footnotes, endnotes, and bibliographies can be easily formatted according to the many different styles used by various journals and publisher needs. These styles are created and openly available to anyone who consults the website of the given publication. In addition to providing the ability to create your own output style, Endnote has simply taken these publicly available style formats, many based on well known formats like the Chicago citation style (see instructions for citation styles for American Historical Review, for example, here), reduced them to their most basic components and created an “.ens” file which saves the formatting requirements in a digital format. If you have Endnote installed, you can see the huge list of style files available in your Endnote folder in the Styles sub-folder:
If you open any of these files in a text editor you will get mostly gibberish, as the information is stored in format readable only (until recently) by the Endnote software. However, if you open Endnote’s style manager and inspect, for example, the style for the American Historical Review, under Bibliography templates, you will see some of the kind of information stored by the .ens file. For example, under book template you will see something like this:
Author. Title|. Translated by Translator|. Edited by Series Editor|. Edition ed|. Number of Volumes vols|. Vol. Volume|, Series Title|. City|: Publisher|, Year|.
Each of those words corresponds to a variable, or a kind of an empty box, into which Endnote will drop your bibliographical information, in accordance with what you have entered into the database with your sources. It is important to understand, for the purposes of this first battle of the E vs. Z war, that the styles themselves are not proprietary, but Endnote lawyers are arguing that the way they have translated these styles into a digital format, that is the “.ens” file, is protected by the Endnote license.
3) Word Integration – The final killer feature of Endnote is that the software can take your list of formatted footnotes, endnotes, or bibliography and directly interface with the most popular word processor out there: Microsoft Word. If a scholar is writing a paper in Word, they can prepare an Endnote document with all the sources they need for the publication, and directly in word they could assign certain sources to certain footnotes or the bibliography using a Word plugin provided by Endnote. They can then, with a few clicks, format all of those footnotes, endnotes, and the bibliography to the style appropriate for whatever publication they are submitting the paper to.
For thousands of scholars this ability has saved hundreds of hours they might otherwise spend typing up their references and making sure it conforms to the requirements of their publisher.
However, as a side note, this hasn’t been all good. I can share from my own experience and the experience of my friends some of the most problematic issues:
a) Garbage in – Garbage out: The library databases that most users of Endnote interface with don’t always have perfect information. Sometimes information is in the wrong place, lacking capitals where it needs them, or contains a lot of surfeit information that one doesn’t want to include in every footnote. Users must often spend a lot of time cleaning up imported information before having Endnote (or Zotero for that matter) do its magic. This is a problem of data integrity, not the fault of the software.
b) Endnote sucks. We used it because, until the rise of alternatives like RefWorks and Zotero, that is all there was. I’m sorry, but since the earliest version I started using years ago until the most recent version Endnote seems to have thrived in an environment of safety and lack of competition. For many years Endnote could not deal with any sources that used non-Roman scripts, mangling any Chinese, Japanese, Korean sources such as those I have need for. To this day, I have encoding issues with Endnote that makes it a pain to use. Endnote has a user interface that seems to have been designed by programmers that have never written a paper in their life, let alone studied user interface techniques. It is ugly, clunky, and unintuitive at every step. Finally, Endnote has long had serious stability and performance issues when it interfaces with Word. Though I haven’t personally had any major disasters, only minor hiccups caught early in the process, during my tech support days at Columbia University’s Faculty Desktop Support, I have had to deal with many panicking professors who showed me their book or article manuscript Word files with completely mangled footnotes. “All my references suddenly disappeared!” or “No matter what I click in Endnote, nothing converts or changes in my Word file anymore!” were two of the most common complaints I had. Sometimes the tenuous connection between Endnote and Word just seem to breakdown, with disastrous consequences.
c) Endnote only works with Microsoft Word. At least as far as I know in the versions I have used. This created a vicious circle within academia. At FDS I watched more and more professors who loved their ancient alternatives to Word like WordPerfect and Notabene (I had never heard of this until I saw its grip on Classics and English departments), or who stubbornly resisted Microsoft’s power by using OpenOffice or Apple’s AppleWorks having to switch to Word not only because .doc was the dominant format but sometimes because they watched with envy as others used the power of Endnote for large scale pieces.
The Rise of Zotero
Zotero will go down as one of the great open source legends. Unlike many other wonderful pieces of open source software, I believe Zotero is poised to completely topple its commercial rival, Endnote, and do so in record time. Zotero has and will continue to have other powerful competitors who askew the browser-based approach or embed a browser into the software, but the rule of Endnote is soon at its end. I have played with Zotero since its buggy early beta days and watched it grow to the powerful alternative to Endnote that it is today. Developed by and for the browser generation it took a radically different starting point: Endnote users started their bibliography creation process within the Endnote software: typing up or using Z39.50 connections to add sources to their bibliography. Zotero users start on the net, because hey, guess what, we all do.
Zotero assumes we find the majority of our sources while, for example, using a library’s search engine, a list of books on Amazon.com, an article at JSTOR or other academic databases, or when reading a blog entry. Zotero has gradually added a huge list of “site translators” which scrape a web page and extract the useful bibliographical information from the page in question. There are plugins to add metadata readable by Zotero in popular blog engines like WordPress. Whether it is a library book entry or a bookstore listing, Zotero can instantly add information from hundreds of websites and databases available online by simply clicking an icon in the address bar. You can also instantly add bibliographic entries from any static web page, and save offline snapshots of these websites from the time you accessed them for future reference. This all meant that Zotero very quickly far overtook Endnote’s main killer feature #1. It was an instant feature smack-down.
Because the project is free and open source, it quickly gained a huge following even when it lacked some of Endnote’s power. Those without access to a university site license were loath to dish out the ridiculous $300 for Endnote ($210 for an academic license) or face its steep learning curve and were willing to accept cheaper alternatives like Bookends (Mac, $100, $70 for students) or the increasingly powerful Sente (Mac, $130 or $90 for students). Zotero, of course, is completely free. Plugins and site translators for Zotero have spread fast as a result. It also offered powerful tagging capabilities and the easy organization of sources into folders, which is way ahead of the incredibly limited organizational possibilities of Endnote’s file-based bibliography system. The only major weakness in Zotero’s general approach is the fact it is wed to the Firefox browser so researchers may have to do their source hunting in something other than their favorite internet browser.
I think the most powerful attack on Endnote’s market came, however, when Zotero added support for Word, OpenOffice, and NeoOffice integration. Although I think the results have been somewhat mixed in the early stages (I haven’t tried in the newest release) this will eventually eliminate the advantage of Endnote’s killer feature #3.
All that remained before Endnote became an expensive 175MB waste of space on one’s hard drive was for Zotero to catch up with Endnote’s killer feature #2. Now, Zotero’s 1.5 Sync Preview which is available for download as a beta, includes (though this has been temporarily disabled, perhaps because of the lawsuit) the ability to export Zotero database entries using Endnote .ens style files. I’m not 100% sure how this works on a technical basis since I haven’t played with a functioning version including the feature, but the text of the Thomson Reuters lawsuit against GMU claims that Zotero now also provides a way for .ens files to be converted into the .csl style files that Zotero has. I have seen some comments on blogs that claim that the new version of Zotero never provided this ability directly but merely provides a way to output bibliographic data exported via existing .ens files should the user be in possession of such Endnote files. Either way, the developers of Zotero must have engaged in some kind of reverse engineering (which is where the lawsuit claims there is a license violation) of the gibberish we otherwise see in the .ens files in order to understand how Endnote has digitally represented the publicly available output styles and is therefore now in possession of the ability to, for example, convert the Zotero database data, through these .ens files, into a readable bibliographical entry, or if it wanted to, save such style formatting data into .csl files if that feature were ever included.
The War Was Over Before It Began
I think we have to await the official Zotero announcement regarding the lawsuit to help us determine the accuracy of the technical claims being made by Thomson Reuters. An entirely separate question, which has received the attention of various technology oriented law bloggers, is the strength of the approach of the legal attack itself and its separate and bizarre claim GMU is responsible for a misuse of Endnote’s trademark.
What isn’t in dispute, however, is the fact that Endnote should be very very scared. Whatever features are included in 1.5 or later versions, the developers of Zotero have clearly made sense of the .ens files and suddenly the thousands of output styles provided by Endnote might potentially become importable, exportable, or more likely, simply accessible and readable by the Zotero software. Once these publicly available style formats become digitally understood by Zotero’s database, by whatever means, Endnote loses its last and final advantage over Zotero. This will, in my mind, undoubtedly be followed by the slow death of Endnote, already begun, as new users see no advantage to using the flawed aging piece of software with its huge price tag.
The outcome of this lawsuit, even if it goes in favor of Endnote, cannot really do much to stop this trend. Zotero isn’t going to disappear. Even if, and I find this to be extremely unlikely, GMU were to take the radical step of completely shutting down its support for Zotero development, the user base is already huge. Other programmers will pick up where GMU’s team began with the code already in their hands. The reverse engineering of the .ens format, if it has been done successfully, can probably be explained in the space of a few paragraphs or represented by means of a few pages of code, perhaps encapsulated as a plugin that can be distributed separately from the Zotero software itself. The knowledge of a file format’s structure, once in the wild, can’t be put back in the proverbial bottle, a reality faced by dozens of software applications in the past and something we have seen with everything from Microsoft’s .doc to various proprietary image, sound, or movie file formats. Once the .ens output style files, which are all under 50k in size can be interpreted, it is a simple matter, though of dubious legality, for scholars and students to email each other the dozen or so .ens files of journals or institutions most important for their field either in the original format or, if the feature is eventually made available, converted into .csl files.
I believe that, whatever the outcome of the lawsuit, Endnote’s owner has shot itself in the foot. Users like myself do not like to be locked into one solution and when we see a free and open source alternative under attack, it is an easy matter for all of us to jump in and identify the “good guys” and the “bad guys” to paraphrase one recent politician. Endnote is in an unenviable position. It saw Zotero’s latest move as the final straw in its attack on the Endnote user base and decided the legal move was its last chance to halt the bleeding by protecting one of the most important components of its legacy code: the .ens output styles. Strategically, they have made the wrong move and I think all of us who agree should make our voice heard. It would have been far better for Endnote developers to at least attempt to out-innovate Zotero, something very hard to do when your opponent’s staff of supporting developers includes the wider community of open source developers along with solid university and foundation funding. Instead they have given Zotero a brilliant publicity moment.
Update: The official response by Zotero and GMU about the case. Nature magazine editorial on the issue.
Further Reading
Text of the Lawsuit (PDF)
Chronicle of Higher Education Wired Campus article on the Lawsuit
Outline on Disruptive Library Technology Jester
More Extracts and Discussion at Disruptive Library Technology Jester
Crooked Timber entry by Henry on the Lawsuit
James Grimmelmann Legal Commentary
More Legal Comments at Discourse.net
Mike Madison at Madisonian Offers a Legal Take
Mention and Comments at Slashdot
The Open Source CSL Format