Tell Me Why This Couldn’t Work

I found lots of interesting book offerings in the Routledge Asian Studies catalog I got in the mail today. Government and Politics in Taiwan is out in paperback, I’d love to learn a bit more about that. Oh, $43 seems a little much for a paperback. Legacies of the Asia-Pacific War looks interesting. Hmm, $125 seems a little unreasonable for a 240 pager, even if it is hardback and all. Ooh, Debating Culture in Interwar China, ah but, this 176 page book is $130. The Third Chinese Revolutionary Civil War, 1945-49 seems right down my alley, but $160 for that 224 page book is out of my range and is probably not where your average library would want to invest. But don’t worry, you can buy a Kindle version of the book for only $127 at Amazon! Hey, a four volume set on Imperial Japan and the World 1931-1945 looks fantastic, and looks to include a collection of influential historical essays on the topic. Oh, these four books will set you back $1295.

It is true Routledge is worse than many publishers, but this is beyond ridiculous. I’m fortunate enough to have access to Harvard libraries until I graduate (fingers crossed) next year, but the chances are very good that whatever libraries I can find nearby throughout the rest of my career are the kind who cringe at these prices. I don’t really blame the publishers, though. They are just trying to make a buck in a tough industry with books that have very low chances of selling more than a few copies here and there.

However, I do blame academia for making book publishing such a central part of career advancement. I really wish they would support a wider range of formats and a completely digital open access but peer reviewed world of scholarly interaction, given the increased potential it offers that informed readers outside our small academic world to participate more actively in the process.

Perhaps my expectations are too high, but even if monograph-length publications of the traditional variety are here to stay, can someone tell me why we can’t do something like this:

1. Scholar gets an annual personal publication fund from department, its size based on multiple variables, including perhaps, evaluation of past publications, a department’s commitment to support research in a tough field that is poorly funded by grants and professional associations.

2. Scholar writes a manuscript (a book, an article, but also other multi-media or film projects etc. ought to be included).

3. Scholar submits manuscript to a professional association along with small administration fee for free distribution of work to readers (or viewers, etc.).

4. Professional association finds some qualified unpaid anonymous readers for the work to evaluate its quality and distributes copies to them (the way publishers do now).

5. Readers return an evaluation that concludes refuse, revise, or publish with some indication of what relative importance the work has in terms of its contribution to the field from their perspective.

6. If it passes peer review, the professional association gives the scholar back the evaluation reports, an official endorsement (which can be used to promote the work, once “published”), and if funding is available, makes an offer of some amount of money towards publication of the work, in relation to the relative importance of the work attributed to it by its readers, its own further evaluation, and its budget for the year.

7. If the work passes peer review and the money offered by the professional association is sufficient for publication, proceed to step (9). Otherwise,

8. If the offered money is insufficient for publication costs or the professional association refuses to endorse it, and the scholar does not wish to make up the difference from her/his personal publication fund, they then repeat steps (3) to (6) seeking help from other professional associations whose evaluation of quality will add to the prestige and funding of the work, or other funding sources (departmental, university, other institutions) until they get enough money in offers or they revise or abandon the research project.

Once the scholar has decided that they have enough support from professional associations, grants, further departmental support, or contribution from their annual personal publication fund they proceed with publication and spend their funds in the following manner:

9. (Optional) Pay lump sum to a publisher-consultant who handles the administrative tasks and payment in below steps (10) to (13) if the scholar doesn’t want to deal with it personally or through someone at their own institution hired specifically for this task. There is to be no transferral of copyright away from the scholar either way and this publisher-consultant does not have any role in determining whether or not something gets published. In this model the publisher is an administrator who has contacts for managing the below steps.

10. Pay for X hours of labor to hire an editor-consultant to help improve the language and writing of the manuscript beyond the quality of its academic content.

11. Pay for Y hours of labor to hire a designer-consultant to create the print and digital presentation for the work (for desktop/mobile web browsers and e-reader applications).

12. Pay $Z for the fees to have the metadata for the work permanently indexed and its files hosted in multiple online depositories, including important information on its peer-reviewed endorsements and positive/negative evaluation reports.

13. (If you really want to make a paper version) submit the print formatted version of the work to all the major online print-on-demand services where anyone can order a cheap paper copy, including both libraries and average readers.

Here are the some of the strengths of a system like this:

-It leaves the copyright in the hands of the author, who will hopefully release the text with a Creative Commons license for maximum distribution and use.

-It imagines a new and powerful role for professional associations, or at least a transformation of traditional journal editorial boards/networks into more broadly defined associations who continue to have, among their primary duties, the evaluation of scholarly work in their field.

-It recognizes that publishing, even digital or print-on-demand works, can be costly process involving many hours of labor beyond that of the author and the anonymous readers.

-It leaves peer review intact, but shifts it from publishers to professional associations which should themselves proliferate in number and each will naturally develop differing perceived standards of quality and funding sources. With the decline of traditional academic publishing, these organizations should receive funding from universities and outside grant institutions or at least provide them with recommendations of where their funding should go.

-It allows for multiple sources of funding both from professional associations that participate in the peer review process but also allows scholars to use their own annual publishing funds, and further grants from university or other institutions.

-Since personal or departmental funds may end up partly or completely funding the publication of works that were poorly evaluated in the peer-review process and couldn’t get financial support from sources based on its quality, it does little to stop bad research from getting published. It does, however, prevent them from creating a burden on the traditional publisher who currently pass that cost onto the consumers of information – since now publishers play no part in the selection process or have any stake in the success of its publication – the publisher, editor, designer, and digital index/content hosts are all paid for their work regardless. Also, since such poor quality publications will not be able to promote themselves by showing that they have the endorsements of, and positive evaluations of reputable professional associations, they will simply get cited less and can get filtered out in various ways during the source search process. However, even bad works or ones on extremely obscure topics can sometimes be useful, if but for a footnote or two that turns us on to a good source.

In this system what is the role for traditional academic publishing companies as they exist now?

None. Universities who support many of them should eventually dissolve them but support them long enough to allow a relatively smooth transition for its employees to find niches in the businesses that should grow from providing services in step (9) to step (12). Book paper printing should be all done through print-on-demand services as the print medium slowly declines. Marketing/promotion of the traditional kind will ideally become a minimal part of the equation as association endorsements and evaluations become the dominant stamp of quality and citation networking power comes to rule the day. Of course, you can add a “marketing” budget for promotion and advertising between steps (11) and (12) above if such funds are available but hopefully this will be seen as a practice resorted to mostly by those who failed to receive strong endorsement from professional associations. No one promotes our journal articles, why should we treat our academic books and other projects differently? If it gets cited, read, and referenced, is that not enough to ensure its spread, especially if the works are openly available and thus offer no barrier to access.

Now, tell me why can’t this work? Why won’t something similar to this emerge from the ridiculous state of academic publishing today when it really wakes up? Let me know what you think.

Time to Walk the Walk

I am deeply frustrated with the sometimes closed atmosphere in academic life. I feel a profound discomfort when I encounter students and scholars who are paranoid that their research ideas will be stolen, that their sources will be discovered and, shock and horror, will be used by someone else. I’m simply incapable of sympathizing with them. I don’t like it when scholars pass around papers with bold warnings commanding me, “Do not circulate,” and I’m even less happy when I have been given handouts at a presentation only to have the speaker collect them again following the talk as if I was looking over instructor comments on a graded final exam. I feel my stomach churn as, to give a recent example, a professor opens up a database file of archival information and, smiling mischievously to the audience, declares that this is his “secret” source.

Such is life, people say to me, or else quote me some snotty French equivalent. That is the reality of this harsh academic world we live in. Well, perhaps I’m suffering from an early onset of old-age grumpiness, but I just don’t want to play that game. I don’t care that I’m still a graduate student, that job committees will look over everything they can find by me in search of sub-standard material, or that publishing firms will want me to explain why an earlier version of something I have submitted to them is available for download somewhere online. I don’t care if someone else finds some topic I have done some preliminary work on interesting, runs with it, and ends up publishing something on it. I may feel a momentary pang of regret that I didn’t get my own butt in gear and finish the project myself, but if they did a good job, then I really have no cause for complaint.

I’ve decided to just go ahead and start posting everything I produce academically, including short conference presentations and other research works in progress. You can find this material on a new research page here at Muninn.

Well Written History

The majority of the research is done. The sources have been found. The books and documents have been photographed or photocopied. Some of them have even been read.

I’ve got ideas. I’ve got outlines. I’ve got hundreds of pages of notes.

I have years of training in the destruction and dismissal of other people’s arguments. They call it grad school.

Now the time has come when I too must write – and not one of those research papers churned out in the day or two before the deadline arrives. I must write the dissertation. I am to write chapters that connect to each other in some logical fashion. Chapters. Even the word itself sounds like so many heavy links of metal to be hung around the necks of PhD students back from those green pastures they call “the field.”

I have seen them. They wander the campus with a pale look; the clank and rattle of their invisible burden almost audible as they walk. Nearby a third year history grad student might be seen skipping away, “I’m off to the archives!”

I forge my first link this fall. Getting a summer head start on my procrastination, this week I sat down to read a few books on the craft of writing, including a simple but handy book of “writing tools” aimed mostly at journalists and fiction writers. Reading through the short examples of good writing, I realized that I didn’t really know what good writing looked like in history.

Don’t get me wrong. In historiography classes, I have read plenty of “classic” works, from a full range of “schools” of historical inquiry and their most radical theoretical rivals. A year spent mostly reading in preparation for oral examinations brought me in close contact – “reading” wasn’t always the best description of what that contact consisted of – with hundreds of history books, but in all cases my eyes were trained on the content, not the form. The only times I really paid much attention to form was when some theoretically ambitious works were so frustratingly obtuse that one wondered how these historians who claim sensitivity to the subtleties of discourse could have nurtured such talent for linguistic slaughter.

I can think of plenty of works of history that took an approach I liked, had an argument that persuaded me, or simply benefited me in my own research. However, I am embarrassed to admit, I can’t name any history books that I thought were well written. That is to say, I have apparently paid so little attention to the writing of history at the level of phrase, sentence, and paragraph, and so much to the arguments and their support instead, that I now feel particularly naked as I go forward in my own writing.

Of course, I suspect good writing in history resembles good writing everywhere else. Surely many of the lessons of good writing taught in a journalism class, at a college writing center, or in Mrs. Gould’s seventh grade English class back in Aberdeen, Scotland are applicable to the writing of one’s history dissertation. I am also doubtlessly influenced by the rhetorical strategies and sentence structures of at least some of the hundreds of works that I have read in the past few years. Hopefully that influence is partly born of an intuitive recognition of quality. Even if that assumption is flawed, it is too late for me to revisit those blissful days of wide secondary source reading now. But if I get a chance to speak to incoming grad students in my last two years in the program, perhaps in the form of a wailing spirit in the night, I think I will advise them to pay closer attention to the language of historical works; to occasionally wield the eyeglass, and not merely the sword when they confront the works both in their own fields and the broader historiography.

Triage in the Archives

I’m working on my last batch of documents in the provincial archives in Shandong. There are two challenges to doing my historical research here which I often think about. The first is the problem of access to both the archive and much of its contents. I have been very fortunate but I regret that it is more of a result of good fortune than anything else. This posting will focus on the other problem, the need for a kind of triage in the archives and the constant awareness of my own personal limits as a reader. It is a humbling experience, and I suspect many, if not most, historians, come to face it if they have spent much time doing archival research, especially dealing with documents not in a language they speak and read natively.

Language and Detailed Local Knowledge

I enter the archives here with a topic in mind, a relatively good understanding of the regional and chronological context for my topic of study, and a working knowledge of the terminology often used in the kinds of documents I will be looking at, in part thanks to the existence of a published collection of documents from the same archive (山东革命历史档案资料选编). However, I have two major disadvantages that I feel very acutely every day I come to the archives. One relates to my language ability; the other to the limits of my local knowledge.

Though I can read Chinese, especially when it comes to the materials in my particular field of study, I have two huge linguistic disadvantages compared to any native speaker of Chinese (and, to a lesser degree, native speakers of Japanese): 1) I read Chinese much slower, and more importantly, skim Chinese slower, than native speakers. I still have to occasionally look up words that cannot either be understood by context or safely ignored due to probable irrelevancy. 2) I do not have a lifetime of practice reading handwritten documents using cursive or radically simplified Chinese characters, which compose over half of the materials I’m looking at. This means that some of the many handwritten documents I look at here, where I do not have permission to photocopy or take photographs of the materials I am looking at, are partially or in a few cases completely impossible for me to read.

The second major kind of disadvantage I have relates to the fact that, as one archivist here put it to me sympathetically, “This must be overwhelming, since you have only had time to study Chinese history for a year or two before you came.” This makes it seem like every Chinese historian has studied Chinese history for decades and is thus many years ahead in terms of knowledge of the specifics of Communist party anti-treason campaigns in Shandong province, which is simply not the case. However, all other things being equal, I must come to terms with an obvious fact that lies at the heart of what the archivist was trying to point out to me: It is physically impossible for me to have found time to read more than a subset of the Chinese language secondary works or document collections that are related to my field in the short time I have worked on my dissertation, let alone read, as some graduate students and scholars here undoubtedly have, read the many other peripheral works that help one understand the context surrounding my topic. This is even more true since I am doing a transnational and comparative project that also incorporates Korea.

The only way people in my position can walk into the archive each day with some degree of self-respect is to convince ourselves that we have something unique to offer the study of our historical topic that gives us some kind of advantage relative to other scholars and students who might be working on a similar field here. Whatever this might be, our critical question, our comparative approach, our sensitivity to patterns etc. that might not be apparent to those working in other scholarly contexts, and so on, it gives us the confidence to go in and struggle through the historical materials and accept our weaknesses. In my case, I try to tell myself the contribution I can make is largely to be found in the way I “slice” the range of my inquiry and attempt to use that slice to answer particular questions. I remain open to the idea, however, that the “uniqueness of approach” claim may ultimately be an illusion, and as the quality of academic research here in China improves rapidly (I was really impressed with the breadth of reading and fresh approaches taken by some graduate students I have met here), some of the other advantages that foreign scholars coming to study might once have dared to claim are disappearing.

Even if one does avoid falling into complete despair, it remains an incredibly humbling experience to walk into the archive each day and be faced time and time again with one’s own all-so-apparent inadequacies. Below, let me share some aspects of that experience with some examples and the unfortunate but necessary steps I have to take in order to maximize the number of historical gemstones I can mine in the ocean of archival material available to me, despite my weaknesses.
Continue reading Triage in the Archives

A Proposal for a Powerful New Research Tool – Organizing Information for Dissertation Writing – Part 3 of 3

In the first and second postings on this topic, I described my approach to a lack of connections between my notes on my sources and my broader dissertation outline. I explained how I organized my material and how I’m trying to use my task management software as way to create a link between the increasingly large number of note files and sections of note files on individual sources and the broader outline of the dissertation I will begin writing this year.

In this posting I will describe a kind of outlining software that could largely resolve the organizational problem I have described in my previous two postings without having to navigate between several applications. These could be easily added as a mode or layer of features to existing outlining software out there. In this case I’m thinking of OmniOutliner, which is what I use, but I think the kinds of modifications I am suggesting could be easily added to most other outlining software solutions out there, or serve as a foundation of a new solution based on the organizing principles described here. The result, I hope, will be an environment which will allow researchers to adopt a smooth workflow which can unite the highest level of a research outline and the most tiny fragments of notes on sources or the sources themselves.
Continue reading A Proposal for a Powerful New Research Tool – Organizing Information for Dissertation Writing – Part 3 of 3

Organizing Information for Dissertation Writing – Part 2 of 3

In the first of three postings on this topic I explained that I have become increasingly concerned that there exists a vast and empty middle layer of organization between the various primary sources, notes, and ‘notes on notes’ I have on the one hand, and my dissertation outline. I have felt the need to develop some way, while I’m still out here in the field conducting my research, of better tying up the many individual fragments of information I find in the sources with the arguments I want to make in the written dissertation.

I’d be very interested in hearing about how other graduate students have sought to resolve the problem of connecting the large quantity of notes, outlines, and unprocessed raw sources with the grand outline of a huge writing project like a dissertation. Below I describe briefly how I have essentially integrated this process into my own task management routine.

First, let me describe how I have been organizing the historical materials I have been collecting in the field and while back at university. Read on for the details. Continue reading Organizing Information for Dissertation Writing – Part 2 of 3

Organizing Information for Dissertation Writing – Part 1 of 3

I’m coming into the home stretch of my two academic years of field work for my dissertation on treason and political retribution against accused collaborators with Japan in Korea and China from 1937-1951. I spent the first academic year in Korea, a summer in Taiwan, and I’ve just begun my last month of research in Jinan, China. I’ll try to wrap up some unfinished research in Korea and Taiwan this spring and then begin the actual writing of my dissertation this coming summer back in my hometown in Norway and while staying with family in the US. My goal is to wrap things up and hopefully complete my history PhD program by the spring of 2011.

I had always hoped I would have at least one chapter written up by the time I returned from the field, but at this I have failed. My primary excuse has been the fact that I have never had all the materials I have collected in various places in one place. In honesty, however, it is probably more due to the fact that I have never been able to combine the “research mode” and the “writing mode” into a single daily routine. I have deep admiration for graduate students and scholars who can do this effectively: spending their days at the archives and libraries, then shifting to chapter writing in the evenings. I haven’t even done what some professors have suggested: write a few disconnected pages here and there as you get enough material to weave a few tight threads. I confess cowardice, having not overcome the fear of composing such fragile and isolated pages.

Since I’m not, like those model students, immediately converting my daily discoveries into chunks of narrative and analysis, I am increasingly concerned about the fact that the hundreds of note files, outlines, and references to various archive images or PDFs themselves have become a considerable corpus that will require a nontrivial amount of processing and mining to reconstruct the argument and narrative of what will become my PhD dissertation.

To put it another way, I have two rich layers that form the foundation and roof of my research. The former is the dense web of primary source materials, notes taken from these source materials, and other timelines or “notes on notes” which organize some conceptually related materials. This is where the truffle hunter can happily prance about. The latter is the dissertation outline. This is an increasingly detailed macroscopic view of my planned chapters and arguments which has taken concrete form in a dozen different formats and lengths as it gets distributed as a dissertation prospectus, various fellowship application essays, emails to professors, and, in its most detailed form, a hierarchical outline document full of barely intelligible bullet points. This overarching top-down view is born of that creative destruction that is the clash between the starting assumptions that feed the “fire in my belly” which brought me to the study of history and my chosen topic, and my intuitive understanding of what my research in the sources permits me to argue in good faith as a historian. It is, of course, at exactly this point where many of the historiographical crises of our time find their point of entry but this is not the issue I wish to address in these postings.

While in the field, the gradual thickening of the web of notes and sources on the one hand and the increasingly detailed and structured outline on the other might suggest progress, but I can already feel the heavy weight of a void that lies between them. PhD students I have talked to who have returned from their research in the field give me the impression that the greatest frustrations that lie ahead for me are to be found in two areas. One is the challenge of writing itself, of synthesis and analysis on a scope never before attempted in our long career as students. The other, however, seems to be found in bridging the vast and dangerously incomplete “middle zone” between the above described layers: Exactly what evidence and what sources will be deployed for precisely which points we think we can persuasively make? Which book, newspaper or archival document was it that demonstrated this or that phenomenon? For every argument I wish to make, must I be reduced to searching through a large subset of my notes and notes on notes, which now number many hundred pages?

I’m very much open to the advice of graduate students and professors who have developed successful strategies for this but in my next two postings, I’ll share a strategy that I’m attempting now that I hope will help me overcome some of the worst of the middle zone nightmare I have described above. I don’t think it is very original, as I suspect many, if not most PhD students may have attempted or used something similar themselves. In fact, some may accuse me of describing the obvious common sense approach. If, however, it indeed is an effective approach – and this remains to be shown in the coming two years of writing I have ahead of me – then I wish it had been explained to me before I launched into my lonely existence as a student roaming the archives of East Asia.

In the next posting, I’ll explain how I’m using my task planning software (OmniFocus) as a bridge between my notes and my dissertation outline, creating a kind of index that links sections of my notes on specific sources, to certain arguments I think I can and will make in my dissertation chapters. While what I’m doing doesn’t require any kind of specific software, this process has integrated relatively smoothly into my existing methods for organizing tasks on my Mac and my iPod Touch. The third posting will probably only be interesting to a more technical audience who are familiar with various specific software solutions. In that posting, I will suggest how, if my current experimental approach is sound, how I think an even more ideal software-based organizational system might work which I have yet to find fully or satisfactorily implemented in any existing soclution I have seen out there. I’m sure there will be dissenters who believe they have found the perfect solution for their needs, but I will attempt to articulate what I have found lacking in what is out there.

Endnote Takes A Shot at Zotero

The cold war between Endnote, the bibliographic software owned by Thomson Reuters that has long had a virtual monopoly on the academic market, and Zotero, the open source alternative created by the incredibly resourceful and innovative Center for History and New Media at George Mason University has finally broke out into an open conflict.

Endnote clearly saw its grip on the academic market coming to a swift end as a new generation of graduate students embrace the free and powerful Firefox browser-based alternative that has rapidly caught up to its rival in features. It responded with a huge gamble and an ancient weapon: the lawsuit. It has sued George Mason University for being in violation of its site license for Endnote. GMU has paid for a site license for the Endnote software, much like other universities (I can confirm, for example, Columbia and Harvard’s internal university software sites also provide its download for their university community) and the CHNM at GMU is listed as the creator of Zotero in the software’s about information. The Endnote site license is said to have explicitly forbidden the license holder from engaging in the “reverse engineering, de-compiling, translation, modification, distribution, broadcasting, dissemination, or creation of derivative works from the [EndNote] Software.”

Lets look a bit closer at the players and the issues.

What is Endnote?

Endnote is a piece of software which allows researchers in any field to compile a list of bibliographic entries. This might mostly include lists of books or articles they have come across for use in their publications.

At its core, the software is simply a database client for research sources. However, it eventually developed three killer features that created a reluctant customer base out of virtually the entire academic world:

1) Z39.50 – In Endnote, the user doesn’t have to type in all their sources by hand. If, for example, they want to include a book which was found in the Library of Congress or any one of thousands of libraries which have an online database which supports something called the Z39.50 protocol they can use Endnote to directly import the info in question. Endnote ships with dozens of “.enz” connection files which allow it to connect to most of the important libraries in the United States and search their holdings for the source required. Endnote will then add the bibliographical information directly into the user’s own database. If you can’t find your library in the default list of connections, very often the Z39.50 .enz file can be downloaded directly from your favorite library’s homepage, usually hidden somewhere deep in the geekier sections of the website. The .enz files simply contain connection information, openly available through various library websites, that has been put into a special format readable to Endnote. Interestingly for this lawsuit, I don’t know of any case in which Endnote has sued libraries for distributing (which is a violation of the license) these .enz files which are, like .ens files (see below), a “component part” of the software.

2) Styles – Endnote provides the ability to convert one’s source entries into any bibliographical style, so that your footnotes, endnotes, and bibliographies can be easily formatted according to the many different styles used by various journals and publisher needs. These styles are created and openly available to anyone who consults the website of the given publication. In addition to providing the ability to create your own output style, Endnote has simply taken these publicly available style formats, many based on well known formats like the Chicago citation style (see instructions for citation styles for American Historical Review, for example, here), reduced them to their most basic components and created an “.ens” file which saves the formatting requirements in a digital format. If you have Endnote installed, you can see the huge list of style files available in your Endnote folder in the Styles sub-folder:

ens files.gif

If you open any of these files in a text editor you will get mostly gibberish, as the information is stored in format readable only (until recently) by the Endnote software. However, if you open Endnote’s style manager and inspect, for example, the style for the American Historical Review, under Bibliography templates, you will see some of the kind of information stored by the .ens file. For example, under book template you will see something like this:

Author. Title|. Translated by Translator|. Edited by Series Editor|. Edition ed|. Number of Volumes vols|. Vol. Volume|, Series Title|. City|: Publisher|, Year|.

Each of those words corresponds to a variable, or a kind of an empty box, into which Endnote will drop your bibliographical information, in accordance with what you have entered into the database with your sources. It is important to understand, for the purposes of this first battle of the E vs. Z war, that the styles themselves are not proprietary, but Endnote lawyers are arguing that the way they have translated these styles into a digital format, that is the “.ens” file, is protected by the Endnote license.

3) Word Integration – The final killer feature of Endnote is that the software can take your list of formatted footnotes, endnotes, or bibliography and directly interface with the most popular word processor out there: Microsoft Word. If a scholar is writing a paper in Word, they can prepare an Endnote document with all the sources they need for the publication, and directly in word they could assign certain sources to certain footnotes or the bibliography using a Word plugin provided by Endnote. They can then, with a few clicks, format all of those footnotes, endnotes, and the bibliography to the style appropriate for whatever publication they are submitting the paper to.

For thousands of scholars this ability has saved hundreds of hours they might otherwise spend typing up their references and making sure it conforms to the requirements of their publisher.

However, as a side note, this hasn’t been all good. I can share from my own experience and the experience of my friends some of the most problematic issues:

a) Garbage in – Garbage out: The library databases that most users of Endnote interface with don’t always have perfect information. Sometimes information is in the wrong place, lacking capitals where it needs them, or contains a lot of surfeit information that one doesn’t want to include in every footnote. Users must often spend a lot of time cleaning up imported information before having Endnote (or Zotero for that matter) do its magic. This is a problem of data integrity, not the fault of the software.

b) Endnote sucks. We used it because, until the rise of alternatives like RefWorks and Zotero, that is all there was. I’m sorry, but since the earliest version I started using years ago until the most recent version Endnote seems to have thrived in an environment of safety and lack of competition. For many years Endnote could not deal with any sources that used non-Roman scripts, mangling any Chinese, Japanese, Korean sources such as those I have need for. To this day, I have encoding issues with Endnote that makes it a pain to use. Endnote has a user interface that seems to have been designed by programmers that have never written a paper in their life, let alone studied user interface techniques. It is ugly, clunky, and unintuitive at every step. Finally, Endnote has long had serious stability and performance issues when it interfaces with Word. Though I haven’t personally had any major disasters, only minor hiccups caught early in the process, during my tech support days at Columbia University’s Faculty Desktop Support, I have had to deal with many panicking professors who showed me their book or article manuscript Word files with completely mangled footnotes. “All my references suddenly disappeared!” or “No matter what I click in Endnote, nothing converts or changes in my Word file anymore!” were two of the most common complaints I had. Sometimes the tenuous connection between Endnote and Word just seem to breakdown, with disastrous consequences.

c) Endnote only works with Microsoft Word. At least as far as I know in the versions I have used. This created a vicious circle within academia. At FDS I watched more and more professors who loved their ancient alternatives to Word like WordPerfect and Notabene (I had never heard of this until I saw its grip on Classics and English departments), or who stubbornly resisted Microsoft’s power by using OpenOffice or Apple’s AppleWorks having to switch to Word not only because .doc was the dominant format but sometimes because they watched with envy as others used the power of Endnote for large scale pieces.

The Rise of Zotero

Zotero will go down as one of the great open source legends. Unlike many other wonderful pieces of open source software, I believe Zotero is poised to completely topple its commercial rival, Endnote, and do so in record time. Zotero has and will continue to have other powerful competitors who askew the browser-based approach or embed a browser into the software, but the rule of Endnote is soon at its end. I have played with Zotero since its buggy early beta days and watched it grow to the powerful alternative to Endnote that it is today. Developed by and for the browser generation it took a radically different starting point: Endnote users started their bibliography creation process within the Endnote software: typing up or using Z39.50 connections to add sources to their bibliography. Zotero users start on the net, because hey, guess what, we all do.

Zotero assumes we find the majority of our sources while, for example, using a library’s search engine, a list of books on Amazon.com, an article at JSTOR or other academic databases, or when reading a blog entry. Zotero has gradually added a huge list of “site translators” which scrape a web page and extract the useful bibliographical information from the page in question. There are plugins to add metadata readable by Zotero in popular blog engines like WordPress. Whether it is a library book entry or a bookstore listing, Zotero can instantly add information from hundreds of websites and databases available online by simply clicking an icon in the address bar. You can also instantly add bibliographic entries from any static web page, and save offline snapshots of these websites from the time you accessed them for future reference. This all meant that Zotero very quickly far overtook Endnote’s main killer feature #1. It was an instant feature smack-down.

Because the project is free and open source, it quickly gained a huge following even when it lacked some of Endnote’s power. Those without access to a university site license were loath to dish out the ridiculous $300 for Endnote ($210 for an academic license) or face its steep learning curve and were willing to accept cheaper alternatives like Bookends (Mac, $100, $70 for students) or the increasingly powerful Sente (Mac, $130 or $90 for students). Zotero, of course, is completely free. Plugins and site translators for Zotero have spread fast as a result. It also offered powerful tagging capabilities and the easy organization of sources into folders, which is way ahead of the incredibly limited organizational possibilities of Endnote’s file-based bibliography system. The only major weakness in Zotero’s general approach is the fact it is wed to the Firefox browser so researchers may have to do their source hunting in something other than their favorite internet browser.

I think the most powerful attack on Endnote’s market came, however, when Zotero added support for Word, OpenOffice, and NeoOffice integration. Although I think the results have been somewhat mixed in the early stages (I haven’t tried in the newest release) this will eventually eliminate the advantage of Endnote’s killer feature #3.

All that remained before Endnote became an expensive 175MB waste of space on one’s hard drive was for Zotero to catch up with Endnote’s killer feature #2. Now, Zotero’s 1.5 Sync Preview which is available for download as a beta, includes (though this has been temporarily disabled, perhaps because of the lawsuit) the ability to export Zotero database entries using Endnote .ens style files. I’m not 100% sure how this works on a technical basis since I haven’t played with a functioning version including the feature, but the text of the Thomson Reuters lawsuit against GMU claims that Zotero now also provides a way for .ens files to be converted into the .csl style files that Zotero has. I have seen some comments on blogs that claim that the new version of Zotero never provided this ability directly but merely provides a way to output bibliographic data exported via existing .ens files should the user be in possession of such Endnote files. Either way, the developers of Zotero must have engaged in some kind of reverse engineering (which is where the lawsuit claims there is a license violation) of the gibberish we otherwise see in the .ens files in order to understand how Endnote has digitally represented the publicly available output styles and is therefore now in possession of the ability to, for example, convert the Zotero database data, through these .ens files, into a readable bibliographical entry, or if it wanted to, save such style formatting data into .csl files if that feature were ever included.

The War Was Over Before It Began

I think we have to await the official Zotero announcement regarding the lawsuit to help us determine the accuracy of the technical claims being made by Thomson Reuters. An entirely separate question, which has received the attention of various technology oriented law bloggers, is the strength of the approach of the legal attack itself and its separate and bizarre claim GMU is responsible for a misuse of Endnote’s trademark.

What isn’t in dispute, however, is the fact that Endnote should be very very scared. Whatever features are included in 1.5 or later versions, the developers of Zotero have clearly made sense of the .ens files and suddenly the thousands of output styles provided by Endnote might potentially become importable, exportable, or more likely, simply accessible and readable by the Zotero software. Once these publicly available style formats become digitally understood by Zotero’s database, by whatever means, Endnote loses its last and final advantage over Zotero. This will, in my mind, undoubtedly be followed by the slow death of Endnote, already begun, as new users see no advantage to using the flawed aging piece of software with its huge price tag.

The outcome of this lawsuit, even if it goes in favor of Endnote, cannot really do much to stop this trend. Zotero isn’t going to disappear. Even if, and I find this to be extremely unlikely, GMU were to take the radical step of completely shutting down its support for Zotero development, the user base is already huge. Other programmers will pick up where GMU’s team began with the code already in their hands. The reverse engineering of the .ens format, if it has been done successfully, can probably be explained in the space of a few paragraphs or represented by means of a few pages of code, perhaps encapsulated as a plugin that can be distributed separately from the Zotero software itself. The knowledge of a file format’s structure, once in the wild, can’t be put back in the proverbial bottle, a reality faced by dozens of software applications in the past and something we have seen with everything from Microsoft’s .doc to various proprietary image, sound, or movie file formats. Once the .ens output style files, which are all under 50k in size can be interpreted, it is a simple matter, though of dubious legality, for scholars and students to email each other the dozen or so .ens files of journals or institutions most important for their field either in the original format or, if the feature is eventually made available, converted into .csl files.

I believe that, whatever the outcome of the lawsuit, Endnote’s owner has shot itself in the foot. Users like myself do not like to be locked into one solution and when we see a free and open source alternative under attack, it is an easy matter for all of us to jump in and identify the “good guys” and the “bad guys” to paraphrase one recent politician. Endnote is in an unenviable position. It saw Zotero’s latest move as the final straw in its attack on the Endnote user base and decided the legal move was its last chance to halt the bleeding by protecting one of the most important components of its legacy code: the .ens output styles. Strategically, they have made the wrong move and I think all of us who agree should make our voice heard. It would have been far better for Endnote developers to at least attempt to out-innovate Zotero, something very hard to do when your opponent’s staff of supporting developers includes the wider community of open source developers along with solid university and foundation funding. Instead they have given Zotero a brilliant publicity moment.

Update: The official response by Zotero and GMU about the case. Nature magazine editorial on the issue.

Further Reading

Text of the Lawsuit (PDF)

Chronicle of Higher Education Wired Campus article on the Lawsuit
Outline on Disruptive Library Technology Jester
More Extracts and Discussion at Disruptive Library Technology Jester
Crooked Timber entry by Henry on the Lawsuit
James Grimmelmann Legal Commentary
More Legal Comments at Discourse.net
Mike Madison at Madisonian Offers a Legal Take
Mention and Comments at Slashdot

The Open Source CSL Format