Muninn » Academia /blog But I fear more for Muninn... Tue, 23 Jun 2015 12:19:09 +0000 en-US hourly 1 http://wordpress.org/?v=4.2.2 Tell Me Why This Couldn’t Work /blog/2010/05/tell-me-why-this-couldnt-work/ /blog/2010/05/tell-me-why-this-couldnt-work/#comments Sun, 02 May 2010 01:36:39 +0000 http://muninn.net/blog/?p=840 Continue reading Tell Me Why This Couldn’t Work]]> I found lots of interesting book offerings in the Routledge Asian Studies catalog I got in the mail today. Government and Politics in Taiwan is out in paperback, I’d love to learn a bit more about that. Oh, $43 seems a little much for a paperback. Legacies of the Asia-Pacific War looks interesting. Hmm, $125 seems a little unreasonable for a 240 pager, even if it is hardback and all. Ooh, Debating Culture in Interwar China, ah but, this 176 page book is $130. The Third Chinese Revolutionary Civil War, 1945-49 seems right down my alley, but $160 for that 224 page book is out of my range and is probably not where your average library would want to invest. But don’t worry, you can buy a Kindle version of the book for only $127 at Amazon! Hey, a four volume set on Imperial Japan and the World 1931-1945 looks fantastic, and looks to include a collection of influential historical essays on the topic. Oh, these four books will set you back $1295.

It is true Routledge is worse than many publishers, but this is beyond ridiculous. I’m fortunate enough to have access to Harvard libraries until I graduate (fingers crossed) next year, but the chances are very good that whatever libraries I can find nearby throughout the rest of my career are the kind who cringe at these prices. I don’t really blame the publishers, though. They are just trying to make a buck in a tough industry with books that have very low chances of selling more than a few copies here and there.

However, I do blame academia for making book publishing such a central part of career advancement. I really wish they would support a wider range of formats and a completely digital open access but peer reviewed world of scholarly interaction, given the increased potential it offers that informed readers outside our small academic world to participate more actively in the process.

Perhaps my expectations are too high, but even if monograph-length publications of the traditional variety are here to stay, can someone tell me why we can’t do something like this:

1. Scholar gets an annual personal publication fund from department, its size based on multiple variables, including perhaps, evaluation of past publications, a department’s commitment to support research in a tough field that is poorly funded by grants and professional associations.

2. Scholar writes a manuscript (a book, an article, but also other multi-media or film projects etc. ought to be included).

3. Scholar submits manuscript to a professional association along with small administration fee for free distribution of work to readers (or viewers, etc.).

4. Professional association finds some qualified unpaid anonymous readers for the work to evaluate its quality and distributes copies to them (the way publishers do now).

5. Readers return an evaluation that concludes refuse, revise, or publish with some indication of what relative importance the work has in terms of its contribution to the field from their perspective.

6. If it passes peer review, the professional association gives the scholar back the evaluation reports, an official endorsement (which can be used to promote the work, once “published”), and if funding is available, makes an offer of some amount of money towards publication of the work, in relation to the relative importance of the work attributed to it by its readers, its own further evaluation, and its budget for the year.

7. If the work passes peer review and the money offered by the professional association is sufficient for publication, proceed to step (9). Otherwise,

8. If the offered money is insufficient for publication costs or the professional association refuses to endorse it, and the scholar does not wish to make up the difference from her/his personal publication fund, they then repeat steps (3) to (6) seeking help from other professional associations whose evaluation of quality will add to the prestige and funding of the work, or other funding sources (departmental, university, other institutions) until they get enough money in offers or they revise or abandon the research project.

Once the scholar has decided that they have enough support from professional associations, grants, further departmental support, or contribution from their annual personal publication fund they proceed with publication and spend their funds in the following manner:

9. (Optional) Pay lump sum to a publisher-consultant who handles the administrative tasks and payment in below steps (10) to (13) if the scholar doesn’t want to deal with it personally or through someone at their own institution hired specifically for this task. There is to be no transferral of copyright away from the scholar either way and this publisher-consultant does not have any role in determining whether or not something gets published. In this model the publisher is an administrator who has contacts for managing the below steps.

10. Pay for X hours of labor to hire an editor-consultant to help improve the language and writing of the manuscript beyond the quality of its academic content.

11. Pay for Y hours of labor to hire a designer-consultant to create the print and digital presentation for the work (for desktop/mobile web browsers and e-reader applications).

12. Pay $Z for the fees to have the metadata for the work permanently indexed and its files hosted in multiple online depositories, including important information on its peer-reviewed endorsements and positive/negative evaluation reports.

13. (If you really want to make a paper version) submit the print formatted version of the work to all the major online print-on-demand services where anyone can order a cheap paper copy, including both libraries and average readers.

Here are the some of the strengths of a system like this:

-It leaves the copyright in the hands of the author, who will hopefully release the text with a Creative Commons license for maximum distribution and use.

-It imagines a new and powerful role for professional associations, or at least a transformation of traditional journal editorial boards/networks into more broadly defined associations who continue to have, among their primary duties, the evaluation of scholarly work in their field.

-It recognizes that publishing, even digital or print-on-demand works, can be costly process involving many hours of labor beyond that of the author and the anonymous readers.

-It leaves peer review intact, but shifts it from publishers to professional associations which should themselves proliferate in number and each will naturally develop differing perceived standards of quality and funding sources. With the decline of traditional academic publishing, these organizations should receive funding from universities and outside grant institutions or at least provide them with recommendations of where their funding should go.

-It allows for multiple sources of funding both from professional associations that participate in the peer review process but also allows scholars to use their own annual publishing funds, and further grants from university or other institutions.

-Since personal or departmental funds may end up partly or completely funding the publication of works that were poorly evaluated in the peer-review process and couldn’t get financial support from sources based on its quality, it does little to stop bad research from getting published. It does, however, prevent them from creating a burden on the traditional publisher who currently pass that cost onto the consumers of information – since now publishers play no part in the selection process or have any stake in the success of its publication – the publisher, editor, designer, and digital index/content hosts are all paid for their work regardless. Also, since such poor quality publications will not be able to promote themselves by showing that they have the endorsements of, and positive evaluations of reputable professional associations, they will simply get cited less and can get filtered out in various ways during the source search process. However, even bad works or ones on extremely obscure topics can sometimes be useful, if but for a footnote or two that turns us on to a good source.

In this system what is the role for traditional academic publishing companies as they exist now?

None. Universities who support many of them should eventually dissolve them but support them long enough to allow a relatively smooth transition for its employees to find niches in the businesses that should grow from providing services in step (9) to step (12). Book paper printing should be all done through print-on-demand services as the print medium slowly declines. Marketing/promotion of the traditional kind will ideally become a minimal part of the equation as association endorsements and evaluations become the dominant stamp of quality and citation networking power comes to rule the day. Of course, you can add a “marketing” budget for promotion and advertising between steps (11) and (12) above if such funds are available but hopefully this will be seen as a practice resorted to mostly by those who failed to receive strong endorsement from professional associations. No one promotes our journal articles, why should we treat our academic books and other projects differently? If it gets cited, read, and referenced, is that not enough to ensure its spread, especially if the works are openly available and thus offer no barrier to access.

Now, tell me why can’t this work? Why won’t something similar to this emerge from the ridiculous state of academic publishing today when it really wakes up? Let me know what you think.

]]>
/blog/2010/05/tell-me-why-this-couldnt-work/feed/ 4
Time to Walk the Walk /blog/2010/04/time-to-walk-the-walk/ /blog/2010/04/time-to-walk-the-walk/#comments Wed, 28 Apr 2010 01:11:34 +0000 http://muninn.net/blog/?p=838 Continue reading Time to Walk the Walk]]> I am deeply frustrated with the sometimes closed atmosphere in academic life. I feel a profound discomfort when I encounter students and scholars who are paranoid that their research ideas will be stolen, that their sources will be discovered and, shock and horror, will be used by someone else. I’m simply incapable of sympathizing with them. I don’t like it when scholars pass around papers with bold warnings commanding me, “Do not circulate,” and I’m even less happy when I have been given handouts at a presentation only to have the speaker collect them again following the talk as if I was looking over instructor comments on a graded final exam. I feel my stomach churn as, to give a recent example, a professor opens up a database file of archival information and, smiling mischievously to the audience, declares that this is his “secret” source.

Such is life, people say to me, or else quote me some snotty French equivalent. That is the reality of this harsh academic world we live in. Well, perhaps I’m suffering from an early onset of old-age grumpiness, but I just don’t want to play that game. I don’t care that I’m still a graduate student, that job committees will look over everything they can find by me in search of sub-standard material, or that publishing firms will want me to explain why an earlier version of something I have submitted to them is available for download somewhere online. I don’t care if someone else finds some topic I have done some preliminary work on interesting, runs with it, and ends up publishing something on it. I may feel a momentary pang of regret that I didn’t get my own butt in gear and finish the project myself, but if they did a good job, then I really have no cause for complaint.

I’ve decided to just go ahead and start posting everything I produce academically, including short conference presentations and other research works in progress. You can find this material on a new research page here at Muninn.

]]>
/blog/2010/04/time-to-walk-the-walk/feed/ 11
Well Written History /blog/2009/08/well-written-history/ /blog/2009/08/well-written-history/#comments Tue, 18 Aug 2009 02:33:24 +0000 http://muninn.net/blog/?p=761 Continue reading Well Written History]]> The majority of the research is done. The sources have been found. The books and documents have been photographed or photocopied. Some of them have even been read.

I’ve got ideas. I’ve got outlines. I’ve got hundreds of pages of notes.

I have years of training in the destruction and dismissal of other people’s arguments. They call it grad school.

Now the time has come when I too must write – and not one of those research papers churned out in the day or two before the deadline arrives. I must write the dissertation. I am to write chapters that connect to each other in some logical fashion. Chapters. Even the word itself sounds like so many heavy links of metal to be hung around the necks of PhD students back from those green pastures they call “the field.”

I have seen them. They wander the campus with a pale look; the clank and rattle of their invisible burden almost audible as they walk. Nearby a third year history grad student might be seen skipping away, “I’m off to the archives!”

I forge my first link this fall. Getting a summer head start on my procrastination, this week I sat down to read a few books on the craft of writing, including a simple but handy book of “writing tools” aimed mostly at journalists and fiction writers. Reading through the short examples of good writing, I realized that I didn’t really know what good writing looked like in history.

Don’t get me wrong. In historiography classes, I have read plenty of “classic” works, from a full range of “schools” of historical inquiry and their most radical theoretical rivals. A year spent mostly reading in preparation for oral examinations brought me in close contact – “reading” wasn’t always the best description of what that contact consisted of – with hundreds of history books, but in all cases my eyes were trained on the content, not the form. The only times I really paid much attention to form was when some theoretically ambitious works were so frustratingly obtuse that one wondered how these historians who claim sensitivity to the subtleties of discourse could have nurtured such talent for linguistic slaughter.

I can think of plenty of works of history that took an approach I liked, had an argument that persuaded me, or simply benefited me in my own research. However, I am embarrassed to admit, I can’t name any history books that I thought were well written. That is to say, I have apparently paid so little attention to the writing of history at the level of phrase, sentence, and paragraph, and so much to the arguments and their support instead, that I now feel particularly naked as I go forward in my own writing.

Of course, I suspect good writing in history resembles good writing everywhere else. Surely many of the lessons of good writing taught in a journalism class, at a college writing center, or in Mrs. Gould’s seventh grade English class back in Aberdeen, Scotland are applicable to the writing of one’s history dissertation. I am also doubtlessly influenced by the rhetorical strategies and sentence structures of at least some of the hundreds of works that I have read in the past few years. Hopefully that influence is partly born of an intuitive recognition of quality. Even if that assumption is flawed, it is too late for me to revisit those blissful days of wide secondary source reading now. But if I get a chance to speak to incoming grad students in my last two years in the program, perhaps in the form of a wailing spirit in the night, I think I will advise them to pay closer attention to the language of historical works; to occasionally wield the eyeglass, and not merely the sword when they confront the works both in their own fields and the broader historiography.

]]>
/blog/2009/08/well-written-history/feed/ 6
Triage in the Archives /blog/2009/04/triage-in-the-archives/ /blog/2009/04/triage-in-the-archives/#comments Wed, 01 Apr 2009 14:04:44 +0000 http://muninn.net/blog/?p=728 Continue reading Triage in the Archives]]> I’m working on my last batch of documents in the provincial archives in Shandong. There are two challenges to doing my historical research here which I often think about. The first is the problem of access to both the archive and much of its contents. I have been very fortunate but I regret that it is more of a result of good fortune than anything else. This posting will focus on the other problem, the need for a kind of triage in the archives and the constant awareness of my own personal limits as a reader. It is a humbling experience, and I suspect many, if not most, historians, come to face it if they have spent much time doing archival research, especially dealing with documents not in a language they speak and read natively.

Language and Detailed Local Knowledge

I enter the archives here with a topic in mind, a relatively good understanding of the regional and chronological context for my topic of study, and a working knowledge of the terminology often used in the kinds of documents I will be looking at, in part thanks to the existence of a published collection of documents from the same archive (山东革命历史档案资料选编). However, I have two major disadvantages that I feel very acutely every day I come to the archives. One relates to my language ability; the other to the limits of my local knowledge.

Though I can read Chinese, especially when it comes to the materials in my particular field of study, I have two huge linguistic disadvantages compared to any native speaker of Chinese (and, to a lesser degree, native speakers of Japanese): 1) I read Chinese much slower, and more importantly, skim Chinese slower, than native speakers. I still have to occasionally look up words that cannot either be understood by context or safely ignored due to probable irrelevancy. 2) I do not have a lifetime of practice reading handwritten documents using cursive or radically simplified Chinese characters, which compose over half of the materials I’m looking at. This means that some of the many handwritten documents I look at here, where I do not have permission to photocopy or take photographs of the materials I am looking at, are partially or in a few cases completely impossible for me to read.

The second major kind of disadvantage I have relates to the fact that, as one archivist here put it to me sympathetically, “This must be overwhelming, since you have only had time to study Chinese history for a year or two before you came.” This makes it seem like every Chinese historian has studied Chinese history for decades and is thus many years ahead in terms of knowledge of the specifics of Communist party anti-treason campaigns in Shandong province, which is simply not the case. However, all other things being equal, I must come to terms with an obvious fact that lies at the heart of what the archivist was trying to point out to me: It is physically impossible for me to have found time to read more than a subset of the Chinese language secondary works or document collections that are related to my field in the short time I have worked on my dissertation, let alone read, as some graduate students and scholars here undoubtedly have, read the many other peripheral works that help one understand the context surrounding my topic. This is even more true since I am doing a transnational and comparative project that also incorporates Korea.

The only way people in my position can walk into the archive each day with some degree of self-respect is to convince ourselves that we have something unique to offer the study of our historical topic that gives us some kind of advantage relative to other scholars and students who might be working on a similar field here. Whatever this might be, our critical question, our comparative approach, our sensitivity to patterns etc. that might not be apparent to those working in other scholarly contexts, and so on, it gives us the confidence to go in and struggle through the historical materials and accept our weaknesses. In my case, I try to tell myself the contribution I can make is largely to be found in the way I “slice” the range of my inquiry and attempt to use that slice to answer particular questions. I remain open to the idea, however, that the “uniqueness of approach” claim may ultimately be an illusion, and as the quality of academic research here in China improves rapidly (I was really impressed with the breadth of reading and fresh approaches taken by some graduate students I have met here), some of the other advantages that foreign scholars coming to study might once have dared to claim are disappearing.

Even if one does avoid falling into complete despair, it remains an incredibly humbling experience to walk into the archive each day and be faced time and time again with one’s own all-so-apparent inadequacies. Below, let me share some aspects of that experience with some examples and the unfortunate but necessary steps I have to take in order to maximize the number of historical gemstones I can mine in the ocean of archival material available to me, despite my weaknesses.

Archive Triage

A man walked into the provincial archive here a few weeks ago and asked to see proof of his father’s selection as a “model worker” in 1952. He unrolls a crumbling certificate glued onto some old newspapers that he says is the original certificate. An archivist looks through some kind of a list they have for that year and find no mention of the man’s father. The man left dejected, “Everyone told me it was fake, but I still can’t believe it.”

Though this is a sad story, this shows a kind of ideal situation for a historian: to be able to walk into an archive with a detailed question, to find an authoritative source that can answer the question, and walk out with a relatively firm answer.

A policewoman walked into the archive last week and said she wanted to know more about her father’s case. Apparently, sometime during the 1940s (I can’t remember the exact year) he was accused of being a “traitor” and a “reactionary” for being a Nationalist party member in some Communist base area and had various trouble in the many anti-reactionary campaigns that followed in the decades thereafter. The archivists helped her look for information but found nothing that could be of help to her. She was offered several other places that she could go and look into things but still left disappointed.

Here we have a case where someone has a somewhat broader question, anything about her father’s case would have been helpful to her, but the archive was completely silent. This is similar to what many historians face, and I think they often change their topic, their sources, the archive in question, or the way they frame their questions in response.

However, there is another common problem, which I face here along with, surely, many of my fellow PhD cohort now camped out in various dusty reading rooms the world over: The challenge of what to do when the archive offers many hundreds of documents that each have a small possibility of offering a nugget or two that may be of use.

One simple and immediate strategy that a historian can then take is to immediately limit the scope of inquiry. That isn’t always the best first approach, however, and should probably only be attempted after getting a good sample of the whole range. Just because you have a huge potential source base, doesn’t guarantee you that selecting any subsection of it, based on region (limiting my study to treason elimination squads in the Jiaodong district), period (for example, the early formative period 1939-1941), or narrower topic (focusing just on how the squads attempted to get the ‘masses’ involved) will yield enough to be interesting.

It seems like the good results from archival research come in fits and starts. I can go for days without finding anything really useful, but then come across several fantastic finds in the course of a few hours. However, even in these cases, these fantastic finds may still only translate into a single paragraph of text or a footnote within the mammoth that is one’s dissertation. Depending on the kinds of source materials, you often have no idea if the next thing you pick up will be a total waste of time or will yield something wonderful.

Learning not to read. One of the skills that has been quite painful for me to learn is to overcome the urge to read everything. A Weihai police report from late 1945 that I looked at yesterday, for example, was over 80 pages long. Of those 80 pages, perhaps half a dozen distinct paragraphs, often separated by a dozen pages, are remotely useful to me. If I really read the full 80 pages of handwritten text, that document would take a whole day. I would probably have a much better understanding of Weihai in 1945 and could probably have possibly found more or even as much as twice the useful information, but very quickly one has to make a call about whether the potential gains are worth the time. Fortunately, the year long preparation for one’s oral exams in a PhD program, which involves the ‘reading’ of hundreds of books helps teach the lesson of not reading but effective combination of selective skimming and close reading of some sections. Unlike preparing for orals, however, the key here is not to extract the ‘main arguments’ of a report by a treason elimination squad in the Binhai district in 1944 or a Shandong police journal from 1947. The key information is very often in precisely the minute factual details and anecdotes that orals preparation teaches you to give only enough attention that you can evaluate whether they contribute or contradict the argument being made by the author of a work.

So what to do? Well, when the source base is quite large, the most useful strategy I have found is to quickly identify patterns in the structure of texts and calibrate your reading speed to locations most likely to yield results. Reading everything would, of course, yield more, but time is a very scarce resource. Early on, I found that many (but by no means all) treason elimination squad reports are divided roughly into sections, not always clearly identified, and that the kinds of meaty anecdotes I have found useful in the past are usually located in two of these sections as instructive examples. Village petitions to have certain people punished as traitors usually have long introductions and conclusions which are highly formulaic and can be skipped. North Korean trial records have extremely rigid structures that, while also not clearly marked, can be located easily by finding certain key phrases in the first sentence of paragraphs, and so on. One strategy I adopted was to familiarize myself quickly with document structures when looking them over as a whole before skimming them. To facilitate this process, try requesting similar kinds of documents in groups, even when they are separated by region and time, because similarity in the structure of these texts can significantly reduce the time it takes to process them.

This has risks too, however. If I request lots of different kinds of documents from the Tai’an district in 1942, for example, I will quickly come to understand the importance and power of something known as the “Tai’an incident” which ripples across other regions in that year and others that follow. Taking the document group approach, however, the importance and power of that event only becomes apparent when reviewing my notes from several weeks of reading. This teaches another lesson though: despite the extra time it takes, frequently review the notes one takes in order to identify new patterns, new keywords or documents to search for, and deepen one’s understanding of the chronology and institutional or regional context of the material.

The last and most painful thing I have had to do which is the best proof I have of the sad reality of my personal limits as a researcher is the kind of triage which is based purely on a linguistic evaluation: Last week I had at one point a dozen or so documents. One of these were detailed meeting minutes from a public security bureau meeting held in a Communist controlled but nominally Japanese occupied area. Given the fact it was “close to the ground” in terms of being a very “local” text, and clearly not edited before being bound and submitted, there probably would have been some good unfiltered information about what was going on in the area. Thus, the chance of finding “gemstones” of information in the source was relatively high. However, the handwriting was about 90% illegible to me at first glance, and even if I slowly worked through it, I doubt I would be able to determine more than 50% of the content with careful reading. If I was a Chinese native speaker with more experience working in these documents, I could probably do much better. However, since I’m not, and my time is scarce, I decided to use the several hours I would have spent on that document on two or three other documents with a lower chance of yielding good material but which I could read much more easily.

This is the kind of decision that has to be made all the time, and it is sad and frustrating. It is especially frustrating when one is looking directly at the gemstone in question. To take one example of many, I found an anecdote filled with rich detail in one report on an exchange between an accused traitor, some women who attended the mass trial that were yelling from the audience, and a man who got on the stage to confront the accused. I could make out bits and pieces of it, and have a theory about what transpired (I believe the accused was thrown into a well), but several key phrases were illegible to me—not because the text was smudged or the paper burnt, but because the handwriting was too difficult for me to read in a few sentences. Thus, I did not record the anecdote at all in my notes. Of course, native speakers also have a great deal of trouble with some of these texts but, all other things equal, have a huge advantage when trying to decipher things. I will still be able to write my chapters and have found great material to support my arguments, but I often lament the fact that I had to leave so many bright gemstones embedded in the rock because I couldn’t take the risk of having misunderstood a text based on a mere partial reading.

I have tried to shared some the humbling realities of doing research here and some of the triage I have had to perform while in the archive. As a closing comment: I often wish that historical research encouraged something akin to the practice of “pair programming” wherein two researchers work together on the same materials, side by side, checking for accuracy, misinterpretation, poor selection of material, etc. I know there are many good pair translators out there, but I think it is less common for historians to collaborate – especially at the research stage as oppose to the writing stage and it reminds me of the debates we had in seminars over whether history can ever be a discipline that truly encourages collaborative work.1

  1. During our discussions in seminar, the historians of the Annales School were seen as the major exception to this observation
]]>
/blog/2009/04/triage-in-the-archives/feed/ 3
A Proposal for a Powerful New Research Tool – Organizing Information for Dissertation Writing – Part 3 of 3 /blog/2009/03/a-proposal-for-a-powerful-new-research-tool-organizing-information-for-dissertation-writing-part-3-of-3/ /blog/2009/03/a-proposal-for-a-powerful-new-research-tool-organizing-information-for-dissertation-writing-part-3-of-3/#comments Fri, 27 Mar 2009 08:13:23 +0000 http://muninn.net/blog/?p=720 Continue reading A Proposal for a Powerful New Research Tool – Organizing Information for Dissertation Writing – Part 3 of 3]]> In the first and second postings on this topic, I described my approach to a lack of connections between my notes on my sources and my broader dissertation outline. I explained how I organized my material and how I’m trying to use my task management software as way to create a link between the increasingly large number of note files and sections of note files on individual sources and the broader outline of the dissertation I will begin writing this year.

In this posting I will describe a kind of outlining software that could largely resolve the organizational problem I have described in my previous two postings without having to navigate between several applications. These could be easily added as a mode or layer of features to existing outlining software out there. In this case I’m thinking of OmniOutliner, which is what I use, but I think the kinds of modifications I am suggesting could be easily added to most other outlining software solutions out there, or serve as a foundation of a new solution based on the organizing principles described here. The result, I hope, will be an environment which will allow researchers to adopt a smooth workflow which can unite the highest level of a research outline and the most tiny fragments of notes on sources or the sources themselves.
My own favorite note taking application, OmniOutliner, like most kinds of outlining software, allows me to hierarchically and in a bullet point fashion record notes on various materials I come across ranging from historical sources to books and articles on any topic. I am always impressed with the great care to detail and flexibility that software created by Omni Group shows so if you are using a Mac, I recommend you give their offerings a look. When we make use of the information we find, we will usually mark the deployment of such information in our academic writing with the use of citations and sometimes direct quotations we have recorded. This means that an important part of historical research, as well as research in many other fields, is keeping careful track of exactly what information comes from what source. Two approaches in note-taking spring to my mind:

1) Somehow keep notes on sources separated by the source, whether in different sections of a single file, or different files. Make note of page numbers as one records notes from different parts of the source.

2) Organize all one’s notes by an arbitrarily determined collection of ideas (or chapters, or chapter sections, or themes, etc.). Each time one reads a new source directly input information worth remembering into the note file or section of a note file dedicated to the given idea, chapter, or theme.

The problem with the first approach, which I presume to be the more common, is that if one has a very large quantity of notes on many different sources, when one shifts into the writing mode, one has to hunt through one’s notes looking for the information relevant to the claims one wishes to make in that particular section of the dissertation. This is the problem of the lack of the “middle layer” of organization that I referred to in my first posting and for which I have presented a temporary and imperfect solution for in the second posting.

For example, I have somewhere between three and four hundred note files on various sources related or potentially useful to my PhD dissertation. Some of them are less than a page long with a few brief points of interest. Others, like my note files on various newspapers and archival collections, are extremely long files of several dozen pages divided into sections by year, issue, or specific document in a collection. Hunting through them all, or more likely, a quickly chosen sub section of these files in search of useful information I have recorded will be highly time consuming and risks missing some great gems that I have since forgotten about.

The second approach seems to provide a much faster transition from research to writing since the researcher can sit down and immediately begin writing a chapter based on the notes collected together under certain idea or chapter headings. The problem with the second approach is twofold: 1) When fragments are completely extracted from their original context you lose the important sense of connection between that information and other fragments or notes you have recorded from the same source. Call this the “problem of context.” 2) Each time you enter a fragment from a source into this chapter/idea/theme based note file you will have to also make note of the exact source and location in that source. This means the time required to record any fragment can potentially double. Call this the “problem of reference.”

Tagging and the Problem of Context

Those familiar with the dramatic rise of “tagging” of information online might have already thought of a way to resolve the problem of context. If information is tagged, you get the best of both worlds: a tagged object can be linked to many different ideas or multiple chapters while still remaining in whatever structure it was found in. There is no need to take the second approach above because we can dynamically create such an idea/theme list of fragments based on particular tags given those fragments within the note file for the original source. When a picture in a Flickr set entitled, say, “trip to the lake” is tagged “bird” then the addition of the tag does not remove it from its initial context, that is to say, the fact that it is a picture posted to the Flickr collection of kmlawson and that I have placed it in set.1

An organizational application which supports tagging, such as Yojimbo, Sente, Evernote, Leap, Yep, etc. allows you to easily tag and display a list of files by tags. iPhoto allows you to tag your images, and I use an applescript to add tags to my songs in iTunes to support my dynamically generated ‘smart’ playlists. However, from the perspective of the historian writing a book or a PhD student writing their dissertation, these tagging applications aren’t quite enough. All these applications are tagging at the level of files, which is not really the level of detail we need in creating a rich web of connections for our research.

Tags must go beyond the level of the file and down to the level of a bullet point within one’s notes. We need a way to easily and quickly tag individual fragments of information within the sources we find so can easily deploy them in our academic writing.

Let me give a concrete example. This is a fragment from one of my note files from a local police report from my trip to the archives yesterday. I also list where this fragment is within my system of organization:

In my folder “Dissertation”:
In my subfolder: “Related Notes”
In the file “Shandong Provincial Archives”:
Under the heading for file G042-01-0283-007 全县反奸诉苦大会总结 1946.4.3 日照县公安局
I have the following fragment from a local trial of a traitor:
“12/107 a woman’s father was executed by shooting by the accused traitor during the J occupation. When she 控訴ed the traitor she cried the whole time. She 上台 and using a stone, beat the 犯人. This scene made the masses 感動 and 落淚”

This is a very brief but rich piece of historical detail from a police report showing how a woman mounted the stage and beat an accused pro-Japanese collaborator who had executed her father when the town was under Japanese occupation, and it records the subsequent emotional impact this scene had upon the ‘masses’ in attendance. In a report that deals mostly with generalities and statistics this is a powerful little anecdote that might potentially make an appearance in my dissertation. This fragment might help me in one or two ways: 1) It can help me describe the way the ‘masses’ got directly involved during the local treason trials and carried out acts of direct violence against the accused during the course of the trials without the interference, despite party directives that no violence or beatings should be carried out upon the accused, especially prior to conviction, and any eventual executions only be carried out by a bullet fired by ‘special’ public security officers. 2) It can also help me make the argument, which I hope to make, which is that Communist cadres were very interested in recording the way that these treason trials aroused the masses, sparked emotion in them, and helped organize them in the contexts of movements carried out under party control. In this case, and in many others I have found, the reaction of the masses is carefully recorded.

Now, how shall I preserve this fragment for easy access later when I begin writing? I might want to give several tags to this fragment. Perhaps I want to tag it for things like, ‘Treason and Social Reform’ (the chapter this might be used in) ‘local trials’,’reaction of the masses’,’women’,’beatings’, etc. Of course, to reduce the problem of tagging things in slightly different tags, auto-complete should be available to guess tags as I begin typing them.

It would also be nice if there were cascading tags. That is to say, if I could assign certain tags to the entire file, Shandong Provincial Archives with tags like, “China”,”Shandong” which trickled down to every fragment bullet point in the file, and also tag the sub-section of that file for the document G042-01-0283-007 全县反奸诉苦大会总结 1946.4.3 日照县公安局 so that all fragments of notes taken from that file had the tags ‘日照’,’公安局’,’反奸訴苦’,’1946′, etc.

It does me little good to use Leap, or Yojimbo etc. to tag the whole file, now several dozen pages long, with all my notes from the Shandong Provincial Archives, or even a file specifically for G042-01-0283-007 which included other useful information that I might want to tag in other ways (like a table of statistics of how many ‘masses’ were ‘organized’ as a result of carrying out the anti-treason campaign). An ideal outline software solution for academic researchers would allow tagging at the level of the bullet point.

The Problem of Reference and Creating Smart or Dynamic Outlines

Of course, this system would have to work both ways. Let us say I have finished my research in the various archives, libraries, and online databases and completed the taking of all those note files. Let us say my software has allowed me to tag all the more useful bullet points, and allowed these bullet points to receive the cascading tags of their section headers and the note files themselves.

Ideally, I should now be given some kind of clean view of all fragments of information that match certain tags. Besides the fact that these have now been dynamically collected for me and displayed in a list, the most important thing I need to know now is what source they came from. Thus, every fragment listed in this way should be able to display a column or otherwise make apparent the source. This system would ideally account for the fact that the source does not always correspond to the name of the originating file, or the header for the section from which the fragment was taken.

To accommodate the fact that a fragment’s source is not necessarily reflected in the file name or section name of its origin, I suggest the outlining software allow the user to designate certain blocks in a note file, whether it is the whole file (I have many files dedicated to a single book or article), or merely a section of a file (I have a single file for all the documents I viewed from some archives such as Shandong Provincial Archive, Korean National Archive, RG242 of the US National Archives) as belonging to a source. The actual citation for this source might be kept internally within the application. However, since there are many great tools out there for managing academic resources and their citations that a student or academic might already have a preference for (Zotero, Sente, Endnote, etc.) I believe the best solution would be to provide some form of a link to a source entry in these external resources, whether they be within an offline application or an online format.

Finally, viewing a list of fragments by a single tag or combination of tags merely gives you an overview of one idea (or a chapter if you have tags for chapter themes). The application should allow you to create ‘smart outline’ files which are essentially dynamically created mega-outlines, ‘notes on notes’ or a kind of complex ‘smart playlist’ of points to be made for each argument or chapter. Here is what I am thinking of: The user could create a ‘smart’ note file called, say, “Dissertation Outline” and then write out their broad outline divided into chapters and the major arguments they wish to make. Then, they could non-exclusively assign certain tags to chapters or arguments within those chapters in a special way that allowed “and/or/not” constructions to limit the hits. This is similar to the process of tagging fragments within note files described above, with one important exception: in this case, tagging these chapters or arguments allows the user to list all fragments associated with those tags under those sections in this mega outline. Thus, at a glance, the researcher can view a dynamic and self-updating outline of their dissertation outline with the major sections and arguments directly inputted, but with each of these chapters or arguments containing within them a smart list of all fragments that contains certain tags associated with these chapters or arguments.

A few more features I believe this smart outline view ought to support: this view would also include a feature that could show a list of “orphans” which are tagged fragments which have not yet been assigned a location in the smart outline. It is very likely that we have tagged many fragments in ways that at a later date turn out not to be the most obvious when we enter the writing process. This orphan view can rescue important fragments from obscurity.

Also, since such a powerful and probably huge smart lists will probably result in a large number of duplicate or less than useful fragments getting listed, the user should be able to easily hide single or groups of fragments which, despite their promising tags, are irrelevant to a chapter/argument. There might also be check marks given so that a researcher can check off fragments as they include them in the written work.

This view should also account for the very likely possibility that the researcher anticipates using material from a source for which they have no notes for. Just as fragments are associated with sources, in this mega-outline or smart outline view, they should be able to easily drag and drop in references to sources they think will be relevant to specific arguments or chapters but for which they have no notes or fragments at all. Again this can be from an internally managed list of sources within the application but more ideally compatible with various existing solutions like Zotero, Sente, etc.

Let me give an example of the ‘Smart Outline’ feature of the software as I imagine it:

Let us say I’m writing a book just on the Treason Elimination Squads in China (rather than divided between two chapters as I currently plan to). After tagging hundreds or several thousand bullet points in dozens or hundreds of note files managed by my outlining application, I write up a (very boring) book outline using this ‘smart outline’ feature:

Introduction
Formation of the Treason Elimination Bureau in Shandong
Early Excesses and the Anti-Trotskyist Movement
Balancing the Three Treasonous Enemies
Reckless Arrests, Reckless Killings, and Attempts at Reform
Turning to the Masses
Liberation and the Anti-Treason Campaign
Conclusion: Continuity and Change in the Civil War

While I won’t list them here, let us say I also add some sections for each of those chapters with major arguments I want to make in the book under the headings for each chapter. Now, I set about attaching relevant tags for each chapter or argument/section within a chapter (and some of these tags might be tags I have created specifically for chapters):

Introduction
Formation of the Treason Elimination Bureau in Shandong
-([not 1940 or 1941 or … 1949],’Treason Elimination Bureau’,Shandong)
Early Excesses and the Anti-Trotskyist Movement
-(托匪,’Treason Elimination Bureau’,torture,executions,excesses,湖西錯誤,[泰山 and 1942],[濱海 and 1942])
Balancing the Three Treasonous Enemies
-(托匪,國特,敵偽,’Treason Elimination Bureau’ etc.)
Reckless Arrests, Reckless Killings, and Attempts at Reform
-(亂捕亂殺,’Treason Elimination Bureau’ etc.)
Turning to the Masses
-(群眾化, ‘Treason Elimination Bureau’ etc.)
Liberation and the Anti-Treason Campaign
-(反奸訴苦, 1946, etc.)
Conclusion: Continuity and Change in the Civil War
-(反奸防匪,[Shandong and [1947 or 1948 or 1949]]

Having thus assigned certain tags (in some cases the same tags are listed in multiple chapters) I should, in the software I am imagining, be able to view a dynamic list of all fragments with those tags (or in some cases, complex combinations of tags like Taishan and 1942 so I get only fragments that are likely to refer to the Taishan killings of that year) under the relevant headings. I should be able to independently re-order the displayed fragments and hide those that I determine are irrelevant in preparation for writing. These smart lists should be ‘live’ so if I go back to the library or archive and add more fragments in some of my note files with the relevant tags, they should appear in the ‘smart outline’ which lists these tags.

I believe that what I have described above can serve as a rough blueprint for a very powerful application that will allow researchers to have a fully integrated web of information between different levels of organization – at the highest outline level and the lowest level of note taking on sources.

Putting it All Together

So, putting it together, here is what I am imagining as a powerful note taking and organizational solution for academic research:

A powerful and flexible hierarchical bullet point outlining application, such as OmniOutliner

Which allows the ability to add multiple and autocompleting tags to any fragment of information within a file represented by a bullet point (and any bullet points it contains below its level)

Which allows cascading tags, so that note files and sections can be tagged and fragments within it inherit those tags

Which allows whole files or sections of files to be designated as coming from specific sources so that all fragments within those files/sections know what source they come from

Which allows sources that are associated with files or sections of files to either be managed within the application, or ideally, be linked to the entries for these files in external citation software (Zotero, Sente, Endnote, etc.) or some online equivalent (Refworks, Zotero, etc.).

Which allows the convenient listing of all fragments of information corresponding to certain tags

Which provides the means of easily viewing the source for all such fragments listed by certain tags

Which allows the creation of dynamic ‘smart outline files’ which are partially composed by the user. The sections composed by the user can be assigned a collection of tags (that might include logical boolean constructions of multiple tags)

Each section of a ‘Smart Outlines’ can expand to show all fragments from the tags assigned to that section

These displayed fragments in ‘Smart outlines’ are live so that fragments added with the given tags are dynamically added, can be arbitrarily re-ordered by the user, and hidden if they are determined to be irrelevant by the user.

Fragments displayed in ‘Smart outlines’ should optionally display check marks so they can be marked off when they are incorporated into the written work.

‘Smart Outlines’ should offer the ability to open a window displaying ‘orphaned fragments’ (all fragments minus tagged fragments already present in the smart outline) listed by tags to prevent important fragments that are badly tagged from being left out.

In the ‘smart outline’ the user should be able to drop in references to specific sources under certain sections to account for useful sources for which there are no note files and fragments.

In short, this application is an outline or note-taking application which supports sub-file level tagging of bullet points along with and a powerful ‘smart outline’ view that allows users to create powerful high-level and dynamic outlines that list all possibly useful fragments of supporting evidence, what sources they come from, or simply references to sources for which there are no notes. It should ideally interface with existing mature citation software solutions either on or offline which already has wide adoption within the academic world.

  1. Of course, in the case of such digital resources, we can of course imagine these initial contextual traits as really just being two more special kinds of tags, one exclusive tag to indicate the owner, and a list of tags with each set I have put the picture into.
]]>
/blog/2009/03/a-proposal-for-a-powerful-new-research-tool-organizing-information-for-dissertation-writing-part-3-of-3/feed/ 6
Organizing Information for Dissertation Writing – Part 2 of 3 /blog/2009/03/organizing-information-for-dissertation-writing-part-2-of-3/ /blog/2009/03/organizing-information-for-dissertation-writing-part-2-of-3/#comments Sun, 22 Mar 2009 09:16:38 +0000 http://muninn.net/blog/?p=718 Continue reading Organizing Information for Dissertation Writing – Part 2 of 3]]> In the first of three postings on this topic I explained that I have become increasingly concerned that there exists a vast and empty middle layer of organization between the various primary sources, notes, and ‘notes on notes’ I have on the one hand, and my dissertation outline. I have felt the need to develop some way, while I’m still out here in the field conducting my research, of better tying up the many individual fragments of information I find in the sources with the arguments I want to make in the written dissertation.

I’d be very interested in hearing about how other graduate students have sought to resolve the problem of connecting the large quantity of notes, outlines, and unprocessed raw sources with the grand outline of a huge writing project like a dissertation. Below I describe briefly how I have essentially integrated this process into my own task management routine.

First, let me describe how I have been organizing the historical materials I have been collecting in the field and while back at university. Read on for the details.The primary historical sources I have been working on can be divided into four kinds:

1) Sources I have read but of which I have no hard or digital copy. These are represented on my computer only in the form of the OmniOutliner notes I have taken on these files, along with the appropriate information necessary for citations.

2) Sources which I have only a photocopy hard copy of. Usually I write any important citation information directly on the first page of the document. Then, to each of these sources (which may be primary materials or photocopied books, articles, etc.) I have given an index number composed of a letter (I have used the language the text is in, which in retrospect was not the best way to do things) and an incrementally increasing number so that I can easily refer to my documents in various other files and relocate them in my boxes of files. I have a single OmniOutliner ‘document index’ which lists the name or description of the document, its index number, and the date I found the document. This index number is also noted under the entry for the appropriate day in a separate chronological dissertation log I keep which describes what work I have been doing on my dissertation and the context in which I came across the material). Two examples from my document index (description | index number | date found):

汪偽政府所屬各機關部隊學校團體重要人員名錄 C1010 2008.7.22

BA0155460 사법경찰지도교양재료송부의건 K1003 2008.5.16

3) Sources which I have taken a photographic image of. Some archives and libraries in China, Korea, Taiwan, and the United States allow me to (or at least don’t stop me from) bringing my own camera and taking pictures of the sources I’m looking at. For each source, I usually create a separate OmniOutliner document. In it I take notes on the content of the source, noting any useful information I find in the source. For each photograph of a page of material I take, I record the photograph’s ID number (there is a different numbering scheme for every camera). Sometimes, the notes are very simple, especially if I don’t know if a piece of material will ever be of any value. However, hard disk space is cheap so I take photos of a lot of material I may never look at in any great detail. When I import my photographs, I keep them separated, by source, in folders within my ‘Images’ folder in my ‘Dissertation’ folder. The file name of imported images is the ID number. That way, if I ever need to look at the original image, I can see the ID number in the notes for a source and then search for that number.

For example. In my notes from my reading of the newspaper 青島公報, which is divided into microfilm reels and years, I have an like this in my note file:

青島公報 Reel 1 From 1946.3 to 1946.10

9.1:3 three 漢奸, including a woman ‘cultural hanjian’ 焦墨筠 who gets 3 years and 6 months with other two geting 10 and 15 years each #0508

In this case, this is a short note summarizing an article on page 3 of the September 1st issue of 青島公報 on the sentencing of three Chinese traitors. I will probably never use this information in my dissertation but if I decide at some point to talk more about female traitors or ‘cultural traitors’ then I know I have this short newspaper article on the sentencing of one such woman at the conclusion of her treason trial. In my dissertation log, I find that I was reading through this reel on the afternoon of 2008.12.2 at Shandong Provincial library microfilm room where they let me take somewhat readable photographs of the screen of the microfilm machine. The numeric portion of “#0508″ can be searched for on my hard drive that search will yield an image in the 青島公報 subfolder of my images folder named DSCF0508.JPG that contains the full text of the original article. I use the prefix ‘#’ because I have already taken over 10,000 pictures during the course of my dissertation research and the camera has begun to recycle the numbers. I used to use ‘$’ but to keep these numbers unique in my notes, I now use ‘#’ for the second 9,999 pictures. When I search for 0508 I now get two pictures, but this is hasn’t been too much of a hassle to sort out.

4) Sources which I have a PDF or other digital document for. Many secondary sources or scanned materials I have are available in PDF format, which I have stored among my dissertation files. When I have read these, I record notes for them in an OmniOutliner file. Sometimes, when I have read several articles on a particular topic, I will have a separate OmniOutliner file summarizing the points and arguments from a collection of articles. For example, I have read a series of articles and book chapters on the history of the Shandong column of the 8th route army during wartime China and have compiled notes on this into a separate file I have for historical background on wartime Shandong province. I also have a note file compiling various notes on my notes related to Korea’s treason trials in 1949.

Making the Connections

So the challenge remains, how do I link all these notes, notes on notes, to the actual dissertation without constantly going on a long hunt through all my files?

The way I have approached this problem is to simply use my existing task management software, or ‘to-do’ software or ‘GTD’ (‘Getting Things Done’) software.

I use a program called OmniFocus both on my MacBook Pro laptop computer and on my iPod Touch to manage my tasks (they synch their content with each other). Like many other ‘GTD’ programs out there like Things or iGTD in the case of Mac software, OmniFocus allows you to organize your tasks by projects and contexts. You might first drop tasks into a general ‘inbox’ of unprocessed tasks and then add a project and context for later review if you don’t have time to perform a given task when reviewing the inbox. The former allows you to easy find tasks related to specific projects you are working on while the latter allows you to find tasks that can be done in certain ‘contexts’ of your life. So, for example, I have a ‘web’ and ’email’ context which are sub contexts under ‘comp’ for when I’m working on my computer but also a ‘harvard-yenching’ and ‘shandong provincial library’ and ‘taiwan national library’ context all three which are sub-contexts under the ‘library’ context.

The key efficiency move in this process of connection creation is, however, the combination of project organization and really simple adding of items to that project in the task management software. Many applications like OmniFocus allow you to very quickly and painlessly add tasks to your inbox or specific projects through a ‘quick entry window’ that is accessible in response to a keyboard shortcut. It is with this feature that I have found a simple and easy way to connect my notes with my final chapter writing: In OmniFocus I have created a folder of projects for my dissertation and created separate projects for each of my planned chapters.

Then, as I am taking notes on my sources, each time I finish looking through a source, I reread my notes and try to estimate what chapter the various information I have found can potentially contribute to. I then copy either the name of the source, its file number, etc. (if the notes are relatively few) or a unique phrase or date+phrase from the relevant bullet point in my notes and activate the ‘quick entry window’ of OmniFocus. I assign it the context of ‘writing’ for when I actually get to writing my dissertation, briefly describe what I think the source contributes, and then assign the item the project corresponding to the chapter I believe I will likely have use for the information. If I don’t really know what chapter some material will useful for, I might add it to some other more general topic projects I have added to the dissertation folder of OmniFocus projects, or, for the time being, drop it into the inbox for later consideration.

examplequickentry.gif

After adding many such items, not all which will necessarily make it into the dissertation, I return to OmniFocus and review the various items I have added, eliminating those I don’t think will actually be useful when I begin writing, and grouping the various items into hierarchical categories within the chapter projects corresponding to sections of my chapter as I currently imagine it.

What I hope will result from this is a smoother writing process. As I write the various portions of each chapter I can find exact references to individual note files or parts of files where I can find the relevant source material and notes on that source material that will help me make the arguments I am planning to make in the final project. All I have to do is look at the OmniFocus project for a given chapter and I will see a full list of references, grouped by the various points or arguments I hope to make, ready for incorporation into the writing.

As I said in my first posting, I don’t really think this is really all that original a method, and is probably just a variation of some kind of similar process (though perhaps without using software) that many graduate students might use when preparing for a large writing process. What I think is particularly useful with this approach is that I can very quickly add these little references – or pointers to sources, whenever I finish typing up notes on a given source without ever actually leaving the outlining software. It is very fast and simple and hardly interrupts the note taking process.

In a shorter final posting on this topic, I want to suggest how I think this process could be made even more fast and well integrated, if the more powerful outlining applications, such as OmniOutliner for Mac (or competing outlining applications on Windows or Linux), specifically targeted this kind of workflow.

]]>
/blog/2009/03/organizing-information-for-dissertation-writing-part-2-of-3/feed/ 3
Organizing Information for Dissertation Writing – Part 1 of 3 /blog/2009/03/organizing-information-for-dissertation-writing-part-1-of-3/ /blog/2009/03/organizing-information-for-dissertation-writing-part-1-of-3/#comments Sat, 21 Mar 2009 15:38:22 +0000 http://muninn.net/blog/?p=715 Continue reading Organizing Information for Dissertation Writing – Part 1 of 3]]> I’m coming into the home stretch of my two academic years of field work for my dissertation on treason and political retribution against accused collaborators with Japan in Korea and China from 1937-1951. I spent the first academic year in Korea, a summer in Taiwan, and I’ve just begun my last month of research in Jinan, China. I’ll try to wrap up some unfinished research in Korea and Taiwan this spring and then begin the actual writing of my dissertation this coming summer back in my hometown in Norway and while staying with family in the US. My goal is to wrap things up and hopefully complete my history PhD program by the spring of 2011.

I had always hoped I would have at least one chapter written up by the time I returned from the field, but at this I have failed. My primary excuse has been the fact that I have never had all the materials I have collected in various places in one place. In honesty, however, it is probably more due to the fact that I have never been able to combine the “research mode” and the “writing mode” into a single daily routine. I have deep admiration for graduate students and scholars who can do this effectively: spending their days at the archives and libraries, then shifting to chapter writing in the evenings. I haven’t even done what some professors have suggested: write a few disconnected pages here and there as you get enough material to weave a few tight threads. I confess cowardice, having not overcome the fear of composing such fragile and isolated pages.

Since I’m not, like those model students, immediately converting my daily discoveries into chunks of narrative and analysis, I am increasingly concerned about the fact that the hundreds of note files, outlines, and references to various archive images or PDFs themselves have become a considerable corpus that will require a nontrivial amount of processing and mining to reconstruct the argument and narrative of what will become my PhD dissertation.

To put it another way, I have two rich layers that form the foundation and roof of my research. The former is the dense web of primary source materials, notes taken from these source materials, and other timelines or “notes on notes” which organize some conceptually related materials. This is where the truffle hunter can happily prance about. The latter is the dissertation outline. This is an increasingly detailed macroscopic view of my planned chapters and arguments which has taken concrete form in a dozen different formats and lengths as it gets distributed as a dissertation prospectus, various fellowship application essays, emails to professors, and, in its most detailed form, a hierarchical outline document full of barely intelligible bullet points. This overarching top-down view is born of that creative destruction that is the clash between the starting assumptions that feed the “fire in my belly” which brought me to the study of history and my chosen topic, and my intuitive understanding of what my research in the sources permits me to argue in good faith as a historian. It is, of course, at exactly this point where many of the historiographical crises of our time find their point of entry but this is not the issue I wish to address in these postings.

While in the field, the gradual thickening of the web of notes and sources on the one hand and the increasingly detailed and structured outline on the other might suggest progress, but I can already feel the heavy weight of a void that lies between them. PhD students I have talked to who have returned from their research in the field give me the impression that the greatest frustrations that lie ahead for me are to be found in two areas. One is the challenge of writing itself, of synthesis and analysis on a scope never before attempted in our long career as students. The other, however, seems to be found in bridging the vast and dangerously incomplete “middle zone” between the above described layers: Exactly what evidence and what sources will be deployed for precisely which points we think we can persuasively make? Which book, newspaper or archival document was it that demonstrated this or that phenomenon? For every argument I wish to make, must I be reduced to searching through a large subset of my notes and notes on notes, which now number many hundred pages?

I’m very much open to the advice of graduate students and professors who have developed successful strategies for this but in my next two postings, I’ll share a strategy that I’m attempting now that I hope will help me overcome some of the worst of the middle zone nightmare I have described above. I don’t think it is very original, as I suspect many, if not most PhD students may have attempted or used something similar themselves. In fact, some may accuse me of describing the obvious common sense approach. If, however, it indeed is an effective approach – and this remains to be shown in the coming two years of writing I have ahead of me – then I wish it had been explained to me before I launched into my lonely existence as a student roaming the archives of East Asia.

In the next posting, I’ll explain how I’m using my task planning software (OmniFocus) as a bridge between my notes and my dissertation outline, creating a kind of index that links sections of my notes on specific sources, to certain arguments I think I can and will make in my dissertation chapters. While what I’m doing doesn’t require any kind of specific software, this process has integrated relatively smoothly into my existing methods for organizing tasks on my Mac and my iPod Touch. The third posting will probably only be interesting to a more technical audience who are familiar with various specific software solutions. In that posting, I will suggest how, if my current experimental approach is sound, how I think an even more ideal software-based organizational system might work which I have yet to find fully or satisfactorily implemented in any existing soclution I have seen out there. I’m sure there will be dissenters who believe they have found the perfect solution for their needs, but I will attempt to articulate what I have found lacking in what is out there.

]]>
/blog/2009/03/organizing-information-for-dissertation-writing-part-1-of-3/feed/ 4
Endnote Takes A Shot at Zotero /blog/2008/10/endnote-takes-a-shot-at-zotero/ /blog/2008/10/endnote-takes-a-shot-at-zotero/#comments Fri, 03 Oct 2008 13:12:05 +0000 http://muninn.net/blog/?p=669 Continue reading Endnote Takes A Shot at Zotero]]> The cold war between Endnote, the bibliographic software owned by Thomson Reuters that has long had a virtual monopoly on the academic market, and Zotero, the open source alternative created by the incredibly resourceful and innovative Center for History and New Media at George Mason University has finally broke out into an open conflict.

Endnote clearly saw its grip on the academic market coming to a swift end as a new generation of graduate students embrace the free and powerful Firefox browser-based alternative that has rapidly caught up to its rival in features. It responded with a huge gamble and an ancient weapon: the lawsuit. It has sued George Mason University for being in violation of its site license for Endnote. GMU has paid for a site license for the Endnote software, much like other universities (I can confirm, for example, Columbia and Harvard’s internal university software sites also provide its download for their university community) and the CHNM at GMU is listed as the creator of Zotero in the software’s about information. The Endnote site license is said to have explicitly forbidden the license holder from engaging in the “reverse engineering, de-compiling, translation, modification, distribution, broadcasting, dissemination, or creation of derivative works from the [EndNote] Software.”

Lets look a bit closer at the players and the issues.

What is Endnote?

Endnote is a piece of software which allows researchers in any field to compile a list of bibliographic entries. This might mostly include lists of books or articles they have come across for use in their publications.

At its core, the software is simply a database client for research sources. However, it eventually developed three killer features that created a reluctant customer base out of virtually the entire academic world:

1) Z39.50 – In Endnote, the user doesn’t have to type in all their sources by hand. If, for example, they want to include a book which was found in the Library of Congress or any one of thousands of libraries which have an online database which supports something called the Z39.50 protocol they can use Endnote to directly import the info in question. Endnote ships with dozens of “.enz” connection files which allow it to connect to most of the important libraries in the United States and search their holdings for the source required. Endnote will then add the bibliographical information directly into the user’s own database. If you can’t find your library in the default list of connections, very often the Z39.50 .enz file can be downloaded directly from your favorite library’s homepage, usually hidden somewhere deep in the geekier sections of the website. The .enz files simply contain connection information, openly available through various library websites, that has been put into a special format readable to Endnote. Interestingly for this lawsuit, I don’t know of any case in which Endnote has sued libraries for distributing (which is a violation of the license) these .enz files which are, like .ens files (see below), a “component part” of the software.

2) Styles – Endnote provides the ability to convert one’s source entries into any bibliographical style, so that your footnotes, endnotes, and bibliographies can be easily formatted according to the many different styles used by various journals and publisher needs. These styles are created and openly available to anyone who consults the website of the given publication. In addition to providing the ability to create your own output style, Endnote has simply taken these publicly available style formats, many based on well known formats like the Chicago citation style (see instructions for citation styles for American Historical Review, for example, here), reduced them to their most basic components and created an “.ens” file which saves the formatting requirements in a digital format. If you have Endnote installed, you can see the huge list of style files available in your Endnote folder in the Styles sub-folder:

ens files.gif

If you open any of these files in a text editor you will get mostly gibberish, as the information is stored in format readable only (until recently) by the Endnote software. However, if you open Endnote’s style manager and inspect, for example, the style for the American Historical Review, under Bibliography templates, you will see some of the kind of information stored by the .ens file. For example, under book template you will see something like this:

Author. Title|. Translated by Translator|. Edited by Series Editor|. Edition ed|. Number of Volumes vols|. Vol. Volume|, Series Title|. City|: Publisher|, Year|.

Each of those words corresponds to a variable, or a kind of an empty box, into which Endnote will drop your bibliographical information, in accordance with what you have entered into the database with your sources. It is important to understand, for the purposes of this first battle of the E vs. Z war, that the styles themselves are not proprietary, but Endnote lawyers are arguing that the way they have translated these styles into a digital format, that is the “.ens” file, is protected by the Endnote license.

3) Word Integration – The final killer feature of Endnote is that the software can take your list of formatted footnotes, endnotes, or bibliography and directly interface with the most popular word processor out there: Microsoft Word. If a scholar is writing a paper in Word, they can prepare an Endnote document with all the sources they need for the publication, and directly in word they could assign certain sources to certain footnotes or the bibliography using a Word plugin provided by Endnote. They can then, with a few clicks, format all of those footnotes, endnotes, and the bibliography to the style appropriate for whatever publication they are submitting the paper to.

For thousands of scholars this ability has saved hundreds of hours they might otherwise spend typing up their references and making sure it conforms to the requirements of their publisher.

However, as a side note, this hasn’t been all good. I can share from my own experience and the experience of my friends some of the most problematic issues:

a) Garbage in – Garbage out: The library databases that most users of Endnote interface with don’t always have perfect information. Sometimes information is in the wrong place, lacking capitals where it needs them, or contains a lot of surfeit information that one doesn’t want to include in every footnote. Users must often spend a lot of time cleaning up imported information before having Endnote (or Zotero for that matter) do its magic. This is a problem of data integrity, not the fault of the software.

b) Endnote sucks. We used it because, until the rise of alternatives like RefWorks and Zotero, that is all there was. I’m sorry, but since the earliest version I started using years ago until the most recent version Endnote seems to have thrived in an environment of safety and lack of competition. For many years Endnote could not deal with any sources that used non-Roman scripts, mangling any Chinese, Japanese, Korean sources such as those I have need for. To this day, I have encoding issues with Endnote that makes it a pain to use. Endnote has a user interface that seems to have been designed by programmers that have never written a paper in their life, let alone studied user interface techniques. It is ugly, clunky, and unintuitive at every step. Finally, Endnote has long had serious stability and performance issues when it interfaces with Word. Though I haven’t personally had any major disasters, only minor hiccups caught early in the process, during my tech support days at Columbia University’s Faculty Desktop Support, I have had to deal with many panicking professors who showed me their book or article manuscript Word files with completely mangled footnotes. “All my references suddenly disappeared!” or “No matter what I click in Endnote, nothing converts or changes in my Word file anymore!” were two of the most common complaints I had. Sometimes the tenuous connection between Endnote and Word just seem to breakdown, with disastrous consequences.

c) Endnote only works with Microsoft Word. At least as far as I know in the versions I have used. This created a vicious circle within academia. At FDS I watched more and more professors who loved their ancient alternatives to Word like WordPerfect and Notabene (I had never heard of this until I saw its grip on Classics and English departments), or who stubbornly resisted Microsoft’s power by using OpenOffice or Apple’s AppleWorks having to switch to Word not only because .doc was the dominant format but sometimes because they watched with envy as others used the power of Endnote for large scale pieces.

The Rise of Zotero

Zotero will go down as one of the great open source legends. Unlike many other wonderful pieces of open source software, I believe Zotero is poised to completely topple its commercial rival, Endnote, and do so in record time. Zotero has and will continue to have other powerful competitors who askew the browser-based approach or embed a browser into the software, but the rule of Endnote is soon at its end. I have played with Zotero since its buggy early beta days and watched it grow to the powerful alternative to Endnote that it is today. Developed by and for the browser generation it took a radically different starting point: Endnote users started their bibliography creation process within the Endnote software: typing up or using Z39.50 connections to add sources to their bibliography. Zotero users start on the net, because hey, guess what, we all do.

Zotero assumes we find the majority of our sources while, for example, using a library’s search engine, a list of books on Amazon.com, an article at JSTOR or other academic databases, or when reading a blog entry. Zotero has gradually added a huge list of “site translators” which scrape a web page and extract the useful bibliographical information from the page in question. There are plugins to add metadata readable by Zotero in popular blog engines like WordPress. Whether it is a library book entry or a bookstore listing, Zotero can instantly add information from hundreds of websites and databases available online by simply clicking an icon in the address bar. You can also instantly add bibliographic entries from any static web page, and save offline snapshots of these websites from the time you accessed them for future reference. This all meant that Zotero very quickly far overtook Endnote’s main killer feature #1. It was an instant feature smack-down.

Because the project is free and open source, it quickly gained a huge following even when it lacked some of Endnote’s power. Those without access to a university site license were loath to dish out the ridiculous $300 for Endnote ($210 for an academic license) or face its steep learning curve and were willing to accept cheaper alternatives like Bookends (Mac, $100, $70 for students) or the increasingly powerful Sente (Mac, $130 or $90 for students). Zotero, of course, is completely free. Plugins and site translators for Zotero have spread fast as a result. It also offered powerful tagging capabilities and the easy organization of sources into folders, which is way ahead of the incredibly limited organizational possibilities of Endnote’s file-based bibliography system. The only major weakness in Zotero’s general approach is the fact it is wed to the Firefox browser so researchers may have to do their source hunting in something other than their favorite internet browser.

I think the most powerful attack on Endnote’s market came, however, when Zotero added support for Word, OpenOffice, and NeoOffice integration. Although I think the results have been somewhat mixed in the early stages (I haven’t tried in the newest release) this will eventually eliminate the advantage of Endnote’s killer feature #3.

All that remained before Endnote became an expensive 175MB waste of space on one’s hard drive was for Zotero to catch up with Endnote’s killer feature #2. Now, Zotero’s 1.5 Sync Preview which is available for download as a beta, includes (though this has been temporarily disabled, perhaps because of the lawsuit) the ability to export Zotero database entries using Endnote .ens style files. I’m not 100% sure how this works on a technical basis since I haven’t played with a functioning version including the feature, but the text of the Thomson Reuters lawsuit against GMU claims that Zotero now also provides a way for .ens files to be converted into the .csl style files that Zotero has. I have seen some comments on blogs that claim that the new version of Zotero never provided this ability directly but merely provides a way to output bibliographic data exported via existing .ens files should the user be in possession of such Endnote files. Either way, the developers of Zotero must have engaged in some kind of reverse engineering (which is where the lawsuit claims there is a license violation) of the gibberish we otherwise see in the .ens files in order to understand how Endnote has digitally represented the publicly available output styles and is therefore now in possession of the ability to, for example, convert the Zotero database data, through these .ens files, into a readable bibliographical entry, or if it wanted to, save such style formatting data into .csl files if that feature were ever included.

The War Was Over Before It Began

I think we have to await the official Zotero announcement regarding the lawsuit to help us determine the accuracy of the technical claims being made by Thomson Reuters. An entirely separate question, which has received the attention of various technology oriented law bloggers, is the strength of the approach of the legal attack itself and its separate and bizarre claim GMU is responsible for a misuse of Endnote’s trademark.

What isn’t in dispute, however, is the fact that Endnote should be very very scared. Whatever features are included in 1.5 or later versions, the developers of Zotero have clearly made sense of the .ens files and suddenly the thousands of output styles provided by Endnote might potentially become importable, exportable, or more likely, simply accessible and readable by the Zotero software. Once these publicly available style formats become digitally understood by Zotero’s database, by whatever means, Endnote loses its last and final advantage over Zotero. This will, in my mind, undoubtedly be followed by the slow death of Endnote, already begun, as new users see no advantage to using the flawed aging piece of software with its huge price tag.

The outcome of this lawsuit, even if it goes in favor of Endnote, cannot really do much to stop this trend. Zotero isn’t going to disappear. Even if, and I find this to be extremely unlikely, GMU were to take the radical step of completely shutting down its support for Zotero development, the user base is already huge. Other programmers will pick up where GMU’s team began with the code already in their hands. The reverse engineering of the .ens format, if it has been done successfully, can probably be explained in the space of a few paragraphs or represented by means of a few pages of code, perhaps encapsulated as a plugin that can be distributed separately from the Zotero software itself. The knowledge of a file format’s structure, once in the wild, can’t be put back in the proverbial bottle, a reality faced by dozens of software applications in the past and something we have seen with everything from Microsoft’s .doc to various proprietary image, sound, or movie file formats. Once the .ens output style files, which are all under 50k in size can be interpreted, it is a simple matter, though of dubious legality, for scholars and students to email each other the dozen or so .ens files of journals or institutions most important for their field either in the original format or, if the feature is eventually made available, converted into .csl files.

I believe that, whatever the outcome of the lawsuit, Endnote’s owner has shot itself in the foot. Users like myself do not like to be locked into one solution and when we see a free and open source alternative under attack, it is an easy matter for all of us to jump in and identify the “good guys” and the “bad guys” to paraphrase one recent politician. Endnote is in an unenviable position. It saw Zotero’s latest move as the final straw in its attack on the Endnote user base and decided the legal move was its last chance to halt the bleeding by protecting one of the most important components of its legacy code: the .ens output styles. Strategically, they have made the wrong move and I think all of us who agree should make our voice heard. It would have been far better for Endnote developers to at least attempt to out-innovate Zotero, something very hard to do when your opponent’s staff of supporting developers includes the wider community of open source developers along with solid university and foundation funding. Instead they have given Zotero a brilliant publicity moment.

Update: The official response by Zotero and GMU about the case. Nature magazine editorial on the issue.

Further Reading

Text of the Lawsuit (PDF)

Chronicle of Higher Education Wired Campus article on the Lawsuit
Outline on Disruptive Library Technology Jester
More Extracts and Discussion at Disruptive Library Technology Jester
Crooked Timber entry by Henry on the Lawsuit
James Grimmelmann Legal Commentary
More Legal Comments at Discourse.net
Mike Madison at Madisonian Offers a Legal Take
Mention and Comments at Slashdot

The Open Source CSL Format

]]>
/blog/2008/10/endnote-takes-a-shot-at-zotero/feed/ 10