May 2010


Workshop06 May 2010 08:37 pm

During my field research in Korea, Taiwan, and China I carried around a hefty camera with me to archives and libraries. On those fortunate occasions when I was allowed to use it, I snapped nice high-contrast “text mode” photos of everything from handwritten documents, mimeographed newspapers, pages of books, and thousands of pictures of microfilm reader screens zoomed in on a particular item. I also developed my own coding system to connect the numbers of the images in the digital camera to items in my notes in order to easily find the images again when I need them in my dissertation.

On other occasions I carried another smaller camera in my backpack for emergencies when I wanted to copy some pages out of books but the pictures were often blurry. I recently discovered, however, that the camera on my iPhone 3GS contains a good enough camera to take decent pictures of books and documents if you have moderate indoor lighting.

The Pics Need Processing

To get optimal results however, pictures of books and documents taken from an iPhone 3GS need to be processed: the contrast and brightness need to be turned way up, the size of the image can be significantly reduced in size (from about 1.1MB to 0.25MB each), and if you are making copies of an article or part of a book, ideally you want the result to be a PDF, not a folder full of pictures. Indeed, it is for this purpose I have logged dozens of hours standing in front of the various PDF scanners in the libraries here at Harvard that I wrote about here.

Processing these pictures is time consuming, and begs for a hack. iPhone applications like JotNot and PocketScan are a nice idea but I find them to be incredibly slow and awkward to use.

So I spent a few hours last night and came up with an inelegant but effective solution that, once set up, makes the whole process of getting iPhone pictures processed and into a readable PDF fast and painless. A real hacker would create a script that does all this for the user in a single step, and I would love to get my hands on such a script but in the meantime, in case there is someone out there who would find this useful, here is my current solution using OS X 10.6 and Adobe Photoshop CS3.

Preparations

You only need to do these steps once to get your computer set up. but they are kind of convoluted. I’m sure someone out there has a more efficient method:

1. Create a folder somewhere easy to get to on your hard drive and call it “Convert”

2. Create a folder (in the same folder as Convert for example) and call it “Converted”

3. Open “Automator” in your Applications folder and create a new Automator workflow that looks like this:

workflow.png

Save this as a workflow that we can attach to the “Convert” folder as a folder action. In the top pop-up menu select “Other…” and choose the “Convert” folder which will contain the iPhone photos you will drop in to have converted into a PDF. The applescript will command Photoshop to do an action I have called “CreatePDF” which will process the images one at a time (see below). The automator workflow then grabs all the files, which Photoshop will save into a folder called “Converted” which you should indicate, and create a PDF from them. The final step cleans up the images in the Convert and Converted folder by deleting them. You can delete this step if you don’t want it to delete the images but I usually drop in copies or exported images so I don’t need them once the PDF has been created. You can if you like, download my Automator application version of this workflow here, modify it for your own use and folder locations and save it as a workflow. Keep in mind you need to change the path on the “rm” commands to point to your Convert and Converted folders.

4. Now we need to open Photoshop in order to create two actions. You can see what my actions look like below and create your own version, or download mine here, import them into Photoshop and modify them for your own needs. In the picture below you can see that I have one action called PrepPDF which actually processes a single image by a) changing from color to grayscale b) increasing the brightness and contrast and c) reducing the size of the image and d) saves the image as a JPEG and compresses it significantly. You may find that you want to process it in some different way. The second action, CreatePDF runs Photoshop’s batch command, performing the PrepPDF action on every image it finds in the Convert folder and saves the resulting processed image in the Converted folder.

adobeactions.png

5. Finally, in the Finder, right click on the “Convert” folder and choose “Folder Actions Setup…” and attach the workflow you created in Automator.

Now things are set up and you will be able to convert your pictures to PDF whenever you like by the means below. You won’t have to repeat the steps above:

If things don’t go right when setting up, make sure the files are all pointing to the right locations, the correct folders, and the correct names for the actions in Photoshop and the action set they are saved in.

Going from iPhone Pictures to Readable PDF

1. Take pictures of the document or book in decent lighting. Click on the screen to focus if it is not focusing properly. Would be nice to put together a nice copy stand to hold the iPhone up while you take pictures, but I’m not that kind of hacker.

2. Import your pictures of the documents/books from your iPhone into iPhoto or, via Image Capture, into your computer somewhere. I don’t recommend importing the pictures from the iPhone directly into the “Convert” folder as the copying process is slow and the script seems to speed ahead of the copying and end up with incomplete PDFs.

3. Open Photoshop. The script should launch it, but I find it work better when it is already open.

4. With Photoshop open, drag and drop your images (or a copy of them, by holding down the option key) into the “Convert” folder. It will run the Automator workflow, which will run the Photoshop action CreatePDF which will run PrepPDF on each picture found in the Convert folder, dump them into the Converted folder after processing them, and when it is done the Automator script will take those processed images in the Converted folder, create a PDF out of them, and delete all the images in both folder so it is clean and ready for the next job. The PDF will be found on the Desktop (this old Automator action seems to be broken in Snow Leopard and I can’t get it to save the PDF anywhere else).

With this I have been able to, even while standing in the stacks of my library, whip out my iPhone and, holding the book open, snap pictures of an interesting chapter etc. and process them quickly and easily into PDFs once I get home. Here is one short example of a PDF created from some pictures taken in the stacks with my iPhone.

Update: If you are converting a lot of pictures into a single PDF, the Applescript in the first command can time out. I added two lines to my workflow to increase the timeout from the default two minutes to 10 minutes:

tell application "Adobe Photoshop CS3"
  with timeout of 600 seconds
    do action "PrepSave" from "Default Actions"
  end timeout
end tell
Print
Workshop01 May 2010 10:49 pm

I recently broke down and got an iPad. I use it mostly for reading PDFs on the run, watching movies, taking notes (with external bluetooth keyboard), and studying my daily flashcards.

After trying (and writing reviews of) many different flashcard programs over the years, and even designing some of my own many years ago, I become a loyal daily user of an open source project called Anki (read my review here). It is, in my opinion, the best program around that uses “spaced repetition” or interval study to prompt you only to review information that you are on the verge of forgetting. It helps me keep up on vocabulary in various languages and even serves as kind of daily “meditation of repetitive action” for me.

I can use Anki on my iPhone/iPad through a browser based script called iAnki but there were some things about the layout of the iAnki plug-in which I didn’t think worked well for the big screen of the iPad, which is now my primary way of studying vocab decks when I’m out of the house.

I made some changes to the HTML in the plug-in that I think work better for me. These include:

1. Increasing the font sizes of several fields. 2. Removing the “Show Answer” button and making most of the screen function as a “Show Answer” button so you you don’t need to reach and hit the button. 3. Moving the 1 and 3 buttons to the left edge where I can easily reach them while holding my iPad. 4. Moving the 2 and 4 buttons to the right edge where I can easily reach them.

For anyone out there also using iAnki on an iPad who want to try my hack here is what you do:

1. Download the hacked template here.

2. Unzip it and use it to replace the existing ianki.html file that is in
the iAnki plugin folder. For example, on my Mac the old ianki file is
found:

~/Library/Application Support/Anki/plugins/ianki_ext/templates/
ianki.html

Replace that file with the new one you downloaded.

3. Open up Anki, launch the iAnki plug and install it on your iPad (you’ll need to
install and bookmark it again if you had it installed already)

If you use Anki, please support Damien’s programming efforts in Japan with a donation and congratulate him on his recent marriage.

Print
Academia and Thoughts01 May 2010 07:36 pm

I found lots of interesting book offerings in the Routledge Asian Studies catalog I got in the mail today. Government and Politics in Taiwan is out in paperback, I’d love to learn a bit more about that. Oh, $43 seems a little much for a paperback. Legacies of the Asia-Pacific War looks interesting. Hmm, $125 seems a little unreasonable for a 240 pager, even if it is hardback and all. Ooh, Debating Culture in Interwar China, ah but, this 176 page book is $130. The Third Chinese Revolutionary Civil War, 1945-49 seems right down my alley, but $160 for that 224 page book is out of my range and is probably not where your average library would want to invest. But don’t worry, you can buy a Kindle version of the book for only $127 at Amazon! Hey, a four volume set on Imperial Japan and the World 1931-1945 looks fantastic, and looks to include a collection of influential historical essays on the topic. Oh, these four books will set you back $1295.

It is true Routledge is worse than many publishers, but this is beyond ridiculous. I’m fortunate enough to have access to Harvard libraries until I graduate (fingers crossed) next year, but the chances are very good that whatever libraries I can find nearby throughout the rest of my career are the kind who cringe at these prices. I don’t really blame the publishers, though. They are just trying to make a buck in a tough industry with books that have very low chances of selling more than a few copies here and there.

However, I do blame academia for making book publishing such a central part of career advancement. I really wish they would support a wider range of formats and a completely digital open access but peer reviewed world of scholarly interaction, given the increased potential it offers that informed readers outside our small academic world to participate more actively in the process.

Perhaps my expectations are too high, but even if monograph-length publications of the traditional variety are here to stay, can someone tell me why we can’t do something like this:

1. Scholar gets an annual personal publication fund from department, its size based on multiple variables, including perhaps, evaluation of past publications, a department’s commitment to support research in a tough field that is poorly funded by grants and professional associations.

2. Scholar writes a manuscript (a book, an article, but also other multi-media or film projects etc. ought to be included).

3. Scholar submits manuscript to a professional association along with small administration fee for free distribution of work to readers (or viewers, etc.).

4. Professional association finds some qualified unpaid anonymous readers for the work to evaluate its quality and distributes copies to them (the way publishers do now).

5. Readers return an evaluation that concludes refuse, revise, or publish with some indication of what relative importance the work has in terms of its contribution to the field from their perspective.

6. If it passes peer review, the professional association gives the scholar back the evaluation reports, an official endorsement (which can be used to promote the work, once “published”), and if funding is available, makes an offer of some amount of money towards publication of the work, in relation to the relative importance of the work attributed to it by its readers, its own further evaluation, and its budget for the year.

7. If the work passes peer review and the money offered by the professional association is sufficient for publication, proceed to step (9). Otherwise,

8. If the offered money is insufficient for publication costs or the professional association refuses to endorse it, and the scholar does not wish to make up the difference from her/his personal publication fund, they then repeat steps (3) to (6) seeking help from other professional associations whose evaluation of quality will add to the prestige and funding of the work, or other funding sources (departmental, university, other institutions) until they get enough money in offers or they revise or abandon the research project.

Once the scholar has decided that they have enough support from professional associations, grants, further departmental support, or contribution from their annual personal publication fund they proceed with publication and spend their funds in the following manner:

9. (Optional) Pay lump sum to a publisher-consultant who handles the administrative tasks and payment in below steps (10) to (13) if the scholar doesn’t want to deal with it personally or through someone at their own institution hired specifically for this task. There is to be no transferral of copyright away from the scholar either way and this publisher-consultant does not have any role in determining whether or not something gets published. In this model the publisher is an administrator who has contacts for managing the below steps.

10. Pay for X hours of labor to hire an editor-consultant to help improve the language and writing of the manuscript beyond the quality of its academic content.

11. Pay for Y hours of labor to hire a designer-consultant to create the print and digital presentation for the work (for desktop/mobile web browsers and e-reader applications).

12. Pay $Z for the fees to have the metadata for the work permanently indexed and its files hosted in multiple online depositories, including important information on its peer-reviewed endorsements and positive/negative evaluation reports.

13. (If you really want to make a paper version) submit the print formatted version of the work to all the major online print-on-demand services where anyone can order a cheap paper copy, including both libraries and average readers.

Here are the some of the strengths of a system like this:

-It leaves the copyright in the hands of the author, who will hopefully release the text with a Creative Commons license for maximum distribution and use.

-It imagines a new and powerful role for professional associations, or at least a transformation of traditional journal editorial boards/networks into more broadly defined associations who continue to have, among their primary duties, the evaluation of scholarly work in their field.

-It recognizes that publishing, even digital or print-on-demand works, can be costly process involving many hours of labor beyond that of the author and the anonymous readers.

-It leaves peer review intact, but shifts it from publishers to professional associations which should themselves proliferate in number and each will naturally develop differing perceived standards of quality and funding sources. With the decline of traditional academic publishing, these organizations should receive funding from universities and outside grant institutions or at least provide them with recommendations of where their funding should go.

-It allows for multiple sources of funding both from professional associations that participate in the peer review process but also allows scholars to use their own annual publishing funds, and further grants from university or other institutions.

-Since personal or departmental funds may end up partly or completely funding the publication of works that were poorly evaluated in the peer-review process and couldn’t get financial support from sources based on its quality, it does little to stop bad research from getting published. It does, however, prevent them from creating a burden on the traditional publisher who currently pass that cost onto the consumers of information – since now publishers play no part in the selection process or have any stake in the success of its publication – the publisher, editor, designer, and digital index/content hosts are all paid for their work regardless. Also, since such poor quality publications will not be able to promote themselves by showing that they have the endorsements of, and positive evaluations of reputable professional associations, they will simply get cited less and can get filtered out in various ways during the source search process. However, even bad works or ones on extremely obscure topics can sometimes be useful, if but for a footnote or two that turns us on to a good source.

In this system what is the role for traditional academic publishing companies as they exist now?

None. Universities who support many of them should eventually dissolve them but support them long enough to allow a relatively smooth transition for its employees to find niches in the businesses that should grow from providing services in step (9) to step (12). Book paper printing should be all done through print-on-demand services as the print medium slowly declines. Marketing/promotion of the traditional kind will ideally become a minimal part of the equation as association endorsements and evaluations become the dominant stamp of quality and citation networking power comes to rule the day. Of course, you can add a “marketing” budget for promotion and advertising between steps (11) and (12) above if such funds are available but hopefully this will be seen as a practice resorted to mostly by those who failed to receive strong endorsement from professional associations. No one promotes our journal articles, why should we treat our academic books and other projects differently? If it gets cited, read, and referenced, is that not enough to ensure its spread, especially if the works are openly available and thus offer no barrier to access.

Now, tell me why can’t this work? Why won’t something similar to this emerge from the ridiculous state of academic publishing today when it really wakes up? Let me know what you think.

Print