Category Archives: china

4.3 Lei ZHANG: A Bridge of Understanding: the importance of the Internet in facilitating the most important translation community in China

Lei Zhang discusses Yeeyan, the largest social translation community on the Internet that he co-founded. The Yeeyan community translates more than 150 articles on daily basis, ranges from news stories, opinions, to scientific papers, from various languages into Chinese.

What is the kind of content of the Internet in China?  According to Wikipedia, more than 80% of web contents are in English (German follows in 4.5% and Japanese 3.1%).  LZ highly doubts the accuracy of these numbers, but still remarks that it points out the dominance of certain languages in the Internet world.  To answer this question, LZ ran queries in search engines and examines the languages of the query results.  ”Breast cancer” in English results in 38 million returns, but only 6 million in Chinese.  So there is a huge amount of content that needs to be translated into Chinese, but machine translations are inadequate to solve this language barrier problem.  

Yeeyan is a community translation approach to this problem; it is essentially a “Wikipedia for translation,” with over 40,000 translations, 8,000 translators, and 80,000 members.  LZ saw a problem of understanding divide, e.g. issues in 2008 on Tibet and nationalism, or conservatism in economic policy (which, in China, in fact means supporting anti-free market).  Translation is not a full solution, but a necessary first step in bridging the understanding divide.

The Guardian Chinese Edition is another Yeeyan project powered by community translation.

4.2 Roger DINGLEDINE: Circumvention technology and its role in China

Tor is a free software that you can use to connect to other sites on the network, even if the network doesn’t want you to.  It comes with a specification and full documentation.

Tor has about 1500 volunteers to serve as active “Tor relays,” allowing others to reroute their traffic through these volunteers.  There are about 200,000+ active users with >1 Gbit/s, and funding from the US Dept. of Defense, Electronic Frontier Foundation, Voice of America, Human Rights Watch, Google, and NLnet. 

Tor deals with “anonymity,” “privacy,” “network security,” and “reachability,” serving different interests for different user groups (government, private citizens, businesses, and blocked users). Few circumvention tools provide privacy, security, and anonymity in addition to circumvention; “circumvention” here addresses Internet filtering (as opposed to Rebecca MacKinnon’s concerns about web site censorship). RD will focus on these functionalities that Tor provides.

The goal for Tor is to distribute the relays over multiple hops, decentralizing trust so that no one intermediate hop knows who is talking to who over a sustained connection.  Tor provides three anonymity properties: (1) a LAN attacker can’t learn or influence your destination (useful for blocking resistance); (2) no single router can link you to your destination (no signing up relays to trace users); and (3) the destination can’t learn of your location (so they can’t reveal you or treat you differently).  

Tor can’t solve all problems, but it is sustainable: it is based on a community of volunteers and developers and on an open design.  Using Tor in oppressed areas – RD claims that as the firewall starts cracking down more and more, there will in fact be more Tor users who will be more “ordinary” people to be able to do what they used to do.

Another note: publicity attracts attention – the publicity attracted by censors threaten the impression of control by censors, which is arguably as, if not more, important to censors than the actual control.  We therefore control the pace of the arms race–we are not “doing against China,” but instead writing software to allow others to use for their own purposes.

Next steps: Tell the right people, keep working on the details.  Again, technical solutions will not solve the whole censorship problem (especially in countries where firewalls are socially very successful).  But a strong technical solution is still a critical piece of the puzzle.

4.1 Bruce ETLING, John KELLY & Rob FARIS: Mapping the Chinese Blogosphere

RF: Will frame the study.  Visualizes a first-generation 3D rotating map of the Chinese blogosphere.

JK: Visualizes social network diagrams of 12 different languages.  Some visual representations of languages (for Russia)–represented by concentrated, polarized areas of color–are platform-specific; others (for Arabic, Persian) are much more distributed.  

The Chinese blogosphere is a mix: it is concentrated in some areas (i.e. it is still platform-specific) but over a spread of “trading zones” (e.g. business bloggers, patriotic bloggers, bloggers based on Sohu.com or ycool.com).  Different cuts of the visualization can be via layers of traditional or simplified characters. Cluster focus index graphs show how proportionately terms are used in a given cluster relative to everyone else. The visualization also shows links and tags (e.g. tags for technology, social, news, and politics). 

Larger zones are in business and culture; interestingly, in one side are pro-state bloggers, but on the other side are overseas communities.  In the middle are critical discourse–those are the ones that get blocked.

Panel 4 – Civil Society in China: Challenges and Opportunities

CIRC squareWe’re on break now, but will soon be hearing from our Panel 4 participations:

  • Bruce ETLING, John KELLY & Rob FARIS, Harvard University 
  • Roger DINGLEDINE 
  • Lei ZHANG 
  • Isaac MAO
  • Respondent: Amy E. GADSDEN, University of Pennsylvania
  • Moderator: Kenneth FARRALL, University of Pennsylvania

Liveblogging is scheduled to begin at 3:30pm (you can also watch the webcast live, among other ways of staying connected with the conference’s proceedings). Full biographies of panelists are still available here.

Panel 3 Q&A

Q: Others have mentioned that the hapharzardness of blogging censorship seems to be one of its strengths.  What do the panelists think of this haphazardness–is it a weakness or in fact a strength?

RM: In short, yes. The fact that the lines are not clear certainly does make censorship more effective–it can have a chilling effect on users trying to post elsewhere, or cause companies to overcompensate based on directives they receive from state council and other departments are generally vague.  Companies therefore err on the side of caution, leading users to do the same.

On the other hand, some bloggers are very active (posting to multiple blogs) and determined, using these inconsistencies to their advantage.  Clever and educated bloggers know the weak points or incompetencies that they can use to their advantage, but the general public does not.

Another audience member concurs, saying that he or his friends also post to multiple blogs.

Continue reading

Panel 3 Respondent – Carolyn Marvin

Prof. Marvin thanks the panelists for a fascinating and informative set of presentations by the panelists, and the innovative tools and research methods to analyze aspects of the Chinese internet.

She wants to step back from the speech zones or Golden Shield, and look at a more civilizational level to understand the implications of these discussions.  While the presentations represent only a fraction of the papers, they signify important issues concerning political will to expand digitization plays in Chinese societies.  The presentations also point us to the dynamism of the technological sphere, and the importance of creating metaphors and meaningful theoretical frameworks to understand the future flow of information and communication. Continue reading

3.4 Dave LYONS: China’s Golden Shield Project: Myths, Realities and Context

Lyons will examine not the role of police (versus companies) in censorship material in China.  The Golden Shield Project began in its exploratory phase around 1998, and has been called by many as “the great firewall.”  

Lyons’s point is that the Golden Shield Project (GSP) is not the Great Firewall.  His interest is in the development of technology in bureaucratic processes; applied here, he examines Jon Agar’s The Government Machine to study what computers and IT do as a tool of government. 

The Golden Shield is a project conducted from roughly 1998-2008 to bring computers to the police at all levels of security in China.  Many of the roles of the computer were not related to the internet (e.g. population management).  The idea in the population management area was to have a national database record for every single Chinese individual in the system, by registering individuals at the local, then provincial, then national level, with corresponding less degree of granularity.  The 1st-generation card, launched in the 1980s, sometimes had names written by hand; the 2nd-generation ID card is now digitized with RFID chips to be more secure.

Lyons specifically focuses on forgery, which has been a serious problem in China.  Forgers readily and publicly advertise their services.  The largest project of GSP has a lot to do with how to accurately identify citizens.  It will help in tracking dissidents who are targeted, but the forgery industry is so robust that when forgers become hackers, the government may find no small amount of resistance.

Another project of GSP is building surveillance technologies through computer technologies; given that other countries have similar surveillance technologies, Lyons sees China as having caught up in aggregating data for surveillance and security purposes.  China has been a decade behind other countries in police computing, but it is catching up quickly.

3.3 Rebecca MACKINNON: China’s Censorship 2.0

MacKinnon will examine one particular type of censorship in China.

Censorship in China is usually categorized in two ways: (1) censorship outside the “great firewall” – filtering of websites outside of China; (2) censorship inside the “great firewall” – deletion of content on domestic or commercial sites; takedown of domestically hosted sites; shut-down of data centers.  Circumvention tools that Hal and Ethan just discussed mainly address the former type of censorship.  

The difficulty, however, comes with the latter type–when internal sites and servers are shut down.  MacKinnon’s work looks at this type of censorship. Research of censorship inside the great firewall is slim; a 2006 study (with MacKinnon) compared which search engine seemed to be removing more or fewer results, and discovered a surprising variety.  Search engines were not uniformly censoring in the same way, for the same things–indicating the companies were making internal decisions in reaction to government demands, implementing those demands differently.  Nart Villeneuve at The University of Toronto followed up on this study, studying search engine transparency in China. 

The study: Given the wide variety of how search engines censored search results, she and her co-researchers hypothesized that blogging services would show a similar variance in their censorship decisions.  The study posted content that ranged in sensitivity across 15 different blog services.  

Results of study: Censorship varied even more than expected in the blogosphere.  A great deal of sensitive content was getting through, but at vary different amounts depending on blogging services.  The company with the highest censoring practices censored 60 out of 108 blog posts; the least censoring service only censored one.  

The results also revealed different types of censorship behavior: (1) The blogger is prevented from posting at all; (2) post is “held for moderation” (in which case it sometimes would and sometimes wouldn’t appear); (3) post is not visible to public, but only to the author when logged in; (4) published, but then removed within 24 hours; and (5) geo-filtering of sensitive posts (MSN only). 

However, very sensitive posts sometimes would get through; yet news agency articles that included names of leaders (e.g. Hu Jintau pep talk to Olympic athletes) would be censored.  Why the variation?  Potential theories: relationship of local city, provincial, or state officials with blog editors; different methods for implementation. 

Conclusions. Domestic censorship is not centralized–it is often outsourced by the government to the private sector, which is itself interacting with choice.  The system of “managing” user-generated web content in China follows similar logic and approach as the system for controlling professional news media.  While the survey should be improved and applied to a systematic and broader range (e.g. web service company employees), it is a helpful beginning point for additional research in China and in other countries, and has potential activist implications as well–for example, perhaps we should fund more than circumvention tools, and instead on other policies such as raising awareness to bloggers about varying censorship practices.

3.2 Hal ROBERTS & Ethan ZUCKERMAN: Circumvention Landscape Report: Methods, Tools, and Uses

Roberts and Zuckerman presented their research in the use of circumvention tools for internet filtering. 

Roberts first notes that the circumvention tools addressed here will not focus on filtering in the local-publisher level in the way that Rebecca MacKinnon addressed in her talk.  Instead, they will address circumvention on an international and IP network level, through DNS (domain name server) queries, and on content-based circumvention through keywords on the online network.  

All circumvention tools use similar mechanisms, be it processing between the user and server via a proxy (for IP and DNS filtering) or encryption (for keyword filtering).  Circumvention also has several different challenges, including ensuring performance, developing sophisticated ways of keeping the proxy themselves from getting blocked, and building trust from users of the tools. 

Roberts highlights three different models of hosting: (1) centralized hosting (e.g. UltraSurf), which has high trust; (2) p2p hosting (e.g. psiphon – although the software version has evolved since the time of testing); and (3) algorithm-based routing (e.g. Tor, also discussed in more detail in a later panel), which allows hosting more servers, but it is harder to find people on the servers.

Zuckerman then pointed to a few lessons learned based on their initial study.  First, user models proved to be a complex and underemphasized aspect of their research.  The funders of the study approached the project from the perspective of human rights activists (how can Burmese journalists stay secure as they report?), so they were looking for the most secure version of circumvention tools.  Other users, however (high school students trying to get into Facebook), have a demand for less secure tools with lower stakes.  So one weakness of the report is failing to reflect the mix of different user models for these circumvention tools.  The current six criteria are utility, usability, security, promotion, sustainability, and openness.  Zuckerman suggests that the report can be improved by including better criteria based on different circumvention needs.

One interesting finding was that the usage of single-hop proxies were increasing, and many were looking for single-hop proxies.  There were sufficient proxies to be able to sell the right to be at the top of the directory page on their proxies.  So the next round of work should be more egalitarian, looking at a much wider range of tools and proxies.

The next step in the project is to study what people are actually doing with their proxy servers by examining data from proxy servers or cyber cafes, analyzing search data for proxy servers, or conducting surveys within human rights and blogger communities.  Zuckerman and the research team plan to collect behavioral data to determine user models, and apply their analysis within those models and a new set of tools.

3.1 LIAO Han Teng: Special Speech Zones in the Chinese-Written Internet

Han-Teng Liao will be focusing on theories of user-generated content, and specifically two major cases of user-generated content: (1) Baidu, and (2) Wikipedia.  William Chang, chief scientist of Baidu, has stated in the WWW2008 conference in Beijing that there is “no reason for China to use Wikipedia.”

He first argues that the debate over freedom v. control (“zhi” v. “luan”) has been neutralized by the government to some extent. Beijing has successfully replaced control with freedom.  According to a 2007 survey, over 80% of the Chinese people prefer government control, based on the idea that freedom leads to chaos or “luna” (e.g. Tiananmen Square protests; Taiwan’s democracy and freedom that is perceived as chaotic).  

So we have reached a deadlock of sorts about these theories; Liao is therefore trying to modify the conceptual framework of control and freedom–looking at it instead as “zoning tech” v. “dynamic order.”  Free speech zones actual walls (e.g. in property zoning cases), and the government treats freedoms as exceptions. In “zoning technologies,” some free actions are allowed, but the mutual adjustment is among states, market players, and individuals.  In contrast, a dynamic order emerges from individual free actions and mutual adjustments to one another based on diverse set of principles.  Order can thus emerge from a free mutual adjustment online.  Instead of the “great firewall,” perhaps we can modify the metaphor to be a “great canal + a great dam.”

Reframing the question: How do we read the order online?  Beijing’s involvement has an impact on the dynamic order; perhaps the question is less about the dynamic order and more about the market order.  Perhaps the relationship is more divisive and less integrative.

Why Baidu Baike and Chinese Wikipedia?  Baidu is a rare and favoured-by-Beijing competitor to Wikipedia.