Lost in translation

Language serves as a measure of culture and inclusion in the world of Wikipedia. Yet this is trickier than we think.

On their website, the Wikimedia Foundation states that it aims to ‘provides the essential infrastructure for free knowledge’. A ticker runs on the screen, espousing this sentiment in four languages, including English and Hindi. At the bottom of the PC page-view, as well as at the top right-hand corner, are five language options. In her presentation at SOAS in mid-February, Miriam Redi, a researcher at the foundation, spoke about how they were looking at driving greater diversity. Understanding and incorporating the variegated ways in which we perceive beauty in images in the Wikipedia algorithms, was one project. Providing content in different languages was another. For Wikipedia then, language appears to be an epistemological axis of both access and, by extension, inclusion.

Do different Wikipedias translate to different cultures?

In a 2015 study, Gloor et al. considered whether there were cultural differences between the English, Chinese, Japanese, and German Wikipedias. In their own words, the research analyzed the ‘historical networks of the World’s leaders since the beginning of written history, comparing them in the four different Wikipedias’. They did this by considering all people who made it into Wikipedia by fulfilling its notability criteria, and then establishing their social networks by considering links between people who were alive at the same time. Thus for a given time period, they arrived at those ‘notable’ people who were most ‘networked’. As an example, an English Wikihistory of 0 BC lists Augustus, Paul the Apostle, Tiberius, and Mary (Mother of Jesus) as having the largest network. Gloor et al. then ranked all such leaders across all times, for each of the four Wikipedias. Two key findings emerged.

First, the Chinese and Japanese Wikipedias mostly had famous warriors and politicians in the top ten, while the English and German ones were more balanced with around half of the top ten as well as half of the top fifty being religious leaders, artists or scientists. Second, and perhaps more strikingly, 80% of the top 50 leaders in the English Wikipedia were not English while just 2 non-Chinese leaders made it to the top 50 in the Chinese Wikipedia. The German and Japanese Wikipedias were slightly more balanced with about 40% of the top 50 not being German or Japanese, respectively. The bottomline then, was that language could indeed be taken as being representative of culture, at least in the Wikiworld. Read the detailed study here.

But if language is a metaphor for culture, then the natural endeavour for bringing about greater cultural inclusivity would be to have several many Wikipedias, in several many languages.

The fatality of languages and imagined communities.

Writing about how print capitalism helped shape the modern nation-state, Benedict Anderson highlighted that ‘almost all modern self-conceived nations … have national print-languages … many of them have these languages in common, and in others only a tiny fraction of the population uses the national language in conversation or on paper’ (1991: 46). In other words, the ability to connect with hundreds, thousands, and millions of others through ‘unified fields of exchange and communication’ or simply, a common, interstitial language situated above the variegated spoken vernaculars and below the high tongue, allowed a unified ‘consciousness’ to be imagined (1991: 44). This was the nation-state, the imagined community.

Anderson argued that there was always an element of ‘fatality’ to language, which he explained as the ‘general condition of irremediable linguistic diversity’ (1991: 43). He posited that this also led to Foucauldian languages-of-power. For there were dialects closer to the print-language which influenced its final form, and there were those which lost out. And once thus established, languages as means of domination were and still are, consciously exploited.

Source: Graham et. al 2014, cited by vox.com

Anderson’s surmising is not too difficult to see or (allowing for a play on the word) imagine. The map on the left represents how inclusive Wikipedia is (or is not). Articles about most of the European countries, for example, are written in their (primary) languages. Articles about Mongolia though, are written in English. But if pluralism is to be driven by language, the question is simply: how many?

Mind your language. And mine. And hers. And his. And everybody else’s.

Perhaps before trying to address the question of how many languages, there is merit in a quick infographic and map-based look at our linguistic (and if we can therefore extrapolate, cultural) diversity.

A twitter map of New York City in 2012-13
Source: vox.com

The map above depicts tweets in different languages in New York City in 2012-13. The most common language for tweets originating from New York City was English, represented by grey dots. What is interesting however, are the other languages, and what this has to say about linguistic diversity and pockets.

Front (obverse: top) and Back (reverse: bottom) views of an Indian currency note, equivalent to INR 50
Source: shutterstock.com.

On the Indian currency note here, you can see that two languages feature prominently: English and Hindi. Yet if you look closely enough, in a boxed column in the middle of the reverse of the note (the bottom image on the left), is printed the denomination of the legal tender in 15 additional languages. vox.com estimates that India lost over 220 languages in the last 50 years, and that today it speaks about 780.

Into a different Wikiverse: an alternative ontology of cultural inclusion

In 2014-15, Siobhan Seiner got her students to engage in the process of curating, debating, and adding content on Wikipedia about indigenous Native authors from New England, US. Writing about the experience, she says:

Crowdsourced knowledge presents itself as contingent, as always subject to further input and revision. Wikipedia changes to reflect not only changing facts, like shifting national borders; it has the potential, at least, to reflect shifting intellectual paradigms. In this respect, wikis are not unlike oral traditions. 

Seiner 2015: 42

Thus, if they are like oral traditions then what do they say about how the same Wiki content is perceived in different cultures? This is where Eduardo Viveiros De Castro’s Amerindian thought comes into play. Castro (1998) contrasts western cosmologies of ‘multiculturalism’ which predicate one natural world and a multitude of cultures, with Amerindian (as well as Indian) conception which conceives a plurality of bodily existence, but a unifying culture. He writes:

…an ethnographically-based reshuffling of our conceptual schemes leads me to suggest the expression, ‘multi-naturalism’, to designate one of the contrastive features of Amerindian thought in relation to Western ‘multiculturalist’ cosmologies. Where the latter are founded on the usual implication of the unity of nature and the plurality of cultures – the first guaranteed by the objective universality of the body and substance, the second generated by the subjective particularity of spirit and meaning – the Amerindian conception would suppose a spiritual unity and a corporeal diversity. Here, culture or the subject would be the form of the universal, whilst nature or the object would be the form of the particular.

Castro 1998: 470

Castro explains this further by positing a humanistic condition or cosmology, where animals are humans in their own perception, but specifically non-human in form. They have their own societies, food, and culture, as ‘jaguars see blood and manioc beer … fur, feathers, claws, beaks etc. as body decorations or cultural instruments’ (1998: 470). Multi-Naturalism may at once appear difficult to grasp, yet examples thereof abound in contemporary cultural constructs. Think of The Jungle Book, Winnie the Pooh, and the Panchatantra (a series of ancient Indian fables).

What this suggests for our question of how many languages we need for Wikipedia to be more culturally inclusive is in fact a restating of the problematic itself, to one which is not about the language as the nodal representation of culture, but about what the content implies or means for different cultures, even if culture is represented by language. Thus a Wiki article on a particular topic on the English Wikipedia could have (for example) stubs for what it could mean for users of the different dialects of English, in England, the United States, and even India or Australia. In a sense then, Wikipedias in different principal print-languages could remain the first method of being more inclusive, but the contextually and culturally different meanings which content carries, as appropriations of Bourdieuesque habitus, could be further elaborated along the lines of the close and not-so-close ‘less-powerful’ languages and dialects which contribute to the principal print-language.

Of course, it may not be as straightforward as I imply above. As an example, and to begin with, the number of Wikipedias could be expanded to include Mongolian, and Wiki contributions encouraged therein. In this way, articles about Mongolia could in fact be made available in Mongolian. The second order of elaboration in Oirat, Buryat, and Mongolic Khamnigan could provide a contextually relevant meaning to these articles. Elements of language-as-power would still remain of course, in the form of Mongolian Wikipedia as the dominant, print-language equivalent. Additionally there may need to effect a hybridisation of the second order of elaboration with factors such as geography, religion, or tribe. Yet I argue that the question of inclusion on a platform such as Wikipedia, is beset with the same element of fatality as Anderson (1991) argued for language. Its irremediable diversity is but its nature, and this stands culturally irreconcilable with the objective of including one and all.


This post is inspired significantly by a conversation during the Q&A session following Miriam Redi’s presentation at SOAS in mid-February. Specifically, I would like to thank Gitika Saksena (@gitikasoas) from the seminar cohort for suggesting an exploration of Wikipedia inclusivity through Amerindian Perspectivism as opposed to language alone, during the the session.

Additional resources

Ever wondered as to how many languages exist out there? And where? Langscape is probably your best bet. Try out their interactive map(s) to zoom in on different parts of the world.

Most Indian children have grown up knowing one or more of the animal-fables espoused in the Panchatantra. Watch an animated version here. The animation may not be the most contemporarily executed, but the essence of the stories is very much there.

Language of course, is not the only construction of bias as far as Wikipedia goes. Read ‘Is Wikipedia Biased?’ by  Shane Greenstein and Feng Zhu in The American Economic Review, Vol. 102, No. 3, 2012 for a political take on bias in Wikipedia.

