Engines of Collaboration: A Look Under the Hood of Wikimedia
usinfo | 2013-01-04 16:51
For many people, Wikipedia is a black box. You put in a queryand get out information. Those who understand Wikipedia as avolunteer-driven project might put it differently: You put in a bunchof smart people, and you get an encyclopedia.  But what’s the black box, and how does it work?
 
A large part of Wikipedia’s success can be attributed to its socialpolicies and principles, and perhaps we’ll explore those in a futurepost. Today, I’d like to take a look at the key technical mechanismsunderlying the encyclopedia and its sister projects. Wikipedia is a wiki,a database open to revision by anyone. The “edit” link gives youinstant write access to the contents of almost any article. How canthis fundamental openness result in anything useful? I’d say thefollowing technical mechanisms are critical:
 
Wiki syntax. This is the code wikis are writtenin. It’s simpler than HTML, but more complex than plain text. If youwanted to substantially contribute to Wikipedia, you’d have to learn atleast the basics of wiki syntax, but there’s plenty of help and tutorials to get you started.
 
Eternal memory. Wikipedia preserves a record ofevery change to an article ever made, allowing editors to instantlyrevert changes if they want. (Jon Udell’s Heavy Metal Umlautvideo is a good visual explanation of this principle.) Beyond contentchanges, even administrative actions like deletions or user blocks canalways be undone and are fully logged.
 
Total surveillance. OK, I’m exaggerating a bit for dramatic effect - we take privacy very seriously. But all changes to Wikipedia can be directly linked to the user account or IP address who made them, and we have tons of tools that help us in the day-to-day patrolling of the content that goes into Wikimedia projects.
 
Discussion pages. A social tool, the discussionpage associated with every article is of critical importance to developconsensus in decision making.
 
Users as toolmakers. One of the cool things aboutwikis is that users can create their own processes. For example, one ofour key quality assurance processes, “Featured article candidates”,is nothing but a wiki page where users nominate articles as highquality, and discuss these nominations. More on empowering users below.
 
It’s instructive to compare Wikipedia to its predecessor. Nupediawas Jimmy Wales’ first encyclopedia project, and it faileddramatically. Unlike Wikipedia, Nupedia implemented a rigorous, topdown peer review process — and when the project was quietlydiscontinued in 2003, it had produced a mere 24 articles. Wikipedia’sopenness is the key to its success, and it is counterbalanced by thetools and policies regulating all changes to the content. But whocontrols the engine that makes it all work?
 
Code is Law
I’ve always found the word “software” to be somewhat ridiculous:there’s nothing particular soft about it, nor is it any kind of “ware”.Computer programs dominate so many aspects of our daily life today, yetwe hide them in artificial obscurity. They are tools, sure, but theyalso have a regulatory function, especially in social spaces. Thepossible interactions of any online community are deeply affected bythe computer code that underpins it. I prefer the word “code” to“software”. As scholar Larry Lessig observed, computer code is comparable to legal code in its effects on (networked) society.
 
That makes it doubly important that code can be inspected, looked at. This is the core code that runs Wikipedia today.It is known as “MediaWiki”, a deeply misguided play on the word“Wikimedia”. The code is available under a free software / open sourcelicense, known as the GNU General Public License which, once again, allows anyone to share and modify it, provided they make all their own changes freely available.
 
The code is written in a programming language called PHP, which is also free and open. It’s also free to learn how to use it. This means anyone with the time and inclination can contribute to making the Wikipedia code better — browse around the MediaWiki website for more information.
 
And this is exactly what’s happened. For most of its history,Wikipedia has had no paid employees. Recently, the Wikimedia Foundationhired its two most prolific volunteer programmers, Brion Vibber and TimStarling. Their contributions are immense; and there are countlessother individuals and companies working on the code as well. PerhapsI’m exaggerating, but I often say that the MediaWiki software is asimportant to the future of free knowledge and open learning as the Linux kernel is to the future of computing.
 
Donatingto the Wikimedia Foundation will allow us to hire more developers tosystematically improve the MediaWiki codebase in key areas which, inturn, will improve the encyclopedia and its sister projects. But beforeI elaborate on some of the things we could do in the future, it mighthelp to explain how a few key technological changes shaped our projectsin the past.
 
Milestones of MediaWiki
Wikipedia, today, has plenty of multimedia. Images in particular adorn hundreds of thousands of pages — some of them of truly brilliant quality.This wasn’t always the case, and there were a few key improvements toour codebase that led to an explosion in the use of images on the site.For example, in March 2004, it became possible to automaticallygenerate small and large versions of images, and features for galleriesas well as vector graphics support followed.
 
In September 2004, we created a multimedia repository called Wikimedia Commons,which now hosts more than 2 million freely usable pictures, soundfiles, and videos. Technically, one of the keys to its success was theability to instantly embed any image from Commons on any Wikimediaproject in any language. Very recently, Tim Starling implemented anembedded video and audio player, and the number of videos and soundsembedded into articles has grown substantially since.
 
Another critical change was the implementation of a newcategorization system in summer 2004, led by Magnus Manske and BrionVibber. Today, we have a gigantic categorical index.When the category system was first implemented, it was fascinating howa single feature change led to an explosion in content: in just a fewdays, thousands of categories were created out of nothing.
 
In order to make Wikipedia available in many languages, and toimprove its usability, an undeniably critical feature was the abilityto edit all user interface texts (like the links in the left-handsidebar on Wikipedia) through the wiki itself. But we take thisprinciple of openness to revision even further: Our software can be reprogrammed by anyone, directly through the wiki!  Don’t believe me? Take a look at Lupin’s navigation popups tool, which fundamentally changes the way you browse Wikipedia.
 
How is it done? Essentially, our software allows you to tell yourweb browser (Firefox, Internet Explorer, or whatever) to execute alittle script whenever you visit a Wikipedia page. These programs canbe enormously complex, and make Wikipedia much friendlier to use. Ofcourse, for security reasons, none of these scripts will be run unlessyou follow the explicit instructions to activate them.
 
Once again, code is law: If we had not given our users the abilityto write these scripts, they would never have been created — andconsequently, Wikipedia would be a different place than it is today.These are just a few examples, and you can read more about theevolution of MediaWiki in its Wikipedia article. Now imagine what we could do if we employed not two, but 10 software developers. I’ll help. 
 
The Future of Collaboration
Mind you, I’m not suggesting that our codebase should not continueto be improved through massive volunteer collaboration. In fact, Ibelieve much of our effort should be focused on integrating and improving the work of others.After reading the above, it should not come as a surprise thatMediaWiki can be heavily customized with plug-ins that add additionalfunctionality. They are different from the browser-side scripts Ireferenced above, and potentially much more powerful still.
 
Take a look at the vast number of extensions out there. Some have enormous potential: The Semantic MediaWiki extension, for example, alters the way wikis handle structured data like the numerical information in infoboxesyou find in Wikipedia articles. Imagine if you could use Wikipedia notjust as an encyclopedia, but as a giant database, searchable in everyconceivable way: “Show me countries with a population smaller than10,000.” — “Show me the latest albums by punk rock bands.” — “Generatea graphical timeline of all Roman emperors.”
 
Or, if that doesn’t excite you, how about making Wikipedia more user-friendly? LiquidThreads,a project I am involved in, reinvents discussion pages to make themmuch simpler to use. There have also been many attempts to buildrich-text editors for Wikipedia. Personally, I think (due to thecomplexity of everything that we can do with our current wiki syntax)it will take a very substantial investment of resources to really pushusability a large step forward, but there are always incrementalimprovements we can make with less effort.
 
There are other cool extensions which have been lingering, sadly,unused for years. For security reasons, we have never deployed WikiTeX,which would make it easier for our editors to add musical scores,graphs and plots, chemical formulae, and similar content to Wikipediaarticles.
 
In many of these cases, what is needed is a final push: security andscalability work, integration, testing, documentation. In other words,the parts of the work that are least exciting. Frequently, authors ofMediaWiki plug-ins only seek to satisfy their own personal needs, bygetting the extension to run in an independent wiki environment theyhave created. That’s why the Wikimedia Foundation needs to be able toput some money into adapting and implementing the best and mostsignificant tools.
 
There are also internal strategic priorities, projects that are soimportant we can’t necessarily depend on volunteers to make themhappen. Here are a few:
 
Flagged Revisions.This toolset will allow us to empower contributors to identify theversions of Wikipedia articles that are known to be of high quality.Readers can then choose whether they want to see the very latestversion of an article (which might contain vandalism), or only the mostrecently reviewed one. Finalizing the implementation of FlaggedRevs ispart of our quality initiative.But to give you an idea how limited our resources are, we had to pullour developers off this project just to make sure that we could get thetechnical work for this fundraiser done! Truly, every donation would help us in our ability to execute key initiatives like this, making Wikipedia more useful and better for you.
 
Cross-project integration. Right now, every singleWikimedia project has a separate user account database. Want to fix anarticle from the German Wikipedia? If you only have an account in theEnglish one, you’ll have to create a new one! This is not an easyproblem to solve: thousands of account names exist in multipleprojects, so we need to merge identical accounts and splitnon-identical ones. Fortunately, much work on this has already beendone, but more remains. And once the account databases are integrated,there is potential for many more exciting features — like the abilityto change content in Wikinews from Wikipedia, to upload pictures toCommons directly from Wikibooks, etc. In this way, we can bring thefamily of Wikimedia projects much closer together.
 
Wiki-to-print and export technology. Right now,we’re not offering a lot of tools to make it easy for you to print ordownload collections of articles. This will soon change, through anexciting collaboration that will be announced within the next fewweeks. It will make it easy to download high quality PDFs of selectedarticles. We’re aiming to also support export to word processorformats. But even this is only the beginning — there’s a whole bunch oftools that would make it easier for the Wikibooksproject to create high quality, open access textbooks. This technologyis key for the developing world, so that we can distribute freeknowledge in whatever formats are most helpful.
 
Mix & Burn Wikipedia. Related to the above, wewant to make it possible to easily create your own Wikipedia/WikimediaDVD or USB stick — either including all articles, or a selection.

Thiswould require a reader application that runs without Internet access.Fortunately, there are already many projects in this space that, once again, just need a final push.Now, imagine that such an application would not only make it possibleto read articles, but also to change them and to synchronize thechanges back once you have an Internet connection — this would enableus to make participatory Wikipedia terminals anywhere in the world.
 
Once again, these are just a few examples. I believe that the futureof collaboration is much greater still: there will be real-timecollaboration on articles, even on images and video. Wikipedians willtalk to each other via Voice over IP while editing articles, and Wikiversitycould become a global free institution of learning using the same toolsfor global teacher/learner interaction, connecting people who haveknowledge with those who seek it. Wikinewscould turn into a global virtual newsroom, making it possible toinstantly record any event as a “citizen journalist”, and tocollaborate with others to tell the full story.
 
Our donation banner proclaims: “You can help Wikimedia to change the world”. Indeed, by supporting usin this fundraising drive, you will allow us to do more than justkeeping Wikipedia running. A donation to the non-profit WikimediaFoundation is a donation for the future of learning. Every donationhelps, and if you want to make a major gift, please contact us at: majordonors AT wikimedia DOT org
 
Erik Möller has been a Wikipedia contributor since 2001, and was elected to the Wikimedia Foundation Board of Trustees in 2006. This post is a personal opinion, and does not represent an official statement from the Wikimedia Foundation.
 
This entry was posted on Sunday, November 25th, 2007 at 2:38 am and is filed under MediaWiki, Wikimedia, Wikipedia. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
 
美闻网---美国生活资讯门户
©2012-2014 Bywoon | Bywoon