[Web4lib] German Wikipedia to be published in book form?
Lars Aronsson
lars at aronsson.se
Wed Apr 23 00:42:39 EDT 2008
B.G. Sloan wrote:
> Editors will distil 50,000 of the most popular entries in the
> German version of Wikipedia into the 1,000-page volume to go on
> sale in September.
The German Wikipedia contains more than 700,000 articles, many of
which are quite long. What they're doing here is picking the
50,000 most visited articles, based on available visitor
statistics and extracting the first paragraph or sentence from
each article. Even if 1000 pages is a quite thick volume, every
page needs to fit 50 articles, so they can't be very long. You
get to know that Titanic was a ship that sank, but not much more.
This sounds like something you could do with a Perl script in an
afternoon. In that aspect, it's a neat hack. The hard part is to
weed out the articles that happened to contain vandalism at that
point in time, and to list all the authors in a way that satisfies
the GNU Free Documentation License (GNU FDL).
The exciting part is that Bertelsmann (the German media giant that
owns Random House), being a major publisher of encyclopedias
already, puts its name behind this. As a surprise move, it is a
parallel to BMG's deal with Napster some years ago. BMG (now part
of Sony BMG) is the Bertelsmann Music Group.
--
Lars Aronsson (lars at aronsson.se)
Aronsson Datateknik - http://aronsson.se
More information about the Web4lib
mailing list