[Web4lib] Web4lib: Wikipedia

Dan Brickley danbri at danbri.org
Thu Mar 18 07:38:27 EDT 2010


On Thu, Mar 18, 2010 at 3:30 AM, Tim Spalding <tim at librarything.com> wrote:


> > Wikipedia. It's a flaw in the page ranking algorithm, in that in general,
> > numbers of sheer links will overwhelm any measure of "authority."  WHO is
> > linking to that link matters; the algorithm does not.
>
> It's repeatedly misstated, but the PageRank algorithm does not measure
> the *number* of links. It measures the authority of pages, as defined
> by the authority of pages thank link to it, recursively, with each
> page given a trivial starting PR. So, a link from the New York Times
> is probably a hundred million times more valuable to a site than a
> link from Joe's blog.(2) There is no Turk in the machine, checking
> credentials, but to describe it as a game of sheer numbers isn't to
> describe it accurately.
>

..ooOO('perhaps if folk could lower themselves to reading Wikipedia ....?')

>From http://en.wikipedia.org/wiki/PageRank
"Page C has a higher PageRank than Page E, even though it has fewer links to
it; *the link* it has is of a much higher value"

(there's also some text in the page that doesn't strike me as clearly
written, I must say)

The Wikipedia entry also quotes from Google:
"

Google describes PageRank:

PageRank relies on the uniquely democratic nature of the web by using its
vast link structure as an indicator of an individual page's value. In
essence, Google interprets a link from page A to page B as a vote, by page
A, for page B. But, Google looks at more than the sheer volume of votes, or
links a page receives; it also analyzes the page that casts the vote. Votes
cast by pages that are themselves "important" weigh more heavily and help to
make other pages "important"."


>
> Tim
>
> 1.  Google sometimes mixes some Google Books data in, but not so much
> for the simple reason that Google Books isn't a place to read books.
> It's a place to search for snippets and a place to buy the book and
> have it shipped to you—activities most people aren't looking for when
> they make a search.
> 2. A standard personal blog will have a PR of maybe 2. PageRank is a
> log score. Assuming it's a log 10—and it's now much higher—then the
> NYT's PageRank is 10 million times that of a personal blog.
>
>
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>
>


More information about the Web4lib mailing list