Site Scraping Gets Rough

March 2009

Why this matters:  traffic=ad revenues.  Scraping gets riskier.

Last month, the New York Times Corporation settled a suit brought by Gatehouse Media Inc., which runs websites for 125 Massachusetts newspapers.  The NYT\’s Boston Globe was essentially scraping the Gatehouse sites.

Technically (and this distinction is important), the Globe site was returning readers TO the page of the article.  Gatehouse complained that readers were bypassing the ads on the home page.  This is interesting.

Intuitively, one would think that a large number of readers who were returned to the Gatehouse site (albeit at a subsidiary page) would in fact go to the gatehouse homepage.  But no, Gatehouse wanted more (rightly or wrongly).  It also turns out that Gatehouse could figure out how to block this process, which probably led to theNYT offering to settle–so as to avoid case law that goes against them.

So what?

Sites regularly scrape or otherwise link to other sites–usually to the subsidiary pages.  We get a lot of people asking of us if they can do it.  Well, this case—though settled and therefore not an opinion for purposes of precedent–suggests (to no one\’s surprise) that doing so will subject you to legal challenge that will cost a lot to defend.

It makes sense, too.  Again, no opinion as to whether it is right or wrong, legal or not, but common sense should tell us that people who own the rights and go to the trouble of posting content where they want it posted should be able to control access to it.

Of course, Google is another matter.

And, by the way, there is already case law in Europe iin support of Gatehouse\’s position, so where it is done will also make a difference.


