ISBN.nu becomes lost and found in Google gaffe

Tips on how to deal constructively with Web site issues

I thought last week that I’d completed my series on ISBN.nu, an online book-price comparison service that gets 135,000 of its pages into Google.com’s index. But that’s before the relationship between the Web service and Google became breaking news.

One day after my Feb. 21 issue revealed that ISBN.nu stores fewer than 4,000 pages on its server — the other 131,000 pages are dynamically generated whenever a search-engine spider (or a human) follows a link — almost all of the site’s pages suddenly disappeared from Google’s index. When I checked the link (reproduced below) that shows the number of ISBN.nu pages that are in Google’s index, the total had dropped to a mere nine.

ISBN.nu Webmaster Glenn Fleishman initially thought that a lower-level functionary at Google had heard about my story and decided to ban the site. That didn’t sit right with Fleishman. He says he’s spoken personally with Google’s top executives over the years. As he describes it, Google has no problem indexing dynamically generated pages, as long as the content a spider sees is exactly the same as a human would see. Many database-driven sites legitimately generate pages upon demand rather than storing every conceivable page on a server’s hard disk.

Fortunately, the outage was caused by an error in a software routine at Google. The search engine company assures me that ISBN.nu’s pages will return to the Google index within days, if they haven’t by the time you read this.

How Fleishman handled the problem, however, gives us a valuable tutorial. At the same time, the outage reveals how a relatively new “banning” algorithm at Google works. Here’s the story:

1. OPEN COMMUNICATION. When Fleishman noticed on March 6 that his pages were missing from the Google index, he sent polite but concerned e-mails to his contacts at the search engine, who he’d previously met.

2. PRESS RELATIONS. Fleishman notified me of the problem, after which I sent a separate e-mail to my contacts at Google (on whom I’d paid a courtesy call by coincidence early in February) asking for clarification.

3. ANALYSIS. When a Google spokesman replied that the missing pages were merely caused by a technical glitch, not a political decision, Fleishman analyzed the situation and found that no changes were needed in his database design.

The problem? Each price-comparison page at ISBN.nu includes links to as many as nine different bookstores. With 135,000 pages indexed, that adds up to a lot of links. And each link contains essential affiliate code-strings so ISBN.nu can earn a commission if a user winds up buying a book.

As Google’s Nate Tyler puts it, “The problem appears to have something to do with the large number of affiliate redirects, which set off some of our automated technology.” That means a Google software routine guessed that ISBN.nu was a “link farm.” This is a bogus Web ring in which hundreds of sites create hundreds of links to each other, trying to fool Google’s well-known “link popularity” system.

Fleishman reports that the Google blackout caused a slump to 5,000 visitors per day from 9,000 (a 45 percent decline) and a 30 percent to 40 percent drop in his affiliate revenue. He adds that Yahoo recently omitted ISBN.nu for a few days, cutting into a couple thousand referrals per month from that source. This underlines the importance of search engine traffic to some e-business sites, while others are far more reliant upon their own advertising and marketing efforts.

If you tried my link last month to the number of ISBN.nu pages at Google, my apologies if it didn’t reveal the 135,000 pages I promised. You might try the link below for a few days to see how many of the site’s pages return to the index as Google’s spider gradually re-crawls the links.

GLENN FLEISHMAN’S COMMENTARY ON THE GOOGLE OUTAGE:

A SEARCH SHOWING THE NUMBER OF ISBN.NU PAGES AT GOOGLE

– – – – – – – – – – – – – – – – – – – – – – – – – – – –

E-BUSINESS TECH REVIEW: EMPTY PRINGLES CAN WIRELESS ANTENNA

Using an empty Pringles potato chip can as an external antenna, malicious hackers can easily find and break into many corporate wireless networks, according to an analysis by I-sec, a security consulting group.

The company drove a car around London’s financial district, using an empty Pringles can to magnify wireless network signals. More than two-thirds of the companies using wireless, or Wi-Fi, had not implemented any encryption features, the security group said. Such networks are vulnerable to bandwidth theft or data intrusion.

The tubular Pringles container makes an effective directional antenna, also known as a Yagi antenna. Plans to use this and other devices to identify Wi-Fi signals began circulating on the Internet last year.

BBC News Online says it witnessed I-sec detecting almost 60 unprotected wireless networks in a single 30-minute journey through the canyons of the city. Its report describes the problem and suggests simple solutions.

EMPTY PRINGLES CAN HELPS HACKERS FIND UNPROTECTED WI-FI:

– – – – – – – – – – – – – – – – – – – – – – – – – – – –

LIVINGSTON’S TOP 10 NEWS PICKS O’ THE WEEK

1. Netscape Navigator 6 reads searches, Newsbytes says

2. Thumbnails of online images are OK, court rules

3. Streaming music sites decry arbitrated royalty rates

4. How Miller Freeman’s paper-buying b-to-b makes money

5. Offering free shipping over $99 increases order size

6. Supreme Court may reverse copyright extension

7. Cool: How to build a rotating gallery in ColdFusion

8. Fraud is 19 times more likely online than offline

9. HTML tips: The right way to make rollovers fast

10. Stop hackers from using PayPal to steal your content

– – – – – – – – – – – – – – – – – – – – – – – – – – – –

WACKY WEB WEEK: TABLOID POP BAND SPLIT-UP GENERATOR

Perhaps this could satisfy your site’s need for content. Type a few keywords into a Web form and presto: Popjustice, an irreverent music-indie site, spits out a perfect tabloid article about the impending breakup of any pop band of your choice.

A little cut-and-paste to copy the familiar-sounding rumor and you, too, can look like a music insider. Popjustice’s U.K.-based site is sort of a cross between a fanzine and one of the most elaborate blogs you’ll ever see. Try it out, but be warned: naughty language and juvenile humor, blokes.

POPJUSTICE’S SPURIOUS BAND-SPLIT WEB ENGINE:

– – – – – – – – – – – – – – – – – – – – – – – – – – – –

E-BUSINESS SECRETS: Our mission is to bring you such useful and thought-provoking information about the Web that you actually look forward to reading your e-mail.

ABOUT THE AUTHOR: E-Business Secrets is written by InfoWorld Contributing

Editor Brian Livingston ( Research director is Ben Livingston (no relation). Brian has published 10 books, including:

Windows Me Secrets:

Windows 2000 Secrets:

Win a gift certificate good for a book, CD, or DVD of your choice if you’re the first to send a tip Brian prints. mailto:[email protected]

Source: www.infoworld.com