LinkedIn’s business model is to charge money for information that users give them for free.
To advertise the availability of such information, they rely on Google for search indexing. Google’s web crawlers only index what they see, so LinkedIn’s servers disable login checks when a page request appears to come from Google.
Digital publications do this too. NY Times, Wall St Journal, Economist. Paywalls come down when Google crawlers arrive. The sites even disable ads to give Googlebot faster page downloads.
If you optimize your site experience for search engine bots, don’t be surprised when your website attracts a lot of bots.
Last week, LinkedIn filed a lawsuit against a hundred anonymous bots. Apparently people were renting cloud computing services from Google, and then running bots to collect LinkedIn user profiles. Because the bots made requests from Google’s servers, LinkedIn mistook them for Google’s web crawler :
Don’t you hate it when you leave the backdoor open for a trusted third party, only to have unwelcome guests invite themselves inside?
This isn’t the first time LinkedIn tried to stop profile-pulling bots. They filed a similar lawsuit two years ago, against bots running on Amazon’s cloud. The result was that LinkedIn identified a single defendant and settled the case for $40,000, an amount that doesn’t even begin to cover the “hundreds of hours of employee time” that LinkedIn spent investigating the bot activity .
The lawsuits are a losing battle. Anyone can run a bot to pull LinkedIn data. A Github search for “LinkedIn scraper” returns sixty open-source tools advertising this exact functionality. Each repository has dozens of contributors and followers, and I bet the closed-source tools do even better.
This is what happens when you make exceptions for a “whitelisted partner”. On the internet, any loophole inevitably turns into open access. Remember that time LinkedIn suffered a data leak, and 117 million users had their passwords and personal information stolen and sold on the black market?
So, LinkedIn, how does unauthorized access feel now?
1. LinkedIn Corporation vs. Does, 1 through 100 inclusive, No. 5:16-cv-4463 (US District Court, Aug. 8, 2016)
2. LinkedIn Corporation v. Robocog Inc, No. C14-00068, (US District Court, Mar. 27, 2014)