The “privacy” model has become a default business strategy for new search engines. Why? Because it sells. Given the typically inferior quality of their results, and their often uninspiring user-experience, “private search engines” routinely over-achieve in terms of growth. So unless they have a revolutionary new concept, anyone entering today’s search market is almost compelled to trade on privacy.
WHY “PRIVATE SEARCH ENGINES” NEED THE ENEMY
Creating a system of crawling and indexing an unthinkable volume of web pages, and meaningfully ranking them to prioritise quality, takes an enormous amount of time, money and talent. A level of time, money and talent that new market entrants are not, realistically, going to have.
Even with the groundbreaking genius of Google’s original PageRank concept, it’s still taken the world’s leading search engine two full decades to achieve the technological power it has today. Go back just five years, and you’ll find a noticeably inferior Google Search as compared with today’s equivalent. Go back ten years, and you’ll be tearing your hair out at the amount of useless spam in the results.
So whilst some “private search engines” are quick to headline that they have their own crawlers, and imply that they can index the web themselves, they’re so far behind the vastly rich market leaders technologically and in terms of connectable real-estate, that any search results of their own are uncompetitive in the extreme. More typically, non-existent.
The only solution is for the “private search engines” to strike deals with major search providers, in which the major search provider delivers competitive results to the “private search engine”. It’s great. The “private search engine” now has the benefit of all the investment and talent that went into Google or Microsoft’s development.
The problem? The major search providers are obsessive, irrepressible trackers. And worse, “private search engines” need to make money, so they have to use advertising programmes, which are also provided by – yes, you’ve guessed – obsessive, irrepressible trackers.
WITHOUT TRACKERS, THERE IS NO “PRIVATE SEARCH”
This is where it starts to become difficult for “private search engines” to be dismissive about the relationships they have with those irrepressible trackers. These are not casual relationships – ships passing in the night. The “private search engines” depend on irrepressible trackers for their very existence. Check this link for an insight into what happened to DuckDuckGo when Bing had a service outage. DuckDuckGo likes to pretend Microsoft is just an extra contributor to its results. But the reality is that DuckDuckGo is not fit for purpose without Microsoft. The same goes for Qwant, which again relies on Microsoft. StartPage, meanwhile, has a critical dependency on Google.
Leading “private search engine” DuckDuckGo has also admitted interfacing with other irrepressible trackers such as Yandex and Yahoo, but it’s very difficult to trust what DDG says on the actual operational detail, because its message changes according to what it thinks people want to hear. For example, after telling the public that they use their own crawler (‘cos that sounds reassuring), DDG told a webmaster who wanted to know how to be de-listed from DuckDuckGo’s results, that their crawler doesn’t actually do any crawling, and therefore he’d have to take the matter up with someone else.
THE ENEMY HAS TRACKING IN ITS DNA
“You must not:
xii. Mask or obscure the user agent or IP address of a user requesting Search Results through a Property;”
Oops. There goes the anonymity of the “private search” user. It’s just Bing being Bing through the back door. Unless the likes of DuckDuckGo and Qwant have separate deals, in which Microsoft has agreed to forego access to user agent and IP address information, that is. Whether or not such separate agreements have been made, we know that as standard, Microsoft insists on collecting user data through its search API – EVEN WHEN THE CUSTOMER IS PAYING.
Not that this is in any way unexpected. We know that Microsoft is a data-obsessed control freak which has turned Windows into a monster of spyware. Windows 10’s data-collecting regime is so deeply rooted into the system that some elements of it can’t be disengaged at all. Microsoft is not asking for data from your PRIVATE space. It’s telling you it’s fucking taking the data whether you like it or not. These companies have the same vision of consent as a rapist.
With Microsoft we’re talking about a company with a totalitarian mentality, which aggressively and threateningly mandates itself access to the computer systems of any businesses who use Microsoft software, to perform what it calls “software audits”. Microsoft is Big Brother. If you’re going for a piss, Microsoft wants to know about it, become involved in it, and ultimately, control it.
So for us, the outsiders, there are blatant red flags in deals between companies who trade on not tracking us, and privacy-rapists whose contempt for public discretion is so great that they actually pretend they don’t understand what “Do Not Track” means. The question arises: if Microsoft, by its own admission, doesn’t even understand what “Do Not Track” means, how can its “private search engine” partners expect it to abide by a “Do Not Track” policy?
Raising those red flags higher up the mast, the “private search engines” don’t want to talk about the deals they have with irrepressible trackers. Some don’t even mention their associations with mega-trackers at all. Oscobo, for example, had no search capability of its own upon launch in 2016, and was cited as exclusively obtaining its results through Bing/Yahoo.
WHAT DATA DO “PRIVATE SEARCH ENGINES” COLLECT?
And by default, DuckDuckGo’s PrivacyPlus browser extension sends the contents of the user’s URL bar to DDG. This is a classic example of a tool marketed as a form of privacy protection actually being a front for under-the-table data harvesting. You can disable the feature and still use the tool, but how many people even know it does that in the first place? Not the majority, I suspect. Accidentally paste a document clipboard into the URL bar (which I’ve done numerous times) and you could be sending DuckDuckGo some very personal information. That also applies with the search box on any search engine. So if they’re storing searched data, even without an associated IP address, they’re not really private.
Qwant also admits to storing various data, and even says it shares job application info with its recruitment company ‘partners’ on an opt-OUT basis. I mean, recruitment companies – of all people… If you ever want an inbox full of unsolicited crap that you can’t stop, give your email address to a recruitment company. And I say that as someone who worked in recruitment.
I also found that in ‘503’ redirects, Qwant stores the make and model of your browser in the actual redirect URL. And it wants us to believe it’s not tracking? Umm, yeah, okay then. Another nice trick of Qwant’s is to block proxied searches…
The message openly states that the block was initiated on the basis of location. In other words, IP address. It’s therefore not a question of whether Qwant does or doesn’t grab your IP address. It does. The only question is how long it holds onto it. One second? Ten seconds? Ten minutes? Half an hour? A week? Six months? See the problem?
The message says it’s a spot check, and doesn’t mention it being location-based. But this NEVER, EVER happens unless you’re using a proxied, multi-user IP. So blatantly, they’re monitoring your IP address.
The “private search engines” can say that since they’re paying for their search results essentially per query, they have a right to block what looks like a bot – so that they’re not wasting their money on fake searches. But the fact that they can’t produce something that’s fit for purpose without paying Microsoft or Google really isn’t our problem. If they’re discriminating between bots and humans, they’re profiling the user. If they’re monitoring user locations, they’re creeping. All “private search engines” serve location-tailored results, but the only one I’ve yet found that actually asks the user for their location is the open-source option, Searx.
So now you’ve got the problem of how Microsoft, or Google, or Yandex, can serve location-tailored results through an API if they don’t know the location of the user. If the user didn’t set their location manually, then the only raw basis for setting the correct location will be the IP address. Do we believe that all “private search engines” take the trouble to convert our IP addresses into broad locations, using routines NOT built by trackers, so as to stop the “baddies” having our IP addresses? Or do we think they simply do nothing at all, let the likes of Microsoft have our IP addresses, as MS in any case demands in its Terms, and get our location-tailored results the easy way?
StartPage states that it uses an anonymised country code to obtain location-tailored results. But the question of how StartPage converts the raw IP to a country code still remains. Do they convert using their own software, or use a third party service? This is the problem. They don’t say. And what about other “private search engines”? How do they get their location-tailored results?
AD CLICKS *MUST* BE TRACKED
But we enter an altogether higher level of questionability when we start to budget in the issue of advertising, and how it works.
Let’s first remind ourselves that the ads on “private search engines” are served by irrepressible trackers and data miners.
Now, let’s accept that if a pay per click ad has no tracking mechanism, neither the publisher nor the ad service provider can be paid. It’s as simple as that. Without tracking, how would it be possible to know where the user came from (i.e. which referrer to pay), whether they’re real or just a bot, whether they bought, what they spent? There has to be an audit trail and verification process, because otherwise advertisers would potentially be giving money to Microsoft or Google for fraudulent clicks, or an uncheckable number of referrals. Anyone who understands online advertising knows that’s not an option. Microsoft and Google have a responsibility to their customer to ensure that ad clicks are not fraudulent. They can’t make that assertion without tracking and monitoring the click-through process.
You might be able to have a private search engine, but you can’t have a private pay per click ad.
WHAT WE KNOW IN SUMMARY
So in summary, what do we know about “private search engines”?
Most if not all of them collect data of one sort or another.
Some collect personally identifying data.
They can’t compete without striking deals with irrepressible trackers, and most can’t exist at all without Microsoft or Google.
In most cases we only have the “private search engine’s” own word on their processes, and there’s no mandatory audit to verify their integrity. I’d suggest that the body most likely to audit search engines that use the Microsoft API, is Microsoft.
The people upon whom the “private search engines” depend for survival are hard-line data miners, whom we KNOW do not like being cut out of the data loop. And those data miners do not, it should be noted, use open source tools. Do you think Google and Microsoft share the code of their API routines with “private search” providers? Of course not. The “private search engines” don’t really know what those APIs are doing.
We can always be tracked through unique referral. For example, Let’s say we search DuckDuckGo for something unusual, and DuckDuckGo refers us to a page that only gets a few hits per day. There’s Microsoft tracking code on that page. So what Microsoft now has is…
- All the usual info it would collect about you, including IP address and perhaps a personally identifying ID – from your destination site.
- The knowledge that the page only gets around two hits per day.
- The knowledge that the page got a hit at 15:15.
- The knowledge that you hit that page from DuckDuckGo.
- The knowledge that you searched DuckDuckGo for the exact unusual thing on that page at 15:15.
You might as well have used Bing.
The “private search engines” do not like talking about the detail of the deals they have with oppressive trackers, or the mechanics of the click through process. They may, in fact, be contractually prohibited from talking about the minutiae of such matters.
Criticism of trackers by “private search engines” is often biased in favour of their own ‘life support machine’. For example, DuckDuckGo heavily lays into Google, but is not critical of Microsoft, and ignores the obvious futility of using a “private search engine” within a hell hole of privacy-rape like Windows 10. In fact, DDG even posts instructions on how to add its search engine to Microsoft Edge. It’s happy to promote blissful ignorance in its quest to gain market share.
Because “Private search engines” depend wholly on the notion of privacy in order to exist, they CANNOT admit we’re being tracked without going out of business. In other words, they CANNOT admit we’re being tracked. We already know they can be irresponsibly economical with the truth, and that some of them have lied. Would it be that much of a surprise to find others had lied too? In business, a lie that gets found out is never a lie. It’s a “We need to launch an investigation into this, but clearly it’s an error, and should never have been happening”. I would not be at all surprised to find that trackers have access to the IP addresses of “private search” users, and are storing those IP addresses.
I admit, I use StartPage. Not because I trust the company any more than one should trust people one doesn’t know, but because I can use it via a de-centralised proxy without too much hassle, and the results are usable, if not as deeply-trawled as Google’s own. If I need deeper results than those I can find via StartPage, I use Google.
I still, however, believe that the whole business of “private search” needs proper, genre-specific regulation to ensure that the level of privacy the businesses promise their users is not misleading people. At present, I believe that in general it is misleading people.