Bad News For Balderdash

A recent story on New Scientist gives a glimmer of hope for those of us who bemoan the swelling tsunami of claptrap and codswallop that fills the internet:

THE internet is stuffed with garbage. Anti-vaccination websites make the front page of Google, and fact-free “news” stories spread like wildfire. Google has devised a fix – rank websites according to their truthfulness.

What a relief that will be. Of course it may spell doom for the popularity of pseudoscience, conspiracy theories, fad diets, racist, anti-vaccination fearmongers, yellow journalism, Fox News, celebrity wingnuts, psychics and some local bloggers – all of whose sites have an astronomically distant relationship with fact and truth, and depend instead on the ignorance and gullibility of their readers.

Page ranking has historically been based on a complex relationship of several, mostly superficial factors: links, keywords, page views, page loading speed, etc. – about 200 different factors determine relevance and where a page appears in a search – it’s in part a worldwide popularity contest that doesn’t measure content.

Fact ranking – knowledge-based trust – would certainly make it more difficult for the scam artists who thrive because their sites pop up at the top of a search – which many people assume means credibility. But people would actually have to pay attention to trust rankings for them to have any effect. If you’re determined to have your aura read, or communicate with your dead aunt, or arrange your furniture with feng shui, you’ve already crossed the truth threshold into fantasy with your wallet open. Fact ranking won’t help you.

A Google research team is adapting that model to measure the trustworthiness of a page, rather than its reputation across the web. Instead of counting incoming links, the system – which is not yet live – counts the number of incorrect facts within a page…its Knowledge-Based Trust score. The software works by tapping into the Knowledge Vault, the vast store of facts that Google has pulled off the internet. Facts the web unanimously agrees on are considered a reasonable proxy for truth. Web pages that contain contradictory information are bumped down the rankings.

Garbage online affecting our decisions, our lifestyles, our pinions, our ability to make appropriate judgments, our voting and our critical thinking? Not news. Back in 2004, the Columbia Journalism Review ran a story on the ‘toxic tidal wave’ of lies and deceit affecting the US presidential campaign. One of the points it makes is that we’re awash in digital content, so much so that our ability to sort it out has been hampered by the sheer volume.

More recently, a piece on this new search engine approach appeared in the Washington Post. It opened:

The Internet, we know all too well, is a cesspool of rumor and chicanery.

Don’t we know that! That cesspool was brimming over in the last municipal election campaign. But it’s of course a much wider problem than a few local sycophants spewing their ad hominem trash. It’s ubiquitous online.

…in a research paper published by Google in February… that could, at least hypothetically, change. A team of computer scientists at Google has proposed a way to rank search results not by how popular Web pages are, but by their factual accuracy… At some point, perhaps even Google’s hotly debated and much-studied ranking algorithm — the creator and destroyer of a million Web sites! — could begin including accuracy among the factors it uses to choose the search results you see.

Accuracy and fact as relevant to a web’s ranking? My, that will hurt the purveyors of balderdash: chemtrails, homeopathy, UFOs, alien pyramids in the Antarctic, angels, Bigfoot, Nessie, creationism, psychics, ghosts, politicians… all headed to the bottom of the factuality heap, one hopes.

Google, by the way, has been quietly tweaking its search and auto-complete algorithms for some time now.

One of the most recent changes implemented in Google’s search algorithm targeted the ranking of piracy websites around the world. “We’ve now refined the signal in ways we expect to visibly affect the rankings of some of the most notorious sites,” Google’s senior copyright counsel, Katherine Oyama, said in October.

Back in 2012, Google launched its “knowledge graph” sidebar to provide searchers information and perhaps even answers, not simply links to other pages. Did you notice? It was a subtle – it only shows when Google’s software deems it relevant – but dramatic way to direct your attention away from the simplistic list of links to some actual content.

Hinted at for months, Google formally launched its “Knowledge Graph” today. The new technology is being used to provide popular facts about people, places and things alongside Google’s traditional results. It also allows Google to move toward a new way of searching not for pages that match query terms but for “entities” or concepts that the words describe… Google says it has compiled over 3.5 billion facts, which include information about and relationships between 500 million objects or “entities,” as it sometimes calls them.

Try this: type “Albert Einstein” into Google and you’ll see the sidebar on the top right. Type “vaccination hoax: and none appears. But type “vaccination” alone and a small box appears.

[youtube=https://www.youtube.com/watch?v=mmQl6VGvX-c]

It’s not just Google that is tired of the misconceptions, conspiracy theories, incorrect information and outright lies. In January, the Washington Post also reported Facebook was in on the act:

…Facebook dealt a blow to the fake-news industry that could, in all likelihood, wipe it out: Starting this week, the site will cut down on the number of fake news stories that circulate in its News Feed — and add a warning on hoax stories indicating that they’re fake.

Well, so far I haven’t seen any of FB’s notification or blocking, but then maybe my feed doesn’t get a lot of that particular form of claptrap – it gets games, mis-attributed quotes, pictures of kittens, videos of pigs, New Age angels, idiot quizzes, and faux medical advice, but seldom any of the fake/spoof news.

Maybe my ‘friends’ know more than the average FB user, so they ignore and don’t share that stuff. Or maybe it’s been happening in the background and I never noticed some of the crapola had vanished. The story as a troubling statistic buried in it (emphasis added):

How to explain the peculiar symbiosis of Facebook and the fake-news industry? It derives, to some extent, from the power that Facebook plays in modern Web media more generally: According to Pew, nearly a third of Americans get their news from Facebook… Facebook’s relying in some part on users to spot and report fakes; past experience would suggest users aren’t always very good at that.

All this seems like good news, if truth and fact are what you want in your browser. But there are already sites that specialize in debunking claptrap – PolitiFact, FactCheck.org and Snopes come to mind – but there are also numerous science-based and skeptic sites, too. All it has ever taken is for someone to make the effort to check the facts first on these sites before clicking the share button. And millions of us click it anyway, helping spread the nonsense.

How else do you think the anti-vaccination movement grew from a handful of gullible, poorly-educated celebrities to tens of thousands of parents putting their children’s lives at risk? No one would believe this anti-vax codswallop if they actually checked into the lack of science and medicine behind the scare tactics.

Maybe search-engine fact checking will inject some common sense into the system.

Google’s idea does raise some troubling questions about who decides what is or isn’t truth, however. A parody or satire isn’t the same as fake news, and it may play an important role in the social conversation or political debate. But it may also not be factual. Can software determine what is deliberately fake and what is parody – i.e. can software alone determine human intent? Can it determine the difference between a for-profit fake news site and the Onion?

What about opinion? Not all opinion is based on fact or truth (as we learned locally, just from reading newspaper columns…) and is sometimes solely based on rumour, allegation and innuendo. Does it always get relegated to the bottom of the pile as a result?

Will Google’s move censor valid dissent and differences of opinion? Does Google’s software distinguish between opinion and statements of fact? Is it tough love or Big Brother they’re bringing in?

What about faith? There’s a thin line here – faith is seldom backed by fact. Many people believe in miracles – are their beliefs subject to the same fact checking system? If so, most religions will find themselves at the bottom of the page ranking. Sure, it’s fine if the creationists get kicked downstream, but what about people who believe the Shroud of Turin is real?

What about fiction? Poetry? Artwork? Music? Not all online content is about fact. A lot of artistic, creative material is online. What happens to it in the page rankings?

And what about sites that question authority, question the official government or party line? Will, for example, Putin’s version of Russian’s aggressive assault on Ukraine soverignty become the search engine “truth” and those who question it tumbled to the bottom of the ranking because their concerns or their counterpoints and questions aren’t considered “factual?”

We all depend on Wikipedia as our go-to source of information – but Wikipedia is user- created and driven. It is not always factual, not always objective or neutral and its content is always changing; being added to, revised and edited. What happens to page ranking if Google refers to Wikipedia for a fact that later gets changed? Do pages that had the correct data but were demoted as “untruthful” then get bumped back up?

I’m torn here. My logical-reasoning brain wants some way to stop the clutter, reduce the noise, downgrade the nonsense. But emotionally, it seems a lot like censorship and control. I think it’s a good direction to go in, but I’ll have to wait like everyone else and see how it gets implemented.

Ian's bookshelf: currently-reading

Ian's bookshelf: read

Search Scripturient

Author: Ian Chadwick

One comment

Leave a Reply Cancel reply

Search Scripturient