top of page

How SEO can be explained through Dataism

Originally published in Unpopular Opinion.

cubes in a data matrix

The concept of Big Data and how it makes the world go round has been heard for several years now. Yuval Noah Harari spoke about Dataism in his book “Home Deus” and the term has been popping around in media ever since. In short, Dataism is a concept of the universe where processes and entities with free flowing data win the evolution and the race. David Deutsch in the “Beginning of Infinity” explains something similar with an angle that well-packaged and easily replicated data builds the foundation of our universe. Think DNA chains, music, memes.


One good evidence of that is how Internet was built and developed. It’s the early open source projects that fueled the fast and robust growth of the Internet capabilities and online culture we experience today. The Internet itself was a worm hole leading to tons and tons of data. If we follow this thread, then we see that the top companies that emerged in this economy were Google, Amazon, Facebook. Google helped organize and access the data. There is not much use in heaves of information buried in world wide web if there is no good way to parse and extract it. Amazon’s niche is the plethora of products. And connecting thousands, millions of sellers around the world to millions of shoppers. There’s been enough said about Facebook data - we all know how much of it there is and all sorts of ways it can be used.


And we can assume that large algorithms which are sufficiently developed and iterated upon will start favoring similar big data patterns as above.


Let’s take a look at SEO now. One of the key phrases firmly residing within the SEO industry is “Content is King”. Content - or data. Each website is carefully “gardening” it’s content as it is the foundation of organic traffic growth. This is where most website owners or digital marketing professionals “get in the weeds”: What to write about? What content do we need? Should we write more or less about xyz? Is it duplicate content? Are we optimizing for crawl budget?.. Time to hire an SEO professional.


If we apply the same data free flow concept here to search engine optimization then it means one thing: more data will win.


Google algorithm is smart and complex at this point in our history. And if the big data concept is correct then at some point in time it has to start favoring websites with similarities to free data flowing and efficiently packaged data.


I am writing this because I’ve seen it. E-commerce sites where SEO takes off with an increase of products on the website. These can be different product categories too - it truly seems that all boats rise with the tide. UGC websites where the more content there is on the site, the higher all of it ranks. Crawl budget you ask? No, haven’t heard. Joking - the point is that crawl budget is adjustable. Maybe not on day one, but website owners should not limit the amount of content just based on the crawl budget considerations alone. If the content is needed/read by someone other than bots, and it’s not a duplicate page glitch then bots will adjust their crawl budget to the amount of useful content on the site.


Bottom line - have more content and you shall rank higher.


How about fake news and false information then? Does that count? That would be a very good question, although I doubt anyone, literally anyone, has an answer to it yet. Twitter troll and fake account problem is real, and so is such of any large media platform especially with user generated content. One thing that is true at a high level is that any successful system or society needs to have effective error correction mechanism. Again - going to David Deutsch here. Democracy has error correction “built in” - the “errors” can be corrected every 4 years. Freedom of publishing has error correction where best ideas win. We just don’t know yet what error correction looks like for our current system where data is created and consumed very fast and at scale. What’s for sure is that this error correction cannot be manual: it cannot be Joe banning tweets that violate a policy, or a narrative, or because Joe doesn’t like someone.


If we look at history, one thing is clear - it’s best at correcting itself. So, a crazy idea: maybe it’s time to let the child walk on their own, time to stop helicopter parenting our data and let it find its own path to what’s right?



bottom of page