The Crowd Mine

Eventually, data mining technologies will converge and synchronize their operations into one giant mine, processing raw global data through ‘industrial-scale exploitation’ into a highly refined global intelligence...

Two of the largest mining operations in the world happen to lie right next to each other, just south of Salt Lake City, Utah. The resources mined in each location are so deep and scattered on so vast a scale that their extraction requires the use of extremely large and powerful machinery.

The Bingham Canyon mine pictured above is so large it can be seen from space with the naked eye. Some judge it to be the largest human excavation on the planet. It is so massively large that it was declared a U.S. landmark and attracts visitors from around the globe. Yet, even though it is deeper than many mountains are tall and the machines working it are larger than buildings, in terms of scale it pales in comparison to the other mining operation just a short distance away—the NSA's Utah Data Center.

Whether it's copper, gold, oil, food, or data, the production of resources over time has been greatly improved with the application of technology. Today, powerful machinery and advanced mining techniques are allowing for the extraction of resources at scales and depths that were previously thought impossible. Though we can understand and visualize this in terms of massive open-pit mines, deepwater drilling, or even fracking, this is no less true when it comes to our personal data—a resource as valuable as gold.

Throughout time, society has always been treated as a resource to be mined. The first and most basic method for determining value goes back to one of the oldest large-scale data gatherings known to man: the census. Here, the authors of Big Data trace the history of data mining and note the biblical account of Caesar Augustus imposing a census so “that all the world should be taxed.”

The technology used then amounted to little more than counting and proper record-keeping. However, as populations swelled, mining crowds for their numerical size and basic characteristics became increasingly difficult. New technologies had to be applied as the scale of operations grew. Eventually, a breakthrough occurred when the U.S. Census Bureau contracted an inventor by the name of Herman Hollerith for the use of punch cards and tabulation machines. Through much effort, he was able to reduce the entire process from eight years to less than one. As the authors of Big Data write, “It was an amazing feat, which marked the beginning of automated data processing (and provided the foundation for what later became IBM).”

Fast forward to today, and we see another major technological breakthrough has led to the massive gathering of resources at unimaginable scales and depths. Like fracking, big data has accelerated the data mining process into previously inaccessible reservoirs of human thought, emotion, and behavior, which collectively form the bedrock of our society, economy, and markets.

Unlike fracking and the mining of non-renewable resources however, there are a number of extremely important distinctions we should recognize. Consider the Bingham Canyon Mine shown at top. Eventually, it will run dry and shut down because there is a limit to the amount of copper concentrated in that deposit. Generally speaking, physical resources come in limited supply—there are input constraints. This is not true with digital data, which continues to grow at exponential rates.

Also, due to limited supplies over geographical areas, using larger and more advanced machinery to mine deeper and deeper, and/or at greater scales, does not often improve the quantity or quality of production. Thus, there is a natural constraint on the size and effectiveness of mining technology. This is less true with data mining. In fact, when we judge the sophistication of the technology by its ability to acquire and process more data, this often leads to higher quality results.

In the case of mining, man uses and creates technology to mine resources, which are typically consumed and depleted. In the case of data mining, man uses and creates technology to mine man, which forms a curious sort of feedback loop. In the first case, both resources and technology suffer clear physical constraints. In the latter, data and data mining technologies are not limited to the same constraints and, on the other hand, do just the opposite: feeding one another in a self-reinforcing cycle.

So, you may be asking, what does all this mean? Well, this is where things start to get interesting. Since the parallels between mining and data mining are pretty clear, think again of the massive open-pit mine introduced at the beginning, but now visualize it in terms of a massive data mining operation. Instead of people, there are large automated machines doing all the work, digging into deeper and deeper layers of societal data. Zoom out and now look at the entire world. What do you see? Data-mines everywhere, with governments and corporations funneling billions of dollars to build bigger and ever more advanced machinery to mine at greater depths.

Eventually, data mining technologies will converge in the center and synchronize their operations as one giant mine, processing raw global data through "industrial-scale exploitation" into a highly refined global intelligence. Is this the event people refer to as the Singularity? Whatever it is, once it takes place (if it hasn't already), we should expect it to alter society, the global economy, and the financial markets in ways that were previously thought impossible.

About the Author

Program Manager, Webmaster, Senior Editor, & Co-Host
cris [at] financialsense [dot] com ()
randomness