Big Data: The next big thing?

After the cloud, mobile apps and social media, Big Data is the most recent buzzword. Big Data is not only about the enormous amount of data that is still increasing with lightning speed, it is mainly about the ability to combine information from various sources and to enrich and analyze this information. This leads to new knowledge and the possibility to make highly accurate predictions.

The term Big Data is hot; it is the subject of an increasing number of blogs, books and meetings. For many, the notion means huge data centers with endless rows of servers, but that is just a part of the story. The main point is not so much the availability of data, but what is being done with it.

Multiple sources
According to ICT research agency Gartner, Big Data must entail volume, variety and velocity. Data must come from multiple sources and must be suitable to be enriched by other data, structured as well as unstructured. Moreover, its added value is mainly due to its ready availability. A database with client data – no matter how large – does not qualify for the definition. That is, however, the case when information on earlier purchases is combined with data on web surf behavior, to trigger someone on line with a tailor made offer.
Flu
In order to shine some light on the scope of Big Data, Viktor Mayer-Schönberger and Kenneth Cukier give a number of clarifying examples in their book The Big Data Revolution. When a new influenza virus raises its ugly head, information regarding the spread of the virus is almost always dated. Usually, people have already been ill for some time before they call a doctor. Also, data is not gathered continuously. Google researched whether the spreading of a virus could be predicted at an earlier stage. In order to do so, they compared the online search behavior in the US to the official data on flu.
This was already tried before, but without Google’s calculating capacity and statistical expertise. The key was not – as was expected – search terms like ‘flu’ or ‘cough medication’. Therefore, Google designed a system that did not consider the content of search terms. It looked only for a match between the frequency of certain search terms and the spreading of the flu. The processing of 450 million different mathematical models eventually yielded a combination of 45 terms that showed a strong correlation with the national flu figures. Using this method, it proved to be possible to predict the geographical spreading of flu in almost real-time. During the 2009 flu epidemic in the US, the Google system was more practical and faster than the after-the-fact statistics.


Cheap tickets
The Big Data Revolution also discusses an initiative by Oren Etzioni, Big Data pioneer. He developed a prediction system for the price of airplane tickets, based on an analysis of 12,000 price observations of a travel website. This led to a small-scale model that performed well in practice and was further perfected by Farecast, a webcompany/website owned by Microsoft that predicts airfares. Farecast additionally gathered large numbers of data from a reservation database from the airline industry, processing almost 200 million flight price records to support it’s fare predictions. Therefore, if a drop in price is expected, advice is given to wait to purchase the tickets. If the average rate is predicted to go up, it is clever to strike. Microsoft bought the company in 2008 and integrated its service in its search engine Bing.

Right turn
Professor Erik Brynjofsson is director of the MIT Center for Digital Business and a popular speaker on, among other subjects, the impact of data. He had projects with Big Data including 330 researched companies and he elaborated on some of these during a conference in Amsterdam last year. A winemaker, for instance, can predict beforehand how his wine will taste by carefully analyzing the composition of his product. Parcel deliverer UPS used Big Data to determine that a route in which the vans make only right turns is faster and cheaper than the ‘traditional’ shortest route. An analysis of the search assignments in Google in the United States proved to be more valuable as a predictor of the readiness to purchase houses than statistical models like historic data and economic indicators.

Revolution
In The Big Data Revolution, Victor Mayer-Schönberger and Kenneth Cukier outline the consequences of the possibility to connect and analyze an almost endless stream of data. It is a very accessible book that will appeal to a large crowd. Conclusion: Big Data will permanently change the way we think, work and live. (Maven Publishing, € 22, in Dutch)

Microscope
When evaluating Big Data projects, Brynjolfsson saw an average 4 percent increase in productivity; profit was up by 6 percent. Of course, not all the projects are success stories, he stresses. But if a project does not yield results, according to the professor, it is not because of a lack of relevant data, but because of the quality of the analyses. Brynjolfsson regards big data not only as a technological breakthrough, but thinks that it will lead to a revolution in the minds of the managers. He even compares big data to the invention of the microscope. Future decisions will no longer be based on suppositions and choices of a limited number of leaders. They can be taken on the basis of data, of factual information.

Cause and effect
Going further, Cukier and Mayer-Schönberger point out another important change. In order to predict developments and to substantiate decisions, common practice at this moment is mainly the use of random samples. Because a limited amount of data is concerned, these random samples must be composed with extreme accuracy in order to have any value. Because of the huge amount of information available for analysis, research in Big Data does not have to be nearly as accurate to provide reliable conclusions. Even data that, at face value, does not appear to have any relevance can be included in the research. Big is beautiful, in this case: quantity leads to quality. Another essential difference with the world as we know it is that it is no longer crucial to know why something is the way it is. Big Data makes it possible to predict, on the basis of data analysis, that something is highly likely to be what it is. Causality becomes less important. We do not necessarily have to go looking for the cause when we know the effect.

4.4 million jobs
The amount of data is increasing explosively. Take for instance sources like online search behavior, smartphones, applications, profiles and messages via social media. From a technological point of view, the storage and access of all this data hardly poses a problem. The key lies in the ability to create links by means of tools like analytical software and algorithms. Therefore, there will be a huge demand for researchers and analysts. Gartner predicts that in 2015, 4.4 million jobs will be created worldwide that are related to Big Data. Europe will account for almost a third of these jobs. However, the research agency expects that only one of three of these jobs can actually be filled. Privacy is another bottleneck. Linking and analysis can lead to discrimination. It may also cause conflicts with existing legislation and future European privacy regulations.

Strike gold
These restrictions hardly affect the potential of Big Data. The opportunities abound not only from a social point of view like in the health care or the battle against crime, but also in the corporate world. The most obvious example is Google, where Big Data is the heart of the business. Initially, Google started only with data on search behavior. Later it used input via gmail, Android (bought in 2005) and YouTube (bought in 2006). All these sources supply Google with an enormous amount of data. These can be analyzed and turned into cash. Currently, revenues are generated primarily through the sale of advertisements, but the applications go a lot further. Other Big Data collectors have also struck gold.

Competitive edge
But what is the impact of Big Data on “common” enterprises? A number of examples show the diversity of the potential added value. In some cases, the ability to apply data in a strategic manner will even be decisive for the continuation of an organization. In any case, it is true for almost all industries that Big Data can supply a competitive edge. Consider, however, that it is a new subject in the ICT research agency. Companies that are still busy exploring the possibilities of social media and mobile applications must prepare themselves for a next challenge. Experts are already announcing the next step: nanodata, in which information from the huge flow of data is used to zoom in on the individual customer. It never stops.

Leave a Reply

Your email address will not be published. Required fields are marked *