Big Data Challenges

redazione / 15 May 2015

I hope you have already heard about “big data challenges” – if not, my friend, you have a problem. Big data is the next big thing right here, right now. In this post I would like to offer an overview of the topic from a business perspective.

Big data, from a technical perspective, is the challenge to build and apply advanced tools (sw + hw) to elaborate on the huge and diversified quantities of data produced daily.

Meanwhile, from a business perspective, big data is the challenge to obtain a competitive advantage by driving decision-making processes based on huge and diversified quantities of data.

In order to achieve this target, managers and entrepreneurs have to redesign their organizations in the most efficient and effective way in order to (1) define (2) collect (3) elaborate on big data.

Scale of impact

In the report written by McKinsey Global Institute (MGI): “Big data: The next frontier for innovation, competition, and productivity”, you can understand the scale of the impact that big data will have on our future society:

  • 300billion  potential annual value to US health care—more than double the total annual health care spend in Spain
  • €250 billion potential annual value to Europe’s public sector administration—more than the GDP of Greece
  • $600 billion potential annual consumer surplus from using personal location data globally
  • 60% potential increase in retailers’ operating margins
  • 140,000–190,000 more deep analytical talent positions, and 1.5 million more data-savvy managers

Big data characteristics

Big data is usually described using 3 dimensions, universally known as the 3Vs (volume, variety and velocity)

  • Volume refers to the amount of data,
  • Variety refers to the number of types of data (video, photo, tweet, text, etc)
  • Velocity refers to the speed of data processing.

According to the 3Vs model, the challenges of big data management result from the expansion of all three properties, rather than just the volume alone — the sheer amount of data to be managed.

How big are big data (Volumes)

The starting point is the huge amount of data produced daily. In 2012, every day 2.5 quintillion bytes of data (1 followed by 18 zeros) are created, with 90% of the world’s data created in the last two years alone. As a society, we’re producing and capturing more data each day than has been seen by everyone since the beginning of the earth.

This vast amount of digital data would fill a DVD stack reaching from the Earth to the moon and back. The world’s technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s according to Martin Hilbert and Priscila López.

Big data producers (Variety)

Data is produced daily by several different operators in different structured and unstructured ways:

  • Retailers and distribution chains started to build consumer databases in the 80s
  • Organisations (logistics, financial services, health care, transportation)
  • Web users through social media (blogs, twitter, facebook, linkedin, instagram)
  • Sense recognition devices (voice, audio, face, visual recognition, muscles)
  • Smartphones and “The Internet of Things” (every object becoming smarter, connected to the internet e.g. cars, home devices, health devices)
  • Scientists (new forms of science, DNA databases, nano technology, CERN experiments etc.)

Big data technology (velocity)

Big data requires exceptional technologies to efficiently process large quantities of data within tolerable elapsed times. The 3 main features are basically:

  • Storage space: storage capacity has never been so cheap. You can buy a disk drive that can store all of the world’s music for less than 300 dollars. Cloud solutions such as Amazon Big Data are the most likely future standards.
  • Parallel processing software: Hadoop is the main open software at the moment. It allows you to break data into datasets or clusters and then create a map of indexed sub datasets.
  • Powerful hardware to cope with big data processing: in a possible future, quantum computers could be the way. Not only algorithms, regression, A/B tests or association rule learning classification but also machine learning, natural language processing, neural networks or genetic algorithms.

The big data challenge for Companies

Enterprise resource planning (ERP) systems were the standard in almost every corporation since the 80s and 90s. ERP is all about structured data to cope with internal organization. The system goal is to track every data classified as a Transaction. Transactions are all the financial and economical operations in both the supply and demand sides.

An example of transaction data that can be recorded is the daily turnover of an ice cream shop derived by the sum of every single record of any ice cream sold in that specific day

New customer relationship management (CRM) has expanded Companies focus from inside the company to outside the company.  A CRM goal is to monitor every data classified as an Interaction.  An Interaction is every possible point of contact between a Company or a product and the external world (clients and potential clients, defined as leads). The explosion of the internet and social networks has offered even more interaction data. Posts and micro posts on products, services and companies reputations, search results, website traffic. New generation CRM can easily help to monitor this wave of data.

An example of interaction data that can be recorded  is the segment of clients of the ice cream shop categorized by salary range. In order to collect the data they could, for example, set a survey

The Internet of Things; the generation of new smart phones and any kind of smart products have started to produce a huge amount of Observation data. Observation data is data not directly related to the Company but which can have some sort of direct or indirect correlation with some elements of the Company.

An example of observation data that can be recorded is the daily temperature in the ice cream shop city. It is likely to be able to find a direct correlation between temperature and turnover (the hotter the weather, the more ice creamssold)

Big data is all about observation. Which observation data we should monitor? How to do it? How to elaborate on those observations? How to try to understand the correlation between observation and interaction or transaction?  How do we create a predictive model that will lead to the transaction data goals we set?

Future challenge

It is clear that big data will greatly impact everyday life. How that impact takes form is in our hands; entrepreneurs andmanagers in both public and private sectors have the opportunity to reshape companies, and even society. Are you ready?

Written by  Simone Cimminelli