How big data is reshaping business
One can spend a holiday in the tropics with relative ease and safety – of course, provided that you have sufficient funds to back your travel expenses. Airlines can now determine and schedule flights effectively, offer cut-throat ticket pricing, and even provide state-of-the-art inflight intelligence. The hospitality industry utilizes the data accumulated from years of operations, which is why you chose a hotel that offers you a differentiated service.
The restaurant ad you saw on your way to the hotel is so appealing and made you drool, resulting in you heading to that resto after you checked in.
It looks like a well-orchestrated scenario, but this is all because of the utilization of vast amounts of data lurking in the industry.
The concept of processing huge amounts of data has been in existence since the 1940s and is believed to be the birth of big data. The first attempt to measure the volume of data back then has been known as the “information explosion”. Since then, the concept has evolved and developed into what is now known as “Big Data”.
No one can really attest whether who coined the term big data, but a quick web search will generally show two personalities: American computer scientist John Mashey, considered to be the “father of big data” and Roger Mougalas, director of market research at O’Reily Media group.
Since the first coinage of the term, the scale, volume, and speed in which data is gathered and generated have exploded to mind-boggling numbers. For context, close to 59 petabytes of data were generated and processed daily in 2021.
What exactly is Big Data and how does it work?
In an increasingly digitalizing world, you generate data whenever you use your smartphone, computer, and other internet-connected devices. Accumulated data is stored somewhere in the physical or cloud storage servers, and comes from different sources, with a wide array of uses and formats. In layman’s terms, it is complex. Very complex that a traditional data processing engine cannot handle. This is Big Data.
Big data’s guiding principle is dubbed the three V’s; volume, velocity, and variety. Sometimes it balloons into five, with the addition of veracity and value. To hardcore industry insiders, it can go to as high as 10 or 17.
Volume refers to the amount of data generated through various sources, ready to be accessed and assessed for productive use. Imagine the amount of data generated through TikTok, the current number one short-form vlogging platform. Do you think you can import it to an Excel file and analyze it?
Velocity is the speed of the accumulation and growth of data harnessed via different sources. Most of the data come in real-time, while some come in batches and increments. Big data processing allows any organization to accept the data and at the same time process it in real-time so as not to create congestion.
Variety refers to the type, format, and possible uses of the harnessed data. It can be structured, or unstructured and can come in encrypted and decrypted versions. Imagine all the data generated from Google, it can be a location, a decrypted communication file, or even a lead form from the ads – and a whole lot more.
Considering the three V’s above, these data cannot be processed or analyzed using traditional data analytics software. Try loading a million rows on Excel and you will be faced with an unresponsive and overheated machine. Tackling Big Data needs a specialized platform that can uncover the complex nature of the gathered data.
The best way to present the way it works is through scenarios. Assuming your organization is an airline, your big data process can be likened to the examples below:
1. Data Collection: Data is collected from a myriad of sources throughout your business. Suppose you have a fleet of 100 planes, you can gather data from cockpit recordings and logs, or inflight records, among many others. You can also collect data from your mobile app, from a partner reseller, or from the leads generated from the ad you placed on YouTube. These data, structured or unstructured, can now be processed and possibly provide your existing and future customers with better overall service.
2. Data Processing: Complex data should be organized in a system that is not only visually consistent but also congruent to your organization’s goals and intentions. One question that would pop out is “How should I use all of this data?”. Based on your question, you should process the data in a way that will address your original intentions. Now that you have determined your goals, you then proceed and “clean” it.
3. Data Sanitation: You should skim for duplicated entries, correct and unify the format, and remove unusable data. Failure to do this step will yield inconsistencies, and can potentially increase the risk of inaccurate reporting and analysis.
4. Data Analysis and Visualization: Your data is now clean, well-formatted, and consistent. You’re now ready to turn a once-messy medley of data into powerful insights that can potentially improve or save your business. You can identify patterns and relationships, determine outliers or even employ artificial intelligence and machine learning.
One powerful platform is DataStreams’ flagship service “TeraONE”. It employs a complete set of functions from data integration, data governance, and analysis. From data gathering to augmented ETL, storage, data virtualization, and analysis, TeraONE can unleash the power hiding in your data through effective and real-time data streaming, batch data processing, machine learning, and visualization.