Understanding Big Data Analytics
Published on : Sunday 01-05-2022
Experts debate how Big Data analytics uncovers hidden patterns and offers deeper insights that lead to better decision making.
The convergence of technologies like the internet, sensors and other IoT devices is revolutionising manufacturing. A humongous amount of data is now available for analysis and the insights gained are leading to better efficiencies and lesser pain points. Analysis of data that was earlier not accessible allows businesses to make better decisions much faster. Several advanced analytics techniques are now available, thanks to data mining, artificial intelligence, machine learning, predictive analytics, etc. The gradual rollout of 5G services will accelerate this process paving the way for smart factories envisioned in the Fourth Industrial Revolution. Data analytics will be a vital part of this revolution. In the future, everything will be connected and traceable. But what exactly is Big Data? What is the difference between Big Data and the good old data?
What is Big Data?
“In the domain of Artificial Intelligence, Analytics and Machine Learning, Big Data is understood to be a collection of complex data sets, stored in specific manner on large servers to achieve defined business goals. The data is retrieved using defined access mechanisms. The procedures, protocols and tools for managing the complex data sets are collectively called Big Data,” says PV Sivaram, Evangelist for Digital Transformation and Industrial Automation. A veteran of the automation industry, PV Sivaram retired as the Non-Executive Chairman of B&R Industrial Automation and was earlier the Managing Director. He is a past President of the Automation Industries Association (AIA). “Big Data is data which is being collected at a furious pace, and continuously growing in volume. Legacy mechanisms for searching and sorting of data are not adequate for Big Data. The authenticity and reliability of prognostics made by Analytical engines processing Big Data is far superior to what can be achieved by the lower volumes of data,” he elaborates.
“Data or plain data as mentioned here is nothing but a set of quantitative and qualitative information stored digitally or otherwise. This information is processed by analysis tools and software to convert the information into actionable insights. Big data is also data after all, but the major difference comes from 3 Vs, i.e., Volume, Variety and Velocity, and the data volume is in petabytes, zettabytes, or exabytes, instead of gigabytes and terabytes. So it is a lot of data by size, has a mixed variety of information and due to the sheer nature of it, it accumulates extremely rapidly,” says Mamta Aggarwal Rajnayak, Head – AiDa (Enterprise wide AI-ML platform), American Express. Mamta works with the community of enthusiastic and talented product managers, tech specialists, data scientists, ML engineers, viz., experts, et al., who leverage industry-leading technologies to keep American Express ahead of its competition. To illustrate the nature of the huge volume of data, Mamta gives the example of ecommerce giant Amazon. “The company ships 1.6 million packages every day and with the conversion rate of 9.87%, there is 16 million customer traffic every day. Imagine capturing the entire clickstream journey of each and every of these 16 million customers that results in petabytes of data every day. Hence traditional data handling techniques stop working on this data and we need big data tools and technologies like Hive, Spark, Hadoop, etc.,” she explains.
“Big Data refers to the rise of digital enterprises and how organisations can leverage their digital assets to their advantage. These assets are commonly described as large amounts of data and can be seen as a sign that something has been going on that hasn't been articulated and analysed fully,” states Abhilash Shukla, Independent Consultant, Digital Transformation Projects. Through the innovative use cases of Digital Technologies, AI/ML, and IoT, Abhilash has contributed to many SaaS and PaaS projects in his 13 years of industrial experience. “For me, the significant difference between big data and plain data, a.k.a traditional data, is the size and the way they are stored. Plain data is relative to the size and of a type defined for purpose, wherein the big data is a data that is fast-changing, large in both size and breadth of information, and come from seemingly disparate data sources. The term ‘Big Data’ refers to datasets that are just too large to be analysed using traditional data processing techniques. Traditional data was always viewed in such a way that was easily understood by looking at it or by relating it to one another, but Big Data is something that cannot be understood by a human glance,” he adds.
According to Jasbir Singh, Automation Expert, Consultant & Implementation Strategist, data is mainly generated by the internet, through social networking, web search requests, text messages, media files, IoT devices and digital wireless sensors. Jasbir Singh has established a long association with Business Houses/large production houses to improve factory automation in their production lines as well as productivity improvement in factories in India and overseas; and in advising and designing the units to transform into digital platforms by use of Artificial Intelligence. “The world continually generates nearly 2.5 quintillion bytes of data daily, in which the majority of the global data has been produced in the last couple of years only as per media reports. The exponential growth of data generated globally in the past one decade has caused worries to organisations about structured storage and meaningful use for analytics and future organisational benefits. The need of Big Data technologies arises, when the gathered data from multiple sources/systems could not be stored systematically and processed manually or by traditional ways for future use. Many types of big data technologies are available for users to implement, which are linked to either of two major domains, operational and analytical,” he elucidates.
Big Data and Digital Transformation
Organisations today are at various stages of their digital transformation journey, in order to stay competitive in an increasingly digital world. Big data is an important element of this transformation process, one that provides insights that otherwise remain buried in the process.
So what is the relationship between Big Data and Digital Transformation?
PV Sivaram takes the view that Digital Transformation as a catch-all term is a journey of an enterprise to become more adept in three major areas. First is to become more agile in serving the customers which means quicker to service expressed demands. Second is to become more efficient in its processes thereby gaining competitive advantage. Thirdly, innovating its products as well as business models continuously by quickly becoming aware of changing customer demands, which are not always articulated. These are activities which need a response in real time. “The inputs come from customer interaction, from the manufacturing field, and from social media. These inputs are so diverse in nature, and the arrival rate of new data is random, hence only techniques of Big Data are adequate for the purpose,” he asserts.
Mamta Aggarwal Rajnayak, who has been instrumental in helping clients in their digital transformation journey, believes majority of the companies are signing up for enterprise wide Digital Transformation. But in order to do that they need a strong data strategy, they need to create a bedrock of data which is clean with well documented lineages, right governance processes, established data marketplaces, consistent data definitions, high reusability quotient, one single source of truth for the entire organisation and above all, data driven mind-set. “The need to go for digital transformation is rapidly increasing with ever increasing big data generated in the organisation. Organisations are realising that data can really help them in making no regret right decisions. With this increasing demand of digital transformation, there is an increasing shortage of the data and AI transformers skills like Data Engineers, DevOps specialists, Testers, ML Engineers, Data Modellers, ML Modellers, MLOps specialists, ML testers, viz., experts, etc.,” she maintains.
“With the advent of digital transformation, organisations now have more access to data analytics solutions that help give them access to rapidly analyse a large amount of data and information, allowing them to make better and faster decisions. Having said that, big data sourced from various places like legacy systems, smart devices, IoT transmissions, etc., is melding into an ecosystem, which powers predictive analytics, bringing a whole new perspective to an organisation’s capability,” says Abhilash Shukla, who quotes the pandemic as an example. “Governments around the world are quickly adapting to handle the effects of the pandemic by increasing their technological readiness to handle on-ground needs. And to derive decisions, they need data from varied sources like hospitals, vaccination centres, 3rd party testing labs, processed results of the tests, etc. Now all this data is required to be plugged into one place, i.e., ‘Big Data’ so that a 360° view can be achieved, resulting in decision making. Think how powerful the knowledge of data can become if all data can be designed into a piece of meaningful information,” he asks.
Gold or garbage?
Data has been compared variously to oil (black gold) and also garbage. How to make sense between the two extremes?
“Data analysis is a process of inspecting, cleansing, transforming, and modelling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays a role in making decisions more scientifically and helping businesses operate more effectively,” opines Jasbir Singh.
“Data interpretation is not easy hence you need people who understand your data, you need people who have skills to convert data into insights, and above all you need scalable solutions for enterprise wide applicability and hence skills to scale,” says Mamta Aggarwal Rajnayak, as she points how a lot goes behind the scenes in order to create value from data. “Without this, your data becomes useless or rather a liability, which just takes up server space and maintenance without adding much value. If the data is not organised well, if users don’t understand how to interpret the data and convert into insight then you won’t be able to take any data driven decision,” she affirms.
Abhilash Shukla agrees that to make data logical and valuable it has to be processed in an understandable way. “The data sources should be free-flowing so that the derivatives it creates are improving with time. The more data you have, the more opportunity you get to become more logically and statistically accurate. The values lie within the quantity of quality data available for us to evaluate,” he says. When you work on a massive dataset it also produces data that has traditionally been looked at, or even deemed as trash. Basically, the data in an incomplete form is garbage; similarly, a by-product of structured data results in incomplete data called Garbage. “So if you have data being sourced and flowing in properly, it can turn into a meaningful result that can be compared to oil or black gold,” he explains.
Tools and platforms
There are various tools and platforms claiming to provide the ideal fit for purpose. How should enterprises evaluate and select the right solution?
To Mamta Aggarwal Rajnayak, there is no one solution that fits all, and organisations need to do their own due diligence to find out what suits best for their purpose. “There are 3 major cloud platforms but I don’t think any one of them has all the functionalities that an organisation would need. For instance, Google’s GCP clubbed with Google Analytics can provide a solid platform to use cloud services and web analytics but Adobe’s SiteCatalyst still has best in class web journey analytics features,” she opines. According to her, it is important for organisations to first chart out their priority areas, see who are best in class services/platform providers in that area, perform integration tests to understand the compatibility status and then take a conscious decision even if they need to compromise on few features and functionalities due to any reason.
According to Abhilash Shukla, the fundamental aspect of evaluating a platform is actually getting to the bottom of defining the ‘fit for purpose’ statement. Most companies simply prefer opt-in platforms without detailing their core requirements. “Like a requirement could just be a storage or simply data cleaning, so enterprises should evaluate their real requirements. In most cases, five key areas enterprises should identify are related to data cleaning, storage, processing, orchestration, and visualisation,” he elaborates. Also, it is important to answer specific questions in conjunction with the platform selection, like, is the data statistically accurate enough from the quality standpoint to serve the purpose? If yes, then how historical it is, and how latest it is? Finally, the results of the data should be purposeful for users so that actions can be taken.
Big Data and Service Industry
Whereas manufacturing companies and process industries have well defined benefit statements, how do service industries benefit from Big Data Analytics?
“Big Data technologies are connected to data storage, data science, data mining, data visualisation, cloud computing, data analytics, machine learning, deep learning and on top of these are linked to business intelligence by handling large amounts of data from multiple sources, says Jasbir Singh, who believes analytical big data technology is used when performance criteria have some target and rapid business decisions are required to be taken based on operational-real time data/information. This is a common requirement for all, including service industries. “It provides a highly secured environment for numerous applications of Big Data management in sectors like banking, insurance, finance, medical, retail and many more,” he avers.
For the services industry, Mamta Aggarwal Rajnayak cites the example of Uber, which collects information about each driver's driving skills, their behaviour, their customers booking pattern, respectful behaviour for the driver, routes, toll roads and what not, etc. “Using all of this information, Uber can customise the experience both for the driver and customer to define the optimised safe route and also use the existing routes information to multitask and use Uber Delivery Services in real time basis to create additional sources of income,” she states.
Abhilash Shukla too believes that service industries have realised the potential of big data. From optimising internal operations to identifying customers and selling the right products have changed the way traditional businesses have worked. “Today big data analytics helps us to predict and identify customers’ interests with already available data points. We are able to take action on the basis of behaviour patterns and design sales, marketing, and customer servicing in a much more mature manner. Service industries are able to close deals in less time with better customer acquisition costs and are able to optimise resourcing with better resource management,” he elaborates.
Impact of 5G
How does the 5G rollout change the landscape of Digital Transformation?
“There is speculation about how 5G technology will impact Digital Transformation. At the outset, the IIoT model allows many technologies to be deployed at the communication layer – starting from traditional plant field buses, wireless field buses, GPS, Bluetooth and so on. 5G is one more addition to this array of media, says PV Sivaram, who agrees there is a significant difference as 5G is quite fast. “The impact of 5G will be proportional to the impact of the applications which can leverage this speed advantage. This area is a work-in-progress. We are unable to say that 5G will make sweeping changes,” he adds, cautiously.
Jasbir Singh is of the view that 5G networks rollout will facilitate fast and improved digital transformation in manufacturing and services sector by way of its speed of performance. “5G shall be the driving force behind the digital transformation in enterprise performance. 5G technology is critical in the adoption of digital transformation strategy for improved digitalisation in the process to reap the benefits of cloud computing using artificial intelligence. Companies invest less time from planning to implementation by using 5G for reaching out to the market,” he says.
According to Mamta Aggarwal Rajnayak, any new technology which can help in better connectivity would help digital transformation. “5G is going to bring in unique features like multi-Gbps data speeds, massive network capacity, and ultra-low latency but this all will only make sense when organisations have invested in technology equipment that enables the convergence of 5G connectivity with distributed computing, and the Internet of Things (IoT) services,” she explains. Thus, for the organisation which has this entire ecosystem established, 5G would help in reducing latency to a great extent. “With 5G technology cutting the latency and providing faster processing, one may say that it is a step in the right direction to come a step closer to the dream of real time analytics, alerts, monitoring, etc.,” she affirms.
Summing up, Abhilash Shukla firmly believes 5G will revolutionise the coming decade! One of the biggest challenges of data is sourcing, which is affected mainly by bandwidth issues that currently exist. 5G will have a definite impact on IoT when it comes to digital transformation from an overall perspective. Think about the consumer market already demanding 3D gaming and augmented reality which is. Similarly, in industrial communication, typical industrial applications connected to control systems and proprietary protocols would see a dramatic increase in communication, security, and privacy capabilities because of 5G.
“5G will enable us to exchange data faster, it will help the healthcare industries with vital tracking, and will help in meeting real-time specifications enabling remote streaming consultations and addressing emergencies with no delay. It will help the agriculture industry to analyse and control the equipment which is sending the massive number of datasets from interconnected sensors collecting precision data of crops, plants, and livestock. It can help the automotive industry with live data processing for improved autonomous driving and can help solar and wind power plants by enhancing their smart capabilities,” concludes Abhilash Shukla.
(Note: The responses of various experts featured in this story are their personal views and not necessarily of the companies or organisations they represent. The full interviews are hosted online at https://www.iedcommunications.com/interviews)