A lot goes behind the scenes in order to create value from data
Published on : Wednesday 04-05-2022
Mamta Aggarwal Rajnayak, Head – AiDa (Enterprise wide AI-ML platform), American Express.
What exactly is Big Data? What actually is the difference between Big Data and Plain Data?
Data or plain data as mentioned here is nothing but a set of quantitative and qualitative information stored digitally or otherwise. This information is processed by analysis tools and software to convert the information into actionable insights. Big data is also data after all, but the major difference comes from 3 Vs, i.e., Volume, Variety and Velocity, and the data volume is in petabytes, zettabytes, or exabytes, instead of gigabytes and terabytes. So it is a lot of data by size, has a mixed variety of information and due to the sheer nature of it, it accumulates extremely rapidly. For instance, the clickstream data of a renowned retailer’s ecommerce site would capture each and every click that a customer would make, how much time did they spend on each page, item, etc. Imagine the level of information collected. Think about Amazon; it ships 1.6 million packages every day and with the conversion rate of 9.87%, there is 16 million customer traffic every day. Imagine capturing the entire clickstream journey of each and every of these 16 million customers that results in petabytes of data every day. Hence traditional data handling techniques stop working on this data and we need big data tools and technologies like Hive, Spark, Hadoop, etc.
What is the relationship between Big Data and Digital Transformation?
Majority of the companies are signing up for enterprise wide Digital Transformation, which means adoption of digital technologies in everything that they do. They want to now take data backed decisions instead of going by experience and hunch. In my previous job, I have been instrumental in helping our clients in the digital transformation journey. But in order to do that they need a strong data strategy, they need to create a bedrock of data which is clean with well documented lineages, right governance processes, established data marketplaces, consistent data definitions, high reusability quotient, one single source of truth for the entire organisation and above all, data driven mind-set. The need to go for digital transformation is rapidly increasing with ever increasing big data generated in the organisation. Organisations are realising that data can really help them in making no regret right decisions. With this increasing demand of digital transformation, there is an increasing shortage of the data and AI transformers skills like Data Engineers, DevOps specialists, Testers, ML Engineers, Data Modellers, ML Modellers, MLOps specialists, ML testers, viz., experts, etc. The need for digital transformation and these skills is recognised at the C-Suite level. CEOs are giving their personal commitment to develop AI and ML strategy for their organisations and put data at the centre of all decision making.
Data has been compared variously to oil (black gold) and also garbage. How to make sense between the two extremes?
Data contains a variety of information which if used wisely can create tonnes of value for the organisation. There is a famous saying in the retail industry for their customers – You are what you buy! So by merely looking at the kind of items customers buy from my stores, I would know so much about them and can give personalised services and offers to them. Hence data at times is referred as the ‘New Oil’ or ‘Black Gold’, etc. However, a lot goes behind the scenes in order to create such a value from data. You need to organise your data in such a way that it is easy to extract the information whenever required, there should be consistency across the organisation on data usage so that everyone is speaking the same data language.
Data interpretation is not easy hence you need people who understand your data, you need people who have skills to convert data into insights, and above all you need scalable solutions for enterprise wide applicability and hence skills to scale. Without any of these, your data becomes useless or rather in fact a liability which just takes up server space and maintenance without adding much value. If the data is not organised well, if users don’t understand how to interpret the data and convert into insight then you won’t be able to take any data driven decision. Even if you do happen to use the data without really understanding, it will totally result into junk in – junk out. Thus, some people refer to data as ‘Garbage’ as well.
There are various tools and platforms claiming to provide the ideal fit for purpose. How should enterprises evaluate and select the right solution?
In my opinion there is no one solution fits all, organisations need to do their own due diligence to find out what suits best for their purpose. For instance, there are 3 major cloud platforms but I don’t think any one of them has all the functionalities that an organisation would need. For instance, Google’s GCP clubbed with Google Analytics can provide a solid platform to use cloud services and web analytics but Adobe’s SiteCatalyst still has best in class web journey analytics features. So it is important for organisations to first chart out their priority areas, see who are best in class services/platform providers in that area, perform integration tests to understand the compatibility status and then take a conscious decision even if they need to compromise on few features and functionalities due to any reason.
Platform and solution providers are also offering a few small PoCs to test the waters. In my previous organisation, I have been doing lot of digital transformation PoCs with few of my clients to help them understand what solution would work for them, it usually becomes a tripartite PoC where the platform/solution providers assigns someone from their organisation to assist and educate on their product usage during the PoCs and the third parties like Accenture helps them with an unbiased fit to purpose recommendation.
Whereas manufacturing companies and process industries have well defined benefit statements, how do service industries benefit from Big Data Analytics?
I don’t see many different benefit statements for any industry, the idea is to capture that data about anything and everything happening in the organisation in an organised way and use that data to come up with a data driven strategy. For instance, for a services organisation like Uber – they collect information about each driver's driving skills, their behaviour, their customers booking pattern, respectful behaviour for the driver, routes, toll roads and what not, etc. Using all of this information, Uber can customise the experience both for the driver and customer to define the optimised safe route and also use the existing routes information to multitask and use Uber Delivery Services in real time basis to create additional sources of income. Similarly, consulting organisations may use the information about their clients, what kind of services they really seek for, how much time does it take for a project to convert for them, are there any payment related issues with any client and so on. All of this data can be used to create a well-rounded pitch strategy for them. When I was in Adobe, I was supporting their DMS marketing and sales organisation to use the data collected for their B2B client and help them into retention, cross sell, upsell, etc.
How does the 5G rollout change the landscape of Digital Transformation?
Any new technology which can help in better connectivity would help digital transformation. 5G is going to bring in unique features like multi-Gbps data speeds, massive network capacity, and ultra-low latency but this all will only make sense when organisations have invested in technology equipment that enables the convergence of 5G connectivity with distributed computing, and the Internet of Things (IoT) services. Thus, for the organisation which has this entire ecosystem established, 5G would help in reducing latency to a great extent. Everyone talks about real time analytics and alert monitoring systems but when you go in depth, it reduces to near real time as opposed to actual real time solutions. With 5G technology cutting the latency and providing faster processing, one may say that it is a step in the right direction to come a step closer to the dream of real time analytics, alerts, monitoring, etc. True!
With more than 15 years of experience, Mamta Aggarwal Rajnayak heads AiDa (Enterprise wide AI-ML platform) at American Express which caters to thousands of users in the organization for their daily AI-ML requirements. She works with the community of enthusiastic and talented product managers, tech specialists, data scientists, ML engineers, viz., experts, et al., who leverage industry-leading technologies to keep American Express ahead of its competition. She has several patents, offerings, and research papers in her name. Mamta loves sharing what she has learnt and developed over all these years in the analytics industry.
Mamta has featured on 40 under 40 data scientists 2022 list by AIM, 3AI President's honour 2022, The AI Maker 150 - 3AI 2021. She figured on Top 11 Women in AI Leadership 2020 - AIM and is on Top AI & Analytics Customer Experience Leader 2020 - 3AI
In her free time, Mamta likes to explore new places, solve mathematics problems with her son and rejuvenate herself by practicing Yoga.
(The views expressed in interviews are personal, not necessarily of the organisations represented)