Gartner says that Artificial Intelligence (AI) and Machine Learning (ML) have reached a critical tipping point and will increasingly augment and extend virtually every technology enabled service, thing, or application. So much so that AI/ML will become one of the top five priorities for CIOs. While the cloud offers “agility”, the advances in AI/ML technologies can jointly transform business outcomes.
That said, machine learning can be challenging for both developers and the data scientists as they advance on their journeys to build, train, and deploy models into production.
You need to do a number of things to accomplish the mission. You have to collect and prepare historical training data. You also need to select an algorithm to fit the data for a near accurate model to come up. Once you are done, you need to tune the models so that they deliver the right predictions. It is a complex mechanism that requires a great deal of effort.
During the AWS re:Invent 2019, I spoke to Bratin Saha, Vice President and General Manager Machine Learning Services and Engines, AWS on a variety of issues ranging from the evolution of machine learning to how AWS has now almost commoditised it for developers.
Below are the excerpts:
DynamicCIO (DCIO): What would be your two key messages on Machine Learning as it exists today? And how would you explain the evolution of Machine Learning?
Bratin Saha (BS): If I have to divide my response into two key messages, they would be as follows:
Machine Learning, as we see it today, is perhaps the biggest contributor to the transformational activities carried out by businesses in the digital transformation era.
- Amazon SageMaker is the best way to do Machine Learning both in terms of technology maturity and variety of functionalities. It helps you to get:
- Lowest Total Cost of Ownership (TCO)
- Bunch of features that makes ML practitioners most productive
That’s true for all the CIOs and technology leaders who are experimenting with technology to give the required push to their customer experience initiatives. Similar to how organisations embedded IT in every aspect of business 20 years ago, it is time to embed ML into all operations, across the company.
In terms of its evolution, there has been trifecta of things that has made the innovation in ML technologies grow exponentially and make it an appealing technology.
To harness the true potential of ML, the first thing AWS did was to make powerful compute available. This paved the way for accessing and analysing more data than one could imagine. 5-6 years ago, what took days, and (maybe) weeks, can now be done in hours.
With the rise of AWS cloud, the data, which was just generated (and not used) is now available for storage, analysis, secure and use it for deriving insights.
The number of patterns being patented and research papers being published, shows the popularity of ML technologies and their practical usages. Amazon (now AWS) has been investing in Machine Learning for decades now. Today, there are thousands of engineers committed to work on ML tools, used to make recommendations to customers, to improve supply chain, to create robotics working in fulfilment centers etc. AWS has been successful in transition and we are convinced that ML can transform the way business operates. It should be an important learning for the CIOs as well. Further, we are now spreading the knowhow of how ML can be deployed at scale. That links to the SageMaker story that I said in my second key message above.
DCIO: Let’s speak of some specific industries like retail, financial companies and others where the deployment of ML resulted into similar transformation as Amazon?
BS: Let me first clarify that AWS only provides a set of ML services and tools, which then users use to build and deploy solutions. For example, US-based business and financial software company, Intuit uses AWS’ ML services. Using Amazon SageMaker, the company has been able to reduce the deployment time from 180-days to just a few weeks. By deploying ML models, Intuit is able to analyse transactions for last one year and see, which one of those would be deductible.
Similarly, in the field of medicine, there’s GE Healthcare is a large use case of SageMaker in the area of radiology.
There are many other customers that have been using ML for personalisation. A digital currency exchange company Coinbase uses SageMaker for fraud detection. Change Healthcare, a provider of revenue and payment cycle management and clinical information exchange solutions in the U.S. healthcare system, uses SageMaker for claims processing.
Across every segment, media, software, financial services, healthcare, entertainment, there you’ll find numerous use cases on how ML has helped transform business operations in areas such as fraud detection, personalisation, customer churn analysis etc.
Let’s isolate one particular use case “customer churn analysis” and discuss the applicability of ML tools/solutions and how it can change the way it has historically been handled or not handled?
Digitization has led to generation of a lot of data. To convert that data into insights, we need the machine learning tools. Be it any industry, you have to start the analysis right from the time when the customers get onboarded, looks for services/goods and (for some reason) leave. As a business, you’d want to know the characteristics or drivers behind this churn.
- Does the customer belong to a particular segment?
- Is it a particular product?
- Did those customers use the product for some time before leaving?
The ML-based analysis can lead you to find whether or not there is a friction in the product or in a part of the product.
DCIO: More often than not we see technology being blamed for faulty outcomes. Are organisations really able to use ML as systematically as the product companies tell them to?
BS: I will again go back to example of Amazon.com. The way it transformed and the way it used machine learning, has set the right precedence for the industry.
Apart from that, there are a many other use cases of Amazon SageMaker and AWS AI/ML services. Over the last two years, the number of ML developers of our services has increased 7X. That means a lot of customers are able to derive value from machine learning. We have created tools that makes machine learning more accessible. Doing ML today, using something like SageMaker, is far easier than what it was a couple of years back. But we still need the right skill sets/ML practitioners and the right kind of data to make a success out of it.
Companies need to get into the mindset of having and organizing the data in the right way and that they should have some amount of ML practitioners to get going. Most important thing is to get started. To begin experimenting with ML, you need to have a use case to go after. Since ML is an iterative process, you need to be patient with it and be ready to achieve many milestones before seeing success.
DCIO: On these three points – the right data, the right skill sets and the use case iterations, let’s know what’s the right data?
BS: It is important to have the data on the use case that we want to pursue. For example, if we are doing ‘churn analysis’, we need to have data around which customers are coming in, which part of the product they are using, how long are they using it, which segment of customers are going away and so on…At the end of the day ML will only draw insights from the data that’s fed into the systems.
Similarly, if you want to do ‘fraud prevention’, you need to have data on what were the key characteristics of the customers that led to fraud. You should be able to collect data around these key parameters.
You need to start backwards: First determine a use case and then collect data that supports the action to establish the patterns.
How do you do experiments with application of machine learning?
The ideal way an organisation should do experiments with machine learning is ‘offline’. You’d typically take the data offline, take the machine learning model and train that on a training set or a validation set and probably do some A/B testing as well. Only after training and validation, when you have gained more confidence, you send more data into production environment and achieve the scale. One of the services that AWS released recently is Augmented AI. It lets you put a human in the loop. As you are testing, you can take some of those predictions and have them tested by humans so that you get confidence on the working of the model. Deploying machine learning is no different than deploying any other technology following the key steps of training, trial and validation.
DCIO: If we take the example of ‘Fraud Detection’ how will the machine learning models help which have no information on these new methods of frauds?
BS: One of the two things happen: Either you have already trained your model so that it is able to correlate. Or, you’ll have to retrain your models again. An exciting feature that was added to the SageMaker this year is ‘model monitoring’. It enables an organisation to guard against these cases. Machine learning doesn’t work on one-time training of models. Once you have deployed a model, you will also have to monitor it to see whether it is behaving according to your needs. By deploying SageMaker model monitor, you can easily detect if your model is drifting away from its actual need. If the drift goes beyond a particular threshold, you get to know that the data used for prediction has become different than the data sed for training of the models. That’s when you retrain your models. That’s why I initially said that machine learning was earlier difficult. It was hard to adopt. But with tools like SageMaker, there is a lot of infrastructure ease to make it easy for users to train, deploy and continuously monitor their machine learning models.
DCIO: How will products like Amazon SageMaker Studio and SageMaker Experiments going to help the enterprises?
BS: SageMaker in general has really evolved a lot from the time it was launched. Now we have started adding a lot more tools to help developers become more productive. When organisations start experimenting with machine learning, there is an aspect of infrastructure that needs to be considered. We need to be careful about things like auto-scaling, setting up clusters, taking clusters down, doing things with serverless etc. Now we are adding more aspects of developer productivity. For example, SageMaker Studio. It gives you a lot of data and matrix and a dashboard that one can look at. And it is one single pane of glass so you don’t have to switch between a variety of things. Similarly, its about SageMaker Experiments. It helps you organise, track, compare and evaluate machine learning (ML) experiments and model versions. It integrates well with model monitor to show which of the models are throwing errors. Both these tools are extremely helpful in reducing undifferentiated heavy lifting.
DCIO: There could be a scenario where the customer is willing to experiment, it has identified a use case, and has the right data sets but is lacking in skills. How does AWS help in this case?
BS: Amazon SageMaker Autopilot is the answer here. If an organisation has good data scientists it can actually generate a model by its own. You can come up with an idea of a model and feed in what kind of prediction you want to make, the Autopilot takes care of the rest. It will automatically explore different solutions to find the best model. It also covers regressions and classifications, which leads to many use cases. For enterprises, SageMaker Autopilot is a good starting point. It significantly reduces the barrier to entry.