Businesses today are faced with an existential crisis: how to increase the quality of products and services whie driving down the costs? Increasingly, the industry captains and senior executives realize that the answer lies in use of advanced technology — in particular, ai’s Machine Learning (ML). ML creates digital labor that can:

· Automate business process,

· Reduce the costs for the process, and

· Enhance the quality by consistency of the outcome.

Those businesses that don’t adopt digital labor will become uncompetitive.

Little wonder then, that ML is being seriously considered across many industry segments. While specific small projects are benefitting from ML, there are instances where ML hasn’t delivered on its promise — especially for larger, more ambitious projects for business process transformation. Why’s that? Because a couple of prominent issues are holding it back. The sequel discusses the issues and how to overcome these issues to realize the full potential of intelligent machines.

First and foremost, ML has to overcome a major challenge: how to make the intelligent machine transparent? ML has gotten bad rap — it’s incredibly complex and opaque. It does not explain its predictions: what were the important reasons of why the machine predicted the way it did? Without the transparency, the subject matter experts who own the business process will not trust the machine’s execution of the process. Lack of trust means limited automation. Lack of data means the experts don’t get to appreciate the insights the machine has mined from the data. Lack of data damages the confidence of the executives in the ability of ML projects to deliver on the ROI promised in the business case. Subsequent ML projects don’t get funded — unfortunate!

How do we overcome this challenge? Make ML more transparent. Effort has been made recently in this regard — explainable AI come to mind immediately. Explaining predictions is being implemented by vendors, large and small, most making progress based on preliminary research in academia and R&D labs. Much more remains to be done. A few startups have shown promise, making their ML explain the predictions well, especially for classification problems.

For example, some ML vendors — from the established Google Analytics to the newer entrants like H2O — base transparency on the research from academia called LIME; EazyML takes this a step further by explainable AI (intuitive format with increased sensitivity to the prediction) and traceable AI (explains in detail how ML’s processing your data).

Another major challenge is the inability of ML to reason and derive intelligence from textual data. This handicaps the accuracy of predictions. Let’s explain. Suppose a bank is trying to stem attrition of its premium customers. It has a ton of data available about the customers — demographics, types of financial products, volume of transactions, amounts transacted, and so on so forth. A particular customer has an issue, calls Customer Care, doesn’t get the issue resolved, is frustrated, and decides to do more banking at a competitors (soft churn). The bank is stumped. Why has the customer’s account activity gone down considerably? The ML model wasn’t able to predict churn proactively; its alert came late. The model failed because it didn’t consider the Customer Care records for analyzing the customer. ML didn’t have the capability to analyze natural language (English, for a US bank) exchange between the customer and the contact center agent.

ML has to — necessarily — consider all information, whether numeric or textual, to predict correctly. ML practitioners dump all types of data on the ML platform, let the data objectively tell ML which ones matter and which ones don’t, and ML accordingly constructs the final model. Unfortunately most ML platforms can’t process unstructured (textual) data. How can we overcome this limitation? There’s hope as natural language processing technology matures; only a small select group of vendors have ML platform that can process numeric and textual data. Research — GloVe and LDA, in particular — has given us some facility with NLP semantics; EazyML, thanks to its IPsoft lineage, has invented Concept Extraction for predictive power of textual data.

Last, but not least, is the challenge of inadequacy of training data. Most of ML is supervisory, which means the intelligent machine uses data to learn about your problem, becomes an expert in your domain, ready to predict. The machine’s training will determine how well will it execute your business process. It depends on the quality of the training data. As ML has painfully discovered, training data is seldom adequate, suffering from a variety of issues, major among them being:

· Incompleteness — Machine is learning to predict an outcome in your process. Not all input variables that influence the outcome are compiled in the training data. The machine’s learning is incomplete, its predictions suspect. To add to the woes, the data is skinny with not nearly enough records required for the machine to learn well.

· Bias — The training data is skewed, capturing one set of events with many records at the expense of other events, making the machine compromise its all-rounded learning. Sometimes, there are unfortunate coincidences (for instance, correlations among independent variables) that don’t reflect reality but inadvertently creep in.

This continues to confound the ML experts. Businesses looking to benefit from ML must build the infrastructure for storing data, tons of data, about their operations, make it accessible with sound data governance. Discipline around ML’s best area practices are required. Research is hot in this area.

To conclude, ML’s digital labor is here. This will be our colleagues in the enterprise, working collaboratively with their human counterparts. Kinks that prevent this vision will be ironed out as ML platforms become increasingly more sophisticated. Let’s start to use ML in our enterprises, starting in small ways, to alleviate risk and see how effective is it for your environment.