Black hole
This article is intended to users relying on machine learning solutions offered by third party vendors. It applies to platforms, dashboards, traditional software, or even external pieces of code that are too time-consuming to modify. One of the goals is to turn such systems into explainable AI.
The Problem I Faced
I frequently write code to solve complicated problems, such as in my recent shape-fitting article. It involves integrating libraries or external code written by various authors. Many times, the third-party functions deal with optimization problems. I call it my “back end”. Typically, it involves well-known algorithms working more or less decently depending on your input data. I don’t have the time to write such algorithms from scratch. First, existing solutions are good enough and have survived the test of time. Then I would otherwise be likely to reinvent the wheel and come up with a solution inferior to what exists already. Finally, the code in question is usually not simple. Sometimes, I don’t even really have access to it: this happens when it is buried in some libraries.
In short, my “back end” code is essentially an external black box. Here, I describe how to make the best out of such systems. In particular, I discuss how to better understand how they work and how to test them on synthetic data to identify their limits and their strengths. In some ways, this is about reverse-engineering your black box. Of course, the first step is to get the black box that best fits your needs. You want a black box that can handle your future projects as well. This requires testing different products offered by different vendors. Work with selected vendors (your short list of finalists) to perform blind tests and a proof of concept before purchasing a solution. Watch out for the quality of customer support and the length of the contract.
Case Studies
In one case, I used a black box despite its known limitations. It is good at what it does, but I wish it was not limited to one special kind of data. The reason to stick with it is a lack of time. In another situation, I was able to put a wrapper around the black box, but I left the core of the algorithm unchanged. Again, it is good at what it does. In a third case, because of the peculiarities of my data, the classic method implemented in the black box resulted in poor performance or total failure or on occasion, excellent results. I had to explore an alternative. I first discuss the last example.
When the Black Box Does Not Work on Your Data
I selected classic least squares to estimate the parameters of a non-periodic time series consisting of a sum of periodic terms. I discuss the problem in section 3.3 in my article “Machine Learning Cloud Regression: The Swiss Army Knife of Optimization”, available here. My problem is known to be ill-conditioned: solutions are numerically unstable, and it has many local minima. So I can’t blame the black box for failing. Yet, I need a solution because, indeed, there is one.
As a temporary fix, I decided to perform Monte Carlo simulations, abandoning the black box. It is very slow, but it always leads to the solution in all cases. The recommended method in the black box (according to the documentation) performed worse than the outdated, non-optimized black box version that you should avoid. The …….
Source: https://www.datasciencecentral.com/how-to-make-black-box-systems-more-transparent/