Supervised Machine Learning for Healthcare Industry: Bacteria Identification & Classification
Did you know that it is possible to identify and categorize bacteria with the help of AI? The results can be staggering: we've trained the model that can detect a bacteria with 99% of accuracy.
How does a successful Machine Learning project start?
Any successful Machine Learning project starts with the determination of accurate business goals. The more information is accumulated on this stage and the more right questions are asked and answered, the more chances to reach the goal defined by the client.
After all, software development starts with effective business analysis, not with datasets or algorithms. And our most recent case of AI in healthcare proves that. Before launching a project with Belgian company MicroTechnix we conducted multiple interviews with the client to understand this non-trivial request in detail.
The global task of the project was to create the software that could automate the process of bacteria detection reducing the number of errors made. The development of ML algorithms for bacteria detection and categorization would eventually lead to the exclusion of a human being out of the process of decision making in laboratories where the cost of the mistake is rather too high to ignore that fact. After all, one indisputable advantage of ML utilization for such complex tasks as bacteria detection is that with ML it is possible to be sure that the same mistake will not be made twice.
Another business goal was to exclude false positive and false negative results the verification of which is very time-consuming giving the client time to work on more creative and serious tasks.
A more precise task was to make the program recognize the specific stamps of bacteria in Petri dishes on the basis of the provided dataset. To make it happen Softengi team was supposed to label the existing datasets and train the model accordingly. Did we reach the results that the customer wanted us to reach? Let’s find out.
Among numerous factors that influence the effectiveness of the new algorithm, an exhaustive dataset is probably the most important. Labeling objects was an exhausting and time-consuming task because it required all the attention of the junior data scientists. Many bacteria on the photos looked very similar or almost the same for junior data scientists who were working on this task. Sometimes it was difficult to differentiate between a bacteria and a little piece of dust.
We have managed to label as many as 7000 objects on the photos to feed the neural network with the dataset that would suffice for the detailed categorization. The labeling was conducted on the basis of several key criteria: the size of the bacteria, the color of the bacteria, and its location on the photo.
Using the labeled data we’ve managed to build a model that would then classify bacteria automatically. We used the Faster RCNN with Resnet101 neural networks which proved to be one of the most effective in this domain.
To make the model work we chose TensorFlow, Open CV, and Keras as these are the most robust and effective tools for image recognition. Image recognition is just one of the multiple advantages of Machine Learning which we used extensively to reach our goal. AI in healthcare promises a lot of benefits, and AI-powered medical equipment is just one of them.
Achieving Business Goals. Did We Manage?
When analyzing photos with bacteria we created 3 classes: false, true and unknown results. The idea of categorization turned out to be very efficient in terms of time because we’ve managed to identify unknown results accurately, and that was the only category that required further research. And that was only 5% of photos on which margins and bacteria merged or there was a piece of dust on the photo.
As a result, for this project, we’ve managed to achieve 99% level of accuracy with bacteria identification and the capability of differentiating between 3 distinct classes.
From Supervised To Unsupervised Machine Learning
In the end, the team decided that the best outcome for the client would be to have a fully-functional unsupervised Machine Learning so that the model could be applicable to all sorts of bacteria and would exclude long-term data labeling.
The customer received a source code with the ready to use the convolutional neural network in the form of the model which he can retrain with the new data.
Unsupervised machine learning looks like magic for those who don’t know how it works. With the help of unsupervised machine learning, it is possible to find the patterns in the dataset without any reference to known or labeled outcomes. This type of machine learning can be an excellent tool for determining the underlying structure of the data.