Supervised Machine Learning for Healthcare Industry: Bacteria Identification & Classification

Client: Belgian lab automation company MicroTechnix.

Challenge: Lack of accuracy in medical image analysis, human errors, and huge time loss for lab personnel when assessing medical images.

Solution: The creation of ML algorithms for automatic medical image analysis (in our case, photos with Petri dishes) with 99% accuracy.

How does a successful Machine Learning project start? Any successful Machine Learning project starts with the determination of accurate business goals. The more information is accumulated at this stage, and the more right questions are asked and answered, the more chances are to reach the goal defined by the client.

After all, software development starts with effective business analysis, not with datasets or algorithms. And our most recent case of AI in healthcare proves that. Before launching a project with Belgian company MicroTechnix, we conducted multiple interviews with the client to understand this non-trivial request in detail.

The global task of the project was to create software that could automate the process of bacteria detection, reducing the number of errors made. The development of ML algorithms for bacteria detection and categorization would eventually lead to the exclusion of a human being from the process of decision-making in laboratories where the cost of the mistake is rather too high to ignore that fact. After all, one indisputable advantage of ML utilization for such complex tasks as bacteria detection is that with ML, it is possible to be sure that the same mistake will not be made twice.

Another business goal was to exclude false positive and false negative results, the verification of which is very time-consuming, giving the client time to work on more creative and serious tasks.

A more precise task was to make the program recognize the specific stamps of bacteria in Petri dishes on the basis of the provided dataset. To make it happen Softengi team was supposed to label the existing datasets and train the model accordingly. Did we reach the results that the customer wanted us to reach? Let’s find out.

Datasets Labeling

Among numerous factors that influence the effectiveness of the new algorithm, an exhaustive dataset is probably the most important. Labeling objects was an exhausting and time-consuming task because it required all the attention of the junior data scientists. Many bacteria in the photos looked very similar or almost the same for junior data scientists who were working on this task. Sometimes, it was difficult to differentiate between a bacteria and a little piece of dust.

We have managed to label as many as 7000 objects on the photos to feed the neural network with the dataset that would suffice for the detailed categorization. The labeling was conducted on the basis of several key criteria: the size of the bacteria, the color of the bacteria, and its location on the photo.

Models Building

Using the labeled data, we built a model that would then classify bacteria automatically. We used the Faster RCNN with Resnet101 neural networks, which proved to be one of the most effective in this domain.

To make the model work, we chose TensorFlow, Open CV, and Keras, as these are the most robust and effective tools for image recognition. Image recognition is just one of the multiple advantages of Machine Learning that we used extensively to reach our goal. AI in healthcare promises a lot of benefits, and AI-powered medical equipment is just one of them.

Achieving Business Goals – Did We Manage?

When analyzing photos with bacteria, we created three classes: false, true, and unknown results. The idea of categorization turned out to be very efficient in terms of time because we managed to identify unknown results accurately, and that was the only category that required further research. And that was only 5% of photos in which margins and bacteria merged, or there was a piece of dust on the photo.

As a result, for this project, we’ve managed to achieve a 99% level of accuracy with bacteria identification and the capability of differentiating between 3 distinct classes.

From Supervised to Unsupervised Machine Learning

In the end, the team decided that the best outcome for the client would be to have a fully functional, unsupervised Machine Learning so that the model could be applicable to all sorts of bacteria and would exclude long-term data labeling.

The customer received a source code with the ready-to-use convolutional neural network in the form of the model, which he can retrain with the new data. Unsupervised machine learning looks like magic for those who don’t know how it works. With the help of unsupervised machine learning, it is possible to find the patterns in the dataset without any reference to known or labeled outcomes. This type of machine learning can be an excellent tool for determining the underlying structure of the data.

Supervised Machine Learning for Healthcare Industry: Bacteria Identification & Classification

Datasets Labeling

Models Building

Achieving Business Goals – Did We Manage?

From Supervised to Unsupervised Machine Learning

PEOPLE ALSO READ

WhTech-WMS: Warehouse Management Software

Police Records Management System For Caribbean Countries

Industrial Asset Tracking: IoT Solution on Guard of Coal Mines

Head of Partnerships

Senior Tech Consultant