This article was written by Yulia Shcherbachova and originally appeared on the DataRobot Blog here: https://www.datarobot.com/blog/get-maximum-value-from-your-visual-data/
The value of AI these days is undeniable. However, in a fast-changing environment, a decision made at the right time is critical. We collect more and more diverse data types, and we’re not always sure how we can turn this data into real value. Sometimes it takes hours and days of experimenting to get valuable insights. Or even if we have a pretty good understanding of the problem, there is not enough data to run a successful project and deliver impact back to the business.
Image recognition is one of the most relevant areas of machine learning. Deep learning makes the process efficient. With frameworks like Tensorflow, Keras, Pytorch, etc., it’s possible to build a robust image recognition algorithm with high accuracy. However, not everyone has deep learning skills or budget resources to spend on GPUs before demonstrating any value to the business.
Who Can Benefit from the Visual Data?
The short answer is anyone: E-commerce, Security, Medical Image Analysis, Industrial Automation, and more. Image recognition has a lot of applications in industries and businesses. AI technology is playing a massive part in the 4th industrial revolution and has already spread across most organizations.
DataRobot Visual AI
In 2020, our team launched DataRobot Visual AI. We embedded best practices and various deep learning models to support image data. Our first step was to include images into the supervised machine learning pipeline. Similar to other DataRobot projects, Visual AI projects delivered deployable models and associated model insights. WIth built-in insights you can see which aspects of the input images the model focuses on when attempting to discriminate between classes.
What’s New In Visual AI
Our team worked hard to take Visual AI to the next level. As a result, we have released several exciting new features during the past few releases that I’m excited to share:
1. Image Augmentation
Don’t have enough images for your dataset? No longer an issue. With Image Augmentation, you can create new training images from your dataset by randomly transforming existing images, thereby increasing the size of the training data via augmentation.
2. Multimodal Clustering
Multimodal Clustering provides users with a one-click, one line-of-code experience to build and deploy clustering models on any data, including images. In addition, with the new Cluster Insights visualization, you can easily combine images with any other data type to understand, name, and explain each cluster for any model.
3. Visual AI Anomaly Detection
This is one of the most exciting features presented in the 7.3 release. With Visual AI Anomaly Detection, you can now address more use cases out-of-the-box with one click and one line of code end-to-end. So drag and drop and let the DataRobot AI Cloud platform get you started.
We’ve also improved user experience and added new and more efficient image featurizers such as darknetpruned, efficientnet-b0-pruned, mobiletetv3-small-pruned, and more.
Quick Start with Visual AI
When it comes to a new type of data, we’re not always sure where to start. Let’s take a closer look at this project with images, and you’ill see how simple it is. For my project, I took a public data set with images of surface cracks (public source).
Concrete surface cracks are a major defect in civil structures. Building inspection is done to evaluate rigidity and tensile strength. Crack detection plays a major role in building inspection, finding the cracks and determining building health.
Step 1. Submit Data
As with any other project, you can just drag and drop a folder with images or use a pre-loaded file that is added or shared within AI Catalog. After Exploratory Data Analysis is completed, you can look at your data.
As you can see, I have a set of 38,402 images that are divided into two classes, positive and negative.
Step 2. Configure Settings You Need
Pick a target variable and select “Start” to automate the preparation, selection, and training of a huge variety of cutting-edge deep learning models. For this project, I decided to run unsupervised mode, which we recently presented with the 7.3 Release.
Step 3. Run Autopilot
Select “Start” and let DataRobot AI Cloud Platform do the work for you. Just like for any other project, DataRobot will generate training pipelines and models with validation and cross-validation scores and rate them based on performance metrics.
When the modeling process is finished, you can also go to advanced tuning and add additional settings for featurizer and image augmentation as additional model tuning.
Step 4. Get Insights
Visual AI has extra tools that are specific to the image data type, which were created to enhance model insights.
- Image Activation Maps allow you to see sample locations in the image that the model is using to make decisions (notice the color highlights showing areas of high and low activation).
Image Embeddings allow you to visualize a sample of images projected from their original N-dimensional feature space to a new two-dimensional feature space. This feature makes it easy to see what images are considered similar.
Step 5. Deploy Your Model and Make Predictions
Like any other model, you have full flexibility to make predictions and deploy your model to the environment of your choice. Operationalize image models with just a few clicks. Deploy to an API endpoint or a Portable Prediction Server (a Docker container that can host one or more production models). Use platform UI to get batch predictions for the ultimate in flexibility. Monitor the service health and accuracy of all your models and update them with no service interruption.