This article was written by Jon Chang and originally appeared on the DataRobot Blog here: https://www.datarobot.com/blog/introducing-the-next-generation-of-text-ai-for-ai-cloud-platform/
Taking Text AI to the Next Level
An estimated 80% of all organizational information is held in text. While it is easy to accumulate text data, it can be extremely difficult to analyze text due to the ambiguity of human language. It is precisely because of the large volume and complexities of navigating unstructured data that DataRobot has focused on assisting our users to unlock insights from text. Use DataRobot’s intelligent AutoML in either supervised or unsupervised modes with your text data (and combine them with other types of data!) and train models with a single click of a button. Advanced users will appreciate tunable parameters and full access to configuring how DataRobot processes data and builds models with composable ML. Explanations around data, models, and blueprints are extensive throughout the platform so you’ll always understand your results. Access the full potential of your models by using DataRobot with your text data.
Unlocking the Value of Text Data. What’s New in 7.3
Diverse Languages and Data Types
We here at DataRobot don’t believe in placing limits or caveats on languages with text. That’s why we’ve made it a point to use language agnostic techniques throughout the platform, so you’ll always have the full power of the platform behind you. No matter what language you bring to DataRobot, you can always expect to see the same great results.
We understand that often text doesn’t always work alone; you may want to use text with other types of data. Text AI allows you to combine text data with as many other feature types in your dataset as you like. All feature types supported by DataRobot can be combined with text data, whether that’s numeric, categorical, date, image, geospatial, time series, or relational. Through the use of diverse feature types, you can observe a much broader perspective with your AI models.
More Value with Less Efforts
Take advantage of DataRobot’s wide range of options for experimentation. Use DataRobot’s AutoML and AutoTS to tackle various data science problems such as classification, forecasting, and regression. Not sure where to start with your massive trove of text data? Simply fire up DataRobot’s unsupervised mode and use clustering or anomaly detection to help you discover patterns and insights with your data. Best of all, these techniques all work completely out of the box with text, whether you’re pursuing a no-code, low-code, or full-code experience. The platform allows you to focus on solving your organization’s business problem instead of drudging through intermediate and technical text featurization methods. Allow the platform to handle infrastructure and deep learning techniques so that you can maximize your focus on bringing value to your organization.
Take Your Experiments to the Next Level
DataRobot’s Text AI clears the way for you to test various text and NLP techniques (such as “bag-of-words” models, tf-idf, cosine similarity, FastText, TinyBert, NLTK, spaCy, stop word removal, stemming, lemmatization, and many more). All of these techniques are built into the platform and are easily accessible and configurable to your specific needs.
More Explainability and Trust
With Text AI, we’ve made it easy for you to understand how our DataRobot platform has used your text data and the resulting insights. Text explanations are embedded throughout the platform and throughout the model building and assessing process, which include the following:
- Exploratory Data Analysis – Frequent Values and Feature Values
- Modeling – Text Featurizer techniques and parameters, manual options and parameters for preprocessors under Advanced Tuning, Feature Impacts, Feature Effects, Cluster Insights, and Word Clouds
DataRobot is there every step of the way to help you understand your text features and how they work in conjunction with your other features. We’re constantly updating our text explanations (and explainability in the platform in general) so expect to see more exciting explainability features in the future.
Even More Value Across Different Industries
DataRobot works with many industries on various use cases. Here’s a few use cases I personally found very interesting:
Candidate Hiring Recommendations – Using labeled resumes (hire or no hire) as text data, DataRobot models can learn from historical hiring trends to predict and recommend candidates for hire. Through a combination of td-idf and named entity recognition techniques, DataRobot can build models to understand how an organization’s recruiters evaluate candidates and can replicate them through predictions.
Support Tickets Routing – DataRobot models built on labeled support tickets can assist an organization in identifying topics and routing them correctly to the correct support assistant for resolution. Topic modeling (via clustering with text) on support ticket descriptions can also be used on unlabeled support tickets to discover general trends to assist the support team in identifying novel issues and emerging themes.