Where brand strategy, design, content, technology and analytics converge to deliver next generation customer experiences

How Data Enrichment Improves Predictive Modeling

Share this article

Predictive modeling, an application of machine learning and artificial intelligence, is the use of data and algorithms to forecast outcomes. While this field has been around for decades, the current data explosion coupled with modern computing power brings predictive modeling to the forefront of business operations.

Predictive models can help identify fraud, improve inventory and pricing operations, reduce risk, optimize marketing campaigns, and more. The artificial intelligence software market generated about $15 billion in worldwide revenue in 2019. By 2025, that number is expected to reach $119 billion.[1]

But before going all in, it’s important to understand the foundations of successful predictive models. No matter which algorithm you utilize, you need the right data to fuel it. Data enrichment is the key to getting the most out of your predictive modeling investment.

Elements of A Predictive Model

Predictive models reach a conclusion about how likely a subject (typically a customer or prospect) is to perform a desired action (such as making a purchase, or renewing a service contract).

When leveraged in marketing, one of the main goals of predictive modeling is to identify the attributes, or “features” (which may include demographic information, purchase history, or other behavioral markers) that are most likely to be predictive of the target outcome, so that people who share these features can be targeted with relevant campaigns.

Here’s an example: If a predictive modeling exercise shows that individuals who visit high-end malls and frequently travel by air are more likely to purchase luxury smartphones, then a phone provider looking to grow their customer base knows that targeting high-end shoppers and frequent flyers with their marketing campaigns will likely result in higher ROI. In this case, the target outcome is the purchase of a luxury smartphone and the features that predict it are high-end mall visits and frequent air travel.

Where Do Predictive Models Go Wrong?

Insufficient Data

In data science, there’s a general belief that algorithm sophistication is the single most important factor in predictive modeling success. In reality, the breadth and depth of data used to train the algorithm has a bigger impact on model quality than the algorithmic technique.

If your approach is thorough and your methods are by the book, yet you still can’t achieve the predictivity you need, then inadequate data is likely the source of your problem.

Feature Selection

Feature selectionthe identification of which features to use for modeling – is a pivotal task. When building a predictive model, data scientists must evaluate and refine each feature until an actionable high-probability model is reached.

In order to be actionable, the final version of a predictive model must include features that are easily projected onto the larger population. Teams working exclusively with first-party data often generate insights that can’t be applied to the general public.

The feature selection process is often where predictive models go wrong and insufficient data is the leading cause of suboptimal feature selection. After all, you can only conduct statistical analysis on the data that’s available to you. A limited scope of data cripples your model’s ability to project probability statements onto the population at large.

Better Data = Higher Value Predictive Models

To effectively identify and market to new prospects, and to better understand, retain, and grow an existing customer base, you will need to build your predictive models using data that reaches far beyond what you have in-house.

No matter how sophisticated your algorithms are, if you are leveraging only first-party data to inform your predictive models, they’ll be limited to generating insights based on your current customers. They won’t provide a comprehensive look at all of the states that might be relevant to your desired outcome, and the features that are available may not apply to consumers who are not customers.

Case Study: Overcoming Limited Data

When a global food delivery company found themselves in the situation we just described, they turned to Mobilewalla for additional consumer insights.

The company’s first-party data revealed the cuisine preference, frequency and timing of it’s highest-value customers. However, they couldn’t use these insights to grow their customer base because there was no way for them to identify non-customers based on food preference and order frequency. That means they didn’t have the information they needed to be able to target this group with their campaigns.

The solution to this problem was data enrichment. After bolstering their database with comprehensive third-party data that painted a more detailed picture of customer habits and behaviors, subsequent analysis found that their high-value customers are likely to be married, aged 25-34, have children, a working spouse, and a home-to-work commute greater than 15 km. Almost none of these features were natively available in their first-party customer data.

This information empowered the food delivery company to run acquisition campaigns targeting audiences likely to become high-value customers. In doing so, the number of high value customers in their overall user-base increased, driving revenue up substantially.

To read more about data enrichment and predictive modeling, read Mobilewalla’s latest white paper, “Third-Party Data: The Missing Ingredient to Predictive Modeling Success.” 

About the Author: Laurie is responsible for all aspects of Mobilewalla’s marketing strategy including messaging and positioning, brand awareness, demand generation and sales enablement. Ready to learn more about how data enrichment can work for your brand? Contact a Mobilewalla expert to discuss your challenges and take a data-driven leap into growing your ROI.


Share this article

Upcoming event

Marketing Analytics & Data Science Digital Week

08 - 10 Dec 2020
MADS Content without the travel
Go to site