Centralized customer data storage: why it’s important and how we do it

15 Oct ‘19

In 2018, Digital Intelligence Briefing studied Asian, European, and American companies. The results were quite interesting:

  • 65% of companies believe that analytics and data usage are key for improving customer experience.
  • Personalized customer experience in real-time is recognized as a No. 1 priority for the next three years.
  • Only 10% of companies use marketing technology solutions to their maximum potential.

The study shines a light on the gap between desires and opportunities. In Russia, we see an identical difference between the desire to improve marketing metrics and the percentage of companies using centralized data storage systems.

One of the possible reasons that we see is complexity. We at Mindbox spent 15 team years of development to create a centralized customer data storage system.

The second possible reason is that the connection between personalization, analytics, and centralized data storage might not be obvious. We will go in depth about CDP works using ourselves as an example, and how it can help your business.

Gathering data on customers

For many companies, customer data is scattered: they use different systems to send mass mailings and triggered emails, store information about goods and orders in 1C, while their online and offline databases are not connected to one another. This creates business challenges: if a client buys something offline, the database doesn’t receive information about that. As a result, the client may land in the outflow segment and receive a triggered Long time, no buy email, although a purchase was literally just made.

Another classic example: a customer buys an item offline and the next day receives a promotional code with a discount for it or a recommendation to buy the exact same thing. On the one hand, the client receives an irrelevant message; on the other, the company loses money on pushing unnecessary discounts.

To prevent this from occurring, it’s better to store data in one place: this ensures that information is updated in real-time. Mindbox integrates with all company systems: CRM, websites, mobile apps, call centers, and of course, cash registers. The platform combines all received data under a unified customer profile, assigns it a unique ID, meaning, all information is tied to a specific person, and not a conventional account or order.

Mindbox stores not only all customer data, including email, phone, gender, and date of birth, but also their actions and orders, behavior, so to say. Depending on the type of business, the content of a customer’s profile may vary: a children’s store collects information about the age of children, a shoe store needs data concerning shoe sizes, and so forth. This allows us to see the client’s stage within the purchase funnel and personalize the business’s interactions with said customer.

Data sources for from the Petrovich.ru project
Data sources for from the Petrovich.ru project

Actions within Mindbox logic fall into two categories:

  1. Everything that the client did: viewed a product or category, added items to the cart, subscribed to a newsletter, opened a letter.
  2. Everything that has been done to the client: letters sent, inclusion in segments, an admin editing his or her profile.

An order, in terms of Mindbox logic, is also an action, but that is carried out by the client himself. s. There are products in the order that, as a rule of thumb, are usually transferred from the YML feed. The order comes with statuses: pending, paid, delivered, canceled and any others, if need be. Additional fields can also be added, for example, delivery times or methods.

One of our clients, PIK, has more than 40 statuses for a retail order, for example, «demonstrated an interest in PIK» or «schedule an apartment tour». Initially, there were no such entities in Mindbox, but the system allows for flexible status settings.


A product is not necessarily a typical e-commerce product at first glance: it can be an apartment, a tour to Hawaii, laser vision correction, etcetera. If we are talking about a publisher that makes newsletters of articles from magazines, the product is an article in this case, and for a SaaS business, it’s a service.

The system is universal; that is, it can adapt to any business characteristics without the need to involve a developer. The flexible configuration of additional fields allows us to store any information: for example, dog breeds or a client’s clothing size.

Cleaning data from duplicates

If a company has many databases, the same individual client can be erroneously recorded multiple times: they were assigned one ID offline, and another one online. Marketers do not know the actual number of customers they have, cannot define the interval between purchases, receive reliable data from an RFM report and set segments correctly.

A classic example is to offer a customer with 5 online and 5 offline orders a discount for every sixth order to increase purchase frequency. Such promotions lead to direct monetary losses. Duplicates also create inconveniences for customers: for example, if a client has two accounts and two bonus point accounts within the bounds of a loyalty program, he cannot spend his bonus points to the maximum capacity.

Mindbox knows how to identify duplicate contacts and merge them into one profile, customer action histories, and IDs from scattered databases. This process is called deduplication. Email addresses and phone numbers get tied to a specific client, and thus marketers understand who they send messages to and how each recipient responds to them.

Another problem duplicates cause is database burnouts. When a client is listed in different databases, he receives repeating messages. Meaning it’s possible that a client will receive the same offer over and over. So, even when that unsubscribe button is hit, the client will continue to receive newsletters because different types of letters are sent from different mailing lists.

Mindbox aggregates information about channels that a client is subscribed to and unsubscribed from, that is, they will not receive a mailing they already refused to receive once before. One project contains within itself several brands, channels, and newsletter topics that are also taken into account during the deduplication process. Moreover, the platform supports double opt-in, meaning, double confirmation of a subscription, and does not send messages to those who have not followed the confirmation link.

What the mailing tree looks like on a project
What the mailing tree looks like on a project

If a client is only subscribed to the Discounts topic via SMS, he will not receive other messages on this channel, as the system will simply not let you send out a message. If they are subscribed to the channel as a whole and there is no information on specific topic preferences in this channel, the client is by default, considered as ready to receive any messages via this channel.

Cleaning and enriching data

Users do not always correctly enter contact information; this can occur as an accident or on purpose. These types of contacts are harmful to the company for two reasons:

  1. The company spends money on mailings, but the messages are never delivered.
  2. A large percentage of messages to invalid addresses can harm the sender’s reputation.

Mindbox cleans data at the customer sign up stage: it corrects typos in common names and surnames, for example, Boob will be input as Bob. It also identifies popular fake contacts that people often use instead of real ones, and does not add them to the database. Examples of incorrect contacts: mail@gmail.com or +7 (999) 9999999

Invalid email addresses and phone numbers look just like real ones, but emails and SMS messages cannot be delivered to them. Such contacts are marked as invalid, and messages are no longer sent to them.

Mindbox is constantly enriching customer data. For example, in the beginning, nothing is known about a person, except for an email address. He or she receives a letter with a unique (that is, generated specifically for him or her) link, opens the letter on his or her home computer, and goes over to the site. At this point, the Mindbox JS tracker identifies the client and records information about the device used to access the website as well as record all of the actions on the site in a single client profile. If later on the client opens the letter again on a phone and follows the link to the site, this device will also be recorded under the same profile. The advantage here is that client actions will still be stored in Mindbox, even if they are not logged in on the website.

Even if the client is not yet in the database, the system still records all of the client’s actions on the website, including pages and categories viewed, additions to the cart, and so forth. If during the session, a person performs any action that allows us to identify him or her, for example, signs up via a pop-up or leaves a phone number for a return call, the person’s anonymity will be lost, and all actions of their session will be recorded in the profile.

Example of a profile with actions recorded prior to registering
Example of a profile with actions recorded prior to registering

Becoming a backend

If the company launches a loyalty program, the client needs to be identified at every interaction point: on websites, offline, mobile app, and so on. This requires an end-to-end authorization, and for this, you have to create a single database and synchronize data transfers from different channels so that you have relevant data on orders, discounts or bonus point accounts for the online store, at the cash desk, and in the mobile app.

An easier path is to utilize Mindbox as a backend, that is, as the main customer data storage facility, which is referred to by all company systems. This speeds up operations, as all requests are sent to just one place, answers arrive in a split second, and there are no data discrepancies.

Before Mindbox
After Mindbox

Mindbox as a back office for the Fran furniture chain personal profile

Phone numbers and email addresses of loyalty program participants receive enhanced security statuses. When attempting to re-register such a contact, a message is sent that it is already contained the database and the client is asked to either login or reset the associated password.

To safeguard customer bonus points, special deduplication rules are employed: members of the loyalty program cannot lose access to their privileges. In practice, this means that scammers or bad actors can’t get access to an active loyalty program account even if they know the client’s associated email address and phone number.


To build out a smart marketing strategy, you need to know your clients and your history of interactions with them. The CDP platform provides companies with this very needed data: combines information from different databases, stores information about actions, orders, and products. The CDP platform can become a master base, that is, a back office for all company systems. Resulting in end-to-end customer identification across all channels and the creation of an omnichannel approach.

This material was prepared by

Marianna Lyubarova

Marianna Lyubarova,Editor

Gleb Efanov

Gleb Efanov,CDP Product Owner

Tell us a little about yourself

We’ll respond within 24 hours

Partnership request

Typically we’re answering within 24 hours