Today on the Demand Gen Pod, Episode 14, Ryan discusses duplicate records and why they occur in CRM tools like
Salesforce. He gives examples of duplicates, such as parents and children
sharing the same email address, or multiple purchase history records for one
contact. Ryan explains how different marketing tools handle duplicates and the
importance of managing them to avoid wasted resources and poor reporting. He
suggests ways to prevent duplicates, including implementing data entry
guidelines and using automation tools. Managing duplicates effectively is
crucial for maintaining data accuracy and improving the customer experience in
marketing campaigns.
Summary notes from Episode 14:
Ryan introduces the topic of duplicate records and their impact on data
quality and messaging consistency. Duplicate records can occur within the same
object or across multiple objects. One example of duplicate records is in the
medical industry, where a company may have records for both the parent and the
child. In email marketing, duplicate records may occur when using tools like
Eliqua, where the email address is the unique identifier. Duplicate records
can lead to wasted resources, poor reporting, inconsistent messaging, and
potential customer dissatisfaction. Manual review and data analysis can help
identify duplicate records, as well as looking for common data fields and
attributes like email, name, phone number, and unique identifiers. Automated
tools with duplicate detection algorithms can also assist in identifying
duplicates based on similarities in names, addresses, etc. To address
duplicate records, they can be merged together or removed to consolidate and
avoid confusion. Following up with individuals to clarify duplicate records
can be time-consuming but is an option. Merging duplicates prioritizes data
quality and ensures accurate information is retained and communicated to
important teams like sales.
Full Transcript:
00:01
Hello and welcome to the demand gen pod. My name is Ryan. Today we’re doing something a little differently. If you’re listening on Spotify or Apple Podcasts or wherever you may get your podcasts, you may notice difference, but today we’re going no video because I’ve spent the first half of my day trying to make the video work and it is just not going to happen. With that said, we’re talking today about duplicate records and why you might have duplicate records and what you can be doing with them to either clean them up if you need to clean them up, and why you may not need to clean them up if you have them. So let’s just start really quick with an example here.
00:39
So if we have a CRM tool like Salesforce, okay, if you’re using Salesforce allows for you to specify what you’re going to use as a unique identifier for each record. And generally speaking, it’s going to be the object id. Now you can group objects together without getting too far off track so quickly, but you can group objects together and connect them by saying the object id. Say, for example, a contact id can be mapped to a contact id in a different object, but that object’s id is going to be its own unique identifier, say customer records or device object or something like that, for a history of device purchases. So what that means is that you could have multiple records, not only in multiple objects necessarily, but even multiple records within a single object.
01:38
And I can give you two examples of why this might be the case. The first is, I guess they overlap a little bit, but the first is that you could have a parent and a child, and the parent could be purchasing something on behalf of the child and therefore be using their own email address. But it may not be for the parent, it may be for the child. This might happen in the medical industry. We work with a lot of medtech companies and I think that one of the things that we see a lot is that you have a parent who might have a medical need and their child has the same need. And so a company might have a record for both the parent and the child.
02:20
The reality is that the parent is using their email address for both themselves and for the child. That creates a duplicate record, even though technically speaking it’s not a duplicate. They’re two different people with the same email address. As far as a platform like Salesforce is concerned, it doesn’t really care that you have different email or, sorry, the same email address. What it really cares about is that they have two different, say contact ids. So you might have a family as an account and then multiple contacts going into that account like a parent and a child. And those two individuals may have the same email address, all fine and dandy. On a Salesforce side. What about when you start going over to email marketing though?
03:04
So when you start getting over into the email marketing side, you start to run into problems really quickly where depending on which platform you’re using, you may have to use email address as the unique identifier on the email tool. Even though email address is not a unique identifier as far as Salesforce is concerned, which may be where your source of truth is and where all of your data is coming from. So you go to do a sync, you sync records over, you have two records with the exact same email address and they may get merged or one of them may be just completely ignored and brought over into your email tool. An example of this is Eliqua, which is a tool that’s made by Oracle. It’s a pretty big kind of like Fortune 500 kind of company level tool.
03:47
It’s pretty expensive and we work in it all the time. And in eliqua the email address is the unique identifier. This is relevant because when you go to send an email and you need to send an email to a specific person where you have a duplicate record, you need to be able to make sure that you’re specifying which record to send to. Do you send it to the parent or do you send it to the child? The second example where you might have duplicate records in Salesforce is going to be with say purchases. So you may have a new record for each purchase that happens. And this could be more relevant in a custom object in Salesforce like say if you had a purchase history object.
04:33
Okay, so you go and you make a purchase with a company and then whatever, six months later you make another purchase. Well they might have a single contact record, but they might have multiple device records and this is important or, sorry, multiple purchase history records. So let’s just say that you’re purchasing some sort of medical device. This is important because if you are going to then email somebody about a particular purchase that they made, you need to be able to ensure that you’re emailing them about the correct purchase. And if there’s any relevant data with that purchase that you need to show, say in field merges you need to be able to pull that over and you need to know that you’re pulling the right record over.
05:11
If they’ve made two purchases and you have a record of two purchases, then you need to make sure that you have some sort of criteria defining which one is which. All of that aside, there are lots of different marketing tools for email marketing out there. And another one that lots of companies are using and starting to use more is Salesforce marketing cloud. And marketing cloud they handle a little bit differently. They require that you define what the unique identifier is for every single, let’s call it like the database that you’re pulling over. So every time that you pull over a new list of people into marketing cloud, you need to define what the unique identifier is for that group of people. So you could define that it’s the contact id, or you could define that it’s the purchase history id.
06:00
So it doesn’t really matter how many purchases that person may have made, or it doesn’t matter how many contact records have the same email address, it will pull over every single one and will email every single one. Obviously there’s a catch there too where you might not want to do that, right? You might not want to send a marketing email to somebody ten times because they’ve made ten purchases, you might just want to send them one email. And marketing cloud allows you to manage that too. Where even if you are specifying the unique identifier is contact id, let’s say you can still also specify to only send and dedup on email or dedupe on id. So entirely up to you.
06:39
With that said though, managing duplicates, those are kind of good duplicates, but you also have bad duplicates and it’s really important that you try to keep up on that stuff. And there are I think the biggest reason beyond things like data governance and just good cleansing and having a good healthy database is also cost because the vast majority of particularly larger, actually it doesn’t even matter. I mean, let’s be honest, the vast majority of marketing tools that you have are done on a contact basis. They’re paid out on a contact basis. So you purchase activecampaign to be able to have a thousand contacts in there. You purchase Salesforce to be able to have so many thousands of records. You purchase eloquine, you pay per record sometimes for each person that’s in there. So you don’t want to have a bunch of duplicate records.
07:35
This goes beyond just email address, right? This could be different email addresses because somebody’s filled out a form twice with two different email addresses or multiple forms with multiple email addresses. Maybe they’ve moved so they have a record with multiple physical addresses or change their phone number. These are all things that you might get duplicate records over depending on how your data structure is set up. So managing those duplicates is really important for effective marketing because it helps maintain that data accuracy and improves the overall customer experience. I mean, it would look really bad if you’re sending them an email with the wrong information from an address that they had six years ago, right? But like I mentioned, duplicate records can also lead to wasted resources, poor reporting, inconsistent messaging, and then therefore maybe some angry customers to identify them.
08:22
Sometimes it really comes down to a manual review and analysis of data, and that can help you to identify duplicate records. Also, common data fields and attributes to look for. They include email, name, phone number, other unique identifiers, physical addresses, whatever other things that you may take into account when you’re thinking about a unique person and how that’s actually mapped into your data tool. But there are other automated tools like duplicate detection algorithms, and they can be used to streamline the process of identifying duplicates. And they’ll be looking for things like similar names, similar addresses, and then call out those similarities and then let a person decide, is that really a duplicate or are we just fine? In order to deal with them though, you can merge them together, or you can also remove duplicate records to consolidate and avoid any confusion.
09:18
If you’re not sure, you can also always follow up with the individual and say, hey, we have two addresses for you, we have two phone numbers for you, which is the right one? And try to get a response, although with a response rate, say like maybe 10% 15%. That can be a time consuming process, but certainly an option. But by merging those duplicates together, you can prioritize your data quality, retain the most complete and accurate information, and then you can also be communicating any changes to important teams like say, sales. If marketing is managing this, sales can certainly also manage this too. Larger companies probably have data teams that are managing this on behalf of marketing and sales, but regular data audits and ongoing data validation processes are really important for maintaining data integrity.
10:03
Over handling duplicates, and some ways that you can prevent this are implementing really strict data entry guidelines and protocols to minimize duplicates. You can leverage the automation tools like form validation and real time duplicate checks to prevent a new record from being created. You can also regularly monitor data hygiene, updating records, and you can enforce some data governance policies that can prevent duplicate records from occurring so that when data entry is happening to make a duplication search part of that process to update records. Another good way that you can use it is with that form validation. And so for example, you can also reduce the potential issue of it happening simply by adding things like, again, whatever the unique identifier is to make it as easy as possible to prevent somebody from putting in something different.
11:02
So you can do that with field merges inside of forms. So basically how that works is that everybody’s browser, when they go and submit a form, unless they prohibitively exclude themselves, their cookies will be tracked on their browser. And then when they go to submit the form, say another form on the website somewhere else, you can pre populate the form with certain fields. So say first, last phone number and email or whatever they’ve already provided you, and it’s entirely up to you on whether or not you want to even ask for some of those things. So one of the nice things I think about the form side is that the other thing you can do is to have dynamic forms where they’re asking only for data that you haven’t gotten yet. So let’s say on the first form you only ask for email address.
11:52
Email address will stick to as being the unique identifier. On the second form you populate the email address, and then you ask for first and last name. On the third form you ask for first, last and phone number and company, let’s say, okay, if it’s maybe b to b. So in this scenario, you’re getting all of that data in progressive steps, but you can also pre populate the data that you already have to reduce the chance of having some sort of dirty data being populated. And it also gives them the opportunity to fix something if it’s wrong, obviously, with the exception of email address, which would be the unique identifier. But if they had spelt their name wrong, and you’re showing them their name as having been spelt wrong, they may correct it this time.
12:32
And then you can make sure that in the processing side of the form, it’s updating the form field submissions when something has changed. Okay. And normally you can do something like update the form, field the form data, as long as the new value is not blank. So as long as they’ve typed something in there, we’re going to update it with whatever they put in again, even if it’s the same thing. Some challenges that you can find here are obviously data in different formats and data quality variations, and some data mapping complexities.
13:06
We were on a call with a client just a couple of days ago and were just kind of kicking off a new client and we said, well, we’d love to see all the data that you have access to because as we go to build out marketing campaigns, we want to think about what data can we leverage in order to send campaigns. And the data team said, well, why don’t you tell us what you want? I said, well, I don’t really know what I want if I don’t know what I can have. So I can give say, well, it’d be nice to have this or be nice to have that, but what if you have something else that I don’t even know that you have? And they said, well, okay, well, the reason is that we have 3000 fields. Oh, okay.
13:45
K so I don’t want 3000 fields. That doesn’t do anybody any good. So I don’t want to have to think about what are 3000 across is not very helpful in the grand scheme of things. So instead, sometimes the way that we want to go about that is to talk about what on a high level you might like to accomplish and then work with a data science team for you as a marketer to say, I would like to accomplish this. It would be nice if we could accomplish that. And hopefully a conversation comes out of that to say, oh, okay, well, if you want to send to people based on the last time that they purchased, we do have that data. We also have data which is the average length of time between purchases, if you’d like to use that. Oh, okay, great.
14:31
Well, I hadn’t thought about leveraging that, so we could definitely use that as well. So conversations like that need to come out. And that is one of the big things with data mapping that can be kind of challenging. And then also why it’s just so important to manage duplicates effectively. So managing duplicates in marketing automation platforms and CRM is just so important. And it’s not something that has to be done really frequently. Because depending on, again, how many submissions you’re getting, depending on a number of factors, it may not necessarily be that big of a deal. But at the end of the day, it’s important to know that the people who you have in your marketing list are active, ready to receive communications from you and also willing. So all those are really important.
15:17
I hope that you’ve enjoyed our episode today and I do hope that we see you next week and maybe you’ll actually see me if you do watch this on YouTube. Sorry about that and catch you soon. Close.