Shopify + Salesforce + HubSpot: A Practical Guide to Unified Customer Data

William Flaiz • January 14, 2026

You've got three platforms. Each one holds a piece of your customer puzzle. Shopify knows what they bought. Salesforce tracks the sales conversations. HubSpot manages the marketing touches.

And none of them agree on who "John Smith" actually is.

This is the reality for most growing businesses. The tools work great individually. But getting them to share a single, accurate view of your customer? That's where things get messy.

Here's a practical guide to unifying customer data across these three platforms—without writing custom code or hiring a data engineer.

Why These Three Systems Fight Each Other

Before diving into solutions, it helps to understand why the problem exists in the first place.

Each platform was built with different priorities.

Shopify cares about transactions. A customer is defined by their email at checkout. Maybe their shipping address. It doesn't care much about company hierarchies or lead scoring. Someone buys something, Shopify captures the sale.

Salesforce lives in a different world entirely. Contacts belong to Accounts. Accounts have hierarchies. Opportunities tie to specific people who influence purchase decisions. The whole structure assumes complex B2B sales cycles.

HubSpot sits somewhere in between. Contacts have properties. Those contacts can belong to companies. Marketing campaigns create new contacts constantly—webinar signups, ebook downloads, demo requests. Volume matters here.

Three different philosophies. Three different data models. One customer trying to exist in all of them simultaneously.

The Schema Conflicts You'll Actually Hit

Let's get specific about what goes wrong.

Email as identifier (sounds simple, isn't)

Shopify: customer.email
Salesforce: Contact.Email
HubSpot: email

Easy match, right? Until someone uses their work email in HubSpot, personal email in Shopify, and their assistant's email got entered in Salesforce. Same person. Three different identities.

Name field variations

Shopify stores first_name and last_name separately. Clean, predictable.
Salesforce has FirstName , LastName , plus Suffix , MiddleName , and Salutation . More fields means more opportunities for inconsistency.
HubSpot uses firstname and lastname (lowercase, no underscore). It also has hs_full_name that sometimes gets populated, sometimes doesn't.

Phone number formatting

Shopify: +1 (555) 123-4567
Salesforce: 555.123.4567
HubSpot: 5551234567

Same number. Completely different strings. A naive merge will create three records for one customer.

Address components

Shopify breaks addresses into address1 , address2 , city , province , country , zip .
Salesforce has MailingStreet , MailingCity , MailingState , MailingCountry , MailingPostalCode . Plus separate fields for "Other Address" and "Billing Address."
HubSpot stores address , city , state , country , zip . Similar to Shopify, but the field names don't match.

Merging address data manually means mapping every field. Miss one, lose data.

Identity Resolution: Finding the Same Person Across Platforms

This is the core challenge. You have records from three systems. Some represent the same person. Some don't. How do you figure out which is which?

Method 1: Exact email matching

The simplest approach. Match records where emails are identical.

Works well when: Customers use consistent emails everywhere. B2B contexts where corporate emails are standard. Clean, maintained databases.

Falls apart when: People use multiple email addresses. Personal vs. work email situations. Typos in email entry. Partner or assistant emails entered instead of the actual contact.

Email matching will find maybe 60-70% of your true duplicates if you're lucky. It's a start, not a solution.

Method 2: Fuzzy name + company matching

When emails don't match, look at name and company combinations.

"Jon Smith at Acme Corp" and "Jonathan Smith at ACME Corporation" are probably the same person. Traditional string matching won't catch this. You need fuzzy matching that understands "Jon" and "Jonathan" are related. That "Corp" and "Corporation" mean the same thing.

This approach catches another 15-20% of duplicates that exact matching misses. But it also introduces false positives. "John Smith at Acme" and "John Smith at Acme Tools" might be different people entirely.

Method 3: Semantic similarity

The most sophisticated approach uses AI to understand meaning, not just strings.

Instead of comparing characters, semantic matching compares the overall meaning of records. It considers multiple fields together—name, company, email domain, phone area code, location. A record that matches on three of five fields might score higher than one that matches perfectly on just email.

This is how modern data cleaning tools find the duplicates humans miss. And it's the only reliable method when dealing with messy, real-world data from multiple sources.

A Practical Merge Workflow

Here's a step-by-step process that works without custom code.

Step 1: Export your data

Pull customer/contact data from all three platforms.

From Shopify: Admin → Customers → Export → CSV

From Salesforce: Reports → Contacts → Export (or Data Export if you have bulk access)

From HubSpot: Contacts → Export → All contacts

You'll end up with three files. Different columns, different formats, same underlying people (hopefully).

Step 2: Decide on your master source

Before merging, choose which system wins when data conflicts. This matters more than you'd think.

If Salesforce is your CRM of record for the sales team, make it the master for company and contact relationship data.

If HubSpot is running your marketing, it should be authoritative for email preferences and subscription status.

If Shopify tracks purchases, it's the master for transaction history and lifetime value.

You can't have three masters. Pick one primary source per field type.

Step 3: Map your fields

Create a mapping document. Here's what it might look like for basic contact info:

Unified Field	Shopify Source	Salesforce Source	HubSpot Source
email	customer.email	Contact.Email	email
first_name	first_name	FirstName	firstname
last_name	last_name	LastName	lastname
phone	phone	Phone	phone
company	—	Account.Name	company

Notice Shopify doesn't have a company field at all. That's a gap you'll need to fill from another source.

Step 4: Standardize formats before matching

This step gets skipped way too often. Before trying to find duplicates, normalize your data.

Phone numbers should all follow the same format. E.164 international format (+15551234567) works across all three platforms and eliminates formatting discrepancies.
Email addresses should be lowercase. "John@Company.com" and "john@company.com" should match.
Names should have consistent capitalization. "JOHN SMITH" and "John Smith" should merge, not create duplicates.
Dates need a standard format. Shopify uses ISO dates. Salesforce might have MM/DD/YYYY. Pick one, convert everything.

Step 5: Run duplicate detection

With standardized data, find your matches.

Start with exact email matching. That catches the obvious duplicates.
Then run fuzzy matching on name + company for records without email matches.
Finally, use semantic similarity for the remaining unmatched records.

Review the suggested matches before merging. Automated matching is smart, but human review catches edge cases.

Step 6: Merge and resolve conflicts

When you find a match, combine the records using your master source hierarchy.

Record A (Shopify): email = john@gmail.com, name = John Smith
Record B (Salesforce): email = jsmith@acme.com, name = Jonathan Smith, company = Acme Corp
Record C (HubSpot): email = john@gmail.com, name = Jon Smith
Merged record (using Salesforce as name master, preserving all emails):
Primary email: jsmith@acme.com (work)
Secondary email: john@gmail.com (personal)
Name: Jonathan Smith
Company: Acme Corp

All three platform identities now point to one unified customer record.

Field Mapping Examples That Actually Work

Here are practical mappings for common scenarios.

B2B SaaS company mapping:

Purpose	Shopify	Salesforce	HubSpot	Notes
Primary identifier	email	Email	email	Match on all three
Revenue data	total_spent	Total Revenue (formula)	hs_lifecyclestage_customer_date	Shopify is authoritative
Engagement score	—	Lead Score	HubSpot Score	HubSpot is authoritative
Sales stage	—	Opportunity.Stage	lifecyclestage	Salesforce is authoritative

E-commerce company mapping:

Purpose	Shopify	Salesforce	HubSpot	Notes
Order count	orders_count	—	number_of_orders	Shopify is authoritative
Last purchase	last_order_date	—	recent_conversion_date	Shopify is authoritative
Email opt-in	accepts_marketing	HasOptedOutOfEmail (inverse)	hs_email_optout (inverse)	HubSpot is authoritative

Testing Your Merged Data

Before pushing unified data back to any system, validate it.

Sample check (quick validation)

Pull 50 random merged records. Manually verify 10 of them against the source systems. If more than 1 has errors, your process needs adjustment.

Edge case review

Look specifically at:

Records that matched on fuzzy criteria (not exact email)
Customers with multiple email addresses
High-value customers where errors cost more
Recently created records (most likely to have issues)

Duplicate count comparison

If you started with 10,000 records across three systems and ended with 8,500 unified records, that's a 15% deduplication rate. Reasonable for moderately clean data.

If you're seeing 40%+ deduplication, either your data was really messy or your matching is too aggressive. Review the matches before proceeding.

When Things Go Wrong: Rollback Planning

Always keep your original exports. Don't delete them after merging.

Before pushing merged data back to any platform, document:

What data existed before the merge
What changes you're making
How to reverse those changes if needed

Most platforms don't have a true "undo" for bulk data changes. Your rollback plan is reimporting the original data and manually fixing any records that got touched.

This is tedious. Which is why you validate before pushing.

The Faster Path

Everything I've described works. It's also time-consuming. Manual exports, spreadsheet mapping, careful review—it adds up to hours of work for a few thousand records. Days for larger datasets.

That's exactly why we built CleanSmart.

Upload your exports. CleanSmart handles the standardization, runs semantic duplicate detection across all three files, and shows you proposed matches before anything changes. You review, approve, and download a unified dataset.

The manual process takes 4-8 hours for a mid-sized dataset. CleanSmart does it in minutes.

Ready to unify your customer data?

Upload your Shopify, Salesforce, and HubSpot exports with our Business plan and see exactly how many duplicates are hiding in your data.

Start cleaning for free →

< Older Post

Newer Post >

William Flaiz is a digital transformation executive and former Novartis Executive Director who has led consolidation initiatives saving enterprises over $200M in operational costs. He holds MIT's Applied Generative AI certification and specializes in helping pharmaceutical and healthcare companies align MarTech with customer-centric objectives. Connect with him on LinkedIn or at williamflaiz.com.