The HubSpot Data Hygiene Cheat Sheet: Everything RevOps Needs in One Place

The HubSpot Data Hygiene Cheat Sheet: Everything RevOps Needs in One Place

One-page HubSpot data hygiene reference: 6 quality dimensions, object-level best practices, maturity model, 30-day quick wins, and governance playbook.

Peter SterkenburgFebruary 27, 202611 min read
Peter Sterkenburg

Peter Sterkenburg

HubSpot Solutions Architect & Revenue Operations expert. 20+ years B2B SaaS experience. Founder of HubHorizon.

Why data hygiene matters

You already know it matters. Here's what it costs when you ignore it:

Stat Source
53% of sales teams report poor CRM data quality Gartner, State of Sales Operations
550 hours wasted per sales rep annually chasing missing information IndustrySelect
79% of CRM data entered by reps is inaccurate or incomplete ESNA

Those numbers compound. Poor data quality wastes time and leaks revenue through orphan records nobody works, broken associations that drop handoffs, and reports nobody trusts. A CRM health score gives you the composite number. This cheat sheet gives you the breakdown.

The 6 data quality dimensions

Every data hygiene problem maps to one of six measurable dimensions. This framework replaces "our data is bad" with specific, fixable categories:

Dimension What it measures HubSpot example Benchmark
Accuracy Does data match reality? Job titles that reflect current roles, deal amounts that match proposals <5% cross-field conflicts
Completeness Are critical fields populated? Contacts with email, job title; deals with amount and close date >85% fill rate on business-critical fields
Consistency Does the same data match across records? Lifecycle stage aligns with deal stage; contact company matches associated company <5% cross-object contradictions
Validity Does data follow defined formats? Phone numbers in consistent format; numbers stored as numbers, not text >95% format compliance
Uniqueness Are there duplicates? One record per real-world entity across contacts, companies, deals <3% duplicate rate
Timeliness Is data current? Records updated within 90 days; no deals with close dates six months past >60% updated quarterly

Governance sits above these dimensions — the policies and processes that keep them healthy. Integrity sits between — the validation rules and automation that enforce governance. The dimensions are the measurable result.

Key objects: hygiene best practices

HubSpot has no separate "Leads" object. Contacts carry lifecycle stages instead. That simplifies some things and complicates others. Here's what hygiene looks like per object:

Object Description Hygiene best practices
Contacts People — prospects, leads, customers, partners. Lifecycle stage tracks progression. Enforce email format validation. Require lifecycle stage on creation. Schedule periodic audits to merge duplicates. Use progressive profiling to collect data over time, not all upfront. Auto-associate to companies.
Companies Organisations linked to contacts and deals. Use consistent naming conventions (decide on "Acme Inc." vs "ACME Corporation" and enforce it). Standardise industry codes with dropdowns. Integrate enrichment tools to keep firmographics current.
Deals Revenue opportunities tracked through pipeline stages. Define clear stage definitions and required fields per stage (e.g., "Closed Won" requires amount and close date). Use automation to prompt updates based on activity gaps. Associate every deal with at least one contact and one company.
Tickets Support interactions tracking service issues and resolutions. Use structured templates for ticket creation. Automate assignment and follow-up. Monitor resolution times. Associate tickets with contacts to build customer history.
Custom objects Data structures for unique needs (partner management, product usage, subscriptions). Define strict field requirements tailored to the object's purpose. Review quarterly to ensure alignment with evolving processes. Document thoroughly — custom objects are the first thing new team members misunderstand.
Activities Emails, calls, meetings, notes, tasks logged against records. Use connected inbox and meetings tool for automatic logging. Review activity records periodically for duplicates from overlapping integrations. Ensure activities associate to the right records — a logged email should link to the deal, not just the contact.
Campaigns Marketing initiatives for engagement tracking and attribution. Standardise campaign naming conventions (e.g., YYYY-MM_Channel_Name). Automate contact and deal association with campaigns. Periodically archive inactive campaigns.

For a deeper look at how broken associations fracture these objects, see Your HubSpot Portal Is Not a Source of Truth.

8 common challenges and HubSpot-native solutions

# Challenge What goes wrong HubSpot solution
1 Missing activity tracking Emails, meetings, and calls go unlogged. Sales visibility has gaps. Enable connected inbox and meetings tool for automatic capture. Use the HubSpot Sales Extension (Chrome/Outlook) for one-click logging. Configure meeting links to auto-create activities.
2 Data decay Contacts change roles, companies get acquired, phone numbers go stale. Use Breeze Intelligence or third-party enrichment (Clearbit, ZoomInfo) for periodic refresh. Set up workflows to flag contacts with no activity in 12+ months. Run re-engagement campaigns before archiving.
3 Inconsistent qualification Reps interpret lifecycle stages differently. MQL means different things to different people. Document lifecycle stage definitions. Use lead scoring with explicit thresholds. Set required properties per stage. Build workflows that enforce progression logic — no skipping stages without the right fields populated.
4 Duplicates Multiple records for the same person or company. Reporting overcounts. Automation fires twice. Use HubSpot's native duplicate management. Set up Operations Hub data quality automation for format standardisation. Establish import rules that check for existing records before creating new ones. Train teams to search before creating.
5 Reps not updating CRM Manual entry is time-consuming. Records stay incomplete. Minimise required fields to what actually matters. Use HubSpot Playbooks to guide reps through structured updates. Enable mobile app for quick field updates. Automate what you can — if a meeting is booked, the deal stage should update without manual input.
6 Unstructured data entry Free-text fields produce 50 variations of "Technology" as an industry. Reporting and automation break. Replace free-text with dropdowns and radio selects wherever possible. Use Operations Hub data quality automation to format and standardise values on entry. Apply consistent picklist values across all forms and imports.
7 Property sprawl Too many custom properties, too few documented. Reps can't find the right field. Fill rates drop. Enforce naming conventions and quarterly audits. Implement a property creation approval process. Archive properties below 5% fill rate that aren't used in workflows or reports. The average portal has 300-500 custom properties — only 30-40% are actively used. That's addition bias at work.
8 Multiple systems of record Data scattered across platforms. Each team trusts a different tool. Numbers don't reconcile. Treat HubSpot as the lens, not the source — the place where data converges into a unified view. Integrate essential tools with clear field mapping. Consolidate overlapping integrations that create duplicate records. Document which system is authoritative for each data type.

Most of these challenges compound as technical debt if left unaddressed. Fix the process that creates the problem, not just the symptoms.

Data capture maturity model

Where does your portal sit? This four-level model adapts the RevOps maturity framework to data capture specifically:

Level Name Characteristics HubSpot features in use
1 Basic Manual data entry. Minimal required fields. Limited validation. No automation. Reps enter what they feel like entering. Forms, basic contact/deal creation
2 Structured Mandatory field implementation. Basic validation rules. Standardised dropdown values. Initial workflow automation. Required fields, dropdowns, simple workflows, connected inbox
3 Intelligent Enrichment-powered data completion. Advanced validation and formatting. Multi-source integration. Operations Hub automating data quality. Operations Hub, Breeze Intelligence, data quality automation, progressive profiling
4 Optimised Real-time data scoring. Continuous automation. Predictive analytics. Advanced compliance. Data quality is monitored, not manually maintained. Predictive lead scoring, Breeze Copilot, automated health monitoring, AI readiness above 70%

Most portals sit at Level 2. The jump from Level 2 to Level 3 has the highest ROI — it's where automation replaces human discipline for data quality, and where Breeze AI features start working reliably.

Quick wins: your first 30 days

If you're starting from scratch or restarting after neglect, do these in order. Each fix makes the next one easier:

  1. Fix contact-to-company associations. Pull a list of contacts with no company association. In every portal I analyse, 15-40% of contacts are orphaned. These contacts are invisible to any account-based process. Fix this first — it has the highest revenue impact per fix.

  2. Run HubSpot's duplicate management tool. Merge obvious duplicates across contacts and companies. Set a baseline duplicate rate so you can track improvement.

  3. Audit custom properties. Export your property list. Flag everything below 10% fill rate that isn't used in an active workflow or report. Move to a "Deprecated" group. Don't delete yet — just separate.

  4. Enforce naming conventions. Pick a standard (snake_case, category prefixes, whatever works for your team). Apply it to new properties immediately. Retrofit existing properties over time. Details in the property hygiene guide.

  5. Set required fields at key lifecycle transitions. "Closed Won" needs an amount and close date. "SQL" needs a company association. Don't over-require — pick 3-5 fields per stage that actually matter for downstream processes.

  6. Enable connected inbox. If your team isn't using it, you're missing activity data. Set it up, train on it, make it the default.

  7. Build a data quality dashboard. Track: duplicate rate, property fill rates on critical fields, orphan contact percentage, records updated in last 90 days. Review monthly.

  8. Schedule a quarterly audit. Put it in the calendar. Assign an owner. Use the 10-area audit checklist or automate it.

Don't try to fix everything at once. Start with foundations before optimisation — associations, then properties, then lifecycle stages, then AI readiness.

Technical implementation tips

Workflow automation

  • Build lifecycle stage progression workflows that enforce logical transitions (Subscriber → Lead → MQL → SQL → Opportunity → Customer)
  • Create alert workflows for overdue updates: deals stuck in a stage for 30+ days, contacts in MQL for 14+ days with no activity
  • Auto-assign leads based on company properties (territory, size, industry)
  • Trigger renewal tasks automatically when a deal reaches "Closed Won"

Validation and required fields

  • Require critical fields at stage gates, not at record creation (e.g., Stage = "Closed Won" requires Amount and Close Date)
  • Use format validation on email and phone fields
  • Set minimum and maximum values on numerical fields
  • Implement de-duplication checks during imports (match on email, domain, or company name)

Standardised inputs

  • Use dropdowns and radio selects instead of free text for any field you'll filter, segment, or report on
  • Apply Operations Hub data quality automation to standardise capitalization, trim whitespace, and format phone numbers on entry
  • Enforce naming conventions for campaign names, deal names, and company names
  • Document standards in a central location accessible to all HubSpot users

Integration hygiene

  • Map fields explicitly when connecting tools — don't rely on "auto-mapping" defaults
  • Schedule regular sync checks to catch mapping drift
  • Monitor integration logs for errors and data conflicts
  • Document which system is authoritative for each shared field
  • Consolidate overlapping tools that create duplicate records (e.g., two tools both logging the same meeting)

Governance and audit rhythm

Data quality degrades by default. Governance is the counterforce.

Quarterly health check

  • Completeness audit: fill rates on critical properties across objects
  • Accuracy verification: cross-field validation (lifecycle stage vs deal status, contact company vs deal company)
  • Duplicate identification: run dedup tool, calculate duplicate rate, compare to last quarter
  • Outdated record removal: flag contacts with no activity in 12+ months
  • Orphan records: contacts without companies, deals without contacts
  • Property audit: review new properties created this quarter, check descriptions, verify naming compliance
  • Remove or archive unused fields
  • Annotate properties with descriptions and owner information

Ownership model

  • Data steward: One person (RevOps manager, HubSpot admin) owns the governance process. They review property creation requests, conduct quarterly audits, maintain documentation, and train new team members.
  • Property request process: Require approval before creating custom properties. A 30-second form (purpose, data source, expected usage) prevents 80% of future sprawl.
  • Team accountability: Track data quality metrics per team. Share results. Recognise improvement. Data hygiene is everyone's job, but someone has to keep score.

Data compliance

Data hygiene and data compliance overlap but aren't identical. Clean data makes compliance easier. The basics:

  • GDPR/CCPA: Consent management for marketing communications. Data retention policies that delete records you're not legally entitled to keep. Right-to-erasure processes that actually work.
  • Privacy-first design: Collect only what you need. Provide clear opt-out mechanisms. Store data securely. Audit access permissions regularly.
  • Documentation: Record what data you collect, why, and how long you keep it. This is a governance artefact — if your data steward maintains it, compliance reviews get simpler.

Automate the audit

Everything above can be done manually. Export properties, pull reports, cross-reference spreadsheets, calculate fill rates by hand. It works. It takes 40-60 hours per quarter for a portal with 200+ custom properties.

HubHorizon automates the diagnostic layer. Connect your portal and get:

  • Property health scoring — naming compliance, documentation coverage, fill rates, duplicate detection across all objects
  • Association coverage analysis — orphan records, contact-to-company mapping, deal association quality
  • Data decay monitoring — staleness tracking, zombie property detection, data freshness scores
  • AI readiness assessment — whether your data structure supports Breeze AI features
  • Prioritised recommendations — problems ranked by business impact, not alphabetical order

The analysis runs in minutes. You spend your time fixing problems, not finding them.

Get your free HubSpot data hygiene analysis at hubhorizon.io — connect your portal in 30 seconds, see per-dimension scores in under 5 minutes. No credit card required. View pricing plans for continuous monitoring, quarterly trend tracking, and exportable audit reports.


Peter Sterkenburg is the founder of HubHorizon, a HubSpot portal health and optimisation platform. He's spent years in scale-up RevOps — building the systems, fighting the fires, and eventually building the tool he wished he'd had.