
Data integrity checks, filter validation, and the 20-point audit that ensures your analytics are reliable.
Why You Audit the Data Before You Touch the Marketing
Every marketing decision downstream of analytics inherits the quality of the analytics. If your conversion count is inflated by duplicate tags, your cost per lead looks better than it is and you scale a campaign that loses money. If your form tracking quietly broke during the last site update, your best channel looks dead and you cut it. We have seen both happen, and in both cases the marketing wasn’t the problem — the measurement was.
That’s why an analytics audit comes before any optimization work, not after. There is no point A/B testing landing pages against a conversion event that double-fires, and no point reallocating budget using attribution data that assigns half your revenue to “direct” because UTM tags are missing.
The audit described here is the one we run on every new account. It’s roughly twenty checks grouped into eight passes, ordered the way errors actually propagate: tags have to fire before events exist, events have to be defined correctly before conversions mean anything, and conversions have to be attributed correctly before channel reports can be trusted. Audit in that order and each pass validates the foundation the next one stands on.
One framing rule before we start: the goal is not perfect data. Perfect data doesn’t exist in a world of ad blockers, consent banners, and cross-device journeys. The goal is data that is consistently wrong in known ways — because a measurement that undercounts by a stable margin is still useful for decisions, while a measurement that’s wrong in unknown, shifting ways is worse than no measurement at all.
Pass One: Tag Coverage and Duplicate Firing
The first pass answers two questions: does the tag fire on every page, and does it fire exactly once?
Coverage first. Crawl the site, or at minimum walk the templates — homepage, service or product pages, blog, checkout or contact flow, thank-you pages — with browser developer tools or a tag inspection extension open, and confirm the GA4 tag fires on each. The gaps are predictable: landing pages built in a separate tool that never got the container, subdomains added after the original setup, a checkout hosted on a third-party platform, or a new template that shipped without the tag manager snippet. Every untagged page is a hole in the journey, and holes in the middle of a journey do double damage — they break sessions in two and inflate your entrance and exit numbers on the surrounding pages.
Then duplicates, which are the more expensive failure. The classic cause is a migration that left the old hardcoded gtag snippet in the theme while the new Google Tag Manager container loads the same measurement ID — every pageview and every conversion counted twice. The check is direct: open the network tab, filter for collect requests, load a page, and count. One page_view request per pageview. Then submit a test form or complete a test purchase and count again.
While you’re in there, check the tag manager container itself for redundancy: multiple tags firing on overlapping triggers, paused tags someone is afraid to delete, and old Universal Analytics tags still firing into a property that stopped processing data years ago. A clean container is auditable; a container with ninety tags and no naming convention is where the next tracking bug is already hiding.
Pass Two: Do Your Conversions Match Business Reality?
This is the pass that most often changes how a business sees its own marketing, because it’s not technical — it’s definitional. Open the list of events marked as key events or conversions and ask, for each one: does this represent money, or motion?
The failure modes are worth naming. Counting a button click as a conversion instead of the successful form submission behind it — the click fires whether or not the form validates, so failed submissions count as wins. Counting a pageview of the contact page as a lead. Counting newsletter signups, PDF downloads, and qualified sales inquiries as the same conversion event, so a campaign that generates a hundred freebie-seekers outperforms one that generates ten real prospects in every report. Or the opposite problem: a phone-heavy business with no call tracking at all, where the majority of actual leads are invisible and the data says the site doesn’t convert.
The checklist items here: every conversion event corresponds to a completed action, not an attempted one. Primary conversions (leads, purchases, booked calls) are separated from secondary ones (signups, downloads) rather than pooled. Conversion values are set where revenue or lead value is known, even as estimates, because value-based bidding and revenue reporting both depend on them. Thank-you pages that define conversions can’t be reached directly — if a bookmarked or crawled confirmation URL fires the event, your count is polluted.
Then verify each definition end to end: submit the form, place the test order, make the test call, and watch the event arrive with the parameters you expect. Conversion tracking that has never been tested with a real submission should be treated as broken until proven otherwise.
Pass Three: Internal Traffic, Bots, and Referral Spam
Your own team is a traffic source, and on smaller sites a significant one. Staff checking the site daily, developers reloading templates, the owner admiring the homepage from three devices — none of it behaves like a customer, and all of it lands in your data unless you filter it.
GA4 handles this through internal traffic rules: you define your office and home IP addresses as internal, then activate the corresponding data filter. The audit checks three things. First, that the rules exist at all — on most accounts we inherit, they don’t. Second, that the filter is actually set to active, not still sitting in testing mode, which marks the traffic but doesn’t exclude it. Third, that the IP list is current — offices move, teams go remote, and a filter built around an address nobody uses anymore filters nothing. Agencies and contractors belong on the list too; your developer’s test purchases should not appear in revenue.
Next, the unwanted referrals list. Payment processors are the common offender: a customer goes to the payment provider and comes back, and without that domain listed as unwanted, the return trip starts a new session credited to the processor — which is how a payment gateway becomes your top converting referral channel. Any domain that’s a step inside your own funnel belongs on that list.
Finally, scan for noise: referral sources you don’t recognize, traffic spikes from one city you don’t serve, sessions with near-zero engagement time arriving in bursts. GA4 excludes known bots automatically, but known is the operative word. You can’t retroactively clean historical data in GA4 — filters only apply forward — so the sooner this pass happens, the more of your history stays usable.
Pass Four: Attribution Settings and Channel Definitions
Two properties can record identical user behavior and report different channel performance, purely because of settings. This pass makes those settings deliberate instead of default.
Start with the attribution model in the property settings. GA4’s default is data-driven attribution, which distributes conversion credit across touchpoints using Google’s modeling. Whatever the model, the audit item is the same: know which one is active, know when it was last changed, and annotate the change date — because a model switch makes before-and-after channel comparisons invalid, and someone will make them anyway if nothing is written down.
Check the conversion windows next: how many days after a click a conversion can still be credited. The defaults are sensible for most businesses, but a company with a six-month sales cycle and a thirty-day window is systematically stripping credit from the campaigns that start its longest, largest deals.
Then look at where your traffic actually lands in the channel groupings. The two numbers that diagnose most attribution problems are the share of traffic in “direct” and the share in “unassigned.” Direct is where analytics puts visits it can’t attribute — and while some direct traffic is genuinely people typing your URL, an outsized direct share usually means missing UTM tags on email and paid social, or links from apps that strip referrer data. Unassigned means traffic carrying source data that matches no channel definition, which is almost always malformed manual tagging. Neither bucket can be fixed in this pass — they’re symptoms — but measuring them here tells you how much pass six (UTM hygiene) matters, and gives you a baseline to verify against after the fixes.
Pass Five: Cross-Domain Tracking and Broken Journeys
Any time a customer journey crosses from one domain to another — main site to a booking platform, marketing site to a checkout on a different domain, microsite to main site — analytics sees the arrival on the second domain as a brand new visitor unless cross-domain tracking is configured. The original campaign source dies at the boundary, and the conversion gets credited to a referral from your own other domain.
The audit starts with an inventory: list every domain a customer can touch between first click and conversion. Main site, landing page tools, booking and scheduling platforms, payment pages, support portals. For each boundary a user crosses, decide whether it needs cross-domain configuration (you control both domains and they share the GA4 property), an unwanted-referral entry (the domain is a pass-through like a payment processor), or nothing (the journey legitimately ends there).
Then test the boundary that matters most. Click from the main site to the second domain and inspect the URL: cross-domain tracking in GA4 works by appending a linker parameter (it starts with _gl) to the destination URL. If it’s present and the second domain runs the same tag, the session survives the crossing. If it’s absent, every conversion on that second domain is being attributed to the wrong source.
Subdomains deserve a check too, in the opposite direction: GA4 tracks across subdomains automatically when the same tag is used, so the failure mode there is different tags or different properties on different subdomains — the blog on a subdomain with its own container, reporting into nowhere. One journey, one property, one tag configuration is the standard this pass enforces.
Pass Six: Consent Mode — The Gap Between Visitors and Recorded Visitors
If your site shows a cookie banner, there is now a gap between the people who visit and the people your analytics records, and this pass measures whether that gap is handled or just ignored.
First, verify the consent banner and the tags actually talk to each other. The common failure is a banner that displays, collects choices, and changes nothing — tags fire identically whether the visitor accepts or declines. That’s a compliance problem under PIPEDA, Quebec’s Law 25, and GDPR for anyone with European visitors, and it also means your banner is suppressing nothing, so your numbers look complete while your legal exposure grows. Test it directly: open a fresh private window, decline everything, and watch the network tab. Then repeat and accept. The two sessions should produce visibly different tag behavior.
Second, check how decline is handled. With Google’s consent mode implemented properly, declined visitors trigger cookieless pings that let GA4 model the missing conversions; without it, declined visitors simply vanish. Either approach can be legitimate — the audit item is knowing which one you’re running, because it changes how to read the numbers. A property receiving modeled data will show different figures than raw collection, and the reporting identity setting determines whether modeling is even applied.
Third, quantify the gap where you can. Compare server-side ground truth — actual form submissions in your CRM, actual orders in your store admin — against what GA4 recorded over the same period. The shortfall is your combined consent, ad-blocker, and tracking-loss rate. Knowing it’s a stable percentage turns an invisible distortion into a known correction factor you can apply when judging real performance.
Pass Seven: UTM Hygiene and the GA4-vs-Platform Reconciliation
UTM parameters are the only attribution signal you fully control, and on most accounts they’re a mess. The audit pulls every session source and medium from the last ninety days and looks for the telltale duplication: email, Email, and e-mail as three separate mediums; facebook, fb, and meta as three sources; campaign names that are sometimes dates, sometimes product names, sometimes someone’s initials. Each variant fragments reporting — the channel’s real performance is scattered across rows nobody adds up.
The fixes are procedural, not technical: a written convention (lowercase everything, fixed vocabulary for medium, a campaign naming pattern), a shared link builder or spreadsheet so nobody hand-types parameters, and a rule that internal links never carry UTMs — tagging a link from your own homepage to your own services page overwrites the visitor’s real source with a fake one, destroying the attribution you spent the other six passes protecting. Auto-tagging from Google Ads should be on, and if manual tags coexist with it, the override setting needs to be a deliberate choice.
Then the reconciliation every client asks about: why Google Ads, Meta, and GA4 report different conversion counts for the same campaigns. The honest answer is that they’re measuring different things and always will be. Ad platforms credit themselves for view-through conversions GA4 never sees, attribute on their own windows, count on click time rather than conversion time, and each platform claims full credit for shared conversions — so platform numbers added together will exceed reality. GA4 applies its own attribution across all channels and misses what consent and blockers hide. The audit item is not making the numbers match — they won’t — but documenting which number is used for which decision: platform data for optimizing within a platform, GA4 for comparing across channels, and CRM or sales data as the final word on what’s real.
The Fix Order: What to Repair First and What Can Wait
An audit that ends with twenty undifferentiated findings gets filed, not fixed. The findings need an order, and the order follows one principle: fix the things that corrupt data going forward before the things that merely obscure it, because GA4’s filters and settings don’t apply retroactively — every week a corruption stays live is a week of history you never get back.
First tier, fix immediately: duplicate tags and double-firing conversions, conversion events that fire on attempts instead of completions, and any consent implementation that isn’t doing what the banner claims. These poison the record and, if conversions feed Google Ads bidding, they’re actively training your campaigns on false signals — the damage compounds daily.
Second tier, fix this week: missing tags on key templates, internal traffic filters, unwanted referrals like payment processors, and cross-domain tracking on any boundary that sits inside the conversion path. These are configuration work measured in hours, and each one starts paying back the moment it’s live.
Third tier, fix this month: UTM conventions and the cleanup of go-forward tagging, conversion values, attribution window review, and the documented reconciliation between GA4, the ad platforms, and the CRM. These improve interpretation rather than collection, so they matter enormously — but they can follow the plumbing.
Then put a date in the calendar, because analytics setups decay. Sites get redesigned, tags get added, banners get swapped, and platforms change their defaults under you. A quarterly re-run of the first two passes — tag coverage and conversion verification, the twenty-minute version — catches most regressions before they cost a quarter of data. The full audit is annual maintenance, and at SearchPod it’s the first deliverable of every engagement, because every recommendation we make afterward is only as good as the data underneath it.
Want help implementing this?
Get a free proposal for your analytics setup. We’ll show you exactly where the opportunities are.
Get Free ProposalRelated Articles