Event duplication is one of the hardest GA4 data quality problems to detect because duplicated events look identical to legitimate ones in standard reports. Revenue appears inflated by an amount that does not correspond to any obvious error. Conversion rates look suspiciously high. BigQuery queries reveal multiple identical events with the same user, timestamp, and parameters. This guide covers the four main causes of GA4 event duplication, how to detect each in BigQuery, and the deduplication strategies for both prevention and post-hoc correction.
Cause 1: Multiple GTM Tags Firing the Same Event
The most common source of GA4 event duplication is two GTM tags configured to send the same event — for example, a GA4 Event tag for purchase firing on a Conversion Linker trigger AND on a Custom Event trigger that both evaluate to true on the order confirmation page. GTM Preview makes this easy to diagnose: look for your event name appearing twice in the same page load in the Preview panel. If it appears twice with the same parameters, you have a duplicate tag configuration. The fix is to audit your trigger logic and ensure each event fires from exactly one tag with non-overlapping triggers.
Cause 2: dataLayer.push Firing Multiple Times
SPAs and dynamic pages can trigger the JavaScript that pushes to the dataLayer multiple times. A React component that pushes a purchase event in a useEffect hook fires on every render of the component, not just the first. If the order confirmation component re-renders (due to a state update, a parent re-render, or a React Router navigation), the purchase event fires again. The fix is to use a ref or sessionStorage flag to ensure the push only happens once per order:
// Prevent duplicate pushes in React
import { useEffect, useRef } from 'react';
function OrderConfirmation({ order }) {
const tracked = useRef(false);
useEffect(() => {
if (tracked.current) return; // Already fired
tracked.current = true;
window.dataLayer = window.dataLayer || [];
window.dataLayer.push({
event: 'purchase',
ecommerce: {
transaction_id: order.id,
value: order.total,
currency: 'USD'
}
});
}, []); // Empty dependency array ensures single fire
}
Cause 3: Server-Side and Client-Side Both Firing
When implementing both browser-side GA4 tracking and server-side Measurement Protocol tracking for the same event, both can reach GA4 and create duplicates. GA4 has a deduplication mechanism for this case: include the same event_id in both the client-side event and the Measurement Protocol event. GA4 will deduplicate events with matching event_id values that arrive within a short time window from the same client_id. The event_id must be a string, must be unique per event occurrence, and must be sent within 24 hours of the original event to trigger deduplication.

Detecting Duplicates in BigQuery
-- Find duplicate events within same session
SELECT
user_pseudo_id,
(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS session_id,
event_name,
event_timestamp,
COUNT(*) AS occurrences
FROM `your_project.analytics_XXXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20240131'
AND event_name IN ('purchase', 'generate_lead', 'sign_up')
GROUP BY user_pseudo_id, session_id, event_name, event_timestamp
HAVING COUNT(*) > 1
LIMIT 100
Events with identical user_pseudo_id, session_id, event_name, and event_timestamp are almost certainly duplicates. Legitimate repeated events (a user adding multiple items to cart) will have different timestamps. If you find a large number of duplicate rows, calculate the duplication rate: divide duplicate event count by total event count for the affected event name. A 10% duplication rate means your revenue figures are inflated by approximately 10%.
Post-Hoc Deduplication in BigQuery Queries
If historical data already contains duplicates, you can deduplicate in your BigQuery queries using ROW_NUMBER() to keep only the first occurrence of each event per session:
-- Deduplicated purchase revenue
WITH deduped AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY user_pseudo_id,
(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id'),
(SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'transaction_id')
ORDER BY event_timestamp
) AS rn
FROM `your_project.analytics_XXXXXX.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20240101' AND '20240131'
AND event_name = 'purchase'
)
SELECT
SUM((SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'value')) AS deduplicated_revenue,
COUNT(*) AS deduplicated_purchases
FROM deduped
WHERE rn = 1
Partitioning by transaction_id is the most accurate deduplication method for purchase events because it preserves the intent of one revenue record per order. For events without a transaction_id, partition by user, session, and event_name with a timestamp window to distinguish intentional repeated actions from true duplicates.
