GA4’s built-in cohort report shows 7-day or 30-day retention for acquisition cohorts, but you can’t customize the cohort definition, change the time granularity, or export the underlying data. BigQuery gives you full control: define cohorts by any event, any time window, and calculate day-N return rates for any number of days. This guide builds a complete cohort retention analysis from GA4 BigQuery data.

The Cohort Analysis Approach

A cohort is a group of users who share a common characteristic at a specific time — usually their first visit date. Retention analysis answers: “Of users who first visited in January, what percentage returned on Day 1, Day 7, Day 14, Day 30?”

-- Step 1: Define cohorts by first visit date
WITH first_visits AS (
  SELECT
    user_pseudo_id,
    MIN(PARSE_DATE('%Y%m%d', event_date)) AS cohort_date
  FROM `project.dataset.events_*`
  WHERE event_name = 'first_visit'
  AND _TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAYS))
  GROUP BY user_pseudo_id
),

-- Step 2: Get all subsequent visits for each user
return_visits AS (
  SELECT
    e.user_pseudo_id,
    fv.cohort_date,
    PARSE_DATE('%Y%m%d', e.event_date) AS visit_date,
    DATE_DIFF(PARSE_DATE('%Y%m%d', e.event_date), fv.cohort_date, DAY) AS days_since_first_visit
  FROM `project.dataset.events_*` e
  INNER JOIN first_visits fv ON e.user_pseudo_id = fv.user_pseudo_id
  WHERE e.event_name = 'session_start'
  AND e._TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAYS))
),

-- Step 3: Count cohort size
cohort_sizes AS (
  SELECT cohort_date, COUNT(DISTINCT user_pseudo_id) AS cohort_size
  FROM first_visits
  GROUP BY cohort_date
)

-- Step 4: Calculate retention rates
SELECT
  rv.cohort_date,
  cs.cohort_size,
  rv.days_since_first_visit AS day_n,
  COUNT(DISTINCT rv.user_pseudo_id) AS returned_users,
  ROUND(SAFE_DIVIDE(COUNT(DISTINCT rv.user_pseudo_id), cs.cohort_size) * 100, 2) AS retention_rate
FROM return_visits rv
INNER JOIN cohort_sizes cs ON rv.cohort_date = cs.cohort_date
WHERE rv.days_since_first_visit BETWEEN 1 AND 30  -- Days 1-30
GROUP BY rv.cohort_date, cs.cohort_size, rv.days_since_first_visit
ORDER BY rv.cohort_date, rv.days_since_first_visit;

Monthly Cohort Aggregation

BigQuery GA4 Cohort Analysis: Building Retention Curves and Day-N Return Rates
-- Aggregate to monthly cohorts for easier visualization
SELECT
  FORMAT_DATE('%Y-%m', cohort_date) AS cohort_month,
  SUM(cohort_size) AS total_users,
  day_n,
  SUM(returned_users) AS total_returned,
  ROUND(SAFE_DIVIDE(SUM(returned_users), SUM(cohort_size)) * 100, 2) AS avg_retention_rate
FROM (
  -- Previous detailed query results
  SELECT * FROM cohort_retention_detail
)
GROUP BY cohort_month, day_n
ORDER BY cohort_month, day_n;

Retention by Acquisition Channel

-- Compare retention by acquisition channel
WITH first_visits_with_source AS (
  SELECT
    user_pseudo_id,
    MIN(PARSE_DATE('%Y%m%d', event_date)) AS cohort_date,
    -- Get source from first visit
    (SELECT traffic_source.source FROM `project.dataset.events_*`
     WHERE user_pseudo_id = e.user_pseudo_id
     AND event_name = 'first_visit'
     LIMIT 1) as acquisition_source
  FROM `project.dataset.events_*` e
  WHERE event_name = 'first_visit'
  GROUP BY user_pseudo_id
)
-- Then join with return visits and group by acquisition_source
-- Compares: do organic users retain better than paid users?

Visualizing Cohort Curves in Looker Studio

Save your cohort query as a BigQuery view. In Looker Studio, connect to this view and create a heatmap (table with conditional formatting) or line chart with cohort_month as series and day_n as the X-axis. Color scale from red (low retention) to green (high retention) makes patterns immediately visible.

What to look for: Is Day-1 retention improving month over month? Do certain acquisition channels drive users with higher Day-30 retention? Which months show sudden retention drops (may indicate product issues)?

Related guides: BigQuery Funnel Analysis, BigQuery Attribution Models, BigQuery Revenue Reconciliation.

Guide: BigQuery GA4 Cohort Analysis: Building Retention C

Leave a Comment