GA4’s built-in cohort report shows 7-day or 30-day retention for acquisition cohorts, but you can’t customize the cohort definition, change the time granularity, or export the underlying data. BigQuery gives you full control: define cohorts by any event, any time window, and calculate day-N return rates for any number of days. This guide builds a complete cohort retention analysis from GA4 BigQuery data.
The Cohort Analysis Approach
A cohort is a group of users who share a common characteristic at a specific time — usually their first visit date. Retention analysis answers: “Of users who first visited in January, what percentage returned on Day 1, Day 7, Day 14, Day 30?”
-- Step 1: Define cohorts by first visit date
WITH first_visits AS (
SELECT
user_pseudo_id,
MIN(PARSE_DATE('%Y%m%d', event_date)) AS cohort_date
FROM `project.dataset.events_*`
WHERE event_name = 'first_visit'
AND _TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAYS))
GROUP BY user_pseudo_id
),
-- Step 2: Get all subsequent visits for each user
return_visits AS (
SELECT
e.user_pseudo_id,
fv.cohort_date,
PARSE_DATE('%Y%m%d', e.event_date) AS visit_date,
DATE_DIFF(PARSE_DATE('%Y%m%d', e.event_date), fv.cohort_date, DAY) AS days_since_first_visit
FROM `project.dataset.events_*` e
INNER JOIN first_visits fv ON e.user_pseudo_id = fv.user_pseudo_id
WHERE e.event_name = 'session_start'
AND e._TABLE_SUFFIX >= FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAYS))
),
-- Step 3: Count cohort size
cohort_sizes AS (
SELECT cohort_date, COUNT(DISTINCT user_pseudo_id) AS cohort_size
FROM first_visits
GROUP BY cohort_date
)
-- Step 4: Calculate retention rates
SELECT
rv.cohort_date,
cs.cohort_size,
rv.days_since_first_visit AS day_n,
COUNT(DISTINCT rv.user_pseudo_id) AS returned_users,
ROUND(SAFE_DIVIDE(COUNT(DISTINCT rv.user_pseudo_id), cs.cohort_size) * 100, 2) AS retention_rate
FROM return_visits rv
INNER JOIN cohort_sizes cs ON rv.cohort_date = cs.cohort_date
WHERE rv.days_since_first_visit BETWEEN 1 AND 30 -- Days 1-30
GROUP BY rv.cohort_date, cs.cohort_size, rv.days_since_first_visit
ORDER BY rv.cohort_date, rv.days_since_first_visit;
Monthly Cohort Aggregation

-- Aggregate to monthly cohorts for easier visualization
SELECT
FORMAT_DATE('%Y-%m', cohort_date) AS cohort_month,
SUM(cohort_size) AS total_users,
day_n,
SUM(returned_users) AS total_returned,
ROUND(SAFE_DIVIDE(SUM(returned_users), SUM(cohort_size)) * 100, 2) AS avg_retention_rate
FROM (
-- Previous detailed query results
SELECT * FROM cohort_retention_detail
)
GROUP BY cohort_month, day_n
ORDER BY cohort_month, day_n;
Retention by Acquisition Channel
-- Compare retention by acquisition channel
WITH first_visits_with_source AS (
SELECT
user_pseudo_id,
MIN(PARSE_DATE('%Y%m%d', event_date)) AS cohort_date,
-- Get source from first visit
(SELECT traffic_source.source FROM `project.dataset.events_*`
WHERE user_pseudo_id = e.user_pseudo_id
AND event_name = 'first_visit'
LIMIT 1) as acquisition_source
FROM `project.dataset.events_*` e
WHERE event_name = 'first_visit'
GROUP BY user_pseudo_id
)
-- Then join with return visits and group by acquisition_source
-- Compares: do organic users retain better than paid users?
Visualizing Cohort Curves in Looker Studio
Save your cohort query as a BigQuery view. In Looker Studio, connect to this view and create a heatmap (table with conditional formatting) or line chart with cohort_month as series and day_n as the X-axis. Color scale from red (low retention) to green (high retention) makes patterns immediately visible.
What to look for: Is Day-1 retention improving month over month? Do certain acquisition channels drive users with higher Day-30 retention? Which months show sudden retention drops (may indicate product issues)?
Related guides: BigQuery Funnel Analysis, BigQuery Attribution Models, BigQuery Revenue Reconciliation.
