Torrent Gear

Statistics and Analytics APIs: Events, Aggregates, Retention

Full article

💡 18+. Engineering analysis for platforms/operators, studios (RGS/live) and BI/CRM teams. Not a call to play.

1) Why an external analytics API

Partners/providers: SLA content monitoring, RTP, engagement.

Marketing/CRM: trigger campaigns based on metrics (DAU, deposit funnel).

Operations/Finance: near-real-time GGR/NGR, payment success, webhook lags.

Product: in-app statistics widgets, A/B panels.

The goal is to safely and predictably give back events and aggregates with understandable semantics and SLAs.

2) Architecture on the fingers


Producers (PAM/Wallet/RGS/Payments/Kafka/CDC)
│
Ingestion API ──Stream (Kafka/Pulsar) ──Lakehouse (Delta/Iceberg)
│                 └─OLAP (ClickHouse/BigQuery/Trino)
└────────────────────────────────────Aggregation/Query API
(cache, RBAC/RLS, rate limits)

Events: at-least-once, grandfather by 'event _ id/idempotency _ key'.

Aggregates: pre-calculated rollups (1m/5m/1h/1d) + on-the-fly.

Retenchen: cohort engine on top of Gold Marts.

Кэш: CDN/edge + ETag/`Cache-Control`, server-side TTL.

3) Event model: minimum standard

3. 1 Common fields

json
{
"event_id":"uuid",  "event_type":"bet. settled",  "occurred_at":"2025-10-23T16:21:05Z",  "ingested_at":"2025-10-23T16:21:06Z",  "tenant_id":"brand-7",  "region":"EU",  "player_id":"p_19f3",   // псевдо-ID
"trace_id":"tr_a1b2c3",  "schema_version":"1. 3. 0",  "payload":{...}
}

Rules: UTC timestamps, 'player _ id' - alias, money in minor units.

3. 2 Key types

`session. started	ended`
`bet. placed	settled` (stake_minor, win_minor, in_bonus, game_id, provider_id)
`wallet. debit	credit` (reason, reference_id)
`payment. intent	authorized
`bonus. issued	consumed
`rg. limit. hit	reality_check`
Versioning - semver; adding optional only.

4) Ingestion API (for third-party sources)

Sending a batch of events


POST /v1/events:batch
Headers: X-Idempotency-Key: ev_20251023_001
[
{"event_id":"...","event_type":"bet. placed",...},  {"event_id":"...","event_type":"bet. settled",...}
]
→ 202 { "accepted":2, "duplicates":0, "trace_id":"tr_a1b2" }

Warranties: at-least-once; duplicates are filtered in Silver by 'event _ id'.

5) Aggregation API: time-series and slices

5. 1 Timeseries (time metrics)


GET /v1/analytics/timeseries
?metric=ggr    // ggr, ngr, dau, deposits_success, rtp
&granularity=5m  // 1m/5m/1h/1d
&from=2025-10-22T00:00:00Z&to=2025-10-23T00:00:00Z
&filters=region:EU,brand_id:brand-7,provider_id:studio_x
&group_by=brand_id
→ 200 {
"metric":"ggr",  "granularity":"5m",  "series":[
{"ts":"2025-10-22T00:00:00Z","brand_id":"brand-7","value_minor":120030},   {"ts":"2025-10-22T00:05:00Z","brand_id":"brand-7","value_minor":98020}
],  "next_cursor":null
}

5. 2 Slices/tops (group-by)


GET /v1/analytics/slice
?metric=rtp &dim=game_id &from=2025-10-22&to=2025-10-23
&limit=50&order=-value
→ 200 { "items":[{"game_id":"g_01","value":0. 956},...] }

5. 3 Funnels


POST /v1/analytics/funnel
{
"steps":[
{"event":"payment. intent"},   {"event":"payment. authorized"},   {"event":"payment. captured"},   {"event":"wallet. credit", "reason":"deposit"}
],  "window_sec": 3600,  "filters":{"region":"EU","brand_id":"brand-7"}
}
→ 200 {
"total": 12450,  "steps": [
{"name":"intent", "count":12450, "rate":1. 0},   {"name":"authorized", "count":11020, "rate":0. 885},   {"name":"captured", "count":10110, "rate":0. 811},   {"name":"credited", "count":10050, "rate":0. 807}
]
}

5. 4 Limits and cache

Rate limit per token/brand/region.

'ETag'to responses; 'If-None-Match 'support.

The TTL cache depends on the granularity (for example, 5m → TTL 60-120 s).

6) Retention and cohorts: rules and API

6. 1 Definitions (conventions)

DAU/WAU/MAU: active if was'bet. placed 'or' wallet. credit (deposit)` или `session. started '≥ N minutes.

Cohort by first deposit (often for LTV) or by registration (for involvement).

Retention D1/D7/D30: the proportion from the cohort returned to the day window +/- tolerance by brand time zone.

We consider repeated visits by the unique 'player _ id' in the window.

6. 2 API cohorts


POST /v1/analytics/retention
{
"cohort":"first_deposit",  "start_date":"2025-09-01",  "end_date":"2025-09-30",  "return_event":"bet. placed",  "days":[1,7,14,30],  "filters":{"region":"EU","brand_id":"brand-7"}
}
→ 200 {
"cohort":"first_deposit",  "rows":[
{"cohort_date":"2025-09-01","size":1820,"d1":0. 36,"d7":0. 22,"d14":0. 18,"d30":0. 12},   {"cohort_date":"2025-09-02","size":1714,"d1":0. 35,"d7":0. 23,"d14":0. 19,"d30":0. 13}
]
}

6. 3 LTV/Cumulative


GET /v1/analytics/ltv? cohort=first_deposit&currency=EUR&horizon=90d
→ 200 { "cohorts":[{"date":"2025-09-01","ltv_minor":[0,150,230,280,...]}] }

7) Metrics semantics (not to argue)

Metrics	Definition (brief)
GGR	sum(stake) − sum(win) по `bet. settled 'in period/currency
NGR	GGR − Bonus Expense − Jackpot Contributions − Commission
RTP	sum(win) / sum(stake)
Deposit success	captured/intent (factor)
Settle lag	t (settled) − t (placed), percentile p95
Webhook lag	t(ingested) − t(occurred), p95
DAU/WAU/MAU	unique active players by activity rules

All - in UTC with currency and minor units; multicurrency is solved by converting fixed FX to Data Lake.

8) Version, filters and compatibility

Path: '/v1/... '; new metrics/fields - optional.

Фильтры: `brand_id, region, provider_id, game_id, method, currency, device, geo`.

Pagination: cursor-based ('next _ cursor').

Breaking → only '/v2 '+ Rejection/Sunset headers and changelog.

9) Security and access

OAuth2 Client Credentials (short-lived tokens), mTLS for B2B.

RBAC/ABAC: metric/slice permissions; RLS filter by 'brand/region'.

PII: API does not give PII, only aggregates/pseudo-ID if necessary.

Residency: routing requests to the region; cross-region data - not allowed.

Rate limits and quotas, anti-abuse; WORM audit of accesses.

10) SLO and observability

SLO benchmarks:

'GET/timeseries gran = 5m'p95 ≤ 500-800 ms,' GET/slice'p95 ≤ 1-2 s (tops up to 50-100 positions),' POST/retention '(month of cohorts) p95 ≤ 3-5 s, freshness rollup: p95 ≤ 2-5 min from' occurred _ at '.
Metrics: latency p50/p95/p99, error-rate (4xx/5xx), cache-hit, requests/scan bytes (OLAP), freshness lag for each rollup'y.
Logs: structured, 'trace _ id', query filters (without PII), scan count.

11) Cash, preliminary calculations, cost

Rollup tables: 1m/5m/1h/1d by key metrics → fast'timeseries'.

Materialized views for heavy sections/cohorts.

ETag + max-age; disability in late events occurs incrementally.

Strategy "hot/cold": hot queries - in OLAP-warehouse; archive - in Lake.

Restriction of "scan bytes" per request; hints to the scheduler.

12) Embedded and export

Built-in widgets via signed URL/iFrame with RLS tokens.

Export CSV/Parquet by task (job API) with size constraints and temporary references.

Webhook notifications about the readiness of uploading.

13) Checklists

Architecture

Unified event schema, semver, registry; grandfather by 'event _ id'.
Rollup's and materialized views for top cases.
RLS/RBAC/ABAC, residency, short-lived tokens.
Cache (ETag/TTL), rate limits, quotas.

Semantics

GGR/NGR/RTP/DAU/retention definitions are documented.
Currencies - minor units; FX is fixed at the time of the event.
Retention by UTC, taking into account the brand timezone in the display.

Operations

SLO/freshness and latency dashboards.
WORM audit of accesses/exports.
DR/xaoc exercises: rollup lag, flurry of requests, late events.

14) Anti-patterns (red flags)

"Raw" OLTP tables are given directly to the API.

Inconsistent metric definitions between commands.

No deduplication and watermarks → double/lost events.

Unlimited on-the-fly aggregations without cache/quotas → expensive and slow requests.

Cross-region aggregation without residency policies.

Return PII/player details to public responses.

Quiet breaking-changes without '/v2 'and Deprecation.

15) Mini-spec (TL; DR)

Events: '/v1/events: batch '(at-least-once, dedup by' event _ id ').

Timeseries: '/v1/analytics/timeseries? metric=...&granularity=...` (rollup + кэш).

Slices: '/v1/analytics/slice? metric=...&dim=...`.

Funnels: '/v1/analytics/funnel '(window, steps, filters).

Retention/cohorts: '/v1/analytics/retention '(+ LTV).

Security: OAuth2 + mTLS, RLS, per brand/region tokens, WORM audit.

SLO: p95 ≤ 0. 5-2 s; freshness ≤ 2-5 min.

Statistics and analytics APIs are not "SELECT FROM, big_table" but a contract of metrics: stable events, pre-read and cached aggregates, strictly defined retention and cohorts, security (RLS/RBAC) and residency understandable by SLO. So you give data quickly, cheaply and predictably - to partners, product and BI - without controversial interpretations and without the risk of leakage or storage overload.