Analyzer
The analyzer reverse-engineers Synode configs from existing event data. Point it at your CSV, JSONL, or JSON event logs and it produces a complete configuration with journeys, adventures, actions, datasets, and personas.
Getting Started
- Create a Synode project:
synode init - Place your event data in the
input/directory - Run
synode analyze
The analyzer will:
- Auto-detect your event schema (Segment, Mixpanel, GA4, or custom)
- Split events into sessions
- Discover event patterns and field distributions
- Build a complete Synode config
Supported Formats
| Format | Extensions | Auto-detection |
|---|---|---|
| CSV | .csv | Column mapping wizard |
| JSONL | .jsonl | Segment, Mixpanel, GA4 |
| JSON | .json | Segment, Mixpanel, GA4 |
For GA4 BigQuery exports, the analyzer automatically flattens event_params arrays and nested device/geo objects into a flat payload.
How It Works
The analyzer runs a 9-stage pipeline:
- Ingestion — parse file, detect schema, normalize events
- Session detection — group events by session ID or time gap (default 30min)
- Action discovery — find unique event names and analyze field distributions
- Sequence mining — discover frequent contiguous event sequences (PrefixSpan)
- Adventure assembly — group sequences into adventures with bounce, wait, and timing stats
- Journey assembly — group adventures into journeys with weights and prerequisites
- Dataset extraction — find entity tables (products, categories) from payload ID patterns
- Persona extraction — detect user-level attribute distributions (locale, device, country)
- Compilation — generate TypeScript config files
What Gets Extracted
The analyzer computes all of these from your data:
| Config element | How it's detected |
|---|---|
| Journey weights | Relative session frequency per journey |
| Bounce rates | Drop-off rates at journey, adventure, and action level |
| Wait times | Median time gaps between consecutive actions |
| Prerequisites | Journey ordering patterns (>80% co-occurrence) |
| Cooloffs | Minimum repeat interval per user |
weighted() fields | Categorical values with observed probabilities |
oneOf() fields | Uniform categorical distributions |
range() fields | Numeric min/max bounds |
chance() fields | Boolean true rates |
| Datasets | Entity-like data identified by _id fields with attributes |
| Persona attributes | User-level fields (locale, device, country) with distributions |
Intermediate Format
The analyzer produces an intermediate JSON file (input/analysis.json) before compiling to TypeScript. This file is a complete, serializable representation of the detected config:
{
"version": "1.0",
"meta": { "sourceFiles": ["events.csv"], "eventCount": 50000, "uniqueUsers": 1200 },
"journeys": [{ "id": "...", "weight": 0.7, "bounceChance": 0.15, "adventures": [...] }],
"datasets": [{ "id": "products", "count": 50, "fields": { "name": { "generator": "oneOf", "options": [...] } } }],
"persona": { "attributes": { "locale": { "generator": "weighted", "options": { "en-US": 0.6, "de-DE": 0.25 } } } },
"simulation": { "users": 1200, "lanes": 4, "tick": "1m" }
}You can edit this file manually and re-compile with synode analyze --compile-only.
AI Enhancement (Optional)
The analyzer can optionally use AI (Claude or Gemini) to refine the generated config:
- Name refinement — suggest human-readable journey/adventure/action names
- Condition inference — detect logical prerequisites between flows
- Dataset identification — find entity patterns in payloads
- Config review — holistic quality check
Each AI call requires explicit user approval. A security notice is shown before any data is sent. No raw event data is transmitted — only aggregated structure and statistics.
Supported providers: Anthropic (Claude), Google (Gemini). Install the provider SDK you need:
# For Anthropic
npm install ai @ai-sdk/anthropic
# For Google
npm install ai @ai-sdk/googleSet the API key via environment variable or enter it when prompted:
export ANTHROPIC_API_KEY=sk-...
# or
export GOOGLE_API_KEY=...CLI Reference
# Full analysis with interactive wizard
synode analyze
# Re-compile from existing analysis.json
synode analyze --compile-only| Flag | Description |
|---|---|
--compile-only | Skip analysis, re-compile from existing input/analysis.json |
Programmatic API
All pipeline stages are available as standalone functions:
import {
parseFile,
detectSchema,
normalizeEvents,
detectSessions,
discoverActions,
mineSequences,
assembleAdventures,
assembleJourneys,
extractDatasets,
extractPersona,
compileToTypeScript,
} from '@synode/analyzer';
// Parse and normalize
const rows = await parseFile('input/events.csv');
const schema = detectSchema(rows);
const events = normalizeEvents(rows, schema.mapping);
// Detect sessions and analyze
const sessions = detectSessions(events, { mode: 'timeGap', gapMs: 30 * 60 * 1000 });
const actions = discoverActions(sessions);
const sequences = mineSequences(sessions, { minSupport: 0.1 });
// Assemble structure
const adventures = assembleAdventures(sequences, sessions, actions);
const journeys = assembleJourneys(adventures, sessions);
const datasets = extractDatasets(sessions);
const persona = extractPersona(sessions);
// Compile to TypeScript files
const files = compileToTypeScript({ version: '1.0', meta: { ... }, journeys, datasets, persona, ... });See the API Reference for detailed type information.
