Exporting structure and example data (Firebase & BigQuery)
Use this when populatinginternal/ai-context/**/firebase and internal/ai-context/**/bigquery. Goal: schema + one real example per resource, redacted, for agents and humans.
General rules
- Never commit secrets — strip API keys, OAuth tokens, emails, phone numbers, street addresses, and raw auth payloads.
- Minimize volume — one account / one sponsor / one creator slice is enough; avoid full collection dumps in git.
- Record metadata — in
SOURCE.md(optional): date, environment, export tool, and which document/row IDs were used (internal IDs only if safe). - Stable ordering — pretty-print JSON; for BigQuery use column order matching the console or
SELECT *from a narrowed query.
Firebase
What to capture
- Document path (full path from root), e.g.
creators/{id}orinstagramProfiles/{id}. - Field names and types as JSON values show them (strings, numbers, maps, arrays).
- Subcollections (list paths only if you do not export every doc): note existence in
README.mdunder that area.
Recommended export methods
A. Firebase Console (quick, small docs)
- Open Firestore → navigate to the document for your single example account.
- Copy relevant fields into a local file, or use ⋮ → Export if your project supports JSON export for that view.
- Save as
firebase/<resource>-example.jsonand add a siblingfirebase/paths.mdlisting the exact path(s).
B. gcloud / Firestore REST (scriptable)
-
Authenticate:
gcloud auth application-default login(or service account with least privilege). -
Use the Firestore REST API
documents.getor Firebase Admin SDK in a one-off script:- Node (Admin SDK): use the repo script
internal/ai-context/scripts/export-firestore-doc.mjs(seescripts/README.md) — or calladmin.firestore().doc('path/to/doc').get()and stringify after converting Timestamps to ISO strings.
- Node (Admin SDK): use the repo script
-
For subcollections:
listDocuments/ collection group queries scoped to your test ID only.
C. Emulator snapshot (if you use emulators)
- Seed one user; export emulator data if your workflow supports it; same redaction applies.
Post-export cleanup
- Replace
Timestampobjects with ISO-8601 strings in saved JSON. - Truncate long arrays (e.g. keep first 2 items, add
"_truncated": true). - Remove or hash fields flagged as PII in your data catalog.
BigQuery
What to capture
- Table reference:
project.dataset.table. - Schema: column name, type, mode (NULLABLE/REQUIRED/REPEATED), and short description if you use BQ column descriptions.
- One row (or a few) for the same logical entity as Firebase (same internal creator/sponsor id where applicable).
Recommended export methods
A. Schema only (DDL or JSON schema)
In BigQuery Console:- Open the table → Details → Schema tab.
- Export schema:
- Option 1: Run
bq show --format=prettyjson --schema creator-rank:dataset.table > bigquery/<table>-schema.json - Option 2: Use Open in → Dataprep / copy schema manually into
bigquery/<table>-schema.md(table in Markdown).
- Option 1: Run
B. One example row (CSV or JSON)
Prefer NDJSON or a single-row JSON array for readability in repos.C. INFORMATION_SCHEMA (structure of many tables)
To document all tables in a dataset without row data:bigquery/information_schema-<dataset>.csv or .md table.
Post-export cleanup
- Drop or mask columns that are PII or contractual secrets.
- If the row is wide, split into logical JSON files (e.g.
profile.json,metrics_daily.json) and reference the shared key inREADME.md.
Suggested filenames (per area)
| Type | Example filename |
|---|---|
| Firebase doc body | firebase/firestore-<collection>-example.json |
| Firebase path index | firebase/paths.md |
| BQ schema | bigquery/<table>-schema.json or -schema.md |
| BQ sample row | bigquery/<table>-example-row.json |
| Lineage note | SOURCE.md (at area root or next to samples) |
Aligning Firebase and BigQuery in one area
- Pick one stable key (e.g. internal creator id, sponsor
brandId). - Export Firebase doc(s) for that key.
- Run BigQuery
SELECT * ... WHERE <key> = ... LIMIT 1for tables that power the same API surface. - In that area’s
README.md, add a short Mapping section: which API fields come from which Firestore path vs which BQ column(s).