Skip to main content

Exporting structure and example data (Firebase & BigQuery)

Use this when populating internal/ai-context/**/firebase and internal/ai-context/**/bigquery. Goal: schema + one real example per resource, redacted, for agents and humans.

General rules

  1. Never commit secrets — strip API keys, OAuth tokens, emails, phone numbers, street addresses, and raw auth payloads.
  2. Minimize volume — one account / one sponsor / one creator slice is enough; avoid full collection dumps in git.
  3. Record metadata — in SOURCE.md (optional): date, environment, export tool, and which document/row IDs were used (internal IDs only if safe).
  4. Stable ordering — pretty-print JSON; for BigQuery use column order matching the console or SELECT * from a narrowed query.

Firebase

What to capture

  • Document path (full path from root), e.g. creators/{id} or instagramProfiles/{id}.
  • Field names and types as JSON values show them (strings, numbers, maps, arrays).
  • Subcollections (list paths only if you do not export every doc): note existence in README.md under that area.

A. Firebase Console (quick, small docs)

  1. Open Firestore → navigate to the document for your single example account.
  2. Copy relevant fields into a local file, or use ⋮ → Export if your project supports JSON export for that view.
  3. Save as firebase/<resource>-example.json and add a sibling firebase/paths.md listing the exact path(s).

B. gcloud / Firestore REST (scriptable)

  1. Authenticate: gcloud auth application-default login (or service account with least privilege).
  2. Use the Firestore REST API documents.get or Firebase Admin SDK in a one-off script:
  3. For subcollections: listDocuments / collection group queries scoped to your test ID only.

C. Emulator snapshot (if you use emulators)

  • Seed one user; export emulator data if your workflow supports it; same redaction applies.

Post-export cleanup

  • Replace Timestamp objects with ISO-8601 strings in saved JSON.
  • Truncate long arrays (e.g. keep first 2 items, add "_truncated": true).
  • Remove or hash fields flagged as PII in your data catalog.

BigQuery

What to capture

  • Table reference: project.dataset.table.
  • Schema: column name, type, mode (NULLABLE/REQUIRED/REPEATED), and short description if you use BQ column descriptions.
  • One row (or a few) for the same logical entity as Firebase (same internal creator/sponsor id where applicable).

A. Schema only (DDL or JSON schema)

In BigQuery Console:
  1. Open the table → DetailsSchema tab.
  2. Export schema:
    • Option 1: Run
      bq show --format=prettyjson --schema creator-rank:dataset.table > bigquery/<table>-schema.json
    • Option 2: Use Open inDataprep / copy schema manually into bigquery/<table>-schema.md (table in Markdown).

B. One example row (CSV or JSON)

Prefer NDJSON or a single-row JSON array for readability in repos.
# Single row, pretty JSON (replace dataset/table/filters as needed)
bq query --use_legacy_sql=false --format=prettyjson \
  'SELECT * FROM `creator-rank.dataset.table`
   WHERE creator_id = "YOUR_TEST_ID"
   LIMIT 1' > bigquery/<table>-example-row.json
Or export to GCS then download (good for wide tables):
bq extract --destination_format=NEWLINE_DELIMITED_JSON \
  'creator-rank:dataset.table$partition' \
  'gs://your-bucket/temp/export-*.json'
# Then gsutil cp one shard and trim to one line/object

C. INFORMATION_SCHEMA (structure of many tables)

To document all tables in a dataset without row data:
SELECT table_name, column_name, data_type, is_nullable
FROM `creator-rank.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name IN ('your_table')
ORDER BY table_name, ordinal_position;
Save results as bigquery/information_schema-<dataset>.csv or .md table.

Post-export cleanup

  • Drop or mask columns that are PII or contractual secrets.
  • If the row is wide, split into logical JSON files (e.g. profile.json, metrics_daily.json) and reference the shared key in README.md.

Suggested filenames (per area)

TypeExample filename
Firebase doc bodyfirebase/firestore-<collection>-example.json
Firebase path indexfirebase/paths.md
BQ schemabigquery/<table>-schema.json or -schema.md
BQ sample rowbigquery/<table>-example-row.json
Lineage noteSOURCE.md (at area root or next to samples)

Aligning Firebase and BigQuery in one area

  1. Pick one stable key (e.g. internal creator id, sponsor brandId).
  2. Export Firebase doc(s) for that key.
  3. Run BigQuery SELECT * ... WHERE <key> = ... LIMIT 1 for tables that power the same API surface.
  4. In that area’s README.md, add a short Mapping section: which API fields come from which Firestore path vs which BQ column(s).
This makes “intended vs actual” comparisons tractable for agents and code reviews.
Last modified on March 20, 2026