Exporting structure and example data (Firebase & BigQuery)

Use this when populating internal/ai-context/**/firebase and internal/ai-context/**/bigquery. Goal: schema + one real example per resource, redacted, for agents and humans.

General rules

Never commit secrets — strip API keys, OAuth tokens, emails, phone numbers, street addresses, and raw auth payloads.
Minimize volume — one account / one sponsor / one creator slice is enough; avoid full collection dumps in git.
Record metadata — in SOURCE.md (optional): date, environment, export tool, and which document/row IDs were used (internal IDs only if safe).
Stable ordering — pretty-print JSON; for BigQuery use column order matching the console or SELECT * from a narrowed query.

Firebase

What to capture

Document path (full path from root), e.g. creators/{id} or instagramProfiles/{id}.
Field names and types as JSON values show them (strings, numbers, maps, arrays).
Subcollections (list paths only if you do not export every doc): note existence in README.md under that area.

Recommended export methods

A. Firebase Console (quick, small docs)

Open Firestore → navigate to the document for your single example account.
Copy relevant fields into a local file, or use ⋮ → Export if your project supports JSON export for that view.
Save as firebase/<resource>-example.json and add a sibling firebase/paths.md listing the exact path(s).

B. `gcloud` / Firestore REST (scriptable)

Authenticate: gcloud auth application-default login (or service account with least privilege).
Use the Firestore REST API documents.get or Firebase Admin SDK in a one-off script:
- Node (Admin SDK): use the repo script internal/ai-context/scripts/export-firestore-doc.mjs (see scripts/README.md) — or call admin.firestore().doc('path/to/doc').get() and stringify after converting Timestamps to ISO strings.
For subcollections: listDocuments / collection group queries scoped to your test ID only.

C. Emulator snapshot (if you use emulators)

Seed one user; export emulator data if your workflow supports it; same redaction applies.

Post-export cleanup

Replace Timestamp objects with ISO-8601 strings in saved JSON.
Truncate long arrays (e.g. keep first 2 items, add "_truncated": true).
Remove or hash fields flagged as PII in your data catalog.

BigQuery

What to capture

Table reference: project.dataset.table.
Schema: column name, type, mode (NULLABLE/REQUIRED/REPEATED), and short description if you use BQ column descriptions.
One row (or a few) for the same logical entity as Firebase (same internal creator/sponsor id where applicable).

Recommended export methods

A. Schema only (DDL or JSON schema)

In BigQuery Console:

Open the table → Details → Schema tab.
Export schema:
- Option 1: Run
  bq show --format=prettyjson --schema creator-rank:dataset.table > bigquery/<table>-schema.json
- Option 2: Use Open in → Dataprep / copy schema manually into bigquery/<table>-schema.md (table in Markdown).

B. One example row (CSV or JSON)

Prefer NDJSON or a single-row JSON array for readability in repos.

# Single row, pretty JSON (replace dataset/table/filters as needed)
bq query --use_legacy_sql=false --format=prettyjson \
  'SELECT * FROM `creator-rank.dataset.table`
   WHERE creator_id = "YOUR_TEST_ID"
   LIMIT 1' > bigquery/<table>-example-row.json

Or export to GCS then download (good for wide tables):

bq extract --destination_format=NEWLINE_DELIMITED_JSON \
  'creator-rank:dataset.table$partition' \
  'gs://your-bucket/temp/export-*.json'
# Then gsutil cp one shard and trim to one line/object

C. INFORMATION_SCHEMA (structure of many tables)

To document all tables in a dataset without row data:

SELECT table_name, column_name, data_type, is_nullable
FROM `creator-rank.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name IN ('your_table')
ORDER BY table_name, ordinal_position;

Save results as bigquery/information_schema-<dataset>.csv or .md table.

Post-export cleanup

Drop or mask columns that are PII or contractual secrets.
If the row is wide, split into logical JSON files (e.g. profile.json, metrics_daily.json) and reference the shared key in README.md.

Suggested filenames (per area)

Type	Example filename
Firebase doc body	`firebase/firestore-<collection>-example.json`
Firebase path index	`firebase/paths.md`
BQ schema	`bigquery/<table>-schema.json` or `-schema.md`
BQ sample row	`bigquery/<table>-example-row.json`
Lineage note	`SOURCE.md` (at area root or next to samples)

Aligning Firebase and BigQuery in one area

Pick one stable key (e.g. internal creator id, sponsor brandId).
Export Firebase doc(s) for that key.
Run BigQuery SELECT * ... WHERE <key> = ... LIMIT 1 for tables that power the same API surface.
In that area’s README.md, add a short Mapping section: which API fields come from which Firestore path vs which BQ column(s).

This makes “intended vs actual” comparisons tractable for agents and code reviews.

Getting Started

User Guides

API Endpoints

Filter Value References

Troubleshooting and FAQs

EXPORTING DATA

Exporting structure and example data (Firebase & BigQuery)

General rules

Firebase

What to capture

Recommended export methods

A. Firebase Console (quick, small docs)

B. `gcloud` / Firestore REST (scriptable)

C. Emulator snapshot (if you use emulators)

Post-export cleanup

BigQuery

What to capture

Recommended export methods

A. Schema only (DDL or JSON schema)

B. One example row (CSV or JSON)

C. INFORMATION_SCHEMA (structure of many tables)

Post-export cleanup

Suggested filenames (per area)

Aligning Firebase and BigQuery in one area

Getting Started

User Guides

API Endpoints

Filter Value References

Troubleshooting and FAQs

​Exporting structure and example data (Firebase & BigQuery)

​General rules

​Firebase

​What to capture

​Recommended export methods

​A. Firebase Console (quick, small docs)

​B. gcloud / Firestore REST (scriptable)

​C. Emulator snapshot (if you use emulators)

​Post-export cleanup

​BigQuery

​What to capture

​Recommended export methods

​A. Schema only (DDL or JSON schema)

​B. One example row (CSV or JSON)

​C. INFORMATION_SCHEMA (structure of many tables)

​Post-export cleanup

​Suggested filenames (per area)

​Aligning Firebase and BigQuery in one area

Exporting structure and example data (Firebase & BigQuery)

General rules

Firebase

What to capture

Recommended export methods

A. Firebase Console (quick, small docs)

B. `gcloud` / Firestore REST (scriptable)

C. Emulator snapshot (if you use emulators)

Post-export cleanup

BigQuery

What to capture

Recommended export methods

A. Schema only (DDL or JSON schema)

B. One example row (CSV or JSON)

C. INFORMATION_SCHEMA (structure of many tables)

Post-export cleanup

Suggested filenames (per area)

Aligning Firebase and BigQuery in one area