Skip to main content

Sponsor Data Coverage Matrix (Discovery Only)

This document maps sponsor PRD requirements to currently available data sources. Scope:
  • Primary sponsor table: v1_brandReport
  • Cross-table dependencies checked: v1_Channel, v1_IGUser, v2_Creator
  • Status definitions:
    • Have: field is directly present in current storage
    • Derivable: field can be reliably computed from stored fields
    • Missing: not present and not safely derivable from current sources
Naming/normalization policy for this mapping:
  • YouTube creator identifier terminology: channelId
  • Instagram/TikTok creator identifier terminology: userId
  • Internal fields may still be cId, igId, etc.; this matrix focuses on API contract naming.

Source Evidence Snapshot

  • v1_brandReport includes brand identity and sponsor aggregates:
    • brandId, name, desc, logo, alias, companySize, competitors, industries, country, location, website, socialMedia
    • YouTube sponsor aggregates: channelNumber, videoNumber, videoNumber30d, countryCounts, langCounts, demographic, growth/stat objects
    • Instagram sponsor aggregates under instagram: sponsoredIgIds, channelNumber, postNumber, postNumber30d, countryCounts, langCounts, summary stat objects
    • Linkage lists: cIds (YouTube creators), brandIgIds and instagram.sponsoredIgIds (Instagram handles)
  • v1_Channel includes YouTube creator profile/performance metadata (cId, cName, countryCode, lang*, nicheIds, topics, demographic blocks, etc.).
  • v1_IGUser includes Instagram creator profile metadata (igId, igName, countryCode, lang, creatorId, profile/contact structures, etc.).
  • v2_Creator includes cross-platform creator identity/linking (creatorId, youtubePrimaryId, youtubeId, instagramPrimaryId, instagramId, tiktok*, profile summary).
Note:
  • Current internal/ai-context/sponsors/api-v3/*.json responses are 404 placeholders (sponsor endpoints not yet live in that capture), so this pass uses PRD + table data only.

1) sponsor/search

Request Filters

PRD FilterStatusSource / Reason
name, industry, countryHavev1_brandReport.name, industries, country
sponsoringRegionHavev1_brandReport.country1st (and related country ratios/counts)
channelId (YouTube creator association)Havev1_brandReport.cIds
userId (Instagram creator association)Havev1_brandReport.brandIgIds + v1_brandReport.instagram.sponsoredIgIds
sponsoringPlatformsDerivablefrom presence of videoNumber > 0 and instagram.postNumber > 0
totalSponsoredContentDerivablevideoNumber + instagram.postNumber
estimatedTotalSpend7d, estimatedTotalSpend30d, estimatedTotalSpend90dMissingPRD: daily rollups on v1_brandReport from per-video misc.calculated.ytVideoPrice.priceRaw (YouTube); Instagram pricing not yet in production — filters aggregate per-platform totals once stored
hasActiveCampaignDerivablefrom videoNumber30d and/or instagram.postNumber30d

Response Fields

PRD FieldStatusSource / Reason
results[].brandId/name/logo/industries/countryHavev1_brandReport direct fields
totalResults, offset, pageSizeDerivablequery/result metadata from service layer

2) sponsor/list

PRD FieldStatusSource / Reason
brands[].brandId/name/logo/industries/countryHavev1_brandReport direct fields
brands[].sponsoringPlatformsDerivablederive from YouTube/Instagram sponsor counts
brands[].totalSponsoredContentDerivablevideoNumber + instagram.postNumber
totalBrandsDerivablecount query over sponsor table
Pagination envelope (offset, pageSize)Derivableservice layer metadata
Primary gap:
  • None for core list payload, if derivations are accepted as API-layer computations.

3) sponsor/information

PRD FieldStatusSource / Reason
brandId, name, alias, logoHavev1_brandReport direct
descriptionDerivablefrom v1_brandReport.desc (rename)
companySize, keyPeople, industries, country, location, website, competitorsHavev1_brandReport direct
socialMedia[] {platform,url}Derivablesource is URL array in v1_brandReport.socialMedia; needs parser/normalizer
sponsoredContentYoutube, sponsoredContentYoutube30dHavevideoNumber, videoNumber30d
sponsoredContentInstagram, sponsoredContentInstagram30dHaveinstagram.postNumber, instagram.postNumber30d
totalSponsoredContentDerivablesum of YouTube + Instagram totals
activePlatformsDerivablebased on non-zero platform totals
Primary gap:
  • No blocker; mostly naming/shape normalization.

4) sponsor/creators

Summary Block

PRD FieldStatusSource / Reason
totalSponsoredCreatorsDerivablechannelNumber + instagram.channelNumber
sponsoredCreatorsYoutubeHavechannelNumber
sponsoredCreatorsInstagramHaveinstagram.channelNumber
totalSponsoredContent, platform content totalsHave/Derivabledirect counts + sums from v1_brandReport
creatorLocationBreakdown (youtube/instagram top countries)Have/DerivablecountryCounts and instagram.countryCounts
creatorLanguageBreakdown (youtube/instagram top languages)Have/DerivablelangCounts and instagram.langCounts
estimatedTotalSpend30d (summary)Missingsame rollup plan as search filters; not present in sampled v1_brandReport checks

Creators Array

PRD FieldStatusSource / Reason
creatorId + platformHave/Derivablefrom sponsor link lists (cIds, sponsoredIgIds) with platform tagging
displayName, avatar, country, followers, engagementRate, topics, nichesHave (cross-table)YouTube from v1_Channel; Instagram from v1_IGUser + potentially v2_Creator harmonization
sponsoredContent URLsMissing (in checked scope)requires sponsor-attributed content linkage table(s), not present in sampled files
sponsoredCount, lastSponsoredDateMissing (in checked scope)require content-level sponsorship attribution and timestamps
Primary gaps:
  • Sponsor-attributed per-creator content linkage and spend attribution.

5) sponsor/performance

Aggregated Platform Blocks

PRD Field FamilyStatusSource / Reason
Content counts by window (sponsoredContent*)Have/Partialsome direct (videoNumber, videoNumber30d, instagram.postNumber, instagram.postNumber30d); additional windows depend on whether corresponding stored fields are consistently available across data
Totals/statistics (views.totalViews / views30d, likes.totalLikes / likes30d, comments.totalComments / comments30d, avg/median/min/max, growth30d, estimatedTotalSpend*, platform-level estimatedCPM30d / estimatedCPE30d)Have/PartialYouTube: rich stat/growth in v1_brandReport; spend rollups are PRD/daily-worker outputs. Instagram: engagement stats Have/Partial; spend fields reserved (null) until pricing pipeline exists

sponsoredContent creator-grouped detail

PRD Field FamilyStatusSource / Reason
creatorId, creatorDisplayName, platformHave/Derivablevia join from sponsor creator IDs to v1_Channel / v1_IGUser
creatorSponsoredStats, creatorTotalStats, estimatedSpendMissing/Partialrequire sponsor-attributed content performance rollups and spend model outputs
content[] per-piece fields (contentId, publishTime, views windows, likes, comments, hashtags, etc.)Missing (in checked scope)requires sponsor-attributed content tables and historical windows not present in sampled sponsor/creator snapshots
Primary gaps:
  • Per-sponsor per-content attribution dataset and computed rollups.

6) sponsor/audience

PRD FieldStatusSource / Reason
YouTube audience demographics (audienceLocations, audienceGender, audienceAvgAge, audienceAgeBreakdown)Have/Derivablefrom v1_brandReport.demographic (country, gender, age distribution)
Instagram audience demographics (same shape)Missingno equivalent Instagram audience demographic block found in current sponsor sources
Primary gap:
  • Instagram audience aggregation pipeline for sponsored creator pool.

7) sponsor/submit

PRD RequirementStatusSource / Reason
Persisted submissions + lifecycle (accepted / processing / done / rejected)Missingrequires dedicated submission store
Duplicate detection by brand root domainDerivable + Missing infraderivable logic exists conceptually using brandId domain norms, but needs submit-path implementation and index
Abuse controls (disallowed, per-key daily limit)Missingrequires policy/config + state store

Cross-Endpoint Naming Alignment Notes

  • For sponsor creator-association filters, keep published platform-native terminology:
    • YouTube: channelId
    • Instagram/TikTok: userId
  • Internal storage mappings:
    • YouTube channelId -> v1_Channel.cId and sponsor-side v1_brandReport.cIds[]
    • Instagram userId (handle) -> v1_IGUser.igId and sponsor-side v1_brandReport.brandIgIds[] / v1_brandReport.instagram.sponsoredIgIds[]
    • Cross-platform identity bridge -> v2_Creator (youtubePrimaryId, instagramId, etc.)

Prioritized Engineering Discovery Backlog (No Coding)

P0 — Contract + Mapping Clarity

  1. Finalize sponsor filter contract names in docs and implementation mapping:
    • accept canonical: channelId (YT), userId (IG/TikTok)
    • decide whether legacy aliases remain accepted and for how long.
  2. Publish canonical source map table:
    • each sponsor API field -> source field path(s) -> transform rule.

P1 — Derived Field Standardization

  1. Standardize socialMedia normalization:
    • URL array -> { platform, url } with deterministic platform parser.
  2. Standardize platform activity derivation:
    • activePlatforms, sponsoringPlatforms, hasActiveCampaign.
  3. Standardize aggregate derivations:
    • totalSponsoredContent, summary top-N country/language formatting.

P1 — Cross-Table Dependency Validation

  1. Validate join keys and quality:
    • sponsor (cIds, sponsoredIgIds) -> v1_Channel / v1_IGUser -> optional harmonization via v2_Creator.
  2. Define fallback rules when linked creator records are missing or stale.

P2 — Missing Data for Creators/Performance Detail

  1. Identify authoritative sponsor-attributed content table(s) for:
    • per-creator sponsored content URLs/counts
    • per-content performance windows (7d/30d/90d)
    • latest sponsored timestamps
  2. Define spend attribution source for:
    • estimatedTotalSpend7d / estimatedTotalSpend30d / estimatedTotalSpend90d (platform summary + search)
    • per-creator estimatedSpend in performance (sum of priceRaw-family per video, rolling 30d).

P2 — Audience Parity

  1. Confirm current YouTube audience derivation contract from sponsor demographic.
  2. Design Instagram audience aggregation source and validation checks to reach schema parity with YouTube block.

P3 — Sponsor Submit Data Model

  1. Define submission entity schema (submissionId, status lifecycle, timestamps, actor key).
  2. Define dedupe index strategy (root-domain normalization).
  3. Define abuse controls and observability fields (per-key daily counter, disallowed flag, moderation notes).

Immediate “Have vs Need” Summary

Already strong today:
  • Sponsor identity/profile data (sponsor/information core)
  • Sponsor list/search base attributes
  • High-level YouTube + Instagram sponsorship aggregate counts
  • YouTube audience demographics
  • Cross-table creator identity foundations (v1_Channel, v1_IGUser, v2_Creator)
Need to add/confirm before full PRD parity:
  • Sponsor-attributed per-content dataset for sponsor/creators + sponsor/performance detailed sections
  • Spend attribution dataset/logic (estimatedTotalSpend* / per-creator estimatedSpend; YouTube from misc.calculated.ytVideoPrice.priceRaw; Instagram TBD)
  • Instagram audience demographics at sponsor-aggregate level
  • Sponsor submit persistence + dedupe + anti-abuse state model
Last modified on March 20, 2026