Schema Mapping
Configuring how Firestore data maps to BigQuery columns
Schema mapping defines how Firestore documents become BigQuery rows. Each column in your BigQuery table has a source that determines where its data comes from.
Column Sources
When adding a column to your schema, you choose a source type that determines what data populates that column.
Document ID
Source: $documentId
The Firestore document’s ID (the last segment of the document path).
/users/abc123 → "abc123"
/orders/2024/items/xyz → "xyz"
Use this when you need to reference the original document or join with other data.
Document Path
Source: $documentPath
The full path to the Firestore document.
/users/abc123 → "users/abc123"
/orders/2024/items/xyz → "orders/2024/items/xyz"
Useful when you need the complete document reference, especially for nested collections.
Document Data (JSON)
Source: $documentData
The entire Firestore document serialized as a JSON string.
{"name": "John", "email": "john@example.com", "age": 30}
Use this when you want to preserve all document data and parse it in BigQuery queries using JSON functions.
Update Time
Source: $documentUpdateTime
The Firestore document’s last update timestamp.
Create Time
Source: $documentCreateTime
The Firestore document’s creation timestamp.
Backfill Time
Source: $backfillTime
The timestamp when the backfill job processed this document. Useful for tracking when data was synced.
Generated (UUID)
Source: $generated
A randomly generated UUID for each row. Useful for creating unique row identifiers in BigQuery.
Null
Source: $constant
A constant null value. Used for placeholder columns (like old_data in the Firebase Extension schema).
Document Field
Source: $documentField
A specific field from the Firestore document, mapped to a typed BigQuery column. This is the most powerful source type, allowing you to extract and type individual fields.
Document Field Mapping
When mapping document fields, you specify:
- Field Path: The path to the field in the document (e.g.,
name,address.city,items[0].price) - BigQuery Type: The data type for the column
Supported BigQuery Types
| Type | Description | Firestore Source |
|---|---|---|
STRING | Text data | String fields |
INTEGER | Whole numbers | Number fields |
FLOAT | Decimal numbers | Number fields |
BOOLEAN | True/false | Boolean fields |
TIMESTAMP | Date and time | Timestamp fields |
DATE | Date only | Timestamp or string fields |
BYTES | Binary data | Bytes fields |
ARRAY_STRING | Array of strings | Array fields |
ARRAY_INTEGER | Array of integers | Array fields |
ARRAY_FLOAT | Array of floats | Array fields |
ARRAY_BOOLEAN | Array of booleans | Array fields |
STRUCT | Nested object | Map/object fields |
ARRAY_STRUCT | Array of objects | Array of maps |
Nested Fields
For STRUCT and ARRAY_STRUCT types, you can define nested field mappings to flatten complex Firestore structures into typed BigQuery columns.
Example: A Firestore document with:
{
"address": {
"street": "123 Main St",
"city": "San Francisco",
"zip": "94102"
}
}
Can be mapped to a BigQuery STRUCT column with typed nested fields for street, city, and zip.
Schema Design Best Practices
Choose the Right Approach
JSON Column (Document Data source):
- Pros: Simple, captures everything, flexible
- Cons: Requires JSON parsing in queries, no type safety
Typed Columns (Document Field source):
- Pros: Type safety, easier queries, better performance
- Cons: More setup, may miss unexpected fields
Hybrid:
- Include both a JSON column for completeness and typed columns for frequently queried fields
Plan for Schema Evolution
BigQuery tables have strict schemas. Consider:
- Adding nullable columns is safe
- Removing columns requires table recreation
- Changing types is not supported
Start with a flexible schema and add typed columns as your query patterns become clear.
Handle Missing Fields
When a Firestore document doesn’t have a mapped field, BigQuery receives a NULL value. Ensure your queries handle NULLs appropriately.
Consider Query Patterns
Design your schema based on how you’ll query the data:
- Frequently filtered fields → typed columns with appropriate types
- Rarely accessed fields → JSON column with JSON_EXTRACT
- Join keys → typed STRING or INTEGER columns
Use Consistent Naming
Match BigQuery column names to Firestore field names when possible. If using different names, document the mapping for your team.
Validation
When configuring a schema, Fireconduit shows a validation query you can run against an existing BigQuery table to verify compatibility:
SELECT column1, column2, column3
FROM `project.dataset.table`
LIMIT 10
Run this query in the BigQuery Console before your first backfill to catch schema mismatches early.