Why We Built Fireconduit
After years of working with Firebase extensions, we realized batch data pipelines need a different architecture. Here's the story behind Fireconduit.
If you’ve used the official Firebase BigQuery Extension, you know it’s a fantastic tool for streaming real-time changes from Firestore to BigQuery. But if you’ve ever needed to backfill historical data, you’ve probably hit some walls.
We did too. And after years of working around those limitations, we decided to build something better.
The Problem with Extensions for Batch Jobs
Firebase extensions are designed for event-driven workloads. They trigger on document writes and process changes one at a time. This architecture works beautifully for real-time sync, but it’s fundamentally mismatched for batch operations like backfilling millions of documents.
Here’s what we kept running into:
Scale Limitations
When you need to backfill a collection with millions (or billions) of documents, you can’t just fire off millions of Cloud Functions. The extension approach of processing documents individually doesn’t scale for batch workloads. You hit rate limits, run into cold start overhead, and end up with unpredictable completion times.
Cloud Tasks Payload Limits
Under the hood, many extensions use Cloud Tasks to coordinate work. But Cloud Tasks has a payload size limit of 1MB. When you’re dealing with large documents or complex nested structures, you quickly hit this ceiling. We’ve seen customers with perfectly reasonable Firestore documents that simply couldn’t be processed through the extension pipeline.
IAM Complexity
Extensions run with their own service accounts and IAM bindings. For straightforward setups, this is fine. But enterprise customers with strict security requirements (VPCs, private networks, custom IAM policies) often find that extensions don’t fit their security model. The extension’s service account needs broad permissions that security teams are reluctant to grant.
What Developers Actually Need
After talking to dozens of teams struggling with these issues, a clear picture emerged. Developers need:
Confidence in their analytics pipelines. When you’re making business decisions based on BigQuery data, you need to trust that your backfills completed successfully, that no documents were missed, and that the data is accurate.
Predictable performance at scale. Whether you’re backfilling 10,000 documents or 10 billion, you need to know roughly how long it will take and have confidence it will complete.
Control over where data flows. For companies with data sovereignty requirements or strict compliance needs, knowing exactly where your data travels, and keeping it within your infrastructure, isn’t optional.
A Different Architecture
Fireconduit takes a fundamentally different approach. Instead of event-driven functions, we use Google Cloud Dataflow, a fully managed service built specifically for batch and streaming data processing at scale.
Dataflow jobs run entirely within your GCP project. Your Firestore data flows directly to BigQuery without passing through any third-party infrastructure. You get:
- Horizontal scaling: Dataflow automatically scales workers based on your data volume
- Exactly-once processing: No duplicate records, no missed documents
- Full observability: Monitor progress, see exactly how many documents were processed, and get detailed error reporting
- VPC support: Run jobs within your private network if needed
Built for Enterprise Requirements
We’ve designed Fireconduit with enterprise needs in mind from day one:
Data sovereignty: Your data never leaves your GCP project. Fireconduit orchestrates the pipeline, but the actual data movement happens entirely within infrastructure you control.
Scale without limits: We’ve tested with collections containing billions of documents. Dataflow handles it. That’s what it was built for.
Security model flexibility: You configure the IAM permissions. You control the service accounts. Fireconduit works with your security requirements, not against them.
The Path Forward
We’re not trying to replace the Firebase BigQuery Extension. It’s excellent for what it does. But batch backfills are a different problem that deserves a purpose-built solution.
If you’ve been struggling with backfills, hitting scale limits, or need more control over your data pipeline, give Fireconduit a try. We built it because we needed it ourselves, and we think you might need it too.