Coming Soon — This page describes an architecture that is currently in development and not yet generally available. Contact us to learn more.
How It Works
Query Execution Paths
Query execution follows three paths depending on what the task touches:| Path | When | What Happens |
|---|---|---|
| A — Direct | Single warehouse | Dialect-specific SQL executes on source compute; only the result set crosses the boundary. |
| B — Extract | Single API | Call API with user’s token, transform response into tabular format in an ephemeral workspace, query, destroy. |
| C — Federate | Multiple sources | Push filtered queries to each source in parallel, load partial results into an ephemeral workspace, join locally, destroy. |
Security Model
The architecture is zero-trust by design — AstroBee assumes no inherent rights to customer data.- No service accounts. Every query runs as a specific user with their permissions. If a user can’t access a table in Snowflake, they can’t access it through AstroBee — the warehouse rejects the query.
- No permission duplication. No parallel ACL system to maintain or drift out of sync.
- Blast radius is per-user. A compromised token exposes only that user’s scope.
- Full audit trail. Every query logged with user identity, source, SQL/API call, and timestamp.
- Data residency preserved. Customer data never leaves their cloud region.
Federation
Tasks that span systems are unanswerable by any single source. “Correlate Gong call sentiment with Salesforce win rates for deals closing this quarter” — Salesforce knows deal stages, Gong knows conversation quality, neither can answer alone. AstroBee extracts filtered subsets into an ephemeral workspace, joins on shared keys, computes the answer, and discards everything. Each source independently validates credentials — no permission bypass through federation. AstroBee acts as a translator and coordinator — never as a database. Customer data stays where it is, governed by the systems that already protect it.The Vision: A Unified Data Layer
The ultimate goal is to create a semantic data layer that spans multiple systems — warehouses, lakehouses, CRMs, and SaaS APIs — enabling organizations to ask questions across their entire data ecosystem without moving or duplicating data. This vision is achieved through a phased approach:| Phase | Capability | Value |
|---|---|---|
| Phase 1 | Single structured data source | Foundation: Prove the credential delegation model with warehouse queries |
| Phase 2 | API data sources | Expansion: Extend to semi-structured data from SaaS applications |
| Phase 3 | Federated queries across systems | Vision: Unified analytics across all data sources |
Deep Dives
Federated Query Layer
Core architecture, credential delegation, and the virtual semantic layer
API Data Sources
Dynamic extraction from Gong, Google Calendar, and other SaaS APIs
Cross-System Federation
Ephemeral joins across warehouses, lakehouses, and APIs
Security & Access Control
Multi-layer security model, zero-trust design, and compliance

