Skip to main content
Coming Soon — This page describes an architecture that is currently in development and not yet generally available. Contact us to learn more.
Phase 2 extends the zero-ingestion model beyond structured warehouse data to semi-structured API sources. Building on the credential delegation foundation from Phase 1, this phase applies the same security principles to SaaS applications like Gong, Google Calendar, and Zendesk — enabling natural language queries against data that doesn’t exist in traditional tables.

Beyond Tables: Dynamic API Extraction

Not all valuable business data lives in warehouses. Critical insights often reside in SaaS applications accessible only via APIs — sales call recordings in Gong, meeting schedules in Google Calendar, support tickets in Zendesk. AstroBee can extract data from these sources on-the-fly, transform API responses into ephemeral tables, and include them in the semantic layer for unified analytics.

How API Extraction Works

1

Schema Discovery

AstroBee’s AI agent analyzes API documentation and sample responses to understand available data structures.
2

Extraction Logic Generation

For each user query, the agent generates appropriate API calls with filters, pagination, and field selection.
3

Response Transformation

JSON/XML responses are flattened into tabular format with inferred column types.
4

Semantic Mapping

Extracted tables are mapped to business entities (e.g., Gong calls → SalesCall entity with properties like duration, sentiment_score, participants).

Example: Gong Integration

Gong provides conversation intelligence — recordings, transcripts, and AI analysis of sales calls. AstroBee can extract:
Gong DataSemantic EntityUse Cases
Call recordingsSalesCallCall volume, duration trends
TranscriptsCallTranscriptKeyword analysis, talk ratio
Sentiment scoresCallSentimentDeal health indicators
Deal intelligenceDealSignalRisk identification

Sample Queries Enabled

  • “Which reps have the highest talk-to-listen ratio?”
  • “Show deals where competitor X was mentioned in calls”
  • “Compare call sentiment trend vs. pipeline progression”

Walkthrough: “Show calls where sentiment was negative”

1

Agent identifies data needs

Gong API: Call recordings with sentiment analysis
2

Extraction logic generated

GET /v2/calls?fromDateTime=2024-01-01&toDateTime=2024-03-31
Fields extracted: call_id, duration, sentiment_score, participants, deal_id
3

Results transformed to table

call_iddurationsentimentdeal_id
c_12345min-0.3opp_abc
c_45630min-0.5opp_def
4

Final results displayed

Results displayed with call links and context

Example: Google Calendar Integration

Google Calendar contains meeting data that reveals collaboration patterns and time allocation. AstroBee can extract:
Calendar DataSemantic EntityUse Cases
EventsMeetingMeeting load analysis
AttendeesMeetingParticipantCollaboration patterns
Response statusMeetingResponseEngagement metrics
Recurring eventsMeetingSeriesTime commitment analysis

Sample Queries Enabled

  • “How many hours per week do sales reps spend in meetings?”
  • “Which team members have the most external meetings?”
  • “Show my meeting load trend over the past quarter”

API Source Security Model

API sources follow the same credential delegation model as warehouses:
  • Per-user OAuth tokens — Each user authenticates with Gong/Google using their own account
  • Native permissions enforced — If a user can’t access certain calls in Gong, AstroBee can’t extract them
  • Scoped access — OAuth scopes limit AstroBee to read-only data access
  • Token encryption — API tokens stored with same AES-256 encryption as warehouse credentials
  • Audit trail — All API extractions logged with user identity and data accessed

Rate Limiting & Caching

  • AstroBee respects API rate limits (e.g., Gong: 10 requests/second)
  • Extracted data cached briefly (5–15 minutes) to avoid redundant API calls
  • Cache invalidated on user request or data freshness requirements
  • Large extractions paginated automatically to avoid timeouts

Next Steps