Core Concepts
Data Layer
A map of how your business concepts connect to your raw data and to each other. When you connect data sources to AstroBee, it automatically builds a data layer that lets you ask questions in plain language without knowing where the data lives or how tables join together.You may also see “semantic layer” or “ontology” used in technical contexts. These all refer to the same thing.
Entity
A business concept represented as a table in your data layer. Entities are the “nouns” of your data: things like Customer, Order, Campaign, or Product. Each entity has properties that describe it and relationships that connect it to other entities.Relationship
A connection between two entities that defines how they join. For example, a Customer entity might have a relationship to an Orders entity, letting you ask questions like “which customers placed the most orders last month?” Relationships have a direction and cardinality:- One-to-Many: One customer can have many orders
- Many-to-One: Many orders belong to one customer
- Many-to-Many: Many campaigns can target many audiences
Properties
Every entity has properties that describe its characteristics. Properties come in two types:Identifier
Properties that uniquely identify an instance of an entity. Think of these as the “who” or “which one” of your data. For example, a Customer entity might havecustomer_id and email as identifiers.
Identifiers form the primary key of an entity. When you group or filter data, you typically use identifiers.
Technical synonym: Domain Property
Descriptive Field
Properties that describe characteristics or quantifiable values of an entity. These are the “what” of your data: things likename, total_spend, created_date, or status.
Descriptive fields are what you typically measure, sum, average, or display in reports.
Technical synonym: Measure Property
Data Sources
Data Source
Raw source data before it becomes part of your data layer. This includes:- CSV uploads: Files you upload directly
- Connected sources: HubSpot, Salesforce, Google Ads, etc.
- Data warehouses: BigQuery, PostgreSQL, Snowflake
Pattern
A discovered join rule between tables. When AstroBee builds your data layer, the Pattern Extraction Agent looks for ways to connect tables from different sources, like matchingcustomer_id in one table to cust_id in another.
Patterns can be:
- Value overlap: Direct matches between columns
- Keyword context: Values embedded in text fields
- Regex: Complex extraction patterns
- Composite: Multiple fields combined
Pattern Extraction Agent
An AI agent that automatically discovers how your tables connect by analyzing column names, values, and patterns. It runs as a sub-agent when you first connect data, looking for relationships you might not have explicitly defined. You can review discovered patterns in the Discovered Joins tab when building your data layer.Entity Resolution
The process of matching records that refer to the same real-world thing across different systems. For example, matching[email protected] in HubSpot to [email protected] in Stripe, or recognizing that “Acme Corp” and “ACME Corporation” are the same company.
AstroBee’s AI applies resolution techniques like email matching, ID matching, and fuzzy name matching when building your data layer. This happens automatically during pattern extraction.
Normalization
Cleaning and standardizing data formats so they can be compared and joined. Examples include:- Converting phone numbers to a standard format (
+1-555-1234→5551234) - Normalizing email casing (
[email protected]→[email protected]) - Standardizing date formats
- Trimming whitespace and special characters
Discovered Joins
The tab shown during data layer generation that displays relationships the Pattern Extraction Agent found between your tables. Each discovered join shows:- Source and target columns being connected
- The pattern type (exact match, fuzzy match, etc.)
- Confidence and coverage scores
- Sample matched values
Unmatched Fields
Fields from your source data that couldn’t be automatically joined to other tables during data layer generation. These appear in the Unmatched Fields tab. Unmatched fields aren’t necessarily a problem. They might be:- Fields unique to one system with no counterpart elsewhere
- Internal IDs not meant for joining
- Fields that need manual mapping in the data layer editor
Interface
Data Map
A visual representation of how your data sources connect to each other. Access it via the Data Map tab in the left panel. The map shows:- Each source as a node
- Lines connecting sources that share relationships
- Icons indicating the source type (HubSpot, Salesforce, CSV, etc.)
Answer Report
The panel that appears on the right side of the chat interface after AstroBee answers your question. It contains four tabs:| Tab | What it shows |
|---|---|
| Results | Charts, tables, and business insights answering your question |
| Sources & Logic | How AstroBee interpreted your question and which entities it used |
| SQL | The exact SQL query generated from your question |
| Pipeline | Data processing steps from raw tables to final results |

