Troubleshooting and limitations
AstroBee is powerful, but like any tool, it has limitations and sometimes needs guidance. Here’s an honest look at common issues and how to resolve them.What AstroBee doesn’t do
Data limitations
- Real-time streaming data: AstroBee works with batch data with 15-minute refresh intervals
- Complex statistical modeling: Use R, Python, or specialized tools for advanced statistics, machine learning, or predictive modeling
- Data transformation: Connect AstroBee to your cleaned, processed data—it doesn’t replace your ETL pipeline
- Very large datasets: Performance may degrade with tables over 10M rows without proper indexing
Query limitations
- Highly technical queries: Complex statistical functions, advanced window operations, or database-specific optimizations
- Real-time alerts: AstroBee analyzes on-demand, not continuous monitoring
- Cross-database joins: Cannot join data across different database connections in a single query
Business context limitations
- Industry-specific terminology: May need guidance on domain-specific terms and metrics
- Complex business rules: Custom calculations with multiple exceptions require explicit definition
- Historical context: Doesn’t understand organizational changes, market events, or business context without being told
Common issues and solutions
”I don’t understand this question”
What this means: AstroBee couldn’t parse your question or map it to available data. How to fix it:- Be more specific: Instead of “How are we doing?” try “What was our revenue last month?”
- Use data terms: Reference actual table/column names if you know them
- Break it down: Split complex questions into simpler parts
- Check data availability: Ensure the data you’re asking about actually exists
Results seem wrong or unexpected
Common causes:- Data quality issues: Missing, duplicate, or inconsistent source data
- Misunderstood context: AstroBee interpreted your question differently than intended
- Incomplete data: Query ran on partial dataset due to filters or date ranges
- Business logic differences: Your mental model differs from how data is structured
- Check the SQL tab: Review the generated query to understand what AstroBee actually ran
- Verify data sources: Confirm you’re looking at the right tables and time periods
- Ask follow-up questions: “Why did you calculate it this way?” or “What data did you use?”
- Cross-reference: Compare with known results from other tools
- Check data freshness: Verify when data was last updated
Slow query performance
Common causes:- Large datasets: Queries spanning millions of rows without proper filtering
- Complex joins: Multiple table relationships without indexes
- Broad time ranges: Querying years of data when months would suffice
- Database load: High concurrent usage on source databases
- Be more specific: Add time ranges, filters, or limits to your questions
- Start small: Begin with recent data, then expand if needed
- Check database performance: Ensure source systems aren’t overloaded
- Use aggregated data: Ask for summaries rather than detailed breakdowns when possible
Charts don’t display properly
Common causes:- Too many data points: Charts with 1000+ categories become unreadable
- Mixed data types: Combining text and numbers inappropriately
- Missing data: Null values or gaps in time series
- Scale issues: Very large or very small numbers
- Limit results: Ask for “top 10” or “last 30 days” instead of all data
- Group data: Request summaries by month/quarter instead of daily
- Specify chart type: “Show as a bar chart” or “create a line graph”
- Handle nulls: Ask AstroBee to “exclude missing values” or “show only complete data”
Performance expectations
Response times
- Simple queries (single table, basic filters): 2-5 seconds
- Medium complexity (joins, aggregations): 5-15 seconds
- Complex analysis (multiple dimensions, large datasets): 15-45 seconds
- Very complex (advanced analytics, huge datasets): 1-3 minutes
Data size recommendations
- Optimal: Tables with 100K - 1M rows
- Good: Up to 10M rows with proper indexing
- Possible but slower: 10M+ rows (consider data sampling)
- Not recommended: 100M+ rows without aggregation
Concurrent usage
- Single user: Full performance
- 2-5 users: Minimal impact
- 5-10 users: Possible slight delays
- 10+ users: May need upgraded infrastructure
Getting better results
Writing effective questions
Be specific about what you want:Teaching AstroBee your business
- Define key metrics: Explain how you calculate important KPIs
- Provide context: Share information about business events, seasonality, or market changes
- Correct misunderstandings: When results seem off, explain what you expected and why
- Set expectations: Clarify what “good” or “normal” means for your metrics
Iterative analysis approach
- Start broad: Get an overview with simple questions
- Drill down: Ask follow-ups about interesting patterns
- Cross-validate: Check surprising results with additional queries
- Synthesize: Pull insights together across multiple questions
When to contact support
Definitely contact support for:- System errors: Database connection failures, timeouts, or crashes
- Data security concerns: Questions about data access, storage, or privacy
- Account issues: Billing, user management, or access problems
- Feature requests: Capabilities you need that aren’t currently available
- Unexpected results: Use the debugging steps above
- Slow performance: Check data size and query complexity
- Question parsing issues: Rephrase and try different approaches
Limitations by data source
BigQuery
- Query complexity: Subject to BigQuery’s SQL limitations and cost controls
- Data freshness: Depends on your BigQuery update schedule
- Partitioning: Large tables should be properly partitioned for performance
PostgreSQL
- Connection limits: Shared connection pool may limit concurrent queries
- Database load: Performance depends on PostgreSQL server capacity
- Custom functions: May not understand database-specific functions
CSV files
- File size: Uploads limited to 100MB per file
- Data types: Limited automatic type detection
- No relationships: Cannot infer connections between different CSV files
- No real-time updates: Must re-upload for fresh data
Best practices summary
- Start simple: Begin with basic questions before complex analysis
- Be specific: Include time ranges, filters, and clear metrics
- Validate results: Cross-check surprising findings
- Teach the system: Provide feedback and business context
- Use appropriate data sources: Ensure data is clean and properly structured
- Monitor performance: Be mindful of query complexity and data size
- Ask for help: Use support when you’re stuck