Getting Started with XO-Data¶
This guide will help you set up your development environment and get started with the XO-Data platform.
Prerequisites¶
Before you begin, ensure you have:
- Python 3.12+ installed
- uv package manager
- Git for version control
- Access to required credentials (Snowflake, AWS, Gladly)
Installation¶
1. Install uv Package Manager¶
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
2. Clone the Repository¶
3. Install Dependencies¶
# Install all workspace dependencies
uv sync
# This installs all packages:
# - xo-core (foundation utilities)
# - xo-foundry (orchestration)
# - xo-lens (analytics)
# - xo-bosun (monorepo CLI)
4. Verify Installation¶
# Check that packages are installed
uv tree
# Run type checking
uv run ty check --project packages/xo-core
# Run linting
uv run ruff check .
Environment Variables¶
Create a .env file in the repository root with required credentials:
# Snowflake
SNOWFLAKE_ACCOUNT=your_account
SNOWFLAKE_USER=your_user
SNOWFLAKE_PASSWORD=your_password
SNOWFLAKE_WAREHOUSE=your_warehouse
SNOWFLAKE_ROLE=your_role
# AWS (for S3 operations)
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_DEFAULT_REGION=us-east-1
# Gladly API
GLADLY_API_USER=your_user
GLADLY_API_TOKEN=your_token
GLADLY_ORG=your_org
Important: Never commit the .env file. It's already in .gitignore.
Development Workflow¶
Working with Packages¶
# Install a specific package
uv sync --package xo-core
# Add a new dependency to a package
uv add --package xo-foundry pandas
# Run code from a specific package
uv run --package xo-core python -m xo_core.extractors.gladly_extractor
Code Quality Tools¶
# Format code (run before committing)
uv run ruff format .
# Check for linting issues
uv run ruff check .
# Fix auto-fixable linting issues
uv run ruff check --fix .
# Type checking (must pass with zero errors, run per package)
uv run ty check --project packages/xo-core
uv run ty check --project packages/xo-foundry
uv run ty check --project packages/xo-lens
uv run ty check --project packages/xo-bosun
Local Airflow Development¶
# Navigate to Airflow environment
cd apps/airflow/xo-pipelines
# Start local Airflow
astro dev start
# Access Airflow UI
open http://localhost:8080
# Username: admin
# Password: admin
# Stop Airflow
astro dev stop
Repository Structure Overview¶
xo-data/
├── packages/ # Reusable Python packages
│ ├── xo-core/ # Foundation utilities
│ ├── xo-foundry/ # Airflow orchestration & DAG Factory
│ ├── xo-lens/ # Analytics tools
│ └── xo-bosun/ # Monorepo navigation CLI
│
├── apps/ # Deployment targets
│ ├── airflow/xo-pipelines/ # Airflow deployment (DAGs + configs)
│ ├── snowflake-schema/ # Snowflake schema migrations
│ └── material-mkdocs/ # Documentation site
│
├── .claude/ # Project documentation
│ └── ongoing/ # Active project docs & ADRs
│
├── pyproject.toml # Workspace configuration
└── .env # Local credentials (not committed)
Common Tasks¶
Creating a New Pipeline¶
- Define YAML config in
apps/airflow/xo-pipelines/dags/configs/ - Validate:
uv run xo-foundry validate-config --config <path> - Generate DAG:
uv run xo-foundry generate-dag --config <path> --output <dags-dir> - Test locally with
astro dev start - Deploy to production
Working with Snowflake¶
from xo_core.snowflake_manager import SnowflakeManager
# Initialize manager
manager = SnowflakeManager()
# Upload DataFrame with deduplication
prepped_df = manager.prep_dataframe_for_table(
df, "TABLE_NAME", filter_existing=True
)
manager.upload_dataframe(prepped_df, "TABLE_NAME")
Working with S3¶
from xo_core.s3_manager import S3Manager
# Initialize manager
s3 = S3Manager()
# Upload DataFrame to S3
s3.upload_dataframe(
df,
bucket="xo-ingest-bucket",
key="client/report/2026-01-15/data.csv"
)
Next Steps¶
Now that you have the platform set up:
- Understand the Architecture -- Learn how components fit together
- Explore xo-core -- Foundation utilities and extractors
- Learn about Medallion Architecture -- BRONZE/SILVER/GOLD layers
- Explore the DAG Factory -- YAML-driven pipeline generation
Troubleshooting¶
uv sync fails¶
Type checking errors¶
Make sure you're using Python 3.12+:
Airflow won't start¶
Import errors¶
Ensure packages are installed:
Getting Help¶
- Review the Architecture Documentation
- Check Architecture Decisions for design rationale
- Ask in the team Slack channel
- Open an issue on GitHub
Next: Architecture Overview →