Architecture for Contributors
Repository Layout​
batho/
├── batho/ # Core Python package
│ ├── core/ # Schemas and configuration models
│ ├── cli/ # CLI subcommand handlers
│ ├── orchestrator/ # Orchestrators (build, patch, export, gc, load)
│ ├── modules/ # Modules (extraction, graph, compression, dependency, integrity, storage)
│ └── utils/ # General utilities (hashes, locks, file I/O)
├── tests/ # Test suite (pytest)
├── docs-site/ # Docusaurus documentation site
├── cicd/ # GitHub Actions + CI templates
├── batho_cli.py # Main CLI entrypoint
├── batho.yaml.example # Example configuration
└── pyproject.toml # Project metadata + dependencies
Key Modules​
| Module | Responsibility |
|---|---|
batho/modules/extraction/ | tree-sitter based AST extraction and parsing caches |
batho/modules/graph/ | InMemoryGraph engine and node-level diff tracking |
batho/modules/compression/ | BSGMap mapping and plugin rules engine |
batho/modules/dependency/ | Consolidated Dependency Extraction Utility (stdlib + venv) |
batho/modules/integrity/ | Database verification, repair engine, and report generator |
batho/modules/storage/ | Arrow IPC Bundle registry and views |
batho/orchestrator/ | High-level subcommand orchestrators (build, patch, export, gc, load) |
Adding a New Language​
- Add a detector in
batho/context/languages/ - Implement the extractor subclass
- Register in the language registry
- Add tests in
tests/
Adding a BSG Plugin​
- Define the plugin YAML in
batho/bsg/plugins/ - Register in
batho/bsg/plugins/registry - Add schema validation in
batho/bsg/schemas/ - Add integration tests