Katachi¶

Katachi is a Python package for validating, processing, and parsing directory structures against defined schemas.
!!! warning "Work in Progress" Katachi is currently under active development and should be considered a work in progress. APIs may change in future releases.
Overview¶
Katachi helps you define, validate, and process structured directory trees. It's particularly useful for:
- Data validation: Ensure datasets follow a consistent structure
- Processing pipelines: Process files based on their position in a directory tree
- Schema enforcement: Validate project structures against conventions
- Relationship validation: Verify relationships between files (like paired files)
Features¶
- 📐 Schema-based validation - Define expected directory structures using YAML
- 🧩 Extensible architecture - Create custom validators and actions
- 🔄 Relationship validation - Validate relationships between files
- 🚀 Command-line interface - Easy to use CLI with rich formatting
- 📋 Detailed reports - Get comprehensive validation reports
Installation¶
Quick Start¶
1. Define a schema (schema.yaml)¶
semantical_name: data
type: directory
pattern_name: data
children:
- semantical_name: image
pattern_name: "img\\d+"
type: file
extension: .jpg
description: "Image files with numeric identifiers"
- semantical_name: metadata
pattern_name: "img\\d+"
type: file
extension: .json
description: "Metadata for image files"
- semantical_name: file_pairs_check
type: predicate
predicate_type: pair_comparison
description: "Check if images have matching metadata files"
elements:
- image
- metadata
2. Validate a directory structure¶
3. Process the results¶
Next Steps¶
- Read the API documentation to learn about the available modules
- Explore examples to see more use cases
- Learn how to extend Katachi with custom validators and actions