Skip to content

Katachi

Release Build status Commit activity License

Logo

Katachi is a Python package for validating, processing, and parsing directory structures against defined schemas.

!!! warning "Work in Progress" Katachi is currently under active development and should be considered a work in progress. APIs may change in future releases.

Overview

Katachi helps you define, validate, and process structured directory trees. It's particularly useful for:

  • Data validation: Ensure datasets follow a consistent structure
  • Processing pipelines: Process files based on their position in a directory tree
  • Schema enforcement: Validate project structures against conventions
  • Relationship validation: Verify relationships between files (like paired files)

Features

  • 📐 Schema-based validation - Define expected directory structures using YAML
  • 🧩 Extensible architecture - Create custom validators and actions
  • 🔄 Relationship validation - Validate relationships between files
  • 🚀 Command-line interface - Easy to use CLI with rich formatting
  • 📋 Detailed reports - Get comprehensive validation reports

Installation

pip install katachi

Quick Start

1. Define a schema (schema.yaml)

semantical_name: data
type: directory
pattern_name: data
children:
  - semantical_name: image
    pattern_name: "img\\d+"
    type: file
    extension: .jpg
    description: "Image files with numeric identifiers"
  - semantical_name: metadata
    pattern_name: "img\\d+"
    type: file
    extension: .json
    description: "Metadata for image files"
  - semantical_name: file_pairs_check
    type: predicate
    predicate_type: pair_comparison
    description: "Check if images have matching metadata files"
    elements:
      - image
      - metadata

2. Validate a directory structure

katachi validate schema.yaml target_directory

3. Process the results

✅ Validation passed!
  - Found 2 image files
  - All image files have matching metadata files

Next Steps