Skip to content

Katachi API Reference

Katachi is a Python package for validating, processing, and parsing directory structures against defined schemas.

Core Concepts

Schema Nodes

Schema nodes represent the elements in your directory structure:

  • SchemaNode: Abstract base class for all schema elements
  • SchemaDirectory: Represents a directory in the schema
  • SchemaFile: Represents a file in the schema
  • SchemaPredicateNode: Represents a validation rule between elements

Two-Phase Validation

Katachi uses a two-phase validation approach:

  1. Structural Validation: Validates the existence and properties of files and directories
  2. Predicate Evaluation: Validates relationships between elements that passed structural validation

Modules

Schema Node (katachi.schema.schema_node)

The foundation of Katachi is the schema node system, which defines how directory structures should be organized.

from katachi.schema.schema_node import SchemaDirectory, SchemaFile, SchemaPredicateNode
from pathlib import Path

# Create a schema hierarchy
root = SchemaDirectory(path=Path("data"), semantical_name="data", description="Data directory")

# Add file templates
root.add_child(SchemaFile(
    path=Path("data/image.jpg"),
    semantical_name="image",
    extension=".jpg",
    pattern_validation=r"img\d+"
))

root.add_child(SchemaFile(
    path=Path("data/metadata.json"),
    semantical_name="metadata",
    extension=".json",
    pattern_validation=r"img\d+"
))

# Add a predicate to validate relationships
root.add_child(SchemaPredicateNode(
    path=Path("data"),
    semantical_name="file_pairs_check",
    predicate_type="pair_comparison",
    elements=["image", "metadata"],
    description="Check if images have corresponding metadata files"
))

SchemaDirectory

Bases: SchemaNode

Represents a directory in the schema. Can contain children nodes (files or other directories).

Source code in src/katachi/schema/schema_node.py
class SchemaDirectory(SchemaNode):
    """
    Represents a directory in the schema.
    Can contain children nodes (files or other directories).
    """

    def __init__(
        self,
        path: Path,
        semantical_name: str,
        description: Optional[str] = None,
        pattern_validation: Optional[str] = None,
        metadata: Optional[dict[str, Any]] = None,
    ):
        """
        Initialize a schema directory node.

        Args:
            path: Path to this directory
            semantical_name: The semantic name of this directory in the schema
            description: Optional description of the directory
            pattern_validation: Optional regex pattern for name validation
            metadata: Optional metadata for custom validations
        """
        super().__init__(path, semantical_name, description, pattern_validation, metadata)
        self.children: list[SchemaNode] = []

    def get_type(self) -> str:
        return "directory"

    def add_child(self, child: SchemaNode) -> None:
        """
        Add a child node to this directory.

        Args:
            child: The child node (file or directory) to add
        """
        self.children.append(child)

    def get_child_by_name(self, name: str) -> Optional[SchemaNode]:
        """
        Get a child node by its semantical name.

        Args:
            name: The semantical name of the child to find

        Returns:
            The child node if found, None otherwise
        """
        for child in self.children:
            if child.semantical_name == name:
                return child
        return None

    def __repr__(self) -> str:
        """Detailed string representation of the directory node."""
        return f"{self.__class__.__name__}(path='{self.path}', semantical_name='{self.semantical_name}', children={len(self.children)})"

__init__(path, semantical_name, description=None, pattern_validation=None, metadata=None)

Initialize a schema directory node.

Parameters:

Name Type Description Default
path Path

Path to this directory

required
semantical_name str

The semantic name of this directory in the schema

required
description Optional[str]

Optional description of the directory

None
pattern_validation Optional[str]

Optional regex pattern for name validation

None
metadata Optional[dict[str, Any]]

Optional metadata for custom validations

None
Source code in src/katachi/schema/schema_node.py
def __init__(
    self,
    path: Path,
    semantical_name: str,
    description: Optional[str] = None,
    pattern_validation: Optional[str] = None,
    metadata: Optional[dict[str, Any]] = None,
):
    """
    Initialize a schema directory node.

    Args:
        path: Path to this directory
        semantical_name: The semantic name of this directory in the schema
        description: Optional description of the directory
        pattern_validation: Optional regex pattern for name validation
        metadata: Optional metadata for custom validations
    """
    super().__init__(path, semantical_name, description, pattern_validation, metadata)
    self.children: list[SchemaNode] = []

__repr__()

Detailed string representation of the directory node.

Source code in src/katachi/schema/schema_node.py
def __repr__(self) -> str:
    """Detailed string representation of the directory node."""
    return f"{self.__class__.__name__}(path='{self.path}', semantical_name='{self.semantical_name}', children={len(self.children)})"

add_child(child)

Add a child node to this directory.

Parameters:

Name Type Description Default
child SchemaNode

The child node (file or directory) to add

required
Source code in src/katachi/schema/schema_node.py
def add_child(self, child: SchemaNode) -> None:
    """
    Add a child node to this directory.

    Args:
        child: The child node (file or directory) to add
    """
    self.children.append(child)

get_child_by_name(name)

Get a child node by its semantical name.

Parameters:

Name Type Description Default
name str

The semantical name of the child to find

required

Returns:

Type Description
Optional[SchemaNode]

The child node if found, None otherwise

Source code in src/katachi/schema/schema_node.py
def get_child_by_name(self, name: str) -> Optional[SchemaNode]:
    """
    Get a child node by its semantical name.

    Args:
        name: The semantical name of the child to find

    Returns:
        The child node if found, None otherwise
    """
    for child in self.children:
        if child.semantical_name == name:
            return child
    return None

SchemaFile

Bases: SchemaNode

Represents a file in the schema.

Source code in src/katachi/schema/schema_node.py
class SchemaFile(SchemaNode):
    """
    Represents a file in the schema.
    """

    def __init__(
        self,
        path: Path,
        semantical_name: str,
        extension: str,
        description: Optional[str] = None,
        pattern_validation: Optional[str] = None,
        metadata: Optional[dict[str, Any]] = None,
    ):
        """
        Initialize a schema file node.

        Args:
            path: Path to this file
            semantical_name: The semantic name of this file in the schema
            extension: The file extension
            description: Optional description of the file
            pattern_validation: Optional regex pattern for name validation
            metadata: Optional metadata for custom validations
        """
        super().__init__(path, semantical_name, description, pattern_validation, metadata)
        self.extension: str = extension

    def get_type(self) -> str:
        return "file"

    def __repr__(self) -> str:
        """Detailed string representation of the file node."""
        return f"{self.__class__.__name__}(path='{self.path}', semantical_name='{self.semantical_name}', extension='{self.extension}')"

__init__(path, semantical_name, extension, description=None, pattern_validation=None, metadata=None)

Initialize a schema file node.

Parameters:

Name Type Description Default
path Path

Path to this file

required
semantical_name str

The semantic name of this file in the schema

required
extension str

The file extension

required
description Optional[str]

Optional description of the file

None
pattern_validation Optional[str]

Optional regex pattern for name validation

None
metadata Optional[dict[str, Any]]

Optional metadata for custom validations

None
Source code in src/katachi/schema/schema_node.py
def __init__(
    self,
    path: Path,
    semantical_name: str,
    extension: str,
    description: Optional[str] = None,
    pattern_validation: Optional[str] = None,
    metadata: Optional[dict[str, Any]] = None,
):
    """
    Initialize a schema file node.

    Args:
        path: Path to this file
        semantical_name: The semantic name of this file in the schema
        extension: The file extension
        description: Optional description of the file
        pattern_validation: Optional regex pattern for name validation
        metadata: Optional metadata for custom validations
    """
    super().__init__(path, semantical_name, description, pattern_validation, metadata)
    self.extension: str = extension

__repr__()

Detailed string representation of the file node.

Source code in src/katachi/schema/schema_node.py
def __repr__(self) -> str:
    """Detailed string representation of the file node."""
    return f"{self.__class__.__name__}(path='{self.path}', semantical_name='{self.semantical_name}', extension='{self.extension}')"

SchemaNode

Bases: ABC

Base abstract class for all schema nodes.

SchemaNode represents any node in the file/directory structure schema. It contains common properties and methods that all nodes should implement.

Source code in src/katachi/schema/schema_node.py
class SchemaNode(ABC):
    """
    Base abstract class for all schema nodes.

    SchemaNode represents any node in the file/directory structure schema.
    It contains common properties and methods that all nodes should implement.
    """

    def __init__(
        self,
        path: Path,
        semantical_name: str,
        description: Optional[str] = None,
        pattern_validation: Optional[str] = None,
        metadata: Optional[dict[str, Any]] = None,
    ):
        """
        Initialize a schema node.

        Args:
            path: Path to this node
            semantical_name: The semantic name of this node in the schema
            description: Optional description of the node
            pattern_validation: Optional regex pattern for name validation
            metadata: Optional metadata for custom validations
        """
        self.path: Path = path
        self.semantical_name: str = semantical_name
        self.description: Optional[str] = description
        self.pattern_validation: Optional[Pattern] = None
        self.metadata: dict[str, Any] = metadata or {}

        if pattern_validation:
            self.pattern_validation = re_compile(pattern_validation)

    @abstractmethod
    def get_type(self) -> str:
        """
        Get the type of this node.

        Returns:
            String representing the node type ("file" or "directory").
        """
        pass

    def __str__(self) -> str:
        """String representation of the node."""
        return f"{self.get_type()}: {self.semantical_name} at {self.path}"

    def __repr__(self) -> str:
        """Detailed string representation of the node."""
        return f"{self.__class__.__name__}(path='{self.path}', semantical_name='{self.semantical_name}')"

__init__(path, semantical_name, description=None, pattern_validation=None, metadata=None)

Initialize a schema node.

Parameters:

Name Type Description Default
path Path

Path to this node

required
semantical_name str

The semantic name of this node in the schema

required
description Optional[str]

Optional description of the node

None
pattern_validation Optional[str]

Optional regex pattern for name validation

None
metadata Optional[dict[str, Any]]

Optional metadata for custom validations

None
Source code in src/katachi/schema/schema_node.py
def __init__(
    self,
    path: Path,
    semantical_name: str,
    description: Optional[str] = None,
    pattern_validation: Optional[str] = None,
    metadata: Optional[dict[str, Any]] = None,
):
    """
    Initialize a schema node.

    Args:
        path: Path to this node
        semantical_name: The semantic name of this node in the schema
        description: Optional description of the node
        pattern_validation: Optional regex pattern for name validation
        metadata: Optional metadata for custom validations
    """
    self.path: Path = path
    self.semantical_name: str = semantical_name
    self.description: Optional[str] = description
    self.pattern_validation: Optional[Pattern] = None
    self.metadata: dict[str, Any] = metadata or {}

    if pattern_validation:
        self.pattern_validation = re_compile(pattern_validation)

__repr__()

Detailed string representation of the node.

Source code in src/katachi/schema/schema_node.py
def __repr__(self) -> str:
    """Detailed string representation of the node."""
    return f"{self.__class__.__name__}(path='{self.path}', semantical_name='{self.semantical_name}')"

__str__()

String representation of the node.

Source code in src/katachi/schema/schema_node.py
def __str__(self) -> str:
    """String representation of the node."""
    return f"{self.get_type()}: {self.semantical_name} at {self.path}"

get_type() abstractmethod

Get the type of this node.

Returns:

Type Description
str

String representing the node type ("file" or "directory").

Source code in src/katachi/schema/schema_node.py
@abstractmethod
def get_type(self) -> str:
    """
    Get the type of this node.

    Returns:
        String representing the node type ("file" or "directory").
    """
    pass

SchemaPredicateNode

Bases: SchemaNode

Represents a predicate node in the schema. Used for validating relationships between other schema nodes.

Source code in src/katachi/schema/schema_node.py
class SchemaPredicateNode(SchemaNode):
    """
    Represents a predicate node in the schema.
    Used for validating relationships between other schema nodes.
    """

    def __init__(
        self,
        path: Path,
        semantical_name: str,
        predicate_type: str,
        elements: list[str],
        description: Optional[str] = None,
        metadata: Optional[dict[str, Any]] = None,
    ):
        """
        Initialize a schema predicate node.

        Args:
            path: Path to this node
            semantical_name: The semantic name of this node in the schema
            predicate_type: Type of predicate (e.g., 'pair_comparison')
            elements: List of semantical names of nodes this predicate operates on
            description: Optional description of the predicate
            metadata: Optional metadata for custom validations
        """
        super().__init__(path, semantical_name, description, None, metadata)
        self.predicate_type: str = predicate_type
        self.elements: list[str] = elements

    def get_type(self) -> str:
        return "predicate"

    def __repr__(self) -> str:
        """Detailed string representation of the predicate node."""
        return (
            f"{self.__class__.__name__}(path='{self.path}', "
            f"semantical_name='{self.semantical_name}', "
            f"predicate_type='{self.predicate_type}', "
            f"elements={self.elements})"
        )

__init__(path, semantical_name, predicate_type, elements, description=None, metadata=None)

Initialize a schema predicate node.

Parameters:

Name Type Description Default
path Path

Path to this node

required
semantical_name str

The semantic name of this node in the schema

required
predicate_type str

Type of predicate (e.g., 'pair_comparison')

required
elements list[str]

List of semantical names of nodes this predicate operates on

required
description Optional[str]

Optional description of the predicate

None
metadata Optional[dict[str, Any]]

Optional metadata for custom validations

None
Source code in src/katachi/schema/schema_node.py
def __init__(
    self,
    path: Path,
    semantical_name: str,
    predicate_type: str,
    elements: list[str],
    description: Optional[str] = None,
    metadata: Optional[dict[str, Any]] = None,
):
    """
    Initialize a schema predicate node.

    Args:
        path: Path to this node
        semantical_name: The semantic name of this node in the schema
        predicate_type: Type of predicate (e.g., 'pair_comparison')
        elements: List of semantical names of nodes this predicate operates on
        description: Optional description of the predicate
        metadata: Optional metadata for custom validations
    """
    super().__init__(path, semantical_name, description, None, metadata)
    self.predicate_type: str = predicate_type
    self.elements: list[str] = elements

__repr__()

Detailed string representation of the predicate node.

Source code in src/katachi/schema/schema_node.py
def __repr__(self) -> str:
    """Detailed string representation of the predicate node."""
    return (
        f"{self.__class__.__name__}(path='{self.path}', "
        f"semantical_name='{self.semantical_name}', "
        f"predicate_type='{self.predicate_type}', "
        f"elements={self.elements})"
    )

Schema Importer (katachi.schema.importer)

Load schema definitions from YAML files to create SchemaNode structures.

from katachi.schema.importer import load_yaml
from pathlib import Path

# Load schema from YAML file
schema = load_yaml(Path("schema.yaml"), Path("target_directory"))

# Now schema contains a fully constructed schema hierarchy
if schema:
    print(f"Loaded schema for {schema.semantical_name}")
else:
    print("Failed to load schema")

load_yaml(schema_path, target_path)

Load a YAML schema file and return a SchemaNode tree structure.

Parameters:

Name Type Description Default
schema_path Path

Path to the YAML schema file

required
target_path Path

Path to the directory that will be validated against the schema

required

Returns:

Type Description
Optional[SchemaNode]

The root SchemaNode representing the schema hierarchy

Raises:

Type Description
SchemaFileNotFoundError

If the schema file does not exist

EmptySchemaFileError

If the schema file is empty

InvalidYAMLContentError

If the YAML content cannot be parsed

FailedToLoadYAMLFileError

If there are other errors loading the YAML file

Source code in src/katachi/schema/importer.py
def load_yaml(schema_path: Path, target_path: Path) -> Optional[SchemaNode]:
    """
    Load a YAML schema file and return a SchemaNode tree structure.

    Args:
        schema_path: Path to the YAML schema file
        target_path: Path to the directory that will be validated against the schema

    Returns:
        The root SchemaNode representing the schema hierarchy

    Raises:
        SchemaFileNotFoundError: If the schema file does not exist
        EmptySchemaFileError: If the schema file is empty
        InvalidYAMLContentError: If the YAML content cannot be parsed
        FailedToLoadYAMLFileError: If there are other errors loading the YAML file
    """
    if not schema_path.exists():
        logging.error(f"Schema file not found: {schema_path}")
        return None

    try:
        with open(schema_path) as file:
            file_content = file.read()
            if not file_content.strip():
                logging.error(f"Schema file is empty: {schema_path}")
                return None

            data = yaml.safe_load(file_content)
            if data is None:
                logging.error(f"Invalid YAML content in file: {schema_path}")
                return None

            # Important: For the root node, we use the target_path directly
            # instead of constructing a path based on the schema node name
            return _parse_node(data, target_path, is_root=True)
    except yaml.YAMLError:
        logging.exception(f"Failed to load YAML file {schema_path}")
        return None
    except Exception:
        logging.exception(f"An error occurred while loading the YAML file {schema_path}")
        return None

Schema Validator (katachi.schema.validate)

Validate directory structures against schema definitions.

from katachi.schema.validate import validate_schema, format_validation_results
from pathlib import Path

# Validate target directory against schema
report = validate_schema(schema, Path("directory_to_validate"))

# Check if validation was successful
if report.is_valid():
    print("Validation successful!")
else:
    # Print formatted validation results
    print(format_validation_results(report))

Actions module for Katachi.

This module provides functionality for registering and executing callbacks when traversing the file system according to a schema.

ActionRegistration dataclass

Action registration details.

Source code in src/katachi/schema/actions.py
@dataclass
class ActionRegistration:
    """Action registration details."""

    callback: ActionCallback
    timing: ActionTiming
    description: str

ActionRegistry

Registry for file and directory actions.

Source code in src/katachi/schema/actions.py
class ActionRegistry:
    """Registry for file and directory actions."""

    # Registry of callbacks by semantic name
    _registry: ClassVar[dict[str, ActionRegistration]] = {}

    @classmethod
    def register(
        cls,
        semantical_name: str,
        callback: ActionCallback,
        timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
        description: str = "",
    ) -> None:
        """
        Register a callback for a specific schema node semantic name.

        Args:
            semantical_name: The semantic name to trigger the callback for
            callback: Function to call when traversing a node with this semantic name
            timing: When the action should be executed
            description: Human-readable description of what the action does
        """
        cls._registry[semantical_name] = ActionRegistration(
            callback=callback, timing=timing, description=description or f"Action for {semantical_name}"
        )

    @classmethod
    def get(cls, semantical_name: str) -> Optional[ActionRegistration]:
        """Get a registered action by semantical name."""
        return cls._registry.get(semantical_name)

    @classmethod
    def execute_actions(
        cls,
        registry: NodeRegistry,
        context: Optional[dict[str, Any]] = None,
        timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
    ) -> list[ActionResult]:
        """
        Execute all registered actions on validated nodes.

        Args:
            registry: Registry of validated nodes
            context: Additional context data
            timing: Which set of actions to execute based on timing

        Returns:
            List of action results
        """
        results = []
        context = context or {}

        # Get all semantical names from the registry
        for semantical_name, registration in cls._registry.items():
            # Skip actions that don't match the requested timing
            if registration.timing != timing:
                continue

            # Get all nodes with this semantical name
            node_contexts = registry.get_nodes_by_name(semantical_name)
            for node_ctx in node_contexts:
                try:
                    # Get parent contexts
                    parent_contexts = []
                    for parent_path in node_ctx.parent_paths:
                        parent_node_ctx = registry.get_node_by_path(parent_path)
                        if parent_node_ctx:
                            parent_contexts.append((parent_node_ctx.node, parent_node_ctx.path))

                    # Execute the action
                    registration.callback(node_ctx.node, node_ctx.path, parent_contexts, context)
                    results.append(
                        ActionResult(
                            success=True,
                            message=f"Executed {registration.description}",
                            path=node_ctx.path,
                            action_name=semantical_name,
                        )
                    )
                except Exception as e:
                    results.append(
                        ActionResult(
                            success=False,
                            message=f"Action failed: {e!s}",
                            path=node_ctx.path,
                            action_name=semantical_name,
                        )
                    )

        return results

execute_actions(registry, context=None, timing=ActionTiming.AFTER_VALIDATION) classmethod

Execute all registered actions on validated nodes.

Parameters:

Name Type Description Default
registry NodeRegistry

Registry of validated nodes

required
context Optional[dict[str, Any]]

Additional context data

None
timing ActionTiming

Which set of actions to execute based on timing

AFTER_VALIDATION

Returns:

Type Description
list[ActionResult]

List of action results

Source code in src/katachi/schema/actions.py
@classmethod
def execute_actions(
    cls,
    registry: NodeRegistry,
    context: Optional[dict[str, Any]] = None,
    timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
) -> list[ActionResult]:
    """
    Execute all registered actions on validated nodes.

    Args:
        registry: Registry of validated nodes
        context: Additional context data
        timing: Which set of actions to execute based on timing

    Returns:
        List of action results
    """
    results = []
    context = context or {}

    # Get all semantical names from the registry
    for semantical_name, registration in cls._registry.items():
        # Skip actions that don't match the requested timing
        if registration.timing != timing:
            continue

        # Get all nodes with this semantical name
        node_contexts = registry.get_nodes_by_name(semantical_name)
        for node_ctx in node_contexts:
            try:
                # Get parent contexts
                parent_contexts = []
                for parent_path in node_ctx.parent_paths:
                    parent_node_ctx = registry.get_node_by_path(parent_path)
                    if parent_node_ctx:
                        parent_contexts.append((parent_node_ctx.node, parent_node_ctx.path))

                # Execute the action
                registration.callback(node_ctx.node, node_ctx.path, parent_contexts, context)
                results.append(
                    ActionResult(
                        success=True,
                        message=f"Executed {registration.description}",
                        path=node_ctx.path,
                        action_name=semantical_name,
                    )
                )
            except Exception as e:
                results.append(
                    ActionResult(
                        success=False,
                        message=f"Action failed: {e!s}",
                        path=node_ctx.path,
                        action_name=semantical_name,
                    )
                )

    return results

get(semantical_name) classmethod

Get a registered action by semantical name.

Source code in src/katachi/schema/actions.py
@classmethod
def get(cls, semantical_name: str) -> Optional[ActionRegistration]:
    """Get a registered action by semantical name."""
    return cls._registry.get(semantical_name)

register(semantical_name, callback, timing=ActionTiming.AFTER_VALIDATION, description='') classmethod

Register a callback for a specific schema node semantic name.

Parameters:

Name Type Description Default
semantical_name str

The semantic name to trigger the callback for

required
callback ActionCallback

Function to call when traversing a node with this semantic name

required
timing ActionTiming

When the action should be executed

AFTER_VALIDATION
description str

Human-readable description of what the action does

''
Source code in src/katachi/schema/actions.py
@classmethod
def register(
    cls,
    semantical_name: str,
    callback: ActionCallback,
    timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
    description: str = "",
) -> None:
    """
    Register a callback for a specific schema node semantic name.

    Args:
        semantical_name: The semantic name to trigger the callback for
        callback: Function to call when traversing a node with this semantic name
        timing: When the action should be executed
        description: Human-readable description of what the action does
    """
    cls._registry[semantical_name] = ActionRegistration(
        callback=callback, timing=timing, description=description or f"Action for {semantical_name}"
    )

ActionResult

Represents the result of an action execution.

Source code in src/katachi/schema/actions.py
class ActionResult:
    """Represents the result of an action execution."""

    def __init__(self, success: bool, message: str, path: Path, action_name: str):
        """
        Initialize an action result.

        Args:
            success: Whether the action succeeded
            message: Description of what happened
            path: The path the action was performed on
            action_name: Name of the action that was performed
        """
        self.success = success
        self.message = message
        self.path = path
        self.action_name = action_name

    def __str__(self) -> str:
        status = "Success" if self.success else "Failed"
        return f"{status} - {self.action_name} on {self.path}: {self.message}"

__init__(success, message, path, action_name)

Initialize an action result.

Parameters:

Name Type Description Default
success bool

Whether the action succeeded

required
message str

Description of what happened

required
path Path

The path the action was performed on

required
action_name str

Name of the action that was performed

required
Source code in src/katachi/schema/actions.py
def __init__(self, success: bool, message: str, path: Path, action_name: str):
    """
    Initialize an action result.

    Args:
        success: Whether the action succeeded
        message: Description of what happened
        path: The path the action was performed on
        action_name: Name of the action that was performed
    """
    self.success = success
    self.message = message
    self.path = path
    self.action_name = action_name

ActionTiming

Bases: Enum

When an action should be executed.

Source code in src/katachi/schema/actions.py
class ActionTiming(Enum):
    """When an action should be executed."""

    DURING_VALIDATION = auto()  # Run during structure validation (old behavior)
    AFTER_VALIDATION = auto()  # Run after all validation is complete (default new behavior)

process_node(node, path, parent_contexts, context=None)

Process a node by running any registered callbacks for it.

Parameters:

Name Type Description Default
node SchemaNode

Current schema node being processed

required
path Path

Path being validated

required
parent_contexts list[NodeContext]

List of parent (node, path) tuples

required
context Optional[dict[str, Any]]

Additional context data

None
Source code in src/katachi/schema/actions.py
def process_node(
    node: SchemaNode,
    path: Path,
    parent_contexts: list[NodeContext],
    context: Optional[dict[str, Any]] = None,
) -> None:
    """
    Process a node by running any registered callbacks for it.

    Args:
        node: Current schema node being processed
        path: Path being validated
        parent_contexts: List of parent (node, path) tuples
        context: Additional context data
    """
    context = context or {}

    # Check if there's a callback registered for this node's semantic name
    registration = ActionRegistry.get(node.semantical_name)
    if registration and registration.timing == ActionTiming.DURING_VALIDATION:
        registration.callback(node, path, parent_contexts, context)

register_action(semantical_name, callback)

Register a callback for a specific schema node semantic name.

Parameters:

Name Type Description Default
semantical_name str

The semantic name to trigger the callback for

required
callback ActionCallback

Function to call when traversing a node with this semantic name

required
Source code in src/katachi/schema/actions.py
def register_action(semantical_name: str, callback: ActionCallback) -> None:
    """
    Register a callback for a specific schema node semantic name.

    Args:
        semantical_name: The semantic name to trigger the callback for
        callback: Function to call when traversing a node with this semantic name
    """
    ActionRegistry.register(
        semantical_name,
        callback,
        timing=ActionTiming.DURING_VALIDATION,
        description=f"Legacy action for {semantical_name}",
    )

Schema Actions (katachi.schema.actions)

Register and execute actions to process files during schema traversal.

from katachi.schema.actions import register_action, NodeContext
from pathlib import Path
from typing import Any, list, dict

# Define a custom action function
def process_image(
    node: SchemaNode,
    path: Path,
    parent_contexts: list[NodeContext],
    context: dict[str, Any]
) -> None:
    """Process an image file during schema traversal."""
    print(f"Processing image: {path}")

    # Find parent timestamp directory
    timestamp_path = None
    for node, path in parent_contexts:
        if node.semantical_name == "timestamp":
            timestamp_path = path
            break

    if timestamp_path:
        print(f"Image from date: {timestamp_path.name}")

    # Use context data if provided
    if "target_dir" in context:
        target_path = context["target_dir"] / path.name
        print(f"Would copy to: {target_path}")

# Register the action with a semantical name
register_action("image", process_image)

Actions module for Katachi.

This module provides functionality for registering and executing callbacks when traversing the file system according to a schema.

ActionRegistration dataclass

Action registration details.

Source code in src/katachi/schema/actions.py
@dataclass
class ActionRegistration:
    """Action registration details."""

    callback: ActionCallback
    timing: ActionTiming
    description: str

ActionRegistry

Registry for file and directory actions.

Source code in src/katachi/schema/actions.py
class ActionRegistry:
    """Registry for file and directory actions."""

    # Registry of callbacks by semantic name
    _registry: ClassVar[dict[str, ActionRegistration]] = {}

    @classmethod
    def register(
        cls,
        semantical_name: str,
        callback: ActionCallback,
        timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
        description: str = "",
    ) -> None:
        """
        Register a callback for a specific schema node semantic name.

        Args:
            semantical_name: The semantic name to trigger the callback for
            callback: Function to call when traversing a node with this semantic name
            timing: When the action should be executed
            description: Human-readable description of what the action does
        """
        cls._registry[semantical_name] = ActionRegistration(
            callback=callback, timing=timing, description=description or f"Action for {semantical_name}"
        )

    @classmethod
    def get(cls, semantical_name: str) -> Optional[ActionRegistration]:
        """Get a registered action by semantical name."""
        return cls._registry.get(semantical_name)

    @classmethod
    def execute_actions(
        cls,
        registry: NodeRegistry,
        context: Optional[dict[str, Any]] = None,
        timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
    ) -> list[ActionResult]:
        """
        Execute all registered actions on validated nodes.

        Args:
            registry: Registry of validated nodes
            context: Additional context data
            timing: Which set of actions to execute based on timing

        Returns:
            List of action results
        """
        results = []
        context = context or {}

        # Get all semantical names from the registry
        for semantical_name, registration in cls._registry.items():
            # Skip actions that don't match the requested timing
            if registration.timing != timing:
                continue

            # Get all nodes with this semantical name
            node_contexts = registry.get_nodes_by_name(semantical_name)
            for node_ctx in node_contexts:
                try:
                    # Get parent contexts
                    parent_contexts = []
                    for parent_path in node_ctx.parent_paths:
                        parent_node_ctx = registry.get_node_by_path(parent_path)
                        if parent_node_ctx:
                            parent_contexts.append((parent_node_ctx.node, parent_node_ctx.path))

                    # Execute the action
                    registration.callback(node_ctx.node, node_ctx.path, parent_contexts, context)
                    results.append(
                        ActionResult(
                            success=True,
                            message=f"Executed {registration.description}",
                            path=node_ctx.path,
                            action_name=semantical_name,
                        )
                    )
                except Exception as e:
                    results.append(
                        ActionResult(
                            success=False,
                            message=f"Action failed: {e!s}",
                            path=node_ctx.path,
                            action_name=semantical_name,
                        )
                    )

        return results

execute_actions(registry, context=None, timing=ActionTiming.AFTER_VALIDATION) classmethod

Execute all registered actions on validated nodes.

Parameters:

Name Type Description Default
registry NodeRegistry

Registry of validated nodes

required
context Optional[dict[str, Any]]

Additional context data

None
timing ActionTiming

Which set of actions to execute based on timing

AFTER_VALIDATION

Returns:

Type Description
list[ActionResult]

List of action results

Source code in src/katachi/schema/actions.py
@classmethod
def execute_actions(
    cls,
    registry: NodeRegistry,
    context: Optional[dict[str, Any]] = None,
    timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
) -> list[ActionResult]:
    """
    Execute all registered actions on validated nodes.

    Args:
        registry: Registry of validated nodes
        context: Additional context data
        timing: Which set of actions to execute based on timing

    Returns:
        List of action results
    """
    results = []
    context = context or {}

    # Get all semantical names from the registry
    for semantical_name, registration in cls._registry.items():
        # Skip actions that don't match the requested timing
        if registration.timing != timing:
            continue

        # Get all nodes with this semantical name
        node_contexts = registry.get_nodes_by_name(semantical_name)
        for node_ctx in node_contexts:
            try:
                # Get parent contexts
                parent_contexts = []
                for parent_path in node_ctx.parent_paths:
                    parent_node_ctx = registry.get_node_by_path(parent_path)
                    if parent_node_ctx:
                        parent_contexts.append((parent_node_ctx.node, parent_node_ctx.path))

                # Execute the action
                registration.callback(node_ctx.node, node_ctx.path, parent_contexts, context)
                results.append(
                    ActionResult(
                        success=True,
                        message=f"Executed {registration.description}",
                        path=node_ctx.path,
                        action_name=semantical_name,
                    )
                )
            except Exception as e:
                results.append(
                    ActionResult(
                        success=False,
                        message=f"Action failed: {e!s}",
                        path=node_ctx.path,
                        action_name=semantical_name,
                    )
                )

    return results

get(semantical_name) classmethod

Get a registered action by semantical name.

Source code in src/katachi/schema/actions.py
@classmethod
def get(cls, semantical_name: str) -> Optional[ActionRegistration]:
    """Get a registered action by semantical name."""
    return cls._registry.get(semantical_name)

register(semantical_name, callback, timing=ActionTiming.AFTER_VALIDATION, description='') classmethod

Register a callback for a specific schema node semantic name.

Parameters:

Name Type Description Default
semantical_name str

The semantic name to trigger the callback for

required
callback ActionCallback

Function to call when traversing a node with this semantic name

required
timing ActionTiming

When the action should be executed

AFTER_VALIDATION
description str

Human-readable description of what the action does

''
Source code in src/katachi/schema/actions.py
@classmethod
def register(
    cls,
    semantical_name: str,
    callback: ActionCallback,
    timing: ActionTiming = ActionTiming.AFTER_VALIDATION,
    description: str = "",
) -> None:
    """
    Register a callback for a specific schema node semantic name.

    Args:
        semantical_name: The semantic name to trigger the callback for
        callback: Function to call when traversing a node with this semantic name
        timing: When the action should be executed
        description: Human-readable description of what the action does
    """
    cls._registry[semantical_name] = ActionRegistration(
        callback=callback, timing=timing, description=description or f"Action for {semantical_name}"
    )

ActionResult

Represents the result of an action execution.

Source code in src/katachi/schema/actions.py
class ActionResult:
    """Represents the result of an action execution."""

    def __init__(self, success: bool, message: str, path: Path, action_name: str):
        """
        Initialize an action result.

        Args:
            success: Whether the action succeeded
            message: Description of what happened
            path: The path the action was performed on
            action_name: Name of the action that was performed
        """
        self.success = success
        self.message = message
        self.path = path
        self.action_name = action_name

    def __str__(self) -> str:
        status = "Success" if self.success else "Failed"
        return f"{status} - {self.action_name} on {self.path}: {self.message}"

__init__(success, message, path, action_name)

Initialize an action result.

Parameters:

Name Type Description Default
success bool

Whether the action succeeded

required
message str

Description of what happened

required
path Path

The path the action was performed on

required
action_name str

Name of the action that was performed

required
Source code in src/katachi/schema/actions.py
def __init__(self, success: bool, message: str, path: Path, action_name: str):
    """
    Initialize an action result.

    Args:
        success: Whether the action succeeded
        message: Description of what happened
        path: The path the action was performed on
        action_name: Name of the action that was performed
    """
    self.success = success
    self.message = message
    self.path = path
    self.action_name = action_name

ActionTiming

Bases: Enum

When an action should be executed.

Source code in src/katachi/schema/actions.py
class ActionTiming(Enum):
    """When an action should be executed."""

    DURING_VALIDATION = auto()  # Run during structure validation (old behavior)
    AFTER_VALIDATION = auto()  # Run after all validation is complete (default new behavior)

process_node(node, path, parent_contexts, context=None)

Process a node by running any registered callbacks for it.

Parameters:

Name Type Description Default
node SchemaNode

Current schema node being processed

required
path Path

Path being validated

required
parent_contexts list[NodeContext]

List of parent (node, path) tuples

required
context Optional[dict[str, Any]]

Additional context data

None
Source code in src/katachi/schema/actions.py
def process_node(
    node: SchemaNode,
    path: Path,
    parent_contexts: list[NodeContext],
    context: Optional[dict[str, Any]] = None,
) -> None:
    """
    Process a node by running any registered callbacks for it.

    Args:
        node: Current schema node being processed
        path: Path being validated
        parent_contexts: List of parent (node, path) tuples
        context: Additional context data
    """
    context = context or {}

    # Check if there's a callback registered for this node's semantic name
    registration = ActionRegistry.get(node.semantical_name)
    if registration and registration.timing == ActionTiming.DURING_VALIDATION:
        registration.callback(node, path, parent_contexts, context)

register_action(semantical_name, callback)

Register a callback for a specific schema node semantic name.

Parameters:

Name Type Description Default
semantical_name str

The semantic name to trigger the callback for

required
callback ActionCallback

Function to call when traversing a node with this semantic name

required
Source code in src/katachi/schema/actions.py
def register_action(semantical_name: str, callback: ActionCallback) -> None:
    """
    Register a callback for a specific schema node semantic name.

    Args:
        semantical_name: The semantic name to trigger the callback for
        callback: Function to call when traversing a node with this semantic name
    """
    ActionRegistry.register(
        semantical_name,
        callback,
        timing=ActionTiming.DURING_VALIDATION,
        description=f"Legacy action for {semantical_name}",
    )

Validation Registry (katachi.validation.registry)

Track and query validated nodes across the schema hierarchy.

from katachi.validation.registry import NodeRegistry
from katachi.schema.schema_node import SchemaNode
from pathlib import Path

# Create a registry
registry = NodeRegistry()

# Register nodes as they're validated
registry.register_node(schema_node, path, parent_paths)

# Query nodes by semantical name
image_paths = registry.get_paths_by_name("image")

# Get all nodes under a specific directory
nodes = list(registry.get_nodes_under_path(Path("data/01.01.2023")))

Registry module for tracking validated nodes.

This module provides functionality for registering and querying nodes that have passed validation, to support cross-level predicate evaluation.

NodeContext

Context information about a validated node.

Source code in src/katachi/validation/registry.py
class NodeContext:
    """Context information about a validated node."""

    def __init__(self, node: SchemaNode, path: Path, parent_paths: Optional[list[Path]] = None):
        """
        Initialize a node context.

        Args:
            node: The schema node
            path: The path that was validated
            parent_paths: List of parent paths in the hierarchy
        """
        self.node = node
        self.path = path
        self.parent_paths = parent_paths or []

    def __repr__(self) -> str:
        return f"NodeContext({self.node.semantical_name}, {self.path})"

__init__(node, path, parent_paths=None)

Initialize a node context.

Parameters:

Name Type Description Default
node SchemaNode

The schema node

required
path Path

The path that was validated

required
parent_paths Optional[list[Path]]

List of parent paths in the hierarchy

None
Source code in src/katachi/validation/registry.py
def __init__(self, node: SchemaNode, path: Path, parent_paths: Optional[list[Path]] = None):
    """
    Initialize a node context.

    Args:
        node: The schema node
        path: The path that was validated
        parent_paths: List of parent paths in the hierarchy
    """
    self.node = node
    self.path = path
    self.parent_paths = parent_paths or []

NodeRegistry

Registry for tracking nodes that passed validation.

Source code in src/katachi/validation/registry.py
class NodeRegistry:
    """Registry for tracking nodes that passed validation."""

    def __init__(self) -> None:
        """Initialize the node registry."""
        # Dictionary mapping semantical names to lists of node contexts
        self._nodes_by_name: dict[str, list[NodeContext]] = {}
        # Dictionary mapping paths to node contexts
        self._nodes_by_path: dict[Path, NodeContext] = {}
        # Set of directories that have been processed
        self._processed_dirs: set[Path] = set()

    def register_node(self, node: SchemaNode, path: Path, parent_paths: Optional[list[Path]] = None) -> None:
        """
        Register a node that passed validation.

        Args:
            node: Schema node that was validated
            path: Path that was validated
            parent_paths: List of parent paths in the hierarchy
        """
        context = NodeContext(node, path, parent_paths)

        # Register by semantical name
        if node.semantical_name not in self._nodes_by_name:
            self._nodes_by_name[node.semantical_name] = []
        self._nodes_by_name[node.semantical_name].append(context)

        # Register by path
        self._nodes_by_path[path] = context

    def register_processed_dir(self, dir_path: Path) -> None:
        """
        Register a directory as processed.

        Args:
            dir_path: Path to the processed directory
        """
        self._processed_dirs.add(dir_path)

    def is_dir_processed(self, dir_path: Path) -> bool:
        """
        Check if a directory has been processed.

        Args:
            dir_path: Path to check

        Returns:
            True if the directory has been processed, False otherwise
        """
        return dir_path in self._processed_dirs

    def get_nodes_by_name(self, semantical_name: str) -> list[NodeContext]:
        """
        Get all nodes with a given semantical name.

        Args:
            semantical_name: The semantical name to look up

        Returns:
            List of node contexts with the given semantical name
        """
        return self._nodes_by_name.get(semantical_name, [])

    def get_node_by_path(self, path: Path) -> Optional[NodeContext]:
        """
        Get a node by its path.

        Args:
            path: The path to look up

        Returns:
            Node context for the path, or None if not found
        """
        return self._nodes_by_path.get(path)

    def get_nodes_under_path(self, base_path: Path) -> Iterator[NodeContext]:
        """
        Get all nodes under a given path.

        Args:
            base_path: The base path to filter by

        Returns:
            Iterator of node contexts under the given path
        """
        for path, context in self._nodes_by_path.items():
            try:
                if base_path in path.parents or path == base_path:
                    yield context
            except ValueError:
                # This happens if paths are on different drives
                continue

    def get_paths_by_name(self, semantical_name: str) -> list[Path]:
        """
        Get all paths with a given semantical name.

        Args:
            semantical_name: The semantical name to look up

        Returns:
            List of paths with the given semantical name
        """
        return [context.path for context in self.get_nodes_by_name(semantical_name)]

    def clear(self) -> None:
        """Clear the registry."""
        self._nodes_by_name.clear()
        self._nodes_by_path.clear()
        self._processed_dirs.clear()

    def __str__(self) -> str:
        return f"NodeRegistry with {len(self._nodes_by_path)} nodes of {len(self._nodes_by_name)} types"

__init__()

Initialize the node registry.

Source code in src/katachi/validation/registry.py
def __init__(self) -> None:
    """Initialize the node registry."""
    # Dictionary mapping semantical names to lists of node contexts
    self._nodes_by_name: dict[str, list[NodeContext]] = {}
    # Dictionary mapping paths to node contexts
    self._nodes_by_path: dict[Path, NodeContext] = {}
    # Set of directories that have been processed
    self._processed_dirs: set[Path] = set()

clear()

Clear the registry.

Source code in src/katachi/validation/registry.py
def clear(self) -> None:
    """Clear the registry."""
    self._nodes_by_name.clear()
    self._nodes_by_path.clear()
    self._processed_dirs.clear()

get_node_by_path(path)

Get a node by its path.

Parameters:

Name Type Description Default
path Path

The path to look up

required

Returns:

Type Description
Optional[NodeContext]

Node context for the path, or None if not found

Source code in src/katachi/validation/registry.py
def get_node_by_path(self, path: Path) -> Optional[NodeContext]:
    """
    Get a node by its path.

    Args:
        path: The path to look up

    Returns:
        Node context for the path, or None if not found
    """
    return self._nodes_by_path.get(path)

get_nodes_by_name(semantical_name)

Get all nodes with a given semantical name.

Parameters:

Name Type Description Default
semantical_name str

The semantical name to look up

required

Returns:

Type Description
list[NodeContext]

List of node contexts with the given semantical name

Source code in src/katachi/validation/registry.py
def get_nodes_by_name(self, semantical_name: str) -> list[NodeContext]:
    """
    Get all nodes with a given semantical name.

    Args:
        semantical_name: The semantical name to look up

    Returns:
        List of node contexts with the given semantical name
    """
    return self._nodes_by_name.get(semantical_name, [])

get_nodes_under_path(base_path)

Get all nodes under a given path.

Parameters:

Name Type Description Default
base_path Path

The base path to filter by

required

Returns:

Type Description
Iterator[NodeContext]

Iterator of node contexts under the given path

Source code in src/katachi/validation/registry.py
def get_nodes_under_path(self, base_path: Path) -> Iterator[NodeContext]:
    """
    Get all nodes under a given path.

    Args:
        base_path: The base path to filter by

    Returns:
        Iterator of node contexts under the given path
    """
    for path, context in self._nodes_by_path.items():
        try:
            if base_path in path.parents or path == base_path:
                yield context
        except ValueError:
            # This happens if paths are on different drives
            continue

get_paths_by_name(semantical_name)

Get all paths with a given semantical name.

Parameters:

Name Type Description Default
semantical_name str

The semantical name to look up

required

Returns:

Type Description
list[Path]

List of paths with the given semantical name

Source code in src/katachi/validation/registry.py
def get_paths_by_name(self, semantical_name: str) -> list[Path]:
    """
    Get all paths with a given semantical name.

    Args:
        semantical_name: The semantical name to look up

    Returns:
        List of paths with the given semantical name
    """
    return [context.path for context in self.get_nodes_by_name(semantical_name)]

is_dir_processed(dir_path)

Check if a directory has been processed.

Parameters:

Name Type Description Default
dir_path Path

Path to check

required

Returns:

Type Description
bool

True if the directory has been processed, False otherwise

Source code in src/katachi/validation/registry.py
def is_dir_processed(self, dir_path: Path) -> bool:
    """
    Check if a directory has been processed.

    Args:
        dir_path: Path to check

    Returns:
        True if the directory has been processed, False otherwise
    """
    return dir_path in self._processed_dirs

register_node(node, path, parent_paths=None)

Register a node that passed validation.

Parameters:

Name Type Description Default
node SchemaNode

Schema node that was validated

required
path Path

Path that was validated

required
parent_paths Optional[list[Path]]

List of parent paths in the hierarchy

None
Source code in src/katachi/validation/registry.py
def register_node(self, node: SchemaNode, path: Path, parent_paths: Optional[list[Path]] = None) -> None:
    """
    Register a node that passed validation.

    Args:
        node: Schema node that was validated
        path: Path that was validated
        parent_paths: List of parent paths in the hierarchy
    """
    context = NodeContext(node, path, parent_paths)

    # Register by semantical name
    if node.semantical_name not in self._nodes_by_name:
        self._nodes_by_name[node.semantical_name] = []
    self._nodes_by_name[node.semantical_name].append(context)

    # Register by path
    self._nodes_by_path[path] = context

register_processed_dir(dir_path)

Register a directory as processed.

Parameters:

Name Type Description Default
dir_path Path

Path to the processed directory

required
Source code in src/katachi/validation/registry.py
def register_processed_dir(self, dir_path: Path) -> None:
    """
    Register a directory as processed.

    Args:
        dir_path: Path to the processed directory
    """
    self._processed_dirs.add(dir_path)

Validation Core (katachi.validation.core)

Core validation components for creating custom validators.

from katachi.validation.core import ValidationResult, ValidationReport, ValidatorRegistry
from katachi.schema.schema_node import SchemaNode
from pathlib import Path

# Define a custom validator
def image_dimensions_validator(node: SchemaNode, path: Path):
    """Check if image dimensions meet requirements."""
    from PIL import Image

    try:
        with Image.open(path) as img:
            width, height = img.size

            # Check if size meets requirements
            min_width = node.metadata.get("min_width", 0)
            min_height = node.metadata.get("min_height", 0)

            if width < min_width:
                return ValidationResult(
                    is_valid=False,
                    message=f"Image width ({width}px) is less than minimum ({min_width}px)",
                    path=path,
                    validator_name="image_dimensions"
                )

            if height < min_height:
                return ValidationResult(
                    is_valid=False,
                    message=f"Image height ({height}px) is less than minimum ({min_height}px)",
                    path=path,
                    validator_name="image_dimensions"
                )

            return ValidationResult(
                is_valid=True,
                message="Image dimensions are valid",
                path=path,
                validator_name="image_dimensions"
            )
    except:
        return ValidationResult(
            is_valid=False,
            message="Failed to open image file",
            path=path,
            validator_name="image_dimensions"
        )

# Register the custom validator
ValidatorRegistry.register("image_dimensions", image_dimensions_validator)

ValidationReport

Collection of validation results with formatted output.

Source code in src/katachi/validation/core.py
class ValidationReport:
    """Collection of validation results with formatted output."""

    def __init__(self) -> None:
        self.results: list[ValidationResult] = []
        self.context: dict[str, Any] = {}

    def add_result(self, result: ValidationResult) -> None:
        self.results.append(result)

    def add_results(self, results: list[ValidationResult]) -> None:
        self.results.extend(results)

    def is_valid(self) -> bool:
        return all(result.is_valid for result in self.results)

    def format_report(self) -> str:
        """Format validation results into human-readable output."""
        if self.is_valid():
            return "All validations passed successfully!"

        failures = [r for r in self.results if not r.is_valid]
        report_lines = ["Validation failed with the following issues:"]

        for failure in failures:
            report_lines.append(f"❌ {failure.path}: {failure.message}")

        return "\n".join(report_lines)

format_report()

Format validation results into human-readable output.

Source code in src/katachi/validation/core.py
def format_report(self) -> str:
    """Format validation results into human-readable output."""
    if self.is_valid():
        return "All validations passed successfully!"

    failures = [r for r in self.results if not r.is_valid]
    report_lines = ["Validation failed with the following issues:"]

    for failure in failures:
        report_lines.append(f"❌ {failure.path}: {failure.message}")

    return "\n".join(report_lines)

ValidationResult dataclass

Result of a validation check with detailed information.

Source code in src/katachi/validation/core.py
@dataclass
class ValidationResult:
    """Result of a validation check with detailed information."""

    is_valid: bool
    message: str
    path: Path
    validator_name: str
    context: Optional[dict[str, Any]] = None

    def __bool__(self) -> bool:
        """Allow using validation result in boolean contexts."""
        return self.is_valid

__bool__()

Allow using validation result in boolean contexts.

Source code in src/katachi/validation/core.py
def __bool__(self) -> bool:
    """Allow using validation result in boolean contexts."""
    return self.is_valid

ValidatorRegistry

Registry for custom validators.

Source code in src/katachi/validation/core.py
class ValidatorRegistry:
    """Registry for custom validators."""

    # Dictionary of validator functions by name
    _validators: ClassVar[dict[str, Callable]] = {}

    @classmethod
    def register(cls, name: str, validator_func: Callable) -> None:
        """Register a new validator function."""
        cls._validators[name] = validator_func

    @classmethod
    def get_validator(cls, name: str) -> Optional[Callable]:
        """Get a registered validator function by name."""
        return cls._validators.get(name)

    @classmethod
    def run_validators(cls, node: SchemaNode, path: Path) -> list[ValidationResult]:
        """Run all registered validators for a given node and path."""
        results = []

        for name, validator_func in cls._validators.items():
            try:
                result = validator_func(node, path)
                if isinstance(result, ValidationResult):
                    results.append(result)
                elif isinstance(result, list):
                    results.extend([r for r in result if isinstance(r, ValidationResult)])
            except Exception as e:
                # Ensure validator failures don't crash the entire validation
                results.append(
                    ValidationResult(
                        is_valid=False,
                        message=f"Validator '{name}' failed with error: {e!s}",
                        path=path,
                        validator_name=name,
                    )
                )

        return results

get_validator(name) classmethod

Get a registered validator function by name.

Source code in src/katachi/validation/core.py
@classmethod
def get_validator(cls, name: str) -> Optional[Callable]:
    """Get a registered validator function by name."""
    return cls._validators.get(name)

register(name, validator_func) classmethod

Register a new validator function.

Source code in src/katachi/validation/core.py
@classmethod
def register(cls, name: str, validator_func: Callable) -> None:
    """Register a new validator function."""
    cls._validators[name] = validator_func

run_validators(node, path) classmethod

Run all registered validators for a given node and path.

Source code in src/katachi/validation/core.py
@classmethod
def run_validators(cls, node: SchemaNode, path: Path) -> list[ValidationResult]:
    """Run all registered validators for a given node and path."""
    results = []

    for name, validator_func in cls._validators.items():
        try:
            result = validator_func(node, path)
            if isinstance(result, ValidationResult):
                results.append(result)
            elif isinstance(result, list):
                results.extend([r for r in result if isinstance(r, ValidationResult)])
        except Exception as e:
            # Ensure validator failures don't crash the entire validation
            results.append(
                ValidationResult(
                    is_valid=False,
                    message=f"Validator '{name}' failed with error: {e!s}",
                    path=path,
                    validator_name=name,
                )
            )

    return results

Command Line Interface (katachi.cli)

Katachi provides a convenient command-line interface for validating directory structures.

# Basic validation
katachi validate schema.yaml target_directory

# Detailed reporting
katachi validate schema.yaml target_directory --detail-report

# Execute actions during validation
katachi validate schema.yaml target_directory --execute-actions

# Provide custom context for actions
katachi validate schema.yaml target_directory --execute-actions --context '{"target_dir": "output"}'

describe(schema_path, target_path)

Describes the schema of a directory structure.

Parameters:

Name Type Description Default
schema_path Path

Path to the schema.yaml file

required
target_path Path

Path to the directory to describe

required

Returns:

Type Description
None

None

Source code in src/katachi/cli.py
@app.command()
def describe(schema_path: Path, target_path: Path) -> None:
    """
    Describes the schema of a directory structure.

    Args:
        schema_path: Path to the schema.yaml file
        target_path: Path to the directory to describe

    Returns:
        None
    """
    console.print(f"Describing schema [bold cyan]{schema_path}[/] for directory [bold cyan]{target_path}[/]")

    try:
        # Load the schema
        schema = load_yaml(schema_path, target_path)
        console.print(Panel(str(schema), title="Schema Description", border_style="blue", expand=False))
    except Exception as e:
        console.print(Panel(f"Failed to describe schema: {e!s}", title="Error", border_style="red", expand=False))

validate(schema_path, target_path, detail_report=typer.Option(False, '--detail-report', help='Show detailed validation report'), execute_actions=typer.Option(False, '--execute-actions', help='Execute actions during/after validation'), context_json=typer.Option(None, '--context', help='JSON string with context data for actions'))

Validates a directory structure against a schema.yaml file.

Parameters:

Name Type Description Default
schema_path Path

Path to the schema.yaml file

required
target_path Path

Path to the directory to validate

required
detail_report bool

Whether to show a detailed validation report

Option(False, '--detail-report', help='Show detailed validation report')
execute_actions bool

Whether to execute registered actions

Option(False, '--execute-actions', help='Execute actions during/after validation')
context_json str

JSON string with context data for actions

Option(None, '--context', help='JSON string with context data for actions')

Returns:

Type Description
None

None

Source code in src/katachi/cli.py
@app.command()
def validate(
    schema_path: Path,
    target_path: Path,
    detail_report: bool = typer.Option(False, "--detail-report", help="Show detailed validation report"),
    execute_actions: bool = typer.Option(False, "--execute-actions", help="Execute actions during/after validation"),
    context_json: str = typer.Option(None, "--context", help="JSON string with context data for actions"),
) -> None:
    """
    Validates a directory structure against a schema.yaml file.

    Args:
        schema_path: Path to the schema.yaml file
        target_path: Path to the directory to validate
        detail_report: Whether to show a detailed validation report
        execute_actions: Whether to execute registered actions
        context_json: JSON string with context data for actions

    Returns:
        None
    """
    console.print(f"Validating schema [bold cyan]{schema_path}[/] against directory [bold cyan]{target_path}[/]")

    # Load the schema
    schema = _load_schema(schema_path, target_path)
    if schema is None:
        console.print(Panel("Failed to load schema!", title="Error", border_style="red", expand=False))
        return

    # Parse context JSON if provided
    context = None
    if context_json:
        try:
            import json

            context = json.loads(context_json)
        except json.JSONDecodeError:
            console.print(Panel("Invalid JSON in context parameter", title="Error", border_style="red", expand=False))
            return

    # Validate the directory structure against the schema
    validation_report = SchemaValidator.validate_schema(
        schema, target_path, execute_actions=execute_actions, context=context
    )

    # Display the results
    _display_validation_results(validation_report, detail_report)

    # Exit with error code if validation failed
    if not validation_report.is_valid():
        return

Extending Katachi

Custom Validators

You can extend Katachi with custom validators to handle specific validation requirements.

from pathlib import Path
from katachi.schema.schema_node import SchemaNode
from katachi.validation.core import ValidationResult, ValidatorRegistry

# Define a custom validator
def file_content_validator(node: SchemaNode, path: Path):
    """Check file content against a pattern."""
    import re

    # Only apply to files with content_pattern in metadata
    if not node.metadata.get("content_pattern"):
        return []

    # Read file content
    try:
        with open(path, "r") as f:
            content = f.read()

        # Validate against pattern
        pattern = re.compile(node.metadata["content_pattern"])
        if pattern.search(content):
            return ValidationResult(
                is_valid=True,
                message="File content matches pattern",
                path=path,
                validator_name="content_pattern"
            )
        else:
            return ValidationResult(
                is_valid=False,
                message=f"File content doesn't match pattern: {node.metadata['content_pattern']}",
                path=path,
                validator_name="content_pattern"
            )
    except Exception as e:
        return ValidationResult(
            is_valid=False,
            message=f"Error validating file content: {str(e)}",
            path=path,
            validator_name="content_pattern"
        )

# Register the validator
ValidatorRegistry.register("content_pattern", file_content_validator)

Custom Predicates

The predicate system can be extended with new types of relationship validation.

Types of predicates currently supported:

Predicate Type Description
pair_comparison Ensures files with the same base names exist across different element types

To implement other predicate types, extend the validate_predicate method in the SchemaValidator class.