Create a custom ontology

Introduction

Solidipes uses Pydantic models to define ontologies for datasets and their entities (files, group of files). It uses the ROCrate metadata file to build the ontological representation of the dataset and perform validations. Pydantic attribute validations (presence of required fields) are used to find the ontology class of each entity. Then, custom validators are applied to check for more specific constraints on the detected entities (e.g. value ranges, presence of specific data, etc.).

It is possible to write custom metadata in the RO-Crate metadata file by using the @File.rocrate_metadata decorator on DataContainer loadable attributes, which will be used to classify entities into specific ontology classes. Have a look at the Table loader and the corresponding Table ontology class for an example of how to write custom metadata and use it for ontology classification and validation.

Example

Below is an example of custom ontoglogy, taken from from the solid mechanics Solidipes plugin. See the inline comments for explanations.

from pydantic import Field, model_validator
from solidipes.ontologies.solidipes import File, build_ontology_model
from solidipes.ontologies.solidipes import entity_classes as base_entity_classes
from solidipes.utils.utils import string_to_regex_pattern as regexify


# Custom ontology class derived from the generic "File" ontology entity

class Mesh(File):
    """Mesh file entity."""

    # An entity will match this class if the ontology_class field matches
    # the following pattern, or matches the name of the class directly
    # ("Mesh" in this case).
    ontology_class: str | None = Field(
        default=None,
        pattern=regexify(
            "solidipes_solid_mech_plugin.loaders.pyvista_mesh.PyvistaMesh"
        )
    )

    # It will match only if all the following fields are present:
    cell_data_names: list[str]
    point_data_names: list[str]


# Two subclasses of Mesh, with different validation rules

class InputMesh(Mesh):
    """Mesh file entity with no fields."""

    cell_data_names: list[str] = Field(default_factory=list, min_length=0, max_length=0)
    point_data_names: list[str] = Field(default_factory=list, min_length=0, max_length=0)


class OutputMesh(Mesh):
    """Mesh file entity with at least one field."""

    cell_data_names: list[str]
    point_data_names: list[str]

    # To implement custom validation checks that run after the ontology
    # class has been assigned, use the `@model_validator(mode="after")`
    # decorator.
    @model_validator(mode="after")
    def _check_at_least_one_field(self):
        if not self.cell_data_names and not self.point_data_names:
            raise ValueError(
                "OutputMesh requires at least one field is required in cell_data_names or point_data_names."
            )
        return self


custom_entity_classes = [
    InputMesh,
    OutputMesh,
]

entity_classes = custom_entity_classes + base_entity_classes

# This line is needed to build the ontology model used by Solidipes.
# Write your own `build_ontology_model` function if you need to customize
# the structure of a dataset for your field of study.
ROCrateMetadata = build_ontology_model(entity_classes)

The corresponding loader must write the appropriate RO-Crate metadata fields using the @File.rocrate_metadata decorator:

class PyvistaMesh(File):
    ...

    @File.rocrate_metadata
    def point_data_names(self):
        return self.pyvista_mesh.point_data.keys()

    @File.rocrate_metadata
    def cell_data_names(self):
        return self.pyvista_mesh.cell_data.keys()