Code Overview
Solidipes curation process relies on the following classes:
ScannerDefines a set of methods that scan directories, build a tree of files (or group of files), and allow the application of functions on the elements of the tree. One typical use case it to try to load each element of the tree using one of the
DataContainerclasses.
DataContainerAlso referred as
Loader. Defines a container for data that allows loading it on demand and applies checks to validate the data in the curation process. One important subclass is theFileclass, that is used to load files from disk (specified by their mime-type and extension), and allows caching computed information about the files. TheDataContaineralso lists a set of compatibleViewerclasses.
ViewerDefines a viewer for compatible
DataContainerclasses. It is used to display data in variousbackends(e.g. terminal, Jupyter notebook, Streamlit).
Other dataset management features of Solidipes rely on the following classes:
DownloaderDefines a script that is callable from the command
solidipes download.
UploaderDefines a script that is callable from the command
solidipes upload.