Documentation tidy-up.

dgbowl · Mar 27, 2024 · c2aa25d · c2aa25d
1 parent 6fb4685
commit c2aa25d
Show file tree

Hide file tree

Showing 9 changed files with 45 additions and 44 deletions.
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -20,7 +20,7 @@
 # -- Project information -----------------------------------------------------
 
 project = "yadg"
-copyright = "2021 - 2023, yadg authors"
+copyright = "2021 - 2024, yadg authors"
 author = "Peter Kraus"
 release = version
 
@@ -39,6 +39,7 @@
     "sphinx_autodoc_typehints",
     "sphinx_rtd_theme",
     "sphinxcontrib.autodoc_pydantic",
+    "sphinxcontrib.mermaid",
 ]
 
 # Add any paths that contain templates here, relative to this directory.

diff --git a/docs/source/extractors.rst b/docs/source/extractors.rst
@@ -2,10 +2,14 @@
    :maxdepth: 1
    :caption: yadg extractors
    :hidden:
+   :glob:
 
-   apidoc/yadg.extractors.agilentch
-   apidoc/yadg.extractors.agilentdx
-   apidoc/yadg.extractors.eclabmpr
-   apidoc/yadg.extractors.eclabmpt
-   apidoc/yadg.extractors.panalyticalxrdml
-   apidoc/yadg.extractors.phispe
+   apidoc/yadg.extractors.public.*
+
+.. toctree::
+   :maxdepth: 1
+   :caption: yadg custom extractors
+   :hidden:
+   :glob:
+
+   apidoc/yadg.extractors.custom.*
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -7,20 +7,11 @@
 .. image:: https://badgen.net/github/tag/dgbowl/yadg/?icon=github
    :target: https://github.com/dgbowl/yadg
 
-**yadg** is a set of tools and parsers aimed to process raw instrument data. Given an experiment represented by a `dataschema`, **yadg** will process the files and folders specified in this `dataschema`, and produce a `datagram`, which is a unified data structure containing all measured ("raw") data in a given experiment. The `parsers` available in **yadg** are shown in the sidebar. As of ``yadg-5.0``, the `datagram` is stored as a |NetCDF|_ file. The produced `datagram` is associated with full provenance info, and the data within the `datagram` contain instrumental error estimates and are annotated with units. You can read more about **yadg** in our paper: [Kraus2022b]_.
+**yadg** is a set of tools and parsers aimed to :ref:`extract<extractor mode>` and standardise data from raw files generated by scientific instruments. The supported types of files that can be extracted are listed in the sidebar. The data (or metadata) extracted from the supplied file is returned as a :class:`xarray.Dataset` or a |NetCDF|_ file.
 
+For extracting and combining data from multiple files, **yadg** can be used to :ref:`process<parser mode>` a special configuration file called :mod:`~dgbowl_schemas.yadg.dataschema`. The combined data is returned as a :class:`datatree.DataTree` or a |NetCDF|_ file. This allows reproducible processing of structured experimental data, and takes care of issues such as timezone resolution, unit annotation, uncertainty determination, and keeps track of provenance.
 
-.. image:: images/schema_yadg_datagram.png
-   :width: 600
-   :alt: yadg is used to process raw data files using a datadchema into a NetCDF datagram.
-
-
-Some of the **yadg** parsers are exposed via an `extractor` interface, allowing the user to extract (meta)-data from individual files without requiring a `dataschema`. Several file formats are supported, as shown in the sidebar. You can read more about this `extractor` interface on the |marda_extractors|_ website, as well as in the :ref:`Usage: Extractor mode<extractor mode>` section of this documentation.
-
-.. warning::
-
-   All of the post-processing features within **yadg** have been removed in ``yadg-5.0``, following their deprecation in ``yadg-4.2``. If you are looking for a post-processing library, have a look at |dgpost|_ instead.
-
+For more details about **yadg** usage, see :ref:`the usage instructions<usage>`. You can read more about **yadg** in our paper: [Kraus2022b]_.
 
 Contributors
 ````````````
@@ -46,8 +37,6 @@ The project is also part of BATTERY 2030+, the large-scale European research ini
    features
    citing
 
-.. include:: parsers.rst
-
 .. include:: extractors.rst
 
 .. toctree::

diff --git a/docs/source/parsers.rst b/docs/source/parsers.rst
diff --git a/docs/source/usage.rst b/docs/source/usage.rst
@@ -1,3 +1,5 @@
+.. _usage:
+
 How to use **yadg**
 ===================
 We have prepared an interactive, Binder-compatible Jupyter notebook, showing the installation and example usage of **yadg**. The latest version of the notebook and the direct link to Binder are:

diff --git a/docs/source/version.5_1.rst b/docs/source/version.5_1.rst
@@ -0,0 +1,22 @@
+**yadg** version 5.1
+``````````````````````
+.. image:: https://img.shields.io/static/v1?label=yadg&message=v5.1&color=blue&logo=github
+  :target: https://github.com/PeterKraus/yadg/tree/5.1
+.. image:: https://img.shields.io/static/v1?label=yadg&message=v5.1&color=blue&logo=pypi
+  :target: https://pypi.org/project/yadg/5.1/
+.. image:: https://img.shields.io/static/v1?label=release%20date&message=2024-XX-YY&color=red&logo=pypi
+
+
+Developed in the |concat_lab|_ at Technische Universität Berlin (Berlin, DE).
+
+New features since ``yadg-5.0`` are:
+
+Other changes in ``yadg-5.1`` are:
+
+  - The dataschema has been simplified, eliminating parsers in favour of extractors.
+  - The code has been reorganised to highlight the extractor functionality in favour of parsers.
+
+
+.. _concat_lab: https://tu.berlin/en/concat
+
+.. |concat_lab| replace:: ConCat Lab
diff --git a/docs/source/version.rst b/docs/source/version.rst
@@ -1,6 +1,8 @@
 **yadg** version history
 ------------------------
 
+.. include:: version.5_1.rst
+
 .. include:: version.5_0.rst
 
 .. include:: version.4_2.rst

diff --git a/setup.py b/setup.py
@@ -55,6 +55,7 @@
             "sphinx-rtd-theme~=1.3.0",
             "sphinx-autodoc-typehints < 1.20.0",
             "autodoc-pydantic>=2.0.0",
+            "sphinxcontrib-mermaid~=0.9.2",
         ],
     },
     entry_points={"console_scripts": ["yadg=yadg:run_with_arguments"]},

diff --git a/src/yadg/extractors/__init__.py b/src/yadg/extractors/__init__.py
@@ -1,18 +1,19 @@
 import importlib
 import logging
-from dgbowl_schemas.yadg.dataschema import ExtractorFactory
 from pathlib import Path
 from datatree import DataTree
 from xarray import Dataset
 from typing import Union
 from yadg import dgutils, core
+from dgbowl_schemas.yadg.dataschema import ExtractorFactory
+
 
 logger = logging.getLogger(__name__)
 
 
 def extract(filetype: str, path: Path) -> Union[Dataset, DataTree]:
     """
-    The extract functionality of yadg is implemented here.
+    The individual extractor functionality of yadg is called from here.
 
     Extracts data from provided ``path``, assuming it is the specified ``filetype``. The
     data is either returned as a :class:`DataTree` or a :class:`Dataset`. In either case
@@ -28,11 +29,6 @@ def extract(filetype: str, path: Path) -> Union[Dataset, DataTree]:
     path:
         A :class:`pathlib.Path` object pointing to the file to be extracted.
 
-    Returns
-    -------
-    Union[Dataset, DataTree]
-        The extracted data and metadata.
-
     """
     extractor = ExtractorFactory(extractor={"filetype": filetype}).extractor