This Node.js script parses DCAT metadata serialized in Turtle (TTL) format and generates an HTML report containing information about a specified dataset and its distributions.
-
Ensure you have Node.js installed on your system. You can download it from here.
-
Clone or download the repository to your local machine.
-
Install the required dependencies by running the following command in the terminal:
npm install
-
Run the script with the following command:
node dcat-to-htmltable.js <metadataFilePath> <datasetSubjectURI>
Replace
<metadataFilePath>
with the path to the metadata TTL file and<datasetSubjectURI>
with the URI of the dataset subject you want to extract information about. -
The script will generate an HTML file named
dataset_info.html
containing the dataset information and distributions.
Suppose your metadata.ttl
file is located in the parent directory and you want to extract information about the dataset with the URI https://www.unitedutilities.com/corporate/responsibility/environment/natural-environment/bowland-winep/
. You would run the following command:
node dcat-to-htmltable.js ../metadata.ttl https://www.unitedutilities.com/corporate/responsibility/environment/natural-environment/bowland-winep/
The script will then generate the dataset_info.html
file containing the dataset information and distributions.
Microdata Integration: The HTML output includes microdata annotations, making the content more machine-readable. Microdata attributes are added to the HTML tables, allowing search engines and other applications to better understand the structured data.