Procedure for Getting and Publishing Metadata for LOGD International OGD Catalog (IOGDS)

Summary
Given a catalog, metadata for each dataset is extracted and saved in a CSV file. csv2rdf4lod automation is then used to convert the CSV file into RDF. Four outputs are publicly accessible through https://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/ for each catalog:
 Example 1: data.gouv.fr
Metadata in csv formathttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gouv-fr/catalog/version/2011-Dec-17/manual/data.gouv.fr.csv
Source code if any to get the metadatahttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gouv-fr/catalog/version/2011-Dec-17/manual/src/data_gouv_fr.java
Conversion triggerhttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gouv-fr/catalog/version/2011-Dec-17/convert-catalog.sh
Enhancement filehttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gouv-fr/catalog/version/2011-Dec-17/manual/data.gouv.fr.csv.e1.params.ttl
 Example 2: data.gov.gh
Metadata in csv formathttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gov-gh/catalog/version/2013-May-06/manual/ghana.csv
Source code if any to get the metadataT.B.D.
Conversion triggerhttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gov-gh/catalog/version/2013-May-06/convert-catalog.sh
Enhancement filehttps://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/data-gov-gh/catalog/version/2013-May-06/manual/ghana.csv.e1.params.ttl

Workflow for getting metadata
  1. Select a catalog
  2. List all datasets in the catalog and collect metadata for each dataset in a big table; save the table in CSV format
    • Some catalogs provide a data dump, such as data.gov.uk; use the provided dump files
    • For catalogs that do not provide a data dump, custom programs must be used (e.g. in Java, Python or otherwise) to extract the metadata
  3. Commit the CSV files and any corresponding source code to https://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/

Workflow for publishing a dataset catalog
  1. Follow csv2rdf4lod conversion/enhancement process to convert the csv files into rdf, see details in https://github.com/timrdf/csv2rdf4lod-automation/wiki/Conversion-process-phases
    1. Run the converter to generate the default enhancement configuration
    2. Edit the enhancement configuration files to map the original metadata into designed metadata for LOGDC
    3. Re-run the converter to generate the rdf graph
  2. Commit the enhancement configuration files and conversion triggers to https://scm.escience.rpi.edu/svn/public/logd-csv2rdf4lod/data/source/