IOGDS Data Analytics

What to Expect

IOGDS data analytics connects users to an overview of the data they are dealing with, the IOGDS catalog data is in turtle format, which possesses a challenge to new users who are trying to understand what's within the data, the data analytics work covers the very basic statistics ranging from catalog, category, and geographic information statistics to some visualization presentation.

What You Need to Know

This analytics page assumes familiarity in the following areas:
  • Resource Description Framework (RDF) is a standard model for data interchange on the Web. See [1]
  • SPARQL is a query language for RDF. See [2]
  • Terse RDF Triple Language (Turtle) is a syntax language for serializing RDF. It is used throughout IOGDS. See [3]

The Analytics Page

  • How many datasets in total?

    Answer:1028054
    PREFIX dcterms: <http://purl.org/dc/terms/>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    PREFIX void: <http://rdfs.org/ns/void#>
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    SELECT  (count(?dataset) as ?count) 
    WHERE {
       {
           SELECT ?abstract (MAX(?m) AS ?modified)
           WHERE {
               GRAPH conversion:MetaDataset {
                   ?abstract void:subset [ void:subset [ a conversion:LayerDataset, conversion:MetaDataset ]; 
               dcterms:modified ?m ] }
           }
           GROUP BY ?abstract
       }
       GRAPH conversion:MetaDataset {
           ?abstract  void:subset    ?versioned .
           ?versioned dcterms:modified ?modified .
           ?dataset a conversion:CatalogedDataset; 
                    void:inDataset ?versioned.  
       }
    } 
    
  • How many countries?

    Answer:43
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    SELECT count(DISTINCT ?country)
    WHERE { 
       GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {
         [] a conversion:DatasetCatalog;
         dgtwc:catalog_country ?country.
    	   } 
    }
    
  • How many catalogs?

    Answer:192 (there are 4 catalogs "depreciated" though)
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
     
    SELECT count(DISTINCT ?catalog)
    WHERE { 
      GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {
        ?catalog a conversion:DatasetCatalog 
      }
    }
    
  • How many categories?

    Answer: 2460
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    PREFIX dgtwc:<http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#> 
    SELECT count(DISTINCT ?category)
    WHERE { 
      GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {
        [] a conversion:CatalogedDataset;
        dgtwc:category?category
        }
    }
    
  • How many languages?

    Answer:24
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    SELECT count(DISTINCT ?language)
    WHERE { 
       GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {
         [] a conversion:DatasetCatalog;
    	 dgtwc:catalog_language ?language.
    	   } 
    }
    
  • How many catalogs are from United States? United Kingdom? Canada?

    Answer:36, 18, 29
    Note: This question is superseded by more complete results below (JSE)
    PREFIX dgtwc:<http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    PREFIX conversion:<http://purl.org/twc/vocab/conversion/>
    PREFIX dcterms:<http://purl.org/dc/terms/>
    SELECT count(?catalog)
    WHERE { 
      GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset>{
        ?catalog a conversion:DatasetCatalog;
    	dcterms:identifier ?identifier;
        dgtwc:catalog_country <http://dbpedia.org/resource/United_States>.
        #dgtwc:catalog_country <http://dbpedia.org/resource/United_Kingdom>.
        #dgtwc:catalog_country <http://dbpedia.org/resource/Canada>. 
      }
    }
    
  • How many catalogs from each country?


    Note: Bar charts may be more useful (JSE)
    Note: List all the datasets for United States (JSE)
    Note: List all the datasets for France (talk to Dominic or just use Excel if API is a problem) (JSE)
    Note: Please use plural correctly, i.e. "Catalogs per..." instead of "Catalog per..." (JSE)
    PREFIX conversion:<http://purl.org/twc/vocab/conversion/>
    PREFIX dgtwc:<http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    PREFIX dcterms:<http://purl.org/dc/terms/>
     
    SELECT (count(?catalog) as ?count) ?country
    WHERE {
     GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {
       ?catalog a conversion:DatasetCatalog;
             dcterms:identifier ?identifier.
        OPTIONAL {?catalog dgtwc:catalog_country ?country}
     }
    }
    group BY ?country order by ?count
    
  • For the top five contributing countries, how many datasets are from each catalog from these five countries?

    PREFIX dcterms: <http://purl.org/dc/terms/>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    PREFIX void: <http://rdfs.org/ns/void#>
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    
    SELECT  (count(?dataset) as ?count) ?catalog 
    WHERE {
       {
           SELECT ?abstract (MAX(?m) AS ?modified)
           WHERE {
               GRAPH conversion:MetaDataset {
                   ?abstract void:subset [ void:subset [ a conversion:LayerDataset, conversion:MetaDataset ]; dcterms:modified ?m ]
               }
           }
           GROUP BY ?abstract
       }
       GRAPH conversion:MetaDataset {
           ?abstract  void:subset      ?versioned .
           ?versioned dcterms:modified ?modified .
           ?dataset a conversion:CatalogedDataset; 
                    void:inDataset ?versioned ;
                    dgtwc:catalog_title  ?catalog;
                    dgtwc:catalog_country <http://dbpedia.org/resource/United_States>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/France>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/Canada>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/United_Kingdom>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/Spain>.
       }
    } group by ?catalog order by ?count
    
  • Across all datasets, list the top 100 categories by dataset count from each data category?

    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX dcterms: <http://purl.org/dc/terms/>
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    
    SELECT (count(?dataset) as ?count) ?category
    WHERE {
      GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {  
         ?dataset 
           a conversion:CatalogedDataset;
           dgtwc:category ?category;
           dgtwc:catalog_country <http://dbpedia.org/resource/United_States>. 
    	   #dgtwc:catalog_country <http://dbpedia.org/resource/France>
    	   #dgtwc:catalog_country <http://dbpedia.org/resource/Canada>
    	   #dgtwc:catalog_country <http://dbpedia.org/resource/United_Kingdom>
    	   #dgtwc:catalog_country <http://dbpedia.org/resource/Spain>
    
      }
    } group by ?category order by ?count
    
  • For the top five contributing countries, how many datasets are from top 20 categories from these five countries?

    PREFIX dcterms: <http://purl.org/dc/terms/>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    PREFIX void: <http://rdfs.org/ns/void#>
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    
    SELECT  (count(?dataset) as ?count) ?category 
    WHERE {
       {
           SELECT ?abstract (MAX(?m) AS ?modified)
           WHERE {
               GRAPH conversion:MetaDataset {
                   ?abstract void:subset [ void:subset [ a conversion:LayerDataset, conversion:MetaDataset ]; dcterms:modified ?m ]
               }
           }
           GROUP BY ?abstract
       }
       GRAPH conversion:MetaDataset {
           ?abstract  void:subset      ?versioned .
           ?versioned dcterms:modified ?modified .
           ?dataset a conversion:CatalogedDataset; 
                    void:inDataset ?versioned ;
                    dgtwc:category  ?category;
                    dgtwc:catalog_country <http://dbpedia.org/resource/United_States>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/France>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/Canada>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/United_Kingdom>.
                    #dgtwc:catalog_country <http://dbpedia.org/resource/Spain>.
       }
    } group by ?category order by ?count
    
  • What are the percentage contributions of each country to the total datasets?

    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX dcterms: <http://purl.org/dc/terms/>
    PREFIX dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>
    PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
    SELECT (count(?dataset) as ?count) ?country
    WHERE {
      GRAPH <http://purl.org/twc/vocab/conversion/MetaDataset> {  
         ?dataset 
           a conversion:CatalogedDataset;
           dgtwc:catalog_country ?country   
      }
    } group by ?country order by ?count
    
  • Some lesser known contributors ?

  • Geo Map of the datasets distribution around the world

  • Top 10 agencies in IOGDS ?

AttachmentSize
Geo.PNG140.89 KB
top 1 to 20 categories.png350.01 KB
top 1 to 20 categories1.png42.56 KB
top21 to 40 categories1.png38.15 KB
top 41 to 60 categories.png387.87 KB
top 41 to 60 categories1.png34.89 KB
top 61 to 80 categories.png282.2 KB
top 61 to 80 categories1.png38.68 KB
top 81 to 100 categories1.png301.17 KB
top21 to 40 categories.png445.88 KB
Dataset From Each Catalog From Canada.PNG28.3 KB
top 81 to 100 categories.png590.08 KB
Some lesser known contributors.png16.51 KB
Dataset From Each Catalog From The United States.PNG511.35 KB
Dataset From Each Catalog From Canada1.PNG465.81 KB
Dataset From Each Catalog From United Kingdom1.PNG214.38 KB
Dataset From Each Catalog From Spain1.PNG211.75 KB
Dataset From Each Catalog From France1.PNG423.35 KB
Dataset By Top 20 Categories From The United States.PNG41.06 KB
Dataset By Top 20 Categories From France.png38.56 KB
Dataset By Top 20 Categories From Canada.png34.54 KB
Dataset By Top 20 Categories From United Kingdom.png33.02 KB
Dataset By Top 20 Categories From Spain.png400.36 KB
Datasets from Each Country.PNG35.2 KB
Catalogs Per Country.PNG55.44 KB
top 10 agency.png31.07 KB