RDF Extension for Google Refine

While Google Refine (http://code.google.com/p/google-refine/) helps people clean tabular data, the RDF Extension for Google Refine (http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/) adds a GUI for exporting the results in RDF.

Purpose: Creating lod-link files for csv2rdf4lod

The purpose of this technology description is to accumulate notes regarding LOGD's use and evaluation of the tool. Google Refine can "reconcile" strings such as "Thailand" to Freebase identifiers such as "/m/07f1x" (see http://code.google.com/p/google-refine/wiki/Reconciliation). This provides a straightforward way to create explicit links from ambiguous strings to existing entities within Linked Open Data. With the addition of DERI's RDF Extension for Google Refine, we can express these connections in RDF and apply them to infer owl:sameAs relationships during LOGD's csv2rdf4lod conversion process. The "create-once" lod-link file from Refine can be reused when enhancing any number of datasets, resulting in a growing number owl:sameAs assertions from a fixed amount of initial human effort.

Reconciling Countries before adding RDF extension

BTW, Google Refine "sets up shop" within a directory it creates for each project, e.g. on the Mac,

~/Library/Application\ Support/Google/Refine/1932739037761.project @@@ snapshots of non-RDF extension csv reconciliation for countries.

Installing the RDF extension

Installing RDF Extension for Google Refine, mkdir -p ~/Library/Application\ Support/Google/Refine/extensions

Creating the lod-link RDF Skeleton

"Bundling" properties was a bit unintuitive. The tutorial at http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/#example should elaborate on how to do this a bit more; the instruction "We created the skeleton by repeatedly using add rdf:type and add property" is correct, but not obvious. http://logd.tw.rpi.edu/technology/csv2rdf4lod https://github.com/fadmaa/grefine-rdf-extension/issues#issue/7
dcterms:identifier "Thailand" ; rdfs:seeAlso "/m/07f1x" .

Exporting the Refined CSV to RDF

