Previous title: "LOGD Instance Hub URI Design: Unique URIs for LOGD instances"
URI Design Goals
These principles should produce:
- URIs that are easily re-hosted. This means that a generated should be easily transformed from one BASE URI-space (e.g.
logd) to another, allowing easier buy-in from government agencies.
- For example, the pattern
http://logd.tw.rpi.edu/id/epa-gov/XXXXXX is easily (syntactically) transformed to
http://epa.gov/id/XXXXXX when/if the EPA buys in to this scheme
- Concise URIs with as little "cruft" as possible
- URIs that span many domains including:
- National identifiers (e.g. govermental agencies, states, zip codes)
- State-level identifiers (e.g. counties, congressional districts)
- Agency-level identifiers (e.g. EPA facilities)
URI Design Overview
'http://' BASE '/' 'id' '/' ORG '/' CATEGORY ( '/' TOKEN )+
For case of the TWC RPI Instance Hub BASE
- This is required because we don't want to pollute the top namespace of BASE with identifiers.
instance-hub because we want as short a token as possible; the
id token doesn't add any semantics, it's just a syntactic way of distinguishing these URIs from others.
- Also, consistency with data.gov.uk URIs here is a good thing.
- This is a short token representing the agency, government, or organization that controls the identfier space.
- For US identifiers, this token will start with
us/, and be followed by a designation of either federal or state-level (e.g.
- Identifiers relating to data.gov will all fall under the federal
- For identifiers that aren't directly governmental, the ORG token should be suitably unique; for example, we use
usps-com below for USPS controlled zip code URIs.
- CATEGORY and TOKEN
- These are ORG-specific values that identify the specific instance.
- Use as many TOKENs as necessary to distinguish the instance.
US Government Agencies
States and Territories
- FIPS code, two-letter code, name, dbpedia/geonames/govtrack sameAs
- States and territories are identified by FIPS 5-2 codes, two-letter abbreviations, and names. Not all states/territories have two-letter abbreviations. FIPS 5-2 has been withdrawn as a FIPS standard (2008). Names are probably the most stable.
- Just like states, counties are identified by FIPS codes (FIPS 6-4), but these have been withdrawn (2008). Names of counties seem stable, though two states don't refer to them as "counties": Alaska (borough) and Louisiana (parish). Hierarchy built on the state/territory URIs seems like the best design.
- code, link to state
- The Census Bureau uses ZIP Code Tabulation Areas (ZCTA) based on ZIP codes.
- link to state, dbpedia sameAs
- STATE here can be a two-letter code because (at least for present-day districts and non-voting delegations) we only have data for places with two-letter codes: the the 50 states, DC, AS, GU, MP, PR, and VI.
- link to facility detail report, link to state: http://iaspub.epa.gov/enviro/fii_query_detail.disp_program_facility?p_registry_id=110007995027
- EPA facility IDs are used in EPA Facilities Registry System (FRS) datasets (e.g. Dataset 1008)
 legislation.gov.uk URIs
 Creating Linked Data - Part II: Defining URIs
 The Real Deal: data.gov.uk