LOGD External Demo

These demos are not hosted in a Drupal page in http://logd.tw.rpi.edu

Data.gov Mashathon 2010: an Energy Mashup

Description: 
This mashup was created at the first Data.gov mashathon event, August 24-25, 2010. It was further refined from October 6-8, 2010 by the National Renewable Energy Laboratory. This mashup profiles 7 cities in different parts of the United States that have a population of roughly 600,000 according to the 2000 census data and 2006 census-estimated population.
Contributor:
Contributor:
Contributor:
Contributor:
Ever wonder how residential energy use varies across the U.S.? By combining data from the Energy Information Administration (EIA) on Data.gov with data from OpenEI.org, the U.S. Census and SmartGrid.gov, this mashup compares 7 cities with populations of roughly half a million. With differing electricity rates, median income levels, energy-related incentives and types of Smart Grid programs being introduced, cities across the country are transitioning to a new energy marketplace in unique ways.
Click on a city on the map to view basic statistics about the city, the local electric utility organization, rebates and financial incentive programs and up-to-date information about local Smart Grid projects. Compare and contrast the cities to see how local utility rates, median income and other regional characteristics relate to average annual electricity use.
This mashup was made possible by Data.gov data sources and the National Renewable Energy Laboratory's website OpenEI.org, sponsored by the U.S. Department of Energy.
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Comparison of crimes in different Local Government Areas in New South Wales, Australia

Description: 
This small visualization compares trends in a specific crime for two different Local Government Areas (LGAs) in New South Wales. It uses the New South Wales crime data.
Contributor:
http://data.gov.au was officially released in March 2011, however there has been a "beta" version of the website for some time. This was one of my first attempts to create a visualization of government data when it was first available. It allows the user to select a specific crime and compare the trends over time in two different Local Government Areas (LGA).
Uses Technology: 
Uses Technology: 
Thumbnail: 

Public Safety in Troy, NY

Description: 
This demo integrates information on public safety from Troy Police department and RPI's public safety department and display it using geographical data taken from Google Maps.
Contributor:

Introduction

This project aims to help residents of Troy, New York to visualize different events related to public safety. The original objective was to cover only crimes such as robberies. However, based on the reports from Department of public safety from RPI and Troy Police Department we obtained several other types of incidents, including fires, and medical events. You can check regularly the events of interest. You can also use the RDF link in the results to mash that data up with other datasets. You can also use RSS feed (so you can suscribe with an RSS reader) soon . If you are searching, you may choose from 2 differente data sources: The first one are the events reported by the Department of Public Safety. The second one is events reported by the Troy Police Department. You can choose to seach for any (or both) data sources.

Technology

The implementation of this experiment is based in several different technologies: As a backend I used semantic technologies, such as RDF and SPARQL. Also, JSON and AJAX has been used to register new events. For visualization I used Exhibit and Google maps. Finally JQuery has been used to wrap the interface.
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

DutchessCounty

Web Science Project 3 - Dutchess County

Dutchess County

About the county:

Related NY Times Articles


Community funding appropriations from the NYS Senate to Dutchess County:

Dataset: LOGD - Senator Funding to Community Projects, 2009-2010

NYS Senate expenditures related to Dutchess County:

Dataset: LOGD - NYS Senate Expenditures for 2009, 3/31/2010

1998 - 2008 ASEAN Education Statistics

Description: 
This demonstration uses Microsoft Pivot to show an image-driven visualization of the education statistics (1998 - 2008) from the ASEAN region based on the World Bank's World Development Indicators.
Contributor:

Data set

Some of the World Bank's World Development Indicators were used for this demo. Our focus was primarily on educational statistics for the ASEAN region during the decade between 1998 to 2008. To be specific, the indicators that are included are listed below:
  • Expenditure per Primary Student as a percentage of Gross Domestic Product
  • Expenditure per Secondary Student as a percentage of Gross Domestic Product
  • Expenditure per Tertiary Student as a percentage of Gross Domestic Product
  • Total Public Spending on Education as a percentage of total government expenditure
  • Total Public Spending on Education as a percentage of Gross Domestic Product
The data was then compiled into a spreadsheet and converted into a Pivot collection.

1998-2008 ASEAN Education Statistics Screenshot
Uses Technology: 
Thumbnail: 

2010 World Economic Forum Interlinkage Survey Matrix

Description: 
Members of 72 World Economic Forum councils were handed a survey that asked them to select 5 Global, 3 Regional, and 3 Industrial councils that their council would most benefit from interacting with. This demo shows the resulting scores from this survey.
Contributor:
Contributor:

About the Challenge

The World Economic Forum and Visualizing.org created the challenge with the aim of elucidating the interconnectedness between councils based on the a survey that was given to the council members. A description of the survey is included below:

2010 GAC Interlinkage Survey

Survey Questions:
  • "Please select a maximum of 5 Global Agenda Councils that your Council would benefit from interacting with by order of priority."
  • "Please select a maximum of 3 Industry/Regional Agenda Councils that your Council would benefit from interacting with by order of priority."
  • "Please describe how it interlinks with your Council."
The results where compiled and a weighted score was derived based on the criteria found here. The matrix displays the results of the survey using colors and opacity. Hover over a cell to find out more information. The darker the color of the cell, the higher the score.
Councils Voted For
Voting Council
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

US Global Foreign Aid

Description: 
This application presents historical foreign aid data (ranging from 1951 to 2008) from the United States Agency for International Development (USAID), the Department of Agriculture and the Department of State, mashed up with information from the New York Times API and CIA World Factbook.
Contributor:
Contributor:
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Demo: Comparing Types of Campaign Money by State

Description: 
Compare the disbursements, receipts, and loans of Democratic and Republican candidates by state. See how these variables are or aren't correlated.
Contributor:
How to use Pick two variables of the six provided to compare. You can do this by:
  • Using the pull-out lists on the X or Y axis of the scatterplot to the left.
  • Clicking on the appropriate cell in the correlation matrix.
  • On the left is a scatterplot of your chosen variables. The
    amount of money for one variable is the Y axis, the amount of money for
    the other variable is the X axis. Each point represents a state.
  • On the top-right are two maps, one per variable. The darker a state is, the higher the quantity of money.
  • In the middle is a gauge showing the correlation between the
    two variables (1 - strong positive correlation; -1 - strong negative
    correlation; 0 - no correlation)
  • On the bottom-right is a table showing the correlation between every pair of variables.
Please note that this demo currently does not work properly in Internet Explorer.

Interesting Observations Disbursements and receipts are highly correlated within the same
party (e.g. Democratic receipts and Democratic disbursements), and both
variables are pretty strongly correlated cross-party (e.g. Republican
receipts and Democratic receipts). Loans do not seem to have a strong
correlation with other variables or across parties. While state population is not included in this demo, that, too seems to related pretty strongly to disbursements and receipts.
Technology Highlights This demo uses the State Variable Comparison Javascript API to allow comparison of the various campaign money variables. It creates a Google datatable via a SPARQL query and then passes the table on to the API for processing. Dataset 10011 is used to connect state abbreviations to state names.
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Dynamic RSS generation using SPARQL and Yahoo Pipes

Description: 
This demo shows how to create an RSS 2.0 feed for RDF data using SPARQL and Yahoo! Pipes.
Contributor:
Following is a step-by-step explanation on the workflow of this demo.
Step 1 locate RDF data (TWC LOGD in the News)
The RDF data used in this demo is collected by parsing RDFa metadata embeded on web pages on http://data-gov.tw.rpi.edu, and here is a link to an example page. The RDFa annotation of the example page can be extracted by any RDFa parser, and here is a link to the parse result.
Step 2 compose a SPARQL query to prepare the content of RSS
The following SPARQL query (using SELECT construct) is used in this demo. The variable names are carefully selected to match the terms used in RSS. Therefore, users should keep using the same variable names(including title, image, description, link, image, and pubDate) when they want to compose RSS content on the other RDF data.
PREFIX rss: 
PREFIX dcterms: 
PREFIX foaf: 

SELECT *
WHERE{
 GRAPH ?g {
 _:x a 
    ;foaf:name ?title
    ;foaf:depiction ?image
    ;dcterms:description  ?description
    ;dcterms:source ?link
    ;dcterms:created ?pubDate
 .
}}

This SPARQL also have some limitations as users cannot control the tile and description of the RSS channel. Although "CONSTRUCT" primitive looks promising in solving this issue, it also has limitations. First, SPARQL results uses XSD:dateTime to encode date (e.g. "2010-05-21T00:00:00") while RSS 2.0 requires data formated in RFC822 (e.g."Thu, 20 May 2010 17:00:00 -0700"). Second, the output RDF/XML syntax used "rdf:Resource" as element tag, while RSS 2.0 demands specific XML format (they are looking for "<channel>" and "<item>" instead). Therefore, we stick to "SELECT" based data construction and generate RSS using a web service-- Yahoo! pipes in this demo.
Step 3 query RDF data in a SPARQL endpoint using SparqlProxy and then get JSON back
The data in this demo are all loaded at a triple store with SPARQL endpoint  http://data-gov.tw.rpi.edu/ws/lodc.php. For each page, it RDF metadata is stored in one named graph in the triple store. The SparqlProxy query interface and choices of parameters can be viewed via this link.
  •  check "Query URI" and paste Query URI after
  •  check "sparqljosn" as output format * click "query" button

The query results can be accessed by application via this link and viewed in text format via another link.  
Step 4
build a Yahoo Pipes! web service that convert the SPARQL query results (in SPARQL/JOSN format) into an RSS2.0 data.the code is available at http://pipes.yahoo.com/pipes/pipe.info?_id=5d8842906152a7d3152a128666970f13 . Following are some notable points:
  • you need a yahoo ID to access Yahoo Pipes!
  • SPARQL/JOSN is used to provide access to data structure. Yahoo! Pipes has hard time deal with the XML version
  • representation of date is not compatible between RDF (ISO  and RSS (required RFC822)
Finally, the result RSS is generated, and Yahoo! also provided some human readable page for the RSS feed (wiht images). The RSS feed is also imported by TWC LOGD front page's RSS aggregator.
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

White House Visitors (mobile version)

Description: 
The White House Visitors/Visitees Demo for the iOS platform shows how RDF data can be consumed on a mobile device.
Contributor:

About

List of top White House Visitees
List of the Top 25 White House Visitees
The White House Visitors/Visitees Demo for the iOS platform shows how RDF version of White House Visitor Record can be consumed on a mobile device. The application queries two separate datasets, Data.gov dataset 10025 and the White House Visitees dataset. There are two views over the underlying data: visitors to the White House, and the visitees who received them (depicted right: top 25 visitees). This data is then transformed into a table by the software. Data rendered as a pie chart
Visitor data visualized using a pie chart
Because the data are easily interpreted by machines, they can be presented in alternative forms, e.g. as a pie chart (see left). Visualization methods such as this can also make it easier to identify useful patterns in data and identify possible irregularities, e.g. if there is an extremely large portion in the pie chart there may be issues in the collected data.    

Linked Data

Information gleaned about the President of the United States using the semantic web
Information obtained by querying linked data sources
The White House Visitors Demo takes advantage of a concept called linked data: data sets hosted across the World Wide Web that provide links to entities in other data sets. The White House Visitees data set, mentioned earlier, links the individuals in the White House Visitors dataset with URIs representing in DBpedia (a machine-understandable version of Wikipedia). DBpedia includes information such as an individual's office (allowing us to infer that Barack Obama is President, left) and links to many additional sites, such as the New York Times linked open data site Querying the New York Times using linked data
Ten most-recent articles published by the New York Times
where machines can get information needed to query the New York Times databases for articles (see right) and other materials.

Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Interactive Government Receipts Timeline

Description: 
Timelines for comparing government agencies' account earnings from 1962 to 2014
Contributor:

Interesting Observations

The user chooses one or more government account names from a checklist (see Figure 1), and the application will display a timeline of their earnings from 1962 to the present and projected values through 2014 (see Figure about). The data and the accounts on the checklist come from dataset 403, published by the Office of Management and Budget.
Figure 1. The checklist of accounts, the first thing that appears in this demo.
Potential uses of this app include (but of course are not limited to) seeing how much revenue a particular tax or program is generating, how effective it has been over time, comparing programs, or seeing how much money certain government-funded foundations have received.

Technology Highlights

World Eathquake Map (Yahoo Pipes)

Description: 
This demo shows a map of the world depicting the locations of all earthquakes of mignitude greater than 3.0, dph of less than 50 miles, over the previous seven days. Clicking on a location shows additional details.
Contributor:

Technical Highlights

In this demo, we show how RDF can be used to consumed by yahoo! pipes which is a popular web based tool for building Mashups. Following is a step-by-step instruction showing
  • how conventional Web data publishing and consumption can be connected with RDF
  • how RDF + SPARQL can be used to support data manipulation in mashups.

Step 1 locate earthquake data on data.gov
The earthquake dataset published at Data.gov

Step 2 convert earthquake data from CSV to RDF
Web interface of CSV2RDF that was used to convert earthquake data from CSV to RDF
* CSV data: http://www.data.gov/download/34/csv
* xmlbase: http://data-gov.tw.rpi.edu/raw/34
* property namespace: http://data-gov.tw.rpi.edu/vocab/p/32  
note: we reused property namespace from a similar dataset Dataset 32 (Worldwide M1+ Earthquakes, Past Hour, Department of the Interior)
Step 3 query RDF data and get the results back into a CSV file
  • we use the following query to filter earthquake data by magnitude and depth. Note that some proposed SPARQL 1.1 features were used here including datatype casting (e.g. "xsd:floaf").
PREFIX dgp32:  <http://data-gov.tw.rpi.edu/vocab/p/32/>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
SELECT ?id ?label ?datetime  ?lat ?lon (xsd:float(?mag) as ?magnitude) (xsd:float(?dep) as ?depth)  ?region ?src ?uri
FROM <http://data-gov.tw.rpi.edu/raw/34/data-34.rdf>

WHERE { 
         ?uri dgp32:eqid ?id. 
         ?uri dgp32:eqid ?label. 
         ?uri dgp32:region ?region.
         ?uri dgp32:datetime ?datetime. 
	 ?uri dgp32:magnitude ?mag .
         ?uri dgp32:depth ?dep. 
         ?uri dgp32:lat ?lat.
         ?uri dgp32:lon ?lon. 
         ?uri dgp32:src ?src.
         filter ( xsd:float(?mag) >= 3 && xsd:float(?dep) <= 50 )  }

Web interface of SparqlProxy that was used to query earthquake data in RDF and return the results back in CSV
* type of past the above SPARQL query in "SPARQL query" field
* check "CSV"  as output format
* click "query" button
Step 4 build a Yahoo Pipes! demo
  • you need a yahoo ID to access Yahoo Pipes!
  • build a yahoo pipe that consume the filtered CSV and plot the data on a map
    • use "Sources">"Fetch CSV" to retrieve SPARQL query result
    • use "Operations">"rename" to rename column names
    • use "Operations">"Loop" together with "String">"String Builder" to compose a text "description" for each earthquake
    • use "Operations">"Loop" together with "String">"String Builder" to compose a USGS "url" for each earthquake
    • use "Locaiton">"Location Extraction" to put data on map.
Web interface of SparqlProxy display filtered earthquake data (query results in CSV)
Uses Technology: 
Uses Technology: 
Thumbnail: 

World Eathquake Map (MIT Exhibit)

Description: 
This demo shows a map of the world depicting the locations of all earthquakes of mignitude greater than 3.0, dph of less than 50 miles, over the previous seven days. Clicking on a location shows additional details. Click on a property facet to filter relevant earthquakes.
Contributor:

Interesting Observations

A user may use a map to view recent earthquakes, and SPARQL querying can be used filter what will be displayed on the Map (e.g. only show earthquakes whose magnitude is no less than 2.0).

Technology Highlights

Simile Exhibit offers a convenient interface that accepts JSON data sources. However, Exhibit and the Google Visualization API accept different types of JSON formats. Therefore, we used the SparqlProxy to bridge this gap. Below is the tricky point
  
 <link rel="exhibit/data"
       href="http://data-gov.tw.rpi.edu/ws/sparqlproxy.php?query-uri=http%3A%2F%2Fdata-gov.tw.rpi.edu%2Fsparql%2Fdemo-34-exhibit.sparql&output=exhibit" 
       type="application/json"
       charset="utf-8"/>
http://data-gov.tw.rpi.edu/sparql/demo-34-exhibit.sparql the complete source code is at http://code.google.com/p/data-gov-wiki/source/browse/trunk/www/demo/exhibit/demo-34-exhibit-sparql.html Below are some documentations for using Exhibit, which is spread around the web, and some useful tips:

More

Currently, the depth property in the RDF data is an untyped literal. This should be converted to the float datatype so that SPARQL can use filtering. There is yet another way to directly show the entire RDF/XML data on MIT Exhibit. This is accomplished by replacing the data with the following line (see this link)
 <link rel="exhibit/data"
       href="http://data-gov.tw.rpi.edu/raw/34/data-34.rdf" 
       type="application/rdf+xml"
       charset="utf-8"/>
The complete source code is at http://code.google.com/p/data-gov-wiki/source/browse/trunk/www/demo/exhibit/demo-34-exhibit-rdf.html. See http://simile.mit.edu/wiki/Exhibit/How_to_import_RDF_on_the_fly for more information.
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

White House Visitation Patterns

Description: 
Compute the social network of White House visitor-visitee, and find the most popular visitee.
Contributor:

More Demos

White House Visitor-Visitee Network
White House Visitor-Visitee Network

Treemap view
Popular White House Visitees (TreeMap)
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

US and UK Crime Data

Description: 
This demo uses data available at Dataset 311 from data.gov and from http://PoliceAPI.rkh.co.uk for UK data and compare number of crimes in cities in the US and UK.
Contributor:

Description

We compare crime in cities and several interesting things are noticed: First, the classification of crimes is different for each country. There are certain similarities, such as 'Burglary' and 'Robbery', however others are not easy to be matched (for example, 'Vehicle theft' and 'Vehicle crimes' are similar but not the same). Another point is the granularity of data available in the UK: It is possible to select a whole police area force, a city or a particular subarea. For the case of the US, we considered only cities. We also allow users to divide the number of events occurred by the location's population. In this way we can do a comparison of events per person instead of considering only totals.
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

US Government Agencies' Contributions at Data.gov

Description: 
This demo shows how many raw datasets have been contributed to Data.gov by each US government agency.
Contributor:
Contributor:
Interesting Observations This demo queries the data.gov catalog to list all the agencies that submitted data to data.gov, then counts the number of datasets each submitted. From the visualization, web users are presented with the following observations:

Technical Highlights

In this demo, SPARQL query results are displayed using various Google Visulization APIs including pie chart, bar chart, line chart and table.
This demo also show the use of aggregation function in SPARQL (we used virtuoso syntax)
# NOTE:
#  this query used SPARQL 1.1 feature, and its SELECT clause is customized to virtuoso syntax
#  we only query raw data catalog
#
PREFIX dgp92:  <http://data-gov.tw.rpi.edu/vocab/p/92/>
SELECT  ?agency , count(*) AS ?cnt
FROM NAMED <http://data-gov.tw.rpi.edu/raw/92/data-92.rdf>
WHERE { 
GRAPH <http://data-gov.tw.rpi.edu/raw/92/data-92.rdf>
{
 ?entry dgp92:agency ?agency. 
 ?entry dgp92:data_gov_data_category_type "Raw Data Catalog" .
}
}
GROUP BY ?agency
ORDER BY DESC (?cnt )

Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

US Agency Budget Browser

Description: 
browse an agency's budget data (summary and account details) using three Public Budget datasets.
Contributor:

interesting observations

This demo uses SPARQL to combine data from dataset 401 budget authority (money allocated by Congress), dataset 402 budget outlays (actual expenses) and dataset 403 recipient by various government accounts from 1976 to 2014 (values from the present to 2014 are projected).
  • not all agencies show up in all datasets, and a lot of them are not mentioned in Dataset 403 (Public Budget Database - Governmental receipts 1962-2015, Executive Office of the President).
  • there are both positive and negative figures in the budget, further investigations are needed

Example 1
Reforms and Recoveries for Elementary and Secondary Education

Technology highlights

SPARQL queries were used to aggregate data from three datasets
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?p  sum(xsd:integer (?o)) ?g 
WHERE 
{GRAPH ?g
{
 # match the specific BGP first, then filter based on account_name. 
 # only join with the completely unbound triple pattern after the filter so that the intermediate result size isn't large.
 {
  ?s <http://data-gov.tw.rpi.edu/vocab/p/401/agency_name> "Department of the Interior" . 
  ?s <http://data-gov.tw.rpi.edu/vocab/p/401/bureau_name> "Departmental Offices" . 
  ?s 	<http://data-gov.tw.rpi.edu/vocab/p/401/account_name> ?account_name.
  filter (regex(?p,"num"))
 }
 ?s ?p ?o.
}
}
group by ?g ?p 

Uses Dataset: 
Uses Dataset: 
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Tracking Changes of data.gov Catalog via RSS

Description: 
RSS feeds for recently updated datasets on data.gov
Contributor:

Interesting Observations

The datasets published at data.gov change on a weekly basis. In order to find out (1) what datasets are available on data.gov, and (2) which datasets were recently updated, we generate RSS feeds daily which use semantic web technologies. The following are some examples:
  • today-raw-ping.rss: list all raw datasets published at data.gov as of today, and update their date by pinging their data access points.
  • diff-today.rss: list recently added, deleted or updated datasets (raw dataset and tool catalog) since last catalog update (see figure).
  • http://twitter.com/datagovwiki twitter updates

Technology Highlights

Medicare Claims versus Interstate Migration

Description: 
This demo traces population migration in contrast to Medicare claims.
Contributor:
Uses Dataset: 
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Semantic Search of the Data-gov Catalog

Description: 
Search the data-gov catalog entries using RDFa and Yahoo! Boss.
Contributor:
Contributor:

Motivating Problem

Keyword based search engines return results based on keyword matches, where result descriptions typically consist of page abstracts with the query keywords highlighted. However, these search result descriptions do not always present useful information for a user to understand the content of the page.

Implementation

Before implementing this search service, we first wrote an RDFa extension for the Semantic MediaWiki (SMW). This RDFa extension of SMW will extract the semantic data of SMW pages, convert this data into RDFa, and embed it within the SMW page. The functionality of this extension is applied to the Data.gov Catalog datasets, and generates RDFa data about the datasets. The search service accesses the RDFa data through queries to our triple store. Upon getting a user query from the web interface, we send the query to the Yahoo! Boss Search Application to get the search results in an XML format. We then parse this XML to fetch the URLs of the results. Using these URLs, we form sparql queries to query the RDFa triples we want. Finally, we present these triples to the user. Read more on How to Build Data-gov Semantic Search to Consume RDFa

Benefit

Using this semantic search application on the Data.gov catalog, we can enhance a user's search experience by given them convenient summary information about their search results (allowing them to more easily find relevant pages).

Technology Highlights

  • We use RDFa extension of SMW to generate RDFa data of data-gov catalog.
  • We use ARC2 to load the RDFa triples to ARC2 triple store.
  • We use Yahoo! Boss Application to search related web contents based on user input.
  • Yahoo! Boss Application returns xml document to our server.
  • We parses the xml results to get URL information, and form sparql queries to query RDFa data against ARC2 triple store.
  • We parse the RDFa data and present enhanced result to the users.
  • Uses Technology: 
    Uses Technology: 
    Uses Technology: 
    Uses Technology: 
    Thumbnail: 

Public Housing Stimulus Funds versus Section 8 Units by State, 2009

Description: 
This demo takes fund allocation data for public housing as granted by the 2009 American Recovery and Reinvestment Act and compares it to the number of section 8 public housing units currently established (as reported by the Department of Housing and Urban Development), for each state.
Contributor:
The map displays low-income housing awards and number of public housing units as a ratio, normalized by the population of each state. The column charts provide comparisons of the state funding and unit number values to the national averages.

Interesting Observations

Despite having the second largest population in the United States and a large number of section 8 units available, Texas did not receive any low-income housing awards through the 2009 stimulus package. One possibility that may explain this is that Texas may already receive funding from other sources/agencies such that it was not determined necessary to grant public housing funds to Texas through the 2009 stimulus. In the future, it may be interesting to examine whether the poverty level of each state has some effect on these results. Incorporation of more data regarding non-stimulus funding for public housing may also be beneficial. Florida is also of interest as it had the highest funding-per-unit ratio out of all the states. The state has a very large elderly population, which may factor into this.
Uses Dataset: 
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Public Company Bankruptcy Cases, Fiscal Year 2009

Description: 
This demo is built upon SEC data about bankruptcy cases of public companies in fiscal year 2009. The map visualizes the total debt over asset ratio at the state level. Upon clicking on a state, a bar chart shows the details of each company that went bankrupt.
Contributor:

Interesting Observations

  • Most highly bankrupt companies are financial companies which have very large amounts of debts, as is in accordance with what people talk about regulations over financial industry.
  • It would gain more insights on the situation if combined with tax rate of each state to look at the payoff from the debt.
  • More data about the stock prices of those public companies would shed lights on the changes of equity assets.

Technology Highlights

Make SEC data linkable by querying the 'state' graph (Dataset 10011)

We created a dataset Dataset 10011 for US states and it covers a lot of alternative labels of a state. Such label can uniquely identify a location which is known to be a US state. Therefore, we can use skos:altLabel as a key (given the context where we are talking about US state) to clarify which US states were actually mentioned in dataset 1580. SPARQL endpoint: http://data-gov.tw.rpi.edu/sparql SPARQL query
PREFIX odg1580: <http://data-gov.tw.rpi.edu/vocab/p/1580/>
prefix skos:  <http://www.w3.org/2004/02/skos/core#> 
prefix dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#> 

prefix foaf:  <http://xmlns.com/foaf/0.1/> 

SELECT ?state_name ?company ?asset ?liabilities
WHERE {
graph <http://data-gov.tw.rpi.edu/vocab/Dataset_1580>
{
  ?s odg1580:state ?state.
  ?s odg1580:company_name ?company.
  ?s odg1580:assets_millions ?asset.
  ?s odg1580:liabilities_millions ?liabilities.
}
graph <http://data-gov.tw.rpi.edu/vocab/Dataset_10011>

{        
         ?uri_dgtwc skos:altLabel ?state .
         ?uri_dgtwc dgtwc:abbreviation ?state_abbrev .
         ?uri_dgtwc foaf:name ?state_name .
}
}
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Multifamily Housing Physical Inspection Scores

Description: 
This allows a user to view the score of physical inspections of properties that are owned, insured or subsidized by HUD, including public housing and multifamily assisted housing.
Contributor:
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Multi-word Tag Cloud from Government Dataset Titles

Description: 
This demo shows a tag cloud consisting of multi-word phrases extracted from data.gov datasets' titles.
Contributor:

Exhibit

For the first time, we have a real Mutli-word TagCloud! My blog Multi-Word TagCloud on Web N-gram Now for this demo is out, so please put your comment there. To learn to build something like this demo, please read How to use Microsoft Web N-gram service.
Single-word TagCloud is ok. see media:datagov-dataset-single.txt raw data
Multi-word TagCloud is more meaningful. see media:datagov-dataset-mix.txt raw data. also see a different rendering of the same data http://www.wordle.net/show/wrdl/2061064/Multi-word_Tag_Cloud_from_data.gov_dataset_titiles

TagCloud with Multi-word tags only - let's check the magic words. see media:datagov-dataset-multi.txt raw data
We are still improving this demo to better extra the term.

Acknowledgment

This demo is powered by Microsoft Web N-gram Services, http://research.microsoft.com/web-ngram
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Linking Wildland Fire and Government Budget

Description: 
US government is spending billions of dollars on fighting with Wildland Fire, and this demos show their correlations.
Contributor:
div style="background:whitesmoke; width:1100px">
Scaling options: budget(USDA/DOI) acres fires
loading ...

Interesting Observations

  • billions of dollars are spent on fighting wildland fire.
  • the big drop of wild fire in 1985 is strange, can we find explanations.
  • While the number of fires are more stable in the past 20 year, the amount of burned land has been growing in the past five years. Meanwhile, the budget is also growing (almost non-linearly in recent years). It would useful to explain which department, Department of the Interior or Department of Agriculture, is taking the primary role in fighing wildland fire and should receive more budget allocation.
more information

Technology Highlights

Find relevant data in Budget Dataset

We use SPARQL to list relevant Budget Accounts
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

SELECT ?p  sum(xsd:integer (?o)) ?agency
WHERE 
{GRAPH <http://data-gov.tw.rpi.edu/vocab/Dataset_401>
{
 # match the specific BGP first, then filter based on account_name. only join with the completely unbound triple pattern after the filter so that the intermediate result size isn't large.
 {
  ?s 	<http://data-gov.tw.rpi.edu/vocab/p/401/account_name> ?account_name.
  ?s  <http://data-gov.tw.rpi.edu/vocab/p/401/bureau_name> ?bureau.
  ?s <http://data-gov.tw.rpi.edu/vocab/p/401/agency_name> ?agency . 
  filter (regex(?account_name,"Wildland Fire"))
 }
 ?s ?p ?o.
}
}
group by ?p ?agency

Collect Annotations from Users

We use semantic wiki to help users collaboratively contribute news on Dataset_WildfireNews. The news is then published on-the-fly via Wildfire News RSS Here, the RSS data is not loaded into the triple store, so it will be related every time we reload the live demo. Following is the sample sparql query (with FROM clause):
SELECT ?date ?title ?link
FROM <http://data-gov.tw.rpi.edu/wiki/Special:Ask/-5B-5BCategory:Wildfire-20News-20Item-5D-5D/-3FDcterms:created%3Ddate/sort%3DDcterms:created/order%3DDESC/format%3Drss/title%3DWildfire-20News/description%3DEvents-20important-20to-20Wildland-20Fire-20fighting-20and-20budgeting/limit%3D10>

WHERE {
?s <http://purl.org/rss/1.0/title> ?title .
?s <http://purl.org/rss/1.0/link> ?link .
?s <http://purl.org/rss/1.0/description> ?description .
?s <http://purl.org/dc/elements/1.1/date> ?date.
}

Connect to Dbpedia/Wikipedia

We can query dbpedia for wildland fires in the US using the category yago-class:WildfiresInTheUnitedStates. Note that dbpedia provide sparql endpoint at http://dbpedia.org/sparql.
SELECT distinct ?subject ?label ?comment ?page ?image ?arces
WHERE {
 {
  {
    {
      ?s a <http://dbpedia.org/class/yago/WildfiresInTheUnitedStates>.
     
      ?s <http://www.w3.org/2000/01/rdf-schema#label> ?label.
      filter (lang(?label)="en")
     }
     ?s <http://www.w3.org/2004/02/skos/core#subject> ?subject.
     filter(regex(?subject,"[1-2][0-9][0-9][0-9]_in_the_United_States"))
   }
 ?s <http://www.w3.org/2000/01/rdf-schema#comment> ?comment.  
 ?s <http://xmlns.com/foaf/0.1/page> ?page. 

  optional{ ?s <http://xmlns.com/foaf/0.1/depiction> ?image. }
  optional{ ?s <http://dbpedia.org/property/acres> ?arces. }

 filter (lang(?comment)="en")
 }
}

Uses Dataset: 
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Linking USPS Spending and News

Description: 
US Postal Service's financial history by: (i) net income (income - expenses), budget authority (money allocated by Congress) and outlays (actual expenses) on a timeline (ranging from 1976 to 2008), and (ii) historical events dynamically loaded from an RSS feed at semantic wiki
Contributor:
Contributor:

interesting observations

This demo shows the US Postal Service's financial history by: (i) plotting net income (income - expenses), budget authority (money allocated by Congress) and outlays (actual expenses) on a timeline (ranging from 1976 to 2008), and (ii) annotating the timeline with historical events, dynamically loaded from an RSS feed of articles on this wiki.
  • As might be expected, outlays and net income almost always go in opposite directions. When net income is increasing, outlays decrease, and vice versa.
  • Fiscal year 2003 appears to have been a great year for the USPS. Outlays went lower than ever, into negatives (meaning the USPS was earning money), and net income reached a record high. The next year it seems that Congress responded by not granting the Postal Service nearly as much budget authority as before. Following goals laid out in the 2002 Transformation Plan may have done much to cut costs.
  • Conversely, fiscal year 2007 looks like it was a terrible year, cost-wise, for the USPS. Outlays shot to record highs and net income to record (negative) lows, and more budget authority appears to have been granted to help the Postal Service out. Ironically, the Postal Accountability and Enhancement Act, meant to liberalize the USPS and make it more flexible as a business, had just been passed in 2006.
  • In 2008, net income was still negative and outlays remained high, but they had reversed direction from 2006 (income rising, outlays decreasing). Perhaps the result of selling many new Forever stamps? Or the Postal Accountability Act finally having a positive influence?
  • When outlays rose, budget authority had increased the year before, except for between 2003 and 2006.

Technical Highlights

We used the following technology
  • RDF - this demo uses four RDF datasets, two from both Data.gov and two directly from the US Postal Service.
  • RSS - this demo used one RSS feed published on this wiki
  • Semantic MediaWiki - this demo let user dynamically edit the list of USPS history events (and the RSS feed) using semantic forms and template feature of semantic mediawiki.
  • SPARQL - this demo obtained the display data via SPARQL query on the three RDF dataset. We used to two SPARQL query in this demo: http://data-gov.tw.rpi.edu/sparql/postalmoney.sparql, http://data-gov.tw.rpi.edu/sparql/uspsrss.sparql
  • Google Visualization API - we used a collection of Restful Web Services to execute SPARQL query, convert the SPARQL query result into JSON (Google processable format), and then visualize data using annotated timeline. some additional javascript code were used to post process the date/time column in JSON file and aggregate JSON data with RSS feed.
Worth Noting
  • Some of the data in this demo does not come from data.gov, but from the USPS webpage. The original source for the data can be found here, while our converted RDF is here.
  • Historical information was gathered from various sites around the web. These sources are listed underneath the visualization and in the wiki pages for the articles.
  • Rendering is not perfect in Google Chrome and IE 7: Both browsers do not limit the width of the annotations area, and so the text goes off the right side of the screen.

linked data

We used two datasets from http://data.gov In addition, we used two non-Data.gov datasets

Live Demo - Add More News

We maintain an user contributed RSS Feed (also in RDF/XML format) - USPS News RSSUser can edit or add a new event via forms (type the title of your article) 
   
Below is a selection of USPS News contributed by Users (sorted by last modification date).
Article Title Dc:date Modification dateThis property is a special property in this wiki.
USPS: Forever Stamp Issued 245420312 April 2007 2455097.685300922 September 2009 16:26:50
USPS: Government Performance and Results Act 24489935 January 1993 2455064.908009320 August 2009 21:47:32
USPS: Civil Service Funding Reform 245275323 April 2003 2455064.903761620 August 2009 21:41:25
USPS: ZIP + 4 code introduced 24453361 January 1983 2455064.869224520 August 2009 20:51:41
USPS: National Postal Museum opens 244919930 July 1993 2455064.868993120 August 2009 20:51:21
USPS: Last public service appropriation 244524330 September 1982 2455064.868715320 August 2009 20:50:57
USPS: John E. Potter becomes postmaster 24520621 June 2001 2455064.868252320 August 2009 20:50:17
USPS: First use of OCR 24449711 January 1982 2455064.867696820 August 2009 20:49:29
USPS: 2002 Transformation Plan released 24523661 April 2002 2455064.866898120 August 2009 20:48:20
Uses Dataset: 
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Linking Agency Budgets and New York Times News

Description: 
Annotated timeline of agency's budget and New York Time news related to it, 1976-2008
Contributor:

Interesting Observations

This demo plots an agency's budget data from 1976 and 2014 (where 2010-2014 represent projected values), and then associates the agency with the news from the New York Times. All data are aggregated on a Google Visualization Annotated Timeline This demo uses the following data sources: From http://data.gov, Dataset 401 (Public Budget Database - Budget Authority and offsetting receipts 1976-2015, Executive Office of the President) From New York Times News API, http://developer.nytimes.com/
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Library Books Per Capita, by State

Description: 
This demo displays the number of library books per capita by state.
Contributor:

Interesting Observations

This demo displays book volumes (adjusted by state population) for each US state library in dataset 353 on a map of the US. Darker colors indicate more books available to library users. Readers can easily see the following
  • New York
    state library has the largest number of books, but it also has a large
    population. Therefore it has a low number for average books per person.
  • East coast, West coast, plus Texas state library hold more book. Maybe that is because of their population.
  • some states, such as Maryland, reported 0 in book volume. Further investigation on State Libraries statistical data (http://harvester.census.gov/imls/data/stla/index.asp) will be useful.
v1 - just books v2 - books per person

Technology Highlights

This is an easy demo and you may copy its code to build your first government data visualization.
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Growth of Datasets Available on Data.gov

Description: 
This demo shows the growth in the number of datasets (not including those from the geodata catalog) available on data.gov from July 2009 through today
Contributor:

Interesting Observations

This demo queries different revisions of Dataset 92 (Data.gov Catalog, Executive Office of the President) since July 2009 and compares the number of entries found (each entry corresponds to one of the non-geo datasets published at Data.gov at the time cached). Using this demo, we can easily see that Data.gov is steadily publishing more datasets.
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 

Government Revenue Timelines

Description: 
This demo shows amounts of money received by two government accounts (individual income taxes and corporate income taxes) from 1962 to the present, as well as projected values through 2014.
Contributor:

Interesting Observations

This demo uses data from Dataset 403 (Public Budget Database - Governmental receipts 1962-2015, Executive Office of the President), which shows the money received by two government accounts from 1962 to the present, as well as projected values through 2014. The two accounts ("government receipts") were selected and plotted side by side on a timeline. A number of observations can be found from the simple graphics
  • Personal income tax is much larger than Corporate income tax.
  • In the fiscal year 2008, there is a big drop of income tax on both.
  • starting from 2010, there is an optimistic projection that the tax income will grow in the future.

Technical Highlights

This demonstration uses Google Visualization's annotated timeline.
Uses Dataset: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Uses Technology: 
Thumbnail: 
Syndicate content