French/Canadian
Environmental Informatics Workshop

Session II - October 16 - 2015

Data Mining and Visualization on Oil and Gas Data

Part I - Theoretical Session

1. Brief introduction of data mining

2. Data preparation techniques

3. Some common data mining techniques

a. clustering
b. classification
c. association rule mining

4. Some application examples:

a. Cold Production application
b. SAGD visualization
c. Clustering on Shale Gas data

5. Tutorial on Weka

In this session we will introduce Linked Data, the Linked Data principles, and the models and main technologies of Linked Data and the Semantic Web.

1. Introduction to Linked Data

We present the evolution of the web from the web of document to the web of dataand introduce the basic principles of linked data.

2. Distributing Data on the Web with RDF

The use of URIs and RDF (Resource Description Framework) is at heart of Linked Data. This session will present the graph-based RDF model for distributing data on the web: URIs for naming things, RDF triples --- or subject-predicate-object RDF statements --- to provide information about these things and create links, and RDF serialization formats (RDF/XML, RDFa, Turtle, N-Triples).

Coffee Break

Querying Linked Data with SPARQL

SPARQL builds on top of RDF and it provides

  1. a query language for accessing RDF graphs;
  2. an XML format for representing the results of a query;
  3. a protocol to submit a query to a distant server and receive the results through HTTP. Linked Data applications typically rely on SPARQL for consuming linked open data. In this session we will introduce the main functionalities of SPARQL: SELECT, DESCRIBE, CONSTRUCT and ASK queries, filters, named graphs, SPARQL 1.1.

Semantic modeling with RDFS and OWL

Predicates in RDF triples come from vocabularies. Even though Linked Data advocates to reuse terms from existing and widely deployed vocabularies, Linked Data publishers may have the need to create new terms, and use their own proprietary vocabularies. We will briefly present the RDFS and OWL ontology languages that are standardly recommended to be used for this purpose.

From Open Data to Linked Open Data

We will present the 5 stars model coined by Tim Berners-Lee, that leads from open data to linked open data.


Lunch


Part II : Hands-on Session

From a CSV file to linked data

Participants will learn how to convert a CSV file to linked data using OpenRefine tool, setting links with external datasets. We will outline guidelines for publishing linked data.

Coffee Break

Querying linked data

Participants will learn how to query linked data using Apache Jena ARQ. We will outline guidelines to build Linked Data applications.