Bruno Pinheiro

O Pio da Coruja

» Semtech 2008 - 1st day

Semtech 2008 - 1st day

May 19, 2008 12:13 am

As I wrote before, I’m at the Semantic Technology Conference or SemTech2008 to simplify. The first day was only to align concepts and put everyone up to date of where we are.

Dave McComb, from Semantic Arts opened with a presentation basically highlighting the basic concepts that would be discussed along the event. Some are pretty old, like False positives and False negatives on search results, but the approach was obviously how semantic apps could help improving these questions. The shift from the web as we know is inevitable, due to the great amount of unstructured data is generating noise and it’s getting hard to work with the relational data model. Data must shift to Information, as information means knowledge. And some of the most recent efforts are on the Entity Extraction, with lots of tools for finding and associating entities found in text with concepts on ontologies. At the end, these information would allow the systems to make inferences and discoveries that wasn’t initially declared.

Ivan Herman, specialist from W3C on semantic web, made a broad presentation of what they’re focusing at the W3C, which are the discussions that are burning at the community and talked about some technologies that they are putting their bets on. As far as I saw, Dublin Core and FOAF are a common sense at the vocabulary level, as they appeared as good examples in both presentations and in every book about semantic. SPARQL is the Query Language that with RDF and WOL OWL seems to be under the spotlight now.

Ivan talked a little about an interesting project called the ‘Linking Open Data Project’, which Goal is to ‘expose open databases in RDF’, setting RDF links among data items from different databases and setting up SPARQL endpoints to query the data. The first practical project One of the projects of this initiative is the DBPedia: by extracting data from that “infobox” on wikipedia pages (right columm) from a City, for example, and integrating with the city information on the US Census database they can build a stronger an richer knowledge of that city.

At this elaboration stage there are still lots of issues, but these were the ones Ivan talked about: security, trust, provenance; ontology merging, alignment, term equivalences; Uncertainty. The most important for me were the ontology merging and uncertainty. The web as we know was build on sharing and linking documents. Now, on the Semantic wave the same concept must be applied. There’s no need to build a complete new ontology on geonames, for example. Just link to an existing and build one just for your own knowledge domain. I firmly agree with this vision.

But was we already know, documents published on the web are hard to control, and there’s no guarantee that they’ll be there forever. A 404 result for a document search is no big deal, but when it comes to build an application based on an external ontology maintained for a third part that you have no relation, there’s a huge difference. That was an issue that I personally asked Ivan, and he said everybody is asking the same question, that’s a big problem that the W3C itself is worried about, but unfortunately there’s no light at the end of the tunnel yet.

Who’s gonna take care of the integrity of all these dependencies?

Ivan’s presentation for the SemTech2008 is available for download at the W3C website.

updated: Daniela Barbosa shared on her delicious some links from this first day.

1 Comment »

  1. [...] - how we got where we are, what “is” semantics, comparing it to the relational model. Bruno Pinheiro describes his intro (as does Shamod Lacoul) along with that of W3C’s Ivan Herman, who gave a [...]

    Pingback by AXONomics » Reusing, Repurposing and Remixing - from the Semantic Technology Conference — June 3, 2008 @ 3:42 pm

Leave a comment

Contact info

contato@brunopinheiro.com.br
+55 21 91463921