Geonames.org

advertisement
Geonames.org
●
My Name : Marc Wick
●
Geonames.org : free global gazetteer
●
Creative commons licence
●
Daily dump (8.000-10.000 downloads/month)
●
Webservices (up to over 3.000.000 requests/day)
●
Semantic Web Ontology
8 March 2007, geonames.org
European GeoInformatics Workshop
1
Data Sources
●
Official sources wherever possible
●
NGA,USGS,geobase.ca, linz, Brazil, ...
●
Accuracy
●
Geonames users
–
Help find data
–
Wiki-Style edit interface
●
gtopo30,srtm3,timezone,polygon,postalcodes,tiger,...
●
Wikipedia
8 March 2007, geonames.org
European GeoInformatics Workshop
2
Geonames Feature Density
8 March 2007, geonames.org
European GeoInformatics Workshop
3
8 March 2007, geonames.org
European GeoInformatics Workshop
4
Source Update - NGA
(National Geospatial Intelligence Agency)
●
Every two or three months
●
Compare updates to current geonames data
●
Decide for every modification whether to apply
or not (mostly it is applied)
●
Consistency checks (elevation, feature code)
●
Who changed what?
●
How reliable is source?
8 March 2007, geonames.org
European GeoInformatics Workshop
5
NGA Update August 2006
●
43244 Iran
●
18272 Lebanon
●
18247 Afghanistan
●
10724 North Korea
●
2476 Russia
●
1431 South Korea
●
783 Qatar
●
729 Kuwait
●
723 Pakistan
8 March 2007, geonames.org
European GeoInformatics Workshop
6
Wikipedia Geodata
8 March 2007, geonames.org
European GeoInformatics Workshop
7
Access to Geonames Data
●
Html
●
Csv : daily dump
●
Xml (REST webservices)
●
Json
●
RDF
●
RSS/GeoRSS
●
KML (GoogleEarth)
8 March 2007, geonames.org
European GeoInformatics Workshop
8
Geonames – Semantic Web
●
Maintainer Bernard Vatant, Mondeca
●
Ontology, codes
●
Interlinking
●
OWL Lite
●
OWL Full (importing OWL Lite)
–
Name
8 March 2007, geonames.org
rdfs:subPropertyOf
rdfs:label
European GeoInformatics Workshop
9
Categorization
●
9 feature Classes
●
650 feature codes
●
SKOS (Simple Knowledge Organization System)
●
gn:Class
subClassOf
skos:ConceptScheme
●
gn:Code
subClassOf
skos:Concept
●
Example :
–
–
Class A = country, state, region
#A.ADM1 skos:inScheme rdf:resource="#A"
8 March 2007, geonames.org
European GeoInformatics Workshop
10
RDF representation
<Feature rdf:about="http://sws.geonames.org/3020251/">
<name xml:lang="fr">Embrun</name>
<alternateName xml:lang="fr">Embrun, Hautes-Alpes</alternateName>
<featureClass rdf:resource="http://www.geonames.org/ontology#P"/>
<featureCode rdf:resource="http://www.geonames.org/ontology#P.PPL"/>
<inCountry rdf:resource="http://www.geonames.org/countries/#FR"/>
<population>7069</population>
<wgs84_pos:alt>900</wgs84_pos:alt>
<wgs84_pos:lat>44.5667</wgs84_pos:lat>
<wgs84_pos:long>6.5000</wgs84_pos:long>
<parentFeature rdf:resource="http://sws.geonames.org/3013738/"/>
<nearbyFeatures rdf:resource="http://sws.geonames.org/3020251/nearby.rdf"/>
<locationMap>http://www.geonames.org/3020251/embrun.html</locationMap>
<wikipediaArticle rdf:resource="http://fr.wikipedia.org/wiki/Embrun_%28Hautes-Alpes%29"/>
<wikipediaArticle rdf:resource="http://pl.wikipedia.org/wiki/Embrun"/>
<wikipediaArticle rdf:resource="http://de.wikipedia.org/wiki/Embrun"/>
<wikipediaArticle rdf:resource="http://en.wikipedia.org/wiki/Embrun%2C_Hautes-Alpes"/>
<wikipediaArticle rdf:resource="http://it.wikipedia.org/wiki/Embrun"/>
<wikipediaArticle rdf:resource="http://nl.wikipedia.org/wiki/Embrun"/>
<owl:sameAs rdf:resource="http://rdf.insee.fr/geo/COM_05046"/>
</Feature>
8 March 2007, geonames.org
European GeoInformatics Workshop
11
Linked Data
●
ChildrenFeatures
●
ParentFeature
●
NeighbouringFeatures
●
NearbyFeatures (same FeatureClass)
8 March 2007, geonames.org
European GeoInformatics Workshop
12
Alternate Names
<name>Edinburgh</name>
<alternateName xml:lang="ko"> 에든버러 </alternateName>
<alternateName xml:lang="ja"> エディンバラ </alternateName>
<alternateName xml:lang="th">เอดนบะระ</alternateName>
<alternateName xml:lang="cy">Caeredin</alternateName>
<alternateName xml:lang="br">Dinedin</alternateName>
<alternateName xml:lang="ga">Dún Éideann</alternateName>
<alternateName xml:lang="gd">Dùn Èideann</alternateName>
<alternateName xml:lang="oc">Edimborg</alternateName>
<alternateName xml:lang="fr">Édimbourg</alternateName>
<alternateName xml:lang="ca">Edimburg</alternateName>
<alternateName xml:lang="ast">Edimburgo</alternateName>
<alternateName xml:lang="es">Edimburgo</alternateName>
<alternateName xml:lang="gl">Edimburgo - Dùn Èideann</alternateName>
<alternateName xml:lang="de">Edinburgh</alternateName>
....
ca 60 names for Edinburgh in nearly as many languages. (up to 235)
8 March 2007, geonames.org
European GeoInformatics Workshop
13
Concept vs Document
●
Concept : http://sws.geonames.org/3020251/
●
303 (See Other) redirection
●
–
http://sws.geonames.org/3020251/about.rdf
–
http://www.geonames.org/3020251/embrun.html
Accept-Header
–
RewriteCond %{HTTP_ACCEPT} application/rdf
–
RewriteRule ^/([0-9]*)/$ http://sws.geonames.org/$1/about.rdf [R=303,L]
8 March 2007, geonames.org
European GeoInformatics Workshop
14
Apache
mod rewrite
ROME (RSS)
jdom.org (xml) JSON
Tomcat (Java)
JMS
JDBC
Full Text Index
TF-IDF
Gtopo30
Lucene
SRTM3
activeMQ
Database : Postgres
(postgis)
8 March 2007, geonames.org
European GeoInformatics Workshop
15
Data exchange
www.PingTheSemanticWeb.org
download.geonames.org
dev.geonames.org
www.geonames.org
ws.geonames.org
sws.geonames.org
www.dbpedia.org
Semantic Web Crawlers
8 March 2007, geonames.org
European GeoInformatics Workshop
16
Data exchange
www.PingTheSemanticWeb.org
download.geonames.org
www.geonames.org
dev.geonames.org
jms
Webservice call
Webservice call
jms
ws.geonames.org
sws.geonames.org
Rdf dump
www.dbpedia.org
Semantic Web Crawlers
8 March 2007, geonames.org
European GeoInformatics Workshop
17
Are SW Crawlers obsolete?
●
2 years to crawl 6.4 million records with 10sec
wait time
●
How should SW search work?
●
directories->crawlers->what is next?
●
Structured data, synchronization
●
Eg : ms hailstorm, google base, ping blogs,
sitemap ...
8 March 2007, geonames.org
European GeoInformatics Workshop
18
8 March 2007, geonames.org
European GeoInformatics Workshop
19
8 March 2007, geonames.org
European GeoInformatics Workshop
20
What is next for geonames?
●
Administrative divisions
●
Integrate Wikipedia Landmarks in geonames
●
Hotel data
●
Continuously update and improve
●
New gazetteer datasets
●
Natural language geocoding
8 March 2007, geonames.org
European GeoInformatics Workshop
21
Download