Experience with EC FP5, FP6, FP7 and Cultural Heritage

advertisement

Experience with EC FP5, FP6, FP7 and Cultural Heritage projects

Vladimir Alexiev, PhD, PMP

Cultural Heritage Digitization meeting, Sofia, 30 Jan 2012

Ontotext

• Ontotext is a Bulgarian company with 65 staff: Sofia, Varna,

Innsbruck, London, Connecticut, New Zealand

• Started in 2000 as a research lab in Sirma Group. Spun off in

2008 with investment from NEVEQ

• World-leader in semantic technologies. 360-degree semtech: repository (OWLIM), text mining (KIM, GATE), web mining

(WMF), Ontology and Linked Data Management

• Revenue grew 210% in the last 3 years: 5M BGL in 2011, over

7M expected in 2012

• Commercial revenue grew 10x in the last 3 years

Ontotext experience with FP and CH 30 Jan 2012 #2

Completed FP5, FP6, FP7 Projects

• OMM , BOR , OntoMap , On-To-Knowledge , SWWS , OntoWeb ,

VISION , DIP , SEKT , INFRAWEBS , PrestoSpace , MediaCampaign ,

RASCALLI , SemanticGov , SUPER , TAO , TripCom , LarKC ,

SOA4ALL

• Typically Ontotext is a core technology partner

• Typical size is 2-4M EUR (STREP); 10-15M EUR (IP). 3 years

• Typical Ontotext share is 200-500k EUR

• Ontotext is the most successful Bulgarian participant in EU FP research projects, received the prestigious Pitagoras award

• Topics range from core semtech to web services, SOA, business processes, eGovernment, media, TV, life sciences…

Ontotext experience with FP and CH 30 Jan 2012 #3

Current EC FP7 Projects

Project cycle and continuity: (F)inishing, (M)iddle, (S)tarting

• NoTube : Personalized creation, distribution, consumption of TV content (F)

• Insemtives : Incentives for Semantics (F)

• Cubist : Combining and Uniting Business Intelligence and Semantic

Technologies (M)

• Khresmoi : Knowledge Helper for Medical and Other Information (M)

• Molto : Multilingual Online Translation (M)

• Render : Reflecting Knowledge Diversity (M)

• TrendMiner : Trend Mining & Summarisation of Real-time Media Streams (S)

AnnoMarket: Marketplace for Semantic Annotation Services (S)

Euclid: Educational Curriculum for Linked Data (S)

Ontotext experience with FP and CH 30 Jan 2012 #4

Ontotext Statistics from One Call

• FP7 SME DCL was a call targeting SMEs

• Topic: Digital Content and Languages

• Purpose: work towards a linked data economy

• 2-phase call: short proposal (5p), full proposal (28.9.2011)

Ontotext experience with FP and CH

Short proposals: 15

(initiated 1, active in

13, rode along in 1)

Full proposals:

6

Accepted:

2

30 Jan 2012 #5

Collaboration With Academia and Research

• Ontotext collaborates extensively with universities and research centers all over Europe on EU FP projects

• Ontotext has a long-standing collaboration with the University of Sheffield on text analysis and semantic technologies

• 2 professors work part-time at Ontotext: Kiril Simov (BAS IICT),

Maurice Grinberg (NBU Cognitive Science)

• 2 PhDs working at Ontotext teach at university: Mariana

Damova (NBU Semantic Technologies), Vladimir Alexiev (NBU and BAS IMI: IT Project Management)

• Ontotext hires interns and doctorants and offers possibilities for doctoral research abroad

Ontotext experience with FP and CH 30 Jan 2012 #6

Commercial Projects

• UK 59%, US 18%, Global 9%, BG 7%, IT 3%, KR 2%, MX 2%, now DE

• Data providers 27% (jobs, food, cars), Publishing 26%, Government

18%, Life Sciences 11%, Cultural Heritage 10%, Telecom 4%

• Regular SemTech training courses in London

• Commercial revenue is close to 2/3 of total

EC projects are a bit shunned because we lose focus

We see great potential in Cultural Heritage so we want to focus on that

Ontotext experience with FP and CH 30 Jan 2012 #7

Cultural Heritage Experience

• OWLIM has a following in CH: Molto FP7 and Gothenburg

Museum, Charisma FP7, 3D COFORM FP7, Dutch Public Library

POC, Polish Digital National Museum, LODAC (JP)

• KR-BG ITCC: semantic publishing to Europeana

• British Museum: ResearchSpace project, funded by the

Andrew Mellon Foundation. Collaborative web-based research for the cultural heritage scholarly community.

Based on the CIDOC CRM ontology

• The National Archives: Semantic Knowledge Base for the

Government Web Archive. 780M documents (150M after deduplication), annotated over 10B facts

Ontotext experience with FP and CH 30 Jan 2012 #8

Semantic Technologies and Applicability to CH

• Many C.H. institutions have a data integration problem: data about the same artifacts is scattered in separate silos: cataloging, acquisition, conservation, research/scientific…

• If the Web (1.0) is a giant hyper-linked document, then Semantic Web (3.0) is a giant linked data-base

• Semantic Technologies are the best way to interconnect:

Ontologies and schemas ensure metadata interoperability

(ESE, EDM, LIDO, CIDOC CRM, EADS, MODS…)

Linked Open Data provides additional context

(DBpedia, GeoNames, FreeBase, WordNet, …)

Thesauri ensure consistent vocabulary

(Getty ULAN, AAT, TGN;

IconClass, RKD People, Concepts, etc)

Ontotext experience with FP and CH 30 Jan 2012 #9

Download