Vladimir Alexiev, PhD, PMP
Cultural Heritage Digitization meeting, Sofia, 30 Jan 2012
• Ontotext is a Bulgarian company with 65 staff: Sofia, Varna,
Innsbruck, London, Connecticut, New Zealand
• Started in 2000 as a research lab in Sirma Group. Spun off in
2008 with investment from NEVEQ
• World-leader in semantic technologies. 360-degree semtech: repository (OWLIM), text mining (KIM, GATE), web mining
(WMF), Ontology and Linked Data Management
• Revenue grew 210% in the last 3 years: 5M BGL in 2011, over
7M expected in 2012
• Commercial revenue grew 10x in the last 3 years
Ontotext experience with FP and CH 30 Jan 2012 #2
• OMM , BOR , OntoMap , On-To-Knowledge , SWWS , OntoWeb ,
VISION , DIP , SEKT , INFRAWEBS , PrestoSpace , MediaCampaign ,
RASCALLI , SemanticGov , SUPER , TAO , TripCom , LarKC ,
SOA4ALL
• Typically Ontotext is a core technology partner
• Typical size is 2-4M EUR (STREP); 10-15M EUR (IP). 3 years
• Typical Ontotext share is 200-500k EUR
• Ontotext is the most successful Bulgarian participant in EU FP research projects, received the prestigious Pitagoras award
• Topics range from core semtech to web services, SOA, business processes, eGovernment, media, TV, life sciences…
Ontotext experience with FP and CH 30 Jan 2012 #3
Project cycle and continuity: (F)inishing, (M)iddle, (S)tarting
• NoTube : Personalized creation, distribution, consumption of TV content (F)
• Insemtives : Incentives for Semantics (F)
• Cubist : Combining and Uniting Business Intelligence and Semantic
Technologies (M)
• Khresmoi : Knowledge Helper for Medical and Other Information (M)
• Molto : Multilingual Online Translation (M)
• Render : Reflecting Knowledge Diversity (M)
• TrendMiner : Trend Mining & Summarisation of Real-time Media Streams (S)
• AnnoMarket: Marketplace for Semantic Annotation Services (S)
• Euclid: Educational Curriculum for Linked Data (S)
Ontotext experience with FP and CH 30 Jan 2012 #4
• FP7 SME DCL was a call targeting SMEs
• Topic: Digital Content and Languages
• Purpose: work towards a linked data economy
• 2-phase call: short proposal (5p), full proposal (28.9.2011)
Ontotext experience with FP and CH
Short proposals: 15
(initiated 1, active in
13, rode along in 1)
Full proposals:
6
Accepted:
2
30 Jan 2012 #5
• Ontotext collaborates extensively with universities and research centers all over Europe on EU FP projects
• Ontotext has a long-standing collaboration with the University of Sheffield on text analysis and semantic technologies
• 2 professors work part-time at Ontotext: Kiril Simov (BAS IICT),
Maurice Grinberg (NBU Cognitive Science)
• 2 PhDs working at Ontotext teach at university: Mariana
Damova (NBU Semantic Technologies), Vladimir Alexiev (NBU and BAS IMI: IT Project Management)
• Ontotext hires interns and doctorants and offers possibilities for doctoral research abroad
Ontotext experience with FP and CH 30 Jan 2012 #6
• UK 59%, US 18%, Global 9%, BG 7%, IT 3%, KR 2%, MX 2%, now DE
• Data providers 27% (jobs, food, cars), Publishing 26%, Government
18%, Life Sciences 11%, Cultural Heritage 10%, Telecom 4%
• Regular SemTech training courses in London
• Commercial revenue is close to 2/3 of total
–
EC projects are a bit shunned because we lose focus
–
We see great potential in Cultural Heritage so we want to focus on that
Ontotext experience with FP and CH 30 Jan 2012 #7
• OWLIM has a following in CH: Molto FP7 and Gothenburg
Museum, Charisma FP7, 3D COFORM FP7, Dutch Public Library
POC, Polish Digital National Museum, LODAC (JP)
• KR-BG ITCC: semantic publishing to Europeana
• British Museum: ResearchSpace project, funded by the
Andrew Mellon Foundation. Collaborative web-based research for the cultural heritage scholarly community.
Based on the CIDOC CRM ontology
• The National Archives: Semantic Knowledge Base for the
Government Web Archive. 780M documents (150M after deduplication), annotated over 10B facts
Ontotext experience with FP and CH 30 Jan 2012 #8
• Many C.H. institutions have a data integration problem: data about the same artifacts is scattered in separate silos: cataloging, acquisition, conservation, research/scientific…
• If the Web (1.0) is a giant hyper-linked document, then Semantic Web (3.0) is a giant linked data-base
• Semantic Technologies are the best way to interconnect:
–
Ontologies and schemas ensure metadata interoperability
(ESE, EDM, LIDO, CIDOC CRM, EADS, MODS…)
–
Linked Open Data provides additional context
(DBpedia, GeoNames, FreeBase, WordNet, …)
–
Thesauri ensure consistent vocabulary
(Getty ULAN, AAT, TGN;
IconClass, RKD People, Concepts, etc)
Ontotext experience with FP and CH 30 Jan 2012 #9