“Achieving Information Resources Empowerment: A Digital Library and Knowledge Management Perspective”

advertisement
“Achieving Information Resources
Empowerment: A Digital Library and Knowledge
Management Perspective”
Hsinchun Chen, Ph.D.
McClelland Professor of MIS
University of Arizona
美國亞歷桑那大學, 陳炘鈞 博
士
•
•
•
PI, NSF DLI-1, DLI-2, NSDL; Director, Artificial Intelligence Lab
PI, NSF Digital Government Program, ITR
Director, Hoffman E-Commerce Lab; PI, SAP, HP research programs;
Founder, Knowledge Computing Corp.
T HE U N IV ERSITY
OF
A RIZO N A
T UCSO N A RIZO N A
Digital Library: Overview
Introduction
• The Internet is changing the way we live and
do business.
• Opportunities for libraries, governments, and businesses:
to better deliver its contents and services and interact with
its many constituents – citizens, patrons, businesses, and
other government partners.
• Exciting and innovative transformation could occur with the
new technologies and practices: in addition to providing
information, communication, and transaction services.
• Review and comparison: but with more focus on digital library
+ some examples/case studies
Digital Library: Characteristics
• No need to leave the home or office:
information now readily available on-line
via digital gateways furnished by a wide
variety of information providers.
• Information is multimedia:
electronically available in a wide variety
of formats, many of which are large,
complex (i.e., video and audio),
and often integrated.
continue
Digital Library: Characteristics (continued)
• Interface to the Web has evolved from browsing to searching:
but the commercial technology has remained largely unchanged
from its roots in the 1960s. New research presents new
opportunities.
• Social impact matters as much as technological advancement:
DL projects need to examine the broad social, economic, legal,
ethical, and cross-cultural contexts and impacts.
DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL:
Towards Building A Global Digital Library
• NSF Digital Library Initiative Phase 1 (DLI-1), 1994-1998
• NSF CISE/IIS Special Program, $24M, NSF,
DARPA, NASA funding; Six projects: Stanford,
Berkeley, UCSB, Michigan, CMU, UIUC.
• Technology focus, new and rich library content; Bi-annual site
visits and project meetings. Special activities: IEEE Computer,
CACM, JASIS special issues, and many books and book
chapters.
continue
DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL:
Towards Building A Global Digital Library
(continued)
• NSF Digital Library Initiative
Phase 2 (DLI-2), 1998-2003
• NSF CISE/IIS Special Program, $60M, 1998-;
NSF, DARPA, NLM, LoC, NASA, NEH; 20+ projects: Stanford,
Berkeley, UCSB, CMU, Arizona, and many others.
• Strong focus on integration of technologies, contents, and
service. Annual NSF all-PI meeting with JCDL.
continue
DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL:
Towards Building A Global Digital Library
(continued)
• National Science Digital Library (NSDL), 2000• NSF CISE/IIS Special Program, $45M, 60+ projects:
Strong education focus in many different application domains.
• Annual NSF all-PI meeting in DC. Core Integration effort:
Cornell (Open Archive Initiative), UCAR, U. Mass., etc.
• Joint Conference on Digital Libraries (JCDL), 1996• ACM DL Conferences and IEEE DL Conferences, 1996-2000.
• JCDL 2001, Virginia, E. Fox; JCDL 2002, Oregon, G.
Marchionini; NSF DLI-2 all-PI meeting held after JCDL.
continue
DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL:
Towards Building A Global Digital Library
(continued)
• European Conference on Digital Libraries (ECDL), 1997• Many working group meetings held in different
DL sub-areas.
• International Conference of Asian Digital Libraries
(ICADL), 1998• ICADL 1998, Hong Kong, J. Yen; ICADL 1999, Taipei,
Taiwan, Hsueh-hua Chen; ICADL 2000, Seoul, Korea, Key-Sun Choi;
ICADL 2001, Bangalore, India, Shalini R. Urs (600 people); ICADL
2002, Singapore, S. Foo and E. Lim (450 people); ICADL 2003, KL,
Malaysia, Baba
• Local content, cultural heritage, education and deployment,
multilingual retrieval, other new technologies; Many other national
programs: China, India, Russia, Japan, etc.
Digital Library: Challenges
• Cultural and historical heritage:
Many digital library and museum
collections contain artifacts that
are fragile, precious, and of historical significance.
• Heterogeneity of content and media types:
Digital library collections have the widest
range of content and media types,
ranging from 3D chemical structures
to tornado simulation models, from the
statue of David to paintings of Van Gogh.
continue
Digital Library: Challenges (continued)
• Intellectual property issues:
Unlike digital government or e-commerce applications
that often derive their own content, digital libraries
provide content management and retrieval services
to many different information creators.
• Cost and sustainability issues:
Many patrons often would like library services
to be “free” or at least extremely affordable.
• Universal access and international collaboration:
Digital library content is often of interest to not just
people in one region, but possibly all over the world.
Digital Government and E-Commerce:
Overview
Digital Government: Characteristics
• Multi-faceted roles of Federal Government:
Government as a major user of information
technologies, a collector and maintainer of
very large data sets, and a provider of critical
and often unique information services to
individuals, states, businesses, and other customers.
• Potential for nearly ubiquitous access:
to government information services by citizen/customers
• Re-inventing the government:
Enhancements derived from new information technology-based
services can be expected to contribute to reinvented and
economical government services, and more productive
government employees.
Digital Government: US Government Goes
Electronic
• 1986 Brooks Act amended:
reducing government costs through
volume buying, including IT purchases.
• 1996 Information Technology Management Reform Act:
Establishing the CIO position to manage IT resources.
• 1998 WebGov portal: announced in August 1998, failed and
replaced by FirstGov portal after technology donation from
Inktomi.
• 2000 Federal Rehabilitation Act:
Requiring all IT products be accessible to the disabled.
continue
Digital Government: US Government
Goes Electronic (continued)
• 2000 FirstGov portal unveiled in June 2000.
• 2001 National Security Telecommunications and Information
Systems Security Policy No. 11:
Mandating off-the-shelf software used in defense be evaluated
by an approved third party (NSA).
• 2001 Health Insurance Portability and Accountability Act
(HIPPA):
Requiring health care information in compliance with privacy
regulations
• 2002 E-Government Act:
Funding additional e-government initiatives and creating Office
of Electronic Government.
Digital Government: Research Programs
• NSF DG Program, 1998- : areas such as: law enforcement
information sharing, citizen access to government statistical
data, and comprehensive emergency management; Digital
Government Research Center (DGRC) and annual NSF-sponsored
Digital Government Conference (DG.O)
• EU: areas such as: online public service for information content;
politics, e-democracy, e-voting; transactions, security, and digital
signatures for e-government.
• Other regions: Many ongoing e-government (G2C) initiatives have also
emerged in Asia and Pan-Pacific countries such as: China, Singapore,
Japan, Korea, India, New Zealand, Australia, etc. E-government
projects in Latin American countries have also been reported.
Digital Government: Challenges
• E-commerce is not at the heart of e-government:
The core task of government
is governance, the job of regulating
society, not marketing and sales.
• Organizational and cultural inertia:
Most government entities are not known for their
efficiency or willingness to adopt changes.
• Government laws and legal regulations:
Although well-intended, such laws and regulations
often inhibit innovation or thinking “out-of-the-box.”
Digital Government: Challenges (continued)
• Security and privacy issues:
Government-provided services have an extra burden of
guaranteeing security and privacy for citizens.
• Disparate and out-dated information infrastructure and systems:
Many government departments at all levels often face budget
shortfalls for years.
• Lack of IT funding and personnel:
Some government units (local, state, and federal) are
affluent, but most are not. IT spending often is not a priority.
E-Commerce: Characteristics
• Business/commercial initiatives: From Fortune 500 companies
to Internet start-ups, from self-funded dotcoms to ventures
funded by influential VCs (unlike digital library or digital
government research).
• Quick evolution and extensive coverage in many magazines and
newspapers: Business Process Re-Engineering (BPR),
Total Quality Management (TQM), Enterprise Resource
Planning (ERP), Supply-Chain Management (SCM),
Knowledge Management (KM), Customer
Relation Management (CRM), etc.
E-Commerce: Challenges
• Internet time or library/government time: In a
competitive business environment, “Internet time”
often demands a business to act on its instinct and
to take risks.
• Build it, but will they come: With the intense business pressure
to perform and significant injection of funding (at least before the
Internet bubble burst), many companies invest significantly in
major Internet-based e-commerce infrastructure and product
initiatives.
• True innovations or marketing hypes: With the fast moving and
sometimes impulsive business behaviors, marketing hypes are
often disguised as true innovations.
The Information-CommunicationTransaction-Transformation (ICTT)
Continuum: The Path to Innovation
• Information: content (e-library)
• Communication: interaction (e-government)
• Transaction: process and rule (e-commerce)
• Transformation: innovation (all)
ICTT: Information
• Definition: Library, government or business
“information” is created, categorized, and
indexed and delivered to its target audiences
through the Internet.
• Core competency of digital library research and services:
metadata generation, data creation and management, content
management, interoperability, system interfaces, etc.
• Many early G2C (government-to-citizen), G2B (government-tobusiness), and B2C services deliver information only:
governments and business portals act as information (about
regulations and products) providers.
ICTT: Communication
• Definition:
E-services support two-way “communication,” whereby
customers or citizens can communicate their needs or
requests through web forms, email, or other Internet media.
• Core function for e-government:
by providing effective communication channels to citizens.
• Many early B2C, G2C and G2B applications quickly evolved into
such communication services:
by adding simple web-based groupware functionalities such as
web forms, email, bulletin boards, chat rooms, etc.
• Computer-Supported Collaborative Systems (or groupware) and
recommender systems:
can significantly improve communication services for all digital
library, digital government, and e-commerce applications.
ICTT: Transaction
• Definition:
Citizens and businesses are supported in conducting
“transactions.”
• Transaction is the essence of e-commerce:
“You are not successful unless they buy.” Many businesses
support transactions among their suppliers (B2B) or customers
(B2C) through ERP, SCM, and CRM systems.
• Digital government could support “citizen transactions:”
such as income tax filing & returns, municipal service requests
and tracking, business license applications and payments, etc.
• Significant adaptation needed for e-government and digital
library:
to be cost-effective for non-commercial applications. (Most
governments and libraries cannot afford SAP R3!)
ICTT Continuum: Transformation
• Definition:
There is an opportunity for “transformation”
for libraries, government agencies, and businesses
through new technologies.
• Digital libraries:
Traditional libraries need to re-examine their content
management and service delivery assumptions and practices.
• E-Commerce:
Business consulting professionals are creating new
methodology and best practices to take advantage of the new
business opportunities.
• E-government:
New information technologies and innovative processes could
significantly enhance many facets of the governments, e.g., epolitics and e-voting, law enforcement and litigation support, etc.
Knowledge Management:
Overview
Unit of Analysis
• Data: 1980s
– Factual
– Structured, numeric
Oracle, Sybase, DB2
• Information: 1990s
– Factual
Yahoo!, Excalibur,
– Unstructured, textual
Verity, Documentum
• Knowledge: 2000s
– Inferential, sensemaking, decision making
– Multimedia
???
Data, Information and Knowledge:
• According to Alter (1996), Tobin (1996),
and Beckman (1999):
– Data: Facts, images, or sounds
(+interpretation+meaning =)
– Information: Formatted, filtered, and
summarized data (+action+application =)
– Knowledge: Instincts, ideas, rules, and
procedures that guide actions and
decisions
Application and Societal Relevance :
• Ontologies, hierarchies, and subject headings
• Knowledge management systems and
practices: knowledge maps
• Digital libraries, search engines, web mining,
text mining, data mining, CRM, eCommerce
• Semantic web, multilingual web, multimedia
web, and wireless web
2010
The Third Wave of Net Evolution
ARPANET
Function
“SemanticWeb”
Internet
Server Access
Info Access
Knowledge Access
1995
Unit
Server
File/Homepage
Concepts
1975
2000
Example
Email
WWW: “World Wide Wait”
Concept Protocols
1985
1965
Company
IBM
Microsoft/Netscape
???
Knowledge Management
Definition
“The system and managerial approach to
collecting, processing, and organizing
enterprise-specific knowledge assets for
business functions and decision making.”
Knowledge Management Challenges
• “… making high-value corporate
information and knowledge easily
available to support decision making at
the lowest, broadest possible levels …”
– Personnel Turn-over
– Organizational Resistance
– Manual Top-down Knowledge Creation
– Information Overload
Knowledge Management Landscape
• Research Community
– NSF / DARPA / NASA, Digital Library Initiative I &
II, NSDL ($120M)
– NSF, Digital Government Initiative ($60M)
– NSF, Knowledge Networking Initiative ($50M)
– NSF, Information Technology Research ($300M)
• Business Community
– Intellectual Capital, Corporate Memory,
– Knowledge Chain, Competitive Intelligence
Knowledge Management
Foundations
• Enabling Technologies:
– Information Retrieval (Excalibur, Verity, Oracle Context)
– Electronic Document Management (Documentum, PC
DOCS)
– Internet/Intranet (Yahoo!, Excite)
– Groupware (Lotus Notes, MS Exchange, Ventana)
• Consulting and System Integration:
– Best practices, human resources, organizational
development, performance metrics, methodology,
framework, ontology (Delphi, E&Y, Arthur Andersen, AMS,
KPMG)
Knowledge Management Perspectives:
• Process perspective (management and behavior):
consulting practices, methodology, best practices,
e-learning, culture/reward, existing IT  new
information, old IT, new but manual process
• Information perspective (information and library
sciences): content management, manual
ontologies  new information, manual process
• Knowledge Computing perspective (text mining,
artificial intelligence): automated knowledge
extraction, thesauri, knowledge maps  new IT,
new knowledge, automated process
KM Perspectives
Cultural
Human
Resources
Databases
ePortals
Tech
Foundation
Best
Practices
Learning /
Education
Consulting
Methodology
Content/Info
Email
Infrastructure
Content
Mgmt
Structure
KMS
Ontology
Analysis
Notes
User
Modeling
Search
Engine
Text Mining
Data Mining
•
Dataware Technologies
(1) Identify the Business Problem
(2) Prepare for Change
(3) Create a KM Team
(4) Perform the Knowledge Audit and
Analysis
(5) Define the Key Features of the Solution
(6) Implement the Building Blocks for KM
(7) Link Knowledge to People
•
Anderson Consulting
(1) Acquire
(2) Create
(3) Synthesize
(4) Share
(5) Use to Achieve Organizational Goals
(6) Environment Conducive to Knowledge
Sharing
•
Ernst & Young
(1) Knowledge Generation
(2) Knowledge Representation
(3) Knowledge Codification
(4) Knowledge Application
KM Architecture (Source: GartnerGroup)
Web UI
Web Browser
Knowledge Maps
Knowledge
Retrieval
Conceptual
Enterprise
Knowledge Architecture
Physical
KR Functions
Text and Database Drivers
Application
Index
Text Indexes
Database Indexes
Applications
“Workgroup”
Applications
Databases
Intranet
and
Extranet
Distributed Object Models
Network Services
Platform Services
Knowledge Retrieval Level
(Source: GartnerGroup)
KR Functions
Concept
“Yellow Pages”
Semantic
• Clustering —
categorization “table
of contents”
• Semantic Networks
“index”
• Dictionaries
• Thesauri
• Linguistic analysis
• Data extraction
Retrieved
Knowledge
• Collaborative
filters
• Communities
• Trusted advisor
• Expert
identification
Value “Recommendation”
Collaboration
Knowledge Retrieval Vendor Direction
(Source: GartnerGroup)
Market
Target
Newbies:
• grapeVINE
• Sovereign Hill
• CompassWare
• Intraspect
• KnowledgeX
• WiseWire
• Lycos
• Autonomy
• Perspecta
Technology
Innovation
* Not yet
marketed
Knowledge Retrieval
NewBies
IR Leaders
IR Leaders:
•Verity
• Fulcrum
• Excalibur
• Dataware
Niche Players:
• IDI
• Oracle
• Open Text
Microsoft • Folio
• IBM
• InText
Niche Players
• PCDOCS
• Documentum
Content Experience
Netscape*
Lotus
KM Software Vendors
Challengers
Leaders
Lotus *
Microsoft *
Ability
to
Netscape *
Execute Documentum*
* IBM
PCDOCS/*
Fulcrum
IDI*
Inference*
Lycos/InMagic*
CompassWare*
KnowledgeX*
SovereignHill*
Semio*
Niche Players
Dataware *
Autonomy*
* Verity
* Excalibur
OpenText*
GrapeVINE*
* InXight
WiseWire*
*Intraspect
Completeness of Vision
Visionaries
From Federal Research to
Commercial Start-ups
•
•
•
•
•
•
•
U. Mass:
MIT Media Lab:
Xerox PARC:
Batelle:
U. Waterloo:
Cambridge U.
U. Arizona:
Sovereign Hill
Perspecta
InXight
ThemeMedia
OpenText
Autonomy
Knowledge
Computing
Corporation (KCC)
Two Approaches to Codifying
Top-Down
Knowledge
Approach
• Structured
• Manual
• Humandriven
Bottom-Up
Approach
• Unstructured
• System-aided
• Data/Infodriven
Information Resources Empowerment:
DG and KM as Catalyst
Examples and Case Studies
Medical Portal and Informatics:
• Goal:
– A “knowledge” portal for medical researchers in US
and the world.
• Content/Information:
– Comprehensive, high quality medical-related content:
NLM databases, evidence-based medical databases
• Key Features:
– Comprehensive medical resources and ontologies
– Automatic medical thesaurus (48.5M terms) and
medical knowledge map (MED Map and Cancer Map)
– Scalable for multilingual support: English, Chinese,
Spanish, Arabic
• Funding:
– NSF DLI2 Program + NIH NLM Medical Informatics
Program (S. Griffin + A. McCray)
Consulting HelpfulMED Cancer Space (Thesaurus)
Enter search term
Select relevant search terms
New terms are posted
Search again...
Or find relevant content
Browsing HelpfulMED Cancer Map
1
Visual Site Browser
Top level map
2
3
Diagnosis, Differential
4
Brain Neoplasms
5
Brain Tumors
Browsing Taiwan Health Map
Simplified Chinese summary
Chinese folder display
Chinese visualization
Chinese
Medical Intelligence
with SOM
Results are from both Simplified
and Traditional Chinese
Select websites from mainland
China, Hong Kong and Taiwan
Traditional Chinese summary
Original encoding of the result
Simplified/Traditional
Chinese summarization
Select search engines from mainland
Chinese
results
China,Traditional
Hong Kong
and Taiwan
haven been converted
into simplified Chinese
Portal
Spanish Business Intelligence Portal
Keyword:
comercio
electronico
Keyword suggestion
from
Scirus and Concept Space
Detailed directory of
Spanish business
resources on the Web
Search, Organize,
Search
, Organize,or
Organize
,
Visualize
or Visualizeresults
results
Meta searches 7 major
sources and provides
searching of its own
collection (PIN)
Supports boolean searching
and allows the display of 10,
20, 30, 50, or 100 results per
each meta searchers
Search Page
Summarizer
Result Page
Web pages
visualized by selforganizing map
(SOM) algorithm
Categorizer
Automatic keyword
suggestion
Web pages grouped by key
organized
by
phrasesResults
extracted
by mutual
Summarize in 3 orA5three-sentence
meta searchers
information
algorithm (nonsentences
summary
on left categorization)
exclusive
Visualizer
Original page
shown on right
Search Page
Spanish Business Taxonomy
Web sites about the
topic “Electronic
Commerce” in Spanish
speaking countries
Arabic Medical Intelligence Portal
Search Page
Result Page
Categorizer
Provides a virtual
Arabic keyboard to
facilitate input
Visualizer
NanoPort:
• Goal:
– A “knowledge” portal for nano researchers in US and
the world.
• Content:
– Comprehensive, high quality nano-related web content:
4 nano-related search engines, 5 online databases, and 3
online journals
• Key Features:
– Comprehensive nano resources
– Post-retrieval analysis: AZ SUM, AZ NP, AZ SOM
– AZ Web Weaver (WW) toolkit: “weaving” your own web
– Alerting and communication among researchers
• Funding:
– NSF Nano Science and Engineering Program (M.
Roco)
Folder display
Visualization using SOM
Folder display
Visualization with SOM
The original page
Input keywords
Summary
Select search engines
Select online databases
Summarize result dynamically
Select online journals
Highlight the summary
in the original page
with corresponding color
Click on the summary
sentence and jump to
its position in the
original page
Communication Garden:
• Goal:
– Visualizing communication patterns and identifying
experts in email/newsgroups.
• Content:
– Any email/newsgroups contents, in any languages
• Key Features:
– Linguistic analysis: AZ NP, MI
– Topic clustering: AZ SOM
– Glyph-based visualization: garden metaphor
• Funding:
– NSF Information and Data Management Program
Thread
Disadvantages:
•No sub-topic identification
•Difficult to identify experts
•Difficult to learn participants’ attitude toward the community
ThreadTime
Representation
Message
Length of
Time
Person
People Representation
Time
Message
Length of
Time
Thread
Proposed Interface (Interaction Summary)
Visual Effects:
•Healthy subgarden with many
blooming high
flowers = popular
active sub-topic
•A long, blooming
flower is a healthy
thread
Proposed Interface (Expert Indicator)
Visual Effects:
•Healthy subgarden with many
blooming high
flowers = popular
sub-topic
•A long, blooming
people flower is a
recognized expert.
GeneScene: Transforming
Biomedical Research
• Correctly extract gene pathway information
from millions of abstracts
• Expedite comprehension of the literature
• Position results relatively to others in the
blink of an eye
• New hypotheses discovery
– Magnesium and migraines (Hearst,99)
Genescene
Overview
Knowledge Base
Integrate gene relations from
literature and outside databases
and provide knowledge for
learning and evaluation in data
mining
Text Mining
Process Medline abstracts
and extract gene relations
automatically from the text
Data Mining
Process gene expression data
(and existing knowledge) and
use different algorithms to
extract regulatory networks
Interface & Visualization
Allow searching for keywords, display a map of the
relations extracted from the text and/or from the
microarray
JIF
Ontologies
External
Databases
HUGO
Publications
Medline
XML Parser
Publications &
GO
Meta Information
UMLS
Knowledge
Base
Titles & Abstracts
GeneScene
Text Mart
Relation Parsers
Lexical
lookup
UMLS
AZ Noun
Phraser
POS
Tagging
Adjuster &
Tagger
Full
Parser
FSA
Relation
Grammar
Relations in
flat files
Concept
Space
Relations in
flat files
Co-occurrence
relations
Feature Structures
GeneScene
Data Mart
Text Mining
GeneScene
Information
Retrieval
Visualization
Data Mining
Spring
Algorithm
Micro
Array
Data
Bayesian
Networks
Association
Rule Mining
Problem (PBG)
•Title Key roles for E2F1 in signaling p53-
dependent apoptosis and in cell division within
developing tumors.
•Abstract: Apoptosis induced by the p53 tumor
suppressor can attenuate cancer growth in
preclinical animal models. Inactivation of the
pRb proteins in mouse brain epithelium by the
T121 oncogene induces aberrant proliferation
and p53-dependent apoptosis. p53 inactivation
causes aggressive tumor growth due to an
85% reduction in apoptosis. Here, we show
that E2F1 signals p53-dependent apoptosis
since E2F1 deficiency causes an 80% apoptosis
reduction. E2F1 acts upstream of p53 since
transcriptional activation of p53 target genes is
also impaired. Yet, E2F1 deficiency does not
accelerate tumor growth. Unlike normal cells,
tumor cell proliferation is impaired without
E2F1, counterbalancing the effect of apoptosis
reduction. These studies may explain the
apparent paradox that E2F1 can act as both an
oncogene and a tumor suppressor in
experimental systems
Action
Protocols
Graphic
Representation
p53
reads
"E2F1 signals p53-dependent
apoptosis"
E2F1
apoptosis
p53
infers
So, I'm assuming... a straight
line pathway...
E2F1
apoptosis
Expert
errs and
corrects
E2F1
reads
"E2F1 acts upstream of p53"
p53
apoptosis
E2F1
p53
reads
"E2F1 deficiency does not
accelerate tumor growth"
apoptosis
tumor growth
Final
graph
Example: Combination
Inactivation of the pRb proteins in mouse brain epithelium
by the T121 oncogene induces aberrant proliferation and
p53-dependent apoptosis
Of-template:
By-template:
Combo:
Agent
Action
null
inactivate
T121 oncogene
T121 oncogene
Theme
pRb proteins
null
null
inactivate
pRb proteins
Preposition: OF
q0
Nominalization
(-ion)
Adjective,
noun
verb (-ed)
q5
Adjective,
noun,
verb (-ed)
Nominalizatio
n (-ion)
Examples:
Negation
NP only
q4
q1
q6
Q0 – q 5 – q 6 – q 2 – q 3
OF
OF
q7
 Dfp1/Him1 protein OF fission yeast
NP only
Nominalizatio
n (-ion)
Nominalizatio
n (-ion)
Q0 – q 1 – q 2 – q 3
MRNA expression OF genes
q2
Q0 – q6 – q2 – q3 – q9 – q10
Adjective,
noun,
verb (-ed)
NP only
OF
Nominalizatio
n (-ion)
OF
q3
OF
q8
q9
NP only
q10
the determination OF the biological
characteristics OF human cancers
Q0 – q 5 – q 6 – q 2 – q 7 – q 8 – q 2 – q 3
Time-dependent induction OF mRNA
expression OF Wip1
Visualization
Preferences to limit the knowledge map, e.g.
only abstracts with research on human cells
Contradictory finding
Line thickness indicates
frequency of findings
All abstracts related to the search, or abstracts related
to a term highlighted in the map are displayed
Select interesting
relations to
visualize
Overview
Double click to
expand
Expanded node
Finding the truth: p38
acts as a negative
feedback for Ras
signaling
COPLINK: From Transaction to Transformation
• Goal:
– Supporting law enforcement information sharing and
crime analysis.
• Content/Information:
– Police incident records, mug shots, gang information.
• Key Features:
– COPLINK Connect: linking legacy databases
– COPLINK Detect: detecting crime associations
(“criminal thesaurus”)
– COPLINK Agent: wireless alerting
– COPLINK Visualization: revealing criminal networks
• Funding:
– NSF Digital Government Program (L. Brandt)
Finding criminals: English and Chinese interface
A narcotic network example
Switch between narcotic
network and gang network
A bubble represents
a subgroup labeled
by its leaders name
A line
implies an
that some
A point
represents
individuals
in
one
individual labeled by group interact
with some individuals in the other
his name
group. The thicker the link, the
more individual interactions
between
therepresents
two groups
A line
a link
between two persons
The rankings of the
members of a selected
group (green).
Show network and
reset network
The size of a bubble
is of
Adjust level
proportional to
the
details
number of individuals
in the group
A gang network example
The leader
The reduced
network structure
A clique
A gatekeeper
Criminal Patterns Found
• The chain structure of
the narcotic network
• Implications: disrupt the
network by breaking the
chain
• The star structure of the
gang network
• Implications: disrupt the
network by removing
the leader
Expert Validation
White gangs
who
involved in
murders and
shootings
A group
of black
gangs
White gangs
who sold
crack cocaine
Expert Validation
“(211) and
(173) are
best
friends”
“Yes, these
two groups
are together
very often”
The Future
• Many active and high-impact research
opportunities for researchers in
information science, library science,
computer science, public policy,
and management information systems.
• Digital library researchers are well positioned to
become the “agents of transformation” for the new
Net of the 21st century.
The Questions
• Who/what is a “librarian”?
• How to transform data and information
into knowledge?
• How to balance between technology,
policy, users, and services?
For more information
• “Knowledge Management
Systems,” H. Chen, 2002
• “Trailblazing a Path Towards
Knowledge and Transformation,” H. Chen, 2003
• International Conference of Asian Digital
Libraries, December 8-11, 2003, KL, Malaysia
• ACM/IEEE Joint Conference on Digital Libraries,
June 7-11, 2004, Tucson, Arizona
• NSF International Digital Library Workshop, June
10-11, 2004, Tucson, Arizona (successful national
DL projects)
For Project Information at AI Lab:
http://ai.bpa.arizona.edu
hchen@bpa.arizona.edu
Download