A Metadata Search Interface for A Metadata

advertisement
A Metadata Search Interface for
RNS File Catalog
GFS-WG, OGF31 Taipei
Hideo Matsuda
Osaka University
y
© 2008 Open Grid Forum
OGF IPR Policies Apply
•
•
“I acknowledge that participation in this meeting is subject to the OGF Intellectual Property
Policy.”
Intellectual Property Notices Note Well: All statements related to the activities of the OGF and
addressed to the OGF are subject to all provisions of Appendix B of GFD-C.1, which grants to
the OGF and its participants certain licenses and rights in such statements. Such statements
include verbal statements in OGF meetings, as well as written and electronic communications
made at any
y time or place,
p
, which are addressed to:
•
•
•
•
•
•
•
•
•
the OGF plenary session,
any OGF working group or portion thereof,
the OGF Board of Directors, the GFSG, or any member thereof on behalf of the OGF,
the ADCOM, or any member thereof on behalf of the ADCOM,
any OGF mailing list, including any group list, or any other list functioning under OGF auspices,
the OGF Editor or the document authoring and review process
Statements made outside of a OGF meeting, mailing list or other function, that are clearly not
intended to be input to an OGF activity, group or function, are not subject to these provisions.
Excerpt from Appendix B of GFD-C.1: ”Where the OGF knows of rights, or claimed rights, the
OGF secretariat shall attempt to obtain from the claimant of such rights, a written assurance
that upon approval by the GFSG of the relevant OGF document(s),
document(s) any party will be able to
obtain the right to implement, use and distribute the technology or works when implementing,
using or distributing technology based upon the specific specification(s) under openly specified,
reasonable, non-discriminatory terms. The working group or research group proposing the use
of the technology with respect to which the proprietary rights are claimed may assist the OGF
secretariat in this effort. The results of this procedure shall not affect advancement of
document, except that the GFSG may defer approval where a delay may facilitate the obtaining
of such assurances. The results will, however, be recorded by the OGF Secretariat, and made
available. The GFSG may also direct that a summary of the results be included in any GFD
published containing the specification.
specification ”
OGF Intellectual Property Policies are adapted from the IETF Intellectual Property Policies that
support the Internet Standards Process.
© 2006 Open Grid Forum
2
Outline
• B
Brief
i f Introduction
I t d ti to
t RNS (R
(Resource
Namespace Service)
• File Catalog and Its Usecase
• Metadata Search Interface
© 2006 Open Grid Forum
3
Resource Namespace Service
(RNS)
• Hierarchical namespace
management that provides name
name/grid
to-resource mapping
• RNS 1
1.1
1 specification was
ogf
jp
published as OGF documents
file1
file2 GFD.171 and GFD.172.
data
gfs
• Basic Namespace Component
file2
file1
file3
file4
• RNS directoryy entryy
• Non-leaf node in hierarchical
namespace tree
EPR2
• RNS non-directory
di t
entry
t
EPR1
EPR: Endpoint Reference
© 2006 Open Grid Forum
4
• Name-to-resource mapping that
interconnects a reference to any
existing resource into hierarchical
namespace
Operations in RNS Specification 1.1
Operations in RNS Specification 1.1
add
dd ( entryName: String, [entryEndpoint: EPR], ( t N
St i [ t E d i t EPR]
[entryMetadata: XML] ): RNSEntry
lookup ( [ entryName: String ] ): LookupResults
remove ( entryName: String ): RNSEntry
( entryName: String ): RNSEntry
rename ( oldEntryName: String, newEntryName: String ): RNSEntry
setMetadata ( entryName: String, newMetadata: ( entryName: String newMetadata:
XML ): RNSEntry
© 2006 Open Grid Forum
5
5
RNS add operation (request)
Request:
<rns:add>
<rns:entry-name
<rns:entry
name name=“EntryNameType”>
name EntryNameType >
<rns:endpoint> EndPointReferenceType
</rns:endpoint>
/
d i t
<rns:metadata> RNSMetadataType
</rns:metadata>
</rns:entry name>
</rns:entry-name>
</rns:add>
© 2006 Open Grid Forum
6
RNS add operation (response)
Success and Failure Response:
<rns:addResponse>
<rns:entry-response name=“EntryNameType”>
<rns:endpoint> EndPointReferenceType
</rns:endpoint>
<rns:metadata> RNSMetadataType
</rns:metadata>
[ <rns:fault> wsbf:BaseFaultType </rns:fault> ]
</rns:entry-response>
</rns:addResponse>
© 2006 Open Grid Forum
7
Current Implementation at Osaka U.
(RNS Client Operations)
(RNS Client Operations)
add operation:
rns mkdir directory
rns‐mkdir
directory‐path
path (Directory Entry)
(Directory Entry)
rns‐add ur URL RNS‐path / rns‐add er EPR_file RNS‐path
(Another RNS (Directory) Entry)
rns‐add
dd u URL RNS‐path / rns‐add e
URL RNS th /
dd EPR_file
EPR fil RNS‐path
RNS th
(NonDirectory Entry) rns‐gridftp‐put
g
p p local‐file‐path physical‐location‐URL RNS‐path p
p y
p
(NonDirectory Entry, file transfer is under construction)) lookup operation:
rns‐ls
rns
ls directory‐path (Directory Entry)
directory path
(Directory Entry)
rns‐getepr RNS‐path (NonDirectory Entry)
rns‐gridftp‐get RNS‐path (NonDirectory Entry, file transfer is under construction)) remove operation:
rns‐rmdir
rns
rmdir directory‐path (Directory Entry)
directory path (Directory Entry)
rns‐rm rns‐path (NonDirectory
Entry) 8
8
© 2006 Open Grid Forum
Example of RNS operations
$ rns-mkdir
s
d /d
/dir1
$ rns-mkdir /dir1/dir2
$ rns-add u gsiftp://host/file /dir1/dir2/file
$ rns-ls
rns ls /dir1/dir2/file
/dir1/dir2/file -> gstftp://host/file
$ rns-rm /dir1/dir2
/dir1/dir2: Is a directory
$ rns-rm-f /dir1/dir2
© 2006 Open Grid Forum
9
RNS as a File Catalog Service
• DataGrid often manages widely distributed
/grid
data ((e.g.,
g High
g Energy
gy Physics,
y
Astronomy,
y
Biology, etc.)
ogf
jp
• File Catalog provides functionality of logicalto-physical mapping (e
(e.g.,
g gLite LFC).
LFC)
file1
file2
d t
data
gfs
f
• RNS can be used as a File Catalog Service.
Registration and
query endpoint
references (EPR)
with logical names
and metadata
file1
e
File Catalog Server
file2
file3
e3
file4
EPR
EPR: Endpoint
Reference
Client
Access to each file
Fil
Filesystem
1
Fil
Filesystem
2
10
Fil
Filesystem
t
3
File Catalog in e-Science
e Science
• File Catalog can be used for not only file-location
managementt but
b t also
l metadata
t d t in
i e-Science
S i
since
i
matadata is often described with hierarchical
representation in many sciences.
ATLAS
20071003
run1
track1
Genome
CMS
run2
Proteome
Bacterial Functional Structure
20080110 Human
Genome Analysis Analysis
Genome Plant
Genome
track2
High Energy Physics
gb|AY157024
11
sp|P37231
Molecular Biology
pdb|1FM6
High Energy Physics Usecase:
ILC VO File Catalog
Number of entries
Directory
watase@kek2-uidev
watase@kek2
uidev rpc]$ LFC_HOST=grid-lfc.desy.de
LFC HOST=grid lfc desy de lfc-ls
lfc ls -ll /grid/ilc/mc
/grid/ilc/mc-2008_2/s
2008 2/s
drwxrwxr-x 3449 44318 3454
0 Sep 28 2009 CMS_250_IDAG-ppr004
drwxrwxr-x
111 44263 3454
0 Jun 17 2009 CMS_250_ppr003
drwxrwxr-x 93894 44290 3454
0 Mar 09 2009 CMS_250_ppr004
drwxrwxr-x
60 44263 3454
0 Feb 23 2009 CMS_250_pre002
drwxrwxr-x 1385 44318 3454
0 Jun 05 2009 CMS_500_Presel_IDAG_p
drwxrwxr-x 8926 44290 3454
0 Mar 19 2009 CMS_500_kek-ppr004
drwxrwxr-x 81146 44290 3454
0 Jul 30 2009 CMS_500_ppr004
drwxrwxr-x
200 44263 3454
0 Nov 09 2008 CMS_500_pre002
d
drwxrwxr-x
4767 44290 3454
0M
Mar 09 2009 DESY
DESY_SM_500_ppr004
SM 500
004
drwxrwxr-x 1113 44290 3454
0 Mar 06 2009 Desy_point5_ppr004
drwxrwxr-x
540 44263 3454
0 Nov 09 2008 Single_Particles_pre002
drwxrwxr x 3567 44288 3454
drwxrwxr-x
0 Apr 08 2009 Slac
Slac_point5_ppr004
point5 ppr004
drwxrwxr-x
167 44377 3454
0 Feb 21 2009 pair_bkgs_LowPparams_c
drwxrwxr-x
100 44377 3454
0 May 13 2009 pair_bkgs_nominalparams
drwxrwxr x 1997 44377 3454
drwxrwxr-x
0 Feb 28 2009 pair
pair_bkgs_nominalparams
bkgs nominalparams
drwxrwxr-x 1650 44290 3454
0 Jun 17 2009 pythiaZPole_ppr004
drwxrwxr-x
22 44290 3454 12 0 Dec 19 2008 ucam_uds
Metadata Search for RNS File
Catalog
• A lookup operation returns all entry information (could be so
much amount of entries)
entries).
• A basic idea (restricting output by XQuery against metadata)
was proposed by Tatebe at OGF28.
OGF28
• RNS entries can spread over multiple servers
Æ Metadata search is done against a directory. The search
starts from a root directory and recursively issue XQuery
against
i t each
h sub-directory
b di t
(such
(
h as, “fi
“find”
d” command
d iin
Unix). XQuery
/grid
g
against
metadata
dir1
dir2
Hit to Query
13
dir3
RNS Metadata Search Interface
• Additi
Additionall operation
ti b
based
d on th
the lookup
l k
operation.
lookup ( [ entryName: String ] ): LookupResults
search([ entryName: String ] query: String):
search([ entryName: String, ] query: String): SearchResults
• SearchResults include the only entries
whose metadata that matches a given
query.
• Submit
S b it its
it specification
ifi ti to
t the
th GFS working
ki
group mailing list. 14
Example: Key-Value
Key Value Metadata
• An
A example
l off an RNS entry:
t
<entry name=“EntryNameType”>
<endpoint> EndPointReferenceType
</endpoint>
<metadata>
<rnskv key=“Key1”> Value1 </rnskv>
<rnskv key=“Key2”> Value2 </rnskv>
…
</metadata>
</entry>
y
15
An Example of XQuery
declare namespace ns1 = "http://schemas
http://schemas.ogf.org/rns/2009/12/rns
ogf org/rns/2009/12/rns";;
let $ent := /ns1:RNSEntryResponseType
let $rnskv := $ent/ns1:metadata/rnskv
where exists($rnskv) and $rnskv/@key = “key1”
return
<ns1:RNSEntryResponseType entry-name="string({$ent/@entry-name})"
entry name string({$ent/@entry name})
xmlns:ns1="http://schemas.ogf.org/rns/2009/12/rns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="ns1:RNSEntryResponseType">
1
S
{$ent/nst:endpoint}
<ns1:metadata xsi:type="ns1:RNSMetadataType">
yp
yp
{$ent/ns1:metadata/ns1:supports-rns}
<rnskv key="key1">{$rnskv/text()}</rnskv>
</ 1
</ns1:metadata>
t d t >
</ns1:RNSEntryResponseType>
16
Summary
• RNS can b
be used
d as a Fil
File C
Catalog
t l
Service.
• Moreover, XML metadata and its search
interface using XQuery provide a
functionality of flexible access to a large
amo nt of data and red
amount
reduce
ce the amo
amount
nt
of its output.
• We want to standardize the interface
specification.
specification
17
Download