Internet Evolution, Governance and the Digital Object Architecture Workshop on SCORM Sequencing and Navigation Gaithersburg, Maryland February 23, 2005 Robert E. Kahn Corporation for National Research Initiatives Reston, Virginia Three Initial Networks About 30 – 35 years ago, DARPA funded the creation of three seminal packet networks – ARPANET, Packet Radio, Packet Satellite The Internet came about from a desire to link the three of them Ethernet occurred in parallel, led by Xerox Parc researchers, and other network types followed The resulting architecture was independent of the number and type of networks or who ran them. Key Decisions The Internet would be a global information system An open-architecture would be used to combine different networks based on open and well-known interfaces, protocols & objects. A new communications-oriented host protocol (TCP/IP) would be created to replace the original ARPANET host protocol (NCP) The concept of global addressing and IP addresses would be introduced to identify individual machines anywhere on the global Internet Comments on the Key Decisions The architecture is robust in the presence of many different network types and many outages Gateways provided IP routing and Network “Impedance Matching” TCP accommodated end-end protocols different packet sizes, duplicates, error detection, losses due to tunnels, mountains, jamming, etc. Separate network administrations were permitted, which allowed the Net to grow DNS not technically critical, but helped users Understanding the Big Picture Many things were done well from the outset; with 20/20 hindsight, some could have been done better The context was critical Mostly mainframes, few time-sharing systems No PCs, workstations, LANs One dominant carrier in the US Government facility initially What is important at the time may be only apparent with hindsight; but also what seems important at the time may not turn out to be so important later on Key Management Structures Internet Configuration Control Board (ICCB) Internet Activities Board (IAB) Internet Engineering Task Force (IETF) Internet Architecture Board (IAB) Internet Society (ISOC) Domain Name System (DNS) Internet Corporation for Assigned Names and Numbers (ICANN) Defining the Internet Logical architecture for internetworking Independent of the underlying networks Open architecture at the network level Not the routers, switches, lines, computers Not any one service provider on the net Reference to the FNC definition from 1995 www.hpcc.gov/fnc/Internet_res.html Updating the “1995 Definition” "Internet" refers to the global information system that -(i) is logically linked together by a globally unique address space based on the Internet Protocol (IP) or its subsequent extensions/follow-ons; (ii) is able to support communications using the Transmission Control Protocol/Internet Protocol (TCP/IP) suite or its subsequent extensions/follow-ons, and/or other IP-compatible protocols; and (iii) provides, uses or makes accessible, either publicly or privately, high level services layered on or integrated with the communications and related infrastructure described herein. Social and Political Dimensions Most Nation’s had committed to ISO protocols Yet TCP/IP won out in the final analysis Many reasons why; critical mass, many organizations helped, no significant benefit to changing, etc. Formation of ICCB, IAB, IETF, etc enabled the net to evolve NSF strategy of fostering independent networks - expanded participation without central control Boucher Bill allowed commercialization The World Embraces the Internet Who is in Charge of the Internet? Who is in Charge of the World Economy? World Summit in Geneva – December, 2003 And a host of follow-activities: ITU Workshop – February, 2004 UN ICT Task Force Global Forum – March, 2004 Working Group on Internet Governance (WGIG) Phase II of the World Summit in Tunis – Nov. ’05 Infrastructure Development What is so hard about it? Getting Buy in Making it scalable over platforms, size and time Achieving Critical Mass Pleasing many essential participants Displacing prior capabilities Structuring matters to deal with concerns about empire building It’s a lot easier to create brand new capabilities than to affect existing means of operation Infrastructure Creation is a Subtractive Process Infrastructure reduces a common, shared capability to its basic and essential attributes These attributes are not always recognized or understood up front Upon further scrutiny, capabilities are usually deleted from a well-conceived architecture over time Consensus develops when no more can be removed without disabling the infrastructure What is the Problem? Managing information in the Net over very long periods of time – e.g. centuries or more Dealing with very large amounts of information in the Net over time When information, its location(s) and even the underlying systems may change dramatically over time Respecting and protecting rights, interests and value A Meta-level Architecture Allows for arbitrary types of information systems Allows for dynamic formatting and data typing Can accommodate interoperability between multiple different information systems Allows metadata schema to be identified and typed Digital Object Architecture: Motivation To reformulate the Internet architecture around the notion of uniquely identifiable data structures Enabling existing and new types of information to be reliably managed and accessed in the Internet environment over long periods of time Providing mechanisms to stimulate innovation, the creation of dynamic new forms of expression and to manifest older forms While supporting intellectual property protection, finegrained access control, and enable well-formed business practices to emerge Digital Object Architecture Technical Components Digital Objects (DOs) Resolution of Unique Identifiers Maps an identifier into “state information” about the DO Handle System is a general purpose resolution system Repositories from which DOs may be accessed Structured data, independent of the platform on which it was created Consisting of “elements” of the form <type,value> One of which is its unique, persistent identifier And into which they may be deposited Metadata Registries Repositories that contain general information about DOs Supports multiple metadata schemes Can map queries into unique DO specifications (via handles) What is a Digital Object Defined data structure, machine independent Consisting of a set of elements Identifiers are known as “Handles” Each of the form <type,value> One of which is the unique identifier Format is “prefix/suffix” Prefix is unique to a naming authority Suffix can be any string of bits assigned by that authority Data structure can be parsed; types can be resolved within the architecture Associated properties record and transaction record containing metadata and usage information Interoperability & Federated Repositories Create a cohesive interoperable collection of repository-based systems Initially, perhaps, around a core set of projects, content, applications and/or organizations as in ADL Demonstrate interoperability between different repository collections Develop procedures to insure continued accessibility to key archival information Repository Notion Logical External Interface RAP Repository Access Protocol Any Hardware & Software Configuration Repositories & Digital Objects Each Digital Object has its own unique & persistent ID Content Providers want to assign Ids IPv6 Objects may be Replicated in Multiple Repositories REPOSITORY Could be upwards of trillions of DOs per Repository Handle System Distributed Identifier Service on the Internet First General Purpose Resolution system Can be used to locate repositories that contain digital objects given their handles - and more! Other indirect references Public Keys, Authentication information for Dos Accommodates interoperability between many different information systems; for example DNS was demonstrated on the Handle System in preparation for Y2K Can support ENUM, RFID, and more Attributes of the Handle System The basic Architecture of the Handle System is flat, scaleable, and extensible Logically central, but physically decentralized Supports Local Handle Services, if desired Handle resolutions return entire “Handle Records” or portions thereof Handle Records are also digital objects signed by the servers doubly certificated by the system Resolution Mechanism Handle Handle Record Multiple Sites Multiple Servers Handle System <www.handle.net> • • • • System is non –nodal Scaleable & Distributed Supports global (and local) resolution With backup for reliability, mirroring for efficiency Managing Rights & Interests Not just about copyright Terms and Conditions (T&Cs) for use may be contained within each DO; also information about intrinsic value, such as monetary value T&Cs are intended to indicate clearly what one can and/or cannot do with a given DO, where such clarity is intended by the owner of the DO Not an enforcement means, although it may be used by an enforcement system Mobile programs that are Digital Objects may apply such terms to themselves and to any digital objects they contain Handle Format 2304.40/1234 Prefix Authority Item ID (any format) Prefix Suffix In use, a Handle is an opaque string. Other examples of Handles 2304/general info 2304/1 2304. HQ/staff 2304.1/memo123 2304.22.Pub/2004 Direct Access and Proxies Direct Access Indirect Access One or more Proxy Servers Redirection of Handle Requests General Registry of all Naming Authorities Direct Access Redirection Information One or more Local Handle Services Conclusions Managing Digital Objects for long-term access is a key challenge Initial Technology Components are available; Industry is expected to generate more over time Third-party value-added providers in the private sector will ultimately shape the long-term evolution Interoperability and reliable information access is a critical objective A diversity of applications (with user-friendly interfaces) need to be developed & deployed The ADL Project can have a central role to play in demonstrating the technology and using it effectively