Legal Data Markup Software CS501 Requirements Presentation October 4th, 2000 Project Team Sponsors Developers Professor William Arms Ju Joh Professor Thomas Bruce Sylvia Kwakye Jason Lee Nidhi Loyalka Reviewer Omar Mehmood Amy Siu Charles Shagong Brian Williams Introduction Objective: US Code (ASCII) Wellformed, valid XML output XML output used as input to other applications Goal of end-use: Making law available for general public use References Current version of code: US Code HTML XML tutorials and faqs Tasmanian SGML DTD’s (EnAct) W3C XML draft specification The Perl CD Bookshelf Overview Functional Requirements Usability Requirements Minimum Performance Requirements Design Constraints Supportability Requirements US Code Acts of Congress (Law) 50 Titles (e.g. Armed Forces, Bankruptcy, Copyrights, Labor, Patents, Transportation) Constantly Updated by Congress Each Title posted online with revisions in ASCII format Legal Information Institute Associated with Cornell’s Law School Founded in part by Thomas Bruce Goal: Publish US Code on Web in a presentable format for general public Problems Current version has difficulties with: Overall US Code structure variations Tables, footnotes, appendices HTML lacks archival qualities of XML, since it fails to show structural relationships. Title 1 (LII HTML) Title 1 (ASCII from Congress) Title 26 (LII HTML) Title 26 (ASCII from Congress) Title 50 (LII HTML) Title 50 (ASCII from Congress) Solution LDMS will have to : Maintain structural layout of US Code Generate cascading table of contents Allow title or full text search Markup and preserve notes Link cross-references Preserve Catch lines Generate Appendices Highlight reserved words Functionality Directly follows from client-specified qualities Functional requirements Table of Contents Generation Direct representation of hierarchy inherent to structure of US Code Functionality Functional requirements Appendices Generation LDMS will recognize appendix sections and markup their constituent elements Catchline Handling LDMS will recognize short headers in US Code, appropriately marking them Functionality Functional Requirements Preservation of Cross-references LDMS will recognize self-referential links by establishing anchors and links between text sections Table Handling LDMS will recognize tabular data in US code, marking up and organizing data elements into proper dimensions and indices Functionality Functional Requirements Preservation of Notes Critical for references, background information, and sources LDMS will recognize notes Reserved Words Recognition Critical attributes to entire subdivisions of text LDMS will markup applicable text Functionality Functional Requirements Graceful Failures LDMS will markup unrecognizable variations in US Code titles as such. If at all possible, LDMS will maintain readability despite the graceful failure. Special Character Handling Non-standard characters have different meanings LDMS will recognize, markup and represent nonconventional characters Functionality Functional Requirements Navigational Aids LDMS will facilitate next/previous reference links. Known Data Input Path Raw ASCII US Code input located in known directory HOQ + + + + Engineer Req. XSL ASCII -> Unicode Word Pattern Matching Special DTD Tags White Space Pattern Matching State Machine Client Req. Appendices + Special Characters + Cross Ref. + Structural Layout + Tables + TOC + Catch Line + Notes + Next/Prev Graceful Failure Magic Word + Difficulty 2 Importance 2 Least to Most (1 to 6) + + + + 1 1 + + + + + + + + + + + + + + + 5 5 3 6 “+” Positive Correlation between two requirements. House of Quality + + + + + + + + + + 4 3 6 4 Usability Development and Application Environment Red Hat Linux running on Leda Cron daemon will execute software at client specified intervals Two levels of users for human operation of LDMS Normal users Power users Normal Users Computer Literacy assumed Familiarity with Linux operating system Required to start and/or stop program from Linux command line window Application Manual provided for training 30-60 minutes expected training time Power Users Familiarity with: Standard development directory with Linux operation system Perl programming language XML, DTDs and US code LDMS source code source code documentation help files, and manual page will be provided One week expected training time Time Estimation for Measurable Tasks Given specifications of Leda, estimates for conversion of all fifty titles of US code to XML 30 minutes to read US Code in its entirety 12-24 hours for conversion processing Status Messages During execution of LDMS, display status messages at client-specified intervals, notifying the user of the progress within the current title. Reliability Availability Available for use 100% of the time Mean Time Between Failures (MTBF) Product designed to fail gracefully Exceptional errors should not occur within useful lifetime of 3 years Reliability Mean Time To Repair (MTTR) In case of product failure, MTTR depends on nature of fault Cause: Transient error in underlying platform MTTR: Time taken for the job to be restarted Cause: Fatal error in underlying platform MTTR: Time taken to restart the system Cause: Semantic Error within program MTTR: Requires repair by reprogramming offending part of product Cause: Error in input. MTTR: Time required to correct input and/or output manually. Reliability Accuracy Paramount to success of project Must generate XML that reproduces original structure within defined tolerances Validation and integrity testing performed using XSL stylesheet to view generated XML Various components and tolerance levels of accuracy are: Structure represented by XML output: 95% accuracy Table of Contents: 95% accuracy Reliability Reserved Words: 95% accuracy Cross-references: 75% accuracy Appendices: 75% accuracy Catchlines: 95% accuracy Preservation of Notes: 75% accuracy Handling Tables: 75% accuracy Handling Special Characters: 75% accuracy Reliability Acceptable Bugs Delivering a perfect program is impossible Bugs and defects not directly affecting usability of program or accuracy of output will be deemed tolerable Supportability Output file naming convention Source Level Documentation Take input filename, attach “.xml” extension All code, Use Peer Review Standard Unix Manual (Man) Page Program Design Document (PDD) DTD Design Document (DDD) Performance Transaction Response Time Average per US Code Title: 30Min. ±10Min. Capacity: 1 Transaction at a time Resource Utilization 12MB System Memory 2MB – Interpreted Perl Code 5MB – Input data buffer 5MB – Output data buffer Design Constraints OS: Leda – Redhat Linux Development Language: Perl File Input: ASCII File Output: XML Development System leda.law.cornell.edu 233Mhz Pentium II 128MB RAM 28GB HDD Software Interfaces ASCII LDMS DTD XML Licensing Requirements Extendable by Client Possible Future Revenues Might use downloaded Library Code Joint Authorship Agreement written to address Licensing Joint Authorship Agreement The undersigned agree to the following: 1. That all code, documentation and other copyright-protected material produced in the course of this CS501 project (PROJECT MATERIAL) shall be understood by all to be the work of joint authors and not as a work made for hire; 2. That the joint authors shall include all the undersigned, the CS501 students working on the project and Thomas R. Bruce; 3. That despite joint authorship there will be no duty on the part of the student authors, individually or as a group, to account for any return on subsequent commercial use or development of the PROJECT MATERIAL; 4. That, in contrast, should Thomas R. Bruce or the Legal Information Institute realize royalties or other direct financial return from licensing any of the PROJECT MATERIAL there will be a duty to account to the other joint authors for any such revenue net of costs; and 5. That the undersigned will use care to assure that the PROJECT MATERIAL does not incorporate code covered by copyright and licensed on terms that are inconsistent with unlimited noncommercial distribution. Legal, Copyright, and other Notices No Warranty; however Developers will do their best to fulfill requirements, but have no legal duties to do so Applicable Standards XML