XML Standards and Strategy

advertisement
IRS XML Standards & Tax Return Data
Strategy
For External Discussion
June 30, 2010
Why has the IRS decided to make the change now?
1. When MeF was first deployed industry standards and
best practices were in the early stages of development.
Over time, stable and well adopted common practices
for XML reference models and NDR have evolved across
numerous industry sectors and government agencies
based on the Core Component Technical Specification
(ISO 15000-5).
2. The IRS standards are aligned with the National
Information Exchange Model (NIEM) which is now
recognized and promoted by the Federal CIO’s Council
as an excellent framework for XML data exchange.
These standards will allow the IRS to ensure
consistency and rapidly integrate published forms,
business services, and compliance systems across all
data sources, including MeF.
1
What we hope to accomplish
TODAY
Large vocabulary with numerous redundant
terms, message components, messages,
and interfaces. A small set of business
applications results in a large distributed
vocabulary
TOMORROW
Each layer is assembled from components at the lower
levels by following the Naming and Design Rules and
discovering re-usable components.
Applications
Apps
Interfaces
Interfaces
Batch/API/
GUI/Service
(Services, API, GUI, batch, etc.)
Messages
Messages
Request/Response/Error
Logical Chunks of Data
(e.g., Address)
Class/Complex Data Type
Terms
(e.g., City)
Attribute/Element
WHY:
The vocabulary
is distributed
across project
models.
Components
(aggregated
Terms)
Terms
The vocabulary
is shared
across project
models.
A controlled vocabulary supports a broad portfolio of business
applications with a shorter time to market
2
How this is accomplished
Extension & Revision
IRS XML Registry
Registry
stored in
Repository
IRS XML Vocabulary
EDMO
Governance &
Harmonization
Process
Metadata
Repository
Service
Registry
User Interface
XML - Editor
Quality of
Design tool
Register
Import
Industry Schema
Project XML
Development
EDMO Action
Guidance
Project Action
Edit/Compose
IRS XML NDR
3
Scope
There are 2 significant topics:
1. IRS XML Standards: Conformance to IRS XML standards in the
development and deployment of the Phase II Form 4868 and Phase III
1040 schema
2. Integration of IRS Forms and Schema
This briefing is intended to provide a synopsis of planned changes:
 MeF schema (XSD) and instance documents (XML)
 Use of an IRS Common XML Vocabulary
 Documentation
 The IRS Tax Return Data Consolidation Strategy
 XML Publishing (integration of IRS forms and schema)
 Integration of MeF schema with IRS back end systems
4
Topic #1: IRS XML Standards
• The Good news: MeF Vocabulary and XML document structure are
aligned with IRS XML Standards in current Phase II development
– 95% of MeF Phase II 1040 XML elements will be in the IRS
Common vocabulary. Resolving the remaining 5% would be
beneficial but is not necessary to move forward
– Prior year and shared form schema are interoperable with IRS XML
Standards compliant schema allowing for transition through annual
maintenance (eFile Types were promoted into the common
vocabulary)
– The IRS has worked to minimize impact by building from existing
MeF practices (e.g. single namespace, version management at the
file directory level, and use of eFile Types)
• So, what changes? Schema will be composed of global elements
defined in a reference library referred to as the IRS XML Common
Vocabulary. As MeF expands across multiple form families and tax
years these standards become essential to our ability to perform
data analysis
– Produce a single data dictionary across all form schema
– Search, discover, and reuse authoritative terms
– Integrate published form components with XML vocabulary
components
5
Summary of Changes
• Schema (XSD)
– Locally defined elements convert to Globally defined elements
and types (eFile Types on steroids)
– Statement and attachment attributes convert to elements
– Some element names change for harmonization into IRS XML
Common Vocabulary and consistency across forms
• Instance Documents (XML)
– Some element names change for harmonization into IRS XML
Common Vocabulary and consistency across forms
– Statement and attachment attributes convert to elements
• Documentation
– Form level documentation will need to be maintained separate
from the form schema to allow for the reuse of global elements
– The integration of the schema and PDF will provide an
authoritative mapping of schema elements to the published Tax
Form
6
What Are The Benefits?
• Integrated form and schema design
• Consistent use of terms
• Quicker deployment of schema
• Improved accuracy for data (form) requirements
7
Example: Form 4868 (Current Practice vs Planned)
8
Form 4868 Schema Composition (Current Practice Vs Planned)
The composition of the schema will change with the implementation of the IRS XML Standards.
9
Example: Form 8812 (Current Practice vs Planned)
In many cases the IRS XML Standards will result in few, if any, changes to the instance
documents.
10
Example: Schedule C (Current Practice vs Planned)
Some instance documents differ because the IRS XML Standards restrict the use of attributes
and require consistent use of terms. This may lead to the addition of complex types or name
changes for harmonization with the IRS Common XML Vocabulary.
11
Example: Form 2106 (Current Practice vs Planned)
In this example the terms for the Form 2106 were harmonized with the Schedule C. The “planned” schema for
the Schedule C and the 2106 now use the same tag names.
12
Topic #2: XML Publishing Integration
Current Practice
Planned XML Publishing Practice
•
•
•
•
•
•
•
•
•
Tax law specialists (TLS) embed
data requirements in forms,
instructions & publications
The Published form is designed for
a paper filer
Data requirements are mined from
forms, instructions and publications
then translated to a record layout
The record layout is translated to
an XML schema
The XML is translated back to the
published form
ETA publishes schema for eFile
stakeholders
Internal and external systems
custom design presentation
MeF stylesheet development is
designed based on the published
form
•
•
•
•
•
Tax law specialists (TLS) document data
requirements
Data requirements mapped to common
vocabulary and the schema is composed
Form updated and schema bound to form
IRS publishes schema and form
Internal and external systems reuse
published forms
MeF stylesheet development is
streamlined due to the integration of the
form and schema with authoritative
binding of data elements
13
Download