Larry W. Bryant larry.w.bryant@jpl.nasa.gov
Jet Propulsion Laboratory / California Institute of Technology
August 20, 2009
The research and development resulting in this publication was carried out at the Jet
Propulsion Laboratory, California Institute of Technology, under a contract with the
National Aeronautics and Space Administration.
ii
F
OREWARD
The Mission Operations Assurance (MOA) discipline is a continuation of the pre-launch
Mission Assurance program as the project transitions from development to flight operations and continues through the end of the mission. The impetus for MOA for spacecraft operations is in the recommendations of the Mars Climate Orbiter Special
Review Board report and the NASA Public Lessons Learned Entries: 0641 and 0886.
This document provides best practices guidance based on successful implementation of
MOA programs at NASA’s Jet Propulsion Laboratory. It can easily be used as a template for a Mission Operations Assurance Plan (MOAP) that may be tailored and republished for individual programs/projects to provide a MOAP specific to each unique mission.
NOTE: Items in brackets [ . . . ] are intended to be replaced with program/project specific information when this document is used as a MOAP template. iii
1 I NTRODUCTION
This document specifies the MOA activities for the [Project Name] [managed / operated] by [Program Office, or other management organization]. The [Project Name] MOA program is based on the use of existing and tailored procedures and processes that have been proven effective for previous space missions and incorporate both lessons learned from these missions and space operations best practices.
1.1
Objectives
1.1.1
Mission Operations Assurance objective:
Enhance the likelihood of mission success by proactively contributing to the identification, assessment and mitigation of risks through the implementation of MOA processes for the [Project Name] project.
1.1.2
Mission Operations Assurance Plan objective:
Clearly document the approach for satisfying MOA requirements and the application of sound mission assurance practices in implementing MOA processes to ensure that mission objectives are achieved.
1.2
Requirements
MOA requirements are levied upon the MOA function for each flight project or flight instrument managed by [PROGRAM OFFICE]. The MOA Manager (MOAM) is responsible for validating the requirements as applicable to meeting [Project Name] needs. The MOAM is responsible for implementing these requirements throughout the project lifecycle. In the requirements below, the term MOA means the MOA function or
MOAM as appropriate.
1) MOA shall independently assess project risks throughout mission operations.
2) MOA shall independently assess the project’s operational readiness to support nominal and contingency mission scenarios.
3) MOA shall implement the project’s problem/failure reporting system (P/FRS) to comply with [Program Office] standards and requirements.
4) MOA shall provide training on problem reporting for the flight team.
1.3
Scope
This plan describes the responsibilities and approach of the MOAM in satisfying MOA requirements and contributing to the reduction of risks to mission success. Of particular importance to the success of this plan is the transition from development mission assurance to operations mission assurance and the involvement of the entire flight operations team in the MOA process. This plan discusses how the MOAM implements the process elements by providing leadership to the management, payload, spacecraft, ground system, and support teams in the MOA process.
1
1.3.1
Project Participant Definitions
1.3.1.1
Contractor:
A participant in the project that is under a formal contract for products or services, as evidenced by a statement of work (SOW), a Contract Data Requirements List (CDRL) and Data Requirements Documents (DRDs). Usually contractors are selected for large portions of the project work. Examples include aerospace industry companies.
1.3.1.2
Partner:
A participant in the project that is under formal contract to provide products or services to the project, however they are not considered a contractor in the classical sense. Business may be conducted via contracts or memorandums. Typical examples include universities, research laboratories and NASA centers.
1.3.1.3
Contributor:
A participant in the project that is providing their products or services free of charge to the project. They typically conduct work with the project under a Memorandum of
Agreement. Examples include foreign governments or partners that are providing "free" hardware or software for a mission.
1.3.1.4
Supplier:
A participant in the project that is providing their products or services through a Purchase
Order (PO), blanket purchase agreements (BPA's) and P-Cards. Suppliers do not have formal contracts, SOWs, CDRLS or DRDs. The appropriate safety and mission assurance requirements are invoked via the purchase order."
1.3.2
Plan Applicability:
This plan is applicable to all in-house [PROGRAM OFFICE] work as well as work with contractors, partners and suppliers for the [Project Name] project.
This plan applies to "contributors" on a best effort basis or as stipulated in other [Project
Name] project "memorandums of agreement" with the contributor.
1.3.3
Plan Implementation
The MOAM manages implementation of the MOA program for [Project Name] in accordance with this plan.
2 M
ISSION
O
PERATIONS
A
SSURANCE
O
VERVIEW
2.1
Organization
The MOAM reports directly to the [Program Office/Institutional Safety and Mission
Assurance Office or equivalent organization] and the [Project Name] project manager.
He/she is also responsible for providing an independent assessment of the project’s risk
2
posture (risks, mitigations, system failures, failure reports, corrective actions, and lessons learned) directly to the project manager and higher level safety and mission assurance offices.
2.2
Objectives
1.
Improve operational reliability of projects during mission operations
2.
Assess the mission objectives, spacecraft/payload design & capabilities, and flight operations planning and implementation for compatibility and consistency.
3.
Facilitate integration of mission operations assurance into the project so that all team members share responsibility for mission success
4.
Assess the design, implementation, integration, validation & execution of robust processes to successfully accomplish mission objectives
5.
Facilitate, when appropriate, Software Quality Assurance (SQA) support during post launch software development, Flight Software modifications, and resolution of software related problem reports.
6.
Provide project manager with visibility into mission operations assurance processes and recommend actions as appropriate
7.
Provide independent risk assessment to project manager & higher level safety and mission assurance offices.
8.
Provide direct transfer of knowledge and experience to existing and future projects
2.3
Implementation Approach
Teamwork is the fundamental approach to mission operations assurance for space flight projects. The MOAM provides leadership and guidance to foster a cohesive effort in achieving mission success. The flight team, the End-to-End Information System (EEIS) teams, contractor teams, and university teams which conduct operations, respond to ground and flight problems, maintain and update systems, and in general keep the mission on track, all have a role in the mission operations assurance processes. They must be involved in activities which contribute to the management and mitigation of risk to mission success, including extended missions.
The sections that follow describe how mission operations assurance satisfies its requirements and implements MOA practices in contributing to the project’s success.
3 R ISK A SSESSMENT
When assigned to the project, the MOAM works with the development Mission
Assurance Manager (MAM) to ensure the project’s risk management plan extends, or is updated to extend throughout the entire mission. The MOAM works with the Project
System Engineer (PSE) and MAM in continuing the risk management effort into the operational phase of the mission. The initial step for the MOAM’s task in this regard is
3
he/she gaining a thorough understanding of the risks identified during development.
Specific areas for information exchange from the MAM to the MOAM include open
Problem Failure Reports (PFRs), Red Flag (mission critical) PFRs, Waivers to [Program
Office] standards and required practices, test as you fly (TAYF) exceptions, the Project’s risk list, and any risks identified by the MAM that are not on the Project’s risk list. As part of the information exchange process, the MAM and MOAM also look to future operational events to identify any residual risk that may be applicable to operations, particularly for critical and first time events.
During mission operations, the MOAM provides independent risk assessments to the
Project Manager and to safety and mission assurance offices. The MOAM presents an independent assessment of the project’s significant risks (red or yellow rated) at regular status and management reviews. The MOAM independently captures residual risks throughout the post-launch risk review process and integrates them into his/her overall risk assessment. This is primarily done as an active participant in the day to day risk review process as an integral part of the flight team. The MOAM’s effort is coordinated with project system engineering as a sanity check and to ensure no identified residual risks are overlooked. The goal of this process is to get project consensus of the overall risk posture. The MOAM assesses planned operational scenarios and mission timelines for consistency with key project documents such as the Mission and Navigation Plans as well as [Program Office]’s standards and required practices. The MOAM identifies inconsistencies that indicate adverse changes in the project’s risk posture for mission success. On a continual basis, within the Problem Reporting Process, the MOAM assesses the residual risk of problems for future operations, particularly first time and critical events.
3.1
Critical Events
Post launch events, such as an orbit insertion; probe release; or entry, descent, and landing (EDL) are considered critical events and trigger a special review process for the project culminating in a Critical Events Readiness Review (CERR). An independent review and assessment of the Project’s risk posture and operational readiness is performed by the MOAM to facilitate the mitigation of risks to critical event operations.
The MOAM reports the residual risk posture at the Project’s CERR as well as higher headquarters Safety and Mission Success Reviews (SMSR). The requirement for a
SMSR is frequently reserved for a project’s preparation for launch but may be included prior to critical cvents with human safety implications. Stardust’s returning a sample return capsule (SRC) over the continental United States for landing in Utah is such an event. In that instance, with the possible risk to the general population, it was a requirement that an independent safety and mission success residual risk assessment be communicated up through the NASA HQ Safety and Mission Assurance organization.
For critical events, the MOAM conducts independent reviews and assessments which are shared with the project in preparation for the critical event and associated reviews. The
MOAM reviews the [Program Office] standards and required practices for deviations representing residual risks and applicability to mission operations. He/she provides recommendations to the project for an incompressible test list specific to the critical event
4
operations. The MOAM reviews lessons learned and advises the project of applicable safety and mission success considerations. The MOAM asseses the project’s pre-launch residual risk items with implications for the critical event. Specific items include single point failures, spacecraft design risks, mission design risks, red flag P/FRs, unverified failures, and major waivers. The MOAM also conducts a review and assessment of the project’s post-launch problem reports and operational waivers. He/she advises the project manager on evolving institutional requirements and expectations in preparation for the event.
NOTE: In the case of Stardust, there was nearly a seven year interval between launch and earth return, making retrieval of pre-launch data quite challenging. Projects should ensure pre-launch development information is maintained in an organized and easily assessable format throughout the operations phase of the mission.
Mission trades are also independently assessed for potential risk by MOA in preparation for a critical event. The assessment is completed from a safety and mission success perspective. While initial assessments of trades might point to a particular conclusion, the
MOA recommendation must be based on risk balance. The risk drivers are categorized as major and minor and are typically summarized in chart form. The results are coordinated and concurred with by the organizations safety and mission assurance office and presented to the Project for its review and action.
3.2
In-flight Anomalies
As an active participant on the Project’s Anomaly Resolution Team (ART), the MOAM provides independent assessment of the Project’s implementation of their anomaly resolution process as well as the risks relevant to any recovery action or decision not to take action. The MOAM may also provide data on similar problems which occurred previously on flight projects through a search of applicable problem reporting system data bases. The MOAM works with the ART lead to ensure proper and adequate documentation of the anomaly with particular focus on identifying root cause and recovery action addressing the root cause to the extent possible. The MOAM is responsible for ensuring the risk to the mission introduced by the anomaly is dealt with in an appropriate manner by the Project and communicated with the institutional stakeholders.
4 P ROJECT O PERATIONAL R EADINESS
The MOAM works with the Ground Data System (GDS) Engineer, Mission Operations
System (MOS) Verification & Validation (V&V) Engineer, and Training Engineer to assure that plans are developed for adequate demonstrations of EEIS functionality as well as flight team and EEIS operational readiness. The MOAM has the responsibility to review plans/procedures/scripts for these demonstrations to assess if the objectives are sufficient to fully validate functionality and operational readiness. The MOAM proactively provides recommendations to the appropriate lead for additions or enhancements to demonstration objectives. The MOAM participates in pre- and post- briefings to confirm that objectives are understood and that residual liens against achieving objectives are identified, tracked, and resolved. The MOAM is responsible for
5
independently assessing the success of these demonstrations and providing those assessments, including risks associated with residual liens, to project management and organizational safety and mission assurance offices.
NOTE: Operational readiness of the flight team and EEIS encompasses processes, procedures, software tools, personal proficiency(see section 6.2 below on Position and
Flight Team Training Oversight), ground system reliability, flight/ground system compatibility, uplink products, uplink product validation, spacecraft constraints and flight rules, and simulation tools.
5 P
ROBLEM
R
EPORTING
The MOAM is responsible for assuring that:
1.
Institutional problem reporting standards and guidelines are implemented throughout flight operations,
2.
The flight team has access to and is properly trained in the use of the system,
3.
The flight team understands the need for initiating an problem report whenever an anomaly, surprise, or unexpected event occurs,
4.
Problem reports are initiated in a timely manner,
5.
Problem reports are analyzed and resolved prior to impacted mission events, and
6.
Corrective actions are reviewed prior to the problem report closure for adequacy to preclude the recurrence of the anomaly.
An important element of precluding recurrence of a problem is identification and correction of causative elements. The following definitions are used within the anomaly resolution process:
Proximate Cause: The immediate (direct) cause for the anomaly. Synonymous with Fault. The event(s) that occurred, including any condition(s) that existed immediately before the anomaly, directly resulted in its occurrence and, if eliminated or modified, would have prevented the anomaly.
Root Cause(s): One of multiple factors (events, conditions or organizational factors) that contributed to or created the proximate cause and subsequent anomaly and, if eliminated, or modified would have prevented the anomaly.
Typically multiple root causes contribute to an anomaly.
Contributing Factor(s): The event(s) or condition(s) that may have contributed to the proximate cause and subsequent anomaly but, if eliminated or modified, would not by itself have prevented it.
5.1
Problem Reporting Overview for Operations
A Problem/Failure Reporting System (P/FRS) is implemented for flight operations no later than the start of operational rehearsals and continuing throughout the mission operations phase of the project.
6
5.1.1
Reporting Incidents, Surprises, and Anomalies
Problems encountered during development typically relate either to hardware or software problems while those encountered during operations may also relate to personnel, processes, and procedures. Consequently, it is not uncommon to maintain the operations related problem reports in a separate data base from the development problem reports.
For example, the Jet Propulsion Laboratory maintains their operational reports in an
Incident, Surprise, Anomaly (ISA) system which functions similarly to their development system but segregates the ISAs from the P/FRs (development reports). The problem reporting system covers all incidents, surprises, and anomalies observed by the flight team on flight/ground hardware, flight/ground software, and test/operational processes/procedures. Problem reports are initiated on events that indicate unexpected performance of the ground system, flight system, or flight team.
5.1.2
Problem/Failure Reports
For anomalies documented in an operational problem report which are determined to be a flight hardware/software design problem or failure, a corresponding hardware problem/failure report (P/FR) is generated. The P/FR generated in operations, unlike those in development, requires no additional signatures. The P/FR is administratively closed with reference to the corresponding operational problem report that documents the closure information. The purpose for creating a P/FR is to archive the hardware/software failure information in the data base researched by other (normally future) projects which plan to use the same/similar hardware or software. (If the institution uses a single data base for development and operational problems, then a separate report may not be necessary.)
5.2
Problem Reporting Implementation for Operations
Problems identified during operations are documented in the institutional P/FRS
(optionally an industry partner may use, an equivalent system that has prior MOAM approval). When a partner uses their own system for failure reporting during operations, the MOAM works with the industry partner to establish a method for MOAM review of their problem reports. Partner problem reports meeting the criticality 2 or 1 definitions below are also documented in the institutional P/FRS.
5.2.1
Timeliness of Reporting
All incidents, surprises, and anomalies are documented as soon as practical into the
P/FRS. In most cases, the problem reports are initiated within 24 hours of the incident.
5.2.2
Responsibility for Reporting
Any individual observing a reportable incident is responsible for originating a problem report. Additionally, the individual in charge of the activity when the incident occurred is responsible for ensuring a problem report is written. The overarching philosophy is, when in doubt, write a problem report.
7
5.2.3
Form for Reporting
Problem events are documented by entering the appropriate information (particularly a clear description of the problem event) into the institutional P/FRS or an industry partner system that has prior MOAM approval.
5.2.4
Problem Report Assignment
The MOAM reviews all newly generated problem reports in a timely manner and assigns them to the appropriate team lead. The team lead then assigns an individual the responsibility of resolving and closing the problem report.
5.2.5
Criticality Assessment
The initiator of the problem report assigns an initial criticality assessment when the problem report is written (this capability may be P/FRS implementation dependant). The
MOAM, in coordination with the mission manager or designated representative, assigns a final criticality assessment. The criticality definitions used for this assessment are as follows:
Criticality 1: Represents major impact or threat to achieving mission success as characterized by the occurrence or potential occurrence of one or more of the following conditions: a) Major degradation of or loss of required functional capability of spacecraft, instrument, or ground data system element b) Major reduction in lifetime c) Major delay in test, launch, or mission activities d) Major degradation of or loss of capability to control spacecraft or instrument e) Major degradation of or permanent loss of essential engineering or science telemetry or navigation radiometric data f) Loss of primary engineering redundancy g) Major increase in difficulty of operations or loss of capability to conduct essential mission operations functions h) Error in the command uplink process indicating a vulnerability to sending a command with major mission degradation effects
Criticality 2: Represents significant impact or threat to achieving mission success as characterized by the occurrence or potential occurrence of one or more of the following conditions: a) Significant degradation of required functional capability of spacecraft, instrument, or ground data system element b) Significant reduction in lifetime c) Significant delays in test, launch or scheduled mission events d) Significant impact from delays in acquisition of test or mission data e) Significant degradation of capability to control spacecraft or instrument
8
f) Significant degradation of engineering or science telemetry or navigation radiometric data g) Loss of minor spacecraft or payload function h) Significant increase in difficulty of operations
Criticality 3: Represents negligible impact or threat to achieving mission success as characterized by the occurrence or potential occurrence of one or more of the following conditions: a) Negligible degradation of required functional capability of spacecraft, instrument, or ground data system element b) Negligible delays in test, launch or scheduled mission events c) Negligible impact from delays in acquisition of test or mission data d) Negligible degradation of capability to control spacecraft or instrument e) Negligible degradation of engineering or science data or navigation radiometric data f) Negligible reduction in lifetime g) Negligible increase in difficulty of operations
Criticality 4: Represents no impact or threat to achieving mission success as characterized by the occurrence or potential occurrence of one or more of the following conditions: a) Spacecraft idiosyncrasy requiring no corrective action b) Surprise which after analysis indicates no risk, adverse effect, or corrective action
5.2.6
Problem Report Closure Meeting
The project conducts a regularly scheduled problem report closure meeting chaired by the
MOAM. The agenda includes review of the criticality ratings and team assignments of the problem reports generated since the last meeting, negotiation of closure dates with the responsible team lead, and status of problem reports due and past due for closure.
5.2.7
Analysis of the Problem
Analysis is conducted and documented in the problem report to clearly define the problem, determine the proximate cause, address the effect of the event on associated elements of the subsystem and system, and determine the necessary corrective actions.
NOTE: For criticality 1 and 2 problem reports, it is desirable to conduct a practical root cause analysis. It is normally sufficient to document that analysis with a Cause Chain
(attached to the report). Fault Tree or Fishbone analysis may be directed by the Mission
Manager or Project Manager when the problem is severe enough to justify the expenditure of the necessary resources.
5.2.8
Corrective Action
Actions taken to correct the problem are documented in the Corrective Action section of the problem report. Corrective actions that include changes to the project configuration and/or documentation should include change requests processed in accordance with the
9
project’s configuration management plan and referenced on the problem report prior to closeout review and approval. Verification of corrective action may include analyses, testing, and /or demonstration. After completion of corrective action, when feasible, the item is subjected to the conditions under which the problem occurred to verify the effectiveness of the corrective action.
5.2.9
Lessons Learned
Each problem report is reviewed to determine if it raises an issue that may affect the success of other current or future projects. Where the initiator or reviewer of a problem report indicates that it may raise a “cross-institution issue,” the problem report is assessed to see if an input to the institutions lessons learned system is warranted.
5.2.10
Problem Report Review, Approval, and Closure
Following the completion of the analysis, verification, and corrective action steps, the assignee normally signs the problem report to initiate the closure process. The team lead then signs after reviewing and concurring on the closure information. System engineering reviews and signs the problem report. Configuration management, particularly for reports with associated change requests, then signs verifying that the appropriate change requests have been processed. Additionally, if the project has a project or chief engineer, they and/or the mission manager should sign criticality 1 and 2 problem reports, and the project manager should sign criticality 1 problem reports. With responsibility for providing an independent assessment of risk and operations readiness, the MOAM then reviews and signs the problem report when they concur with the closure action.
5.3
Command File Errors
A problem event is categorized as a command file error if it describes an error in a command file that was sent to the spacecraft; or an error in the approval, processing, or uplinking of a command file that was sent to the spacecraft; or the omission of a needed command file that was not sent to the spacecraft; regardless of the consequence on the spacecraft.
5.3.1
Command file error locations include: a) Initiation b) Generation c) Testing d) Review e) Approval f) Processing for radiation g) Uplinking
5.3.2
The command file error causes include: a) Human errors b) Uplink Process deficiency
10
c) Testbed/simulation/modeling setup errors d) Post-launch flight software error e) Ground software error f) Configuration management deficiency g) Tracking configuration errors (e.g. DSN, TDRS)
5.3.3
Command file error corrective actions include: a) Flight team training b) Procedural command uplink process modification c) Automated command uplink process modification d) Flight software modification e) Ground software modification f) Configuration management modification
5.3.4
d. Command file categories include: a) Interactive command process b) Non-interactive command process
5.4
Tracking and Reporting
Mission operations assurance performs anomaly tracking and reporting. This process includes reporting on all open criticality 1 and 2 problem reports at the NASA Quarterly reviews (for NASA sponsored projects) and monthly institutional safety and mission assurance organization reviews. Additionally, at a minimum, all open criticality 1 and 2 problem reports are assessed for risk with respect to upcoming critical events and presented at the project’s Critical Events Readiness Reviews.
6 O
PERATIONS
T
RAINING
MOA is involved in operations training from two perspectives. First is providing training to the flight team to prepare them to participate in the MOA contribution to achieving mission success. Second is the responsibility to assess if the training of the flight team, at both the individual position and the flight team levels, is sufficient to prepare the team to execute the nominal mission, respond to contingency situations, and minimize the risk of errors that could threaten mission success.
6.1
Operations Assurance Training
The MOAM assures the Problem/Failure Reporting function is addressed in the flight team training curriculum. Lesson plans are included as Appendix B to this document.
6.2
Position and Flight Team Training Oversight
A key element of operational readiness is the Project’s training program. As part of fulfilling his/her responsibility for assessing operational readiness, the MOAM reviews the plans developed for each team’s operations position training as well as the project’s
11
plan for overall, or system level, flight team training. There are four key ingredients to effective position training (typically referred to as training and certification at the operations position level) which the MOAM looks for to be convinced of the effectiveness of position training. The Team Training and Certification Plan:
1.
Defines each operational role (aka position description).
2.
Identifies the experience, knowledge, and training necessary to develop competence to execute the operational role.
3.
States the criterion an individual must satisfy to demonstrate qualification and competence to perform the operational role (aka certification criteria).
4.
Establishes a reporting structure to provide project management sufficient information to realistically evaluate an individual's readiness to support mission operations.
Key ingredients the MOAM looks for in system level training are exercising interfaces, diligence in using operational tools and procedures, addressing contingency situations, meeting established objectives, and training under flight like conditions.
7 O
PERATIONAL
R
EQUIREMENTS
The MOAM has a responsibility to work closely with the MAM, the PSE and the MOS
Engineer (MOSE) to assure that operations requirements are implemented into the flight hardware, software and operations designs. He/she also works with the MAM, the PSE, and the MOS V&V Engineer to assure that a process is in place to verify the adequacy, completeness, and compliance with operational requirements. The MOAM participates in operations peer reviews and the Operations Readiness Reviews (ORR) to assure integration issues between development and operations are addressed with achievable plans for resolution prior to launch. He/she is responsible for providing project management and the institutional safety and mission assurance organization with an independent assessment of the MOS V&V program’s completeness and capability to demonstrate that operations requirements are met by the MOS and the Flight System.
8 P ROJECT P LANNING
Key mission documents, specifically the Mission Plan, Navigation Plan, and Operations
Plan are continually assessed by the flight team during Mission Planning meetings, which include MOAM participation. As potential changes to these documents, which are maintained under configuration control throughout the mission, are identified, they are discussed at the Mission Planning meetings. A Mission Change Request (MCR) must be processed and dispositioned to implement any changes to these documents. The MOAM and the project’s configuration management (CM) personnel are responsible for ensuring rigor in following the change management process. A key ingredient of this process is the adequate review and impact assessment by knowledgeable members of the project and line organizations. The MOAM assesses MCRs to ensure the appropriate peer and project review is documented. He/she also assesses the risk associated with making these changes and provides that assessment to project management.
12
9 F
LIGHT
R
ULES
Flight rules provide constraints to operations to ensure that the flight system is not damaged by planned operational activities. Every uplink product is specifically checked for flight rule compliance as required by system and subsystem procedures and checklists. The MOAM reviews official releases of team procedures and checklists associated with the uplink process for inclusion of specific (manual or automated) checks for compliance with applicable flight rules. An element of this review process is review of official releases of the flight rules document to correlate each flight rule with an appropriate (manual or automated) check in team documentation.
Flight Rules are maintained under configuration control and require that a MCR be processed and approved by the Mission Manager or Project Manager (depending on criticality/category) prior to implementation of the change. Prior to submittal of the
MCR, flight rules are reviewed and verified for correctness including: wording, criticality rating, applicability by mission phase, references, implementation strategy, and for traceability to their source.
Event specific deviations to Flight Rules require that a Waiver be processed and approved by the MOAM and the Mission Manager and/or the Project Manager prior to uplink of the products for which the deviation is needed. Waivers to Flight Rules identified as criticality/category “A” require the Project Manager’s approval.
The MOAM includes an assessment of the project’s comprehensiveness and rigor in implementing and documenting flight rule checking in their monthly project and institutional safety and mission assurance organization review presentations.
10 MOA R EPORTING
The MOAM provides monthly risk and anomaly resolution status, as well as other significant operations assurance items at Mission Management Reviews (MMR), Project
Status Reviews (PSR), Quarterly Reviews and institutional safety and mission assurance organization reviews. He/she also presents assessments of risk, project readiness, anomaly resolution, and other issues germane to mission success at Critical Events
Readiness Reviews (CERR). A receivables/deliverables list should be negotiated between the MOAM and the project as an inherent element of developing the MOA program to assure a positive and adequate interchange of information to enhance the risk reduction and mission success objectives of the project.
11 P ROJECT O PERATIONS C ONFIGURATION M ANAGEMENT
The MOAM is an active participant in the change control process and continually evaluates compliance with the process. Specifically, the MOAM:
1.
Participates in the project Change Board,
2.
Participates in command conferences for planned flight software uploads and critical events,
3.
Reviews change requests for risk to elements of the flight system and the mission,
13
4.
Assesses the potential risk to the mission and flight system of waivers to mission or flight rules and other constraints or requirements documents,
5.
Approves those waivers that have an acceptable risk level,
6.
Assures all approved changes for the project are implemented, and
7.
Assures, via spot-check, that the implementation of applicable changes complies with the change documentation.
12 I
NTERFACE WITH
O
THER
Q
UALITY
/O
PERATIONS
A
SSURANCE FUNCTIONS
12.1
Software Quality Assurance
The MOAM works with the Project to facilitate the appropriate support from the institutional safety and mission assurance organization SQA office for in-flight software development, flight software modifications, and the resolution of flight software anomalies. The participation of SQA is important for ensuring that good practices are followed and that the risks inherent in changing previously validated software are identified and adequately addressed.
12.2
Partner Operations Assurance
The MAM and the MOAM coordinate with industry and academic partners to develop a plan(s) that assures adequate partner operations assurance and interfacing with institutional mission operations assurance. The MOAM oversees the implementation of the plan(s). The MOAM monitors the anomaly reporting and closure process of the partner(s). The MOAM solicits partner inputs for CERRs, ORRs, MMRs, and peer reviews.
13 L ESSONS L EARNED A SSESSMENT
The mission operations assurance activity includes investigating issues, anomalies and idiosyncrasies discovered on one mission for applicability to other missions. The
MOAM works with the other project MAMs/MOAMs as part of the regular institutional safety and mission assurance organization review process to discuss the applicability of one project’s problems to other projects. The appropriate problem reports are then generated on other projects where applicable.
14
A
PPENDIX
A
–
A
CRONYMS
&
ABBREVIATIONS
ATLO ...................Assembly, Test and Launch Operations
BATC ...................Ball Aerospace & Technologies Corp
CCB......................Change Control Board
CDR .....................Critical Design Review
CERR ...................Critical Event Readiness Review
CM .......................Configuration Management
CoFR ....................Certificate of Flight Readiness
DR ........................Discrepancy Report
DSN......................Deep Space Network
EEIST ...................End-to-End Information System Test
EOSDT .................Engineering Operations System Design Team
FR ........................Failure Report
FSW .....................Flight Software
FTA ......................Fault Tree Analysis
GDS......................Ground Data System
IPAC ....................Infrared Processing and Analysis Center
ISA .......................Incident Surprise Anomaly report
JPL .......................Jet Propulsion Laboratory
LMA .....................Lockheed Martin Astronautics
MCCB ..................Multi-mission Change Control Board
MCO ...................Mars Climate Orbiter
MCR .....................Mission Change Request
MM ......................Mission Manager
MMO....................Mission Management Office
MMR ....................Monthly Management Report
MOA ....................Mission Operations Assurance
MOAM .................Mission Operations Assurance Manager
MOAP ..................Mission Operations Assurance Plan
MOS .....................Mission Operations System
MPIAT ................Mars Program Independent Assessment Team
MSOO ..................Mars Surveyor Operations Office
NAV .....................Navigation Team
OET ......................Observatory Engineering Team
ORR .....................Operations Readiness Review
ORT......................Operational Readiness Test
15
OSC ......................Orbital Sciences Corporation
PIRS .....................Product Integrity Reporting System
P/FR .....................Problem/Failure Report
PRS ......................Problem Reporting System
RFP ......................Request For Proposal
RTO......................Real-Time Operations
S/C........................Spacecraft
SCT ......................SpaceCraft Team
SDL ......................Space Dynamics Laboratory
SDP ......................Software Development Plan
SES .......................Space Exploration Systems
SESMO ................Space Exploration Systems Mission Operations
SIRTF ...................Space Infrared Telescope Facility
SPR ......................Software Problem Report
STL ......................Spacecraft Test Laboratory
STRATCOM ........Strategic Command
TDRSS .................Tracking and Data Relay Satellite System
TRR ......................Test Readiness Review
V&V .....................Verification & Validation
WISE ....................Wide Field Infrared Survey Experiment
WSGT ..................White Sands Ground Terminal
16
APPENDIX B
–
MOA LESSON PLANS
LESSON ONE
Problem Report Initiation
Duration: One (1) hour
Goals: Enable participants to properly use the appropriate reporting system to document problems/failures encountered during mission operations.
Objectives:
1.
Participants can identify the available problem/failure reporting systems and the types of problems/failures for which they are applicable.
2.
Participants can describe when an apparent problem/failure should be documented in a reporting system.
3.
Participants can log in and properly initiate a problem report.
Prerequisites:
1.
Participants should have an institutional username and password that allows access to the Problem Reporting System.
2.
A generic project exists within the current Problem Reporting System
3.
Access to meeting place (TBD – optional)
Materials:
1.
Laptop with wireless internet access and browser software
2.
Copy of the institutions Anomaly Resolution Standard
Lesson Description: Introduce participants to the need for the problem reporting system and when the problem reports are generated. Teach participants the use of the tools to initiate a problem report by allowing them to generate a report using the operational reporting system.
Procedure:
1.
Lead discussion of PRS process from problem identification to closure documentation.
2.
Demonstrate and facilitate participant initiating project report using the “Test” project.
3.
Highlight MOA areas of responsibility for problem report form completion
4.
Review signature and closure process using a sample “Test” project problem report.
17
Assessment:
Verify required items are completed properly by participants for initiated problem reports.
1.
Title
2.
Project
3.
Date of Incident
4.
Observed Location
5.
Mission Phase
6.
Priority for resolution
7.
Initiator’s Criticality Rating
8.
Clear description of observed issue (expected outcome vs observed outcome) and
(sub-)system(s) involved/affected
Provide feedback to participant on any areas of improvement and notable excellence.
LESSON TWO
Problem Report Processing and Closure
Duration: One (1) hour
Goals: Enable participants to properly document and close reports for problems/failures encountered during mission operations.
Objectives:
1.
Participants can identify the available problem/failure reporting systems and the types of problems/failures for which they are applicable.
2.
Participants can describe when an apparent problem/failure should be documented in a reporting system.
3.
Participants can log in and properly update an problem report.
Prerequisites:
1.
Participants should have an institutional username and password that allows access to the Problem Reporting System.
2.
A generic project exists within the current Problem Reporting System
3.
Access to meeting place (TBD – optional)
Materials:
1.
Laptop with wireless internet access and browser software
2.
Copy of institutional Anomaly Resolution Standard
18
Lesson Description: Teach participants the use of the tools to update a problem report by allowing them to complete a report using the operational reporting system.
Procedure:
1.
Lead discussion of PRS process from problem identification to closure documentation.
2.
Demonstrate and facilitate participant updating a problem report using the “Test” project.
3.
Highlight MOA areas of responsibility for problem report form completion
4.
Demonstrate signature and closure process using a sample “Test” project problem report.
Assessment:
Verify required items are completed properly by participants for initiated ISAs.
1.
Title
2.
Project
3.
Date of Incident
4.
Observed Location
5.
Mission Phase
6.
Priority for resolution
7.
Initiator’s Criticality Rating
8.
Clear description of analysis of cause of problem and any immediate responses appropriate to initially restore system capability.
9.
Clear description of corrective action to resolve immediate problem and preclude recurrence in the future.
Provide feedback to participant on any areas of improvement and notable excellence.
19
A
PPENDIX
C
–F
LIGHT
P
RACTICES AND
D
ESIGN
P
RINCIPLES
This appendix contains a select set of good flight practices and design principles which may be used as focus items for Mission Operations Assurance
Flight Practices
1.
Each project independently checks each process involved in spacecraft control (e.g., command and sequence generation, navigation orbit determination, and maneuver design).
Note: “Independent check” means that someone other than the originator or originating team (e.g., another person, a peer review, or software automation) checks the process.
2.
Projects plan continuous coverage for tracking and data services and appropriate staffing by operations personnel as follows:
For at least the initial 7-day period of flight operations starting at launch
For at least the final 7-day period of flight operations leading up to and through mission critical events
3.
Each project validates sequences for mission-critical events* prior to uplink.
Validation includes all of the following:
Testing on a testbed (both nominal and off-nominal sequence execution)
Project internal sequence review
Peer (project external) review (walkthrough) of the sequence and the testing performed on the testbed
* Note: Mission-critical events are those that if not executed properly and in a timely manner could result in failure to achieve mission success (e.g., orbit insertion, EDL).
4.
Each project validates sequences for first-time events (e.g., first use of propulsion system in flight) prior to uplink. This validation includes:
Testing on a testbed (at least nominal sequence execution)
Project internal sequence review
Identifying what could go wrong and developing contingencies if warranted.
Uplink commands that are included in any contingency response are also validated (e.g. by testing on a testbed) prior to their use.
5.
Prior to mission-critical event activation, projects ensure the integrity of the on-board memory contents e.g. via memory or checksum readout.
Note: This action ensures memory contents have not been corrupted (e.g. as a result of single event effects) after command receipt, validation, and on-board storage.
20
Design Principles for development and operations
1.
Uplink/Downlink capabilities - The mission design shall provide for a real time downlink capability during mission critical events.
2.
Critical Sequence Telemetry Monitoring - Mission critical event data and visibility of mission-critical deployments shall be available via real time telemetry.
3.
Power Cycling Of Mission-Critical Hardware - Power cycling of mission-critical hardware shall be avoided unless required to meet mission requirements, and is within the known capability of the hardware being cycled.
Rationale: Unnecessary power cycling represents an unnecessary mission risk.
4.
Powering off the RF Downlink - After in-flight turn on, the spacecraft downlink RF transmitter hardware (e.g., exciters, power amp) shall not be turned off during nominal flight operations; a downlink signal shall be transmitted continuously during the entire mission. An exception is the momentary cycling of power to the transmitter chain hardware that may result via system autonomous fault protection responses.
Rationale: In the absence of a downlink signal, ground-initiated recovery actions must consider a much broader set of spacecraft anomaly scenarios.
5.
Thermal Cycling - Flight hardware thermal cycling shall be limited to that required to meet mission requirements, and within established limits of design capability.
Rationale: Thermal cycling can fatigue the hardware, e.g. solder joints, printed wiring board plated thru holes, and flex cable connections, and thus is to be avoided unless required by the mission, and consistent with the hardware design capability.
6.
Use of Prime Units - Prime selected hardware elements shall remain in use for all operations.
Note: Switching to back-up units, e.g. for aliveness check, is not done as long as the prime unit continues to serve mission needs.
Note: This operating tenet should be considered in the design of the flight system, e.g. where unit performance may change with extended storage life.
7.
Swapping to Redundant Hardware - Swapping mission-critical hardware to a redundant element shall be limited to fault recovery actions taken to assure health/ safety and/or meet mission requirements, unless there is observed via telemetry unacceptable degradation of the primary unit, and the risk trade, subject to institutional review, indicates pre-empting the on-board fault protection by ground control to be a prudent approach.
8.
Simultaneous use of Prime and Redundant Hardware - Simultaneous use of selected prime and redundant hardware to enhance reliability/performance for accomplishing mission critical activities shall be considered only after the operation
21
has been verified and validated, and shall be approved at the Critical Events
Readiness Review.
9.
Critical Hardware Power Cycling for Power Management - In-flight routine power cycling of critical hardware for power margin management purposes shall be avoided, unless cycling is essential to mission viability and the risk is demonstrated to be acceptable.
Note: At the outset, the spacecraft power source should be sized to accommodate all critical loads in the required mission modes, rather than rely on power cycling critical components to manage power margins.
Rationale: Cycling of critical hardware introduces failure modes that are to be avoided if possible.
22
23