1. Introduction: The goal of workflow is to abstract a series of working steps in a way that administrators will have the freedom to modify the step sequence. It also provides a means to track working progress, enabling more efficient and smoother transactions of different kinds. To give users a better way to organize their work tasks, workflow can also sort each user's task list based upon importance level, timeliness, or other criteria as specified by users. 2. Problem Statement: Finding a generic model to represent working processes which have not existed yet is one of the challenges. In manufacturing chain, the processes can be very sophisticated with many branching factors and conditions. To build such a system would involve years of work and would require expertise to operate it. This is not DSpace's need. Yet, we do not want a too simple system incapable of handling regular processes in DSpace. Thus, a reasonable system with extensibility is our target. The three main features we would like to deliver in the workflow engine are: the ability to modify a working process by adding, removing, or resequencing the working steps in a workflow. providing each end user with a work task list in order for them to prioritize their work. a tracking system to trace the progress of a workflow. 3. Model: (The big picture) We use Petri net as the model for workflow instead of a finite state machine because the former captures a more complex problem domain. Petri net, as described in one of the publications on the web, is liken to an old fashioned office with plenty tables with one or more inboxes sitting on top. And workflow is a process of passing through documents from inbox on a table to another inbox of another table. When a document is available in the inbox of a table, it is said to be ready for process. (i.e. enabled) There are three main components in a Petri net: transition (table), place (inbox), and token (document). Transitions correspond to the actual activity being carried out and are linked with various places in the workflow engine. Each transition has at least one input and output place. Place is where tokens are put. To start the workflow engine, a sufficient number of token has to be present in all the start places. Each transition has a consumption rate (number of token) for each of its input place. When sufficient amount of token is present in all the input place of a transition, the transition is said to be enabled. An enabled transition is ready to be triggered and if indeed it does, the token in its input places will be consumed and new tokens will be generated in its output places afterwards. Subsequently, the following transitions will be enabled and triggered. The process keeps running until the tokens arrive at the end places where the workflow engine terminates. 3.1. Transition: Transitions, as described above, are activities such as sending email, filling out approval forms, making phone calls, etc. There are three possible states for each transition: enabled, being processed, and canceled. A transition is enabled if there is sufficient amount of token present in each of its input place. When the workflow administrators create a new transition, they need to specify the consumption rate (i.e. number of token) in each of the new transition's input place as well as the output rate for its output places. In other words, each input place of a transition is weighted. Only if the consumption rate is met will the transition be enabled. When a transition is enabled, it is ready to be triggered. In fact, at the moment of being enabled, the transition owners will be notified and the transition will appear on their task lists. When the owners claim the task, the transition is triggered and the tokens will be transferred from its input places to its consumption basket. At this time the state will be changed to being processed. The transition owners can then proceed to carry out their designated activities. If for some reason they would like to cancel the task, the tokens will be transferred back from the consumption basket to the input places. Otherwise, if the transition is done successfully, the tokens will be consumed (i.e. removed from the consumption basket) and new ones will be generated and put in the output places according to the output rate of the transition. 3.1.1. Transition trigger method There are four different ways to trigger a transition after it has been enabled: 3.1.1.1. Message triggered: Transitions are triggered by email messages (represented by an envelope in the figure) 3.1.1.2. User triggered: Triggered by users. An example of such transition would be a phone call. (a fat arrow in the diagram) 3.1.1.3. Automatic triggered: Triggered by the system itself. (a thin arrow) 3.1.1.4. Time triggered: Transitions are triggered after a certain time period. (a clock in the diagram) 3.2. Place: Place is where tokens are placed. Each of them is linked with at least one input transition and output transition except for the two special types of place: the starting place(s) and the ending place(s). For each workflow engine, there are a list of starting places and ending places along with number of tokens required. The former is where the specified number of token needs to be present in order to start a workflow. And the latter is the ending condition of the workflow (i.e. the specified number of token has to be present in all the ending places for the workflow to terminate) 3.3. Token: There are two types of tokens: the generic ones and the ones carrying attributes. The former is not transition specific and there is no difference among these tokens. The ones carrying attributes, however, are transition specific. If the transition is canceled and the tokens need to be transferred back from its consumption basket to the various input places, it is important to put exactly the same tokens back to the place where they come from. 3.3.1. Example: Take a look at the attached diagrams in various user cases. The rectangles and circles are transitions and places respectively. Note that there could be times when two transitions have to fight for a token. Cases of or-split, or-join, and-split, and-join are illustrated. 3.3.1.1. Tokens carrying attributes C A Submit info B Notify approver G D Notify approver H E Notify approver I F Notify approver J Select approver Figure 1 In this case, there exists situation when tokens need to carry attributes. Each attribute consists of a name-value pair and there is no limit on the number of attribute a token can carry. The workflow starts off by having the users input information of the publication being submitted. Then, users can select four approvers to review his or her publication out of the committee (which probably consists four or more members). Depending upon whom the user selected, the system will notify the corresponding approvers. This is the time when the tokens need to carry attributes. After the user selects the four approvers, there will be four tokens, each carrying an approver's email address, put in place C, D, E, and F correspondingly. After the notification process is done, the approvers can make their decision concurrently and approval will be granted if the predefined threshold is passed. ( Figure 1) Approve /reject 3.3.1.2. Fight for token Figure 2 Here we illustrate the case where only some transitions would be fired because of a limited supply of tokens. Again the workflow starts off by having the user input the publication information. Next, the user is asked about the number of approvers required and the same number of tokens will be put in place C accordingly. The entire committee of approvers will now be notified (i.e. all the approval transitions for the committee are enabled) but only those who successfully get hold of the tokens in place C would be fired. Suppose only 4 tokens were put in place C and there are 10 members in the committee, that means the 10 persons would have to fight for the tokens and only 4 of them will win and carry out their tasks. Decision will be made after collecting all approvers' comments.(Figure 2) 3.3.1.3. Explicit-OR split versus OR-split A Submit info B Notify approver D Approve /reject C Resubmit Cancel submission Figure 3 In this user case, we demonstrate the explicit-OR split and the OR-split. Once again, the workflow starts by having user submitting publication information. Then depending upon whether the submission transition succeeds or not, the workflow either branches out to place B or place C. (i.e. the explicit-OR split) In the former case where submission is successful, a token is put in place B and the workflow moves on to the notification transition and eventually to the approval transition. Otherwise, if the submission transition is failed, a token is put in place C. Both the Resubmission transition and Cancellation of submission will then be enabled. Note that there is only one token available in their shared input place. Thus, only one of them will be able to triggered. This will be determined on a first-come-first-serve basis. Note that resubmission is triggered by message and cancellation is by time. In other words, if the specific triggering message appears before the set period of time, transition resubmission will be triggered, otherwise the cancellation transition will. (Figure 3) 3.3.1.4. Complete example A Submit info about publicatio Select approver B C E Notify approver Spam user D Cancel submission F I G Notify approver Notify approver Notify approver K L J Update submission info H Approve / reject Figure 4 The process starts at place A when a token is put there. The transition 'submit info about publication' represented by a rectangle in the diagram is then enabled. Submitter will be presented with a HTML form to fill out relevant details of submission. If submission process is successful, then the system will move on to place B where a token is placed. Next the 'approver selection' transition is enabled. For each approver selected, there will be a token generated which will enable the subsequent notification transition. (this is known as the and-split). After each notification is sent, a token will be produced and the decision to grant approval or not will be made after all four tokens at place I, J, K, and L are collected. On the other hand, if we trace the path of place C when submission of information fails, the system will proceed and spam the users. After that, the two transitions 'update submission info' and 'cancel submission' will have to fight for the token at place D (i.e. an OR-split). The envelope next to the 'update submission info' transition indicates that it's a message-triggered transition; whereas the clock at the 'cancel submission' box shows that it's time triggered. That means if a message is received within a certain period of time, the 'update submission info' will be triggered. Otherwise, when time is up, the submission will be canceled. (Figure 4) 4. Implementation Notes: Implementation is focused on two parts: XML schema for workflow design and workflow management. For workflow design, we would like the system to take in any XML schema, parse it, and generate a workflow object. Future effort is expected in this area. Workflow management deals with the handling of the workflow object once it has started. The system needs to keep the workflow object running by ensuring proper user interactions when transitions are triggered. 4.1. XML schema input for workflow design: The system should have a XSD file available to interpret the various XML files input by the workflow administrators for different workflow object. The XSD file defines all the attributes of the components in the workflow model and the XML file provides the specific values. In other words, each XML file generates one workflow object. A parser, based on the XSD file, will then compile the XML file and generate the workflow object for later use in the workflow management. 4.2. Workflow management: This part of the program primarily deals with how to operate a workflow object. The six major classes include Transition, Place, Token, Workflow, Workflow manager, and Callback. The first three are components of the Workflow object. The Workflow manager, as the name may have already suggested, manages each workflow object, keeps the workflow running smoothly by placing tokens at the appropriate places before, after and during the transitions and enabling transitions if possible. When a transition is enabled, the Workflow manager will notify the transition owner or system depending on the triggering method of the particular transition. Lastly, the class Callback works with the user interaction when a transition is triggered. It can be viewed as the black box having input from the workflow manager of which transition is triggered and output from the user when they are done with their task. 5. Future issues for research: Next topic to look at is how to select which transition to trigger among a list of enabled system triggered transition. Often time there are multiple transitions being enabled simultaneously. If they are all triggered by the system, the order of them being looked at by the system will matter. The 3 possible ways to determine this include: Programmatic (e.g. always pick the 2/3 of the transition owners with less tasks in their work list), manual (UI for administrators to pick their choice), and static (fixed assignment) The system will also be capable to know how to prioritize the three methods of assignment when there are more than one available to a transition. Lastly, it is essential to build a user friendly UI for the 3 main parts of the workflow: 5.1. For end users (e.g. job task list, 'things to do' list for today, completed tasks, etc) 5.2. For admin/manager to track the process of a workflow 5.3. For workflow design (may involve automatic generation of html forms) This one is by all means the most challenging because of its complex functionality.