Estimation of timings for RCASE algorithm to optimise distribution calculation strategy Background Warwick Analytics provides ground breaking software that identifies root causes of faults and inefficiencies in manufacturing industries. The main algorithm (RCASE) was developed from over 10 years academic research and spun out of Warwick University. RCASE requires a failure marker (usually from a warranty claim or test case) and some data from the life cycle of the product – this could include tolerance data, production data, testing data and user data. It does not require hypothesis and provides results even with dirty/incomplete data sets. In 2013, Warwick Analytics won Demo God at Demo Fall and SAP’s worldwide most innovative start up as well as raising initial investment from Jensons Solutions. Project RCASE calculations can vary from seconds to days! The algorithm is implemented in 7 distinct segments each of which can be parallelised. RCASE can be deployed on Windows Azure platform and spin up as many machines as required to perform the calculations. The bottlenecks within the algorithm have not been analysed. However, it is confirmed that bottlenecks occur throughout the algorithm depend on the makeup of the data. There is a performance hit when distributing an algorithm segment (spinning up machines, splitting the data, setting up queues, merging the results). The project is to find a concurrency strategy for the RCASE algorithm based solely on the characteristics of the input data. By analysing various input data sets and an understanding of the algorithm, it should be possible to model the relative timings of each segment of the RCASE algorithm. By using this model, the optimal concurrency strategy can be inferred. Deliverables WA will provide access to the RCASE implementations, example data sets and the algorithm developers. The student is not expected to program but take metrics from the algorithms and come up with a method for optimisation. The deliverables are expected to be 1. Data analysis strategy and categorisation 2. Relative timings for each segment of RCASE based on Deliverable 1 3. Optimal distribution strategy of RCASE based on Deliverable 2