PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C PART 2 – SECTION C REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM Page 1 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C CONTENTS 1. INTRODUCTION ......................................................................................................... 3 2. SCOPE OF WORK ........................................................................................................ 3 3. SYSTEM DEVELOPMENT AND IMPLEMENTATION FOR DATA SCIENCE PLATFORM .................................................................................................................. 4 4. SYSTEM ADMINISTRATION AND MAINTENANCE FOR DATA SCIENCE PLATFORM .................................................................................................................. 9 5. SYSTEM INTEGRATION FOR DATA SCIENCE PLATFORM ............................. 10 6. HARDWARE MAINTENANCE FOR DATA SCIENCE PLATFORM .................. 10 (OPTIONAL) 7. SYSTEM PERFORMANCE TEST FOR DATA SCIENCE PLATFORM ................ 13 8. TRAINING REQUIREMENTS FOR DATA SCIENCE PLATFORM ..................... 16 Page 2 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM 1. INTRODUCTION 1.1 Background SECTION C (a) The Data Science Platform (DSP) will provide data science capabilities such as data analytics, machine learning, integrated web-based development environment, distributed version control, model development and deployment to meet users’ business and analytics needs. It shall integrate with an Enterprise Data Management Platform (xDASH), covered under Part 2 Section B1 which support data transformation, sharing and discovery. (b) The Data Science Platform shall be a scalable and multi-tenant architecture with high availability. The tenderer shall describe in detail how multi tenancy is implemented in their proposal. The proposed solution shall address the following details such as isolation for data, compute, code, content, account for administrators, different administrative accounts, level of controls for accounts etc. 2. SCOPE OF WORK 2.1 The Supplier shall be responsible for the following: (a) Design, development, and implementation of the Data Science Platform; (b) Maintenance of the Data Science Platform including all system interfaces and data integrations within the system; (c) Integration with the Enterprise Data Management Platform (xDASH)’s Data Access Tier (covered under Part 2 Section B2); (d) Supply, setup, install and maintenance of hardware equipment supporting Data Science Platform (optional); and (e) Training for Power Users and End Users on the Data Science Platform capabilities and end-to-end data science workflow. 2.2 The Supplier shall provide the schedule of rate for Service Request Man-Days as specified in Part 3 Annex I on Cost Schedule Table 11. This Service Request man-days shall cater for the following throughout the entire project lifecycle: (a) Modifications, enhancements, sustainment under Data Science Platform covered under Part 2 Section C; and (b) Security review and testing services covered under Part 2 Section D. 2.3 The high-level architecture of the Data Science Platform after implementation at OnPremise Cloud (OPC) is as follows. The diagram also shows the integration with the Page 3 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C Enterprise Data Management Platform (xDASH)’s Data Access Tier (covered under Part 2 Section B2). 2.4 The Supplier shall design a multi-tenanted Data Science Platform that provides data science capabilities such as data analytics, machine learning, integrated web-based development environment, distributed version control, model development and deployment. There shall have 2 tenants at system commissioning. 3. SYSTEM DEVELOPMENT AND IMPLEMENTATION FOR DATA SCIENCE PLATFORM 3.1 The Supplier shall be responsible for the design, development, and implementation of a multi-tenanted Data Science Platform for data science capabilities. 3.2 The Data Science Platform shall provision data analytics, machine learning, integrated web-based development environment, distributed version control, model development and deployment tools to facilitate data science functionalities with capabilities as follows: 3.2.1 General Requirements (a) Allow role-based authorisation control for access management and SSO (e.g., able to integrate with Active Directory / SAML1 ) through integration with Authority’s IAMS2 for the Data Science Platform and its subcomponents; and (b) Allow integration with databases (e.g., RDBMS, NoSQL databases, Object Store) and data virtualisation tools (e.g., Denodo, Tibco) to allow merging, joins, transformation and querying of data for data management. 1 Security Assertion Markup Language Identity & Access Management System (IAMS) provides key functions such as Single Sign-On (SSO) and 2factor authentication (2FA). 2 Page 4 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C 3.2.2 Data Storage and Management Capabilities (a) Allow parallel queries and workloads to be executed across multiple nodes; (b) Support multi-tenancy – support segregation of data among multiple teams of users, with each team able to have fine-grained access to their data; (c) Provide access control of System and/or database administration to disallow administrator’s access and rights to non-administrators (e.g., via permissions, masking); (d) Leverage on distributed processing frameworks to run queries (e.g., Hadoop MapReduce, Apache Spark, etc.); and (e) Allows queries to be scheduled or batched. 3.2.3 Data Privacy-Preservation (a) Allow policies for encryption, anonymisation, masking, and tokenization using techniques like homographic encryption, K-anonymity, differential privacy and others to be applied to data; (b) Allow data to be ingested from files, database or big data sources and application of privacy measures to the data without the need for a temporary staging area; (c) Allow users to define their own sets of privacy settings (e.g., masking of columns and rows); (d) Able to run defined sets of privacy settings on similar datasets; and (e) Allow imported data to be stored in an encrypted format (please refer to Part 2 Section L for more details). 3.2.4 Data Collection, Preparation, and Query (a) Facilitate data collection from various sources including but not limited to structured, unstructured, files, databases, Application Programming Interfaces (APIs), streams3; (b) Perform data transformation and cleansing with R and Python code; (c) Support scheduling of workflow jobs; 3 API stands for Application Programming Interface. It provides a channel to send information between two applications. Page 5 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C (d) Allow import of table/index/constraints/foreign keys creation Data Definition Language (DDL) to perform data modelling4; (e) Allow organisation of data as data assets that can be separately access controlled; (f) Allow search and query of data fields; (g) Allow data to be imported and exported in one or more of the following formats: csv, json, xml, SQL, binary, xls, xlsx, etc.; (h) Allow data governance workflows to be built to allow prior review and approval for data objects before they are created, modified or removed; (i) Provide fine-grained user access and control; and (j) Allow checking of data quality by including but not limited to showing minimum and maximum values for each field, number of empty cells or inconsistent encoding (nulls, blanks, NAs and zeros), invalid date formats, invalid geospatial coordinates, inconsistent data types, abnormally long or short fields and statistical checks (mean, standard deviation, etc.) 3.2.5 Distributed Version Control System (a) Provide code sharing and version control using Git repositories; (b) Support multi-tenancy – support segregation of code among multiple teams of users, with each team able to have fine-grained access to their code; (c) Allows collaboration of code development with multiple teams of users; (d) Allows code to be exported out as R, Python or Jupyter notebooks or equivalent; and (e) Allows users to be authenticated via IAMS. 3.2.6 Machine Learning and Model Development (a) Allow development using R and Python code; (b) Allow R, Python, and other open-sourced libraries to be used; (c) Allow code to be developed as Jupyter notebooks or equivalent; (d) Allow interactive execution of code with intermediate results to be shown visually on the same interface as text, or in the form of tables, graphs or charts; 4 DDL stands for Data Definition Language. It is a computer language used to create and modify the structure of database objects (e.g., views, schemas, tables, indexes etc.) in a database. Page 6 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C (e) Allow auto-completion, syntax highlighting and checking; (f) Allow concurrent execution of code segments; (g) Allow separation of data into training and test sets, with support for different sampling methods; (h) Allow integration with Distributed Version Control System for code sharing and version control; (i) Allow integration of product principal approved plugins and extensions for added functionalities; (j) Support multi-tenancy – support segregation of code among multiple teams of users, with each team able to have fine-grained access to their code; (k) Support unit-testing of models either natively or with integration with Git repositories; and (l) Allow batch execution of code with results shown, collected, or exported at the end of the run. 3.2.7 Automated Machine Learning Tools (a) Allow model development using R and Python code; (b) Allow R and Python libraries and other open-sourced libraries to be used; (c) Allow code to be developed as Jupyter notebooks or equivalent; (d) Support multi-tenancy – support segregation of code among multiple teams of users, with each team able to have fine-grained access to their code; (e) Allow separation of data into training and test sets and validation sets, with support for different sampling methods; (f) Allow rapid iteration of machine learning algorithms and models; (g) Able to propose recommended or default model parameters; (h) Allow model parameters to be tracked across runs; (i) Allow open-sourced algorithms to be imported and run; (j) Allow models to be scored and ranked; (k) Recommend the best model based on model evaluation metrics (e.g., F1 score, RMSE, ROC-AUC, etc.); Page 7 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C (l) Allow previous results of model runs to be saved; (m) Support model ensembles; (n) Allow multiple jobs to be scheduled, run and tracked; (o) Support open-source analytics frameworks like Keras, PyTorch, TensorFlow, H2O; (p) Allow data to be ingested from data sources like MS SQL, Oracle, Hive, HDFS, Postgres, MySQL, CSV; and (q) Able to automatically generate explanations of the models and analytics results, and export the detailed explanations as text document. 3.2.8 Model Deployment and Operationalisation (a) Support monitoring of model performance; (b) Allow machine learning and predictive models to be deployed as APIs; (c) Support fine-grained access control for users to deploy models and APIs; (d) Support multi-tenancy – support segregation of APIs among multiple teams of users, with each team able to have fine-grained access to their APIs; (e) Support secure encrypted transmission of results and data; (f) Allow APIs to be pushed and deployed on Kubernetes cluster as Docker containers; (g) Allow APIs to be tested before deployment; (h) Allow newer APIs to be deployed on-the-fly, or older versions of APIs to be replaced on-the-fly, while minimising the disruption to users; (i) Allow newer and older APIs to be deployed concurrently based on weights or priority, for A-B testing or benchmarking; (j) Support model routing based on specific parameters, e.g., Hybrid switching recommenders; (k) Monitor the status of the deployed APIs; (l) Log and track all API calls; (m) Track the performance and load of the APIs; (n) Send alerts or notifications when APIs fail; Page 8 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C (o) Support high-availability for APIs; and (p) Allow access control for APIs. 3.2.9 Hardware Requirements (optional) The Supplier shall propose hardware with Graphic Processing Unit (GPU) that are compatible with the Data Science Platform solution. The platform together with the GPU hardware shall allow the following: (a) GPU based model inference and model training; (b) Support horizontal scaling of GPUs; and (c) Support Natural Language Processing (NLP), Speech-to-Text (STT) and Large Language Model (LLM) workloads; 4. SYSTEM ADMINISTRATION AND MAINTENANCE FOR DATA SCIENCE PLATFORM 4.1 The Supplier shall be responsible for the system administration of the Data Science Platform, and the maintenance of all system interfaces and data integrations within the system. 4.2 The Supplier shall provide administrative support of data science tool, integrated webbased development environment and distributed version control system: (a) Security review and permissions settings; (b) Supports the accounts creation, deletion and regular review; (c) Server connection reconfiguration (e.g., password change for system accounts); and (d) Tenant support and management 4.3 The Supplier shall provide administrative support and updating of open-sourced and product specific libraries every quarter. 4.4 The Supplier shall provide workflow and tools to review (i.e., vulnerabilities, malicious code) open-sourced and product specific libraries every quarter. 4.5 The Supplier shall manage and ensure smooth operations of all system interfaces and data integrations deployed within the Data Science Platform. 4.6 The Supplier shall be responsible to investigate and troubleshoot any platform related issue (i.e., failed interfaces, failed APIs deployed as models, failed data science workflows, etc.) and to identify the possible cases. The Supplier shall re-run and Page 9 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C rectify/modify all platform related issues within the scope of work where necessary to resolve the issue. 4.7 The Supplier shall monitor and log the failed platform related issues and the associated causes. The Supplier should also highlight consistent failure and/or causes to the Authority for the necessary follow-up actions. The Supplier should propose solutions to prevent recurrence of the interface issues. 5. SYSTEM INTEGRATION FOR DATA SCIENCE PLATFORM 5.1 The Supplier shall configure, test and provide the necessary support for the system integration and maintenance of the Data Science Platform with Enterprise Data Management Platform (xDASH)’s Data Access Tier (covered under Part 2 Section B2). 5.2 The Supplier shall work with the Supplier of the Enterprise Data Management Platform (xDASH) to integrate the Data Science Platform with Enterprise Data Management Platform (xDASH)’s Data Access Tier. 5.3 The Supplier shall ensure that Workload B (covered under Part 2 Section B1) will be accessible and able to conduct data science/data analytics work by users of Data Science Platform. 6. HARDWARE (OPTIONAL) 6.1 The section defines the scope of services for the following: MAINTENANCE FOR DATA SCIENCE PLATFORM (a) General Requirements (b) Support Hours (c) Service Levels 6.2 General Requirements 6.2.1 The Supplier shall provide hardware maintenance services for the Data Science Platform. Detailed configurations for the server equipment shall be provided upon the award of the contract. The server equipment shall consist of the equipment rack that houses the servers. 6.2.2 The Supplier shall propose the Server Equipment Maintenance charges which shall be quoted as a percentage of the latest list price of the said equipment. The list price shall be obtained from the equipment manufacturer. The Supplier shall provide an update of the price list by the equipment manufacturer within one (1) month of the published date. The Supplier shall make the price list available and accessible to the Authority. All Page 10 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C prices and updates shall not be modified. The Authority reserves the right to verify the accuracy of the price list with the equipment manufacturer. 6.2.3 In cases where server equipment maintenance services are required for an obsolete item, the latest list price of its replacement item endorsed by the equipment manufacturer shall be used as reference to calculate annual maintenance costs. 6.2.4 The Supplier shall be responsible for the co-ordination of all activities, including maintenance, installation, testing and commissioning activities with other third-party vendors. In the event of any dispute between the Supplier and relevant parties on coordination and interface activities, the decision of the Authority shall be final and binding. 6.2.5 The Supplier shall ensure an adequate supply of spare parts for the System (including servers, so that faulty components can be replaced promptly when needed.) The spare parts supplied shall be of original or compatible, but higher, specifications such that no direct or indirect side-effects whatsoever shall be caused. 6.2.6 Any defective parts removed from the equipment shall become the property of the Supplier. In the case of faulty storage media such as hard disks or tapes, they shall remain property of the Authority. 6.2.7 The Supplier shall ensure that all replacement parts are in working condition before using them to replace the defective ones. 6.2.8 The Supplier shall ensure that all equipment upgrades, firmware and software patches/new releases are tested in an environment similar to that of the Authority before implementing them in the Authority’s environment. 6.2.9 Upon receipt of notification from the Authority that the equipment has failed or is malfunctioning, the Supplier shall dispatch suitably qualified personnel to arrive at the Authority Site stipulated by the service levels section to make such repairs and adjustments to and replace such parts necessary to restore the equipment to its original functional state. 6.2.10 Where the Supplier is unable to restore any component part of the equipment to its original functional state, the Supplier shall, without any cost to the Authority, provide the Authority with substitute equipment which is functionally equivalent to the defective one until the failure or malfunction is rectified. 6.2.11 The Supplier shall maintain a log of all their activities at the Authority Site. The Supplier shall propose a format for the log. The log shall include the following: (a) Date and time when the Supplier is notified of any defect or malfunction. (b) Date and time of arrival of the Supplier's personnel at the Authority Site. (c) Date and time when the faulty equipment or component was successfully replaced. Page 11 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C (d) Description of the faulty equipment or component and the causes for their failure. (e) Corrective actions taken, including temporary corrections, bypasses, etc. (f) Preventive actions to be taken. (g) Result of tests performed to verify the correct functioning of the substitute equipment. 6.2.12 The Supplier shall at its own expense within a reasonable period of time, clear away and remove from the Authority Site all surplus material and rubbish after the completion of each visit. 6.2.13 The Supplier shall provide assistance to Authority-appointed auditors and consultants where applicable. 6.3 Support Hours 6.3.1 The Support Hours for Server Equipment shall be twenty-four (24) hours every day, including Saturday, Sunday and Public Holidays 6.4 Service Levels 6.4.1 The Supplier shall restore the faulty equipment to its original functional state within four (4) hours from the time of failure, failing which, the Supplier shall provide a temporary loan set within eight (8) hours from the time of failure. 6.4.2 All storage media installed in the temporary loan set, such as hard disks or tapes, shall be handled according to the “Singapore Government Security Instructions for the Handling and Custody of Classified Information” before being taken out of Authority’s premises. The Supplier shall refer to Part 2, Section L – Security Requirements for the compliance of Authority’s policies and standards. 6.4.3 The Supplier shall conduct preventive maintenance services on all hardware equipment maintained once every six (6) months per equipment. The preventive maintenance works shall include the following: (a) Cleaning of equipment and rack ventilation fans. (b) Running diagnostic programs on the hardware. (c) Replacing, without cost to the Authority, the equipment where replacement is necessary for the normal functioning of the System. (d) Performing tests or adjustments necessary to keep the equipment in original functional state. (e) Updating documentation on the rack information such as position of equipment in the rack. Page 12 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C 6.4.4 The Supplier shall submit the preventive maintenance work plan to the Authority for acceptance at least one (1) month before the works commencement. The preventive maintenance work plan shall indicate the work activity, schedule, duration, and any other relevant information. 6.4.5 The Supplier shall submit the preventive maintenance report to the Authority within one (1) week after the inspection. 7. SYSTEM PERFORMANCE TEST FOR DATA SCIENCE PLATFORM 7.1 The Supplier shall refer to Part 2 Section A for the general requirements for performance test. 7.2 The Tenderer shall note that System Performance Test comprises the following: (a) Benchmark test; (b) Stress test; (c) Endurance test; (d) Breakpoint test; and (e) Gate-keeping test 7.3 The Benchmark Test shall validate system performance with the normal concurrent load factor (50% of peak concurrent load factor). 7.4 The Stress Test shall validate system performance with peak concurrent load factor (100% of peak concurrent load factor). 7.5 The Endurance Test shall establish system consistency over prolonged sustained peak concurrent load factor. 7.6 The Breakpoint Test shall establish system capacity with gradual increment of user concurrency and to determine the maximum user load factor which the system is able to sustain prior to the defined breakpoint criteria defined by Authority. 7.7 The Gate-keeping test shall validate the effectiveness of the gate-keeping measures to gracefully handle peak load or breakpoint load. 7.8 The detailed System Performance Test Plan shall be reviewed and approved by the Authority. 7.9 The Supplier shall engage an independent Test Consultant to perform the application performance testing services before each system release implementation phase or major upgrades. Page 13 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C 7.10 The Supplier shall note that the appointment of the independent Test Consultant is subjected to the approval of the Authority. 7.11 The Supplier shall bear the cost of all the related consultancy services. 7.12 The Test Consultant shall provide the System Performance Test Plan using the Authority’s Performance Test Plan template which documents minimally the background i.e., system overview, objectives, roles & responsibilities of the project team, testing methodology, key business processes, measurement of success, areas required for refinement and fine-tuning recommendations. 7.13 The Test Consultant shall propose in the System Performance Test Plan how to generate the transaction load to test the Systems as though operating under live conditions such as similar data size, usage scenarios, and user base. The System Performance Test shall be carried out by the Test Consultant and the results of the test will be verified by the users. 7.14 The Test Consultant shall note that the System Performance Test Plan shall be established according to the available resource in the System Performance Test environment. The Supplier shall provide the results to show that the application meets the performance targets for the Systems sizing proposed. 7.15 The Supplier shall submit the System Performance Test Plan to the Authority for review and approval minimally ONE (1) month before the commencement of the System Performance testing. 7.16 The Tenderer shall note that the Authority’s System Performance Test Plan template will be provided to the Supplier upon award of this Contract. 7.17 The Test Consultant shall generate sufficient test coverage and a realistic amount of load together with batch processing on the application and system under test. The proposed test coverage and suggested amount of test data to be generated is subjected to the Authority’s review and approval. 7.18 The Test Consultant shall note that the test coverage, the type of testing services and the amount of load generated on the system and application, which are subjected to the approval of the Authority, are dependent on the business requirements of the Systems, the System application architecture, the size of the user base, the number of concurrent users, the number of concurrent access, the number of concurrent request, the system configurations, etc. 7.19 The Supplier shall provide the system performance testing service to conduct the performance testing for applications, subject to the Authority’s approval. 7.20 The Supplier shall notify and obtain the Authority’s approval on the recommended testing tools and number of software licenses. 7.21 The Supplier shall be responsible for setting up the System Performance Test environment, preparing all the data required for the System Performance Test, Page 14 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C refreshing the data where required, plan, execute and monitor all related batch job runs and other necessary work required by the System Performance Test even if the System Performance Test scope is not awarded to the Tenderer. 7.22 The Supplier shall carry out the performance testing as well as application tuning to optimise the application performance in the System Performance Test environment. 7.23 The Supplier shall also note that the performance of the Systems shall be base lined under the required load for user base and concurrent users within the specified System Response Time. 7.24 The Supplier shall ensure that performance testing shall be conducted in conjunction with applications’ tuning to optimise the System performance in the System Performance Test environment regardless the System Performance Test are performed onsite or offsite. 7.25 The Supplier shall ensure that the system performance testing shall be conducted without impact to other applications, their schedule or resource requirements by executing the test in different mix of scenarios such as low load vs high load, off peak periods and scheduling such that the environment be shared and mitigate any impact or cannot affect the test result. 7.26 Upon completion of application performance tests, the Supplier shall submit and present the System Performance Test Report to the Authority for review and approval. The System Performance Test Report shall document test cases with results (expected and actual), statistics as evidence that system performance tests have been carried out and the Systems are ready for review by the Authority, problems identified and recommendations for the application and system fine-tuning. 7.27 The Authority shall review and confirm that the Systems meet the performance requirements in Benchmark Test, Stress Test and Endurance Test as defined in this Tender Specifications and whether the corrective actions to be undertaken by the Supplier to meet requirements are acceptable by the Authority. 7.28 The Supplier shall note that if the Systems are unable to meet the performance standards required by Authority, resulting in the need to conduct more than additional rounds of performance test, the Supplier shall be held fully accountable for any additional resources, inclusive of software and hardware, that may be required to conduct subsequent round(s) of performance test. 7.29 In the event of non-compatibilities or system degradation, the Supplier shall bear the responsibility to propose and implement solution(s) including adding system resources to meet the Systems’ requirements. The Supplier shall bear all the cost for implementing the solution(s) agreed by the Authority. 7.30 The Supplier shall comply with system performance test parameters as stated in the table below. Page 15 of 17 PART 2 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM Parameters SECTION C Value Number of users at normal concurrent 15 users load factor Number of users at peak concurrent load 30 users factor Shall not exceed 50 % for all benchmark test scenarios. CPU Utilisation Shall not exceed 80 % for all stress test scenarios. Memory Utilisation Shall not exceed 70 % for stress test scenarios after reaching peak load. Maximum Think Time* Not more than 20 seconds for all transactions Duration for Benchmark and Stress At least 60 minutes Tests Duration for Endurance Test At least 8 hours *The think time varies based on the business process defined in system performance tests and shall be determined during the development of the system performance test plan. 7.31 The System shall be fault tolerant. An example of fault tolerance is the System’s ability to handle graceful degradation of the system performance beyond the minimal performance requirements. 7.32 Unless otherwise approved by the Authority, the Supplier shall work within the network and security policy and guidelines stipulated by the Authority. 8. TRAINING REQUIREMENTS FOR DATA SCIENCE PLATFORM 8.1 The Supplier shall refer to Part 2 Section K for general requirements for training. 8.2 The Supplier shall provide at least 1 session of training (either physical or virtual, subject to approval by the Authority) for up to 10 user representatives (comprises of power users and end users) from the tenants who onboarded the Data Science Platform as part of System Commissioning Page 16 of 17 PART 2 8.3 REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM SECTION C The scope of training shall minimally include: Both Power Users and End Users (a) Concept of governance model for the data, code, and contents Power Users (b) How to create and set up of new tenants? (c) How to manage user groups and users in Data Science Platform? (d) How to manage and configure access controls in Data Science Platform? (e) How to monitor virtual machine performance, site status and user activities? (f) How to monitor the system performance of models End Users (g) How to connect and query a data source? (h) How to import and export data? (i) How to do row and column masking? (j) How to set user access and control for data assets? (k) How to collaborate on code development with multiple teams of users? (l) How to export code out? (m) How to perform data cleaning with R and Python code? (n) How to conduct model deployment, model performance tracking and machine learning operations? (o) How to track the performance and load of the APIs or Apps (ML models)? (p) How to send alerts or notifications when APIs or Apps fail? Page 17 of 17