Uploaded by dmalande000

08 Part 2 Section C - Requirement Specification for Data Science Platform Corrigendum 2

advertisement
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
PART 2 – SECTION C
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE
PLATFORM
Page 1 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
CONTENTS
1.
INTRODUCTION ......................................................................................................... 3
2.
SCOPE OF WORK ........................................................................................................ 3
3.
SYSTEM DEVELOPMENT AND IMPLEMENTATION FOR DATA SCIENCE
PLATFORM .................................................................................................................. 4
4.
SYSTEM ADMINISTRATION AND MAINTENANCE FOR DATA SCIENCE
PLATFORM .................................................................................................................. 9
5.
SYSTEM INTEGRATION FOR DATA SCIENCE PLATFORM ............................. 10
6.
HARDWARE MAINTENANCE FOR DATA SCIENCE PLATFORM .................. 10
(OPTIONAL)
7.
SYSTEM PERFORMANCE TEST FOR DATA SCIENCE PLATFORM ................ 13
8.
TRAINING REQUIREMENTS FOR DATA SCIENCE PLATFORM ..................... 16
Page 2 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
1.
INTRODUCTION
1.1
Background
SECTION C
(a) The Data Science Platform (DSP) will provide data science capabilities such as data
analytics, machine learning, integrated web-based development environment,
distributed version control, model development and deployment to meet users’
business and analytics needs. It shall integrate with an Enterprise Data Management
Platform (xDASH), covered under Part 2 Section B1 which support data
transformation, sharing and discovery.
(b) The Data Science Platform shall be a scalable and multi-tenant architecture with
high availability. The tenderer shall describe in detail how multi tenancy is
implemented in their proposal. The proposed solution shall address the following
details such as isolation for data, compute, code, content, account for
administrators, different administrative accounts, level of controls for accounts etc.
2.
SCOPE OF WORK
2.1
The Supplier shall be responsible for the following:
(a) Design, development, and implementation of the Data Science Platform;
(b) Maintenance of the Data Science Platform including all system interfaces and data
integrations within the system;
(c) Integration with the Enterprise Data Management Platform (xDASH)’s Data
Access Tier (covered under Part 2 Section B2);
(d) Supply, setup, install and maintenance of hardware equipment supporting Data
Science Platform (optional); and
(e) Training for Power Users and End Users on the Data Science Platform capabilities
and end-to-end data science workflow.
2.2
The Supplier shall provide the schedule of rate for Service Request Man-Days as
specified in Part 3 Annex I on Cost Schedule Table 11. This Service Request man-days
shall cater for the following throughout the entire project lifecycle:
(a) Modifications, enhancements, sustainment under Data Science Platform covered
under Part 2 Section C; and
(b) Security review and testing services covered under Part 2 Section D.
2.3
The high-level architecture of the Data Science Platform after implementation at OnPremise Cloud (OPC) is as follows. The diagram also shows the integration with the
Page 3 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
Enterprise Data Management Platform (xDASH)’s Data Access Tier (covered under
Part 2 Section B2).
2.4
The Supplier shall design a multi-tenanted Data Science Platform that provides data
science capabilities such as data analytics, machine learning, integrated web-based
development environment, distributed version control, model development and
deployment. There shall have 2 tenants at system commissioning.
3.
SYSTEM DEVELOPMENT AND IMPLEMENTATION FOR DATA SCIENCE
PLATFORM
3.1
The Supplier shall be responsible for the design, development, and implementation of
a multi-tenanted Data Science Platform for data science capabilities.
3.2
The Data Science Platform shall provision data analytics, machine learning, integrated
web-based development environment, distributed version control, model development
and deployment tools to facilitate data science functionalities with capabilities as
follows:
3.2.1 General Requirements
(a) Allow role-based authorisation control for access management and SSO (e.g., able
to integrate with Active Directory / SAML1 ) through integration with Authority’s
IAMS2 for the Data Science Platform and its subcomponents; and
(b) Allow integration with databases (e.g., RDBMS, NoSQL databases, Object Store)
and data virtualisation tools (e.g., Denodo, Tibco) to allow merging, joins,
transformation and querying of data for data management.
1
Security Assertion Markup Language
Identity & Access Management System (IAMS) provides key functions such as Single Sign-On (SSO) and 2factor authentication (2FA).
2
Page 4 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
3.2.2 Data Storage and Management Capabilities
(a) Allow parallel queries and workloads to be executed across multiple nodes;
(b) Support multi-tenancy – support segregation of data among multiple teams of users,
with each team able to have fine-grained access to their data;
(c) Provide access control of System and/or database administration to disallow
administrator’s access and rights to non-administrators (e.g., via permissions,
masking);
(d) Leverage on distributed processing frameworks to run queries (e.g., Hadoop
MapReduce, Apache Spark, etc.); and
(e) Allows queries to be scheduled or batched.
3.2.3 Data Privacy-Preservation
(a) Allow policies for encryption, anonymisation, masking, and tokenization using
techniques like homographic encryption, K-anonymity, differential privacy and
others to be applied to data;
(b) Allow data to be ingested from files, database or big data sources and application
of privacy measures to the data without the need for a temporary staging area;
(c) Allow users to define their own sets of privacy settings (e.g., masking of columns
and rows);
(d) Able to run defined sets of privacy settings on similar datasets; and
(e) Allow imported data to be stored in an encrypted format (please refer to Part 2
Section L for more details).
3.2.4 Data Collection, Preparation, and Query
(a) Facilitate data collection from various sources including but not limited to
structured, unstructured, files, databases, Application Programming Interfaces
(APIs), streams3;
(b) Perform data transformation and cleansing with R and Python code;
(c) Support scheduling of workflow jobs;
3
API stands for Application Programming Interface. It provides a channel to send information between two
applications.
Page 5 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
(d) Allow import of table/index/constraints/foreign keys creation Data Definition
Language (DDL) to perform data modelling4;
(e) Allow organisation of data as data assets that can be separately access controlled;
(f) Allow search and query of data fields;
(g) Allow data to be imported and exported in one or more of the following formats:
csv, json, xml, SQL, binary, xls, xlsx, etc.;
(h) Allow data governance workflows to be built to allow prior review and approval
for data objects before they are created, modified or removed;
(i) Provide fine-grained user access and control; and
(j) Allow checking of data quality by including but not limited to showing minimum
and maximum values for each field, number of empty cells or inconsistent encoding
(nulls, blanks, NAs and zeros), invalid date formats, invalid geospatial coordinates,
inconsistent data types, abnormally long or short fields and statistical checks (mean,
standard deviation, etc.)
3.2.5 Distributed Version Control System
(a) Provide code sharing and version control using Git repositories;
(b) Support multi-tenancy – support segregation of code among multiple teams of
users, with each team able to have fine-grained access to their code;
(c) Allows collaboration of code development with multiple teams of users;
(d) Allows code to be exported out as R, Python or Jupyter notebooks or equivalent;
and
(e) Allows users to be authenticated via IAMS.
3.2.6 Machine Learning and Model Development
(a) Allow development using R and Python code;
(b) Allow R, Python, and other open-sourced libraries to be used;
(c) Allow code to be developed as Jupyter notebooks or equivalent;
(d) Allow interactive execution of code with intermediate results to be shown visually
on the same interface as text, or in the form of tables, graphs or charts;
4
DDL stands for Data Definition Language. It is a computer language used to create and modify the structure of
database objects (e.g., views, schemas, tables, indexes etc.) in a database.
Page 6 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
(e) Allow auto-completion, syntax highlighting and checking;
(f) Allow concurrent execution of code segments;
(g) Allow separation of data into training and test sets, with support for different
sampling methods;
(h) Allow integration with Distributed Version Control System for code sharing and
version control;
(i) Allow integration of product principal approved plugins and extensions for added
functionalities;
(j) Support multi-tenancy – support segregation of code among multiple teams of
users, with each team able to have fine-grained access to their code;
(k) Support unit-testing of models either natively or with integration with Git
repositories; and
(l) Allow batch execution of code with results shown, collected, or exported at the end
of the run.
3.2.7 Automated Machine Learning Tools
(a) Allow model development using R and Python code;
(b) Allow R and Python libraries and other open-sourced libraries to be used;
(c) Allow code to be developed as Jupyter notebooks or equivalent;
(d) Support multi-tenancy – support segregation of code among multiple teams of
users, with each team able to have fine-grained access to their code;
(e) Allow separation of data into training and test sets and validation sets, with support
for different sampling methods;
(f) Allow rapid iteration of machine learning algorithms and models;
(g) Able to propose recommended or default model parameters;
(h) Allow model parameters to be tracked across runs;
(i) Allow open-sourced algorithms to be imported and run;
(j) Allow models to be scored and ranked;
(k) Recommend the best model based on model evaluation metrics (e.g., F1 score,
RMSE, ROC-AUC, etc.);
Page 7 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
(l) Allow previous results of model runs to be saved;
(m) Support model ensembles;
(n) Allow multiple jobs to be scheduled, run and tracked;
(o) Support open-source analytics frameworks like Keras, PyTorch, TensorFlow, H2O;
(p) Allow data to be ingested from data sources like MS SQL, Oracle, Hive, HDFS,
Postgres, MySQL, CSV; and
(q) Able to automatically generate explanations of the models and analytics results, and
export the detailed explanations as text document.
3.2.8 Model Deployment and Operationalisation
(a) Support monitoring of model performance;
(b) Allow machine learning and predictive models to be deployed as APIs;
(c) Support fine-grained access control for users to deploy models and APIs;
(d) Support multi-tenancy – support segregation of APIs among multiple teams of
users, with each team able to have fine-grained access to their APIs;
(e) Support secure encrypted transmission of results and data;
(f) Allow APIs to be pushed and deployed on Kubernetes cluster as Docker containers;
(g) Allow APIs to be tested before deployment;
(h) Allow newer APIs to be deployed on-the-fly, or older versions of APIs to be
replaced on-the-fly, while minimising the disruption to users;
(i) Allow newer and older APIs to be deployed concurrently based on weights or
priority, for A-B testing or benchmarking;
(j) Support model routing based on specific parameters, e.g., Hybrid switching
recommenders;
(k) Monitor the status of the deployed APIs;
(l) Log and track all API calls;
(m) Track the performance and load of the APIs;
(n) Send alerts or notifications when APIs fail;
Page 8 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
(o) Support high-availability for APIs; and
(p) Allow access control for APIs.
3.2.9 Hardware Requirements (optional)
The Supplier shall propose hardware with Graphic Processing Unit (GPU) that are
compatible with the Data Science Platform solution. The platform together with the
GPU hardware shall allow the following:
(a) GPU based model inference and model training;
(b) Support horizontal scaling of GPUs; and
(c) Support Natural Language Processing (NLP), Speech-to-Text (STT) and Large
Language Model (LLM) workloads;
4.
SYSTEM ADMINISTRATION AND MAINTENANCE FOR DATA SCIENCE
PLATFORM
4.1
The Supplier shall be responsible for the system administration of the Data Science
Platform, and the maintenance of all system interfaces and data integrations within the
system.
4.2
The Supplier shall provide administrative support of data science tool, integrated webbased development environment and distributed version control system:
(a) Security review and permissions settings;
(b) Supports the accounts creation, deletion and regular review;
(c) Server connection reconfiguration (e.g., password change for system accounts); and
(d) Tenant support and management
4.3
The Supplier shall provide administrative support and updating of open-sourced and
product specific libraries every quarter.
4.4
The Supplier shall provide workflow and tools to review (i.e., vulnerabilities, malicious
code) open-sourced and product specific libraries every quarter.
4.5
The Supplier shall manage and ensure smooth operations of all system interfaces and
data integrations deployed within the Data Science Platform.
4.6
The Supplier shall be responsible to investigate and troubleshoot any platform related
issue (i.e., failed interfaces, failed APIs deployed as models, failed data science
workflows, etc.) and to identify the possible cases. The Supplier shall re-run and
Page 9 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
rectify/modify all platform related issues within the scope of work where necessary to
resolve the issue.
4.7
The Supplier shall monitor and log the failed platform related issues and the associated
causes. The Supplier should also highlight consistent failure and/or causes to the
Authority for the necessary follow-up actions. The Supplier should propose solutions
to prevent recurrence of the interface issues.
5.
SYSTEM INTEGRATION FOR DATA SCIENCE PLATFORM
5.1
The Supplier shall configure, test and provide the necessary support for the system
integration and maintenance of the Data Science Platform with Enterprise Data
Management Platform (xDASH)’s Data Access Tier (covered under Part 2 Section B2).
5.2
The Supplier shall work with the Supplier of the Enterprise Data Management Platform
(xDASH) to integrate the Data Science Platform with Enterprise Data Management
Platform (xDASH)’s Data Access Tier.
5.3
The Supplier shall ensure that Workload B (covered under Part 2 Section B1) will be
accessible and able to conduct data science/data analytics work by users of Data Science
Platform.
6.
HARDWARE
(OPTIONAL)
6.1
The section defines the scope of services for the following:
MAINTENANCE
FOR
DATA
SCIENCE
PLATFORM
(a) General Requirements
(b) Support Hours
(c) Service Levels
6.2
General Requirements
6.2.1 The Supplier shall provide hardware maintenance services for the Data Science
Platform. Detailed configurations for the server equipment shall be provided upon the
award of the contract. The server equipment shall consist of the equipment rack that
houses the servers.
6.2.2 The Supplier shall propose the Server Equipment Maintenance charges which shall be
quoted as a percentage of the latest list price of the said equipment. The list price shall
be obtained from the equipment manufacturer. The Supplier shall provide an update of
the price list by the equipment manufacturer within one (1) month of the published date.
The Supplier shall make the price list available and accessible to the Authority. All
Page 10 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
prices and updates shall not be modified. The Authority reserves the right to verify the
accuracy of the price list with the equipment manufacturer.
6.2.3 In cases where server equipment maintenance services are required for an obsolete item,
the latest list price of its replacement item endorsed by the equipment manufacturer
shall be used as reference to calculate annual maintenance costs.
6.2.4 The Supplier shall be responsible for the co-ordination of all activities, including
maintenance, installation, testing and commissioning activities with other third-party
vendors. In the event of any dispute between the Supplier and relevant parties on coordination and interface activities, the decision of the Authority shall be final and
binding.
6.2.5 The Supplier shall ensure an adequate supply of spare parts for the System (including
servers, so that faulty components can be replaced promptly when needed.) The spare
parts supplied shall be of original or compatible, but higher, specifications such that no
direct or indirect side-effects whatsoever shall be caused.
6.2.6 Any defective parts removed from the equipment shall become the property of the
Supplier. In the case of faulty storage media such as hard disks or tapes, they shall
remain property of the Authority.
6.2.7 The Supplier shall ensure that all replacement parts are in working condition before
using them to replace the defective ones.
6.2.8 The Supplier shall ensure that all equipment upgrades, firmware and software
patches/new releases are tested in an environment similar to that of the Authority before
implementing them in the Authority’s environment.
6.2.9 Upon receipt of notification from the Authority that the equipment has failed or is
malfunctioning, the Supplier shall dispatch suitably qualified personnel to arrive at the
Authority Site stipulated by the service levels section to make such repairs and
adjustments to and replace such parts necessary to restore the equipment to its original
functional state.
6.2.10 Where the Supplier is unable to restore any component part of the equipment to its
original functional state, the Supplier shall, without any cost to the Authority, provide
the Authority with substitute equipment which is functionally equivalent to the
defective one until the failure or malfunction is rectified.
6.2.11 The Supplier shall maintain a log of all their activities at the Authority Site. The
Supplier shall propose a format for the log. The log shall include the following:
(a) Date and time when the Supplier is notified of any defect or malfunction.
(b) Date and time of arrival of the Supplier's personnel at the Authority Site.
(c) Date and time when the faulty equipment or component was successfully replaced.
Page 11 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
(d) Description of the faulty equipment or component and the causes for their failure.
(e) Corrective actions taken, including temporary corrections, bypasses, etc.
(f) Preventive actions to be taken.
(g) Result of tests performed to verify the correct functioning of the substitute
equipment.
6.2.12 The Supplier shall at its own expense within a reasonable period of time, clear away
and remove from the Authority Site all surplus material and rubbish after the
completion of each visit.
6.2.13 The Supplier shall provide assistance to Authority-appointed auditors and consultants
where applicable.
6.3
Support Hours
6.3.1 The Support Hours for Server Equipment shall be twenty-four (24) hours every day,
including Saturday, Sunday and Public Holidays
6.4
Service Levels
6.4.1 The Supplier shall restore the faulty equipment to its original functional state within
four (4) hours from the time of failure, failing which, the Supplier shall provide a
temporary loan set within eight (8) hours from the time of failure.
6.4.2 All storage media installed in the temporary loan set, such as hard disks or tapes, shall
be handled according to the “Singapore Government Security Instructions for the
Handling and Custody of Classified Information” before being taken out of Authority’s
premises. The Supplier shall refer to Part 2, Section L – Security Requirements for
the compliance of Authority’s policies and standards.
6.4.3 The Supplier shall conduct preventive maintenance services on all hardware equipment
maintained once every six (6) months per equipment. The preventive maintenance
works shall include the following:
(a) Cleaning of equipment and rack ventilation fans.
(b) Running diagnostic programs on the hardware.
(c) Replacing, without cost to the Authority, the equipment where replacement is
necessary for the normal functioning of the System.
(d) Performing tests or adjustments necessary to keep the equipment in original
functional state.
(e) Updating documentation on the rack information such as position of equipment in
the rack.
Page 12 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
6.4.4 The Supplier shall submit the preventive maintenance work plan to the Authority for
acceptance at least one (1) month before the works commencement. The preventive
maintenance work plan shall indicate the work activity, schedule, duration, and any
other relevant information.
6.4.5 The Supplier shall submit the preventive maintenance report to the Authority within
one (1) week after the inspection.
7.
SYSTEM PERFORMANCE TEST FOR DATA SCIENCE PLATFORM
7.1
The Supplier shall refer to Part 2 Section A for the general requirements for
performance test.
7.2
The Tenderer shall note that System Performance Test comprises the following:
(a) Benchmark test;
(b) Stress test;
(c) Endurance test;
(d) Breakpoint test; and
(e) Gate-keeping test
7.3
The Benchmark Test shall validate system performance with the normal concurrent
load factor (50% of peak concurrent load factor).
7.4
The Stress Test shall validate system performance with peak concurrent load factor
(100% of peak concurrent load factor).
7.5
The Endurance Test shall establish system consistency over prolonged sustained peak
concurrent load factor.
7.6
The Breakpoint Test shall establish system capacity with gradual increment of user
concurrency and to determine the maximum user load factor which the system is able
to sustain prior to the defined breakpoint criteria defined by Authority.
7.7
The Gate-keeping test shall validate the effectiveness of the gate-keeping measures to
gracefully handle peak load or breakpoint load.
7.8
The detailed System Performance Test Plan shall be reviewed and approved by the
Authority.
7.9
The Supplier shall engage an independent Test Consultant to perform the application
performance testing services before each system release implementation phase or major
upgrades.
Page 13 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
7.10
The Supplier shall note that the appointment of the independent Test Consultant is
subjected to the approval of the Authority.
7.11
The Supplier shall bear the cost of all the related consultancy services.
7.12
The Test Consultant shall provide the System Performance Test Plan using the
Authority’s Performance Test Plan template which documents minimally the
background i.e., system overview, objectives, roles & responsibilities of the project
team, testing methodology, key business processes, measurement of success, areas
required for refinement and fine-tuning recommendations.
7.13
The Test Consultant shall propose in the System Performance Test Plan how to generate
the transaction load to test the Systems as though operating under live conditions such
as similar data size, usage scenarios, and user base. The System Performance Test shall
be carried out by the Test Consultant and the results of the test will be verified by the
users.
7.14
The Test Consultant shall note that the System Performance Test Plan shall be
established according to the available resource in the System Performance Test
environment. The Supplier shall provide the results to show that the application meets
the performance targets for the Systems sizing proposed.
7.15
The Supplier shall submit the System Performance Test Plan to the Authority for review
and approval minimally ONE (1) month before the commencement of the System
Performance testing.
7.16
The Tenderer shall note that the Authority’s System Performance Test Plan template
will be provided to the Supplier upon award of this Contract.
7.17
The Test Consultant shall generate sufficient test coverage and a realistic amount of
load together with batch processing on the application and system under test. The
proposed test coverage and suggested amount of test data to be generated is subjected
to the Authority’s review and approval.
7.18
The Test Consultant shall note that the test coverage, the type of testing services and
the amount of load generated on the system and application, which are subjected to the
approval of the Authority, are dependent on the business requirements of the Systems,
the System application architecture, the size of the user base, the number of concurrent
users, the number of concurrent access, the number of concurrent request, the system
configurations, etc.
7.19
The Supplier shall provide the system performance testing service to conduct the
performance testing for applications, subject to the Authority’s approval.
7.20
The Supplier shall notify and obtain the Authority’s approval on the recommended
testing tools and number of software licenses.
7.21
The Supplier shall be responsible for setting up the System Performance Test
environment, preparing all the data required for the System Performance Test,
Page 14 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
refreshing the data where required, plan, execute and monitor all related batch job runs
and other necessary work required by the System Performance Test even if the System
Performance Test scope is not awarded to the Tenderer.
7.22
The Supplier shall carry out the performance testing as well as application tuning to
optimise the application performance in the System Performance Test environment.
7.23
The Supplier shall also note that the performance of the Systems shall be base lined
under the required load for user base and concurrent users within the specified System
Response Time.
7.24
The Supplier shall ensure that performance testing shall be conducted in conjunction
with applications’ tuning to optimise the System performance in the System
Performance Test environment regardless the System Performance Test are performed
onsite or offsite.
7.25
The Supplier shall ensure that the system performance testing shall be conducted
without impact to other applications, their schedule or resource requirements by
executing the test in different mix of scenarios such as low load vs high load, off peak
periods and scheduling such that the environment be shared and mitigate any impact or
cannot affect the test result.
7.26
Upon completion of application performance tests, the Supplier shall submit and
present the System Performance Test Report to the Authority for review and approval.
The System Performance Test Report shall document test cases with results (expected
and actual), statistics as evidence that system performance tests have been carried out
and the Systems are ready for review by the Authority, problems identified and
recommendations for the application and system fine-tuning.
7.27
The Authority shall review and confirm that the Systems meet the performance
requirements in Benchmark Test, Stress Test and Endurance Test as defined in this
Tender Specifications and whether the corrective actions to be undertaken by the
Supplier to meet requirements are acceptable by the Authority.
7.28
The Supplier shall note that if the Systems are unable to meet the performance standards
required by Authority, resulting in the need to conduct more than additional rounds of
performance test, the Supplier shall be held fully accountable for any additional
resources, inclusive of software and hardware, that may be required to conduct
subsequent round(s) of performance test.
7.29
In the event of non-compatibilities or system degradation, the Supplier shall bear the
responsibility to propose and implement solution(s) including adding system resources
to meet the Systems’ requirements. The Supplier shall bear all the cost for
implementing the solution(s) agreed by the Authority.
7.30
The Supplier shall comply with system performance test parameters as stated in the
table below.
Page 15 of 17
PART 2
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
Parameters
SECTION C
Value
Number of users at normal concurrent
15 users
load factor
Number of users at peak concurrent load
30 users
factor
Shall not exceed 50 % for all
benchmark test scenarios.
CPU Utilisation
Shall not exceed 80 % for all stress test
scenarios.
Memory Utilisation
Shall not exceed 70 % for stress test
scenarios after reaching peak load.
Maximum Think Time*
Not more than 20 seconds for all
transactions
Duration for Benchmark and Stress
At least 60 minutes
Tests
Duration for Endurance Test
At least 8 hours
*The think time varies based on the business process defined in system performance tests and
shall be determined during the development of the system performance test plan.
7.31
The System shall be fault tolerant. An example of fault tolerance is the System’s ability
to handle graceful degradation of the system performance beyond the minimal
performance requirements.
7.32
Unless otherwise approved by the Authority, the Supplier shall work within the network
and security policy and guidelines stipulated by the Authority.
8.
TRAINING REQUIREMENTS FOR DATA SCIENCE PLATFORM
8.1
The Supplier shall refer to Part 2 Section K for general requirements for training.
8.2
The Supplier shall provide at least 1 session of training (either physical or virtual,
subject to approval by the Authority) for up to 10 user representatives (comprises of
power users and end users) from the tenants who onboarded the Data Science Platform
as part of System Commissioning
Page 16 of 17
PART 2
8.3
REQUIREMENT SPECIFICATIONS FOR DATA SCIENCE PLATFORM
SECTION C
The scope of training shall minimally include:
Both Power Users and End Users
(a) Concept of governance model for the data, code, and contents
Power Users
(b) How to create and set up of new tenants?
(c) How to manage user groups and users in Data Science Platform?
(d) How to manage and configure access controls in Data Science Platform?
(e) How to monitor virtual machine performance, site status and user activities?
(f) How to monitor the system performance of models
End Users
(g) How to connect and query a data source?
(h) How to import and export data?
(i) How to do row and column masking?
(j) How to set user access and control for data assets?
(k) How to collaborate on code development with multiple teams of users?
(l) How to export code out?
(m) How to perform data cleaning with R and Python code?
(n) How to conduct model deployment, model performance tracking and machine
learning operations?
(o) How to track the performance and load of the APIs or Apps (ML models)?
(p) How to send alerts or notifications when APIs or Apps fail?
Page 17 of 17
Download