Competency framework for team working on Big Data All team

advertisement
Competency framework for team working on Big Data
All team members
S/he must be accomplished in analytical methods, and have an appreciation and understanding of
information presented in mathematical terms; have the ability to extract the key messages or
underlying trends present within data; and be able to present statistical results and concepts, both
orally and in writing, in a confident and professional manner. All team members should understand
and have some knowledge of:





Data science and data science methods including those used in the areas of Machine
learning, Data mining, Artificial intelligence and Computational Statistics.
statistics and statistical theory (e.g. representativeness: target population, sampling frame,
sampling, weighting, inference; measurement: validity, reliability; modelling).
High Performance Computing platforms like Hadoop, Spark, Hive, Pig and others.
data visualization methods.
potential risks to data security and standards in data security at all times.
All team members should also possess the following behavioural skills:




High level of situational awareness and the ability to exploit for work environment
Emphasis on result orientation
Is Innovative with the ability to creative action
Excellent judgement skills with the ability to weigh alternatives
Specific team members
ICT skills






Ability to code algorithms for data processing in suitable technologies/environments;
Strong ability to work with a programming/scripting languages, e.g. bash, R and Python for
data preparation
Ability to develop software for more advanced data processing tasks (in programming
language of choice i.e. Java, Python, R);
Can use High Performance Computing platforms in an efficient way
Can operate on structured and unstructured data with a wide range of tools (from software
development to computational packages);
Knowledge of Linux operating systems.
Data analytical skills


Can assimilate information from a range of sources.
Organizes (complex) information to make it accessible.


Can (quickly) analyse complex data, identifying what is relevant/ sees similarities between
data and/or information.
Can join (combine) various data processing techniques to achieve a given analytical task/
Analyse issues/problems from different angles.
Methodologist skills




Ability to apply methods to assess representativeness of data given a target population.
Ability to apply methods to assess the validity and reliability of a measured variable.
Ability to apply noise reduction methods and techniques.
Ability to determine the disclosure risks of data.
Download