SYSTEMS THEORY APPROACH TO USING STATISTICS IN

advertisement
Invited lecture to the Conference on Computers in Sport, Israel, Netanya, Wingate Institute, 1992
SYSTEMS THEORY APPROACH TO USING STATISTICS IN
SPORT PERFORMANCE ANALYSIS
Petr Blahuš
Faculty of Physical Education and Sport
Charles University, Prague
1. Introduction
Systems theory paradigm is considered to be a general methodological approach to
using statistical methods for analysis of sport performance. There are two main
motivations for that:
(i) The everyday experience shows that in many cases statistical methods are applied
without a clear methodological conception. The systems theory approach helps to
formulate the solved problem more clearly, typically as a problem of input - output or
training - performance statistical relationship or as a network of statistical
relationships.
(ii) On the other hand some rather more practical aspects can be found if one takes
into account the huge amount of different statistical methods and software statistical
packages. Then systems theory approach helps to choose proper design of the future
investigation or select appropriate statistical methods for analysis of given sport
training data with respect to desired aim.
The systems theory conception stresses that a researcher or analyst working jointly
with the coach should formulate the problem to be solved in terms of systems theory
paradigm of input, output and further types of system variables.
2. An athlete - statistical unit or dynamic system ?
The very first assumption for any application of statistics can be seen in explicit
definition of mutual correspondence between the pairs of notions:
observed data --- values of a variable
athlete s attribute --- statistical random variables
This seemingly apparent correspondence very often relies just on intuition or it is
taken as granted. This fact may become a source of blunders, for instance if data were
represented by fallible pseudo-categories that would not be actually disjunctive and
exhaustive in the sense of random events.
If a researcher wishing to use a statistical method avoids mistakes of this kind he
usually starts to deal with rather purely statistical problems like those of normality,
linearity etc. But we find such an approach to be too formalistic and we assume that
some more general questions of scientific inquiry are to be answered first.
From the viewpoint of applications the expression that "random variable x takes on a
value " means that the value is somehow connected with a statistical unit E. In our
case the unit, sometimes also called "unit of observation" typically is an investigated
athlete. The idea of joint observation of several different variables x, y, z, ... that take
on their values over a sample of athletes seems to follow the old Aristotle s paradigm
of attributes connected to entities. See Fig. 1 (a). For this conception it is typical that
attributes are considered as "labels" that are indifferent with respect to possible
dynamic behavior of corresponding entity E.
-------------------------insert Fig. 1 around here
-------------------------Applied to performance analysis the above-mentioned considerations mean that from
now
(1) an athlete should be no more recognized as a "statistical unit" with observed
"attributes" but as a system with possibly dynamic behavior
(2) its investigated attributes - like performance, amount of training load, level of
abilities etc. - cannot be taken as indifferent qualities or quantities but have to be
distinguished into several types: at least input variables and output variables or some
further types of system variables like internal state variables etc. can be recognized
following the kind of problem to be solved.
3. Problems in monitoring performance as system tasks
Typical problem is the input - output relationship, i.e. the training - performance
relationship. In the simplest case we can look at the problem as a direct relationship,
i.e. relationship which is not transmitted through variables inside the system. If,
moreover, the relationship is assumed to be linear we can describe this statistical
problem in terms of multivariable linear system by a multiple regression equation, as
it is the case in Fig. 2.
-------------------------insert Fig. 2 around here
-------------------------In modeling or representing a practical problem in the systems theory way it is
necessary to be aware what kind of variability is inherent in the data we use for
estimation of relationships and of parameters in the equation similar to the regression
equation in Fig. 1. It is quite fundamental difference if, say, the input-output
relationship is studied on the basis of intraindividual variability of one athlete and
based on his diachronic investigation over time. Or, if the same relationship is
investigated as interindividual (among different athletes) and synchronically (in one
time "slice"). This latter case of investigation is known as the so-called correlational
research. It is more frequent, more convenient but tells us less about monitoring
individual training and its planning. The reason is that this type of relationship is too
far from the ideal of desired cause-effect influence.
Of course there are some other fruitful kinds of problems and relationships than inputoutput analysis. For instance, let us consider a sport discipline where output
performance cannot be directly measured or observed in sufficiently short time
periods to evaluate effects of training for short time feed back control. Such a case can
be find e.g. in gymnastics with its two or three competitions in a year. Or in some
situations we don t have the performance even defined - like in the case of individual
player performance in ice hockey, basketball and in other team games.(A counterexample to those could be long jump - it can be measured directly and at least each
week to follow the change of performance). In such sport disciplines where the
performance is "difficult to measure" we have to solve the problem of constructing
auxiliary indirect indicators of output performance. I.e. we have several indirect
criteria of performance and we are in position to solve tasks as the following:
to reduce the number of indirect output indicators
to find the most relevant outputs
etc.
But some complicated problems arise if we take into account the more realistic
situation, namely that the system has an internal structure which intermediates the
connection between input and output, i. e. between training and performance.
4. The internal structure and latent variables
From the classical systems theory point of view the intermediating structure is
represented by internal state variables s, as it is pictured in Fig. 3. In the figure there
are at least two possibilities:
(a) input x influences internal state s and that influences further the output y
(b) the past state s (in time t-1) interacts with past input to yield a new present state
which produces present output in interaction with the present input.
-------------------------insert Fig. 3 around here
-------------------------From the viewpoint of sport training the internal state s could be interpreted as
"general state of preparedness" of an athlete or as a "state of development of motor
abilities" influencing athlete s performance. This athlete s internal state interacts with
the last training load to yield a new value of performance.
If the training-state-performance paradigm works as linear dynamic system then its
control can be described by linear equations indicated in Fig. 3. For the purpose of
statistical analysis of training data the system can be described by set of regression
equations. In these equations some of the variables - those hidden inside the system have character of directly unobservable or latent variables.
The latent variables have to be estimated by a diagnostic procedure. The procedure
includes process of statistical modeling in terms of latent variable models such as
factor analysis and others. For instance the case in Fig. 3 (a) can be modeled by
interbattery factor analysis model:
the "state of preparedness" including level of motor abilities can be estimated by
latent factor scores of so-called interbattery factors. Those are the factors which
intermediate the connection of input variables in battery one with output variables in
battery two. I.e. they intermediate connection between input variables, or training load
indexes, and the output variables, or performance criteria.
5. Statistical models for latent internal structure
Several sophisticated models are available for theories and hypotheses about more
complicated internal structures intermediating the connection between the input
training and the output performance. One of them is the quite well known LISREL
model presented in Fig. 4. The model (cf. Jöreskog and Sörbom 1988) can be seen as
a straightforward generalization of the system of Fig. 2. Just imagine that the data x
on input training load are contamined by errors and observable only indirectly through
auxiliary indicators. Thus, we are in position to estimate the "true" input through socalled measurement model II. Then, the same holds for "true" output which is
observable only through indirect auxiliary performance criteria y and their
measurement model I. But, what we are interested about is the relationship between
true performance and true training.
-------------------------insert Fig. 4 around here
-------------------------In some even more complicated models we can proceed further as in the case of Fig.
5. The "true" input-output relationship can again be intermediated by internal state (a),
and the internal state can have its higher level internal structure of latent variables
connected each other by chains of internal relationships (b).
-------------------------insert Fig. 5 around here
-------------------------The idea of variables that influence sport performance by acting in chains resembling
"causal chains" is more realistic than the older approach which was using multiple
regression to combine "causal variables" without any hierarchy of cause-effect
sequences. This seems to be the very reason why former correlational studies, hunting
for significant correlations of various variables to performance, were able to add so
little to the knowledge about monitoring training and performance.
The above mentioned chains of influence on the sport performance can be modeled by
different statistical tools of the family of path analytic models with latent variables.
These can formulated also in terms of LISREL, or in a very helpful model RAM (the
McArdle s so called Reticular Action Model, cf. McArdle,McDonald 1984) jointly
with a very general COSAN model (McDonald 1978).
6. Monitoring of training as a system control process
Another question is the process of monitoring performance. From the point of view of
systems control theory we can accept the idea that the above mentioned conception of
a monitored athlete actually deals with controlled system. The controlled system
represents a particular subsystem within a complete system of control. In the frame of
system of control it is the controlling subsystem which selects appropriate stimuli that
function as controlling inputs to the controlled system. An appropriate controlling
inputs can be applied only on the basis of knowledge of three streams or three blocks
of information, namely about input, output and internal state. The information then
proceeds through the process of storing, processing, and evaluation as illustrated in
the Fig. 6.
-------------------------insert Fig. 6 around here
--------------------------------------------------insert Fig. 7 around here
-------------------------The applied parallel situation of monitoring sport training can be seen in Fig. 7. There
athlete s monitoring is modeled as in the preceding Fig. 6. An athlete represents the
controlled system with its internal structure, and the control process is realized
through activities as training data collection, its storing, processing and evaluation.
Finally, it leads to the creation of modified training plan and new training load stimuli
applied to the athlete s workout as controlling inputs.
References
Blahuš, P. (1974). De la causalité dans l analyse de correlation et factorielle des
aptitudes. Scientia Paedagogica Experimentalis, 11, 1, 24-33.
Blahuš, P. (1982). Methodological problems of latent variable models. Acta
Universitatis Carolinae G., 23/1, 25-37.
Blahuš, P., Hruby,J., Kvapil,J., & Paichl,J. (1988). Systems theory approach to using
statistics in social sciences. Praha: Charles University Press.
Galtung, J. (1968). Diachronic correlation, process analysis, and causal analysis.
UNESCO Seminar on Developmental Sociology, Rio Janeiro, July 1968.
Jöreskog, K.G., & Sörbom, D. (1988) LISREL 7: A guide to the program and
applications. Chicago: SPSS, Inc.
Klir, G.J. (1972). Trends in general systems theory. N. York: Wiley.
McArdle, J.J., & McDonald, R.P.(1984). Some algebraic properties of the RAM logic
for structural equation model specification. Brit. J. math. statist. Psychol. 37, 234251.
McDonald, R.P. (1978). A simple comprehensive model for the analysis of
covariance structures. Brit.J. math.statist. Psychol. 31, 59-72.
McDonald,R.P. (1985). Factor analysis and related methods.
Hillsdale, NJ: Lawrence Erlbaum Ass.
Mesarovic, M. D., & Takahara, T. ( 1975). General systems theory: mathematical
foundations. N. York: McGraw.
Morrison, D.F. (1967). Multivariate statistical methods. N. York: McGraw.
TITLES TO FIGURES:
Fig. 1 An athlete as an object of statistical investigation
Fig. 2 Input - output, traing - performance relationship described by linear regression
equation
Fig. 3 Internal state intermediating the training - - performance relationship
Fig. 4 LISREL model description of internal structure
Fig. 5 Further possible conceptions of the internal structures: (a) "true" training and
"true" performance, (b) path analytic structure
Fig. 6 General control system
Fig. 7 Control system applied to monitoring of training performance
Download