Research Report Measuring Physical Fitness in Children Who Are 5 to 12 Years Old With a Test Battery That Is Functional and Easy to Administer Background. Valid and reliable measures of children’s physical fitness are necessary for investigating the relationship between children’s physical fitness and children’s health. Objective. The objective of this study was to estimate the feasibility, internal consistency, convergent construct validity, and test-retest reliability of a new, functional, and easily administered test battery for measuring children’s physical fitness. I. Fjørtoft, PhD, Faculty of Arts, Folk Culture and Teacher Education, Telemark University College, 3670 Notodden, Norway. Address all correspondence to Dr Fjørtoft at: ingunn.fjortoft@hit.no. A.V. Pedersen, PT, MSc, SørTrøndelag University College, Trondheim, Norway. fitness tests across age groups 5 to 12 years. H. Sigmundsson, PhD, Norwegian University of Science and Technology, Trondheim, Norway. Methods. Each of the 9 items in the test battery consists of a compound motor B. Vereijken, PhD, Norwegian University of Science and Technology. activity that recruits various combinations of endurance, strength (force-generating capacity), agility, balance, and motor coordination: standing broad jump, jumping a distance of 7 m on 2 feet, jumping a distance of 7 m on one foot, throwing a tennis ball with one hand, pushing a medicine ball with 2 hands, climbing wall bars, performing a 10 ⫻ 5 m shuttle run, running 20 m as fast as possible, and performing a reduced Cooper test (6 minutes). The test battery was administered to 195 children (aged 5–12 years) from 4 schools and kindergartens in Norway. [Fjørtoft I, Pedersen AV, Sigmundsson H, Vereijken B. Measuring physical fitness in children who are 5 to 12 years old with a test battery that is functional and easy to administer. Phys Ther. 2011;91:1087–1095.] Design. The study was a cross-sectional descriptive survey applying physical Results. Overall, the children in each age group were able to perform all of the test © 2011 American Physical Therapy Association items, indicating the suitability of the test battery for children as young as 5 years of age. With increasing age, total scores improved linearly, indicating the adequate sensitivity of the test battery for the age range examined in this study. Furthermore, even with the modest sample size used in this study, total scores were normally distributed, thereby fulfilling the necessary assumptions of most statistical procedures. For investigating the reliability of the test battery, 24 children (mean age⫽8.6 years) in one class were retested 1 week later. Test-retest correlations were high, with intraclass correlation coefficients for individual test items and total score ranging from .54 to .92. Limitations. The survey was limited to samples of 5- to 12-year-old Norwegian children. Larger samples in each age group are essential for establishing age- and sex-specific norms. Conclusions. These promising results warrant further development of the test battery, including standardization and normalization based on a large, representative sample. July 2011 Volume 91 Number 7 Post a Rapid Response to this article at: ptjournal.apta.org Physical Therapy f 1087 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 Ingunn Fjørtoft, Arve Vorland Pedersen, Hermundur Sigmundsson, Beatrix Vereijken Physical Fitness Test Battery for Children O For investigating such relationships, reliable tests that can establish children’s physical fitness in large population samples are needed. Although several tests for determining physical fitness in adults are available, these are generally inappropriate for determining physical fitness in children.12 Extant tests of physical fitness typically focus on isolated physiological com- Available With This Article at ptjournal.apta.org • Audio Abstracts Podcast This article was published ahead of print on May 26, 2011, at ptjournal.apta.org. 1088 f Physical Therapy Volume 91 ponents, such as muscle strength (force-generating capacity) or aerobic endurance, that are tested with more or less advanced technological equipment in controlled laboratory settings13,14 (see also Kemper and van Mechelen12). These tests are based mostly on test batteries for adults and may be ill-suited for testing children because they place high demands on endurance and the willingness and ability of participants to follow strict instructions. These characteristics make the most reliable tests of physical fitness particularly unfeasible for testing young children. In addition, laboratory tests based on direct measures of physiological variables are expensive and require highly trained experimenters; thus, they are not feasible for use with large groups of participants.7,12,15 A further disadvantage of most existing tests is that they attempt to divide a complex attribute into constituent components and measure each of the components separately.16,17 The theoretical problem with this endeavor is that researchers do not know what the constituent components of a complex skill are or how they collectively make up the complex skill. In other words, investigators know neither the variables nor the function making up physical fitness. In this article, we describe a new, functional test battery that aims to provide a reliable, objective quantification of children’s physical fitness. In contrast to many extant tests, it does not attempt to define and then measure constituent components. Rather, it focuses on compound activities that recruit various combinations of multiple factors, such as strength, endurance, motor coordination, balance, and agility.18 –20 Furthermore, the test battery focuses on common activities that are included in most children’s everyday play activities. This design reduces the Number 7 cognitive component of the test items and more easily incites and sustains children’s motivation to participate and perform as well as possible. Our final consideration was that to be applicable in larger studies, the test battery should be easy to administer and not require specialized training of experimenters or equipment beyond what is normally available in most gymnasiums. Furthermore, whereas previous tests were divided into several age bands with different test items for each age band, the test battery that we describe includes the same test items for all ages (5–12 years). This design enables the longitudinal monitoring of children’s physical fitness. In the present study, we investigated the applicability of our test battery for children who were 5 to 12 years of age. We had 4 specific goals. First, we examined the feasibility of the test battery for children as young as 5 years of age and assessed whether the test battery could distinguish performance across all age groups. The focus was on whether the youngest children would understand and correctly perform the various tasks and whether the tasks were difficult enough to distinguish performance in the oldest age groups (11 and 12 years). Second, we estimated the internal consistency of the individual test items and the relationship between individual test item scores and the total test score. By using test items that encompassed individual components of physical fitness in different combinations and with some overlap, we aimed for a compound measure of the term “physical fitness.”21 Third, we estimated the convergent construct validity of the test battery by comparing scores of children in one class on the test battery with evaluations of physical fitness by their physical education teacher. Finally, we estimated the test-retest reliability of the test battery. July 2011 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 ver the last few decades, children’s physical activity levels have dramatically changed.1–3 Outdoor physical play is increasingly being replaced by less physical indoor activities,4 – 6 children are increasingly being driven to school by car or bus instead of cycling or walking, and participation in organized sports is declining.6 – 8 In recent years, the possible consequences of these changes for children’s overall development and health have attracted much attention from the media, scientific researchers, and policy makers. However, longitudinal research on the relationships among physical activity, physical ability, and health is scarce, and cross-sectional studies have not generated consistent results.3,9 Therefore, still in question are how the frequency, intensity, and duration of physical activity in children affect their physical fitness and how decreasing levels of physical activity may be related to possible changes in physical fitness and to ensuing health problems later in life, such as obesity, diabetes, osteoporosis, back pain, cardiovascular disease, and cancer.10,11 Physical Fitness Test Battery for Children Table 1. Characteristics of Participating Children Body Mass Index (kg/m2) Age Group (y) Total No. of Children No. of Girls/Boys X SD X SD X SD X SD 5 21 11/10 5.5 0.29 114.0 5.63 19.9 2.63 15.4 1.60 6 26 13/13 6.5 0.30 120.4 5.07 24.5 4.20 16.8 2.01 7 34 17/17 7.4 0.28 128.4 6.44 27.5 4.12 16.6 1.46 8 29 14/15 8.5 0.26 132.1 5.61 31.2 4.90 17.8 1.84 Age (y) Height (cm) Weight (kg) 24 11/13 9.4 0.30 138.1 6.49 33.7 4.52 17.6 1.58 20 13/7 10.4 0.26 140.7 7.10 33.4 6.35 16.7 1.89 11 16 7/9 11.5 0.36 150.8 6.63 42.4 10.60 18.5 3.22 12 25 15/10 12.4 0.37 156.5 5.92 47.0 7.81 19.1 2.38 Method Participants Information about the study and the test items and informed consent forms were distributed in 2 kindergartens (1 in southern Norway and 1 in central Norway) and 3 primary schools (1 in southern Norway and 2 in central Norway). A total of 195 children (101 girls and 94 boys) with no obvious abnormalities participated in the project; they were approximately 5 to 12 years of age (X⫽8.3, SD⫽2.21, minimum⫽5.0, maximum⫽13.1). Table 1 shows the distribution of the children across the age groups and the anthropometric measures height, weight, and body mass index. Test Items and Materials The test battery consisted of 9 test items that represent typical everyday activities for children, namely, jumping, throwing, climbing, and running. Most of the test items have appeared in other tests or test batteries as well, such as the EUROFIT,14 the Allgemeiner Sportmotorischer Test für Kinder,22 the Folke Bernadotte Hemmet,23 and the Fitnessgram.24 The test item “climbing wall bars” was designed specifically for our test battery. July 2011 The 9 test items were as follows: 1. Standing broad jump. The child stands with his or her feet parallel and shoulder width apart behind a starting line. Upon a signal, the child swings the arms backward and forward and jumps with both feet simultaneously as far forward as possible. The test item score (better of 2 attempts) is the distance between the starting line and the landing position (measured in centimeters). 2. Jumping a distance of 7 m on 2 feet as fast as possible. The test item score (better of 2 attempts) is the time needed to cross the distance (measured in seconds). 3. Jumping a distance of 7 m on one foot (the child is free to choose which foot) as fast as possible. The test item score (better of 2 attempts) is the time needed to cross the distance (measured in seconds). 4. Throwing a tennis ball with one hand (the child chooses which hand) as far as possible. The child stands with the contralateral foot in front of the ipsilateral foot. The test item score (better of 2 attempts) is the distance thrown (measured in meters). 5. Pushing a medicine ball (1 kg) with 2 hands as far as possible. The starting position is with the feet parallel to each other and shoulder width apart, with the ball held against the chest. The test item score (better of 2 attempts) is the distance achieved (measured in meters). 6. Climbing up wall bars, crossing over 2 columns to the right, and climbing down the fourth column as fast as possible. Each column of the wall bars is 2.55 m high and 0.75 m wide. The test item score (better of 2 attempts) is the time to completion (measured in seconds). 7. Shuttle run. The test item score is the time required to run 10 ⫻ 5 m (measured in seconds). If the child makes a procedural error, the performance is interrupted and the test item is repeated. 8. Running 20 m as fast as possible. The child starts in a standing position. The test item score is the time required to run the distance (measured in seconds). If the child makes a procedural error, then the performance is interrupted and the test item is repeated. Volume 91 Number 7 Physical Therapy f 1089 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 9 10 Physical Fitness Test Battery for Children 9. Reduced Cooper test. The child runs or walks around a marked rectangle measuring 9 ⫻ 18 m (the size of a volleyball field) for 6 minutes. Both running and walking are allowed. The test item score is the distance covered in 6 minutes (measured in meters). Procedure Children were tested individually. Each test item was explained and demonstrated before the child started. Except for the 3 running tests, each test item was performed twice, with the better attempt scored. If a child made a procedural error, instructions and demonstrations were repeated, and the child made a new attempt. If a second procedural error occurred or if a child could not perform the test item, the test item was scored as missing. In the present study, 40 children had a total of 51 missing scores. In addition, 24 children in one class at a primary school in central Norway (mean age⫽8.6 years, SD⫽0.3) were tested a second time with the same test battery 1 week later to establish test-retest reliability. The class was chosen to reflect most closely the average age in the entire sample. Data Reduction and Analysis To express the child’s overall performance in one score, we calculated a total test score. To this end, the test item scores that were measured as the time needed to accomplish the test items were first converted to 1/score, such that higher scores always indicated better performance than lower scores. After conversion, the scores on all test items were nor1090 f Physical Therapy Volume 91 To estimate the internal consistency of the test battery items, we calculated the Cronbach alpha value for the test battery. In addition, we calculated Pearson coefficients of correlation between individual test item scores and the total test score and Pearson coefficients of correlation between scores on individual test items. When an individual test item score was correlated with the total test score, the individual test item score was excluded from the total test score to avoid statistical dependence. For example, when the score on test item 1 was correlated with the total test score, the latter was calculated as the average of z scores for test items 2 through 9. The construct validity of a test can be established by comparing it with a prior test known to be valid, a so-called gold standard. For physical fitness in children, no such gold standard is available. Nevertheless, to obtain an estimate of the suitability of the test battery, we asked the physical education teacher of children in one of the classes that we tested to rank 10 girls and 10 boys in the class (mean age⫽8.7 years, SD⫽0.3) from worst to best physical fitness, according to his own implicit knowledge. The teacher had been trained in physical education and was experienced in grading the physical performance of his pupils. He had no knowledge about our test battery, its individual test items, or the children’s scores. Number 7 To estimate the test-retest reliability of the test battery, we tested 24 children in one class in central Norway twice, 1 week apart. We then calculated intraclass correlation coefficients (ICC [2,1]) and 95% confidence intervals for test and retest scores25 to determine relative reliability as well as standard errors of measurement and 95% confidence intervals to determine absolute reliability for both individual test items and the total test score. The standard error of measurement was calculated as the square root of the mean within-subject variance, and 95% confidence intervals were calculated as 1.96 times the standard error of measurement.26 All statistical analyses were performed with SPSS, version 16.0.1,* and consisted of Pearson correlation coefficients, Spearman rho (rank) correlations, Kolmogorov-Smirnov tests for normality of the data distribution, and linear regression analyses. Role of the Funding Source This work was commissioned and supported by the Ministry of Social and Health Affairs, Oslo, Norway. Results Total Test Score The total test score for each child was calculated as the average of z scores for the individual test items. Figure 1 shows a plot of the total test score against age for girls and boys separately. The total test score increased linearly with increasing age. With respect to our first goal, 2 observations are relevant. First, even the 5-year-old children were able to perform the test items, indicating that the test battery is not too difficult even for the youngest children. Second, the total test score did not level off with age, indicating that the * SPSS Inc, 233 S Wacker Dr, Chicago, IL 60606. July 2011 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 The following materials were needed for administering the test items: masking tape, ruler, stop watch, tennis ball, medicine ball (1 kg), wall bars at least 4 columns wide, and gym mats. mally distributed, as indicated by one-sample Kolmogorov-Smirnov tests. Subsequently, all test item scores were transformed into z scores on the basis of the sample mean and standard deviation so that each test item would have the same weight in the total score as other test items. The total test score for each child was then calculated as the average of z scores for all test items successfully performed by that child. Physical Fitness Test Battery for Children test battery is still challenging even for the oldest children. A one-sample Kolmogorov-Smirnov test indicated that total test scores were normally distributed: Z(195)⫽.97, P⫽.302. Changes in total test score (average of z scores for the 9 test items) with age (n⫽195). Lines represent separate linear regressions for girls (r⫽.85, 95% confidence interval⫽.81–.89) and boys (r⫽.84, 95% confidence interval⫽.79 –.88). Table 2. Pearson Correlation Coefficients and 95% Confidence Intervals for Individual Test Item Scores and Total Test Scorea and Pearson Correlation Coefficients for Individual Test Items Correlation With: Correlation With Total Score 95% Confidence Interval Standing Broad Jump (m) Standing broad jump (m) .84 0.79–0.88 1.00 Jumping on 2 feet (s) .67 0.59–0.74 .58 1.00 Jumping on 1 foot (s) .67 0.59–0.74 .60 .85 1.00 Throwing a tennis ball (m) .72 0.65–0.78 .71 .34 .31 1.00 Pushing a medicine ball (m) .80 0.74–0.85 .72 .43 .41 .80 1.00 Climbing wall bars (s) .80 0.74–0.85 .76 .47 .47 .72 .83 1.00 Shuttle run (s) .80 0.74–0.85 .71 .54 .55 .67 .72 .69 1.00 Running 20 m (s) .88 0.84–0.91 .80 .66 .68 .67 .77 .73 .76 1.00 Reduced Cooper test (m) .65 0.56–0.72 .53 .50 .49 .56 .54 .49 .57 .63 Test Item a Jumping on 2 Feet (s) Jumping on 1 Foot (s) Throwing a Tennis Ball (m) Pushing a Medicine Ball (m) Climbing Wall Bars (s) Shuttle Run (s) Running 20 m (s) On the basis of the other 8 test items. July 2011 Volume 91 Number 7 Physical Therapy f 1091 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 Figure 1. Internal Consistency of the Test Battery To estimate the internal consistency of the test battery, we first calculated Pearson coefficients of correlation between scores on individual test items as well as between individual test item scores and the total test score based on the other 8 test items (Tab. 2). The results indicated that all individual test item scores correlated positively with the total test score, with correlations ranging from .65 to .88. Correlations between scores on individual test items were moderate to high (.31–.85). The Cronbach alpha value for standardized items was high (.93). With respect to our second goal, these results indicated that the internal consistency of the test battery is high and that the dif- Physical Fitness Test Battery for Children confirmed by high Spearman rho correlations between the 2 rank scores, which were .93 for girls and .90 for boys. Correspondence between rankings of 10 girls and 10 boys on a scale from worst (rank of 1) to best (rank of 10) physical fitness by their physical education teacher and rankings on the basis of their total test score. ferent test items indeed tap into similar underlying components, without correlations being so high as to indicate that test items are redundant. dren in one class on the basis of their total test scores with the rankings of the same children on the basis of an evaluation of their physical fitness by their physical education teacher. Figure 2 shows that there was a close association between the rankings on the basis of the teacher’s evaluation and the rankings on the basis of the total test scores. This association was Construct Validity of the Test Battery We estimated the convergent construct validity of the test battery by comparing the rankings of 20 chil- Discussion In this article, we described a new test battery aimed at quantifying physical fitness in children who were 5 to 12 years of age. Our considerations for the construction of Table 3. Means and Standard Deviations of Test and Retest Scores and 95% Confidence Intervals for Intraclass Correlation Coefficients (ICCs) and Standard Errors of Measurement Test Score Test Item Standing broad jump (cm) Jumping on 2 feet (s) Jumping on 1 foot (s) Throwing a tennis ball (m) Pushing a medicine ball (m) Retest Score 95% Confidence Interval Standard Error of Measurement 95% Confidence Interval X SD X SD ICC (2,1) 122.54 19.41 119.79 18.66 .88 0.74–0.95 6.68 5.28–8.65 3.87 0.59 3.84 0.77 .65 0.34–0.83 0.40 0.33–0.55 3.19 0.37 3.11 0.42 .66 0.37–0.84 0.23 0.18–0.30 11.97 3.56 12.04 3.51 .92 0.83–0.97 0.99 0.81–1.33 3.34 0.47 3.44 0.53 .54 0.18–0.77 0.34 0.28–0.45 Climbing wall bars (s) 12.10 2.83 12.04 3.08 .77 0.54–0.89 1.41 1.17–1.91 Shuttle run (s) 25.87 2.03 25.20 2.64 .69 0.41–0.86 1.32 1.02–1.67 4.62 0.40 4.82 0.49 .71 0.32–0.87 0.25 0.17–0.28 984.82 133.47 942.05 108.23 .72 0.41–0.88 66.06 48.19–80.91 .80 0.59–0.91 0.26 0.22–0.36 Running 20 m (s) Reduced Cooper test (m) Total test score (z score) 1092 f Physical Therapy 0.042 Volume 91 0.405 Number 7 ⫺0.001 0.728 July 2011 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 Figure 2. Test-Retest Reliability of the Test Battery We estimated the relative test-retest reliability of the test battery by using ICC (2,1)25 between test and retest scores for both total test scores and individual test item scores. Absolute reliability was estimated from the standard error of measurement, which was calculated as the square root of the average within-subject variance for each test item score and the total test score.26 Table 3 shows the means and standard deviations of test and retest scores and the 95% confidence intervals for the ICCs and standard errors of measurement. The results indicated fair to good reliability for individual test item scores and the total test score, with ICCs between test and retest scores ranging from .54 to .92. Physical Fitness Test Battery for Children Applicability of the Test Battery Total test scores increased linearly with increasing age, as one would expect given that the constituent components typically improve in children with increasing age.27 These results indicate that our test battery can be used across the entire age span studied here, that is, from 5 to 12 years. Our test battery was not too difficult for the youngest children and not too easy or boring for the oldest children. Regression analysis suggested that sex differences can be revealed by our test battery as well, although we did not formally evaluate this concept in the present study. Furthermore, total test scores were normally distributed, indicating that the test battery is applicable at both ends of the scale and can be used to classify both children with extraordinarily good fitness and those with extremely poor fitness. In contrast, some other tests, such as the Movement Assessment Battery for Children,28 are heavily skewed toward a normal distribution. Such tests appropriately identify children performing below normal but are July 2011 not suited for discriminating between children performing normally. A further advantage of total test scores being normally distributed is fulfillment of the necessary assumptions of most parametric statistical procedures. Internal Consistency of the Test Battery Items One of our considerations in choosing the test items was that they should consist of compound activities that recruit various combinations of underlying components, such as strength, endurance, motor coordination, balance, and agility. If we were successful in this aim, the individual test items should correlate reasonably well with each other, without correlations being so high as to indicate that individual test items were redundant. The results showed that this aim was met, with correlations between the test item scores ranging from .31 to .85. Furthermore, all of the individual test item scores would be expected to contribute to the total test score, without either very high or very low correlations between individual test item scores and total test scores. This expectation was confirmed, with correlations ranging from .65 to .88. The internal consistency of the entire test battery was attested to by the high Cronbach alpha value (ie, .93). These results support the internal consistency of our test battery and provide confidence that our test battery yields a fair measure of physical fitness, a complex attribute that otherwise is not so readily measured.21,29 Construct Validity of the Test Battery Construct validity is arguably among the most important characteristics of a test. Construct validity represents the extent to which a test measures what it is supposed to measure and how well it measures that property.30 It also is among the most dif- ficult characteristics to determine, unless the new test can be compared with an existing test known to be valid. For physical fitness in children, no such test is available. Although the construct validity of our test battery is therefore difficult to establish, we can nevertheless discuss convergent construct validity on the basis of 2 arguments. First, 7 of the 9 individual test items are found in existing tests that have been used for many years and have been largely standardized and scrutinized for construct validity; these tests include the BOT,31 the EUROFIT,14 the AST,22 and the FBH.23 However, this fact is insufficient to conclude that we have measured physical fitness. One reason why it is difficult to measure physical fitness is the wide range of opinions about what physical fitness really is and how it should be measured.21,29 Nevertheless, professionals in the field typically have no problem recognizing higher and lower levels of fitness and motor ability in children. Because our aim was to measure children’s physical fitness in a way that would be understood by various professionals working to increase children’s physical activity and thereby improve their physical fitness, we sought help from such a professional. Although several groups of professionals can be considered experts on children’s physical fitness, such as sports coaches, various health care professionals, and physical education teachers, we chose a physical education teacher for several reasons. The daily experience of sports coaches might be biased toward children with higher levels of fitness and motor skills, and the daily experience of health care professionals, such as physical therapists, might be biased toward children who are less fit and those who have poor or impaired motor skills. Most physical education teachers, on the other hand, work with the entire Volume 91 Number 7 Physical Therapy f 1093 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 the test battery were that the battery should be quantitative and that its test items should consist of compound activities, based on everyday activities, that recruit several constituent components of physical fitness, such as strength, endurance, motor coordination, balance, and agility.18 Furthermore, the test battery should be easy to administer and should not require specialized technical equipment or specially trained personnel. This design would allow the test battery to be used to test large groups of children, even entire populations, and reliably monitor children’s physical fitness over time. In this first round of testing, the test battery was administered to 195 children from 4 kindergartens and schools, enabling us to investigate its feasibility, internal consistency, construct validity, and test-retest reliability. Physical Fitness Test Battery for Children range of children’s competence every day. Therefore, we selected a physical education teacher to provide an experience-based test of construct validity. Test-Retest Reliability of the Test Battery A test cannot be valid unless it is reliable. That is, with repeated administration of the test to the same participants, the results should be highly comparable and should not be severely influenced by irrelevant or chance factors, such as the time of day, motivation, fatigue, or boredom. We administered our test battery twice to the children in one class, 1 week apart, and established test-retest reliability. Correlations between the test and retest scores for the separate test item scores were high; ICCs ranged from .54 to .92. More importantly, the ICC for the test and retest scores for the total test score was .80, and the 95% confidence interval ranged from .59 to .91. With such a relatively small sample of participants, these results can be considered satisfactory. Limitations and Future Directions Although our test battery for physical fitness in children is clearly in an early stage, the results presented here are promising and warrant further development of the test battery. In the present study, each child was ranked with respect to the entire sample across all ages and both sexes. The next step in the further 1094 f Physical Therapy Volume 91 A limitation of the present study was that both a failed attempt and a missing test item were scored as a missing value. However, it might be more correct to reserve a missing value for a test item not attempted and to assign a minimum score to a test item that a child attempted but was not able to perform correctly. For the establishment of an adequate and fair minimum score for each test item, a larger data sample is required. Only when such a sample is available will it be possible to reliably deduce the minimum score for each test item, according to sex and age category, thereby allowing the assignment of a minimum score to a “failed performance.” This scoring method would further increase the construct validity of the test battery. Conclusions To measure children’s physical fitness, we have developed a test battery that is based on children’s everyday activities and scored on an interval scale. On the basis of this first round of investigation, we conclude that the test battery is easy to administer, appropriate for children who are 5 to 12 years of age, and discriminates well across the entire age range. Furthermore, the results of the initial investigation of the construct validity and test-retest reliability of the test battery are promising and warrant its further development and standardization. Because the test battery is easy to administer and does not require specialized technical Number 7 equipment or specially trained personnel, it can be used to measure physical fitness in large groups of children. This application would enable health care authorities to collect reliable longitudinal data on a population level. Such data, in turn, may provide valuable information about changes in the level of physical fitness of children over time. Furthermore, health care personnel, such as physical therapists, can use such data for planning prophylactic interventions on a group level or for evaluating the effects of such interventions. All authors provided concept/idea/research design, writing, data analysis, project management, and facilities/equipment. Dr Fjørtoft, Mr Pedersen, and Dr Sigmundsson provided data collection. Dr Fjørtoft and Dr Sigmundsson provided participants. Dr Fjørtoft provided institutional liaisons and consultation (including review of manuscript before submission). The authors thank the children and staff at Helgen Primary School and Kindergarten, Sætre Primary School, and Åsvang Primary School for their participation in this project; Vigdis Vedul Moen, Arne Martin Hårstad, and Ann Kristin Forseth for help in data collection; Bjørg Fallang, Jan Morten Loftesnes, Thomas Moser, and Svein Arne Pettersen for helpful discussions on test construction; and Erling J. Solberg, Kyrre Svarva, Tom Ivar L. Nilsen, and Rolf Moe-Nilssen for advice on statistical analyses. This work was commissioned and supported by the Ministry of Social and Health Affairs, Oslo, Norway. This article was submitted October 27, 2009, and was accepted March 14, 2011. DOI: 10.2522/ptj.20090350 References 1 Anderssen L, Harro M, Sardinha L, et al. Physical activity and clustered cardiovascular risk in children: a cross-sectional study (The European Youth Heart Study). Lancet. 2006;368:299 –304. 2 Hands B, Larkin D. Physical fitness and developmental coordination disorder. In: Cermak SA, Larkin D, eds. Developmental Co-ordination Disorder. Albany, NY: Delmar Thomson Learning; 2002:172–184. July 2011 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 When we compared the intuitive rankings assigned by the physical education teacher to 20 of his pupils with the total test scores of the same children on our test battery, we found high correlations both for girls and for boys. Even though this evaluation was not a formal test of construct validity, it provides additional face validity to our test battery and supports the further pursuit of its development. development of the test battery is to collect data on a large sample to make standardization of the test battery possible, along with the establishment of age- and sex-specific norms. Once age- and sex-specific norms are available, a child can be ranked with respect to his or her own age group and sex; in other words, a child can be classified with respect to relative physical fitness for a child of that particular age and sex. Physical Fitness Test Battery for Children July 2011 13 Bös K, Mechling H. International Physical Performance Test Profile for Boys and Girls From 9 –17 Years (IPPTP 9 –17). Berlin, Germany: International Council of Sport Science and Physical Education; 1985. 14 Adam C, Klissouras V, Ravazollo M, et al. EUROFIT: European Test of Physical Fitness—Handbook. Rome, Italy: Council of Europe, Committee for the Development of Sport; 1988. 15 Safrit M. The validity and reliability of fitness tests for children. Pediatr Exerc Sci. 1990;2:9 –28. 16 Fjørtoft I. Motor fitness in pre-primary school children: the EUROFIT Motor Fitness Test explored in 5- to 7-year-old children. Pediatr Exerc Sci. 2000;12: 424 – 436. 17 van Praagh E, Franca NM. Measuring maximal short-term power output during growth. In: van Praagh E, ed. Pediatric Anaerobic Performance. Champaign, IL: Human Kinetics; 1998:155–189. 18 Fjørtoft I, Pedersen AV, Sigmundsson H, Vereijken B. Testing Children’s Physical Fitness: Developing a New Test for 4 –12 Year Old Children. Oslo, Norway: The Norwegian Social and Health Ministry; 2003. Report IS-1256. 19 Haga M. The relationship between physical fitness and motor competence in children. Child Care Health Dev. 2008;34: 329 –334. 20 Haga M. Physical fitness in children with high motor competence is different from that in children with low motor competence. Phys Ther. 2009;89:1089 –1097. 21 Hopkins WG, Walker NP. The meaning of “physical fitness.” Prev Med. 1988;17: 764 –773. 22 Bös K, Wohlman R. Allgemeiner Sportsmotorischer Test (AST 6 –11) zur Diagnose der konditionellen und koordinativen Leistungsfähigkeit. Lehrhilfen für den Sportunterricht. 1987;36:145–160. 23 Bille B, Brieditis K, Ekström B, Esscher E. FBH Provet: Erfarenheter från Folke Bernadottehemmet. Örebro, Sweden: Motorika; 1992. 24 The Prudential Fitnessgram: Test Administration Manual. Dallas, TX: Cooper Institute for Aerobics Research; 2001. 25 Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420 – 428. 26 Bland JM, Altman DG. Measurement error. BMJ. 1996;313:744. 27 Burton AW, Miller DE. Movement Skill Assessment. Champaign, IL: Human Kinetics; 1998. 28 Henderson SE, Sugden D. The Movement Assessment Battery for Children. Kent, United Kingdom: The Psychological Corporation; 1992. 29 Caspersen CJ, Powell KE, Christenson GM. Physical activity, exercise, and physical fitness: definitions and distinctions for health-related research. Public Health Rep. 1985;100:126 –131. 30 Anastasi A. Psychological Testing. 3rd ed. New York, NY: Macmillan; 1968. 31 Bruininks RH. Bruininks-Oseretsky Test of Motor Proficiency Examiner’s Manual. Circle Pines, MN: American Guidance Service; 1978. Volume 91 Number 7 Physical Therapy f 1095 Downloaded from https://academic.oup.com/ptj/article/91/7/1087/2735048 by guest on 31 May 2021 3 Stalsberg R, Pedersen AV. Effects of socioeconomic status on physical activity in adolescents: a systematic review of the evidence. Scand J Med Sci Sports. 2010;20: 368 –383. 4 Ekeland E, Halland B, Refsnes KA, et al. Er barn og unge mindre fysisk aktive i dag enn tidligere? Tidsskr Nor Lægeforen. 1999;199:2358 –2362. 5 Grund A, Dilba B, Forberger K, et al. Relationships between physical activity, physical fitness, muscle strength and nutritional state in 5- to 11-year-old children. Eur J Appl Physiol. 2000;82:425– 438. 6 Hoos MB, Gerver WJM, Kester AD, Westerterp KR. Physical activity levels in children and adolescents. Int J Obes. 2003;27: 605– 609. 7 Rice MH, Howell CC. Measurement of physical activity, exercise, and physical fitness in children: issues and concerns. J Pediatr Nurs. 2000;3:148 –156. 8 Søgaard AJ, Bø K, Klungland M, Jacobsen BK. En oversikt over norske studier: hvor mye beveger vi oss i fritiden? Tidsskr Nor Lægeforen. 2000;120:3439 –3446. 9 Marshall JD, Bouffard M. The effects of quality daily physical education on movement competency in obese versus nonobese children. Adapt Phys Activ Q. 1997; 14:222–237. 10 Andersen RE, Crespo CJ, Bartlett SJ, et al. Relationship of physical activity and television watching with body weight and level of fatness among children. JAMA. 1998;279:938 –942. 11 Blair SN, Kohl HW, Gordon NF, Paffenbarger RS Jr. How much physical activity is good for health? Annu Rev Public Health. 1992;13:99 –126. 12 Kemper HCG, van Mechelen W. Physical fitness testing of children: a European perspective. Pediatr Exerc Sci. 1996;8:210 – 214.