SuppInfo

advertisement
Supplementary material
Supplementary Tables ................................................................................................................................... 2
Supplementary Figures ................................................................................................................................. 7
Appendix S1 – Details on vegetation plots: the DIVGRASS project ............................................ 10
Appendix S2 – Details on phylogeny reconstruction ....................................................................... 13
Appendix S3 – Details on trait preparation ......................................................................................... 15
Appendix S4 – Distance Metrics based on Co-occurrence to study invasions ....................... 19
Appendix S5 – Influence of evolutionary history on invasion success ..................................... 23
Appendix S6 – Cross Validation of model for Invasion Success .................................................. 24
1
Supplementary Tables
Table S1. Introduction pathways of alien species based on the DAISIE hierarchical
classification system. Reproduced from Lambdon et al. (2008).
Pathway
Level1
Description
Level2
Level3
Introductions have been introduced deliberately by
humans, for commercial or recreational reasons
Species have been released deliberately into the
wild (e.g., for the enrichment of the native flora,
landscaping, etc.).
Intentional
Released
Escaped
Species have escaped into the wild from cultivation
Forestry
Amenity
Ornamental
Species are cultivated for timber on a large-scale,
or as part of re-/aforestation programmes
Species are cultivated on a large to moderate scale
in public places for landscaping purposes (e.g., for
soil stabilization or aesthetic enhancement).
Species are cultivated for ornament on a small
scale (especially in private gardens).
Agricultural
Species are cultivated on a field scale as
commercial non-timber crops.
Horticultural
Species are cultivated for edible or other useful
products on a small-scale (e.g., in private gardens).
Introductions have arrived as a result of human
actions but have not been introduced deliberately.
Unintentional
Unaided
Transported
Seed
Mineral
Commodity
Stowaway
Species have spread via natural (spontaneous)
means from introduced populations elsewhere in
non-native range.
Species have been introduced accidentally via
shipping, air, road or rail freight, directly by
humans or with domestic animals.
Have been introduced as a contaminant of crop
seed or propagules.
Have been introduced during the deliberate
movement of soil or other minerals.
Contaminants have been introduced as
contaminants of non-seed crop commodities (e.g.,
wool, organic refuse).
Have been introduced accidentally but are not
known to be associated with any particular
commodity, e.g., on car tyres or in the hulls of
ships.
2
Table S2. The alien species used in the study, with their family, biogeographic origin,
invasion success index and Rabinowitz commonness class. Anecophytes are species,
which have been created from their wild ancestors by plant breeding, and subsequently
have become alien. Therefore they have no native range in strict sense.
Origin
Freq
Local
Env
Invasion Rabinowitz
(Nr
abund
class
volume success
plots) (% cover)
Alien species
Family
Achillea crithmifolia
Asteraceae
Eurasia
1
0.5
0.0000
-2 117
H
Aegilops cylindrica
Poaceae
Eurasia, Africa
1
15
0.0000
-1 533
G
Agave americana
Asparagaceae
N America
1
0.5
0.0000
-2 117
H
Agrostemma githago
Caryophyllaceae
Eurasia, Africa
6
2.92
0.0323
0.92
C
Allium cepa
Amaryllidaceae
Asia, S America
5
9.9
0.0002
-1 017
G
Allium porrum
Amaryllidaceae
Anecophyte
6
0.92
0.0047
-0.279
H
Allium sativum
Amaryllidaceae
Asia
3
1.33
0.1814
0.855
D
Amaranthus deflexus
Amaranthaceae
S America
29
7.71
0.0393
1 813
A
Amaranthus hybridus
Amaranthaceae
Americas
4
4.75
0.0245
0.452
C
Amaranthus retroflexus
Amaranthaceae
N America
24
1.94
0.0761
2 135
B
Ambrosia artemisiifolia
Asteraceae
Americas
36
3.5
0.0060
1 087
E
Anethum graveolens
Apiaceae
Asia temp
1
0.5
0.0000
-2 117
H
Artemisia annua
Asteraceae
Eurasia
24
2.54
0.0016
0.276
E
Arundo donax
Poaceae
Asia, S America
23
9.04
0.0038
0.517
E
Asclepias syriaca
Apocynaceae
N America
3
2.17
0.0004
-0.907
H
Avena sativa
Poaceae
Eurasia
180
7.27
0.0353
3 114
A
Avena strigosa
Poaceae
Europe
2
7.75
0.0197
0.889
C
Barbarea intermedia
Brassicaceae
Eurasia
25
1.5
0.1929
2 402
B
Barbarea stricta
Brassicaceae
Eurasia
1
0.5
0.0000
-2 117
H
Berteroa incana
Brassicaceae
N America
124
4.7
0.0040
2 194
E
Bidens frondosa
Asteraceae
N America
5
7.3
0.0001
0.601
G
Brassica napus
Brassicaceae
Anecophyte
4
0.5
0.0035
0.251
H
Brassica rapa
Brassicaceae
Anecophyte
5
2
0.0034
0.451
H
Bromus catharticus
Poaceae
9
11.94
0.2385
2 382
C
Bromus inermis
Poaceae
21
6.43
0.0316
1 457
A
Bunias orientalis
Brassicaceae
S America
Eurasia,
Americas
Eurasia
21
5.64
0.0214
0.995
A
Calendula officinalis
Asteraceae
Europe
3
0.5
0.0001
-1 209
H
Camelina sativa
Brassicaceae
Eurasia
2
0.5
0.0001
-1 495
H
Claytonia perfoliata
Montiaceae
Americas
5
3.9
0.0000
-1 282
G
Collomia grandiflora
Polemoniaceae
N America
16
5.34
0.0022
0.993
E
Conringia orientalis
Brassicaceae
Eurasia, Africa
6
0.92
0.0519
0.546
D
Cortaderia selloana
Poaceae
S America
11
0.95
0.0050
1 333
H
Cotula coronopifolia
Asteraceae
Africa
124
22.46
0.0373
2 928
A
3
Crepis bursifolia
Asteraceae
Americas
34
2.24
0.0037
0.427
F
Cuscuta campestris
Convolvulaceae
Americas
3
1.33
0.3516
1 230
D
Cuscuta suaveolens
Convolvulaceae
Americas
2
0.5
0.0003
-1 517
H
Cymbalaria muralis
Plantaginaceae
Europe
16
1.44
0.0827
2 105
B
Cyperus eragrostis
Cyperaceae
Americas
6
3.75
0.0143
0.962
C
Cyperus reflexus
Cyperaceae
Americas
1
62.5
0.0000
-1 288
G
Datura stramonium
Solanaceae
N America
6
0.92
0.0089
0.874
D
Dianthus caryophyllus
Caryophyllaceae
Europe
2646
1.57
0.0773
3 600
B
Dipsacus sativus
Caprifoliaceae
Anecophyte
1
3
0.0000
-1 809
G
Duchesnea indica
Rosaceae
Asia
1
3
0.0000
-1 809
G
Epilobium ciliatum
Onagraceae
Asia, Americas
7
0.5
0.0050
0.234
H
Eragrostis pectinacea
Poaceae
Americas
2
0.5
0.0001
-1 322
H
Erigeron annuus
Asteraceae
N America
249
3.16
0.0178
2 446
A
Erigeron karvinskianus
Asteraceae
Americas
1
3
0.0000
-1 809
G
Erysimum cheiri
Brassicaceae
Anecophyte
43
0.91
0.1532
2 477
B
Eschscholzia californica
Papaveraceae
N America
8
7.25
0.0128
1 230
C
Euphorbia lathyris
Euphorbiaceae
Eurasia
2
1.75
0.0003
-1 076
H
Galinsoga parviflora
Asteraceae
Americas
1
0.5
0.0000
-2 117
H
Galinsoga quadriradiata Asteraceae
N America
3
0.5
0.0575
0.318
D
Glycyrrhiza glabra
Fabaceae
Eurasia, Africa
1
0.5
0.0000
-2 117
H
Helianthus tuberosus
Asteraceae
N America
3
0.5
0.0072
0.002
H
Heliotropium
curassavicum
Boraginaceae
Australasia,
Americas
2
0.5
0.0074
-1 002
H
Hemerocallis fulva
Xanthorrhoeaceae Asia
1
0.5
0.0000
-2 117
H
Hordeum bulbosum
Poaceae
Eurasia, Africa
1
0.5
0.0000
-2 117
H
Hordeum distichon
Poaceae
Asia temp
1
0.5
0.0000
-2 117
H
Hordeum vulgare
Poaceae
Eurasia, Africa
7
6.5
0.0131
1 568
C
Hypericum hircinum
Hypericaceae
Eurasia, Africa
3
0.5
0.0465
-1 139
D
Impatiens glandulifera
Balsaminaceae
Asia trop
2
50
0.0035
0.135
G
Iris germanica
Iridaceae
Eurasia
11
10.55
0.0015
1 202
G
Juncus tenuis
Juncaceae
Americas
62
8.13
0.0270
2 380
A
Lathyrus odoratus
Fabaceae
Europe
1
0.5
0.0000
-2 117
H
Lathyrus sativus
Fabaceae
Anecophyte
4
0.5
0.0065
-0.108
H
Lens culinaris
Fabaceae
Eurasia
1
0.5
0.0000
-2 117
H
Linum austriacum
Linaceae
Eurasia, Africa
36
4.1
0.0148
1 601
A
Linum usitatissimum
Lycopersicon
esculentum
Matricaria discoidea
Linaceae
Anecophyte
560
3.33
0.0552
3 501
A
Solanaceae
Anecophyte
6
0.5
0.0025
-0.63
H
Asteraceae
Asia temp
105
4.69
0.0424
2 875
A
Medicago intertexta
Fabaceae
Europe, Africa
1
3
0.0000
-1 809
G
Medicago sativa
Fabaceae
Eurasia, Africa
1293
3.14
0.0332
3 655
A
Mentha spicata
Lamiaceae
Eurasia
8
7.56
0.0081
0.941
C
4
Mesembryanthemum
crystallinum
Aizoaceae
Eurasia, Africa
25
37.26
0.1571
0.361
A
Nothoscordum
borbonicum
Amaryllidaceae
S America
1
3
0.0000
-1 809
G
Oenothera biennis
Onagraceae
N America
75
2.53
0.0500
2 444
A
Oenothera glazioviana
Onagraceae
Anecophyte
8
1.13
0.0029
0.458
H
Oenothera laciniata
Onagraceae
N America
1
0.5
0.0000
-2 117
H
Oenothera parviflora
Onagraceae
N America
1
0.5
0.0000
-2 117
H
Onobrychis viciifolia
Fabaceae
Europe
2364
6.29
0.0821
3 720
A
Opuntia ficus-indica
Cactaceae
S America
4
0.5
0.0614
0.095
D
Opuntia monacantha
Cactaceae
S America
1
0.5
0.0000
-2 117
H
Opuntia stricta
Cactaceae
Americas
1
0.5
0.0000
-2 117
H
Oxalis articulata
Oxalidaceae
S America
7
2.93
0.0113
1 360
C
Oxalis pes-caprae
Oxalidaceae
Africa
7
23.5
0.0031
0.609
G
Panicum capillare
Poaceae
Americas
1
3
0.0000
-1 809
G
Panicum miliaceum
Poaceae
Asia
1
0.5
0.0000
-2 117
H
Papaver somniferum
Papaveraceae
Europe, Africa
9
1.06
0.0505
0.462
D
Paspalum dilatatum
Poaceae
S America
19
3.32
0.0254
1 340
A
Paspalum distichum
Poaceae
Americas
11
18.91
0.0454
1 772
C
Pennisetum villosum
Poaceae
Africa, Asia
8
1.44
0.0014
-0.274
H
Petroselinum crispum
Apiaceae
Europe
2
0.5
0.0007
-1 657
H
Phalaris canariensis
Poaceae
Europe, Africa
3
2.17
0.0497
0.349
D
Phytolacca americana
Phytolaccaceae
N America
7
1.57
0.0292
1 169
D
Portulaca oleracea
Portulacaceae
Eurasia
22
7.36
0.1025
2 380
A
Potentilla intermedia
Rosaceae
Eurasia
2
1.75
0.0006
-1 526
H
Rubia tinctorum
Rubiaceae
Eurasia
1
3
0.0000
-1 809
G
Rumex patientia
Polygonaceae
Eurasia
31
2.81
0.0461
1 385
A
Ruta graveolens
Rutaceae
Europe
6
0.5
0.0247
1 050
D
Salvia nemorosa
Lamiaceae
Eurasia
1
15
0.0000
-1 533
G
Satureja hortensis
Lamiaceae
Eurasia
2
0.5
0.0296
-1 033
D
Secale cereale
Poaceae
Eurasia, Africa
4
1.75
0.0572
0.044
D
Senecio squalidus
Compositae
Europe, Africa
1
3
0.0000
-1 809
G
Setaria parviflora
Poaceae
Americas
1
0.5
0.0000
-2 117
H
Silene dichotoma
Caryophyllaceae
Eurasia
1
0.5
0.0000
-2 117
H
Sisymbrium altissimum
Brassicaceae
Eurasia
1
15
0.0000
-1 533
G
Sisyrinchium montanum Iridaceae
N America
2
26.25
0.0001
-0.312
G
Solidago canadensis
Asteraceae
N America
29
7.79
0.0048
1 506
E
Solidago gigantea
Asteraceae
N America
66
3.39
0.0068
1 480
E
Solidago graminifolia
Asteraceae
N America
1
0.5
0.0000
-2 117
H
Sorghum bicolor
Poaceae
Africa
1
0.5
0.0000
-2 117
H
Sorghum halepense
Poaceae
Africa, Asia
6
1.75
0.0096
0.059
D
5
Sporobolus indicus
Poaceae
Africa, Asia,
Americas
2
9
0.0061
-0.638
G
Sporobolus vaginiflorus
Poaceae
N America
1
62.5
0.0000
-1 288
G
Symphytum asperum
Boraginaceae
Asia temp
1
0.5
0.0000
-2 117
H
Symphytum orientale
Boraginaceae
Eurasia
9
6.5
0.0000
-0.977
G
Triticum aestivum
Poaceae
Asia temp
8
0.81
0.0485
1 451
D
Tropaeolum majus
Tropaeolaceae
S America
2
0.5
0.0014
-0.498
H
Veronica filiformis
Plantaginaceae
Eurasia
5
7.9
0.0856
0.93
C
Veronica peregrina
Plantaginaceae
Americas
3
1.33
0.0857
0.427
D
Veronica persica
Plantaginaceae
Eurasia, Africa
252
2.32
0.0609
3 170
A
Vicia ervilia
Fabaceae
Eurasia
2
0.5
0.0637
-0.282
D
Vicia faba
Fabaceae
Anecophyte
1
0.5
0.0000
-2 117
H
Vicia tenuifolia
Fabaceae
Eurasia, Africa
594
4.55
0.0362
3 072
A
Xanthium spinosum
Asteraceae
S America
14
2.93
0.0029
0.661
E
Xanthium strumarium
Asteraceae
Americas
2
1.75
0.0444
0.331
D
Zea mays
Poaceae
Americas
1
87.5
0.0000
-1 231
G
6
Supplementary Figures
d=2
Eigenvalues
MEDICAGO ARBOREA
LONICERA JAPONICA
CUSCUTA CAMPESTRIS
env_volume
ERYSIMUM
CHEIRI ALLIUM SATIVUM
BARBAREA
INTERMEDIA
LEPIDIUM DIDYMUMCARPOBROTUS ACINACIFORMIS
GALINSOGA
QUADRIRADIATA
VICIA
ERVILIA
OPUNTIA
FICUS-INDICA
PYRUS COMMUNIS
RUTA GRAVEOLENS
HYPERICUM HIRCINUM
TRITICUM AESTIVUM
PRUNUS
DULCIS
SATUREJA
HORTENSIS
HELIANTHUS
TUBEROSUS
PRUNUS
SEROTINA
BRASSICA
NAPUS
CYMBALARIA MURALIS
LATHYRUS
SATIVUS
VERONICA
PEREGRINA
EPILOBIUM
CILIATUM
TROPAEOLUM
MAJUS
CONRINGIA
ORIENTALIS
ACACIA
DEALBATA
HELIOTROPIUM
CURASSAVICUM
LYCOPERSICON
ESCULENTUM
MAHONIA
AQUIFOLIUM
CENTAUREA
PULLATA
ERAGROSTIS
PECTINACEA
CALENDULA
OFFICINALIS
CAMELINA
SATIVA
CUSCUTA
SUAVEOLENS
PAPAVER
SOMNIFERUM
SENECIO
ANGULATUS
PETROSELINUM
CRISPUM
NICOTIANA
GLAUCA
OENOTHERA
GLEDITSIA
SOLIDAGO
GALINSOGA
COTONEASTER
ACHILLEA
ANETHUM
ERODIUM
OPUNTIA
OENOTHERA
SYMPHYTUM
GLYCYRRHIZA
HEMEROCALLIS
PITTOSPORUM
LATHYRUS
HORDEUM
SETARIA
ATRIPLEX
KUNDMANNIA
PANICUM
BARBAREA
SORGHUM
CREPIS
RICINUS
SILENE
CYDONIA
AGAVE
OPUNTIA
LILIUM
LENS
VICIA
CULINARIS
MONACANTHA
DICHOTOMA
AMERICANA
CRITHMIFOLIA
MICRANTHA
AETHIOPICUM
CANDIDUM
PARVIFLORA
GRAVEOLENS
TRIACANTHOS
GRAMINIFOLIA
COMMUNIS
MILIACEUM
SAGITTATA
PARVIFLORA
BULBOSUM
OBLONGA
DISTICHON
ODORATUS
PARVIFLORA
FABA
STRICTA
BICOLOR
STRICTA
LACINIATA
ASPERUM
GLABRA
SIMONSII
SICULA
TOBIRA
FULVA
HELIANTHEMUM
SYRIACUM
BIDENS
CONNATA
BROMUS
CATHARTICUS
XANTHIUM
ORIENTALE
TANACETUM
ANREDERA
PRUNUS
CINERARIIFOLIUM
ARMENIACA
CORDIFOLIA
AMARANTHUS
RETROFLEXUS
DIANTHUS CARYOPHYLLUS
DATURA
STRAMONIUM
ACER
NEGUNDO STRICTUM
ERYSIMUM
CORTADERIA
SELLOANA
FICUS
CARICA
XANTHIUM
SECALE
STRUMARIUM
CEREALE
ALLIUM
PORRUM
PHYTOLACCA
AMERICANA
BUDDLEJA
DAVIDII
ARTEMISIA
VERLOTIORUM
Y_range
JUGLANS
REGIA
VERONICA PERSICA
GLAZIOVIANA
PHALARIS
CANARIENSIS
X_range
SENECIOOENOTHERA
INAEQUIDENS
OENOTHERA BIENNIS
EUPHORBIA
MACULATA
OENOTHERA
LONGIFLORA
ALCEA
BIENNIS
SYRINGA
VULGARIS
PENNISETUM
VILLOSUM
ELAEAGNUS
ANGUSTIFOLIA
SORGHUM
HALEPENSE
LINUM USITATISSIMUM
RUMEX
PATIENTIA
CAMELINA
MICROCARPA
PRUNUS
CERASUS
AGROSTEMMA
GITHAGO
BRASSICA
RAPA
PORTULACA
OLERACEA
PRUNUS
EUPHORBIA
CERASIFERA
LATHYRIS
COCHLEARIA
GLASTIFOLIA
PRUNUS
DOMESTICA
OXALIS
ARTICULATA
MEDICAGO
SATIVA
POTENTILLA
INTERMEDIA
PLANTAGO LANCEOLATA
ULEXERIGERON
EUROPAEUS
PASPALUM
DILATATUM
Nr_releves
SETARIA
VERTICILLATA
ASCLEPIAS SYRIACA
MATRICARIA
DISCOIDEA
CREPIS
BURSIFOLIA
ROBINIA
PSEUDOACACIA
ANNUUS
GOMPHOCARPUS
FRUTICOSUS
LOTUS
DREPANOCARPUS
ONOBRYCHIS
VICIIFOLIA
VERONICA
FILIFORMIS
MENTHA
X NILIACA
CYPERUS
ERAGROSTIS
ARTEMISIA
ANNUA
XANTHIUM
SPINOSUM
VICIA
TENUIFOLIA
SOLIDAGO
GIGANTEA
AMARANTHUS
HYBRIDUS
LINUM
AUSTRIACUM
AMBROSIA ARTEMISIIFOLIA
SENECIO
SQUALIDUS
NOTHOSCORDUM
AESCULUS
PARTHENOCISSUS
ERIGERON
MEDICAGO
CENTAUREA
OENOTHERA
PANICUM
DUCHESNEA
DIPSACUS
RUBIA
FUMARIA
HIPPOCASTANUM
TINCTORUM
KARVINSKIANUS
CAPILLARE
INTERTEXTA
AGRARIA
SATIVUS
ACAULIS
BORBONICUM
ISSLERI
INDICA
INSERTA
AILANTHUS
ALTISSIMA
BROMUS
INERMIS
AVENA SATIVA
BERTEROA
INCANA
HORDEUM
VULGARE
BACCHARIS
HALIMIFOLIA
AMARANTHUS
DEFLEXUS
BUNIAS
ORIENTALIS
DYSPHANIA
AMBROSIOIDES
DYSPHANIA
MALUS
MULTIFIDA
DOMESTICA
AVENA
STRIGOSA
COLLOMIA
GRANDIFLORA
CLAYTONIA PERFOLIATA
CASTANEA
SATIVA
JUNCUSESCHSCHOLZIA
TENUIS
CALIFORNICA
SPICATA
BIDENS
FRONDOSA
SOLIDAGOMENTHA
CANADENSIS
MESEMBRYANTHEMUM
CRYSTALLINUM
EUPHORBIA
PROSTRATA
SYMPHYTUM
ORIENTALE
SPOROBOLUS
INDICUS
GERMANICA
PASPALUMIRIS
DISTICHUM
ARUNDO
DONAX
QUERCUS
RUBRA
CARPOBROTUS EDULIS
ALLIUM
CEPA LUTEA
BROUSSONETIA
STERNBERGIA
COTULA CORONOPIFOLIA
ARBORESCENS PAPYRIFERA
CORISPERMUMARTEMISIA
DECLINATUM
PARTHENOCISSUS
QUINQUEFOLIA
SISYMBRIUM
ALTISSIMUM
PSEUDOTSUGA
ABIES
AEGILOPS
SALVIA
NORDMANNIANA
NEMOROSA
CYLINDRICA
MENZIESII
RANUNCULUS
MACROPHYLLUS
OXALIS PES-CAPRAE
SISYRINCHIUM
MONTANUM
CEDRUS ATLANTICA
IMPATIENS GLANDULIFERA
SPOROBOLUS
CYPERUS VAGINIFLORUS
REFLEXUS
PASPALUM
ZEA MAYS
VAGINATUM
Ave_Cover
Fig. S1. Scatterplot of PCA of invasion success measures for all alien species occurring in
the dataset (n= 203). Occasionally occurring shrub and tree species were subsequently
removed in order to focus on herbaceaous species.
7
0.2
-0.3
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
-4
start_year_EU (8%)
1.0
-1
0
1
0.8
1.0
0.0
1.0
4
6
8
4
6
8
0.02
0.2
0.4
0.6
0.8
1.0
fitted function
MDMCO.pool.plot.phylo (1.2%)
fitted function
-0.02
AgriForest
OtherUnknown
0.2
0.4
0.6
0.8
logSLA.pmm (0.6%)
1.0
-1
0
1
2
0.0020
PH_hier_Hab (0.8%)
-0.0010
-0.004
0.0
-2
fitted function
fitted function
0.01
-0.03
2
0.0
path.mostcommon (0.9%)
-0.01
fitted function
0.03
0.01
0
SM_hier_Hab (0.7%)
2
0.04
-1.0
SLA_hier_Hab (0.9%)
-0.01
-2
1
-0.04
-2.0
logSM.pmm (1%)
-4
0
0.02
fitted function
0.00
0.02
0.6
-1
SLA_hier_Plot (2%)
-0.04
0.4
0
-0.10
-2
-0.02
fitted function
0.06
0.02
0.2
0.10
2
PH_hier_Plot (4.8%)
-0.02
0.0
0.05
fitted function
-0.05
-2
0.00
0.8
0.004
0.6
0.000
0.4
0.00
0.2
0.1
-0.1
0.0
fitted function
0.0
-0.1
fitted function
-0.2
0.2
MDMCO.pool.plot.func (5.1%)
-2
SM_hier_Plot (6.1%)
0.3
MDMCO.pool.func (9%)
-0.02
0.2
-0.06
1.0
0.02
0.8
0.00
0.6
0.0005
0.4
-0.3
0.0
fitted function
0.1
-0.1
0.0
0.2
MDMCO.pool.phylo (58.7%)
fitted function
0.0
fitted function
0.1 0.2
fitted function
-0.1
0.6
0.4
0.2
fitted function
0.0
-0.5
fitted function
-1.0
-1.5
0.0
0.4
0.5
0.6
0.7
0.8
0.9
logPH.pmm (0.1%)
graminoid
herb
herb/shrub
GrowthForm (0%)
Fig. S2. Response curves of invasion success to all variables in the BRT. All variables are
rescaled between zero and one (apart for the hierarchical indices to highlight positive
vs. negative trait differences). Invasion success is expressed by the first axis of a PCA.
Negative values relate to low invasion success while positive values relate to high
invasion success.
8
C
A
0.25
0.35
MDMCS.hab.func
MDMCO.pool.func
0.45
0.6
0.5
0.4
0.3
0.2
MDMCO.pool.phylo
MDMCS.hab.phylo
B
H
D
G
E
B
C
A
H
D
G
E
B
C
A
Fig. S3. The two most important explanatory variables in BRT separated according to
Rabinowitz classes. Represented are the means and bars show standard errors. The
least successful aliens are in the class on the left of the graph (class H - small regional
distribution, locally not abundant, small niche breadth) and the most successful aliens
are on the right (class A - regionally widespread, locally abundant, large niche breadth).
See contingency table in Fig. 1 for a full explanation of the labels.
9
Appendix S1 – Details on vegetation plots: the DIVGRASS project
Permanent grasslands are broadly defined as “Land on which vegetation is composed of
perennial or self-seeding annual forage species which may persist indefinitely. It may include
either naturalized or cultivated forages” (Allen et al. 2011). According to European Union
laws, this definition is further restricted to grasslands that have been used for at least five
years to produce forage, and which have not been ploughed nor re-seeded during this period
(Plantureux et al. 2012). In France, permanent grasslands are mainly found in fodder region
where they account for more than 20% of total land surface areas (cf. Fig S4).
Here we give further details on the rationale and data structure of the DIVGRASS project
conducted at the CESAB, the French Centre for the Synthesis and Analysis of Biodiversity.
The DIVGRASS project was aimed at integrating and sharing existing knowledge on both
taxonomic and functional plant diversity, as well as on ecosystem properties and functioning
of the C3 French permanent grasslands. Specifically it allowed to assemble plant community
data (species’ occurrences and abundances) and plant trait data, along with multiple
environmental layers relevant to characterize climate, soil and land use, within a coherent
platform.
The resulting DIVGRASS database comprises 51,486 vegetation plots from multiple data
sources (see Violle et al. 2015 for full data sources). Plots, of 50 to 100 m2 on average, are
homogenous with respect to the type of vegetation sampled. These plots are representative not
of a locality but of an ecological situation, in other words as the example of coexistence
between a flora and an environment. The data consist in visually estimated relative cover of
all present species in plots using a 6-level abundance scale derived from the Braun-Blanquet
(1932) cover scale : [0%,1%], ]1%,5%], ]5%,25], ]25%,50%], ]50%75%] and ]75%,100%].
We used the median of each class to derive a percentage cover for each species, i.e. 0.5%,
3%, 15%, 37.5%, 62.5% and 87.5%, respectively. The vegetation plots are dominated by
graminoids with ca. 20 species per plot.
Vegetation databases such as DIVGRASS have undoubtedly many advantages and great
potential, but can also have limitations and potential biases. For example, collating historical
vegetation plots is often biased by unbalanced sampling efforts towards patrimonial plant
communities (e.g. the species-rich dry calcareous meadows) and by differences in the timing
of sampling across plots. However while vegetation plots such as those included in
DIVGRASS are generally not repeatedly resampled, in the large majority of cases they are
visited in the period of greatest species detectability (late spring to summer), i.e. the best time
10
for identifying flowering and vegetative parts. Moreover the authors of the sampling were
generally skilled botanists able to identify seedlings and dead parts, so that ephemeral and
rare species should still be adequately sampled (if not even over-represented). Finally, the
great majority of plots included in this dataset were sampled after the 1990s, so that they
would at least reflect current ecological conditions relatively well.
Soil characteristics and land use are key to assess the drivers of vegetation changes and
functioning but there is little reliable information at large scales. However, in the context of
this project, we benefit from unique country-wide soil and land use databases available for
France.
Figure S4. (A) Spatial distribution of French permanent grasslands (FPGs) and (B) location of the
51,486 vegetation relevés collated in the DIVGRASS database. In (A), the green colour scale
represents the coverage (%) of French Permanent Grassland (FPG) in a 5km x 5km grid cell. In (B)
the heat colour scale represents the number of relevés per pixel; Red colour represents a number of
relevés higher or equal to ten relevés in a grid cell. Grid cells with a cover percentage of permanent
grasslands lower than 20% are not shown (grey colour). Reproduced from Violle et al. (2015).
References
Allen, V.G., batello, C., Beretta, E.J., Hodgson, J., Kothmann, M., Li, X. et al. (2011). An
international terminology for grazing lands and grazing animals. Grass Forage Sci,
66, 2-28.
Braun-Blanquet, J. (1932). Plant sociology. McGraw-Hill, New York.
11
Plantureux, S., Pottier, E. & Carrere, P. (2012). Permanent grassland: new challenges, new
definitions? Fourrages, 211, 181-193.
Violle, C., Choler, P., Borgy, B., Garnier, E., Amiaud, B., Debarros, G. et al. (2015).
Vegetation ecology meets ecosystem science: Permanent grasslands as a functional
biogeography case study. Science of The Total Environment, in press.
12
Appendix S2 – Details on phylogeny reconstruction
We reconstructed a genus-level phylogeny for the entire species pool using the procedure
proposed by Roquet et al. 2013). We retrieved from Genbank 3 conserved chloroplast regions
(matK, rbcL, ndhF) for all available genera, plus two chloroplastic regions (rpl16 and trnL-F)
and the nuclear ribosomal ITS region for certain families (taxonomical clustered alignment at
the family level was performed for these 3 regions). Sequences were aligned with MAFFT
(Katoh et al. 2005), checked by eye, and depurated with TrimAl (Capella-Gutierrez et al.
2009). Maximum likelihood phylogenetic inference analysis was performed with RAxML
(Stamatakis et al. 2008): 100 independent searches were carried out to retrieve 100 trees
applying a supertree constraint at the family-level based on Davies et al. (2004) and Moore et
al. (2010); and node support was assessed by bootstrap (BS) analysis with 1000 replicates.
Given that the likelihoods of the 100 ML trees obtained varied only slightly (from -585751.9
to -585791.1 for the best ML tree), and that BS analysis showed a high robustness for most of
the nodes (71% of nodes obtained a BS support > 70%, and an additional 11% of nodes
obtained a moderate BS support of 50-70%), we chose to conduct all analyses only based on
the best maximum likelihood tree. Species of the same genus were included as polytomies.
The tree was dated using penalized likelihood as implemented in r8s (Sanderson 2003) based
on fossil information extracted from Smith et al. (2010) and Bell et al. (2010). To calculate
distance-based phylogenetic metrics (see below), we extracted the cophenetic distance from
the phylogenetic tree.
References
Bell C.D., Soltis D.E. & Soltis P.S. (2010). The age and diversification of the angiosperms rerevisited. American Journal of Botany, 97, 1296-1303.
Capella-Gutierrez S., Silla-Martinez J.M. & Gabaldon T. (2009). trimAl: a tool for automated
alignment trimming in large-scale phylogenetic analyses. Bioinformatics, 25, 19721973.
Davies T.J., Barraclough T.G., Chase M.W., Soltis P.S., Soltis D.E. & Savolainen V. (2004).
Darwin's abominable mystery: Insights from a supertree of the angiosperms.
Proceedings of the National Academy of Sciences of the United States of America,
101, 1904-1909.
Katoh K., Kuma K., Toh H. & Miyata T. (2005). MAFFT version 5: improvement in accuracy
of multiple sequence alignment. Nucleic Acids Research, 33, 511-518.
13
Moore M.J., Soltis P.S., Bell C.D., Burleigh J.G. & Soltis D.E. (2010). Phylogenetic analysis
of 83 plastid genes further resolves the early diversification of eudicots. Proceedings
of the National Academy of Sciences of the United States of America, 107, 4623-4628.
Roquet C., Thuiller W. & Lavergne S. (2013). Building megaphylogenies for macroecology:
taking up the challenge. Ecography, 36, 13–26.
Sanderson M.J. (2003). r8s: inferring absolute rates of molecular evolution and divergence
times in the absence of a molecular clock. Bioinformatics, 19, 301-302.
Smith S.A., Beaulieu J.M. & Donoghue M.J. (2010). An uncorrelated relaxed-clock analysis
suggests an earlier origin for flowering plants. Proceedings of the National Academy
of Sciences of the United States of America, 107, 5897-5902.
Stamatakis A., Hoover P. & Rougemont J. (2008). A Rapid Bootstrap Algorithm for the
RAxML Web Servers. Systematic Biology, 57, 758-771.
14
Appendix S3 – Details on trait preparation
We collated information on four key functional traits for both alien and native species:
Specific Leaf Area (SLA; the ratio of leaf area to dry mass), plant maximum height at
maturity (Height), seed mass (SM) and growth form. These trait data were extracted from the
TRY database (Kattge et al. 2011), the Androsace database (see Thuiller et al. 2014) and
additional local datasets (see Violle et al. 2015 for details). We kept only species for which
we had information on at least two traits (2930 species from the initial 4280). Note that for all
the studied traits, trait availability increased with species frequency (Fig. S5, reproduced from
Violle et al. 2015). In other words, trait data were more available for frequent native species
than for rare species, so that we can be reasonably sure that community patterns were well
represented. Also, note that growth form was available for all species.
Figure S5. Trait availability in the plant traits module of the DIVGRASS platform. The x-axis
represents the species’ frequency in the plots module of the DIVGRASS platform, based on
occurrence data across vegetation plots. The y-axis represents the proportion of species for which trait
values are available in the Plant traits module of the DivGrass platform (black colour: available
values; grey colour: lack of data). Traits are ordered by decreasing data availability (n = total number
of species with available trait values). Partially reproduced from Violle et al. (2015).
For the species for which we still lacked partial information, we used multivariate imputation
by chained equations based on predictive mean matching to estimate the missing values based
on the relationships between the continuous traits, as implemented by the ‘mice’ package for
R (van Buuren & Groothuis-Oudshoorn 2011). The algorithm imputes the incomplete trait
15
columns (the target column) by generating 'plausible' synthetic values given the other trait
columns in the data. In the predictive mean matching method, for each missing value, the
imputed value is randomly chosen from a set of observed values whose predicted values are
closest to the predicted value for the missing value from the simulated regression model based
on the predictors (i.e. all other traits in the dataset) (van Buuren & Groothuis-Oudshoorn
2011).
This is one of the most commonly used imputation methods (e.g., Baraloto et al. 2010; Paine
et al. 2011) and allows preserving non-linear relationships among traits (van Buuren &
Groothuis-Oudshoorn 2011). In an ad-hoc evaluation study, Penone et al. (2014) artificially
removed trait values from a trait dataset on mammals to simulate percentages of missing
values ranging from 10% to 80%, and showed that ‘mice’ produced less biased results than
other imputation methods and than those obtained with datasets in which missing values were
simply removed. In our study, imputed values had means and ranges comparable to the
observed values (Fig S6) for all traits (the range of imputed values was only slightly narrower
for Seed Mass). This underlines that our imputation approach did not introduce any
directional bias in the dataset. Specifically missing values were 35% for SLA, 14% for Height
Imputed
1e+02
1e+00
Seed Mass
5e-01
Height
1e-04
5e-02
5e-03
5
Observed
1e-02
5e+01
5e+00
100
50
20
10
SLA
1e+04
and 15% for Seed Mass (in average 20% per trait).
Observed
Imputed
Observed
Imputed
Figure S6. Boxplots for the observed (green) and imputed (orange) trait values used in the study.
Width of the boxplots for imputed values is proportional to the number of values imputed (35% for
SLA, 14% for Height and 15% for Seed Mass). Note that the y-axis is logarithmically scaled.
References
Baraloto C., Paine C.E.T., Poorter L., Beauchene J., Bonal D., Domenach A.M., Herault B.,
Patino S., Roggy J.C. & Chave J. (2010). Decoupled leaf and stem economics in rain
forest trees. Ecology Letters, 13, 1338-1347.
16
Kattge J., Diaz S., Lavorel S., Prentice C., Leadley P., Bonisch G., Garnier E., Westoby M.,
Reich P.B., Wright I.J., Cornelissen J.H.C., Violle C., Harrison S.P., van Bodegom
P.M., Reichstein M., Enquist B.J., Soudzilovskaia N.A., Ackerly D.D., Anand M.,
Atkin O., Bahn M., Baker T.R., Baldocchi D., Bekker R., Blanco C.C., Blonder B.,
Bond W.J., Bradstock R., Bunker D.E., Casanoves F., Cavender-Bares J., Chambers
J.Q., Chapin F.S., Chave J., Coomes D., Cornwell W.K., Craine J.M., Dobrin B.H.,
Duarte L., Durka W., Elser J., Esser G., Estiarte M., Fagan W.F., Fang J., FernandezMendez F., Fidelis A., Finegan B., Flores O., Ford H., Frank D., Freschet G.T., Fyllas
N.M., Gallagher R.V., Green W.A., Gutierrez A.G., Hickler T., Higgins S.I., Hodgson
J.G., Jalili A., Jansen S., Joly C.A., Kerkhoff A.J., Kirkup D., Kitajima K., Kleyer M.,
Klotz S., Knops J.M.H., Kramer K., Kuhn I., Kurokawa H., Laughlin D., Lee T.D.,
Leishman M., Lens F., Lenz T., Lewis S.L., Lloyd J., Llusia J., Louault F., Ma S.,
Mahecha M.D., Manning P., Massad T., Medlyn B.E., Messier J., Moles A.T., Muller
S.C., Nadrowski K., Naeem S., Niinemets U., Nollert S., Nuske A., Ogaya R.,
Oleksyn J., Onipchenko V.G., Onoda Y., Ordonez J., Overbeck G., Ozinga W.A.,
Patino S., Paula S., Pausas J.G., Penuelas J., Phillips O.L., Pillar V., Poorter H.,
Poorter L., Poschlod P., Prinzing A., Proulx R., Rammig A., Reinsch S., Reu B., Sack
L., Salgado-Negre B., Sardans J., Shiodera S., Shipley B., Siefert A., Sosinski E.,
Soussana J.F., Swaine E., Swenson N., Thompson K., Thornton P., Waldram M.,
Weiher E., White M., White S., Wright S.J., Yguel B., Zaehle S., Zanne A.E. & Wirth
C. (2011). TRY - a global database of plant traits. Global Change Biology, 17, 29052935.
Paine C.E.T., Baraloto C., Chave J. & Herault B. (2011). Functional traits of individual trees
reveal ecological constraints on community assembly in tropical rain forests. Oikos,
120, 720-727.
Penone C., Davidson A.D., Shoemaker K.T., Di Marco M., Rondinini C., Brooks T.M.,
Young B.E., Graham C.H. & Costa G.C. (2014). Imputation of missing data in lifehistory trait datasets: which approach performs the best? Methods in Ecology and
Evolution, 5, 961-970.
Thuiller W., Guéguen M., Georges D., Bonet R., Chalmandrier L., Garraud L., Renaud J.,
Roquet C., Van Es J. & Zimmermann N.E. (2014). Are different facets of plant
diversity well protected against climate and land cover changes? A test study in the
French Alps. Ecography, 37, 1254-1266.
17
van Buuren S. & Groothuis-Oudshoorn K. (2011). mice: Multivariate Imputation by Chained
Equations in R. Journal of Statistical Software, 45, 1-67.
Violle, C., Choler, P., Borgy, B., Garnier, E., Amiaud, B., Debarros, G. et al. (2015).
Vegetation ecology meets ecosystem science: Permanent grasslands as a functional
biogeography case study. Science of The Total Environment, in press.
18
Appendix S4 – Distance Metrics based on Co-occurrence to study
invasions
We calculated a set of functional and phylogenetic similarity metrics to the natives for each
alien species based on co-occurrence information at two scales (invasibility metrics, Thuiller
et al. 2010). We chose to calculate indices based on co-occurrence because, in our modeling
framework at the scale of France, we needed to assess a single index value for each alien
species, representing its similarity to the overall native grassland assemblages.
Classical distance metrics used in invasion ecology, such as the weighted mean distance to the
native species (WMDNS) or the distance to the nearest native species (DNNS), are generally
calculated at the level of single communities/plots resulting in many different values for each
alien species within a region (Gallien et al. 2014, Thuiller et al. 2010). By contrast, a category
of indices exists in community ecology, which describes community level similarity patterns
by expressing the correlation between a matrix of phylogenetic/functional distances between
species and a matrix of pairwise co-occurrence indices between species (Hardy et al. 2008).
Many examples of these indices exist which differ only in the way co-occurrence is estimated
(Cavender-Bares et al. 2004, 2006, Helmus et al. 2007). These indices allow estimating a
single value for a region by relying on the information obtained from a high number of
sampling plots or communities.
Here we built on this latter approach to derive an index which expresses, only the
dissimilarity of a single invader to the rest of the species in the region, which can be used in
the context of invasion ecology. A part from being based on a single focal alien species,
instead of relating the dissimilarity to the natives with their co-occurrence through a
correlation index, we only focused on the distance to the most often co-occurring native
species (upper right corner in correlation plot, Fig. S7). This should make the index less
dependent on the number of observations available for each focal alien species. Also,
conceptually, the species that most often co-occur with an alien are also the ones that are
expected to convey the maximum amount of information for that particular alien species. This
is parallel to what is done in the case of the classical index based on the distance to the most
abundant species (DMAS), which is then taken to be a good indicator of overall community
resistance (Thuiller et al. 2010, Gallien et al. 2014). Note that in the case in which only one
native species had the highest co-occurrence value, distances were calculated to that single
native species, while when several natives had the same co-occurrence value, an average
distance was calculated.
19
MOST
CO-OCCURRING
SPECIES
Functional Distance
Functional Distance
MOST
CO-OCCURRING
SPECIES
Co-occurrence
COMPETITION
Co-occurrence
ENVIRONMENTAL
FILTERING
Figure S7. Conceptual figure for the Mean Distance to the Most often Co-occurring Species
(MDMCS) index. Represented are hypothetical regression lines of pairwise functional distance of a
focal alien to the natives as a function of its pairwise co-occurrence metric with the natives (CavenderBares et al. 2004, 2006). On the left side panel the most often co-occurring natives are functionally
distant to the focal alien, suggesting that the species is competitively filtered. On the right side panel,
the most often co-occurring natives are functionally similar to the focal alien, suggesting
environmental filtering.
In our specific analysis, for each alien species we calculated 1) the mean distance to the most
often co-occurring species (MDMCS) and 2) whether it had higher or lower values than those
species for each trait (i.e. its hierarchical position on each trait gradient). MDMCS was
calculated based on both phylogenetic and multi-trait functional distances, whereas the
hierarchical index was calculated for each trait independently. We identified the native
species which most often co-occurred with each alien species by using the V-score as a
measure of species co-occurrence (Lepš & Šmilauer 2003), modified to account for species
abundances. For a pair of species A and B the v-score based on presence absence data is
calculated as : V = (ad – bc) / √(a + b)(c + d)(a + c )(b + d), where a is the number of units
where both species are present, b and c are the number of units where only species A or B is
present, respectively, while d is the number of units where neither of the two species is
present. Computationally, the value of the v-score is equivalent to the value of the Pearson
correlation coefficient between the presence/absence vectors of the two species. Hence, in
order to take abundance into account here we calculated the Pearson correlation coefficient
for each pair of species using the abundance vectors of the species instead of just the
presence/absences, normalized to a logarithmic scale. The values of this co-occurrence index
20
range from -1 (complete segregation) to +1 (complete positive association). V-scores (and
therefore MDMCS) were calculated at both the plot (local community) and the habitat scale.
The habitat scale was the set of plots belonging to the same grassland type (one of the four
categories defined in the main text).
Finally, in order to assess the congruence of our novel co-occurrence based index with the
more classical indices used in invasion community ecology we performed a comparison of
MDMCS with WMDNS (weighted mean distance to the natives) and DNNS (Distance to the
closest native relative) metrics. To allow a comparison we calculated both WMDNS and
DNNS based on multi-trait functional distances for each plot (and for each habitat) and then
pooled them across plots (or habitats) in order to obtain a single value for each alien species
summarizing its plot-level and habitat-level functional similarity pattern in French grasslands.
Plot level
6
2 3 4 5 6 7
R = 0.68
2
3
4
5
R = 0.8
1
pooled WMDNS
Habitat level
3
4
5
6
7
2
4
6
5
2
R = 0.65
1
1.0
2
3
2.0
4
R = 0.68
0.0
pooled DNNS
3.0
1
1
2
3
4
5
MDMCS
6
7
2
4
6
MDMCS
Figure S8. Comparison of MDMCS.func (Mean Distance to Most often Co-occurring Species) metric
used in this paper with the more classical WMDNS (weighted mean distance to the natives) and
DNNS (Distance to the closest native relative) metrics calculated per plot (or habitat) and averaged
across plots (or habitats). Significant Pearson coefficients (R) are indicated in each panel.
We found that MDMCS.func was reasonably well correlated with both WMDNS and DNNS
averaged across communities at both the habitat and plot scale (Fig. S8). This suggests that
our novel index is quite congruent with some of the more classical metrics used in invasion
studies, if these are averaged across sampling units. Nevertheless we believe that our index is
21
conceptually a more suitable and direct choice in the case in which patterns are of interest at a
regional extent such as the one of this study, because species that co-occur more often with
the alien are emphasized, while bias introduced by species that might just accidentally be
present in one or few plots with the alien is avoided.
References
Cavender-Bares, J., Ackerly, D.D., Baum, D.A. & Bazzaz, F.A. (2004). Phylogenetic
overdispersion in Floridian oak communities. American Naturalist, 163, 823-843.
Cavender-Bares, J., Keen, A. & Miles, B. (2006). Phylogenetic structure of floridian plant
communities depends on taxonomic and spatial scale. Ecology, 87, S109-S122.
Hardy, O.J. (2008). Testing the spatial phylogenetic structure of local communities: statistical
performances of different null models and test statistics on a locally neutral community.
Journal of Ecology, 96, 914-926.
Helmus, M.R., Bland, T.J., Williams, C.K. & Ives, A.R. (2007) Phylogenetic measures of
biodiversity. American Naturalist, 169, E68–E83.
Lepš, J. & Šmilauer, P. (2003). Multivariate Analysis of Ecological Data using CANOCO.
Cambridge University Press, Cambridge.
Gallien, L., Carboni, M. & Muenkemueller, T. (2014a). Identifying the signal of
environmental filtering and competition in invasion patterns - a contest of approaches
from community ecology. Methods in Ecology and Evolution, 5, 1002-1011.
Thuiller, W., Gallien, L., Boulangeat, I., de Bello, F., Munkemuller, T., Roquet, C. et al.
(2010). Resolving Darwin's naturalization conundrum: a quest for evidence. Diversity
and Distributions, 16, 461-475.
22
Appendix S5 – Influence of evolutionary history on invasion success
Since species share evolutionary history to some degree (Blomberg and Garland 2002;
Felsenstein 1985), modeling species independently of each other may bias analyses if closely
related species have very similar values for the response variable of interest. We thus tested
for phylogenetic signal of our continuous indicator of invasion success (i.e. PCA1) on the
alien species phylogenetic tree using Blomberg’s K statistic. Significance test was based on
the variance of phylogenetically independent contrasts (PIC) relative to tip shuffling
randomization (Blomberg & Garland 2002).
We found no evidence that closely related alien species had similar success. Instead we found
that the most successful species were distributed quite randomly throughout the tree (Fig. 2 in
main text; K = 0.074, PIC variance p-value = 0.931). Hence, we considered that the success of
alien species is independent from phylogenetic relatedness in further analyses.
References
Felsenstein J. (1985). Confidence-limits on phylogenies - an approach using the bootstrap.
Evolution, 39, 783-791.
Blomberg S.P. & Garland T. (2002). Tempo and mode in evolution: phylogenetic inertia,
adaptation and comparative methods. Journal of Evolutionary Biology, 15, 899-910.
23
Appendix S6 – Cross Validation of model for Invasion Success
In order to assess whether our modeling framework also had the potential to estimate
future invasion success of newly introduced species, we attempted to validate the
predictive performance of our model. Given that we did not have any additional suitable
independent data available, we ran a cross validation procedure by following two
separate approaches: 1) Repeated sample splitting, and 2) Jackknifing.
Repeated sample splitting
For our first approach, we randomly split the data into a validation (80%) and a
calibration group (20%) of species. Then we fit the GBMs in the calibration subset of
species using exactly the same procedure used to fit the full model, and used these
models to predict invasion success in the rest of the species (validation data set). The
splitting procedure was repeated 50 times and model predictive performance was
evaluated in average across repetitions. For each split run performance was assessed by
comparing the predicted values of invasion success with the observed values based on
several goodness of prediction statistics: the R-squared of the relationship, Pearson’s R
of the relationship,
the Mean Square Error (MSE), and the G-statistic (Guisan &
Zimmermann 2000). The G-statistic measures how effective a prediction might be
relative to that which could have been derived using the sample mean. A G-value equal
to 100% indicates perfect prediction, a positive value indicates a more reliable model
than if one had used the sample mean, and a negative value indicates a less reliable
model than if one had used the sample mean.
Table S3. Goodness of prediction statistics for the GBM, averaged across 50 splitting runs.
R-squared
Pearson r
G-statistic
MSE
Value
0.330
0.565
29.64 %
1.410
Jackknifing
24
For our second approach, we removed one species at a time from the full dataset and fit
the GBM to model invasion success based only on the remaining species. Then we used
the fitted model to predict invasion success of the single removed species, only based on
the predictors (traits, invasion history and invasibility metrics). For each prediction we
calculated the square error compared to the true value of invasion success. We repeated
this procedure 1000 times (with replacement) in order to have at least a few predictions
for each species. Finally we averaged within species and checked how the predictions
and the MSE values varied depending upon the Rabinowitz class and the frequency,
abundance and specialization classes.
Figure S9. Observed invasion success and predicted invasion success for the species
« excluded » in fitting the model across classes of frequency (first row), abundance (second row)
25
and generalism (third row). Indicated is also the significance (p-value) of the difference of the
two classes with a t-test.
Pred
1.0
0.5
-0.5
0.0
Invasion Success
1
0
-2
-1.0
-1
Invasion Success
2
1.5
3
2.0
2.5
Obs
H
D
G
E
B
C
A
H
D
G
E
B
C
A
3.0
Mean Square Error
COTULA CORONOPIFOLIA
AMARANTHUS DEFLEXUS
2.5
VICIA TENUIFOLIA
1.5
OENOTHERA BIENNIS
1.0
Mean Square Error
2.0
LINUM AUSTRIACUM
0.5
JUNCUS TENUIS
PASPALUM DILATATUM
BROMUS INERMIS
ERIGERON ANNUUS
0.0
RUMEX PATIENTIA
H
D
H - NarrowSparseSpecialist
D - NarrowSparseGeneralist
G - NarrowAbundantSpecialist
G
E
B
E - WideSparseGeneralist
B - WideAbundantSpecialist
C - NarrowAbundantGeneralist
C
A
A - WideAbundantGeneralist
Figure S10. First row : observed and predicted values of invasion success for the « excluded »
species across rabinowitz class. Second row : MSE for the excluded species across rabinowitz
classes. For the « WideAbundantGeneralist » category two grups of outlier species are
highlighted: 1) species that have the highest MSE and are underpredicted (invasive species that
are not well predicted), and 2) species that have the lowest MSE and have high predicted
invasiveness (correctly identified highly invasive species).
26
Conclusions
We found that the model had a moderate predictive performance, but it was still in
average 30% better than using the sample mean to estimate invasion success of species
(Table S3). According to the Pearson R (0.56) and R-squared (0.33) performance was
reasonable (Table S3). In particular, frequent species are in average predicted as more
invasive than the narrow ones (Fig.S11, first row, second panel). The same is true for
abundant species and generalist species (Fig.S11, second and third rows).
Some of the highest prediction errors occured for the NarrowAbundantGeneralist group
and
for
some
of
the
WideAbundantGeneralist
species
(Fig.
S12).
The
NarrowAbundantGeneralist species (class C) are the ones that are most underpredicted
(Fig. S12, first row). These species are locally abundant and generalist, but they are at
the moment restricted to a narrow extent, potentially close to introduction sources,
which suggests that they are likely not to be in equilibrium. Indeed it is intuitive that this
group of species would be the most difficult to predict.
The WideAbundantGeneralist species are mostly correctly predicted as having high
invasion success (figure S12, first row), but nevertheless MSE for this group of species is
quite high. However, we can note that some of the most well-known invaders in this
category (O. biennis, E. annuus, B. inermis, J. tenuis, Fig. S12, second row) which have the
highest invasion success values, are indeed very well predicted by the model (low MSE),
whereas it is the species that are generally considered less problematic which are less
well predicted. This is an encouraging finding from a conservation point of view.
27
Download