Uploaded by Jing Cui

Analyze the Characteristics and Development of Music based on Network

advertisement
Problem Chosen
D
2021
MCM/ICM
Summary Sheet
Team Control Number
2105383
Analyze the Characteristics and Development of Music
based on Network
The inheritance and development of music reflects human collective experience. To further understand the impact of music, models will be used to quantify it. This paper will start
from the establishment of music influence model to understand how music evolves with the
changes of society.
Problem 1 is to establish the model of music influence, and this paper puts forward the
genre influence coefficient and the weighted directed network graph is drawn to analyze
the subset of music influence. Pop/Rock and the Beatles were calculated to be the most influential.
Problem 2 is to measure musical similarity. Firstly, Pearson correlation coefficient
model and visualization were used to compare the weights of variables. Since music similarity
can be regarded as the distance between different data, this paper uses Euclidean distance to
calculate. We find that there is a strong blend between music.
Problem 3 is to find the difference between music genres. We use the formula of problem
2 to calculate the approximate degree of each genre and measure the changes of all parameters
to explore the development of genres. It can be found that Avant-Garde and Blues have the
highest similarity.
Problem 4 is to find the influence of influencers on followers and specific characteristics.
In this paper, followers are clustering, data normalization, Euclidean distance is calculated
to get the similarity and the influential people can affect followers. Finally, through calculating
the degree of difference between the clustering data and the musical characteristic parameters
of The Beatles, it is concluded the influential characteristics and person.
Problem 5 is to explore the characteristics and leaders of musical change. This paper calculates the popularity data by matching artists' genres and music careers and eventually found
Ray Charles and other revolutionaries.Problem 6 is to analyze the development and transformation of specific music genres. This paper uses time series and grey correlation analysis to
measure the dynamic influence index of the development of genres. In the end, the dynamic
influence index of R&B is obtained in this paper.
Problem 7 is to judge the influence of external environment on music. In this paper, the
total number of songs, the number of songs of each genre and the musical characteristics are
visualized according to the time. And the visualization results are comprehensively analyzed
to find the external factors affecting music. The development of music is often influenced by
social change.
The advantage of this paper lies in the vivid and diversified visualization of data and
model results, and the combination with the analysis to draw more objective and real conclusions.
Keywords: Music Influence Model; Euclidean distance; Clustering model; Grey correlation
analysis
Team # 2105383
Page 2 of 25
Contents
1 Introduction................................................................................................................................................3
1.1 Problem Background ................................................................................................................................... 3
1.2 Problem Restatement ................................................................................................................................... 3
1.3 Our Work ..................................................................................................................................................... 4
2 Assumptions and Justifications .................................................................................................................4
3 Notations .....................................................................................................................................................4
4 Music influence ..........................................................................................................................................5
4.1 Data preprocessing ....................................................................................................................................... 5
4.2 directed weighted network of music influence ............................................................................................ 5
4.2.1 influence coefficient of the genre ...................................................................................................... 6
4.2.2 conclusion ......................................................................................................................................... 6
4.2.3 Analysis ............................................................................................................................................. 7
4.3 Music similarity model ................................................................................................................................ 8
4.3.1 Pearson Correlation Coefficients ....................................................................................................... 8
4.3.2 Visualization ...................................................................................................................................... 8
4.3.3 Calculation of Euclidean Distance .................................................................................................. 10
4.4 Division of music genres ........................................................................................................................... 11
4.4.1 The similarity was observed using European examples .................................................................. 11
4.4.2 The variation of genres over time .................................................................................................... 12
5 Music Artist and Genre Revolution ........................................................................................................13
5.1 Cluster model ............................................................................................................................................. 13
5.1.1 Hierarchical clustering method ....................................................................................................... 13
5.1.2 The principle of hierarchical clustering method .............................................................................. 13
5.1.3 clustering process ............................................................................................................................ 14
5.1.4 Euclidean distance ........................................................................................................................... 15
5.1.5 Similarity of different parameters between influencer and follower ............................................... 16
5.2 The analysis of music revolution ............................................................................................................... 17
5.3 The analysis of music changers.................................................................................................................. 17
5.4 R&B genre changes over time ................................................................................................................... 18
5.4.1 time series model ............................................................................................................................. 18
5.4.2 Grey Relation Analysis.................................................................................................................... 19
6 The External Factors Affecting Music ...................................................................................................21
6.1 Visualization of music features .................................................................................................................. 21
6.2 Visualizing the number of songs ................................................................................................................ 21
6.3 Analysis...................................................................................................................................................... 22
7 Sensitivity Analysis ..................................................................................................................................23
8 Model Evaluation and Further Discussion ............................................................................................23
8.1 Strengths .................................................................................................................................................... 23
8.2 Weaknesses ................................................................................................................................................ 24
9 Document for ICM...................................................................................................................................24
References ....................................................................................................................................................25
Team # 2105383
Page 3 of 25
1 Introduction
1.1 Problem Background
The inheritance and development of music is based on the collective experience of mankind. At the same time, due to the different innate ingenuity, personal experiences, technological development and historical background of the musicians, the new music created by the
same musician is not completely the same.[1]
Over time, genres of music theory sometimes have revolutionary changes and give rise to
new musical genres. Collaboration among musical artists, inheritance, and internal social factors are often the important causes for revolutionary change. Therefore, it is very important to
establish a model to describe the influence of music. In order to further understand the influence
of music, we will use models to quantify it.
1.2 Problem Restatement
Considering the complexity of the music artists and genres and the data given on the problem, we will address the following questions:
➢ Create a directed network to describe musical influence among musicians and develop
parameters to capture the influence. Explore and describe a subset of musical influence
and the influence measures.
➢ Develop a model of music similarity and compare the similarity among artists whether are
of the same genre.
➢ There is an interplay between musical genres. Describe and compare these genres. Explore
the characteristics and evolution of a genre as well as the connections among multiple
genres.
➢ In the data_influence data set, whether the similarity data can show that the identified influencers influence the respective artists and that the influencers have influence on the
followers. Does the influence of musical characteristics play the same role?
➢ Explore the characteristics and representatives of major changes in the field of music.
➢ Explore and explain the evolution and indicators that reveal the dynamic influencers of a
musical genre.
➢ Describe the influences of different environments and cultures on music. And the impact
of political, social and technological changes on music in the network environment.
Team # 2105383
Page 4 of 25
1.3 Our Work
The writing ideas of this paper are as Figure 1:
Figure 1 Structure of this Paper
2 Assumptions and Justifications
➢ We assume that all influencers have the same influence on the same follower, regardless of the genre of influencers and followers
In the reality, it is difficult to describe influence with specific values. In order to better
quantify the influence model, this paper assumes that all the influencers have the same
influence.
➢ We assume that all songs sung by an artist are of the same genre as that of the artist
In the real world, the same artist may sing songs of various genres, but it will still be
dominated by his genre. This paper simplifies the model, ignoring the artist ' s songs and
their own different situations.
➢ We assume that the data given by the topic is representative and can overview music
development
Ensure that the model can get reliable results with reliable data.
➢ When the influencers and followers are different, the influence relationship between
different genres is completely consistent.
If we believe that different genres will have different degrees of influence, we need large
number of data to make specific evaluations. To ensure the simplicity of model establishment and solution, this assumption is proposed.
3 Notations
The key mathematical notations used in this paper are listed in Table 1.
Team # 2105383
Symbol
𝑞
𝑛
𝐶𝑜𝑢𝑛𝑡𝑖
i

0
Page 5 of 25
Description
A comprehensive influence score for an artist
The total number of people affected by the artist
The total number of people affected by an artist's the 𝑖th follower
An artist's the influence coefficient of the genre of his 𝑖th follower
Influencer’s genre
Follower’s genre
Pearson correlation coefficients
Euclidean distance
aggregation coefficient
Class number
the difference between the data
J
k
resolution coefficient
Figure 2 Notations used in this paper
4 Music influence
In this section, we will explore the music influence of different music artists, their music
similarity and the formation of genres.
4.1 Data preprocessing
Before data analysis, we must ensure the availability of data.
If the given data have large problems such as missing or redundancy, it will affect the
accurate answers. In order to improve the availability of data, we preprocessed the data: data
classification, data cleaning, information filtering, the establishment of new attributes and data
metrics.
Step 1: In data cleaning, we use Python to find missing values and redundant values
through the program, but there is no such value in the four data tables.
Step 2: When browsing text data, we found that there was a bit of code disorder. According to the information of the topic, we choose utf-8 to transcode and solve the problem of
random code.
4.2 directed weighted network of music influence
In this section, an artist is often influenced by multiple other artists. It’s assumed that each
influencer has the same influence on the same follower, without considering the genres of followers and influencers. In order to consider the influence of influencers and followers' genre
on influencers, we put forward the influence coefficient of the genre, and then give the comprehensive influence score.
Set the above two scores and the final comprehensive score as parameters. We construct
a network diagram with a directed weighted network graph to analyze and measure the subset
of musical influence.
Team # 2105383
Page 6 of 25
4.2.1 influence coefficient of the genre
For i  [1, n] ,define each artist's overall impact score as following:
n
1
q=
i
i =1 Counti
 1  = 0
0.5    0
i = 
(1)
(2)
q : A comprehensive influence score for an artist
n : The total number of people affected by the artist
𝐶𝑜𝑢𝑛𝑡𝑖 : The total number of people affected by an artist's the i th follower
i : An artist's the influence coefficient of the genre of his i th follower
 : Influencer’s genre
 0 : Follower’s genre
Through the above formula, we combine various factors into an intuitive comprehensive
impact score. It helps to identify artists with stronger influence and to quantify the influence of
different artists for detailed, rigorous analysis and evaluation.
4.2.2 conclusion
Based on the results of the above model calculations, we get the weight of each artist's
influence. Through the processing of data, we draw a directed weighted network graph by
Gephi, setting influencer_id as source, follower_id as target and influencer_main_genre as the
classification criteria.
We can get Figure 2:
Figure 3 Directed Weighted Network Graph
Because of the large amount of data, we can't see how each artist's influence unfolds in
Team # 2105383
Page 7 of 25
Figure 3. Thus we chose pop/rock, which have the largest number of examples in influencer_main_genre, for further drawing and analysis. As shown in Figure 4.
Figure 4 Each Artist's Influence of Pop/Rock
In order to further study the development of each music genre, we establish a
multidimensional line chart for intuitive data visualization, taking the horizontal coordinates of
the beginning time of each artist's musical career as the horizontal axis and the number of artists
of each genre who began their musical career at corresponding time as the vertical axis.
Figure 5 Development of each Music Genre
4.2.3 Analysis
Based on the results of the above model, we can get:
➢ The influence of music is related to the number of artist followers and the similarities
of the two genres. With the quantitative comprehensive scoring model of these two
aspects, we have a clearer understanding of each artist and can further analyze the
historical development of the music genre according to the influence of the artist and
the era.
➢ By analyzing the subnetwork, Pop/Rock is the most widely spread and influential
genre. In addition, the Beatles is the most influential artist in the Pop/Rock music
genre. By drawing this model, we can know the composition of subnetworks of other
genre artists.
Team # 2105383
Page 8 of 25
➢ Through the observation of the development of various music genres at different
times, the line chart shows that Pop/Rock began to develop at a high speed from 1950
to 1960, setting off a wave of Pop/Rock, and then slowing down. In addition, in 1940,
the R&B began to develop gradually and maintained a more stable development.
Other music genres have maintained a relatively stable low-speed development.
➢ Through comprehensive analysis of the influence of different artists, the era and the
development music genres, we find that some influential artists can often greatly influence the development of a genre, and can lay a certain foundation for the future
development of the music genre.
4.3 Music similarity model
Use Pearson correlation coefficient model to preliminary process the multi-dimensional
and large amount of data. Considering when the popularity of the song is high, it is more practical to study its similarity, we refer to the correlation between the variables and reduce the
variables with the lower correlation with popularity to extracted the main components. In addition, we give different weights to some variable to obtain more reference variables. The common method to estimate the similarity between different samples is to calculate the "distance"
between samples, we use Euclidean distance and used processed variable data to measure the
similarity between music.
4.3.1 Pearson Correlation Coefficients
Pearson correlation coefficients are used to measure whether two data sets are above a
line, that is to measure the linear relationship between Fixed distance variables. And when both
variables are normal continuous variables, there is a linear relationship between the two parameters. Pearson correlation coefficients are often used to depict the degree of relationship
between these. The formula is as follows:
(3)
In the upper
resents the
tion range of
class, represents the correlation coefficient, n is the sample size,
and
rep-
th property value of the two sets of the sample datasets respectively. The variais
. Negative values
represent negative correlations. Positive values
represent positive correlations. The closer its absolute value is to 1, the stronger the correlation is, which is quantitative description.
4.3.2 Visualization
Based on the above formula, we describe the data and get the table.
Team # 2105383
Page 9 of 25
Based on this table, we visualize the Pearson correlation coefficient of the data. The closer
the correlation coefficient is to red, the stronger the positive correlation between the two variables. The closer the correlation coefficient is to the blue, the stronger the negative correlation
between the two variables.
Table 1 Descriptive Statistics
Based on the visuality rendering and the detailed comparison of the correlation coefficients among variables, we can get that danceability, energy, loudness, acousticness, instrumentalness, exolicit and popularity are more relevant, and are more representative of the characteristics of a song. So we keep these variables, modify or delete other parameters and calculate the Euclidean distance later.
Figure 6 Comparison of the Correlation coefficients among variables
Team # 2105383
Page 10 of 25
4.3.3 Calculation of Euclidean Distance
In 2.1.2 above, we get a few reserved parameters, which we calculate based on the formula's weight. The weight formula is as follows:
(4)
Represents the weight of each parameter. representing the correlation coefficient of
each parameter to popularity.
After we get the weight, we multiply the original data, get the new data, and calculate the
Euclidean distance based on this.
Euclidean distance is a common distance definition, which is the true distance between
two points in m-dimensional space. Because the similarity between songs can be regarded as
distance, the algorithm of n-dimensional Euclidean distance is used to sum the cases. The formula is as follows:
(5)
Because the data is huge, and we need more intuitive expression of the same genre and
different genre music similarity. We selected more influential music genres, namely Pop / Rock,
R & B, Country. These music singers have high influence and can represent the corresponding
music genre music characteristics.
The Euclidean distance between each music is shown in the following table. 1 – 5 is Pop
/ Rock genre, 6 – 8 is Pop/Rock, 9 – 10 is Country, and the red box is the Euclidean distance
between the music of the same genre. The smaller the Euclidean distance, the more similar
between the two music.
Table 2 European Distance between Genres of Music
According to the results of table 1 above, we find that the music of the same genre is often
more similar, but there are also many different genres of music with high similarity. This shows
that a music genre often has a wide range of music characteristics, a music genre of music
features often have a certain degree of coincidence with other genres of music features. There
are often only vague boundaries between different genres of music, and there is a strong integration of music.
Team # 2105383
Page 11 of 25
4.4 Division of music genres
100%
80%
60%
40%
20%
0%
-20%
-40%
-60%
-80%
-100%
1
2
3
4
5
6
7
8
9
10
11
12
13
Avant-Garde
Blues
Children's
Classical
Comedy/Spoken
Country
Easy Listening
Electronic
Folk
International
Jazz
Latin
New Age
Pop/Rock
R&B;
Reggae
Religious
Stage & Screen
Vocal
14
Figure 7 percentage accumulation line chart with data
In this section, we use the known data, and take artists _ id as the identification mark, and
then divide the songs into groups. After classification, we obtain 14 variables of each music
genre, and calculate them to obtain the final value. Visualize the values and calculate the Euclidean distance to observe their similarity. We believe that the similarity between genres can
be described by the two most similar ones, and the difference between genres can be described
by the lowest similarity ones. At the same time, according to the release time of songs, it can
be divided into 10 years as a stage to observe the change trend of music genres over time.
4.4.1 The similarity was observed using Euclidean examples
We calculated the average of 14 variables to measure songs, starting with visualizing the
results. Fig. 7 shows:
According to the percentage accumulation line chart with data, we can clearly see the
performance of songs of various genres in various variables and the approximation between
various genres.
Based on this, we calculate the Euclidean distance of these data and finally get the results
as shown in Table 3.
By analyzing Fig.7 and Table 3, we can see that Avant-Garde and Blues have the highest
similarity, and Jazz and Children’s have the lowest similarity. We can know:
Similar music genres have small differences in danceability, energy, key and speechiness,
indicating that similar genres have consistency in rhythm. And the perception characteristics
for music are also very similar. Moreover, the music enthusiasm conveyed to people and the
expression of word clarity is basically the same.
The difference of music genre is that there are great differences in the performance of
valence, loudness, acousticness and instrumentalness. According to the examples of Jazz and
Children's, we find that Jazz ' s loudness and acousticness are larger, while Children's performance in these aspects is opposite to Jazz, and it is higher in valence, instrumentalness and
Team # 2105383
Page 12 of 25
danceability. This can also explain that Children's music conforms to the characteristics of
happy, lively and suitable for singing and dancing.
AvantGarde
Avant-Garde
Blues
0
1742.65
Blues Children's Classical
Comedy/Spoken
Country
Easy Listening
Electronic
1742.65 82774.9
0
Children's 82774.9 84517.5
Classical
76636
76636
7129.02 17109.3 32413.6 72315.6
84517.5 74893.4 5386.38 18851.9 34156.2
0
74893.4 159411
Comedy/Spoken
7129.02 5386.38 89903.9
Folk
16353
18095.7 51064.4 89087.4
90830
Latin
11400.4
New Age 80138
89087.4 173605
9657.8
11400.4
9657.8
80138
78395.4 25778.5 32829.8 27561.3 49550.1
14194
65235.6 3502.11 49114.9 42063.6 47332.1 25343.3 83540.4 98749.8
69507
0
24238.3 39542.6 65186.6
23482
45678.1
83701
4271.49
92989
14194
0
23482
83701
73009
73009
20392.2 27443.4 22174.9 44163.7 14033.4 29242.8
15304.3 89424.9 756.367 69916.4 107939 28509.7 97247.3 44630.4 51681.7 46413.2
0
104729 16060.6 85220.7 123244
0
756.367 16060.6 88668.6
43814
0
69160.1 107183 27753.4
0
112552 59934.7
66986
66986
96491
27330.9
96491
Vocal
68402
61717.5 83706.3 25509.1 10299.8
43874.2 50925.4 45656.9 67645.7 9448.59 5760.74
0
10692
79429.6
0
68737.6
10692
37743.2 50925.4 18234.7 56257.6
23172
63308.9 56257.6 61526.1 39537.3 97734.5 112944
68737.6 16120.7
0
23172
17903.5 39892.3 18304.9 33514.2
52616.9 45565.6 50834.1 28845.3 87042.5 102252
0
45565.6 7051.28
7051.28 1782.77 23771.5 34425.6 49634.9
0
Reggae 29303.9 27561.3 112079 47332.1 22174.9 46413.2 61717.5 43011.7 45656.9 23503.2 61526.1 17903.5 50834.1 1782.77 5268.53
Religious 51292.7 49550.1 134068 25343.3 44163.7
10204.9 5004.52
38022.9 41406.7 27330.9 25285.9 18234.7 23503.2 1514.55 59711.5 74920.8
60915.2 27753.4 41406.7 79429.6
97247.3 112552 7822.41
43814
68402
88668.6 19508.6 18514.4 60915.2 7822.41 44794.5 37743.2 43011.7 21022.9 79220.1 94429.4
107939 123244 18514.4 107183 38022.9
94175.3 65235.6 4271.49 28509.7
78395.4 162913 3502.11
8647.1
23856.4
23828.9
34572.4 32829.8 117347 42063.6 27443.4 51681.7
Stage & Screen
6904.45
8647.1
92989
Pop/Rock 27521.2 25778.5 110296 49114.9 20392.2 44630.4 59934.7 44794.5 43874.2 25285.9 63308.9 16120.7 52616.9
R&B;
27521.2 34572.4 29303.9 51292.7 6904.45 22113.8
93745.3 109050 4320.56
International52807.1 51064.4 135582 23828.9 45678.1 69916.4 85220.7 19508.6 69160.1
Jazz
90830
Reggae Religious
Stage & ScreenVocal
69507
155091 4320.56 65186.6 89424.9 104729
18095.7 66421.9
52807.1
New AgePop/Rock R&B;
0
Easy Listening
32413.6 34156.2 50361.3 109050 39542.6 15304.3
70573
16353
Latin
159411 89903.9 65665.6 50361.3 155091 66421.9 135582 173605 94175.3 162913 110296 117347 112079 134068 75870.5 60661.2
Country 17109.3 18851.9 65665.6 93745.3 24238.3
Electronic 72315.6
70573
Folk International Jazz
5268.53 16720.3 41476.9 56686.2
0
83706.3 21022.9 67645.7 1514.55 39537.3 39892.3 28845.3 23771.5 16720.3 21988.8
21988.8 36208.3 51417.7
0
75870.5 83540.4 14033.4 10204.9 25509.1 79220.1 9448.59 59711.5 97734.5 18304.9 87042.5 34425.6 41476.9 36208.3 58197.1
58197.1 73406.5
0
22113.8 23856.4 60661.2 98749.8 29242.8 5004.52 10299.8 94429.4 5760.74 74920.8 112944 33514.2 102252 49634.9 56686.2 51417.7 73406.5 15209.3
15209.3
0
This is a dissimilarity matrix
Table 3 approximation between various genres
4.4.2 The variation of genres over time
By calculating the average value of each parameter of all music in each genre each year,
we obtain the annual changes of each parameter of each genre. We take some parameters in the
Pop/Rock genre that are in the range of [0,1] as an example, and the changes of each parameter
with time are shown in the figure 8.
Figure 8 the variation diagram of each parameter of Pop/Rock with time
Team # 2105383
Page 13 of 25
Combined with the variation diagram of each parameter of Pop/Rock with time, we find
that the mode, valence and acousticness of Pop/Rock gradually decreased with time. The acousticness has a relatively drastic change. After experiencing large fluctuations in the early stage,
energy gradually recovered to be stable in the late stage. Danceability and explicit begin to
increase slightly, and other parameters basically maintain a stable level. As a whole, we can
find that a genre will change with time in various parameters. The changes of these parameters
make the relevant music flow more conform to the trend of the times, making the music genre
more vitality and lay the foundation for the development of a music genre.
5 Music Artist and Genre Revolution
In this section, we will explore the influence of musical artists on each other and the evolution of genres
5.1 Cluster model
According to the statistics, The Beatles has 615 followers, and the number of followers is
the largest among all the influencers, which can have a better reflection of the relationship
between the influencers and followers. Therefore, our group selects The Beatles as the example
for analysis in this paper. We extracted all the followers of The Beatles and corresponding
variables in the data_by_artists table to obtain the specific performance of each follower in
music. Then the followers are clustered and normalized, and the Euclidean distance is calculated to obtain the similarity between the clustered followers and the Beatles ' music expression.
It is concluded that ' influential people ' really affect the music created by the followers. Finally,
by calculating the difference between the clustering data and The Beatles ' music feature parameters, the influence of the followers on their music features is obtained.
5.1.1 Hierarchical clustering method
In order to facilitate subsequent data processing, we use the system clustering method to
simplify the data of the 615 followers and extract the main features of the 615 follower data.
Clustering is a data processing method that divides a collection of multiple data into several classes composed of similar objects. Clustering model can simplify mass data and reflect
the characteristics of the data.
In this clustering, we use the hierarchical clustering method to reduce the dimension of
data and extract their features for data processing and analysis.
5.1.2 The principle of hierarchical clustering method
The hierarchical clustering method combines the closest two types of data points by calculating the distance between the two types of data points. After repeated iterations, the data
can be clustered layer by layer until all data points are finally combined into one class to obtain
the multi-level clustering results. The suitable clustering scheme can be obtained according to
the actual number of clusters combined with pedigree diagram.
Team # 2105383
Page 14 of 25
5.1.3 clustering process
The process of this hierarchical clustering through graphics is shown below:
Figure 9 The process of this hierarchical clustering
Elbow method can roughly estimate the optimal number of clusters by graphics.
Assuming that n samples are divided into K classes (K ≤ n-1, namely that there are at least
two elements in one class). The distortion degree of each class equals to the sum of squares of
the distance between the center of gravity of the class and the position of its internal members.
The distortion degree of the k class is:
(6)
Therefore, we define the total distortion degree of all classes as:
(7)
Among them :
Ck represents k category (k = 1, 2, , K )
J is the aggregation coefficient
uk represents the center of gravity dn of the k th class
We take the number of clusters K as the abscissa and the polymerization coefficient J as
200
150
100
6, 55.905
50
0
1
201
401
601
Figure 10 Polymerization coefficient plot
Figure 11 Local Amplification Graph of
Polymerization Coefficient Fold Graph
Team # 2105383
Page 15 of 25
the ordinate. Draw the polymerization coefficient line chart in the following figure. Analyze
the polymerization coefficient line chart with the elbow rule, and obtain the optimal number of
clusters.
According to the plot of aggregation coefficient, when the number of categories is 6, the
decreasing trend of discount becomes slow, so the number of categories K can be set to 6
When the number of classes is 6, we calculate the average value of each parameter data
of the six categories and get six groups of representative parameter data. Combined with the
parameter data of The Beatles, seven groups of data are finally obtained. As the following figure.
Table 4 representative parameter data
We test the average of the six classification parameters as following Table 5 and found
that the indicators basically reached a significant level. This indicate that the classification results are more effective.
Table 5 Clustering results test
5.1.4 Euclidean distance
With SPSS, we calculate the Euclidean distance between the six categories that have been
clustered and The Beatles, as shown in the Table 6.
Team # 2105383
Page 16 of 25
Table 6 Proximity Distance
We can see that the distance between each species and The Beatles is small, indicating
that each species is similar to The Beatles. Since these six categories contain the data characteristics of 615 followers, we can learn that influencers can actually affect the music created by
followers.
5.1.5 Similarity of different parameters between influencer and follower
In order to better characterize the similarity between influencers and followers in different
parameters, we define the degree of difference formula between followers and influencers in
parameters. And we compare each musical feature of six types of follower data with the musical
feature of The Beatles. So as to further understand whether the influence of influencers on
followers is comprehensive or one-sided. The formula is as follows
(8)
Among them:
represents the difference between the six types of data and the influencer on this parameter. The smaller the value, the greater the similarity between the six types of follower data
and the influencer on this parameter.
represents the corresponding parameter values for class i data.
represents the corresponding parameter value of the influencer.
represents the number of categories.
By calculation, results are visualized in the figure 12.
The red line in the above graph is the reference line. When the difference value between
one parameter and its influencer is equal or lesser than 0.05, we call it greatly affected. We can
observe that loudness, tempo, valence, energy and danceability are the music characteristics
that are greatly affected by the influencer. It can explain that some features are more contagious
in the process of influence and can be better applied to their works by the followers. Therefore,
it can also explain that the influencer will affect the music characteristics of the followers.
By matching the data of data_by_artist with the genre of artists and the time of starting
music career, we calculate the popularity data of each genre and visualize it. Based on this, the
rise of some genres is found to define the occurrence of revolutionary change. And the influence is used to measure the changer within the scope of the initial year of each new genre.
Team # 2105383
Page 17 of 25
duration_ms
speechiness
liveness
instrumentalness
acousticness
key
mode
loudness
tempo
valence
energy
danceability
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
Figure 12 Analysis of Effect
We use the time the singer begins his music career as an independent variable. The average
value of the popularity parameters of all singers in each genre at the corresponding time is used
to represent the respectively popularity parameters. Take the popularity parameter value of each
genres as the dependent variable. We observe the development of various genres ls over time
and find significant changes in music development.
5.2 The analysis of music revolution
We visualize the obtained data, and characterize the data characteristics through the percentage accumulation histogram, as shown in the following figure 13.
By observing the color at 0% position of each genresin the figure 13, we can intuitively
understand when each genre began to rise. And the rise of a genre represents the beginning of
a music development revolution.
100%
80%
60%
40%
20%
0%
1930
1940
1950
1960
1970
1980
1990
2000
2010
Figure 13 Percentage accumulation histogram of music development
5.3 The analysis of music changers
Based on the measurement of the number of artists in the new genre, we select the new
genre R&B and Electronic to find the changers in these two genres.
A network diagram is used to represent the influence of various artists and to find the
revolutionaries of these two new genres.
With the help of the network diagram, the time of singers starting their music career in
the influence_data table and the time correction of influencer_active_start data, we find singers
with great influence in the ten years when the new genres was founded as the revolutionaries
Team # 2105383
Page 18 of 25
of the music revolution.
We choose Ray Charles as the revolutionist of R&B and Giorgio Moroder as the revolutionist of Electronic. They are also the revolutionaries in the development of music.
Figure 14 Network Diagram of Internal Influencers of Electronic Genres
Figure 15 Network Diagram of Internal Influencers of R&B Genres
5.4 R&B genre changes over time
In this paper, time series is used to visualize the songs sung of R&B and calculate the
index intensity between each music feature and music popularity through grey correlation analysis, so as to measure the indicators of dynamic influencers. Combined with above indicators
and analysis of time series visualization, we get the results of how R&B genre changes over
time.
5.4.1 time series model
Time series analysis is an analysis method that takes time as the order, extracts data information, analyzes the characteristics of data changing with time and explores the rule of changes.
By time series analysis, we can analyze the influence process of a type of music over time and
get the characteristics of the genre of music over time.
According to the genres of artists and combined with assumptions, we classify all the
music performed by artists belonging to R&B genres as R&B genre music. We chose 19502020 to obtain the average of the parameters of all the genre songs that belong to the event
segment. Finally, all data is normalized to facilitate subsequent data analysis.
We perform time series analysis on all normalized parameter data of R & B, and the sequence diagram is shown below.
Combined with sequence diagram analysis, we found that speechiness, popularity and energy is gradually increasing over time, and acoustic, mode and loudness gradually decrease.
Instrumentness experienced severe fluctuations in the early stage, but eventually tended to the
initial value. Other parameters remain basically stable.
Combined with the sequence diagram, we can learn that some parameters of R&B will
change with the change of the times, which makes R&B to develop continuously and become
an important genre in music.
Team # 2105383
Page 19 of 25
Figure 16 time series analysis on all normalized parameter data of R&B
5.4.2 Grey Relation Analysis
Grey Relation Analysis ( GRA ) is a multi-factor statistical analysis method. The basic
idea is to determine whether the sequence is closely related according to the similarity of its
geometric shape. The closer the curve is, the greater the correlation between the corresponding
sequences is, and vice versa.[2]
Step1: Determination of reference sequence and comparison sequence
Considering the dynamic impact of indicators, we choose the popularity as a reference
sequence. Popularity is not only the characteristics of time change, popularity as a visibility,
but can measure the size of the impact of influence. Other music features are select as the
comparison sequence. The reference sequence is
(9)
The comparison sequence is denoted as :
(10)
Step2: Standardized processing of data
Because the original data dimension is different, it is difficult to compare directly. So it is
necessary to use the mean value method for dimensionless processing of experimental data,
namely using the data of each column divided by the mean value of the corresponding sequence.
And the new data sequence is obtained as the standardized sequence, the formula is as follows :
(11)
corresponds time period, and corresponds a music feature.
Step3:Calculate the correlation coefficient
The calculation formula of correlation coefficient is as follows :
(12)
Team # 2105383
Page 20 of 25
is called the resolution coefficient. The smaller
General value range of
is
is, the greater the resolution.
.Specific values are subject to circumstances. When
, the resolution is the best, usually
.
Step4: Calculation of correlation
Since the correlation coefficient is the correlation degree between the comparison sequence and the reference sequence at each moment, namely each point in the curve, its number
is more than one. And the information is too scattered to make a holistic comparison. Therefore,
it is necessary to concentrate the correlation coefficient of each moment, namely the points in
the curve, into a average value as the quantitative expression of the correlation degree between
the comparison sequence and the reference sequence. The correlation degree formula is as follows :
(13)
Through Matlab programming, we can get the correlation between each music feature and
popularity.
danceability
0.8976
energy
0.9104
instrumentalness
0.7429
valence
0.8559
tempo
0.8745
loudness
0.821
liveness
speechiness
0.8475
0.908
Table 5 each music feature and popularity
key
0.8818
acousticness
0.7631
explicit
0.7134
According to the table data, we can see that the correlation between energy, speechness
and popularity is relatively strong, while the correlation between instrumentalness, explicit,
acousticness and popularity is relatively weak. And other music features are at a medium level.
According to the hypothesis 2 we can obtain the size of the dynamic influencer index.
According to the table and combined with figure 17, we can analyze the following results.
In the period of 1950-2020, the trend of energy and speechness with strong correlation is
basically the same as that of popularity, while the trend of instrumentalness, explicit and acousticness with poor correlation is different from that of popularity. It can be seen that the correlation and synergy between indicators can reflect the change of music characteristics of R&B
genres with years.
Team # 2105383
Page 21 of 25
Figure 17 Time series analysis for data_by_year
6 The External Factors Affecting Music
In this part, we will find out the external factors affecting music by visualizing the total
number of songs, various genres of songs and the characteristics of music according to time.
6.1 Visualization of music features
In order to make the chart more clear and intuitive, we normalize data of data_by_year
and select the time series for plotting. As shown in the figure 18.
Figure 18 Changes in the number of total songs
According to the figure 18, we can clearly see that the popularity began to rise around
1953. Considering that the World War II just ended in the 1950s and the social environment
was relatively stable and prosperous, the music in this period was greatly changed by the social
environment, and the popularity of music had been greatly improved.
6.2 Visualizing the number of songs
In this section, we count the total number of songs and the number of songs of various
genres from 1920 to 2020. As shown in figure 19, we use the right side of the coordinate system
to represent it because of the large value of Pop / Rock.
Team # 2105383
Page 22 of 25
Figure 19 Changes in the number of songs in various genres
6.3 Analysis
Combined with the above three graphs, we can get the conclusion:
➢ Around 1950, combined with the images of 2.1 and 2.2, we found that at the beginning of
the end of the Second World War, people experienced the pain and anxiety of the war. After
the end of the war, people began to have the energy and time to appreciate the music that
can bring people relaxed, happy or resonate with themselves, making the overall music
more popular. Rhythm Bruce, jazz and rock music which have strong rhythm, unfettered
forms of music just cater to the young people's moving characteristics, energetic temperament, in this period they produced a lot of music.
➢ In the fifties and sixties of the 20th century, countries around the world are basically in the
recovery period of World War II. During this period, the economy has gradually developed,
the structure of the music market has changed, and two obvious phenomena have emerged
in the record market, namely, ' market intersection ' and ' reprint '. This situation broke the
interval between various music markets, leading to the gradual formation of pop/rock music. The popularity of this genre of music made the total number of songs increase year by
year around 1953. In addition, due to the end of the Second World War, each genre has
developed better to varying degrees. Therefore, in Figure 19, we can see that the graph
lines of each genre in this period have changed.
➢ In the1960s, the folk ballad revival movement began, so the folk genre in this period produced great changes. Traditional folk songs have three elements: oral teaching, low-class
audience, the author unknown or untestable. Due to the impact of the late development of
the Internet, it often violates the three elements of folk songs, and the development of folk
songs has gradually become low. Therefore, new things often have a great impact on old
things.
➢ In the twenty-first century, although the Internet has been a huge development, the spread
of information has become more convenient, but from the image can be found that various
genres of song creation basically began to reduce. This is because many creators are starting to focus on re-creation of existing music, and even some radical creators are starting
to be conservative.
Team # 2105383
Page 23 of 25
In general, the development of music is often affected by social change, from war, technological revolution to a movement, the emergence of an artist, which will have different degrees of impact on a genre and even the entire music. In addition, the change of artists’ mentality will also affect the development of music. The influence of society, environment, politics
and creators on music is sometimes concrete and sometimes abstract. The specific reasons for
the changes in music often require us to combine larger data to conduct more detailed research.
7 Sensitivity Analysis
Comprehensive Influence
Score
In the first question model algorithm, we define the artist ' s comprehensive influence
score. And the influence coefficient is assigned to 1 and 0.5 respectively when the genre influence coefficient of influencers and followers are at the same and different genres of influencers
and followers. When genres of influence and followers are different, the influencer’s influence
is also different.
Considering that different genres will have different degrees of influence, which requires
a lot of data to make specific evaluation, so we put forward hypothesis 4. Here we will test the
genre influence coefficient in the first question model algorithm.
This paper takes the Beatles as an example. We make the influence coefficient of change
by 5% without changing other parameters, when influencers and followers are in different genres. Draw a line chart for sensitivity analysis, as shown in Figure 20 below.
68.500
0.4, 68.424
68.400
0, 68.259
68.300
68.200
68.100
68.000
67.900
-0.4, 68.094
0.2, 68.342
0.5, 68.466
0.3, 68.383
-0.2, 68.176
0.1, 68.300
-0.1, 68.218
-0.3, 68.135
-0.5, 68.052
67.800
-0.5 -0.4 -0.3 -0.2 -0.1
0
0.1
0.2
0.3
0.4
0.5
Change Rate of Genre Influence Coefficient
Comprehensive Influence Score
Figure 20 Sensitivity Analysis
The sensitivity analysis shows that the comprehensive influence score is sensitive to the
genre influence coefficient. But the change of genre influence coefficient will not change the
comprehensive influence score to a large extent.
In the actual evaluation of the comprehensive influence of an artist, we need to put forward a more rigorous scoring formula for the influence of an artist based on the relationship
between the influencers and followers.
8 Model Evaluation and Further Discussion
8.1 Strengths
➢ Multidimensional analysis: We vividly and diversely visualize our data and model
Team # 2105383
Page 24 of 25
results, and combine them with analysis to obtain more objective and realistic conclusions.
➢ Correct selection method: We can choose the required values for similarity measurement, which is due to the method we choose.
➢ We combine the characteristics of the data with the actual situation. And we put forward some simple, intuitive model and algorithm.
➢ We fully contact the relationship between the various data tables to integrate the data
of an object in the four tables in many ways. And we comprehensively explore the
detailed information contained in the data.
8.2 Weaknesses
➢ Model design has certain randomness.
➢ Due to the limitation of professional level, we have insufficient understanding of music-related influencing factors, and do not fully take the essence of music reflected
into account by various music parameters. The views obtained in the analysis process
are subjective.
9 Document for ICM
Dear ICM Association:
Thank you for your trust in our group. According to the requirements of your association,
we are pleased to be able to explain our understanding to you. The following is the detailed
content of the report.
The value of understanding music influence through network is mainly reflected in three
aspects. First of all, the network can clearly express the logical relationship between musicians,
and thus help us better understand the relationship between genres and artists ; Moreover,
through the calculation of data, the network can intuitively show the key nodes, help us find
influential artists ; Finally, the network can also help us understand the development of genre
music in the macro.
If we can get more abundant data, on the one hand, we can establish a scoring system of
influence between genres, explore the degree of influence between different genres, and better
evaluate the influence of artists. On the other hand, we can use LDA model to analyze some
song names and words that frequently appear in the lyrics, and then analyze the music thought
expressed by a singer and even an era. With the abundance of data, we can propose more targeted solutions for different data.
Music is a special language used all over the world. It has stronger appeal than language
and contains many thoughts and feelings that are difficult to express in language. In different
periods, music can reflect the various information of a historical period by virtue of its powerful
expression ability. From different perspectives, we can understand different people ’ s cognition
of things in different periods, different states and different thoughts. In addition to getting information from music, we can also actively communicate with people through music, explore
people ' s psychology through the appeal of music, and enrich social science knowledge. ( This
paragraph may be appropriately retained, depending on the number of words )
Team # 2105383
Page 25 of 25
The content of music often depends on the social environment at that time. A kind of
technological innovation and a change of social ideology will be reflected in music to varying
degrees.
In addition to the creation of music to combine the perception of the environment, but also
need a certain knowledge of music. The changes of different notes and music parameters in
music can often provide information for the research of a large number of social science and
natural science disciplines such as mathematics and physics. The development of social science
and natural science can better study music, create music and develop music. Music and science
are complementary. We firmly believe that our understanding and solutions can bring inspiration to your music development.
Best regards,
Sincerely, Team #2105383
References
[1] Mauch Matthias, MacCallum Robert M., Levy Mark and Leroi Armand M. 2015 The evolution of popular music: USA 1960–2010R. Soc. open sci.2150081
[2] Jiang Shiquan. Gray correlation decision model based on general gray number and its application research [D]. Nanjing University of Aeronautics,2018.
[3] Liu Sifeng, Xie Naming. Grey system theory and its application. 4th edition [M]. Science
Press, 2008
Download