Problem Chosen D 2021 MCM/ICM Summary Sheet Team Control Number 2105383 Analyze the Characteristics and Development of Music based on Network The inheritance and development of music reflects human collective experience. To further understand the impact of music, models will be used to quantify it. This paper will start from the establishment of music influence model to understand how music evolves with the changes of society. Problem 1 is to establish the model of music influence, and this paper puts forward the genre influence coefficient and the weighted directed network graph is drawn to analyze the subset of music influence. Pop/Rock and the Beatles were calculated to be the most influential. Problem 2 is to measure musical similarity. Firstly, Pearson correlation coefficient model and visualization were used to compare the weights of variables. Since music similarity can be regarded as the distance between different data, this paper uses Euclidean distance to calculate. We find that there is a strong blend between music. Problem 3 is to find the difference between music genres. We use the formula of problem 2 to calculate the approximate degree of each genre and measure the changes of all parameters to explore the development of genres. It can be found that Avant-Garde and Blues have the highest similarity. Problem 4 is to find the influence of influencers on followers and specific characteristics. In this paper, followers are clustering, data normalization, Euclidean distance is calculated to get the similarity and the influential people can affect followers. Finally, through calculating the degree of difference between the clustering data and the musical characteristic parameters of The Beatles, it is concluded the influential characteristics and person. Problem 5 is to explore the characteristics and leaders of musical change. This paper calculates the popularity data by matching artists' genres and music careers and eventually found Ray Charles and other revolutionaries.Problem 6 is to analyze the development and transformation of specific music genres. This paper uses time series and grey correlation analysis to measure the dynamic influence index of the development of genres. In the end, the dynamic influence index of R&B is obtained in this paper. Problem 7 is to judge the influence of external environment on music. In this paper, the total number of songs, the number of songs of each genre and the musical characteristics are visualized according to the time. And the visualization results are comprehensively analyzed to find the external factors affecting music. The development of music is often influenced by social change. The advantage of this paper lies in the vivid and diversified visualization of data and model results, and the combination with the analysis to draw more objective and real conclusions. Keywords: Music Influence Model; Euclidean distance; Clustering model; Grey correlation analysis Team # 2105383 Page 2 of 25 Contents 1 Introduction................................................................................................................................................3 1.1 Problem Background ................................................................................................................................... 3 1.2 Problem Restatement ................................................................................................................................... 3 1.3 Our Work ..................................................................................................................................................... 4 2 Assumptions and Justifications .................................................................................................................4 3 Notations .....................................................................................................................................................4 4 Music influence ..........................................................................................................................................5 4.1 Data preprocessing ....................................................................................................................................... 5 4.2 directed weighted network of music influence ............................................................................................ 5 4.2.1 influence coefficient of the genre ...................................................................................................... 6 4.2.2 conclusion ......................................................................................................................................... 6 4.2.3 Analysis ............................................................................................................................................. 7 4.3 Music similarity model ................................................................................................................................ 8 4.3.1 Pearson Correlation Coefficients ....................................................................................................... 8 4.3.2 Visualization ...................................................................................................................................... 8 4.3.3 Calculation of Euclidean Distance .................................................................................................. 10 4.4 Division of music genres ........................................................................................................................... 11 4.4.1 The similarity was observed using European examples .................................................................. 11 4.4.2 The variation of genres over time .................................................................................................... 12 5 Music Artist and Genre Revolution ........................................................................................................13 5.1 Cluster model ............................................................................................................................................. 13 5.1.1 Hierarchical clustering method ....................................................................................................... 13 5.1.2 The principle of hierarchical clustering method .............................................................................. 13 5.1.3 clustering process ............................................................................................................................ 14 5.1.4 Euclidean distance ........................................................................................................................... 15 5.1.5 Similarity of different parameters between influencer and follower ............................................... 16 5.2 The analysis of music revolution ............................................................................................................... 17 5.3 The analysis of music changers.................................................................................................................. 17 5.4 R&B genre changes over time ................................................................................................................... 18 5.4.1 time series model ............................................................................................................................. 18 5.4.2 Grey Relation Analysis.................................................................................................................... 19 6 The External Factors Affecting Music ...................................................................................................21 6.1 Visualization of music features .................................................................................................................. 21 6.2 Visualizing the number of songs ................................................................................................................ 21 6.3 Analysis...................................................................................................................................................... 22 7 Sensitivity Analysis ..................................................................................................................................23 8 Model Evaluation and Further Discussion ............................................................................................23 8.1 Strengths .................................................................................................................................................... 23 8.2 Weaknesses ................................................................................................................................................ 24 9 Document for ICM...................................................................................................................................24 References ....................................................................................................................................................25 Team # 2105383 Page 3 of 25 1 Introduction 1.1 Problem Background The inheritance and development of music is based on the collective experience of mankind. At the same time, due to the different innate ingenuity, personal experiences, technological development and historical background of the musicians, the new music created by the same musician is not completely the same.[1] Over time, genres of music theory sometimes have revolutionary changes and give rise to new musical genres. Collaboration among musical artists, inheritance, and internal social factors are often the important causes for revolutionary change. Therefore, it is very important to establish a model to describe the influence of music. In order to further understand the influence of music, we will use models to quantify it. 1.2 Problem Restatement Considering the complexity of the music artists and genres and the data given on the problem, we will address the following questions: ➢ Create a directed network to describe musical influence among musicians and develop parameters to capture the influence. Explore and describe a subset of musical influence and the influence measures. ➢ Develop a model of music similarity and compare the similarity among artists whether are of the same genre. ➢ There is an interplay between musical genres. Describe and compare these genres. Explore the characteristics and evolution of a genre as well as the connections among multiple genres. ➢ In the data_influence data set, whether the similarity data can show that the identified influencers influence the respective artists and that the influencers have influence on the followers. Does the influence of musical characteristics play the same role? ➢ Explore the characteristics and representatives of major changes in the field of music. ➢ Explore and explain the evolution and indicators that reveal the dynamic influencers of a musical genre. ➢ Describe the influences of different environments and cultures on music. And the impact of political, social and technological changes on music in the network environment. Team # 2105383 Page 4 of 25 1.3 Our Work The writing ideas of this paper are as Figure 1: Figure 1 Structure of this Paper 2 Assumptions and Justifications ➢ We assume that all influencers have the same influence on the same follower, regardless of the genre of influencers and followers In the reality, it is difficult to describe influence with specific values. In order to better quantify the influence model, this paper assumes that all the influencers have the same influence. ➢ We assume that all songs sung by an artist are of the same genre as that of the artist In the real world, the same artist may sing songs of various genres, but it will still be dominated by his genre. This paper simplifies the model, ignoring the artist ' s songs and their own different situations. ➢ We assume that the data given by the topic is representative and can overview music development Ensure that the model can get reliable results with reliable data. ➢ When the influencers and followers are different, the influence relationship between different genres is completely consistent. If we believe that different genres will have different degrees of influence, we need large number of data to make specific evaluations. To ensure the simplicity of model establishment and solution, this assumption is proposed. 3 Notations The key mathematical notations used in this paper are listed in Table 1. Team # 2105383 Symbol 𝑞 𝑛 𝐶𝑜𝑢𝑛𝑡𝑖 i 0 Page 5 of 25 Description A comprehensive influence score for an artist The total number of people affected by the artist The total number of people affected by an artist's the 𝑖th follower An artist's the influence coefficient of the genre of his 𝑖th follower Influencer’s genre Follower’s genre Pearson correlation coefficients Euclidean distance aggregation coefficient Class number the difference between the data J k resolution coefficient Figure 2 Notations used in this paper 4 Music influence In this section, we will explore the music influence of different music artists, their music similarity and the formation of genres. 4.1 Data preprocessing Before data analysis, we must ensure the availability of data. If the given data have large problems such as missing or redundancy, it will affect the accurate answers. In order to improve the availability of data, we preprocessed the data: data classification, data cleaning, information filtering, the establishment of new attributes and data metrics. Step 1: In data cleaning, we use Python to find missing values and redundant values through the program, but there is no such value in the four data tables. Step 2: When browsing text data, we found that there was a bit of code disorder. According to the information of the topic, we choose utf-8 to transcode and solve the problem of random code. 4.2 directed weighted network of music influence In this section, an artist is often influenced by multiple other artists. It’s assumed that each influencer has the same influence on the same follower, without considering the genres of followers and influencers. In order to consider the influence of influencers and followers' genre on influencers, we put forward the influence coefficient of the genre, and then give the comprehensive influence score. Set the above two scores and the final comprehensive score as parameters. We construct a network diagram with a directed weighted network graph to analyze and measure the subset of musical influence. Team # 2105383 Page 6 of 25 4.2.1 influence coefficient of the genre For i [1, n] ,define each artist's overall impact score as following: n 1 q= i i =1 Counti 1 = 0 0.5 0 i = (1) (2) q : A comprehensive influence score for an artist n : The total number of people affected by the artist 𝐶𝑜𝑢𝑛𝑡𝑖 : The total number of people affected by an artist's the i th follower i : An artist's the influence coefficient of the genre of his i th follower : Influencer’s genre 0 : Follower’s genre Through the above formula, we combine various factors into an intuitive comprehensive impact score. It helps to identify artists with stronger influence and to quantify the influence of different artists for detailed, rigorous analysis and evaluation. 4.2.2 conclusion Based on the results of the above model calculations, we get the weight of each artist's influence. Through the processing of data, we draw a directed weighted network graph by Gephi, setting influencer_id as source, follower_id as target and influencer_main_genre as the classification criteria. We can get Figure 2: Figure 3 Directed Weighted Network Graph Because of the large amount of data, we can't see how each artist's influence unfolds in Team # 2105383 Page 7 of 25 Figure 3. Thus we chose pop/rock, which have the largest number of examples in influencer_main_genre, for further drawing and analysis. As shown in Figure 4. Figure 4 Each Artist's Influence of Pop/Rock In order to further study the development of each music genre, we establish a multidimensional line chart for intuitive data visualization, taking the horizontal coordinates of the beginning time of each artist's musical career as the horizontal axis and the number of artists of each genre who began their musical career at corresponding time as the vertical axis. Figure 5 Development of each Music Genre 4.2.3 Analysis Based on the results of the above model, we can get: ➢ The influence of music is related to the number of artist followers and the similarities of the two genres. With the quantitative comprehensive scoring model of these two aspects, we have a clearer understanding of each artist and can further analyze the historical development of the music genre according to the influence of the artist and the era. ➢ By analyzing the subnetwork, Pop/Rock is the most widely spread and influential genre. In addition, the Beatles is the most influential artist in the Pop/Rock music genre. By drawing this model, we can know the composition of subnetworks of other genre artists. Team # 2105383 Page 8 of 25 ➢ Through the observation of the development of various music genres at different times, the line chart shows that Pop/Rock began to develop at a high speed from 1950 to 1960, setting off a wave of Pop/Rock, and then slowing down. In addition, in 1940, the R&B began to develop gradually and maintained a more stable development. Other music genres have maintained a relatively stable low-speed development. ➢ Through comprehensive analysis of the influence of different artists, the era and the development music genres, we find that some influential artists can often greatly influence the development of a genre, and can lay a certain foundation for the future development of the music genre. 4.3 Music similarity model Use Pearson correlation coefficient model to preliminary process the multi-dimensional and large amount of data. Considering when the popularity of the song is high, it is more practical to study its similarity, we refer to the correlation between the variables and reduce the variables with the lower correlation with popularity to extracted the main components. In addition, we give different weights to some variable to obtain more reference variables. The common method to estimate the similarity between different samples is to calculate the "distance" between samples, we use Euclidean distance and used processed variable data to measure the similarity between music. 4.3.1 Pearson Correlation Coefficients Pearson correlation coefficients are used to measure whether two data sets are above a line, that is to measure the linear relationship between Fixed distance variables. And when both variables are normal continuous variables, there is a linear relationship between the two parameters. Pearson correlation coefficients are often used to depict the degree of relationship between these. The formula is as follows: (3) In the upper resents the tion range of class, represents the correlation coefficient, n is the sample size, and rep- th property value of the two sets of the sample datasets respectively. The variais . Negative values represent negative correlations. Positive values represent positive correlations. The closer its absolute value is to 1, the stronger the correlation is, which is quantitative description. 4.3.2 Visualization Based on the above formula, we describe the data and get the table. Team # 2105383 Page 9 of 25 Based on this table, we visualize the Pearson correlation coefficient of the data. The closer the correlation coefficient is to red, the stronger the positive correlation between the two variables. The closer the correlation coefficient is to the blue, the stronger the negative correlation between the two variables. Table 1 Descriptive Statistics Based on the visuality rendering and the detailed comparison of the correlation coefficients among variables, we can get that danceability, energy, loudness, acousticness, instrumentalness, exolicit and popularity are more relevant, and are more representative of the characteristics of a song. So we keep these variables, modify or delete other parameters and calculate the Euclidean distance later. Figure 6 Comparison of the Correlation coefficients among variables Team # 2105383 Page 10 of 25 4.3.3 Calculation of Euclidean Distance In 2.1.2 above, we get a few reserved parameters, which we calculate based on the formula's weight. The weight formula is as follows: (4) Represents the weight of each parameter. representing the correlation coefficient of each parameter to popularity. After we get the weight, we multiply the original data, get the new data, and calculate the Euclidean distance based on this. Euclidean distance is a common distance definition, which is the true distance between two points in m-dimensional space. Because the similarity between songs can be regarded as distance, the algorithm of n-dimensional Euclidean distance is used to sum the cases. The formula is as follows: (5) Because the data is huge, and we need more intuitive expression of the same genre and different genre music similarity. We selected more influential music genres, namely Pop / Rock, R & B, Country. These music singers have high influence and can represent the corresponding music genre music characteristics. The Euclidean distance between each music is shown in the following table. 1 – 5 is Pop / Rock genre, 6 – 8 is Pop/Rock, 9 – 10 is Country, and the red box is the Euclidean distance between the music of the same genre. The smaller the Euclidean distance, the more similar between the two music. Table 2 European Distance between Genres of Music According to the results of table 1 above, we find that the music of the same genre is often more similar, but there are also many different genres of music with high similarity. This shows that a music genre often has a wide range of music characteristics, a music genre of music features often have a certain degree of coincidence with other genres of music features. There are often only vague boundaries between different genres of music, and there is a strong integration of music. Team # 2105383 Page 11 of 25 4.4 Division of music genres 100% 80% 60% 40% 20% 0% -20% -40% -60% -80% -100% 1 2 3 4 5 6 7 8 9 10 11 12 13 Avant-Garde Blues Children's Classical Comedy/Spoken Country Easy Listening Electronic Folk International Jazz Latin New Age Pop/Rock R&B; Reggae Religious Stage & Screen Vocal 14 Figure 7 percentage accumulation line chart with data In this section, we use the known data, and take artists _ id as the identification mark, and then divide the songs into groups. After classification, we obtain 14 variables of each music genre, and calculate them to obtain the final value. Visualize the values and calculate the Euclidean distance to observe their similarity. We believe that the similarity between genres can be described by the two most similar ones, and the difference between genres can be described by the lowest similarity ones. At the same time, according to the release time of songs, it can be divided into 10 years as a stage to observe the change trend of music genres over time. 4.4.1 The similarity was observed using Euclidean examples We calculated the average of 14 variables to measure songs, starting with visualizing the results. Fig. 7 shows: According to the percentage accumulation line chart with data, we can clearly see the performance of songs of various genres in various variables and the approximation between various genres. Based on this, we calculate the Euclidean distance of these data and finally get the results as shown in Table 3. By analyzing Fig.7 and Table 3, we can see that Avant-Garde and Blues have the highest similarity, and Jazz and Children’s have the lowest similarity. We can know: Similar music genres have small differences in danceability, energy, key and speechiness, indicating that similar genres have consistency in rhythm. And the perception characteristics for music are also very similar. Moreover, the music enthusiasm conveyed to people and the expression of word clarity is basically the same. The difference of music genre is that there are great differences in the performance of valence, loudness, acousticness and instrumentalness. According to the examples of Jazz and Children's, we find that Jazz ' s loudness and acousticness are larger, while Children's performance in these aspects is opposite to Jazz, and it is higher in valence, instrumentalness and Team # 2105383 Page 12 of 25 danceability. This can also explain that Children's music conforms to the characteristics of happy, lively and suitable for singing and dancing. AvantGarde Avant-Garde Blues 0 1742.65 Blues Children's Classical Comedy/Spoken Country Easy Listening Electronic 1742.65 82774.9 0 Children's 82774.9 84517.5 Classical 76636 76636 7129.02 17109.3 32413.6 72315.6 84517.5 74893.4 5386.38 18851.9 34156.2 0 74893.4 159411 Comedy/Spoken 7129.02 5386.38 89903.9 Folk 16353 18095.7 51064.4 89087.4 90830 Latin 11400.4 New Age 80138 89087.4 173605 9657.8 11400.4 9657.8 80138 78395.4 25778.5 32829.8 27561.3 49550.1 14194 65235.6 3502.11 49114.9 42063.6 47332.1 25343.3 83540.4 98749.8 69507 0 24238.3 39542.6 65186.6 23482 45678.1 83701 4271.49 92989 14194 0 23482 83701 73009 73009 20392.2 27443.4 22174.9 44163.7 14033.4 29242.8 15304.3 89424.9 756.367 69916.4 107939 28509.7 97247.3 44630.4 51681.7 46413.2 0 104729 16060.6 85220.7 123244 0 756.367 16060.6 88668.6 43814 0 69160.1 107183 27753.4 0 112552 59934.7 66986 66986 96491 27330.9 96491 Vocal 68402 61717.5 83706.3 25509.1 10299.8 43874.2 50925.4 45656.9 67645.7 9448.59 5760.74 0 10692 79429.6 0 68737.6 10692 37743.2 50925.4 18234.7 56257.6 23172 63308.9 56257.6 61526.1 39537.3 97734.5 112944 68737.6 16120.7 0 23172 17903.5 39892.3 18304.9 33514.2 52616.9 45565.6 50834.1 28845.3 87042.5 102252 0 45565.6 7051.28 7051.28 1782.77 23771.5 34425.6 49634.9 0 Reggae 29303.9 27561.3 112079 47332.1 22174.9 46413.2 61717.5 43011.7 45656.9 23503.2 61526.1 17903.5 50834.1 1782.77 5268.53 Religious 51292.7 49550.1 134068 25343.3 44163.7 10204.9 5004.52 38022.9 41406.7 27330.9 25285.9 18234.7 23503.2 1514.55 59711.5 74920.8 60915.2 27753.4 41406.7 79429.6 97247.3 112552 7822.41 43814 68402 88668.6 19508.6 18514.4 60915.2 7822.41 44794.5 37743.2 43011.7 21022.9 79220.1 94429.4 107939 123244 18514.4 107183 38022.9 94175.3 65235.6 4271.49 28509.7 78395.4 162913 3502.11 8647.1 23856.4 23828.9 34572.4 32829.8 117347 42063.6 27443.4 51681.7 Stage & Screen 6904.45 8647.1 92989 Pop/Rock 27521.2 25778.5 110296 49114.9 20392.2 44630.4 59934.7 44794.5 43874.2 25285.9 63308.9 16120.7 52616.9 R&B; 27521.2 34572.4 29303.9 51292.7 6904.45 22113.8 93745.3 109050 4320.56 International52807.1 51064.4 135582 23828.9 45678.1 69916.4 85220.7 19508.6 69160.1 Jazz 90830 Reggae Religious Stage & ScreenVocal 69507 155091 4320.56 65186.6 89424.9 104729 18095.7 66421.9 52807.1 New AgePop/Rock R&B; 0 Easy Listening 32413.6 34156.2 50361.3 109050 39542.6 15304.3 70573 16353 Latin 159411 89903.9 65665.6 50361.3 155091 66421.9 135582 173605 94175.3 162913 110296 117347 112079 134068 75870.5 60661.2 Country 17109.3 18851.9 65665.6 93745.3 24238.3 Electronic 72315.6 70573 Folk International Jazz 5268.53 16720.3 41476.9 56686.2 0 83706.3 21022.9 67645.7 1514.55 39537.3 39892.3 28845.3 23771.5 16720.3 21988.8 21988.8 36208.3 51417.7 0 75870.5 83540.4 14033.4 10204.9 25509.1 79220.1 9448.59 59711.5 97734.5 18304.9 87042.5 34425.6 41476.9 36208.3 58197.1 58197.1 73406.5 0 22113.8 23856.4 60661.2 98749.8 29242.8 5004.52 10299.8 94429.4 5760.74 74920.8 112944 33514.2 102252 49634.9 56686.2 51417.7 73406.5 15209.3 15209.3 0 This is a dissimilarity matrix Table 3 approximation between various genres 4.4.2 The variation of genres over time By calculating the average value of each parameter of all music in each genre each year, we obtain the annual changes of each parameter of each genre. We take some parameters in the Pop/Rock genre that are in the range of [0,1] as an example, and the changes of each parameter with time are shown in the figure 8. Figure 8 the variation diagram of each parameter of Pop/Rock with time Team # 2105383 Page 13 of 25 Combined with the variation diagram of each parameter of Pop/Rock with time, we find that the mode, valence and acousticness of Pop/Rock gradually decreased with time. The acousticness has a relatively drastic change. After experiencing large fluctuations in the early stage, energy gradually recovered to be stable in the late stage. Danceability and explicit begin to increase slightly, and other parameters basically maintain a stable level. As a whole, we can find that a genre will change with time in various parameters. The changes of these parameters make the relevant music flow more conform to the trend of the times, making the music genre more vitality and lay the foundation for the development of a music genre. 5 Music Artist and Genre Revolution In this section, we will explore the influence of musical artists on each other and the evolution of genres 5.1 Cluster model According to the statistics, The Beatles has 615 followers, and the number of followers is the largest among all the influencers, which can have a better reflection of the relationship between the influencers and followers. Therefore, our group selects The Beatles as the example for analysis in this paper. We extracted all the followers of The Beatles and corresponding variables in the data_by_artists table to obtain the specific performance of each follower in music. Then the followers are clustered and normalized, and the Euclidean distance is calculated to obtain the similarity between the clustered followers and the Beatles ' music expression. It is concluded that ' influential people ' really affect the music created by the followers. Finally, by calculating the difference between the clustering data and The Beatles ' music feature parameters, the influence of the followers on their music features is obtained. 5.1.1 Hierarchical clustering method In order to facilitate subsequent data processing, we use the system clustering method to simplify the data of the 615 followers and extract the main features of the 615 follower data. Clustering is a data processing method that divides a collection of multiple data into several classes composed of similar objects. Clustering model can simplify mass data and reflect the characteristics of the data. In this clustering, we use the hierarchical clustering method to reduce the dimension of data and extract their features for data processing and analysis. 5.1.2 The principle of hierarchical clustering method The hierarchical clustering method combines the closest two types of data points by calculating the distance between the two types of data points. After repeated iterations, the data can be clustered layer by layer until all data points are finally combined into one class to obtain the multi-level clustering results. The suitable clustering scheme can be obtained according to the actual number of clusters combined with pedigree diagram. Team # 2105383 Page 14 of 25 5.1.3 clustering process The process of this hierarchical clustering through graphics is shown below: Figure 9 The process of this hierarchical clustering Elbow method can roughly estimate the optimal number of clusters by graphics. Assuming that n samples are divided into K classes (K ≤ n-1, namely that there are at least two elements in one class). The distortion degree of each class equals to the sum of squares of the distance between the center of gravity of the class and the position of its internal members. The distortion degree of the k class is: (6) Therefore, we define the total distortion degree of all classes as: (7) Among them : Ck represents k category (k = 1, 2, , K ) J is the aggregation coefficient uk represents the center of gravity dn of the k th class We take the number of clusters K as the abscissa and the polymerization coefficient J as 200 150 100 6, 55.905 50 0 1 201 401 601 Figure 10 Polymerization coefficient plot Figure 11 Local Amplification Graph of Polymerization Coefficient Fold Graph Team # 2105383 Page 15 of 25 the ordinate. Draw the polymerization coefficient line chart in the following figure. Analyze the polymerization coefficient line chart with the elbow rule, and obtain the optimal number of clusters. According to the plot of aggregation coefficient, when the number of categories is 6, the decreasing trend of discount becomes slow, so the number of categories K can be set to 6 When the number of classes is 6, we calculate the average value of each parameter data of the six categories and get six groups of representative parameter data. Combined with the parameter data of The Beatles, seven groups of data are finally obtained. As the following figure. Table 4 representative parameter data We test the average of the six classification parameters as following Table 5 and found that the indicators basically reached a significant level. This indicate that the classification results are more effective. Table 5 Clustering results test 5.1.4 Euclidean distance With SPSS, we calculate the Euclidean distance between the six categories that have been clustered and The Beatles, as shown in the Table 6. Team # 2105383 Page 16 of 25 Table 6 Proximity Distance We can see that the distance between each species and The Beatles is small, indicating that each species is similar to The Beatles. Since these six categories contain the data characteristics of 615 followers, we can learn that influencers can actually affect the music created by followers. 5.1.5 Similarity of different parameters between influencer and follower In order to better characterize the similarity between influencers and followers in different parameters, we define the degree of difference formula between followers and influencers in parameters. And we compare each musical feature of six types of follower data with the musical feature of The Beatles. So as to further understand whether the influence of influencers on followers is comprehensive or one-sided. The formula is as follows (8) Among them: represents the difference between the six types of data and the influencer on this parameter. The smaller the value, the greater the similarity between the six types of follower data and the influencer on this parameter. represents the corresponding parameter values for class i data. represents the corresponding parameter value of the influencer. represents the number of categories. By calculation, results are visualized in the figure 12. The red line in the above graph is the reference line. When the difference value between one parameter and its influencer is equal or lesser than 0.05, we call it greatly affected. We can observe that loudness, tempo, valence, energy and danceability are the music characteristics that are greatly affected by the influencer. It can explain that some features are more contagious in the process of influence and can be better applied to their works by the followers. Therefore, it can also explain that the influencer will affect the music characteristics of the followers. By matching the data of data_by_artist with the genre of artists and the time of starting music career, we calculate the popularity data of each genre and visualize it. Based on this, the rise of some genres is found to define the occurrence of revolutionary change. And the influence is used to measure the changer within the scope of the initial year of each new genre. Team # 2105383 Page 17 of 25 duration_ms speechiness liveness instrumentalness acousticness key mode loudness tempo valence energy danceability 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 Figure 12 Analysis of Effect We use the time the singer begins his music career as an independent variable. The average value of the popularity parameters of all singers in each genre at the corresponding time is used to represent the respectively popularity parameters. Take the popularity parameter value of each genres as the dependent variable. We observe the development of various genres ls over time and find significant changes in music development. 5.2 The analysis of music revolution We visualize the obtained data, and characterize the data characteristics through the percentage accumulation histogram, as shown in the following figure 13. By observing the color at 0% position of each genresin the figure 13, we can intuitively understand when each genre began to rise. And the rise of a genre represents the beginning of a music development revolution. 100% 80% 60% 40% 20% 0% 1930 1940 1950 1960 1970 1980 1990 2000 2010 Figure 13 Percentage accumulation histogram of music development 5.3 The analysis of music changers Based on the measurement of the number of artists in the new genre, we select the new genre R&B and Electronic to find the changers in these two genres. A network diagram is used to represent the influence of various artists and to find the revolutionaries of these two new genres. With the help of the network diagram, the time of singers starting their music career in the influence_data table and the time correction of influencer_active_start data, we find singers with great influence in the ten years when the new genres was founded as the revolutionaries Team # 2105383 Page 18 of 25 of the music revolution. We choose Ray Charles as the revolutionist of R&B and Giorgio Moroder as the revolutionist of Electronic. They are also the revolutionaries in the development of music. Figure 14 Network Diagram of Internal Influencers of Electronic Genres Figure 15 Network Diagram of Internal Influencers of R&B Genres 5.4 R&B genre changes over time In this paper, time series is used to visualize the songs sung of R&B and calculate the index intensity between each music feature and music popularity through grey correlation analysis, so as to measure the indicators of dynamic influencers. Combined with above indicators and analysis of time series visualization, we get the results of how R&B genre changes over time. 5.4.1 time series model Time series analysis is an analysis method that takes time as the order, extracts data information, analyzes the characteristics of data changing with time and explores the rule of changes. By time series analysis, we can analyze the influence process of a type of music over time and get the characteristics of the genre of music over time. According to the genres of artists and combined with assumptions, we classify all the music performed by artists belonging to R&B genres as R&B genre music. We chose 19502020 to obtain the average of the parameters of all the genre songs that belong to the event segment. Finally, all data is normalized to facilitate subsequent data analysis. We perform time series analysis on all normalized parameter data of R & B, and the sequence diagram is shown below. Combined with sequence diagram analysis, we found that speechiness, popularity and energy is gradually increasing over time, and acoustic, mode and loudness gradually decrease. Instrumentness experienced severe fluctuations in the early stage, but eventually tended to the initial value. Other parameters remain basically stable. Combined with the sequence diagram, we can learn that some parameters of R&B will change with the change of the times, which makes R&B to develop continuously and become an important genre in music. Team # 2105383 Page 19 of 25 Figure 16 time series analysis on all normalized parameter data of R&B 5.4.2 Grey Relation Analysis Grey Relation Analysis ( GRA ) is a multi-factor statistical analysis method. The basic idea is to determine whether the sequence is closely related according to the similarity of its geometric shape. The closer the curve is, the greater the correlation between the corresponding sequences is, and vice versa.[2] Step1: Determination of reference sequence and comparison sequence Considering the dynamic impact of indicators, we choose the popularity as a reference sequence. Popularity is not only the characteristics of time change, popularity as a visibility, but can measure the size of the impact of influence. Other music features are select as the comparison sequence. The reference sequence is (9) The comparison sequence is denoted as : (10) Step2: Standardized processing of data Because the original data dimension is different, it is difficult to compare directly. So it is necessary to use the mean value method for dimensionless processing of experimental data, namely using the data of each column divided by the mean value of the corresponding sequence. And the new data sequence is obtained as the standardized sequence, the formula is as follows : (11) corresponds time period, and corresponds a music feature. Step3:Calculate the correlation coefficient The calculation formula of correlation coefficient is as follows : (12) Team # 2105383 Page 20 of 25 is called the resolution coefficient. The smaller General value range of is is, the greater the resolution. .Specific values are subject to circumstances. When , the resolution is the best, usually . Step4: Calculation of correlation Since the correlation coefficient is the correlation degree between the comparison sequence and the reference sequence at each moment, namely each point in the curve, its number is more than one. And the information is too scattered to make a holistic comparison. Therefore, it is necessary to concentrate the correlation coefficient of each moment, namely the points in the curve, into a average value as the quantitative expression of the correlation degree between the comparison sequence and the reference sequence. The correlation degree formula is as follows : (13) Through Matlab programming, we can get the correlation between each music feature and popularity. danceability 0.8976 energy 0.9104 instrumentalness 0.7429 valence 0.8559 tempo 0.8745 loudness 0.821 liveness speechiness 0.8475 0.908 Table 5 each music feature and popularity key 0.8818 acousticness 0.7631 explicit 0.7134 According to the table data, we can see that the correlation between energy, speechness and popularity is relatively strong, while the correlation between instrumentalness, explicit, acousticness and popularity is relatively weak. And other music features are at a medium level. According to the hypothesis 2 we can obtain the size of the dynamic influencer index. According to the table and combined with figure 17, we can analyze the following results. In the period of 1950-2020, the trend of energy and speechness with strong correlation is basically the same as that of popularity, while the trend of instrumentalness, explicit and acousticness with poor correlation is different from that of popularity. It can be seen that the correlation and synergy between indicators can reflect the change of music characteristics of R&B genres with years. Team # 2105383 Page 21 of 25 Figure 17 Time series analysis for data_by_year 6 The External Factors Affecting Music In this part, we will find out the external factors affecting music by visualizing the total number of songs, various genres of songs and the characteristics of music according to time. 6.1 Visualization of music features In order to make the chart more clear and intuitive, we normalize data of data_by_year and select the time series for plotting. As shown in the figure 18. Figure 18 Changes in the number of total songs According to the figure 18, we can clearly see that the popularity began to rise around 1953. Considering that the World War II just ended in the 1950s and the social environment was relatively stable and prosperous, the music in this period was greatly changed by the social environment, and the popularity of music had been greatly improved. 6.2 Visualizing the number of songs In this section, we count the total number of songs and the number of songs of various genres from 1920 to 2020. As shown in figure 19, we use the right side of the coordinate system to represent it because of the large value of Pop / Rock. Team # 2105383 Page 22 of 25 Figure 19 Changes in the number of songs in various genres 6.3 Analysis Combined with the above three graphs, we can get the conclusion: ➢ Around 1950, combined with the images of 2.1 and 2.2, we found that at the beginning of the end of the Second World War, people experienced the pain and anxiety of the war. After the end of the war, people began to have the energy and time to appreciate the music that can bring people relaxed, happy or resonate with themselves, making the overall music more popular. Rhythm Bruce, jazz and rock music which have strong rhythm, unfettered forms of music just cater to the young people's moving characteristics, energetic temperament, in this period they produced a lot of music. ➢ In the fifties and sixties of the 20th century, countries around the world are basically in the recovery period of World War II. During this period, the economy has gradually developed, the structure of the music market has changed, and two obvious phenomena have emerged in the record market, namely, ' market intersection ' and ' reprint '. This situation broke the interval between various music markets, leading to the gradual formation of pop/rock music. The popularity of this genre of music made the total number of songs increase year by year around 1953. In addition, due to the end of the Second World War, each genre has developed better to varying degrees. Therefore, in Figure 19, we can see that the graph lines of each genre in this period have changed. ➢ In the1960s, the folk ballad revival movement began, so the folk genre in this period produced great changes. Traditional folk songs have three elements: oral teaching, low-class audience, the author unknown or untestable. Due to the impact of the late development of the Internet, it often violates the three elements of folk songs, and the development of folk songs has gradually become low. Therefore, new things often have a great impact on old things. ➢ In the twenty-first century, although the Internet has been a huge development, the spread of information has become more convenient, but from the image can be found that various genres of song creation basically began to reduce. This is because many creators are starting to focus on re-creation of existing music, and even some radical creators are starting to be conservative. Team # 2105383 Page 23 of 25 In general, the development of music is often affected by social change, from war, technological revolution to a movement, the emergence of an artist, which will have different degrees of impact on a genre and even the entire music. In addition, the change of artists’ mentality will also affect the development of music. The influence of society, environment, politics and creators on music is sometimes concrete and sometimes abstract. The specific reasons for the changes in music often require us to combine larger data to conduct more detailed research. 7 Sensitivity Analysis Comprehensive Influence Score In the first question model algorithm, we define the artist ' s comprehensive influence score. And the influence coefficient is assigned to 1 and 0.5 respectively when the genre influence coefficient of influencers and followers are at the same and different genres of influencers and followers. When genres of influence and followers are different, the influencer’s influence is also different. Considering that different genres will have different degrees of influence, which requires a lot of data to make specific evaluation, so we put forward hypothesis 4. Here we will test the genre influence coefficient in the first question model algorithm. This paper takes the Beatles as an example. We make the influence coefficient of change by 5% without changing other parameters, when influencers and followers are in different genres. Draw a line chart for sensitivity analysis, as shown in Figure 20 below. 68.500 0.4, 68.424 68.400 0, 68.259 68.300 68.200 68.100 68.000 67.900 -0.4, 68.094 0.2, 68.342 0.5, 68.466 0.3, 68.383 -0.2, 68.176 0.1, 68.300 -0.1, 68.218 -0.3, 68.135 -0.5, 68.052 67.800 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 Change Rate of Genre Influence Coefficient Comprehensive Influence Score Figure 20 Sensitivity Analysis The sensitivity analysis shows that the comprehensive influence score is sensitive to the genre influence coefficient. But the change of genre influence coefficient will not change the comprehensive influence score to a large extent. In the actual evaluation of the comprehensive influence of an artist, we need to put forward a more rigorous scoring formula for the influence of an artist based on the relationship between the influencers and followers. 8 Model Evaluation and Further Discussion 8.1 Strengths ➢ Multidimensional analysis: We vividly and diversely visualize our data and model Team # 2105383 Page 24 of 25 results, and combine them with analysis to obtain more objective and realistic conclusions. ➢ Correct selection method: We can choose the required values for similarity measurement, which is due to the method we choose. ➢ We combine the characteristics of the data with the actual situation. And we put forward some simple, intuitive model and algorithm. ➢ We fully contact the relationship between the various data tables to integrate the data of an object in the four tables in many ways. And we comprehensively explore the detailed information contained in the data. 8.2 Weaknesses ➢ Model design has certain randomness. ➢ Due to the limitation of professional level, we have insufficient understanding of music-related influencing factors, and do not fully take the essence of music reflected into account by various music parameters. The views obtained in the analysis process are subjective. 9 Document for ICM Dear ICM Association: Thank you for your trust in our group. According to the requirements of your association, we are pleased to be able to explain our understanding to you. The following is the detailed content of the report. The value of understanding music influence through network is mainly reflected in three aspects. First of all, the network can clearly express the logical relationship between musicians, and thus help us better understand the relationship between genres and artists ; Moreover, through the calculation of data, the network can intuitively show the key nodes, help us find influential artists ; Finally, the network can also help us understand the development of genre music in the macro. If we can get more abundant data, on the one hand, we can establish a scoring system of influence between genres, explore the degree of influence between different genres, and better evaluate the influence of artists. On the other hand, we can use LDA model to analyze some song names and words that frequently appear in the lyrics, and then analyze the music thought expressed by a singer and even an era. With the abundance of data, we can propose more targeted solutions for different data. Music is a special language used all over the world. It has stronger appeal than language and contains many thoughts and feelings that are difficult to express in language. In different periods, music can reflect the various information of a historical period by virtue of its powerful expression ability. From different perspectives, we can understand different people ’ s cognition of things in different periods, different states and different thoughts. In addition to getting information from music, we can also actively communicate with people through music, explore people ' s psychology through the appeal of music, and enrich social science knowledge. ( This paragraph may be appropriately retained, depending on the number of words ) Team # 2105383 Page 25 of 25 The content of music often depends on the social environment at that time. A kind of technological innovation and a change of social ideology will be reflected in music to varying degrees. In addition to the creation of music to combine the perception of the environment, but also need a certain knowledge of music. The changes of different notes and music parameters in music can often provide information for the research of a large number of social science and natural science disciplines such as mathematics and physics. The development of social science and natural science can better study music, create music and develop music. Music and science are complementary. We firmly believe that our understanding and solutions can bring inspiration to your music development. Best regards, Sincerely, Team #2105383 References [1] Mauch Matthias, MacCallum Robert M., Levy Mark and Leroi Armand M. 2015 The evolution of popular music: USA 1960–2010R. Soc. open sci.2150081 [2] Jiang Shiquan. Gray correlation decision model based on general gray number and its application research [D]. Nanjing University of Aeronautics,2018. [3] Liu Sifeng, Xie Naming. Grey system theory and its application. 4th edition [M]. Science Press, 2008