Melody Generation using an Interactive Evolutionary Algorithm

Reading time: 6 minute
...

📝 Original Info

  • Title: Melody Generation using an Interactive Evolutionary Algorithm
  • ArXiv ID: 1907.04258
  • Date: 2020-04-09
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Music generation with the aid of computers has been recently grabbed the attention of many scientists in the area of artificial intelligence. Deep learning techniques have evolved sequence production methods for this purpose. Yet, a challenging problem is how to evaluate generated music by a machine. In this paper, a methodology has been developed based upon an interactive evolutionary optimization method, with which the scoring of the generated melodies is primarily performed by human expertise, during the training. This music quality scoring is modeled using a Bi-LSTM recurrent neural network. Moreover, the innovative generated melody through a Genetic algorithm will then be evaluated using this Bi-LSTM network. The results of this mechanism clearly show that the proposed method is able to create pleasurable melodies with desired styles and pieces. This method is also quite fast, compared to the state-of-the-art data-oriented evolutionary systems.

💡 Deep Analysis

Deep Dive into Melody Generation using an Interactive Evolutionary Algorithm.

Music generation with the aid of computers has been recently grabbed the attention of many scientists in the area of artificial intelligence. Deep learning techniques have evolved sequence production methods for this purpose. Yet, a challenging problem is how to evaluate generated music by a machine. In this paper, a methodology has been developed based upon an interactive evolutionary optimization method, with which the scoring of the generated melodies is primarily performed by human expertise, during the training. This music quality scoring is modeled using a Bi-LSTM recurrent neural network. Moreover, the innovative generated melody through a Genetic algorithm will then be evaluated using this Bi-LSTM network. The results of this mechanism clearly show that the proposed method is able to create pleasurable melodies with desired styles and pieces. This method is also quite fast, compared to the state-of-the-art data-oriented evolutionary systems.

📄 Full Content

Music is a ubiquitous, undeniable, and perhaps the most influential part of media content. It can easily facilitate, transferring of the emotion and concepts in an artistic, and delicate way. This motivates the music accompaniment with all media types and human-oriented places, such as movies, theaters, games, shops, and so on in order to bring human pleasure.

For a machine to create automatic enjoyable music it should imitate the rules, know-how, and subtleties embedded in a pleasurable melody or a famous masterpiece of an artist. This could be extracted from a rich database of artistic music being played by famous musicians and then being learned to the machine.

There are several methods introduced so far in order to generate musical melodies automatically, such as Hidden Markov Models [21,5,19,7,16,20,2], models based on artificial neural networks [23,18,8,3,4,10,14,26], models based on the evolutionary and population-based optimization algorithms [13,25,11,22], and models based on local search algorithms [6,9]. Recently, the sequential deep neural networks especially Long Short-Term Memory (LSTM) neural networks have become prevalently used and achieved successful results generating time series sequences [15,1,17].

Music generation can be viewed from different aspects. One can focus on melody generation, while others can work specifically on harmony and rhythm. In terms of data, music generation methods could also be divided into note-based and signal-based methods. In former, the machine should learn music from the music sheets, while in latter the musical audio signals are learned. Music generation can also be discussed in terms of the difficulty of performing [16,24], and the narrative [12].

The major questions in this regard are; what kind of music do we prefer to be generated by a machine? Do we need new styles to be created, or we want the machine to resort and imitate the existed music? How the quality of the generated music could be evaluated by machine? Are we able to generate new music analogous to the manuscripts of a famous musician by machine, and how this similarity could be certified?

In this study, our assumption is that human will judge the quality of the generated melody, and will give them scores. Thereafter, the human scoring will be modeled using a Bidirectional Long-Short-Term Memory (Bi-LSTM) neural network. This neural network-based system will supersede the human scoring system and will perform as a standard evaluator of the melody generated by optimization-based models. The proposed music generation system of our paper is based on Genetic algorithm, which performs as a notebased method to create melodies.

The novelty of this paper is two-fold. First, the interaction between human and machine in order to generate new meaningful and pleasurable melodies, and second is introducing a new scoring model in order to evaluate the generated melodies.

The rest of the paper is arranged as follows. In section II, the proposed method has been represented in three phases. In section III, the experimental results have been provided and then analyzed. The paper is then terminated by our conclusion in chapter IV, followed by the cited references.

The proposed method consists of three major phases. First, a Genetic algorithm (GA) is used to generate a vast spectrum of melodies, from bad ones up to pleasurable ones. The bad melodies are the ones that are made randomly when GA starts. In the second phase, the outputs of GA are given to humans to be scored from zero to 100. Then, these generated and scored melodies are trained by a Bi-LSTM neural network. This network, when trained by a sufficient amount of musical data and associated scores, would perform like a performance evaluator of the GA music generator system. In the end, the GA music which can maximize the Bi-LSTM output as the objective function is played, as the generated pleasurable melody.

As already mentioned, this interval involves providing the necessary training data (i.e. melodies) for the next phase. Genetic algorithm has been chosen for this task. The reason is that GA can generate a population of melodies (chromosomes) randomly, with a vast spectrum of qualities.

These melodies are generated in the form of ABC notations 1 , and then at each iteration, the best melodies among the population are selected based on their similarity to the human made melodies, which are taken from the manuscripts of the most famous musicians. Therefore, a database of melodies has to be used as part of the fitness function. Then, the number of 2-gram, 3-gram, and 4-gram structures in generated melodies are computed, which also exist in the database. Then, the fitness function is calculated as,

Where Ci is i th chromosome and N2, N3 and N4 are the numbers of 2-grams, 3-grams, and 4-grams respectively.

If we set the database for a specific genre, the generated music will be very much similar to that specific genre, as well. To make this str

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut