harmonic_analysis

View on GitHub

Parameter Adjustment

In this program, there are many adjustable parameters, ranging from the analytical styles, types of machine learning models, model architectures, and hyper-parameters. All of the parameters and the corresponding available values are shown in the chart below. The first value will be the default one. If you look at the main.py script, you will notice there are a few more parameters. However, they are either not essential for the task or not fully tested yet. Please let me know if you want to experiment with these parameters.

Parameters Values Explanation
–source (-s) ISMIR2019 The kind of annotations you want to use for training. Currently, only ISMIR2019 went through the whole workflow, including partial manual modification and re-training. Currently, the code only accepts ISMIR2019.
–num_of_hidden_layer (-l) 3, usually ranging from 2-5 The number of hiddel layers (not effective in SVM)
–num_of_hidden_node (-n) 300, usually ranging from 100-500 The number of hidden nodes (not effective in SVM)
–model (-m) DNN is default, SVM also available The types of models you want to use
–pitch (-p) pitch_class The kind of pitch you want to use as features. pitch_class means using 12-d pitch class for each sonority; pitch_class_4_voices means using 12-d pitch class for each of the 4 voices
–window (-w) 1, usually ranging from 0-5 The static window you can add as context for DNN or SVM model
–output (-o) NCT_pitch_class NCT_pitch_class means using 12-d output vector specifying which pitch classes contain non-chord tones (NCTs)
–input (-i) 3meter, barebone, 2meter and NewOnset also available. The default is 3meter_NewOnset. Specify what input features, besides pitch classes, you are using. You can use meter features: 2meter means you are using on/off beat feature; 3meter means you are using ‘strong beat, on/off beat’ feature; NewOnset means whether the current slice has a real attack or not across all the pitch classes/voices. It will add another 12-d vector in the input specifying which pitch classes are the real attacks.
–predict (-pre) ‘Y’ is the default, ‘N’ also available Specify whether you want to predict and output the results in musicXML
–cross_validation (-c) 1, ranging from 1-10 Specify how many cross validation folds you want to use. The default is set to 1 for the fast runtime of the experiment

Usage Example