Miyaesi

Saturday, October 1, 2011

Components of Miyaesi Architecture

Input data come into the system through the input interface and in this stage user is facilitated to input the desired music file in common format .WAV (Waveform Audio File). There exists a file chooser in the system to select the input music file. Input .WAV file may contain stereo or mono music track. The input file size is not limited, but it is assumed user will input a general length music file. User is not allowed to input multiple files into the system at the same time.

WAV file preprocessor Module

Data that come into the system via input interface reach this module on the first place. The input to this unit is a byte array of PCM data. Output would be processed data with performed with segmentation, windowing applied and noise is removed to a certain degree.

Voice Remover Module

This module sits between the preprocessor module and the Onset Detector module in the pipeline. Input to the module is the preprocessed data and output is the PCM data with values changed, so that the centered voice parts are removed. It uses some of the metadata that is extracted during the reading phase of the WAV file. This module is a simple one and process data with one pass. The data which is input to the system can skip this module if the user prefers.

Onset detector Module

This module comes after the WAV file preprocessor module (or optionally after voice remover module) in the pipeline, and before the frequency estimator module and instrument classifier. The task of onset detector is to define the boundaries of the notes looking at the whole wave pattern represented by PCM data. For this, it uses the Meta information of the WAV file that is kept while reading it. Marking of onsets are done based on the time-line because it is common to all the modules. Then other modules can understand where the marked onsets are without sharing additional information. This makes this module loosely coupled with the other modules. Also it makes the processing of data easy at the later stages in the pipeline.

MIDI File Creator Module

At the end of the wave file analysis process MIDI files are created by the Miyaesi System. The user can check out the accuracy of the notation generated by playing these MIDI files. And the MIDI files can be saved to be used in the future.

MIDI (Musical Instrument Digital Interface) is an industry-standard protocol that enables electronic musical instruments to communicate and synchronize with each other. MIDI is used in a wide range of devices such as synthesizers, drum machines, computers and other electronic equipment like MIDI controllers, sound cards and samplers. MIDI differs from analog devices because MIDI does not transmit an audio signal. It sends event messages with data about

· musical notation,

· pitch and intensity,

· Control signals for parameters such as

o Volume

o Vibrato

o Panning

o Cues

· Clock signals to set tempo.

Java Sound API can be used for controlling audio playback, audio capture, MIDI synthesis, and basic MIDI sequencing. In our system Java Sound API is used to create MIDI files and playback MIDI files created after Wave file analysis.

Instrument Classifier Module

Instrument Identifier comes after the onset detector module and before the notes editor module within the pipeline. This module directly uses the onsets which are outputted by the onset detector module and processes each note separately. The main task of this module is identifying the instrument which is played within each note. The classification is based on spatial and temporal features of the instrument. After extracting the spatial and temporal features, the feature vector is send to a Neural Network which is trained for predicting the playing instrument. The identified instrument details are finally sent to the Notes Editor Module.

Notes Editor Module

This component exists right after the instrument classifier and the frequency estimator. Both outputs of previous two modules are inputted here. The main task of the component is providing facilitates for the user to edit the detected notes and instruments if he is not happy with it. This module is hence semi-automatic and optional. The output is the changed data by the user, from which he needs to generate the sheet and the MIDI file. This component uses IO modules which are utility modules .

Notation Sheet Generator Module

This module comes after the MIDI file creator. It takes the output of MIDI file creator in the as its input. This component uses the MIDI data and generates the notation sheet. User can see the generated notation sheet in a Jpanel.

Output interfaces

There are two main outputs which come through the output interfaces of this system. Final outcome of the system is the notation sheet generated which is printable. It is the output of Notation sheet generator module. This notation sheet displays time signature and key signature also. If the generated sheet’s length is more, the user can scroll and view the sheet. And this notation sheet can be saved as an image. Also user is facilitated to playback the music file along with the sheet. In addition to that MIDI file is other outcome of the system which can be saved to disk. This output comes from the MIDI file creator module.

Other Utility modules

There are many utility modules that support the processing of data stage after stage as described in previous sections. The following modules are reused throughout the system architecture.

· Print Utilities Module

· Sound IO module containing low level operations on PCM data

· Custom file filter modules and table modules that supports the GUI

· Modules that support data writing to the disk

These modules stay separately to the pipeline architecure and share information with modules in the pipeline where necessary. Some of them operate on the results of modules and some of them operate when inputs are received by the modules. There are few occasions that they help to process the data in the middle of main function of a module as well.

System Architecture of Miyaesi

Pipeline architecture was selected as the most suitable architecture for Miyaesi system. Because the stages of music transcription are naturally independent to each other and most of the time one stage comes after the other. Meta information sharing between the components can be done safely, so that the loose coupling between the modules is preserved.

As above figure illustrates, the pipeline is forked into two after onset detection is done. After that stage the frequency estimating module and the instrument classification module are present. Frequency detector transforms time domain data into the frequency domain and process. But for classification of instruments frequency domain data cannot be used. For that, again time domain data is necessary. Thus after the onset detection is performed the pipeline forks; one branch is for frequency processing and other branch is for instrument detection. The results are joined again at the MIDI file creator module because it wants output of both processing channels as the input.

Each module has cohesion with an assigned primary task. Sometimes data has to be shared explicitly between modules like Meta information about the input file. Each component is independent to each other and can be independently developed. There are no modules which are inherited from the other modules in the high level.

There is a set of utility modules that stay separate and are reused all over the system architecture. They usually appear at the edge of the processing within a particular module to write data to disk, send it to printer etc.

The input to the overall system is a music recording in the form of WAV file format. The reason to choose this format is WAV files contain the raw data from a recoding. When it is converted to other file formats like .MP3 the conversion generally loses some of the information which might be important in music transcription.

Final outcome of the system is the notation sheet generated which is printable. In addition to that MIDI file can be saved to disk as well.

Saturday, April 16, 2011

Features of Automatic Music Transcriber

Following are important features of Miyaesi-Automatic Music Transcriber.

• Identifying a main instrumental sound for generating notation

• Capability of music score editing

• Capability of printing music sheet

• Displaying key signature
-Designating notes that are to be consistently played one semitone higher or lower than the equivalent natural notes

• Displaying time signature
-Notational convention used in Western musical notation to specify how many beats are in each measure and which note value constitutes one beat

Sunday, April 3, 2011

Introduction

For a musician music notations are a very important feature. Music notation is used to represent a melody of a song or any music piece on sheet of paper and it provides a way of communication between two musicians. A person, who has created a melody, can write it down on a paper using a music notation system and send it to another person. Then that person can play the melody on any kind of instrument just by reading the notation received.

For a fresher to the music field these music notations are very valuable. If as his or her talent in identifying the notes using the hearing has not been developed, then music notations are the easiest method for him or her to learn new songs. But the problem we face is the lack of music notations available. It is very easy to find song files, but finding notes for an available song is not an easy task. Currently some web applications are available in internet which the user can search for notes of a song. But all these applications use some kind of storage to store the notes of individual songs with a search mechanism to find the requested one. But each and every song requested is not in the databases on the internet. And some software exists only to generate notes of a particular music instrument. Even they are also not open source or free of charge.

The summery of this problem is that, there is not a real time, software based tool to give musical notations of any given song. Thus the final goal of this project is to come up with a mechanism to filter the notes of individual musical instruments from an original music track. The problem is not easy to solve since extracting individual basic components with various notes, various harmonics and various signal patterns is very complex. A lot of statistical knowledge and artificial neural network concepts are needed to solve this.

So the purpose of this project is to develop open source software that that identifies notes played by different music instruments together and generates the music notation. It would be useful for the users to classify and analyze music. And this will be a software application which allows to get print outs of the notes. User will be able to edit the notes being printed and test it back on its own. And the software will be pluggable to music libraries/ media players.