Automatic Music Classification


jMIR is an open-source software suite implemented in Java for use in music information retrieval (MIR) research. It can be used to study music in both audio and symbolic formats as well as mine cultural information from the web and manage music collections. jMIR includes software for extracting features, applying machine learning algorithms and analyzing metadata.

The primary emphasis of jMIR is on providing software for general research in automatic music classification and similarity analysis. The main goals of the project are as follows:

  • Make sophisticated pattern recognition technologies accessible to music researchers with both technical and non-technical backgrounds.
  • Eliminate redundant duplication of effort.
  • Increase cooperation and communication between research groups.
  • Facilitate iterative development and sharing of new MIR technologies.
  • Facilitate objective comparisons of algorithms.
  • Facilitate research combining high-level, low-level and cultural musical features (i.e. symbolic, audio and web-mined features).

In order to meet these goals, all aspects of jMIR are open source and distributed free undr a GNU General Public License. The software is well-documented and include GUIs in order to increase general usability. A special emphasis has been placed on software architectures that facilitate extensibility for those technically inclined users who wish to modify or add to the software.

jMIR was funded by a grant from the Social Sciences and Humanities Research Council of Canada.

Each of the components comprising the jMIR software suite may be used entirely separately or as an integrated whole. The components communicate with each other using files in either ACE XML or Weka ARFF formats. The components are as follows:

Standardized File Format

  • ACE XML: A standardized set of file formats for representing feature values, feature metadata, instance labels and class ontologies. Work on new and significantly extended ACE XML 2.0 versions of these file formats is also ongoing. More details are available on the ACE XML Development Page.

Data Mining and Machine Learning

  • ACE: Pattern recognition software that utilizes meta-learning. Evaluates, trains and uses a variety of classifiers, classifier ensembles and dimensionality reduction algorithms based on the needs of each particular research problem.

Feature Extraction

  • jAudio: Software for extracting low and high-level features from audio recordings.
  • jSymbolic: Software for extracting high-level features from MIDI recordings.
  • jWebMiner: Software for extracting cultural features from web text.

Data and Metadata

  • jMusicMetaManager: Software for profiling music collections and detecting metadata errors and redundancies.
  • Codaich: A labeled database of MP3s for training, testing and evaluating MIR systems.


  • Bodhidharma: MIREX 2005-winning software for classifying MIDI recordings by genre. The ancestor of ACE and jSymbolic.


The Networked Environment for Music Analysis (NEMA) project is a multinational and multidisciplinary cyber-infrastructure project for music information processing that builds upon and extends the music information retrieval research being conducted by the International Music Information Retrieval Systems Evaluation Laboratory (IMIRSEL) at the University of Illinois at Urbana-Champaign (UIUC). NEMA brings together the collective projects and the associated tools of six world leaders in the domains of music information retrieval (MIR), computational musicology (CM) and e-humanities research. The NEMA team aims to create an open and extensible web services-based resource framework that facilitates the integration of music data and analytic/evaluative tools that can be used by the global MIR and CM research and education communities on a basis independent of time or location. To help achieve this goal, the NEMA team will be working co-operatively with the UIUC-based and Mellon-funded Software Environment for the Advancement of Scholarly Research (SEASR) project to exploit SEASR’s expertise and technologies in the domains of data mining and web services-based resource framework development.

The NEMA work at McGill is currently focused on expanding the ACE XML file formats and developing software tools for parsing, writing and processing them, but jMIR tools in general are being adapted for the project as well.

NEMA is being funded through a generous grant from the Scholarly Communications program of the Andrew W. Mellon Foundation.

Available at Cory McKay’s publication page.


Ichiro Fujinaga

Graduate Students

Undergraduate Students

  • Jessica Thompson