CATALOG DESCRIPTION: Computational auditory scene analysis (CASA) is the study of how a computational system can organize sound into perceptually meaningful elements. Problems in this field include source separation (splitting audio mixtures into individual sounds), source identification (labeling a source sound), and streaming (finding which sounds belong to a single explanation/event). This course is an advanced graduate course covering current research in the field.
REQUIRED TEXTBOOK: Advanced research papers in the field.
REFERENCE TEXTBOOKS: (not required purchases) Excerpts from the following texts may be provided, however the focus will be on research papers published in the field.
DeLiang Wang, Guy J. Brown, Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Bregman, Albert S., Auditory Scene Analysis: The Perceptual Organization of sound.
COURSE COORDINATOR: Bryan Pardo
COURSE GOALS: The goal of this course is to familiarize graduate students (and advanced undergraduates) with the current state-of-the-art in machine perception of audio. Students will read recently published papers in the field and become well informed on at least one sub-field within machine perception of audio. The class will also explore basics of audio perception, including the relationship between pitch and frequency and the difficulties inherent in auditory scene analysis by humans and machines. Basic classification and sequence alignment techniques will also be introduced.
PREREQUISITES: Understanding of signal processing including topics such as Fourier Transforms and filter design is a prerequisite. Knowledge of machine learning techniques (Markov models, support vector machines, etc.) is also helpful, but not required.
DETAILED COURSE TOPICS:
What follows is an example syllabus. As topics of current interest in the field shift, course content will vary to reflect research trends.
Week 1: Perception of periodic complex sounds, auditory filters, critical bands
Week 2: Representations of audio: spectrograms, cepstrograms, atomic representations
Week 3: Audio fingerprinting
Week 4: Pitch tracking
Week 5: Melody matching
Week 6: Source identification
Week 7: Source identification
Week 8: Source separation
Week 9: Source separation
Week 10: Streaming
Presentation on topic (30%)
Research paper synopses (30%)
Report on research area (30%)
Class participation (10%)
COURSE OBJECTIVES: When a student completes this course, s/he should:
have a general understanding of the current state-of-the art in machine perception of audio.
be able to distill large amounts of research into coherent summaries.
be able to think critically about work in the field.