---
title: Music as a Material for Information Science Education and Research
tags: 
author: [Kitahara Lab](https://image.docswell.com/user/kthrlab)
site: [Docswell](https://www.docswell.com/)
thumbnail: https://bcdn.docswell.com/page/L73WV49Z75.jpg?width=480
description: Seminar at Academia Sinica
published: June 04, 26
canonical: https://image.docswell.com/s/kthrlab/ZN7VWL-2026-06-04-155201
---
# Page. 1

![Page Image](https://bcdn.docswell.com/page/L73WV49Z75.jpg)

Music as a Material for
Information Science
Education and Research
Tetsuro Kitahara
Professor, Nihon University, Japan
Specially Appointed Professor, Shiga University, Japan
Visiting Scholar, Academia Sinica, Taiwan (Apr. 2026—Jan. 2027)


# Page. 2

![Page Image](https://bcdn.docswell.com/page/87DK8QG4JG.jpg)

Self introduction
• Name:
北原 鉄朗 (きたはら てつろう / KITAHARA Tetsuro)
• Affiliation: 日本大学 (Nihon University)
文理学部 (College of Humanities and Sciences)
情報科学科 (Dept. of Information Science)
• Career:
PhD from Kyoto Univ. (Musical instrument recognition)
PostDoc at Kwansei Gakuin Univ. (Music generation)
Assist., Assoc. &amp; Full Professor at Nihon Univ.
• Interests: All topics related to music computing
(in particular, symbolic music generation technologies)


# Page. 3

![Page Image](https://bcdn.docswell.com/page/VJPK8L3VE8.jpg)

Kitahara Lab at Nihon University
• Established in 2010
• Catch-phrase: Technology Makes Music More Fun
• The lab typically consists of:
1 faculty member, 0--2 part-time staff members
0--1 PhD students, 0--6 MSc students, 14--18 Bach students
• Almost all students are engaged in music computing


# Page. 4

![Page Image](https://bcdn.docswell.com/page/2EVVNQ4REQ.jpg)

Today’s talk
• Why we focus on music
• Examples of music-related information science research
in our laboratory
• Attempts to teach machine learning through music


# Page. 5

![Page Image](https://bcdn.docswell.com/page/57GLKW16EL.jpg)

Why we focus on music


# Page. 6

![Page Image](https://bcdn.docswell.com/page/4EQYN3D2JP.jpg)

Music
Music is familiar
Music has various aspects
• A form of art/entertainment
• Signal
• Spectrogram
• Large business market
• Popular as a hobby
(incl. playing instruments)
• Everyone learns it at school
(to some extent)
Everyone has personal
musical experience
• A sequence of notes


# Page. 7

![Page Image](https://bcdn.docswell.com/page/KJ4WG1ZP71.jpg)

What we should teach
• Data representation
• How we represent various types of content digitally
• Programming
• How we implement computational processing as executable programs
• Signal processing
• How we extract meaningful information from signals
• Machine learning
• How we build intelligent systems from data
• Human-computer interaction
• How we design smooth interactions between humans and computers


# Page. 8

![Page Image](https://bcdn.docswell.com/page/LE1YDGRX7G.jpg)

Various representations of music
Signal
Symbolic time series
2D image
Event series
Hierarchical
(note_on, 60), 0.50,
(note_on, 64), 0.00,
(note_on, 67), 0.50,
(note_off, 60), 0.00,
(note_on, 62), 0.50,
We can learn programming / machine learning
for various types of data


# Page. 9

![Page Image](https://bcdn.docswell.com/page/GEWGYK1KJ2.jpg)

Other reasons
Music is real-time
Everyone has musical experience
• Listening
interaction
• Playing an instrument as a hobby
• Club activity at school
• For some applications,
real-time processing is mandatory
Good material for exercising
real-time, low-latency,
multi-threaded processing
• Music classes at school
• etc.
Easy to find research topics
based on personal experience


# Page. 10

![Page Image](https://bcdn.docswell.com/page/47ZLXZPNJ3.jpg)

Examples of music-related inf. sci. research
in our laboratory


# Page. 11

![Page Image](https://bcdn.docswell.com/page/YJ6W4ZM9JV.jpg)

Typical process of Bachelor thesis projects
Grade 3
Semester 1
Join the lab
Discuss personal interests in research
Decide the research topic roughly
Grade 3
Semester 2
Study basic knowledge (e.g. basis of machine learning)
Grade 4
Semester 1
Decide the details of the research topic
Grade 4
Semester 2
Complete developing the system, model, etc.
Start preliminary analysis of data related to the topic
Start developing a system, model, etc.
Conduct experiments
Write a thesis


# Page. 12

![Page Image](https://bcdn.docswell.com/page/GJ5MQWZDJ4.jpg)

Research topics
Music generation
• Four-part harmonization (2014)
Symbolic
• Guitar tablature generation (2025)
• Drum loop morphing (2023)
Symbolic
Audio
Music analysis
• Phrase tendencies of a particular bassist (2017)
Symbolic
Music interaction
• Drum velocity control (2024)
Symbolic
• Drawing-based improvisation system (2023)
Symbolic


# Page. 13

![Page Image](https://bcdn.docswell.com/page/9E29PQPM7R.jpg)

Music generation
• Four-part harmonization (2014)
• Guitar tablature generation (2025)
• Drum loop morphing (2023)


# Page. 14

![Page Image](https://bcdn.docswell.com/page/D7Y45W5PEM.jpg)

Case 1
[S. Suzuki &amp; T. Kitahara, JNMR, 2014]
Four-part harmonization
Model
Difficulty
We have to consider both
continuity
and
simultaniety
Learning-based
- Neural net (Hild ‘91)
- HMM (Allen, ‘05)
- Weighted finite
transducer (Buys ‘12)
Non-learning-based
- Expert system (Ebcioglu ‘90)
- Constraint satisfaction
problem (Pachet ‘98)
- GA (Phon, ‘99)


# Page. 15

![Page Image](https://bcdn.docswell.com/page/VENYN9NMJ8.jpg)

Problem in chord nodes
C
Am
G
A
E
E
C
C
Most existing studies use nodes representing
chords or harmonic functions
Practically, using chord nodes is not easy
If chord symbols distinguish voicings If not
C6 C6 on G
Too many elements
C
Difficult to train models
with a limited # of data
Am
Am7
Too ambiguous
C
Am
Is it better not to use chord nodes?
One symbol corresponds various sounds


# Page. 16

![Page Image](https://bcdn.docswell.com/page/Y79PR2RWE3.jpg)

Model
Determined before inference at time i


# Page. 17

![Page Image](https://bcdn.docswell.com/page/G78DW5WR7D.jpg)

Training data
• 254 Hymnal four-part melodies
Example
Chord model
• Transposed to C major
Non-Chord model
Input (soprano melody)


# Page. 18

![Page Image](https://bcdn.docswell.com/page/L7LMNYN2JR.jpg)

Case 2
Guitar tablature generation
[S. Sakai et al. SMC 2024]
Motivation
Our goal
In finger-style solo guitar,
a player often plays both a melody
and chords on a single guitar
Difficult to find how to play both
(within physical restrictions)
Automatically generate a tablature
for playing both a melody and chords
from a given lead sheet
Input
Output
Includes chord voicings playable with the melody


# Page. 19

![Page Image](https://bcdn.docswell.com/page/4EMYXNX9EW.jpg)

What’s the difficulty
Key idea for solution
Many possibilities of chord voicings
Search the minimal cost state transitions
Must find physically playable ones
together with the melody
Example: Dm7 with A (melody)
Not playable
state = fingering form on the fretboard
cost = performing difficulty
(with an HMM-like idea)
To get easily playable tablatures
Introduce typical forms
X
X
Playable
X
X
F’s typical form
=(1, 1, 2, 3, 3, 1)
C’s typical form
=(0, 1, 0, 2, 3, -1)
States are restricted to typical forms
and their modified forms


# Page. 20

![Page Image](https://bcdn.docswell.com/page/PER9NDN9J9.jpg)

Basic formulation
State definition
Input: {(x1, c1), (x2, c2), ..., (xN, cN)}
Highest note
melody note
xn: melody note, cn: chord
Lowest note
root note of the chord
Output (state): Q = {q1, q2, …, qN}
qn: fingering form (6-dim vec)
(0, 1, 0, 2, 3, -1)
Typical forms and their modified forms are
added to the state set
3 types of costs
Initial cost C(q1): neck-side positions are prior
Minimize: Initial cost
C(Q) = C(q1) + C((x1, c1) | q1) + …
+ C(qN | qN-1) + C((xN, cN) | qN)
Transition cost
Emission cost
Transition cost C(qn | qn-1):
smaller position changes are better
Emission cost C((xn, cn) | qn):
melody note and chord tones must be emitted


# Page. 21

![Page Image](https://bcdn.docswell.com/page/P7XQN1N3EX.jpg)

Example
Evaluation
Evaluator: one professional classical guitarist
Voicing richness
should depend on
metrical positions
Many simultaneous notes


# Page. 22

![Page Image](https://bcdn.docswell.com/page/37K9N2NN7D.jpg)

Case 3
Drum loop morphing
Motivation
[M. Kawahara et al. CMMR 2023 (demo)]
Loop sequencers need many sound loops
to enable to compose various music
We focus on morphing
as a method for generating new loops
(New loop) = α × (Loop A) + (1 – α) × (Loop B)


# Page. 23

![Page Image](https://bcdn.docswell.com/page/LJ3WV4VZJ5.jpg)

VAE-based model
Convolution
Deconvolution
Loop A
Spectrogram
New loop
Loop B
Dataset
224 loops taken from “Techno &amp; Trance” of “Sound PooL”
(Drums, 2-bar, BPM=135)


# Page. 24

![Page Image](https://bcdn.docswell.com/page/8JDK8Q84EG.jpg)

Example
Subjective evaluation
Loop A
1. Listen to Loops X &amp; Y
(One is generated; the other is a reconstruction of an existing one)
Loop B
Generated
loop
2. Answer which is ML-generated
Ratio of correctly answered participants
Mean: 0.374
SD: 0.180
They could not distinguish generated and existing loops


# Page. 25

![Page Image](https://bcdn.docswell.com/page/VEPK8L8V78.jpg)

What they learned through the research
Four-part harmonization
• Basic knowledge of probabilistic models
• Designing a practical model (model complexity vs. data size)
• Dataset construction (incl. managing the data input team)
Guitar tablature generation
• Formulating the task as a mathematical optimization problem
• Implementing an HMM-like optimization algorithm
Drum loop morphing
• Basic knowledge of CNN, VAE, etc.
• Designing and conducting experiments


# Page. 26

![Page Image](https://bcdn.docswell.com/page/27VVNQNR7Q.jpg)

Music analysis &amp; interaction
• Phrase tendencies of a particular bassist (2017)
• Drum velocity control (2024)
• Drawing-based improvisation system (2023)


# Page. 27

![Page Image](https://bcdn.docswell.com/page/5JGLKWK67L.jpg)

Case 4
Evolution of phrase tendencies in a particular bassist
[Matsuura et al. CSMC 2017]
Motivation
Musicians’ individuality often changes for various reasons
(change in personal preference, change of band members, etc.)
Analyze changes in phrase-level individuality of a particular musician
Target player
Flea (the bassist of Red Hot Chilli Peppers)
“(As John returned in 1999,) Flea’s bass play drastically changed;
he plays thoroughly simply, focusing on root notes.”
(originally in Japanese; translated by us)
Year
Higher
1989
Ground』
『
Year
Parallel
1999Universe』
『
Year 1989
『Higher Ground』
Ba.
Ba.
Year 1999
『Parallel Universe』
Ba.
Ba.


# Page. 28

![Page Image](https://bcdn.docswell.com/page/47QYN3N2EP.jpg)

Research questions
• How can we confirm that the phrase tendency changed in 1999?
• What are the differences between the phrases before and after 1999?
Solution
Pattern recognition approach
Using MIDI transcriptions of Flea’s bass phrases
① before/after classification
1999
before
after
Higher accuracy
before
2002
after
Lower accuracy
More remarkably changed in 1999
② feature selection
1999
before
after
Using features
A, B &amp; C
High accuracy
A, B &amp; C are main differences


# Page. 29

![Page Image](https://bcdn.docswell.com/page/KE4WG1GPJ1.jpg)

Results
①
(10-fold cross validation)
J48
IBk
Bayes
MLP
Net
1999 76%
78%
73%
84%
2002 61%
54%
61%
63%
2006 65%
55%
62%
50%
②
Mean pitch
⚫ Ratio of succ. notes with pitch diff. of 0
⚫ Num of succ. notes with pitch diff. of 3
⚫ Ratio of notes with top 5 note nums
⚫
Accuracy: 82% with only 4 features
Classification between pre-1999 and post-1999
phrases achieved the highest accuracy
Changed most remarkably in 1999
Pitch and simplicity are
the main differences


# Page. 30

![Page Image](https://bcdn.docswell.com/page/L71YDGDXJG.jpg)

Case 5
Drum velocity control based on
human piano performance
[S. Seki et al.
GCCE 2024]
Motivation
A band should share global changes in dynamics
e.g. start the intro with low dynamics, play the bridge with high dynamics
Control the drum velocity according to human piano performance
velocity
Global velocity changes are shared
velocity
Piano
(human)
time
Drums
(system)
time


# Page. 31

![Page Image](https://bcdn.docswell.com/page/G7WGYKYKE2.jpg)

Proposed method
velocity
+ mean
0
= actual velocity
time
Global velocity change
Local velocity change
measure-wise mean of velocity
Deviation from global velocity change
measure m-2 measure m-1 measure m measure m+1
Predict
Piano (human)
×α
Reflect
Δv(i)m,n ～ N(μ(i)c, σ(i)c2)
Randomly determined
following a normal distribution
(μ and σ2 are learned with data)
Drums (system)
×(1-α)
Predict
Linear regression


# Page. 32

![Page Image](https://bcdn.docswell.com/page/4JZLXZXNE3.jpg)

Demo
Sorry, the difference between the sounds with high and low velocity is unclear


# Page. 33

![Page Image](https://bcdn.docswell.com/page/YE6W4Z49EV.jpg)

Extra case
Improvisation system based on
※ This is my own project
user-drawn melodic outlines
(not a student&#039;s one)
[Kitahara et al. ACM MM Asia 2022 (demo)]
• Improvisation is difficult because
it requires creating melodies while playing
• Once the user draws a melodic outline,
the system generates a melody in real time
Cmaj7 Am7
Create
a melody
Play it
Harmony
theory
Musical
scale
Learned
melodies


# Page. 34

![Page Image](https://bcdn.docswell.com/page/GE5MQWQDE4.jpg)

Key idea
• Use a dataset of symbolic transcriptions of professional improvisations
• Make pseudo outlines by smoothing pitch trajectory of melodies
• Make a model that estimates (before-smoothing) melodies from outlines
Transcribed melody
(Weimar Jazz DB)
smoothing
A sequence of notes
Pseudo melodic outline
Estimate a before-smoothing sequence
(with CNN)


# Page. 35

![Page Image](https://bcdn.docswell.com/page/9729PQWMJR.jpg)

time
Melodic
outline
Input
Chord
Model
time
Output
Dataset
Melody
notes
(onset)
96 Blues melodies
from Weimar Jazz DB
Melody
notes
(cont’d)
(Half for training)
Rest
conv.
conv.
deconv.
deconv.
Input
time
Output
time
time
time
time
Let’s see a live demo


# Page. 36

![Page Image](https://bcdn.docswell.com/page/DJY45WLP7M.jpg)

What they learned through the research
Phrase analysis of a bassist
• Basic knowledge of pattern recognition techniques
(But he didn’t learn how to implement them; he used Weka)
• How to analyze musicians’ intuitive impressions quantitatively
Drum velocity control
• Basic knowledge of probabilistic models and statistics
• Formulating a time-series prediction problem using regression
• Implementing a real-time system


# Page. 37

![Page Image](https://bcdn.docswell.com/page/V7NYN94ME8.jpg)

Discussions: what they learned
Computational thinking
• Formulating tablature generation as an optimization problem
• Analyzing bass phrase tendencies as a pattern recognition problem
• Modeling ensemble interaction as a time-series prediction problem
Basic knowledge of specific areas in information science
• Probabilistic models &amp; machine learning (Bayesian networks, VAE, etc.)
Programming
• How to use libraries (e.g. TensorFlow)
• Some projects involve real-time processing
They didn’t learn to implement
ML algorithms from scratch


# Page. 38

![Page Image](https://bcdn.docswell.com/page/YJ9PR2QW73.jpg)

Discussions: how they chose topics
Students who play instruments
• Most students chose topics related to instruments they play
• Guitar, bass, drums, etc.
• They tended to come up with topics from personal experience
• Wanted to play solo guitar pieces but could not find suitable tablatures
Students who do not play instruments
• They tended to choose topics not related to specific instruments
• Some of them avoided symbolic music generation
• Music knowledge (harmony, chords, scales) is required


# Page. 39

![Page Image](https://bcdn.docswell.com/page/GJ8DW5GRJD.jpg)

Attempts to teach machine learning through music


# Page. 40

![Page Image](https://bcdn.docswell.com/page/LJLMNYG2ER.jpg)

One of my classes at Nihon University
Objective
To learn “deep learning” through
exercises in music analysis and generation
Details of content
1. Let’s learn MLP through major/minor key classification
2. Let’s learn RNN through two-part harmonization
3. Let’s learn VAE through melody morphing
4. Let’s learn CNN through polyphonic melody generation
5. Let’s learn GAN through polyphonic melody generation


# Page. 41

![Page Image](https://bcdn.docswell.com/page/47MYXNQ97W.jpg)

Overview of exercises
Polyphonic pianoroll
Data
• Four-part harmonized pieces from
the Infinite Bach dataset (about 250 pieces)
• Provide codes for converting MIDI data to
4- or 8-bar pianoroll matrices (8th note grid)
Environments &amp; libraries
• Python on Google Colab
• TensorFlow
• PrettyMIDI
Basic codes are provided
Partwise pianoroll


# Page. 42

![Page Image](https://bcdn.docswell.com/page/P7R9ND89E9.jpg)

Learn RNN through
two-part harmonization
Learn VAE through melody morphing
Latent space
Soprano pianoroll matrix
Decode
?
Encode
Learn GAN through polyphonic melody generation
z
1/0
Alto pianoroll matrix
Data from dataset


# Page. 43

![Page Image](https://bcdn.docswell.com/page/PJXQN1837X.jpg)

Textbook
• “Learning Deep Learning with Music”
• Written by T. Kitahara (me)
• Published by Ohmsha in 2023


# Page. 44

![Page Image](https://bcdn.docswell.com/page/3JK9N2KNJD.jpg)

Results (personal observations)
Pros
• We could learn RNN, CNN, GAN, etc. with a unified type of data
• Listening to generated melodies was enjoyable
• Practically useful for students planning a music-related thesis
Cons
• Learning music-related knowledge (incl. MIDI) required overhead
for students not planning a music-related thesis
• Evaluating generated content was difficult for most students
(they lacked musical knowledge)


# Page. 45

![Page Image](https://bcdn.docswell.com/page/LE3WV4ZZE5.jpg)

Conclusion
• Music has multiple computational representations
(signals, images, event sequences, hierarchical structures)
• Music naturally involves many areas of information science
(machine learning, signal processing, optimization, HCI, etc.)
• Students learned computational thinking through music-related research
• Personal musical experience strongly motivated students’ topic selection
• Music also worked well as a teaching material for machine learning
Music provides an intuitive and engaging gateway to information science


