---
title: FIRAS@SIGMOD2026
tags: 
author: [Daichi Amagata](https://image.docswell.com/user/8828547526)
site: [Docswell](https://www.docswell.com/)
thumbnail: https://bcdn.docswell.com/page/9J29PVZVER.jpg?width=480
description: Presentation slide of FIRAS: A Framework for Interval Range Search and Sampling
published: June 03, 26
canonical: https://image.docswell.com/s/8828547526/51QX63-2026-06-03-012608
---
# Page. 1

![Page Image](https://bcdn.docswell.com/page/9J29PVZVER.jpg)

FIRAS: A Framework for Interval
Range Search and Sampling
Daichi Amagata (The University of Osaka)
Panagiotis Simatis
(University of Ioannina)
Panagiotis Bouros
---
Nikos Mamoulis
(University of Ioannina)


# Page. 2

![Page Image](https://bcdn.docswell.com/page/DEY453RQJM.jpg)

Background: Range Search on Interval Data
No. 1
•
An interval 𝑥 = [𝑥. 𝑙, 𝑥. 𝑟]
•
A set of 𝑛 intervals 𝑋 = {𝑥1 , … , 𝑥𝑛 }
•
A query interval 𝒒 = [𝒒. 𝒍, 𝒒. 𝒓] finds 𝑹 = 𝒙 ∈ 𝑿 ∶ 𝒙 ∩ 𝒒 , i.e., “all” intervals ∈ 𝑿 that overlap 𝒒.
Ex.1) Vehicle management systems:
Show the vehicles that were active
between 17:00 and 22:00 a week ago.
Ex.2) Book management systems:
Collect the books that were borrowed
in the last month.
Ex.3) Historical cryptocurrency databases:
Show when the price of Bitcoin (BTC) falls
in [30,000, 40,000] dollars.


# Page. 3

![Page Image](https://bcdn.docswell.com/page/VJNYNVD278.jpg)

Background: What if “Too Large Result Set”…
No. 2
•
An interval 𝑥 = [𝑥. 𝑙, 𝑥. 𝑟]
•
A set of 𝑛 intervals 𝑋 = {𝑥1 , … , 𝑥𝑛 }
•
A query interval 𝒒 = [𝒒. 𝒍, 𝒒. 𝒓] finds 𝑹 = 𝒙 ∈ 𝑿 ∶ 𝒙 ∩ 𝒒 , i.e., “all” intervals ∈ 𝑿 that overlap 𝒒.
Ex.1) Vehicle management systems:
Show the vehicles that were active
between 17:00 and 22:00 a week ago.
Database system view:
A range search must incur 𝛀
to obtain its result set.
= 𝑶(𝒏) time
𝒙∈𝑿∶𝒙∩𝒒
Still 10,000,000 intervals!
query
interval 𝑞
Database 𝑋
of a billion intervals
𝑥 ∈𝑋 ∶𝑥∩𝑞
Issue 1: not easy to analyze/visualize
Issue 2: slow response


# Page. 4

![Page Image](https://bcdn.docswell.com/page/YE9PR6GDJ3.jpg)

Background: Range Sampling on Interval Data
•
An interval 𝑥 = [𝑥. 𝑙, 𝑥. 𝑟]
•
A set of 𝑛 intervals 𝑋 = {𝑥1 , … , 𝑥𝑛 }
•
A query interval 𝒒 = [𝒒. 𝒍, 𝒒. 𝒓] randomly samples some intervals from 𝑹.
Ex.1) Vehicle management systems:
Show the vehicles that were active
between 17:00 and 22:00 a week ago.
Issue 1: not easy to analyze/visualize
Ex.1) Vehicle management systems:
Show randomly sampled vehicles that were
active between 17:00 and 22:00 a week ago.
Easy to analyze/visualize &amp; accelerated efficiency
(if only samples are obtained)
No. 3


# Page. 5

![Page Image](https://bcdn.docswell.com/page/GE8DWZV5ED.jpg)

Problem Definition
Problem Definition 1: Range search on interval data
Input: 𝑋 (a set of 𝑛 intervals), 𝑞 (query interval)
Output: 𝑅 = 𝑥 ∈ 𝑋 ∶ 𝑥 ∩ 𝑞
Problem Definition 2: Range counting on interval data
Input: 𝑋 (a set of 𝑛 intervals), 𝑞 (query interval)
Output: 𝑘 = |𝑅|, i.e., range search result size
Problem Definition 3: IRS (Independent Range Sampling) on interval data
Input: 𝑋 (a set of 𝑛 intervals), 𝑞 (query interval), and 𝑠 (#samples)
Output: 𝑆 (a set of 𝑠 intervals, each of which is picked from 𝑅 uniformly at random)
No. 4


# Page. 6

![Page Image](https://bcdn.docswell.com/page/LELMND537R.jpg)

Background: Seamless between Range Search &amp; Sampling
No. 5
Case 2 is useful/efficient ☺
Case 1 is bothering and not useful/efficient in practice 
Range search
Range search
𝑅
𝑅
in 𝑶(𝒌) time ☺
Run search algo.
Run algo. x
IRS
IRS/Counting
𝑅 then 𝑆 or 𝑘
(2)
𝛀 𝒌 = 𝑶(𝒏) time 
𝑆
in 𝑶(𝒔) time ☺
Run algo. x
Run search algo.
IRS
(1) Range counting
𝑆
|𝑅|
Another algorithm
implementation 
in 𝒐 𝒏 = 𝑶(𝒏𝟏−𝝐 ) time ☺
Run IRS algo.
Run algo. x


# Page. 7

![Page Image](https://bcdn.docswell.com/page/4JMYXP3MJW.jpg)

Contributions
No. 6
Challenge
How to design an algorithm that satisfies the following requirements?
•
Range search and IRS are done in 𝑜 𝑛 + 𝑂(𝑘) and 𝑜 𝑛 + 𝑂(𝑠) times, respectively.
•
Only 𝑂(𝑛) space is required (to scale well to data size).
•
Updatable, i.e., it can deal with streaming interval data.
Findings
Method
Range search time
IRS time
Space
Update time
Interval tree [1]
𝑂(log 𝑛 + 𝑘)
𝑂(𝑛)
𝑂(𝑛)
𝑂(𝑛 log 𝑛)
AIT [2]
𝑂(log 𝑛 + 𝑘)
𝑂(log 2 𝑛 + 𝑠)
𝑂(𝑛 log 𝑛)
𝑂(𝑛 log 𝑛)
FIRAS (static)
𝑂(log 𝑛 + 𝑘)
𝑂(log 2 𝑛 + 𝑠)
𝑂(𝑛)
-
FIRAS (dynamic)
𝑂( 𝑛 + 𝑘)
𝑂( 𝑛 + 𝑠)
𝑂(𝑛)
𝑂 𝑛
𝑂(log 𝑛) amortized
[1] Computational Geometry: Algorithms and Applications, 3rd Edition, 2008.
[2] “Independent Range Sampling on Interval Data,” In ICDE, 2024.
Seamless?


# Page. 8

![Page Image](https://bcdn.docswell.com/page/PJR9N1QL79.jpg)

FIRAS: Main Idea
No. 7
Query result decomposition (decomposing 𝑅 into two disjoint sets)
𝑅 = 𝑥 ∈ 𝑋 ∶ 𝑥 ∩ 𝑞 = 𝑄1 ∪ 𝑄2 , where
•
𝑸𝟏 = 𝒙 ∈ 𝑿 ∶ 𝒒. 𝒍 ≤ 𝒙. 𝒓 &lt; 𝒒. 𝒓 , -&gt; one-dimensional range search
•
𝑸𝟐 = 𝒙 ∈ 𝑿 ∶ 𝒙. 𝒍 ≤ 𝒒. 𝒓 ≤ 𝒙. 𝒓 , -&gt; stabbing query (𝑞. 𝑙 = 𝑞. 𝑟) on interval data
•
and 𝑄1 ∩ 𝑄2 .
𝑸𝟏 case: binary searches on a sorted array
𝑥𝑖 𝑥𝑗
… 𝑥𝑎 …
𝑞. 𝑙
… 𝑥𝑏 …
… 𝑥𝑛
𝑞. 𝑟
𝑸𝟐 case: stabbing query processing on an interval data structure
Merits:
•
Any structure is acceptable.
•
Guaranteed to be faster than
range search
Merits:
•
𝑂(𝑛) space
Default structure: interval tree
•
𝑂(log 𝑛) time to identify the space of 𝑄1
•
𝑂(𝑛) space.
•
𝑂(𝑘) time enumeration &amp; 𝑂(1) time sampling
•
𝑂(log 𝑛 + 𝑘) time stabbing query


# Page. 9

![Page Image](https://bcdn.docswell.com/page/PEXQNYR6JX.jpg)

FIRAS: IRS in 𝑶(log 2 𝒏 + 𝒔) time with 𝑶(𝒏) space
1. 𝑸𝟏 = 𝒙 ∈ 𝑿 ∶ 𝒒. 𝒍 ≤ 𝒙. 𝒓 &lt; 𝒒. 𝒓 case
Algorithm: Binary search of 𝑞𝑙 (𝑞𝑟 ) on the sorted array.
𝑥𝑖 𝑥𝑗
… 𝑥𝑎 …
… 𝑥𝑏 …
𝑞. 𝑙
… 𝑥𝑛
No. 8
2. 𝑸𝟐 = 𝒙 ∈ 𝑿 ∶ 𝒙. 𝒍 ≤ 𝒒. 𝒓 ≤ 𝒙. 𝒓 case
Algorithm: For each node of the interval tree,
Binary search on the corresponding sorted list
𝑞. 𝑟
Time complexity: 𝑂(log 𝑛)
3. Weighted sampling
Algorithm:
1.
Build an alias structure [3] on the index ranges in 2.
2.
Repeat the following:
• Generate a random value 𝑣 ∈ [1, 𝑘].
• If 𝑣 ≤ |𝑄1 |, random sampling from 𝑄1 .
• Otherwise, random sampling from the alias.
Time complexity: 𝑂 log 𝑛 + 𝑠
Time complexity: 𝑂 log 2 𝑛 = 𝑂 log 𝑛 × 𝑂(log 𝑛)
Sampling probability of 𝒙 ∈ 𝑹: 1/𝑘
[3] “New fast method for generating discrete random numbers with arbitrary frequency distributions,” Electronics Letters 10 (1974), 127–128.
bin. search
cost
tree height


# Page. 10

![Page Image](https://bcdn.docswell.com/page/3EK9N3RGED.jpg)

FIRAS: Range Search in 𝑶(log 𝒏 + 𝒌) time with 𝑶(𝒏) space
1. 𝑸𝟏 = 𝒙 ∈ 𝑿 ∶ 𝒒. 𝒍 ≤ 𝒙. 𝒓 &lt; 𝒒. 𝒓 case
Algorithm: Binary search of 𝑞𝑙 (𝑞𝑟 ) on the sorted array
𝑥𝑖 𝑥𝑗
… 𝑥𝑎 …
𝑞. 𝑙
… 𝑥𝑏 …
… 𝑥𝑛
No. 9
2. 𝑸𝟐 = 𝒙 ∈ 𝑿 ∶ 𝒙. 𝒍 ≤ 𝒒. 𝒓 ≤ 𝒙. 𝒓 case
Algorithm: For each node of the interval tree,
Binary search on the corresponding sorted list
𝑞. 𝑟
Time complexity: 𝑂(log 𝑛)
3. Scan the index ranges
Algorithm: For each array/list accessed, scan the index
range identified in 1 or 2.
Time complexity: 𝑂(𝑘)
Note:
•
•
In the 𝑄2 case, binary searches are not required.
Then, the time becomes 𝑂(log 𝑛 + 𝑘).
Time complexity: 𝑂 log 2 𝑛 = 𝑂 log 𝑛 × 𝑂(log 𝑛)
list size
tree height


# Page. 11

![Page Image](https://bcdn.docswell.com/page/L73WVG8575.jpg)

FIRAS: Streaming Interval Data
No. 10
Assumption: Intervals arrive in a point-wise manner.
•
An interval 𝑥 first has only 𝑥. 𝑙, i.e., it is an infinite interval [𝑥. 𝑙, ∞).
•
Sometime later, it has 𝑥. 𝑟, i.e., it becomes a closed interval 𝑥 = [𝑥, 𝑙, 𝑥. 𝑟].
𝑥2 . 𝑙
𝑥1 . 𝑙
𝑥2 . 𝑟
𝑥3 . 𝑙
𝑥3 . 𝑟
…
𝑡
Observation:
•
𝑥 = 𝑥. 𝑙, ∞ ∩ 𝑞 iff 𝑥. 𝑙 ≤ 𝑞. 𝑟.
•
One-sided one-dimensional range search
Data structure for infinite intervals:
•
•
A sequence of buffers 𝐵1 , … , 𝐵𝑏
•
𝐵𝑖 contains at most 𝑛 points.
•
𝑂( 𝑛) time for merging
𝑶 𝟏 time insertion/deletion
•
•
Hash table-based implementation
𝑶( 𝒏) time for range search/IRS
…
𝑥1
𝒒


# Page. 12

![Page Image](https://bcdn.docswell.com/page/87DK81MYJG.jpg)

FIRAS: Evolving Domain Interval Tree (EDIT)
No. 11
Basic structure: Balanced binary search tree based on the domain size 𝐷
•
Idea: The EDIT is doubled (i.e., nodes are created) only when 𝐷 &gt; 𝐷𝑒𝑑𝑖𝑡 and 𝐷 is an odd integer.
•
Merits: Nodes are added incrementally, and existing intervals do no need to move.
Theoretical properties:
Theoretical properties: Theoretical properties:
Update time (𝑫 = 𝑶 𝒏 )
Space complexity
Stabbing query time
•
•
•
𝑂(1) amortized tree update time
•
𝑂 log 𝐷 +
𝑛
𝐷
≈ 𝑂(log 𝑛) average interval insertion time
𝑂(max 𝑛, 𝐷 )
𝑂 log 𝐷 log 𝑛 &lt; 𝑂( 𝑛)


# Page. 13

![Page Image](https://bcdn.docswell.com/page/VJPK8QR2E8.jpg)

Experiment: Setting
No. 12
Algorithms for streaming interval data
Algorithms for static interval data
Method
Range search time
IRS time
Space
Method
Range search time
IRS time
Space
Update time
Interval tree [1]
𝑂(log 𝑛 + 𝑘)
𝑂(𝑛)
𝑂(𝑛)
LIT [4]
𝑂(𝑛)
𝑂(𝑛)
𝑂(𝑛)
-
AIT [2]
𝑂(log 𝑛 + 𝑘)
LIT-EDAIT
𝑂(log 𝑛 + 𝑘)
𝑂(log 2 𝑛 + 𝑠) 𝑂(𝑛 log 𝑛)
HINT [3]
𝑂(𝑛)
𝑂(𝑛)
𝑂(𝑛)
LIT-FIRAS
𝑂( 𝑛 + 𝑘)
𝑂( 𝑛 + 𝑠)
FIRAS
𝑂(log 𝑛 + 𝑘)
𝑂(log 2 𝑛 + 𝑠)
𝑂(𝑛)
𝑂(log 2 𝑛 + 𝑠) 𝑂(𝑛 log 𝑛)
[1] Computational Geometry: Algorithms and Applications, 3rd Edition, 2008.
[2] “Independent Range Sampling on Interval Data,” In ICDE, 2024.
[3] “HINT: a hierarchical interval index for Allen relationships,” VLDB J. 33, 1 (2024), 73–100.
[4] “LIT: Lightning-fast In-memory Temporal Indexing,” Proc. ACM Manag. Data 2, 1 (2024), 20:1–20:27.
𝑂(𝑛)
𝑂 𝑛
𝑂(log 𝑛) amortized


# Page. 14

![Page Image](https://bcdn.docswell.com/page/2EVVN5RXEQ.jpg)

Experiment: Result on Static Data (Range search &amp; Memory size)
No. 13


# Page. 15

![Page Image](https://bcdn.docswell.com/page/57GLK6MREL.jpg)

Experiment: Result on Static Data (Independent Range Sampling)
No. 14


# Page. 16

![Page Image](https://bcdn.docswell.com/page/4EQYN4RYJP.jpg)

Experiment: Result on Streaming Data (Range search &amp; Memory size)
No. 15


# Page. 17

![Page Image](https://bcdn.docswell.com/page/KJ4WGQ8Z71.jpg)

Experiment: Result on Streaming Data (Independent Range Sampling)
No. 16
Key takeaway: FIRAS has the best tradeoff between query time, space, and update-efficiency.


# Page. 18

![Page Image](https://bcdn.docswell.com/page/LE1YDP2D7G.jpg)

Conclusion
No. 17
Problems
•
Range search/counting and IRS on static and streaming interval data
Findings
Method
Range search time
IRS time
Space
Update time
Interval tree [1]
𝑂(log 𝑛 + 𝑘)
𝑂(𝑛)
𝑂(𝑛)
𝑂(𝑛 log 𝑛)
AIT [2]
𝑂(log 𝑛 + 𝑘)
𝑂(log 2 𝑛 + 𝑠)
𝑂(𝑛 log 𝑛)
𝑂(𝑛 log 𝑛)
FIRAS (static)
𝑂(log 𝑛 + 𝑘)
𝑂(log 2 𝑛 + 𝑠)
𝑂(𝑛)
-
FIRAS (dynamic)
𝑂( 𝑛 + 𝑘)
𝑂( 𝑛 + 𝑠)
𝑂(𝑛)
𝑂 𝑛
𝑂(log 𝑛) amortized
Experiments
•
We demonstrate that FIRAS has the best tradeoff between query time, space, and update-efficiency.
•
Our implementation is available at https://github.com/psimatis/FIRAS.
Seamless?


