BerlinMOD

BerlinMOD is a benchmark for spatio-temporal database management sytems (STDBMS). It is intended as a tool for both

comparing different implementation details of STDBMS, as moving object data types, index structures, and spatio-temporal operators;
comparing the performance of different STDBMS.

What Data is used by BerlinMOD?

BerlinMOD primarily measures the performance on queries employing moving point data. Moving point data are sampled from simulated cars driving on the street network of the german capital Berlin in a representative way. The simulation models the behavier of workers commuting between their homes and work places, and additional trips in their leisure time. Normally, the sampled data is mapped to the street network, but it is also possible to disturb the data. In generating the benchmark data, BerlinMOD relies on real spatial data on the roads of the german capital Berlin, imported from the tool bbbike (http://bbbike.de) written by Slaven Rezic, who gave us kind permission to use his data for scientific purposes. The data is generated using not a dedicated data generator program, but a script for the extensible Secondo DBMS. This makes BerlinMOD flexible, as you can easily modify the benchmark to your own needs. Of course it is possible to export the generated data to use it in your favourite DBMS.

We have also generated the benchmark data for different scale factors. The different datasets and their characteristics are given in Table 1, that also provides links to download the data in CSV and ESRI shape file format.

Table 1: Pregenerated BerlinMOD Data.
Scale Factor	Days	Vehicles	Trips	Units	bbbike coords	wgs84 coords
Indicated file sizes are for the compressed archieve sizes, the size of the unpacked CSV data for scale factor 1.0 is about 11 GB, the uncompressed shape data 23 GB. A data format description for the files and the generator settings used are available.
0.005	2	141	1,797	346,657	CSV, 10 MB	CSV, 5 MB	SHAPE, 15 MB	CSV, 4 MB	SHAPE, 11 MB
0.05	6	447	15,045	2,998,674	CSV, 92 MB	CSV, 46 MB	SHAPE, 132 MB	CSV, 37 MB	SHAPE, 98 MB
0.2	13	894	62,510	12,091,785	CSV, 372 MB	CSV, 183 MB	SHAPE, 534 MB	CSV, 150 MB	SHAPE, 397 MB
1.0	28	2,000	292,940	56,129,943	CSV, 1.7 GB	CSV, 857 MB	SHAPE, 2.5 GB	CSV, 706 MB	SHAPE, 1.8 GB

After unzipping one of the files into the secondo/bin/BerlinMOD directory, the file contents can be imported into an open Secondo database by executing the suitable script for CSV or SHAPE.

Another pregenerated BerlinMOD dataset can be downloaded here (8.6 GB). It consists of 750,000 trajectories in geographic coordinates with corresponding street names and elevation data. After unpacking the file into your local secondo/bin directory, you can import the data into an existing Secondo database via the command restore Trips from Trips.

What Queries are used in BerlinMOD?

The benchmark uses two different forms of representation for the created movements: the object based (one position history per vehicle, a concatenation of all the vehicle's movements during the observation period) and the trip based approach (one position history per trip and vehicle). For each of these representations, the benchmark provides 17 range-style queries (called BerlinMOD/R), and 9 nearest-neighbours queries (called BerlinMOD/NN). The queries deal with predicates on standard data, but mainly with spatial, temporal, and spatio-temporal predicates.

Where can I read more about it?

An article on BerlinMOD has been published 2009 in the VLDB Journal:

The original article is available at Springer Link. You may also download a technical report (revised version) describing BerlinMOD.

What do I need for BerlinMOD?

This depends on your intentions:

You want to benchmark your system...: Just download the pregenerated BerlinMOD data.
You want to compare your system against Secondo...: Download Secondo, the BerlinMOD base data files and the BerlinMOD script files.
You want to use the generator to create your own data...: Download Secondo, the BerlinMOD script files, and probably the BerlinMOD data files.

The mentioned elements are all available for free download here:

For pregenerated BerlinMOD data, see Table 1.
A working version of the Secondo 3.X DBMS, e.g. using the Secondo Appliance on a virtual machine
The base data files:
The generator and benchmark scripts for Secondo version 3.0 or
The generator and benchmark scripts for Secondo version 3.1 and 3.2 or
The data, generator, and benchmark script for Secondo version 3.4 (includes base data files)
The data, generator, and benchmark script for Secondo version 4.1 (includes base data files)

How do I use BerlinMOD?

You want to benchmark your DBMS or just use the data...: Unpack the downloaded data and import it into your system (the data format description may be useful). You can look up the benchmark queries from one of our articles and translate the queries into your query language.
You want to compare your system against Secondo...: Install Secondo on your test platform. Set up the BerlinMOD data generator to export the data to your preferred data format. Then start the data generator. Import the created data into your system and translate the benchmark queries. You can execute the benchmark object builder and query scripts on the same Secondo system.
You want to use the generator to create your own data...: Install Secondo and read our article to learn about the parameters and input files used. Then you can adapt the data generator to your needs.

Using the Data Generator

After installing Secondo (project website), copy the base data files streets.data, homeRegions.data, workRegions.data to your Secondo binary directory. Extract the generator and benchmark scripts into your Secondo binary directory. You may now set up the data generator according to your wishes. For further details, please confer with our article. Change to your Secondo binary directory and type SecondoTTYNT -i BerlinMOD_DataGenerator.SEC to generate the benchmark data.

Running the Benchmark on Secondo

You can also create the benchmark data, create all database objects (including indexes) for running the benchmark on the Secondo DBMS and run all benchmark queries by calling: SecondoTTYNT -i BerlinMOD_Complete.SEC
By default, a scale factor of 0.05 is used in these scripts.

Further information and instruction is contained in the script files and within the technical report.

Examples and Use Cases

Assessing Representations for Moving Object Histories

Here, the BerlinMOD data and the BerlinMOD/R query set were used to assess different representation variants for moving object data. The Secondo command scripts used for the experiments are available as a zip archieve. The archieve contains a file ReadMe.txt with further instructions. The scripts transform the BerlinMOD data into several different representation modes, and create the according index structures required within the BerlinMOD/R query scripts. This example may inspire you to use BerlinMOD and Secondo as a test bed for your own experiments in MODB research. There is also a paper discussing 5 example queries from these scripts, and a technical report describing the experiments and their results.

Spatiotemporal Pattern Queries

This paper proposes an elegant way to formulate Spatiotemporal Pattern queries. Some experiments were done in order to establish first efficiency results. The BerlinMod data generator was used to create moving object data for 50,0000, 100,000, 200,000, and 300,000 cars over one day used in the tests.

The article via Springer Link.
The according Technical Report (revised version).
A Secondo Plugin with the implementation.
Instructions on how to repeat the experiments.

Efficient k-Nearest Neighbor Search on Moving Object Trajectories

BerlinMOD data was used for experiments comparing a new R*-tree index access method supporting k-NN-queries on moving objects with two other methods using the TB-tree index. The BerlinMOD data generator was set up to create the "Cars" test data set: 2,000 cars moving on 1 day.

BerlinMOD

What is BerlinMOD?

What Data is used by BerlinMOD?

What Queries are used in BerlinMOD?

Where can I read more about it?

What do I need for BerlinMOD?

How do I use BerlinMOD?

Using the Data Generator

Running the Benchmark on Secondo

Examples and Use Cases

Assessing Representations for Moving Object Histories

Spatiotemporal Pattern Queries

Efficient `k`-Nearest Neighbor Search on Moving Object Trajectories

I want to talk about that...

Scale Factor	Days	Vehicles	Trips	Units	bbbike coords			wgs84 coords
					CSV old	CSV new	Shape	CSV	Shape
Indicated file sizes are for the compressed archieve sizes, the size of the unpacked CSV data for scale factor 1.0 is about 11 GB, the uncompressed shape data 23 GB. A data format description for the files and the generator settings used are available.
0.005	2	141	1,797	346,657	CSV, 10 MB	CSV, 5 MB	SHAPE, 15 MB	CSV, 4 MB	SHAPE, 11 MB
0.05	6	447	15,045	2,998,674	CSV, 92 MB	CSV, 46 MB	SHAPE, 132 MB	CSV, 37 MB	SHAPE, 98 MB
0.2	13	894	62,510	12,091,785	CSV, 372 MB	CSV, 183 MB	SHAPE, 534 MB	CSV, 150 MB	SHAPE, 397 MB
1.0	28	2,000	292,940	56,129,943	CSV, 1.7 GB	CSV, 857 MB	SHAPE, 2.5 GB	CSV, 706 MB	SHAPE, 1.8 GB

BerlinMOD

What is BerlinMOD?

What Data is used by BerlinMOD?

What Queries are used in BerlinMOD?

Where can I read more about it?

What do I need for BerlinMOD?

How do I use BerlinMOD?

Using the Data Generator

Running the Benchmark on Secondo

Examples and Use Cases

Assessing Representations for Moving Object Histories

Spatiotemporal Pattern Queries

Efficient k-Nearest Neighbor Search on Moving Object Trajectories

I want to talk about that...

Efficient `k`-Nearest Neighbor Search on Moving Object Trajectories