BerlinMOD Data Export Formats

This file describes the export table file formats used by the BerlinMOD data generator script. The information presented here describes
  1. which tables exist,
  2. which attributes and attribute types each table uses,
  3. the meaning of the data provided by each table,
  4. the format used within CSV files.
Using this information should enable you to import the BerlinMOD data into your database system. Please go to the following websites to learn more about Secondo and BerlinMOD:

CSV data format

The BerlinMOD data generator can export the data using comma separated value (CSV) files. Also the pregenerated benchmark data is available in this format. Each table is represented by an ASCII-text file. In this file, each line of the table is represented by a line terminated with the END-OF-LINE character. In each line, the tuple's attribute (column) values are separated by the comma character (,). The format for simple types is:
Table 1: CSV Type Representation.
Type Example Representation Comment
N: any digit (0, 1, 2, 3, 4, 5, 6, 7, 8, or 9).
C: any ASCII character, except LF, CR and comma (,).
X|Y: either X or Y
[X]: optional (might be skipped
Xn: Up 0 to n elements of type X
X+: at least 1 element of type X
X*: as many instants of type X as you wish
. the dot character (.)
: the colon character (:)
- the minus character (-)
E the E character (E or e)
int -5441115 [-]N+
real 0.56489217E-12 [-]N*.N+[E[-]N+] The period is mandatory.
varchar[n] Sachsenring Cn The string may contain up to n characters. Any ASCII-characters are allowed, except characters signaling END-OF-LINE, as CR, LF, and the comma (,).
date 2007-05-08-06:40:15.000 NNNN-NN-NN-NN:NN:NN.NNN This is basically a varchar[23], having format yyyy-mm-dd-hh:tt:ss.nnn where the yyyy=year, mm=month, dd=day, hh=hour, tt=minute, ss=second, nn=millisecond. The fields are always fully used, leading digits are filled up with zeros (0).
Table 2: CSV Table Descriptions.
File Name BerlinMOD Relation Name Column Name/Attribute Type Meaning
[1]: Line/Polyline format: For each line (identified by an Id attribute), a set of tuples defines a set of line segments. A segment is defined by a line connecting the points (X1, Y1) and (X1, Y2).
[2]: Point format: A 2D point attribute <name> is decomposed into its X and Y components, each represented by a real value. The name of the according value column is <aname>X or X<aname>, resp. <aname>.Y, Y<aname>.
[3]: Region/polygon2D format: For each Region (usually identified by an Id attribute), a sequence of tuples defines a sequence of vertexes. The region's boundary is defined by the segments connecting each pair of subsequent points (see [2]), and the segment connecting the first and the last vertex of the sequence. Due to representing a sequence of segments, the ordering of the tuples is significant!
datamcar.csv dataMcar{ Moid:
Licence:
Type:
Model:
int,
varchar[48],
varchar[48],
varchar[48]}.
A table with general information on each vehicle. Moid is the key for the vehicle, that is also used in table journey.
journey.csv Journey{ Moid:
Licence:
Type:
Model:
Tstart:
Tend:
Xstart:
Ystart:
Xend:
Yend:
int,
varchar[48],
varchar[48],
varchar[48],
date,
date,
real,
real,
real,
real}.
Each tuple defines a steady linear movement of the vehicle with the given Moid, that has a given licence, vehicle type and car model. Movement starts at instant Tstart and position (Xstart, Ystart) and ends at instant Tend and point (Xend, Yend).[2]
queryinstants.csv QueryInstants{ Id:
Instant
int,
date}.
A table with query instants. Id is a key.
queryperiods.csv QueryPeriods{ Id:
Begin:
End
int,
date,
date}.
Id is a key. A table with temporal intervals starting at instant Begin and ending at instant End.
querypoints.csv QueryPoints{ Id:
Pos_x:
Pos_y:
int,
real,
real}.
Id is a key. A table with query point data.[2]
queryregions.csv QueryRegions{ Id:
Vertex_x:
Vertex_Y:
int,:
real,:
real}.
Id is a key. A table with regions/polygons used in queries.[3]
streets.csv streets{ Id:
Vmax:
X1:
Y1:
X2:
Y2:
int,
real,
real,
real,
real,
real}.
Id is the key for a complete line. A table with line segment data defining the road network. Vmax is the speed limit for the according street segment. Start point of the segment is (X1, Y1), endpoint is (X2,X2).[1]
trips.csv Trips{ Moid:
Tripid:
Tstart:
Tend:
Xstart:
Ystart:
Xend:
Yend:
int,
int,
date,
date,
real,
real,
real,
real}.
Contains the same movement data as table Journey. However, here the complete movements are decomposed into several shorter movements called trips. The Moid is the same as used in dataMcar and Journey. Each trip is composed by a set of tuples, each defining a steady linear movement.

ESRI Shape format

The data represented is the same as in CSV format. However, the ESRI Shape file standard requires, that standard data is stored in a DBF file, and spatial data be represented using Shape types within a separate SHP file. Joining the DBF and the SHP file then provides the full tables. Dates are exported in the same format as described for CSV files. Spatio-temporal data is also exported as described for the CSV files, meaning that start and end instants are given as varchar-attributes Tstart, Tend; the start and endpoints are described as pairs (Xstart, Ystart), resp. (Xend, Yend) of coordinates.
Table 3: Shape Table Descriptions.
DBF File SHP File Comment
cars.dbf   Attributes represented as in CSV table dataMcars.
journey.dbf   Attributes represented as in the according CSV table.
queryinstants.dbf   Attributes represented as in the according CSV table.
querylicences.dbf   Attributes represented as in the according CSV table.
queryperiods.dbf   Attributes represented as in the according CSV table.
querypoints.dbf querypoints.shp Attribute Pos is represented by the shape file.
queryregions.dbf queryregions.shp Attribute Region is represented by the shape file.
streets.dbf streets.shp Attribute geoData is represented by the shape file.
trips.dbf   Attributes represented as in the according CSV table.