Master's project by Jason Winnebeck - Current Status | Download | Using RealDB | Progress Reports
Embedded sensor monitoring systems deal with large amounts of live time-sequenced stream data. The embedded system requires a lower overhead data store that can work with limited resources and be able to run reliably and unattended even in the face of power faults.
Relational database management systems (RDBMS) are a well-understood and powerful solution capable of storing time-sequenced data; however, many have a high overhead, are not sufficiently reliable and maintenance free, or are unable to maintain a hard size limit without adding substantial complexity.
RealDB is a specialized solution that capitalizes on the unique attributes of the data stream storage problem to maintain maximum reliability in an unstable environment while significantly reducing overhead from indexing, space allocation, and inter-process communication when compared to a traditional RDBMS-based solution.
To summarize the main topics on what this work is to accomplish:
The RealDB project final report is tentatively finished and can be downloaded here; the project defense is scheduled for October 5, 2010. The final proposal can be found here (SVN revision 81, 7-Aug-2008).
Target | Planned | Completed | Percent Complete |
---|---|---|---|
Preproposal | 2006-10-17 | 2006-10-17 | 100% |
Preproposal Presentation | 2006-10-17 | 2006-10-17 | 100% |
Proposal Approved | 2008-07-31 | 2008-08-11 | 100% |
Milestone 1 | 2008-08-31 | 2008-08-28 | 100% |
Milestone 2 | 2008-08-31 | 2008-09-29 | 100% (r123) |
Milestone 3 | 2008-09-08 | 2009-01-29 | 100% (r199) |
Milestone 4 | 2008-09-22 | 2010-02-28 | 100% |
Milestone 5 | 2008-10-20 | 2010-04-11 | 100% |
Milestone 6 | 2008-11-10 | 2010-07-27 | 100% |
Milestone 7 | 2008-11-24 | 2010-07-27 | 100% |
Report | 2008-12-15 | 2010-09-19 | 100% |
Defense | 2009-01-15 | 2010-10-05 | presentation 100% |
Latest Release: RealDB 1.0 (SVN revision 504), released on August 22, 2010.
Please note that unfortuantely the full dataset used for benchmarking in the final report cannot be published. A very small dataset usable for functionality demonstration is provided but is not suitable for benchmarking due to its size.
Javadoc for the realdb-core module (the "library" part) can be browsed online here. You may want to read on how to use RealDB.
RealDB is built with Maven. The outputs are published as a Maven repository at http://www.gillius.org/maven2/. You may download individual javadoc/source/binary jars there (but not the dependencies).
The final report was completed on September 19, 2010, although it may go through minor revision after the defense based on feedback.
Defense presentation: PowerPoint PDF Handouts (PDF)
RealDB Project - Copyright (c) 2008-2010 by Jason Winnebeck
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, see <http://www.gnu.org/licenses>.
Additional permission under GNU GPL version 3 section 7
If you modify this Program, or any covered work, by linking or combining it with JUnit (or a modified version of that library), containing parts covered by the terms of Common Public License, 1.0, the licensors of this Program grant you additional permission to convey the resulting work.
The following libraries are used. Note that with the new Maven build system these would be downloaded for you:
For running the main library, realdb-core, the only dependency is the Antlr runtime jar. For the browser and concept tools, JFreeChart is also needed. All other dependencies are used in building or unit tests.
For the benchmark program, it should be able to work with any JDBC driver, assuming the database server and driver configurations are set up properly. The benchmark as run in the final report uses the following:
This section is very minimalistic and will be expanded in the future.
Download the full release of RealDB; no compilation is needed. There are several ways to use RealDB:
Note, the current version of RealDB may be unable to work with databases made with versions of RealDB before 1.0.
Apache Maven is used for compilation. To compile RealDB yourself, download a recent version of Maven (2.2 is known to work), and run "mvn install" to locally install the RealDB packages. The resulting source, javadoc, and binary jars for each module will be in the respective target directory.
NetBeans and IntelliJ IDEA IDEs are capable of creating projects from the Maven files. IDEA 9 community edition project files are already generated and exist in the source code.
The RealDB Definition Language (RDL) is used to define RealDB databases. Here is an example RDL file:
SET blockSize = 2048 SET fileSize = 204800 SET maxStreams = 3 SET dataBlockSize = 2 CREATE STREAM Test WITH ID 1 { value float NULL //will use SampledAlgorithm by default } CREATE STREAM CarSnapshots WITH ID 2 { rpm float WITH CODEC DeadbandAlgorithm PARAMS (deadband=50.0), speed float WITH CODEC DeadbandAlgorithm PARAMS (deadband=5), passengers uint8 WITH CODEC StepAlgorithm, driving boolean WITH CODEC StepAlgorithm }
This section will be expanded more later. Brief descriptions of the elements in this file:
A DataStream is a time-ordered sequence of Records with a fixed format -- an ordered list of elements.
A single entry in a DataStream, has a timestamp and can be a "discontinuity" marker.
A specific field in the Record with a known name and type.
A reconstructed continuous interval of a DataStream that comprises of 1 or more original (possibly dropped) records.
A stream can be in a "discontinuity" state. This is different from elements individually being null and even all elements being null at the same time. The actual meaning can be defined by the user. For example a null could be "not applicable," or "collected but not present", whereas a discontinuity could signifiy ranges where the system was turned off and not collecting data at all. When a discontinuity is written, the stream is said to remain in that state until the next non-discontinuity record is written.
All records in RealDB are timestamped, and written in ascending order of time. The timestamp is a 63 bit unsigned integer. RealDB does not assume any particular unit or meaning for time other than it being linear (i.e. distance between 5 and 10 is the same as between 50 and 55), leaving the user to define both the unit (i.e. seconds versus milliseconds) and the epoch (the meaning of time 0).
The current command line tool supports the following commands. Use the help command to get more information:
bulkLoad | Bulk loads data into a stream from a tab-separated values file. |
create | Creates a new RealDB Database |
describe | Shows information about a database. |
help | Prints out a list of all commands or detailed help on a single command |
intervals | Reads intervals in a given time range on an element within a stream. |
read | Reads all raw records or a range of raw records from a single stream and outputs a tab-delimited output with the first row as the column headers. |
summary | Reads intervals in a given time range on an element within a stream, and summarizes them into a single report. |
There is a graphical browser tool that can view information about a database and the streams within. See the Running RealDB section on how to start it.
Clear up some instructions about running the "realdb_raw" test in the "Running RealDB" section.
The final report has been posted in the download section.
The final code deliverables are up on the site. Milestones 6 and 7 were completed back in late June and July. The final report is completed except that it needs some proofreading and touchup on reference citations. The slides for the presentation are half complete.
After the defense presentation, the final report and the full set of results will be posted, and this site will be finalized a bit more in content.
The project deliverables are licensed under GPLv3.
Milestones 6 and 7 are mostly completed. What is done is performance testing of the 5 configurations.
The size management applies significantly only to the non-RealDB configurations, where I implement a DELETE on every batch of 1000 records to delete all records older than the first by some amount of time (right now 1000 seconds). With no size management on SQL, I issue no deletes. For RealDB, the difference is a 100MB database versus a 200MB one.
For SQL databases, I implement the StepAlgorithm that RealDB uses to eliminate duplicate data points, to put both systems on equal footing. MySQL and Derby are used as they come "out of the box", except that on MySQL I use the JDBC parameter rewriteBatchedStatements=true to turn JDBC batches into MySQL multi-inserts, which offers a huge speed up (seems to be about 10x but I did not measure it directly).
The dataset I will describe in more detail later, but it includes raw data and the benchmark uses the same techniques and schema as the proof of concept to generate 2 boolean streams. 9203285 records are generated from this.
Disk statistics for the flash memory test are better than the hard drive test because the OS and database/RealDB binaries reside there. For the flash test I put ONLY the data store on the flash, not the programs.
I probably won't be graphing each metric, but these are what I collect. I am running the benchmark on the Linux operating system, where I use /proc/self/stat to get CPU statistics, and /sys/block/DEVICE/stat (where DEVICE is the raw disk in use, for example sdc).
To finish this milestone I will meet with my committee to find out what else I need to collect, if anything. The only major missing component from the benchmark is a reliability test. I have no test right now to quantify that -- for example I know that MyISAM requires a repair after crash and I know it needs free disk space equal to the largest table but how long does it take? Does Derby and InnoDB recover in constant time? Also, I should prove that RealDB also recovers, and in "constant" (for a given DB config) time.
I also may bundle up a 0.7 release when this is done.
I am looking into how to best emulate the "circular buffer" behavior from RealDB in a standard JDBC data store. The problem is that there is no perfect decision, and the best decision differs on the database implementation. Here are some options:
Ultimately, I think that any real-world application that needs this type of functionality that chooses to use a SQL database as a backend will need a database-specific implementation (because portable SQL does not give you a standard way to control size), and also likely will need to compromise on strong guarantees of performance (constant time startup) and/or size control (even the proprietary methods I reviewed were not sufficient).
Given that there is no single solution that works everywhere and solves the whole problem, I decided for the RealDB benchmark run on JDBC databases will use the "constant time range" method for size control. This is simplest to implement and relies only on portable SQL. A real-world application may best be able to use this method practically if it has a way of summarizing when data was recorded and can assume some maximum data rate.
The benchmark program is proceeding well. I have written a "generic JDBC" method for writing the data in a table format that mirrors the following:
This allows storage of multiple streams of data sharing the same structure in the same table. I felt that this is the most straightforward mapping of RealDB streams to SQL tables because typically in relational database design you would not create separate tables with the same structure. I also specified an index on the time column because in a relational database we need to delete old data to make room for new data by time.
I figured the most performant JDBC method would be to use PreparedStatement with batched execution, 1000 statements at a time. If there is a better and practical method I am open for suggestions.
Currently the benchmarking program has a way to specify a DB-specific properties (configuration) file and SQL script to set up the database. Currently I have tested the program against both MySQL MyISAM and Apache Derby databases. I've done preliminary runs on the three database systems on a data simulation that does not perform any deletes yet. Currently the only metrics I have gathered is time to run the simulation and amount of disk space used. I will need to collect more, such as I/O operations, memory usage, and ability for database recovery on system fault.
To complete milestones 5 and 6 I need to do the following:
I completed the new transaction system below a few weeks back, and I have finished the proof-of-concept for milestone 5. While the new transaction system achieved my main goal of having a constant upper-bound on time and storage space for a given database configuration (linear in number of streams), I could not get it working to reduce the writes as much as I wanted. In the end I decided to move on and leave the optimization as future work. Essentially, only a single transaction can be outstanding at one time, which reduces performance (although it is still linear with data). Since a transaction is only needed when committing a data block (not a data point), the larger the data block, the lower the overhead. My eventual goal is to allow overlapping transactions, so that single writes to disk can commit multiple modifications.
The proof of concept for milestone 5 farily closely follows what was specified in the proposal. I have provided a data file with the following streams of data: RPM, FuelRate, AccelPedalPosition, BatteryVolts, FuelEcon, and Speed. For the small file included with the release, it is just fake generated data. The VehicleDataSimulation program in the realdb-concept module also generates two additional streams: VehicleStatus and HardAccelEvent. The VehicleStatus denotes where vehicle data was being collected (and not) via a boolean field. The HardAccelEvent stream denotes an event that is set to "true" at the start of 10 continuous seconds or more of accelerator input of 90%. The analysis is to be able to examine critical vehicle parameters during hard acceleration. An interface is presented to the user which allows them to select an incident (instance of the event) and show a graph of data on all of the other streams. A "smarter" event function than hard acceleration would certainly be possible, for example to detect anomalous voltages on the battery, or abnormally low fuel economy. Since the focus on this project is on the database and not diagnostic algorithms, I chose something simple that I knew would occur in the data.
With the milestone 5 release also comes a simple read-only GUI browser tool for generic databases. The proof-of-concept reuses components of this browser.
I believe I will be able to use the basis of the proof-of-concept code and database definition that loads databases as a strong foundation for milestones 6 and 7.
I am still working towards milestone 5, but I wanted to give an update on the new transaction system I am also working on. The current system is naive and scales based on the amount of data in the database. The new (and final) system scales based on the number of data streams in the database, which should be relatively low and "fixed." I've finished the design (unless I find problems) and am part way through implementation.
Old transaction logging method: Recording a transfer of a data block:
The transaction log contains only the active transactions and nothing more. It was completely rewritten every time the transaction log is updated. A transaction object in Java recorded the sequence of steps so that a flush was delayed as much as possible, and when any step in the sequence was flushed, all steps before were flushed.
The recovery process was not appropriate, because it was linear on the size of the database. I wasn't sure properly how to safely examine the indexes to determine if the operation happened, so I scanned the entire affected indexes.
The recovery was done in a manner that even if the IO failed during recovery, you could re-try recovery later. So this process seems to work perfectly, but it does not scale. So I need a new transaction recovery method.
I spent some time reading the book Database Design and Implementation by Edward Sciore that I borrowed. I wanted to get some ideas specifically from the transaction management chapter (14). The idea in the book is much easier than mine -- basically create a forward-only writing log containing the modifications (block ID, offset, old data, new data) to physical blocks on disk. The recovery manager does not need to inspect anything. For rollbacks it writes the old data to the blocks, for commits it writes the new data. It might re-write something that was already done but it always ends up with the right result.
The method from Sciore's book is very simple and elegant, but there are a few issues:
But, some of the fundamental properties of the idea in Sciore are important, that I want to preserve. Specifically:
After much thought, I realized I can utilize a key ingredient of my database to ensure idempotency without huge overhead: the index sequence number. Given that all operations are ordered and that I have a reliable method of incrementing the sequence number, if I use the sequence number in my log, I can find out if the operation has completed simply by comparing the sequence numbers. This way I can track "deltas" which are smaller than logging the whole block's state without being afraid of getting out of sync. The next key step is to break apart the transfer into two logged operations: remove and add.
Tx log entry types:
Tx log memory model:
Transaction Log Size: Given that we know that a stream can only be writing a single block at a time, we know that only that many transactions can be pending. Therefore we can size the log to the maximum number of streams times the size of the remove and add entries and know that we will never truncate an open transaction (although I keep track to assert that!)
Tx free pool to index:
Step | If step fails, after recovery |
---|---|
return next data block number | |
Queue Tx: remove block | |
(Optional) flush log | block stays free |
Stream: Write data block to disk | block stays free |
Queue Tx: add block to index | |
flush log | block was written but will be counted as free |
flush index | replay will redo add to index |
I'm not totally sure whether or not I need the "remove block" for this operation -- I think I do if I allocate free pool blocks 2, 3, 4 and 4 gets committed before 2 and 3, but this might not be possible if I delay the allocation internally until I need to commit.
Tx index to index:
Step | If step fails, after recovery |
---|---|
Find source index to get block | |
Queue Tx: remove block | |
Queue source index: remove block | |
(A) flush log | block stays allocated to source |
(depends A) flush source index | will be replayed |
(B depends A) flush data block | block enters free queue |
(C depends B) Queue Tx: add block | |
(D depends B) Queue dest index: add block | |
(E depends C) flush log | block enters free queue |
(depends E, D) flush dest index | will be replayed |
Note the the order of the above can change some and still have the same effect (note dependencies).
Recovery Process: Read Tx log from beginning, on each event:
Milestone 4 release. I took a little extra time from my previous report to make the import/export file formats more robust to handle discontinuities, and I realized that I missed some API support for discontinuities in StreamIntervals.
Refer to the instructions above for how to download and run RealDB. For this release, you can run these commands:
create examples/test.rdl test.rdb describe test.rdb bulkLoad test.rdb 1 examples/test.txt bulkLoad test.rdb 2 examples/CarSnapshots.txt read test.rdb 1 10 20 read test.rdb 2 intervals test.rdb 2 rpm 50 70 summary test.rdb 2 passengers
To get the output below:
% java -jar RealDB-cli.jar RealDB Tool Interactive mode This version is extremely experimental and the commands will change in the next release. You can also invoke this program with arguments to run a single command non-interactively. Use 'help' command to get a list of commands, 'exit' to end > create examples/test.rdl test.rdb Created database at test.rdb > describe test.rdb Database /home/stu4/s30/jpw9607/realdb/realdb-2010-02-28-r318/test.rdb: Block size : 2048 Database size : 100 blocks (204800 bytes) Data block size: 2 Database has 2 streams: Test (1), with 1 elements: 0 value (Float, optional) with codec SampledAlgorithm Stream contains records from times -1 to -1 CarSnapshots (2), with 4 elements: 0 rpm (Float) with codec DeadbandAlgorithm 1 speed (Float) with codec DeadbandAlgorithm 2 passengers (UInt8) with codec StepAlgorithm 3 driving (Boolean) with codec StepAlgorithm Stream contains records from times -1 to -1 > bulkLoad test.rdb 1 examples/test.txt Wrote 35 records into stream Test (ID 1) > bulkLoad test.rdb 2 examples/CarSnapshots.txt Wrote 30 records into stream CarSnapshots (ID 2) > read test.rdb 1 10 20 stream time value 1 10 null 1 11 11.0 1 12 12.0 1 13 13.0 1 14 14.0 1 15 15.0 1 -- -- 1 17 null 1 18 18.0 1 19 19.0 1 20 20.0 Read 11 records. > read test.rdb 2 stream time rpm speed passengers driving 2 5 0.0 0.0 0 false 2 20 0.0 0.0 1 false 2 35 0.0 0.0 2 false 2 45 0.0 0.0 3 false 2 50 500.0 0.0 3 true 2 65 600.0 0.0 3 true 2 70 950.0 5.0 3 true 2 75 1250.0 10.0 3 true 2 80 1500.0 20.0 3 true 2 85 700.0 25.0 2 true 2 90 850.0 30.0 2 true 2 95 1000.0 35.0 2 true 2 100 1200.0 40.0 2 true 2 105 1400.0 45.0 2 true 2 120 0.0 0.0 2 false 2 135 0.0 0.0 0 false Read 16 records. > intervals test.rdb 2 rpm 50 70 stream element start end min avg max integral 2 rpm 50 65 500.0 500.0 500.0 7500.0 2 rpm 65 70 600.0 600.0 600.0 3000.0 2 rpm 70 75 950.0 950.0 950.0 4750.0 Read 3 intervals. > summary test.rdb 2 passengers stream element count start end min avg max integral 2 passengers 16 5 135 0.0 0.25384615384615383 3.0 255.0
I am almost completely done with milestone 4, and a release will be coming this week. Since the last progress report, I have completed all of the "next steps":
I'm just polishing up a few things for the release:
Here is an example RDL that I will be using for this run:
SET blockSize = 2048 SET fileSize = 204800 SET maxStreams = 3 SET dataBlockSize = 2 CREATE STREAM Test WITH ID 1 { value float //will use SampledAlgorithm by default } CREATE STREAM CarSnapshots WITH ID 2 { rpm float WITH CODEC DeadbandAlgorithm PARAMS (deadband=50.0), speed float WITH CODEC DeadbandAlgorithm PARAMS (deadband=5), passengers uint8 WITH CODEC StepAlgorithm, driving boolean WITH CODEC StepAlgorithm }
There is one caveat to the completion of milestone 4: the current transaction handling, while functioning perfectly (as far as I know), does not meet the scalability requirements in my proposal as the current implementation is naive. It is linear time on the size of the database, while it should be linear time on the uncommitted transactions, up to the maximum user-specified transaction log size.
The command line tool (org.gillius.realdb.tools.cli.RealDBTool) will provide at least the following commands:
See OldReports