CoGaDB–Column-oriented GPU-accelerated DBMS
Description
CoGaDB is a prototype of a column-oriented GPU-accelerated database management system developed at the University of Magdeburg. Its purpose is to investigate advanced coprocessing techniques for effective GPUs utilization during database query processing. It utilizes our hybrid query processing engine (HyPE) for the physical optimization process.
Overview
CoGaDB's main purpose is to investigate a GPU-aware database architecture to achieve optimal performance of DBMS on hybrid CPU/GPU platforms. We are currently working on a architecture proposal and try to benefit from past experiences of hybrid CPU/GPU DBMS. Therefore, CoGaDB provides an extensible architecture to enable researchers an easy integration of their GPU-accelerated operators, coprocessing techniques and query optimization heuristics. Note that CoGaDB assumes that the complete database can be kept in main memory, because GPU-acceleration is not beneficial for workloads where disc I/O is the dominating factor.
Features
Currently, CoGaDB implements the following features:- Written mainly in C++ and Cuda C
- Column-oriented in-memory database management system
- SQL Interface
- CPU and GPU operators for selection, sort, and simple aggregations using optimized parallel algorithms from the libraries Intel® TBB (CPU) and Thrust (GPU) and CPU only operators for projections and joins
- Uses HyPE, our hybrid query processing engine, for physical optimization and query processing
- Capable of data compression:
- Run Length Encoding
- Bit Vector Encoding
- Dictionary Compression
- Delta Coding
- NEW: Single Instruction Multiple Data (SIMD) selection operator
- NEW: Supports filtering of strings on the GPU
- NEW: Support for primary key and foreign key integrity constraints
Download
We regularly release new versions of CoGaDB here. CoGaDB is released under the GPL v3 License. You can find installation instructions, tutorials and documentation of the current version here.Current Release
Older Releases
Contact
CoGaDB is developed mainly at the University of Magdeburg, Germany. It is open source and we are currently working on the first release so that everybody who is interested can extend/improve it. For information about the project, technical questions and bug reports: please contact the development team via Sebastian Breß. Project members:- Sebastian Breß (University of Magdeburg)
- David Broneske (University of Magdeburg)
- Tobias Lauer (Jedox AG)
- Christian Nywelt (University of Magdeburg)
- Gunter Saake (University of Magdeburg)
- Norbert Siegmund (University of Passau)
- Jens Teubner (TU Dortmund University)
- Darius Brückers (contributed Compression Technique: Run Length Encoding)
- Sebastian Krieter (contributed Compression Technique: Delta Coding)
- Steffen Schulze (contributed Compression Technique: Bit Vector Encoding)
- Ladjel Bellatreche (LIAS/ISEA-ENSMA, Futuroscope, France)
- Robin Haberkorn (University of Magdeburg)
- René Hoyer (University of Magdeburg)
- Steven Ladewig (University of Magdeburg)
- Manh Lan Nguyen (University of Magdeburg)
- Patrick Sulkowski (University of Magdeburg)
Project Publications
- Sebastian Breß. Ein selbstlernendes Entscheidungsmodell für die Verteilung von Datenbankoperationen auf CPU/GPU-Systemen. Master thesis, University of Magdeburg, Germany, March 2012. In German.
- Sebastian Breß, Siba Mohammad, and Eike Schallehn. Self-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms. In Proceedings of the 24st Workshop Grundlagen von Datenbanken (GvD), pages 89–94. CEUR-WS, 2012.
- Sebastian Breß, Eike Schallehn, and Ingolf Geist. Towards Optimization of Hybrid CPU/GPU Query Plans in Database Systems. In Second ADBIS workshop on GPUs In Databases (GID), pages 27–35. Springer, 2012.
- Sebastian Breß, Felix Beier, Hannes Rauhe, Eike Schallehn, Kai-Uwe Sattler, and Gunter Saake. Automatic Selection of Processing Units for Coprocessing in Databases. In 16th East-European Conference on Advances in Databases and Information Systems (ADBIS), pages 57–70. Springer, 2012.
- Sebastian Breß, Ingolf Geist, Eike Schallehn, Maik Mory, and Gunter Saake. A Framework for Cost based Optimization of Hybrid CPU/GPU Query Plans in Database Systems. Control and Cybernetics, 41(4):715–742, 2012.
- Sebastian Breß, Stefan Kiltz, and Martin Schäler. Forensics on GPU Coprocessing in Databases – Research Challenges, First Experiments, and Countermeasures. In Workshop on Databases in Biometrics, Forensics and Security Applications (DBforBFS), BTW-Workshops, pages 115–130. Köllen-Verlag, 2013.
- Sebastian Breß, Felix Beier, Hannes Rauhe, Kai-Uwe Sattler, Eike Schallehn, and Gunter Saake. Efficient Co-Processor Utilization in Database Query Processing. Information Systems, 38(8):1084–1096, 2013. http://dx.doi.org/10.1016/j.is.2013.05.004.
- Sebastian Breß. Why it is Time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS. In The VLDB PhD workshop. VLDB Endowment, 2013.
- Sebastian Breß, Norbert Siegmund, Ladjel Bellatreche, and Gunter Saake. An Operator-Stream-based Scheduling Engine for Effective GPU Coprocessing. In 17th East-European Conference on Advances in Databases and Information Systems (ADBIS), pages 288–301. Springer, 2013.
- Sebastian Breß, Max Heimel, Norbert Siegmund, Ladjel Bellatreche, and Gunter Saake. Exploring the Design Space of a GPU-aware Database Architecture. In ADBIS workshop on GPUs In Databases (GID), pages 225–234. Springer, 2013.
- Sebastian Breß, Max Heimel, Michael Saecker, Bastian Köcher, Volker Markl, and Gunter Saake. Ocelot/HyPE: Optimized Data Processing on Heterogeneous Hardware. PVLDB, 7(13), 2014.
- Sebastian Breß, Norbert Siegmund, Max Heimel, Michael Saecker, Tobias Lauer, Ladjel Bellatreche, and Gunter Saake. Load-Aware Inter-Co-Processor Parallelism in Database Query Processing. Data & Knowledge Engineering, 2014. doi: 10.1016/j.datak.2014.07.003.
- Sebastian Breß. The Design and Implementation of CoGaDB: A Column-oriented GPU-accelerated DBMS. Datenbank-Spektrum, 2014. to appear.
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.