PL/CUDA - The Fusion of HPC grade power with In-Database Analytics by KaiGai Kohei
Wednesday, November 16 at 10:20-11:15
GPU is a key component to tackle highly computing intensive workloads. It has more than ten years history in the HPC region, and then the latest semiconductor’s evolution enables to deliver more than multi-thousands processing cores and tera-flops class computing power within a consumer chip.
This talk will introduce how PostgreSQL can adopt GPU’s computing capability for in-database analytics, two case studies in the drug-discovery and data-mining area, and technology directions.
Database system is an ideal location to process analytic workloads because of some reasons; distance between storage and processors, flexible data handling before or after the heavy calculations, data integrity and consistency by schema definition, and so on.
Therefore, we can expect the fusion of HPC grade power within database systems may become a game changer on the analytics area.
The major topic of this talk is PL/CUDA that is a new feature of PG-Strom.
PG-Strom is an extension of PostgreSQL, to off-load various CPU intensive SQL workloads, like join, aggregation or projection, into GPU devices transparently. It has been developed and improved for 4 years, by the presenter.
PL/CUDA allows to embed special tuning into the GPU code to run user’s specific calculation and algorithms, using the infrastructure of user defined SQL functions.
We will present two special cases implemented using PL/CUDA. One is a similarity search on the drug discovery area, and the other is unsupervised classification (k-means clustering) on the data-mining area. It shows advantages of GPU adoption within database system for analytics.
This talk also covers the deeper integration with PostgreSQL in the future version; like a sparse-matrix support, CPU-GPU hybrid parallel and so on.
About the Speaker
KaiGai has about 10 year experiences for the core PostgreSQL development, and contributed various security features, FDW enhancement, Custom-scan interface, and so on.
He also launched PG-Strom project 4 years before. This OSS extension tries to off-load CPU intensive workloads into GPU devices, to utilize the latest outcome of the semiconductor’s evolution on the existing database systems.
Nowadays, he leads this open source project and also works on the in-company startup that tries to leverage this technology.
Our expected audience is application developers who are interested in data-analytics leveraging PostgreSQL. No special knowledge for GPU is required.