Skip to main content

Introduction

Kubedoop Data Platform is a modular, Kubernetes-native platform. Through Kubedoop, users can quickly and easily deploy data infrastructure and algorithm infrastructure to address DataOps and MLOps requirements.

Kubedoop includes mainstream data processing components such as HDFS, Hive, Kafka, Superset, etc., while supporting data lakes and real-time data warehouses to meet the migration needs from traditional Hadoop platforms to Kubernetes platforms.

Built on Kubernetes Operator technology, Kubedoop automates the lifecycle management of data processing tasks, including task creation, startup, monitoring, scheduling, restart, and scaling. Users only need to define data processing tasks through simple configuration files, and Kubedoop will automatically deploy the tasks to the Kubernetes cluster and manage their lifecycle.

Components

Kubedoop Product Operators:

Built-in Kubedoop Operators:

Contributing

If you would like to contribute to Kubedoop, please refer to our contribution guide for more information. We welcome all forms of contributions, including but not limited to code, documentation, and use cases.