Apache Zeppelin is a multi-purpose, web-based notebook that serves as a central hub for data ingestion, discovery, analytics, visualization, and collaboration. It provides an interactive environment for data scientists and engineers to perform data-driven tasks.
Key Features:
- Multiple Language Backend: Zeppelin's interpreter concept allows seamless integration of various language/data-processing backends. It natively supports Apache Spark, Apache Flink, Python, R, JDBC, Markdown, and Shell, with the flexibility to add more.
- Apache Spark Integration: It offers robust, built-in integration with Apache Spark, featuring automatic SparkContext and SQLContext injection, runtime JAR dependency loading from local filesystems or Maven repositories, and capabilities for canceling jobs and displaying their progress.
- Data Visualization: The platform includes basic charting functionalities and an intuitive pivot chart that allows users to aggregate values and create charts with simple drag-and-drop operations, supporting aggregations like sum, count, average, min, and max. It also supports custom display systems and Angular API for advanced visualizations.
- Dynamic Forms: Zeppelin can dynamically generate input forms within notebooks, enhancing interactivity and user experience for parameterizing analyses.
- Collaboration: Notebooks can be shared among collaborators via URL, enabling real-time changes and collaborative editing similar to Google Docs. It also provides a publishable URL to display results only, which can be easily embedded as an iframe into other websites.
Deployments:
- Single User: Supports local Spark environments, comes with 6 built-in visualizations, a display system, dynamic forms, and compatibility with multiple backends.
- Multi-User: Offers multi-user support with LDAP integration, allowing configuration for Yarn clusters to manage resources and access securely.
What's New (Apache Zeppelin 0.11):
- Java 11: Zeppelin 0.11 is built with Java 11, which is the recommended Java version for running the application.
- Spark and Flink: It supports the latest versions of Apache Spark and Apache Flink, allowing users to leverage the newest features and improvements from these frameworks.
- Python 3: Python 3.9 is set as the default version for the Python interpreter.
Apache Zeppelin is 100% Apache2 Licensed software, fostering an active development community and encouraging contributions.

