Cloudera launches the Cloudera Data Platform: a Cloud platform entirely dedicated to data. It enables Data Warehousing, data management and analysis, but also data exploitation for Machine Learning. Find out everything you need to know about this platform, its different functionalities, and the main strong points that distinguish it from its competition…
In October 2018, Cloudera and Hortonworks announced their merger. A few months later, in March 2019, the two companies unveiled the fruit of their alliance: the Cloudera Data Platform, the first Enterprise Data Cloud. At the annual Cloudera Strata event in New York this week, the COP was finally launched.
Cloudera Data Platform: What is it?
The Cloudera Data Platform is a platform that enables enterprises to analyze data using self-service analytical tools in hybrid and multi-cloud environments. It enables companies to create Data Lakes on the Cloud in a matter of hoursin order to enhance the value of all their data.
New Cloud services integrated into the platform also offer self-service access to data and analytical functions for analysts, data scientists, IT professionals and developers. Three new Cloud-Native services for leveraging enterprise data have already been unveiled by Cloudera.
Cloudera Data Warehouse
First of all, the Cloudera Data Warehouse service allows you to Easily deploy self-service data warehouses for analysts. In particular, it will enable the rapid transfer of on-site workloads to the Cloud, thanks to its ability to ingest large-scale data from structured or unstructured sources.
This Data Warehouse Cloud is based on powerful, enterprise-tested SQL engines. Hundreds of users will be able to access the data with a single click, both on site and in the Cloud. Resources can be scaled on-demand as needed, and workloads can be managed on a self-service basis using intuitive tools.
Cloudera Machine Learning
The Cloudera Machine Learning service, on the other hand, makes it easy to deploy collaborative spaces of Machine Learning to enable Data Scientists to exploit corporate data. Using this tool, companies can quickly deploy new Machine Learning workspaces with just a few clicks to enable self-service access to the tools and data needed for machine learning workflows.
On-premise or cloud-based data replication can be done very quickly without compromising security and governance. In addition, Data Scientists can choose their own tools while benefiting from elastic resources that can be adapted to their needs. Model training, data engineering, deployment and model management are easily performed via an all-in-one interface, avoiding the need to switch from one platform to another.
Cloudera Data Hub
I mean, come on, The Cloudera Data Hub is a Cloud-Native data management and analytics service.. It enables IT and business developers to create business applications for any scenario faster.
Again, self-service access to company data is at the core of this service. For all three CDP Cloud-Native services presented by Cloudera, self-service, multi-cloud and security are the key words that allow the platform to stand out from its competition…
What are the advantages of the Cloudera Data Platform over the competition?
Because of its multi-functionality, the Cloudera Data Platform competes with many services. First of all, it competes with several services of the major Cloud providers. These include several components of the AWS portfolio such as Sagemaker, Red Shift or EMR. The same goes for Big Query at Google Cloud or Microsoft SQL on Azure.
Similarly, the CDP aspires to replace several services from third party vendors such as Databricks, Snowflake, Elasticsearch, MongoDB or Confluent. According to Fred Koopmans, VP of the company, while some of these specialized services will be more effective in their field, none can boast the versatility of the Cloudera Data Platform. Cloudera offers an all-in-one platform for Big Data.
The CDP offers multiple features to facilitate and accelerate the deployment of the main types of applications. Five new self-service experiences are offered: flow & streaming, data engineering, data warehouse, operational database and Machine Learning.
In addition to this multi-function character, the CDP stands out for its multi-cloud aspect. Data can be managed, analyzed and used on-premise, in Hybrid or Private Cloud, and in the environments of leading Public Cloud providers. Currently, only Amazon Web Services Public Cloud is supported, but Microsoft Azure compatibility is expected by the end of 2019. Google Cloud compatibility is expected to be achieved in the course of 2020. This feature offers increased flexibility for businesses.
The other strong point of the Cloudera Data Platform is the emphasis on data security and governance. Its SDX (Shared Data Experience) technologies ensure data security, privacy and compliance on any Cloud. With this tool, companies can create secure data lakes in just a few hours. In addition, scripting is replaced by an intuitive configuration interface.
Finally, CDP’s last major advantage over its competition is its 100% open source approach for calculation, storage and integration. Users and third-party developers will therefore be able to contribute to the improvement of the platform, ensuring rapid and continuous innovation. The platform is already being used by several large groups such as Accenture, IBM, Globe Telecom or GlaxoSmithKline (GSK) who are very satisfied.
Prices and availability
The Cloudera Data Platform is now available internationally. It is already compatible with Amazon Web Services, and compatibility with Google Cloud and Microsoft will be added in the coming months.
The basic fee for the platform and its different variations starts at $10,000. The new Cloud Native services will be billed on an hourly basis, with the Data Warehouse service charged at 72 cents per hour, the Machine Learning service at 68 cents per hour, and the Data Hub at 24 cents per hour. You can consult all rates on the official Cloudera page at this address.