With the new wave of data platforms, many early adopter organizations are questioning their original direction, and looking to determine how to continue to stay current without spending too much, getting organizationally distracted, or winding up in a technological dead end. Making these decisions purely based on the technical merits of the platforms and on no formal cost (TCO) analysis may not yield the right answer. Elevondata was conducted an in-depth analysis of data platforms for a media client, which favored an enterprise-scale rollout of Databricks, but certain areas of the company were already using Cloudera.
Both cloudera and Databricks can handle streaming real-time data science applications, such as the “tenants” listed in the document. Both can handle building a data lake platform, with the above noted distinctions. We think cloudera could offer more flexibility in the long term across the broader set of use cases, while Databricks could reduce complexity and cost (how much cost is open to question) in the near term, while somewhat limiting flexibility.
Please download this whitepaper, which covers a more detailed analysis of the above summary.