The data warehousing and business intelligence space has undergone a huge transformation in the past several years whereby business users are moving away from these traditional, ‘IT bottleneck’ environments to more agile ones driven by Big Data. For example, when business users lobbied for self-service access, they got Tableau. When they pressed for data discovery, they got Endeca. What’s next? An agile, yet controlled environment to satisfy both the business and IT community. Pivotal Data Dispatch (Pivotal DD) fulfills the needs of all enterprise data stakeholders by empowering business users with on-demand access and analysis to Big Data – all under an established system of metadata and security defined by IT.
I spoke with Todd Paoletti, Vice President of Product Marketing at Pivotal to explain why Pivotal DD is the next Big Thing to hit the Big Data market.
1. Walk me through how Pivotal DD is used from the inception of a Big Data project and what issues it overcomes during a project lifecycle?
95% of clients starting Hadoop projects don’t have an established use case; therefore, selecting the right distribution will probably be a shot in the dark. You may start off with Hortonworks for a dev/test environment, but then realize that Pivotal HD is a better choice for enterprise-class deployment. The good news is that if you start your Hadoop project using EMC Isilon scale-out NAS, you have zero data migration when moving from one Hadoop distribution to another. In fact, you can run multiple Hadoop distributions against the same data – no duplication of data required.
All this makes sense to me. Utilize Isilon scale-out NAS as the native storage layer for Hadoop, making the entire Hadoop environment more flexible. But wait, there’s more. Using Isilon storage with Hadoop instead of a traditional DAS configuration makes the entire Hadoop environment easier and faster to deploy, reliable, and in some cases, a lower TCO than DAS.
De-coupling the Hadoop compute and storage layer may lead you to believe there is a performance hit. Not true. You can expect up to 100GB/s of concurrent throughput on the Hadoop storage layer with Isilon. Additionally, by off-loading storage-related HDFS overhead to Isilon, Hadoop compute farms can be better utilized for performing more analysis jobs instead of managing local storage.
You may think I am biased towards Isilon because I do Big Data Marketing for EMC. Not true. I genuinely believe Isilon is a better choice for Hadoop than traditional DAS for the reasons listed in the table below and based on my interview with Ryan Peterson, Director of Solutions Architecture at Isilon.
Traditional BI makes it very difficult for people in the business who know the story behind the data to actually gain direct access to the data. Instead, they submit data requirements to IT and when IT does finally deliver the data, it is typically only a subset or incomplete data, and in the wrong format. When data gets lost in translation, business users become frustrated, abandon analytics altogether, and operate on hunches and guesses. Fortunately Tableau solves this problem through its Self Service BI paradigm whereby any user in the organization can quickly gain direct access to the data needed, with flexibility to create any visualization imaginable (goodbye Excel!). But wait, there is more. Tableau has partnered with Pivotal to add a social element to these Self Service BI capabilities, whereby people in the business, data scientists, and IT can come together as a team to collaborate around data sets, visualizations, predictive models, and more to uncover new and better insight. The result – Big Data No Longer Lost in Translation.
Click inside to watch 11 Tableau customers talk about how Self Service BI has changed the way they do business
I spoke with Ted Wasserman, a Product Manager at Tableau to learn more about the value of their technology and partnership with Pivotal.
1. Let’s first talk about Tableau. Describe what part of the analytical process Tableau fits in and what problems it solves?