Architecture Changes in a Bound vs. Unbound Data World

Thomas Henson

Thomas Henson

Unstructured Data Engineer and Hadoop Black Belt at Dell EMC
Thomas Henson is a blogger, author, and podcaster in the Big Data Analytics Community. He is an Unstructured Data Engineer and Hadoop Black Belt at Dell EMC. Previously he worked helping Federal sector customers build their first Hadoop clusters. Thomas has been involved in the Hadoop Community since the early Hadoop 1.0 days. Connect with him @henson_tm.
Thomas Henson
Thomas Henson
Thomas Henson

Latest posts by Thomas Henson (see all)

Originally posted as Bound vs. Unbound Data in Real Time Analytics.

Breaking The World of Processing

Streaming and Real-Time analytics are pushing the boundaries of our analytic architecture patterns. In the big data community we now break down analytics processing into batch or streaming. If you glance at the top contributions most of the excitement is on the streaming side (Apache Beam, Flink, & Spark).

What is causing the break in our architecture patterns?

A huge reason for the break in our existing architecture patterns is the concept of Bound vs. Unbound data. This concept is as fundamental as the Data Lake or Data Hub and we have been dealing with it long before Hadoop. Let’s break down both Bound and Unbound data.

Bound vs. Unbound Data (more…)

Distributed Analytics Meets Distributed Data with a World Wide Herd

Jean Marie Martini

Jean Marie Martini

Director, Data Analytics Portfolio Messaging and Strategy at Dell EMC
Jean Marie Martini is a Director of messaging and strategy across the data analytics portfolio at Dell EMC. Martini has been involved in data analytics for over ten years. Today the focus is on communicating the value of the Dell EMC solutions to enable customers to begin and advance their data analytics journeys to transform their organizations into data-driven businesses. You can follow Martini on Twitter @martinij.

Originally posted on CIO.com by Patricia Florissi, Ph.D.

What is a World Wide Herd (WWH)?

What does it mean to have “Distributed analytics meet distributed data?” In short, it means having a group of industry experts, in this case a group given the title of World Wide Herd, to form a global virtual computing cluster. The WWH concept creates a global network of distributed Apache™ Hadoop® instances to form a single virtual computing cluster that brings analytics capabilities to the data. In a recent CIO.com blog, Patricia Florissi, Ph.D., vice president and global CTO for sales and a distinguished engineer for Dell EMC, details how this approach enables analysis of geographically dispersed data, without requiring the data to be moved to a single location before analysis. (more…)

Dell EMC Takes #1 Position on TPCx-BigBench for Scale Factor 10000

Nicholas Wakou

Nicholas Wakou

Nicholas Wakou is a Senior Principal Performance Engineer with the Dell EMC Open Source Solutions team. Nicholas's role, interest and activity is focused on the characterization and optimization of the performance of Dell EMC Cloud and Big Data solutions. Nicholas has been involved and is engaged with Industry efforts to define performance benchmark specifications. He is active on the SPEC (www.spec.org) Cloud committee and several committees of the TPC (www.tpc.org). Nicholas represents Dell Technologies on the Board of Directors of the TPC and on its Technical Advisory Board (TAB). Previously, he was Chair of the TPC Public Relations standing committee. Nicholas has an MS. Electrical Engineering from Oklahoma State University, MS. Microelectronics Technology from Middlesex University, London and a BSc. Electrical Engineering from Makerere University, Kampala, Uganda.
Nicholas Wakou

Latest posts by Nicholas Wakou (see all)

Dell EMC is focused on providing information that helps customers make the most of their big data technology investment. The failure rate for Hadoop big data projects is still too high given the maturity of the technology.  Customers can’t afford to guess when designing and sizing a solution; they need to deliver optimal performance for their business use cases and to scale as needed. Dell EMC recently completed and published a new TPCx-BigBench (TPCx-BB) result that will help customers make the right choices for Hadoop performance and scalability. Today we are happy to announce that

Dell EMC is the industry leading supplier of hyper-converged, converged and “Ready” Solutions by many standards.  Dell EMC’s tested and validated Ready Bundle for Cloudera Hadoop, together with the right performance benchmark results, takes the guess work out of Hadoop implementations.

The Transaction Processing Council (TPC) is a non-profit corporation founded (more…)

Dell EMC extends its portfolio for Splunk to VxRack FLEX

Brett Roberts

Brett Roberts

Data Analytics Systems Engineer at Dell EMC
Brett is the Technical Lead for Dell EMC’s Data Analytics Technology Alliances, focused on developing solutions that help customers solve their data challenges. You can find him on social media at @Broberts2261

Operational Intelligence and machine generated data have been very hot topics lately as organizations are beginning to realize how valuable this data is for the business. For the last few years, Splunk has been the leader in this space with their all-encompassing platform that enables the ability to collect, search and analyze machine generated data. (Not up to speed on this yet? Check out my other blog on getting started with machine generated data) Dell EMC and Splunk have had a tremendous partnership over the past couple years that is based on the premise that we offer market leading infrastructure that is optimal for Splunk’s world class analytics platform for machine generated data. A couple weeks ago, we took this one step further… I’m excited to announce the release of the Solution Guide for Machine Analytics with Splunk Enterprise on VxRack Flex 1000! With this, Dell EMC now has a validated rack scale, hyper-converged infrastructure solution for Splunk that has been jointly validated by Splunk & Dell EMC.

Why is this important?

Having this solution that has been jointly validated by both Splunk and Dell EMC to “meet or exceed Splunk’s performance benchmarks” gives users a higher degree of confidence in the environment. With this solution the performance needed to run Splunk effectively and gain the valuable insights to make critical IT and business decisions will be there. Our solutions engineering team along with Splunk put hundreds of engineering hours into designing specific configurations based on a variety of different deployment scenarios and rigorously tested them to ensure performance. The solutions guide gives you not only those configurations but also implementation guidelines and deployment practices. All of this equals lower risk, quicker time to value and validated for performance…can’t ask for anything better.

How is VxRack Optimal for Splunk?

VxRack provides flexible, rack scale, hyper-converged infrastructure that allows you to use the hypervisor of your choice or bare metal as well as the ability to start small but scale-out to thousands of nodes. With VxRack you are given the flexibility to optimize your tiering for Splunk by putting Hot and Warm buckets in SSD while using HHD or even Isilon scale-out NAS for your cold bucket needs (Solution guide shows how to use Isilon for cold tiering). You also get to enjoy the benefits of Software Defined Storage and data services that are essential in today’s data center. The best part is that VxRack gives a turnkey experience that is engineered and designed to be ready to run, giving you a quicker time to insight and value. Additionally, with single support and life-cycle management for your infrastructure you lower complexity and reduce risk and costs. All of this equals great performance, economical tiering structure & easy to deploy and manage infrastructure that is validated to run Splunk.

Follow Dell EMC

Dell EMC Big Data Portfolio

See how the Dell EMC Big Data Portfolio can make a difference for your analytics journey

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Dell EMC Community Network

Participate in the Everything Big Data technical community

Follow us on Twitter