Archive for the ‘Big Data’ Category

Democratizing Artificial Intelligence, Deep Learning and Machine Learning with Dell EMC Ready Solutions

Bill Schmarzo

Bill Schmarzo

CTO, Dell EMC Services (aka “Dean of Big Data”)
Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Dell EMC’s Big Data Practice. As a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide. Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata. Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications. Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.

Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are at the heart of digital transformation by enabling organizations to exploit their growing wealth of big data to optimize key business and operational use cases.

• AI is the theory and development of computer systems able to perform tasks normally requiring human intelligence (e.g. visual perception, speech recognition, translation between languages, etc.).
• ML is a sub-field of AI that provides systems the ability to learn and improve by itself from experience without being explicitly programmed.
• DL is a type of ML built on a deep hierarchy of layers, with each layer solving different pieces of a complex problem. These layers are interconnected into a “neural network.” A DL framework is SW that accelerates the development and deployment of these models.

See “Artificial Intelligence is not Fake Intelligence” for more details on AI | ML | DL.

And the business ramifications are staggering (see Figure 1)!

Figure 1: Source : McKinsey

And Senior Executives seem to have gotten the word.  BusinessWeek (October 23, 2017) reported a dramatic increase in mentions of  (more…)

Scientific Method: Embrace the Art of Failure

Bill Schmarzo

Bill Schmarzo

CTO, Dell EMC Services (aka “Dean of Big Data”)
Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business” and “Big Data MBA: Driving Business Strategies with Data Science”, is responsible for setting strategy and defining the Big Data service offerings for Dell EMC’s Big Data Practice. As a CTO within Dell EMC’s 2,000+ person consulting organization, he works with organizations to identify where and how to start their big data journeys. He’s written white papers, is an avid blogger and is a frequent speaker on the use of Big Data and data science to power an organization’s key business initiatives. He is a University of San Francisco School of Management (SOM) Executive Fellow where he teaches the “Big Data MBA” course. Bill also just completed a research paper on “Determining The Economic Value of Data”. Onalytica recently ranked Bill as #4 Big Data Influencer worldwide. Bill has over three decades of experience in data warehousing, BI and analytics. Bill authored the Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements. Bill serves on the City of San Jose’s Technology Innovation Board, and on the faculties of The Data Warehouse Institute and Strata. Previously, Bill was vice president of Analytics at Yahoo where he was responsible for the development of Yahoo’s Advertiser and Website analytics products, including the delivery of “actionable insights” through a holistic user experience. Before that, Bill oversaw the Analytic Applications business unit at Business Objects, including the development, marketing and sales of their industry-defining analytic applications. Bill holds a Masters Business Administration from University of Iowa and a Bachelor of Science degree in Mathematics, Computer Science and Business Administration from Coe College.

I use the phrase “fail fast / learn faster” to describe the iterative nature of the data science exploration, testing and validation process.  In order to create the “right” analytic models, the data science team will go through multiple iterations testing different variables, different data transformations, different data enrichments and different analytic algorithms until they have failed enough times to feel “comfortable” with the model that they have developed.

However an early variant of this process has been employed a long time: it’s called the Scientific Method. The scientific method is a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge. To be termed scientific, a method of inquiry is commonly based on empirical or measurable evidence subject to specific principles of reasoning[1] (see Figure 1).

Figure 1: The Scientific Method

The Scientific Method is comprised of the following components: (more…)

Get even more choice with the Ready Bundle for Hortonworks with Isilon

Brett Roberts

Brett Roberts

Data Analytics Systems Engineer at Dell EMC
Brett is the Technical Lead for Dell EMC’s Data Analytics Technology Alliances, focused on developing solutions that help customers solve their data challenges. You can find him on social media at @Broberts2261

Earlier this month marked the 1 year anniversary of Dell Technologies and the coming together of Dell and EMC. Looking back, it has truly been a great year with a lot of bright spots to reflect on. I am most excited about how we have been able to bring together two powerful product portfolios to create choice and value through unique solutions we now build for our customers. This can be seen across the company as our new portfolio drives increased opportunities to meet specific customer needs like creating more value add solutions for specific workloads. As a data analytics junkie, one that is near and dear to my heart is the recently released Dell EMC Ready Bundle for Hortonworks with Isilon Shared Storage.

You might ask “Why is this so important”? First, this is a Ready Bundle and a part of the Ready Solutions family meaning you reduce your deployment risks and speed up your time in value. If you aren’t sure what Ready Solutions are then here is a Whitepaper from IDG.  Secondly, this new Ready Bundle with Isilon extends flexibility for the user more than ever before. As a heritage Dell offering, the Dell EMC Ready Bundles for Hadoop have been around for years but traditionally they have been designed on PowerEdge servers. When you needed to scale your environment you would need to scale both compute and storage together; not a bad thing for many customers and deployments of these Ready Bundles have been outstanding. Now however, with heritage EMC’s Isilon added to the Ready Solution cadre of technologies, we offer organization the choice to decouple storage from compute and scale independently these two distinct components while delivering world class data services that have earned Isilon the top spot on Gartner’s Magic Quadrant for Scale-Out File and Object storage. We generally find this is a great option for Hadoop deployments where capacity requirements are growing much more rapidly than processing requirements.

In addition to the increased choice and data services that you get with Isilon, you still enjoy all of the benefits of the other Ready Solutions for Hortonworks Hadoop. This solution has been tested and validated for Hortonworks HDP by both Dell EMC and Hortonworks. Dell EMC and Hortonworks
have continued to strengthen their partnership over the years and this is yet another example of how we have come together to provide a unique, integrated solution to meet customers’
needs. Both Dell EMC and Hortonworks are excited about how this new Ready Bundle will help drive even more business outcomes with customers achieving success with Hadoop much more quickly. Jeff Schmitt, Hortonworks’ Sr. Director of Channels and Alliances had this to say about the Ready Bundle “The Ready Bundle for Hortonworks is yet another example of joint Dell EMC and Hortonworks investment bringing increased value to customers. As HDP deployments continue to grow in scale, offering customers choice in their infrastructure deployments is critical. The Ready Bundle for Hortonworks provides a customer simplified deployment while allowing storage and compute to scale independently.”

This new Ready Bundle release is the epitome of the value that this merger has created. If you find yourself having to scale your Hadoop environment to meet capacity needs or are looking where to start on your Hadoop journey, the Dell EMC Ready Bundle for Hortonworks Hadoop with Isilon is a great fit. Here is the Ready Bundle Solution Overview for you to learn more about this great solution.

 

 

Distributed Analytics Meets Distributed Data with a World Wide Herd

Jean Marie Martini

Jean Marie Martini

Director, Data Analytics Portfolio Messaging and Strategy at Dell EMC
Jean Marie Martini is a Director of messaging and strategy across the data analytics portfolio at Dell EMC. Martini has been involved in data analytics for over ten years. Today the focus is on communicating the value of the Dell EMC solutions to enable customers to begin and advance their data analytics journeys to transform their organizations into data-driven businesses. You can follow Martini on Twitter @martinij.

Originally posted on CIO.com by Patricia Florissi, Ph.D.

What is a World Wide Herd (WWH)?

What does it mean to have “Distributed analytics meet distributed data?” In short, it means having a group of industry experts, in this case a group given the title of World Wide Herd, to form a global virtual computing cluster. The WWH concept creates a global network of distributed Apache™ Hadoop® instances to form a single virtual computing cluster that brings analytics capabilities to the data. In a recent CIO.com blog, Patricia Florissi, Ph.D., vice president and global CTO for sales and a distinguished engineer for Dell EMC, details how this approach enables analysis of geographically dispersed data, without requiring the data to be moved to a single location before analysis. (more…)

Follow Dell EMC

Dell EMC Big Data Portfolio

See how the Dell EMC Big Data Portfolio can make a difference for your analytics journey

Dell EMC Community Network

Participate in the Everything Big Data technical community