Posts Tagged ‘big data analytics’

How Schema On Read vs. Schema On Write Started It All

Thomas Henson

Thomas Henson

Unstructured Data Engineer and Hadoop Black Belt at Dell EMC
Thomas Henson is a blogger, author, and podcaster in the Big Data Analytics Community. He is an Unstructured Data Engineer and Hadoop Black Belt at Dell EMC. Previously he worked helping Federal sector customers build their first Hadoop clusters. Thomas has been involved in the Hadoop Community since the early Hadoop 1.0 days. Connect with him @henson_tm.
Thomas Henson
Thomas Henson

Article originally appeared as Schema On Read vs. Schema On Write Explained.

Schema On Read vs. Schema On Write

What’s the difference between Schema on read vs. Schema on write?

How did Schema on read shift the way data is stored?

Since the inception of Relational Databases in the 70’s, schema on write has be the defacto procedure for storing data to be analyzed. However recently there has been a shift to use a schema on read approach, which has led to the exploding popularity of Big Data platforms and NoSQL databases. In this post let’s take a deep dive into what are the differences between schema on read vs. schema on write.

What is Schema On Write

Schema on write is defined as creating a schema for data before writing into the database. If you have done any kind of development with a database you understand the structured nature of Relational Database(RDBMS) because you have used Structured Query Language (SQL) to read data from the database.

One of the most time consuming task in a RDBMS  is doing Extract Transform Load (ETL) work. Remember just because the data is structured doesn’t mean it starts out that way. Most of the data that exist is in an unstructured fashion. Not only do you have to define the schema for the data but you must also structure it based on that schema.

For example (more…)

Architecture Changes in a Bound vs. Unbound Data World

Thomas Henson

Thomas Henson

Unstructured Data Engineer and Hadoop Black Belt at Dell EMC
Thomas Henson is a blogger, author, and podcaster in the Big Data Analytics Community. He is an Unstructured Data Engineer and Hadoop Black Belt at Dell EMC. Previously he worked helping Federal sector customers build their first Hadoop clusters. Thomas has been involved in the Hadoop Community since the early Hadoop 1.0 days. Connect with him @henson_tm.
Thomas Henson
Thomas Henson

Originally posted as Bound vs. Unbound Data in Real Time Analytics.

Breaking The World of Processing

Streaming and Real-Time analytics are pushing the boundaries of our analytic architecture patterns. In the big data community we now break down analytics processing into batch or streaming. If you glance at the top contributions most of the excitement is on the streaming side (Apache Beam, Flink, & Spark).

What is causing the break in our architecture patterns?

A huge reason for the break in our existing architecture patterns is the concept of Bound vs. Unbound data. This concept is as fundamental as the Data Lake or Data Hub and we have been dealing with it long before Hadoop. Let’s break down both Bound and Unbound data.

Bound vs. Unbound Data (more…)

Revealing the secret to speed and flexibility for data analytics

William Geller

William Geller

Data Analytics Product Marketing at Dell EMC
William Geller has been involved in new technology and data science for over 15 years, with experience launching and marketing new products for both startups and in enterprise, around the world. William is the Principal Product Marketing lead for Data Analytics in the Solutions Marketing division of CPSD. Prior to joining Dell EMC, he worked for numerous startups in Healthcare IT, Social Network Analytics, and cyber security. He holds a VMware VCP4.0 accreditation. Willam has an BS in Electrical Engineering from Drexel University and an MBA from Babson College. You can find him on Twitter at @williamgeller
William Geller
William Geller

Most companies recognize that they have opportunities through data analytics to raise productivity, improve decision making, and gain competitive advantage. Unfortunately, the majority of initiatives fail to move beyond the experimental stage, or analytic insights are not operationalized back into the business as intended. The causes range from inaccessibility to siloed data, time invested in continually gathering theAnalytic Insights Module technology review - data analytics data before performing analytics, and long lead times for resources from IT.  Recently, Enterprise Strategy Group (ESG) reviewed Dell EMC Analytic Insights Module, which is engineered to smooth out these friction points in the data analytics lifecycle.  It’s delivered on Dell EMC Native Hybrid Cloud, combining a self-service data analytics experience with cloud-native application development (more…)

Can Kids Give Us A Lesson Or Two On Big Data Analytics?

Mona Patel

Senior Manager, Big Data Solutions Marketing at EMC
Mona Patel is a Senior Manager for Big Data Marketing at EMC Corporation. With over 15 years of working with data at The Department of Water and Power, Air Touch Communications, Oracle, and MicroStrategy, Mona decided to grow her career at EMC, a leader in Big Data.

STEM education

Math scores across US grade schools have dropped this year according to newly published NAEP 2015 test scores. According to the STEM (Science, Technology, Engineering and Mathematic) Education Coalition, U.S. 15 year olds ranked 21st in science test scores among 34 developed nations. These are dismal stats that have many concerned, since STEM education plays a critical role in U.S. competitiveness and future economic prosperity.

The good news is that by 2020, the demand for STEM professionals will add over 1 million STEM jobs to the US work force. STEM jobs offer higher job security and higher yearly income than other fields. Even better, in STEM occupations the number of job postings outnumber applications by 1.9 to 1. Clearly getting our kids interested and motivated in STEM careers makes sense from many angles.

With so many engineers, data scientists, and technologists of our own, EMC is particularly passionate and committed to the STEM movement and wanted to help.

How can we motivate kids to take an interest in STEM and measure the impact of EMC’s STEM initiative? This was the challenge the EMC Presales took on and met with great success and personal reward. For example, in just one event connecting EMC with approximately 300 seventh and eighth graders on topics such as big data analytics, the interest in a math careers increased by 7%.   In fact, the more kids became excited and used their imagination about the possibilities of big data analytics, the more questions they asked, giving EMC a few more lessons to learn ourselves on the topic. Watch the video below to learn more.

I spoke with David Dietrich, Director of Technical Marketing for Big Data Solutions at EMC, to understand how the topic of big data analytics was able to change the sentiment of STEM, and in turn, how utilizing big data analytics was able to measure the effectiveness of STEM.

Q: David, why is STEM important and why did EMC become involved?

(more…)

Follow Dell EMC

Dell EMC Big Data Portfolio

See how the Dell EMC Big Data Portfolio can make a difference for your analytics journey

Dell EMC Community Network

Participate in the Everything Big Data technical community