Big Data Conversation with Spotify

Erin K. Banks

Portfolio Marketing Director at Dell EMC
Erin K. Banks has been in the IT industry for almost 20 years. She is the Portfolio Marketing Director for Big Data and Data Analytics at Dell EMC. Previously she worked at Juniper Networks in Technical Marketing for the Security Business Unit. She has also worked at VMware and EMC as an SE in the Federal Division, focused on Virtualization and Security. She holds both CISSP and CISA accreditations. Erin has a BS in Electrical Engineering and is an author, blogger, and avid runner. You can find her on social media at @banksek

I spoke with Eliot Van Buskirk ( @listeningpost ), Data Storyteller at Spotify, as part of my Big Data Conversations series for Dell EMC. Note that this is not a product or technology blog and certainly there are no endorsements implied from either side but, in all fairness, I am a paying Spotify user and I am in love with what they do to bring music to the masses.

Erin K. Banks: How do you utilize big data at Spotify?

photo credit: Larisa Barita

Eliot Van Buskirk: A good way to understand how I utilize data in my role at Spotify is to look at my background. I came out of journalism. I worked at CNET and Wired and I reviewed the first 50-plus digital players in the world and have covered digital music since it came onto the scene in the 1990’s. I left Wired to go to a startup called The Echo Nest, a music data company. They had more individual data points about music than any other entity on the planet. This is everything like “which artists are similar to other artists”, which enabled the clients to make radio stations that make sense, down to what key the chorus is in a song, and where the beats are in a song. It is incredibly detailed data. There are attributes including acousticness which measures how acoustic a song is versus how electronic, and mechanism tied to how regular of a beat it is. Whatever you can think of — these guys were teaching computers to listen to music and to understand what people were saying about music on the web. If a thousand people say something is dubstep then the system says that is probably dubstep. It draws conclusions using the acoustic data and cross references that with the semiotic data from the web to figure out how to understand music at a scale that has pretty much never been possible before. Going from journalism to then going to The Echo Nest, I found the sweet spot in telling stories about the data and analyzing all this incredible technology that is there to solve other problems like “what someone should listen to right now” or “will someone like a certain song.” We started to realize the potential for storytelling – things like how music from various decades is listened to now. The break-out story for us was how artists are related to the individual states. We looked at which artist is listened to disproportionately in each US state, and we made a map that got a ton of traction. Music charts are great, and they tell us a lot about what people like, but the most popular music is popular everywhere. We started looking at what music makes things distinct from each other. You can apply this to anything, like what music is distinctive to millennials versus boomers… the answer is Skrillex versus Roy Orbison. The basic idea is that the data that people view as kind of dry is capable of producing these incredible interesting nuggets of information, and as a journalist who was shifting into a marketing and communications role, what appealed to me was that everything we did with the data is real. It is like being a journalist where you actually do have access to all of the facts. At Spotify, I help publications like the NY Times or whomever to tell data stories, and I publish my own data stories based on the massive amounts of data at Spotify. It was great doing this at The Echo Nest because we had all the data about the music itself, and now that I am a part of the largest on demand music service in the world,  there a lots of anonymized data about how people listen to music, so we are able to see even more, and that makes my job even more interesting

Erin: You said you created a map to identify music regions. Is this map publicly available?

Eliot: This map is publicly available and went everywhere, I think USA Today almost printed instead of the weather map. This is what I mean about it being a success: to get this information out there and to get people talking about it. When we did it originally, it was a static map and it was a snapshot of a certain time when those were the popular artists. I have gone further with the concept now and made this musical map of the world that you can find at, which updates twice a month so you can see the music that is distinct for a thousand cities worldwide at any given time. People from those cities will recognize local bands because people listen to them in, say, Nashville but might not have heard of them outside of Nashville. The beautiful thing about this map is that I worked with this incredibly talented guy called Glenn McDonald ( @EveryNoise ) and he maintains these playlists that update for each city. In order to create the map I linked each spot on the map to one of these playlists. This is a demonstration of what we call at Spotify, “Care at Scale”. We are lavishing attention on these cities and showing exactly what makes them special compared to the rest of the world, but it scales remarkably well. We have a great response from people all over the world, like a radio station in Napoli, Italy where they have a bi-weekly show just dedicated to the Naples playlist on this map. I get emails from people all over the place who can tell that the care is there. It doesn’t come off as, “here is the Top Ten globally” — big data can essentially allow for a more nuanced view of the world, and again, it is important to emphasize that this is all based on anonymized and aggregated listening data.

Erin: Your background is journalism but do you consider yourself a data scientist? Are there data scientists there? Lately I have been hearing that it is not about just the data scientist. When it comes down to it, it is regular people searching through the data trying to get insights. Do you have thoughts on that?

Eliot: We definitely do have data scientists here and I do not consider myself one of them. I also don’t consider myself a data journalist which is another title. I have been a journalist before and this is not journalism. I came up with “data storyteller” as the title which I think is the sweet spot for describing what I do. I have a math and science background from a long time ago. I was very much into calculus and physics and became an English major because it was more interesting to me at that point. I have sort of a split early background between the humanities and more math and sciences and did a tiny amount of programming but really nothing after the age of ten or eleven. I taught myself SQL with the help of people at Spotify who are legendary programmers who are helping me along and I am finally at the point where I can get direct access to data and I can write my own queries and crunch my own data and it is actually empowering and I recommend it to anyone. The data can be a bottleneck, as we all know, and only certain people know how to analyze it and without the ability to do that, I couldn’t do my job.

Erin: Do you pull from the data scientists or do you do it yourself? What is the difference between what you are doing and what data scientists are doing?

Eliot: The data scientists have PhDs and patents and I was an English major who was a former journalist. The skill level is very different, but there is a democratizing aspect to this where once you learn your way around the tools, you are looking at the same data as everyone else, so if you learn how to  use the data, the results are going to be similar to them analyzing it and that is the great thing about technology. Regardless of whether you have a PhD or not… if you figure something out, you figure it out. I would never put myself on the same level as they are and I think there needs to be more of us that are in-between various fields and data science. At Spotify, the data scientists are solving major problems and doing very difficult work. They keep this whole thing running as they optimize everything and provide the best music recommendations. For example, it would not make sense for them to be working on a data-driven story about what happens to Aerosmith listening when there is a comet in the news. Turns out that people listen to “Don’t Miss a Thing” every time there is a comet event. I consider that fun and interesting from a news and PR point of view but, that isn’t something you need a data scientist working full time on.

Erin: I think it says a lot that you know that whenever there is news about a comet, people listen to that song and that it turns into something else. How has big data impacted your business? Did Spotify start off as a streaming business and then they added analytics in order to provide different services?

Eliot: Spotify was always data driven. The Echo Nest brought a different approach. They combined what people said to what music actually is. To have a machine ear that says this song sounds happy or this song sounds sad… this is an actual number that we have, an emotional valence. It varies between zero which is completely unhappy and 1, completely happy. I don’t think anyone else had been able to scale up like that. Spotify is scaling up massively in terms of usage and The Echo Nest scales very well for understanding music and how people listen to music. I would like to add that when people hear about big data being associated with music, they can understandably get the wrong idea that it is just algorithms — what do they know, and why would a robot make a good radio station? That is completely the wrong way to look at it. Thinking of music as data and algorithms is misleading. People make music and people listen to music and it is all about humans being involved. The comparison I often make: I used to review music. That is one person’s opinion, and you can base something on that, or you can base something on how 75 million people listen to music. I argue that 75 million people can make a data-drive approach more human than one person can. The algorithms are a way of getting at very human things through this route that it is sometimes mistakenly viewed as artificial.

Erin: What is the biggest myth about big data? Do you think that it is that big data is not looking at the human aspect?

Eliot: Yes, that would be a fair answer to that question. A great example is the “Discover Weekly” feature on Spotify. Every Monday, for every single user of Spotify, Discover Weekly creates a personalized playlist of music that listeners have never heard before on Spotify, and that they know you are going to like. It seems impossible, but when you have that many people listening to music, making playlists, adding things to their collections, it is possible to say that for every single user, here is music that you will enjoy and we know that you have never listened to it before on Spotify, so here it is. That happens every Monday, and people’s minds are blown. It is a similar feeling to a friend making a mix tape. You feel like “There is no way this thing knows me this well”. That might seem strange or magical or impossible if you are just viewing it as an algorithm, but once you understand that it is all about people, the human element, their responses and reactions, then it makes sense.


Erin: Now on Monday you have people interested in listening to new music and you are offering it up to them based on everything they have given you before. You have a portfolio on me, as a Spotify user, this is what I listen to and this is what I like and on Monday you give me something different based on me and others.

Eliot: Yes, that is essentially how it works. It is a great demonstration of how big data or algorithms actually provide a very human experience. Once you understand that it is about people, it makes sense.

Erin: So, big data allows your business to be more human. Instead of being a streaming music service you are now creating human interaction with us by gathering all this data around us. You are changing how I listen to music and how I discover music.

Eliot: I think that is true. People really love this feature. People return every Monday saying “give me more of this please.” I think it is one of the most powerful ways that we are using data to improve people’s listening experiences right now.

Erin: Does data allow you to create additional products and services?

Eliot: For sure, as we discussed, there is “Discover Weekly” and there is a degree of outreach being done. If we can tell that you are really into a certain band, then we can tell you if they are going to tour where you live. There is no end to where you can go with the data. I am not involved in the product here, but would guess that we don’t want to “over hit” people with offers. But, if we could help artists find people that are interested in them, only good things can come from that.

Erin: I feel people are not seeing all the power of the data and the additional products and services it can provide. We tend to get stuck in our bubble and not see outside of it.

Eliot: I think we are seeing a lot of examples where people are in bubbles more and more with technology, but that data can expand people’s spheres. Discover Weekly is a great example of that. A similar approach could be used on social networks to bring us outside of our bubbles. The amount of data that we generate in everything we do these days, people can use that in a number of different ways. The part that is interesting to me is using it to expand people’s exposure to things –  rather than keeping them in their bubbles.

Erin: What can you leave us with that is important to note?

Eliot: Another of the offers that we have that we couldn’t have done without big data is “Fresh Finds,” which we release every Wednesday, and it is all about expanding people’s spheres. It is introducing bands that are not in the mainstream. In one case, a local bar band in Florida was featured on the list, found completely using big data, and then using the data for understanding very subtle signals that this band was about to take off. They were then showcased on Fresh Finds and then toured in NY and I think scheduled shows all around the United States. There are ways to use big data to personalize things, or ways to use it to identify very early trends, in our case music trends, but it can be applied to every kind of market.

Erin: When you said that people are talking about them are you bringing in social media?

Eliot: The whole web is sitting right there if you want to know about music. If people suddenly start tweeting about bands, there is a way to do semantic analysis of social networks to see what is bubbling up. What I think is powerful is to combine lots of different signals like listening data, what regions are heating up, social media, and who is opening for what band. The number of signals are almost without limit, and that is what is so powerful about big data and the ability to analyze it — you can come close to understanding all of this, resulting in a play list that feels magical. It is the result of lots of hard work and data processing and analysis of anonymized listening. The more signals you can factor in the better. Then you can continue to apply new data and get more results.

Erin: Can bands receive additional services through Spotify based on big data?

Eliot: We released the Spotify for Artists site and service where artists can access data that they couldn’t earlier receive — levels beyond what they can see in the Spotify app, to get an understanding of where their fans are, and so on. It is another great opportunity from big data.

Erin: It isn’t just Healthcare companies or banks, it is more than that, we are talking about bands.

Eliot: Yes, and we’re still talking about data. If you are a band in Seattle and you are trying to decide if you should drive to San Francisco to play, there are ways to answer that question… even that question can be answered using data.

The ultimate tip for your data analytics journey: Get started

Anthony Dina

Anthony Dina

Director of Data Analytics across North America at Dell EMC
With twenty years in the IT industry, Anthony Dina serves as the North America Director of Data Analytics at Dell EMC. He leads a team of solutions architects that synthesize the tsunami of new data types (machine, application, person-driven) with traditional systems of record. Their expertise in big data, data warehouse modernization and analytic modeling helps customers succeed in the era of Digital Transformation. This work not only involves intellectual property from Dell EMC but also from partners like Cloudera, HortonWorks, Splunk, and SAP. Prior to this, he served as executive director of strategy and director of solutions marketing. He has earned a Masters of Business Administration from the University of St. Thomas and a Masters of Fine Art from Cranbrook Academy of Art. His technical certifications include ITIL v3 Foundation and ITIL Services Strategy.
Anthony Dina

Latest posts by Anthony Dina (see all)

Data analytics is rapidly emerging as a key to success in the new digital economy. To capitalize on this opportunity, organizations need to rethink the status quo and invest accordingly.

When it comes to organizations and their data analytics journeys, a quote often attributed to Mark Twain seems especially appropriate: “The secret to getting ahead is getting started.”

While data analytics is rapidly emerging as a key to business success in the new digitally driven economy, many organizations are just exploring or in the early stages of their data analytics journeys. Basically, many people are just kicking the tires of the data analytics vehicle. And that’s a problem, because you can’t get anywhere with data analytics until you are actively using data analytics.

A survey by IDG Research Services suggests that just 36 percent of organizations have data analytics projects in progress. Most of the rest of the survey respondents fall into the stages of exploring, planning or pilot testing analytics projects. And then there are the laggards—the 10 percent of organizations who don’t have data analytics on their radar.[i]

These numbers need to change. Based on the business results organizations see as they gain competitive advantages through data analytics, it’s extremely important to get started down the path to data analytics with the ultimate goal of digital transformation.

About those business results: An extensive survey of IT and business decision-makers, commissioned by Dell EMC and conducted by TNS, found that organizations actively using big data, cloud and mobility technologies are growing up to 53 percent faster than the laggards of the world.[ii]

In another large survey, this one of more than 4,000 business leaders around the world, Vanson Bourne and Dell Technologies found that more than half of respondents (52 percent) have already experienced significant disruption to their industries as a result of digital technologies, and more than three-fourths of respondents (78 percent) consider digital startups as a threat, either now or in the future.[iii]

Clearly, digital transformation is not just an opportunity; it’s also a threat—to those organizations that don’t get started in a timely manner. So, what’s holding organizations back?

The Vanson Bourne/Dell Technologies survey found that the top barriers to progress are:

  1. Insufficient budget and resources
  2. Lack of executive support
  3. Inadequate expertise and skills
  4. Technologies that can’t work at the speed of business
  5. Data privacy and security concerns

How can your organization overcome these barriers? The first step is to recognize that the digital transformation you make today is the key to future competitiveness. The next step is to rethink the status quo and invest accordingly.

The report from Vanson Bourne and Dell Technologies boils the advice down to a simple, clear sentence: “Organizations, at every stage, will need to couple the latest and greatest in technology with a shift in mindset, an investment in their staff and a bold approach which includes rethinking business models.”

If you follow that advice, your organization will be positioned to compete more effectively in the digital economy. That’s what happens when you put the strategies and technologies in place to capitalize fully on your digital assets—the massive amounts of data you generate every day.

As the Vanson Bourne/Dell Technologies report notes: “Businesses of every kind still have a chance to leap ahead. Change, if embraced correctly, can open a world of opportunity.”

The key is to get started.


[i] IDG Research Services. IDG TechPulse survey. Jan. 10, 2016.

[ii] “Global Technology Adoption Index 2015.” A survey commissioned by Dell EMC and conducted by TNS.

[iii] Dell Technologies. “Embracing a Digital Future.” 2016.



Revealing the secret to speed and flexibility for data analytics

William Geller

William Geller

Data Analytics Product Marketing at Dell EMC
William Geller has been involved in new technology and data science for over 15 years, with experience launching and marketing new products for both startups and in enterprise, around the world. William is the Principal Product Marketing lead for Data Analytics in the Solutions Marketing division of CPSD. Prior to joining Dell EMC, he worked for numerous startups in Healthcare IT, Social Network Analytics, and cyber security. He holds a VMware VCP4.0 accreditation. Willam has an BS in Electrical Engineering from Drexel University and an MBA from Babson College. You can find him on Twitter at @williamgeller
William Geller
William Geller

Most companies recognize that they have opportunities through data analytics to raise productivity, improve decision making, and gain competitive advantage. Unfortunately, the majority of initiatives fail to move beyond the experimental stage, or analytic insights are not operationalized back into the business as intended. The causes range from inaccessibility to siloed data, time invested in continually gathering theAnalytic Insights Module technology review - data analytics data before performing analytics, and long lead times for resources from IT.  Recently, Enterprise Strategy Group (ESG) reviewed Dell EMC Analytic Insights Module, which is engineered to smooth out these friction points in the data analytics lifecycle.  It’s delivered on Dell EMC Native Hybrid Cloud, combining a self-service data analytics experience with cloud-native application development in a single cloud platform.

What did ESG do?

ESG reviewed Analytic Insights Module from end-to-end, reviewing data processes across the entire analytics lifecycle. Their study focused on both the technical and the user experience aspects, and the results are in.

Analytic Insights Module delivers faster time to actionable insights and continual ease of use over other solutions for data discovery, ingest, and analysis. We’re not surprised, though. Analytic Insights Module was specifically engineered to make data analytics and operationalizing insights easier, so organizations can focus on their business rather than the technology.

What were some of the results? 

ESG found incredible benefits in terms of speed, flexibility of the analytics environment, and overall benefit to building an in-house solution. A couple of the many observations from the study include:

  • Self-service analytic workspaces with Hadoop clusters stood up in just 15 minutes.
  • Users can implement a robust data analytics solution 47% faster, on average, than building it in-house.

Why should your businesses care?

Faced with competitive pressures, businesses need to differentiate themselves among the sea of options in the market. Insights derived from data analytics can help businesses gain the edge they need, but technical obstacles have often stymied progress. Analytic Insights Module removes these barriers by enabling data analyst teams to get to work quickly delivering insights back into the business within quotas set by IT. With single-contact support across the platform, companies can now move faster and become more innovative by focusing on their business, not the underlying platform.

Check out the full study from ESG now.

Recap – Strata+Hadoop World 2017 San Jose

Erin K. Banks

Portfolio Marketing Director at Dell EMC
Erin K. Banks has been in the IT industry for almost 20 years. She is the Portfolio Marketing Director for Big Data and Data Analytics at Dell EMC. Previously she worked at Juniper Networks in Technical Marketing for the Security Business Unit. She has also worked at VMware and EMC as an SE in the Federal Division, focused on Virtualization and Security. She holds both CISSP and CISA accreditations. Erin has a BS in Electrical Engineering and is an author, blogger, and avid runner. You can find her on social media at @banksek

I love the Strata + Hadoop World Conference (renamed Strata Data Conference) and once again the 2017 conference did not fail me in anyway. I wanted to take this opportunity to give you a quick recap since I had the privilege of attending.

I love how the keynotes are short and impactful. Wednesday delivered great insight from a conversation with Beau Cronin and Phil Keslin, CTO and Founder of Niantic. Another great session was from Rajiv Maheswaran from Second Spectrum. Niantic created Pokeman Go and some of the great insight that Phil brought was how they started with a strong architecture that they knew would scale. Pokemon GOThey had no idea that it would need to scale so fast or so soon. Although there were sleepless nights, they were prepared and able to solve both the compute and big data problems they encountered immediately. Rajiv’s topic was “When machines understand sports” and it was great to see how they were able to track the movement of the ball and players and how they could transform the data into facts about players, strategies and probabilities as well as changing the overall game. We watch a great deal of basketball, especially now with the NCAA Men and Women’s Championships going on, and to see data analytics applied to the game like never before is really cool. Is there anything that data analytics can’t impact?

hurricaneOn Thursday we got to see many of the good things that data analytics can help to achieve. For instance, Desiree Matel-Anderson from the Field Innovation Team talked about “Data in disasters: Saving lives and innovating in real time”. Desiree talked about hurricane Sandy and the Boston Marathon bombing as well as other recent events. She talked about how we can use social media and data analytics to determine the impact of the event and how they make it easier to better follow or respond to events in the future. For instance, looking at social media, they saw that people were tweeting for “help” after Sandy hit the coast, electricity was gone and the hurricane had subsided. There was no panic occurring during the storm hitting the land. These facts can help agencies like FEMA in future hurricanes help people feel more connected and can respond to them faster and with this historical data. Another great session was Maya Shankar who worked for the White House Office of Science and Technology Policy under President Barack Obama. She spoke about “Improving Public Policy with Behavioral Insights” and provided us with four conclusions as to the how those insights can guide next steps…


1) convert interest into impact

2) quantify impact

(3) celebrate small-wins

(4) generate organic buy-in

These conclusions are based on many years of work and delivered as proof to Washington DC insiders that “data tells you what people are doing” and “behavioral science tells you why.”

The rest of the day was filled with expo time and some great technical sessions. The majority of the topics for the conference focused on real-time machine learning and Artificial Intelligence (AI). In one presentation, Rob Craft from Google explained machine learning best… he said, machine learning is “one branch of the field of AI”, “a way of solving problems without explicitly codifying the solution”, and, last but not least, “a way of building systems that improve themselves over time.” Machine learning and AI are clearly the future with regard to data analytics and a great reason why they changed the name of the conference to Strata Data Conference. Don’t get me wrong… Apache Hadoop, Spark, Impala, Flink, Kafka, Beam, Apex, Kudo, etc are still being talked about. Data analytics means a lot of things to people and it varies in multiple aspects, but one thing that remains the same is the fact that data analytics drives change and impact. Whether it is changing our nation’s policies or making better video games, we were all at Strata + Hadoop World to learn more and use that information to make a difference. That’s why I love Strata + Hadoop World. So much to learn, so many people to learn from, and realizing that for once, our jobs are making an impact and what is better than that?

Follow Dell EMC

Dell EMC Big Data Portfolio

See how the Dell EMC Big Data Portfolio can make a difference for your analytics journey

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Dell EMC Community Network

Participate in the Everything Big Data technical community

Follow us on Twitter