>> Ryan Brush: Terrific, let’s go ahead and... Welcome to the MSR visiting speaker series. Before we...

>> Ryan Brush: Terrific, let’s go ahead and get started. Thank you all for coming. Welcome to the MSR visiting speaker series. Before we get started, let me make a few notes. First of all, thank you Amy Draves and Michelle Riggen-Ransom for organizing this event. We’ve got a great speaker today, so I’m really looking forward to that. Also, I wanted to make a note. My name is Ryan Brush. I work in GFS, Global Foundation Services for Microsoft, but I also teach at University of Washington. So, what’s fun about this is I’m actually going to use part of the book that you see in the back as a textbook for this upcoming semester, so from October through June we offer data analysis certificate program through University of Washington, and this is basically in the evenings Mondays and Thursdays. If any of you are interested in learning more about the program please feel free to reach out. I’d be happy to share details about that. So as we get started, let me read the short blip about John Foreman. So John Foreman is the chief data scientist fro Mailchimp.com where he leads a data science product development effort called the email genome project. As an analytics consultant John has created data science solutions for the Coca-Cola Company, Royal Caribbean International, Intercontinental Hotels Group, and the Department of Defense, the IRS, and the FBI. So John’s book, Data Smart: using data science to transform information into insight is now out from Wiley, and it’s possibly the most legal fun you can have with spreadsheets. So I’m looking forward to today’s session, and without further ado I’ll turn it over to John. Thank you. >> John Foreman: All right. Thank you. So I’m John and I appreciate the introduction, Ryan. Like Ryan said, I am a recovering consultant that was all those projects he listed off when I was a consultant in a management-consulting firm. Also, before this talk I had a huge lunch over near building 99, so I’m just going to be burping through the entire talk, so that’ll just happen. I’ll try to keep it quiet, but the food was delicious. I’m an operations research guy, so operations research is math applied to decision making, and it’s kind of a piece of data science before data science was sexy, right? Operations research has been around for a long time, since World War II looking at how do we schedule convoys, things like that. So it’s an analytics practice that’s been around for a very long time. But I left management consulting and went to MailChimp. So I don’t know how many of y’all know what MailChimp is, or use MailChimp, but it is the world’s largest -- I see a thumbs up in the back there. Some of you use MailChimp, yes! So it’s the world’s largest email service provider. So MailChimp is a website through which you would upload your email list and create a news letter to send to that list, and we help you do that and help you track opens, clicks, e-commerce, transactions that started in that newsletter. So we send about 10 billion emails every month, and then we track a bunch of interactions with those emails. And so there’s another 4 billion interactions coming in, opens, clicks, subscribes, reports, all that kind of stuff. We actually send a lot to hotmail so y’all get a lot of email from us. We have about 280 employees in Atlanta. So we’re all in the south, that’s why I say y’all. You’ll get used to it by the end of this talk. So I was a math guy and was used to working with large organizations before coming to MailChimp, and so my experience doing analytics was always that there would be this other analytics team I would be working with and everyone would be analytics all the time. And when I went to a startup it was kind of weird. I met Greg. Greg is a designer at MailChimp. And then I met Fabio, email template designer. Mardov is the designer. And I met Dave, designer, Kayla designer, Jen designer, Justin our t-shirt designer. He designs t-shirts like this, and like that shirt. They make things like this billboard outside of our headquarters. This is the death metal billboard. It was eventually replaced with an ‘80s metal billboard and these are our action figures for Freddy, our mascot. Our vinyl figures. And so this became immediately very confusing for me. I met Aaron, who is our head of UX. He is a very famous designer in a book called Designing for Emotion. And this is Ben, our CEO, a man of many talents. He also has a design background. And very quickly I realized that most companies are not in the business of analytics, right? When you hear about data science and analytics these days the same companies are in the headlines. A lot of it seems to have to do with tracking personal data and doing add targeting. I was going to a company that did not do that. They had some other product that people paid monthly fees for to use our product. What does analytics look like there? What I quickly realized was that all these people, all these designers, they were after one goal, right? They want to improve the product by improving the user’s experience of that product. So what I realized is if I’m going to do analytics at MailChimp, I’m going to have to get behind this. This is the goal of everyone I kept encountering. So what does that look like for me? What does it look like for a math guy? To use my skill set, which is not Photoshop, to improve this website. So I went online and started looking at other websites, right? So this is one I found. This is a data science product. I thought okay, I can do something like this. This is from a restaurant review website. I guess they review other things besides restaurants, but they release this heat map thing where I can pick a city and Atlanta is not on there, so I picked San Francisco. Hey Seattle made the list, good for y’all. And then you pick a keyword, but you can’t type in any word, they’re curated per city, right? So if you pick Paris, like, baguette comes up and if you pick San Francisco hipster comes up, and then you click it and it colors a heat map thing with review data all the reviews talking about hipsters are all in the mission. Okay, so that’s the mission in San Francisco. And so looking at this you kind of realize, okay, this actually doesn’t improve my experience of the product, right? All it does is it sort of flexes analytics muscle that hey, we have all this data, we’ve taken all this textual data and put it in a heat map to say okay, people talked about hipsters, now I’m showing it back to you because we think it’s cool, right? And they’ve in fact selected the words that I can play with to pair it back something to me that I already knew to begin with. That’s how they curate, it’s like, oh yeah, that’s exactly where hipsters are, isn’t’ that cool? So I didn’t want to do this. This doesn’t look like it improves the user experience of the product at all, this is more impressing maybe the press or someone before I go public. This is another example. It’s from a famous professional social network. This was their big first data science project that sort of kicked off data science is sexy a few years ago. So in this social network there is me down here. I can visualize all my connections in a map, and they’ve graphed them in these clusters. This is probably using modular maximization, which I talk about in my book, yay! And so I can see okay, I’ve got these people in orange, these are people I knew in Boston. I’ve got these people down here in green. These are people I worked with at Boos Allen. So I’ve got these different clusters. I already know why these clusters exist, right? I can just look at the graph and it’s like yeah, I worked with those people there and I worked with those people there and I went to school with those people there. So really all you’re telling me is things I already know and I can highlight someone’s name and it gives me their name, so now I’ve seen their name twice. So it’s really sexy looking, right, but when you get into it it’s not improving the product in any way. It just looks really good. It takes the data and just sort of parrots it back to you and tells you things you already know, but it does it in a sexy way. So if this isn’t what I should be doing, I’ll leave this kind of stuff to the designers, what should I be doing? So I thought okay, well if I can’t find good examples, maybe I’ll look at the toolset that’s out there. Maybe if I look at tools or techniques I’ll get some insight into what I should be working on. So what is the big data data science landscape look like? I’ll just look at all the tools. I found this slide, which is a nightmare. But then there’s version 2.0 of the nightmare, so this is nightmare version 2.0 and this is still a couple years old actually. I suspect it’s getting even more crowded. It’s like bubbles within bubbles within bubbles. So what I discovered is that most analytics teams do something like this. Right? You start by hearing about some tools, okay, we need to use hadoop [phonetic], we have to. I don’t know why we have to use hadoop. Choose your tools first. You know a fraction of what’s possible because you’ve seen some other website do something that involved a graph or they wrote a blog post that was really cool, and then you just kind of flail about looking for something to do, right? Trying to figure out what does this mean for me? And then in order to impress your boss there’s another step, you create an info graphic, right? So this is the last step in analytics nirvana. You collected some data and the new tool that you paid a lot of money for to save it there even though your data wasn’t that big and you probably could have put in CQL database, but now I’m going to create an info graphic out of it to prove that it was all worth it and that I should keep my job. So what I would propose is there’s another way to go about analytics. And it goes something like this. Know what’s possible first, so find out what data you have available to you, both internally and externally if there’s some external data that has some bearing on your life, but most of the time it’s going to be internal data, right? So I got these databases, these databases, these databases, margin of error has a spreadsheet she rarely shares with anyone but I could use that to kind of take a survey of all the things you have around the company. For Microsoft that’s going to be insane, but at least in your small world what are the data sources you have at your disposal? What techniques could I possibly use? So a basic understanding of what’s forecasting, how is that different from predictive modeling, how is that different from data mining? How is that different from optimization? So just kind of a basic idea of okay, these are the various things people do with data in order to solve problems and then understanding some of the technology, right? So like if I want to do predictive modeling maybe I would use [inaudible] or if I just want to do some basic data vis maybe I would use R [phonetic] again or Python or maybe I can even use Excel and just draw a graph. So understanding various technologies. Once you have all that down then identifying problems for opportunities, right? So talking with people figuring okay, what do you guys need help with that data could possibly solve? And actually these first two points are in conversation all the time. Right? You should always be talking to people, learning what opportunities there are and you should always be learning about new stuff, the new data sources you didn’t’ know existed. Oh, that’s a finance database I didn’t’ know about. New technologies that are coming out, new techniques people are publishing. So that’s kind of all in conversation intention all the time. And then once you know all right, we’re going to solve this problem, this requires data, and we can solve it, here’s the data we’re going to use. We’re going to use these sources, here’s the technologies we’re going to use, and here’s the techniques we’re going to use, and then you solve it. So it’s sort of a problem centric approach to data science, right? And this is actually why I wrote this book. So Data Smart, this book is really not sexy. Basically I take the reader through a bunch of techniques that are sort of the bread and butter of analytics data science, and I do it in excel because that’s a great place to learn them. It’s very sort of tool, I mean it is a tool, but the data is always in your face, right? So you can say these are the inputs and I’m going to put them through this formula and I’m going to put them through the outputs versus using programming language like R or Python where you’re just going to load up the library and the work is done for you. You never actually learned anything. So this book is just meant to teach you like a survey of all the different techniques at your disposal. And it doesn’t just hammer on one. A lot of books out there seem to think that artificial intelligence is going to solve everything as a really sexy technique right now, but there are a bunch of other techniques out there that are also very useful. Outlier detection is one where you can do artificial intelligence or you could do some other kind of technique that maybe would be unsupervised. So I try to take people through all of them. Okay. So that’s my spiel on analytics and how it should be done within a company or an analytics team or some part of the organization. I want to just give y’all some examples of what this looks like at MailChimp. So I’m not just kind of blowing smoke like we actually try to practice this basic philosophy. And the way I’m going to do this is I’m just going to go through and we break up the data science team into sort of four things that we do. We deliver insights and capabilities. Insights is just basic analysis research. So these might be one-off things where the result might be a report or spreadsheet or a blog post, right? And then we build capabilities, which are tools. So these could be internal tools that other teams can use, or external teams that actually just become part of the application, right? And so a MailChimp customer can use one of these tools. So we do this internal and external stuff. So we do it for internal customers, which are essentially other teams. We act as an internal consultant other teams and then we do it for external customers, which are our users that pay us money. And the way it generally breaks down for us is we spend about 20 percent of our time just doing basic insights and 80 percent of our time building tools. That’s in part because tools are just more powerful. If I can build something for the application that means that all of our users can do analytics versus if I just do it as a one-off then that one report gets created and that’s it. Furthermore, we spend a lot of time on capabilities because they are just harder to build. Okay. So I’m just going to go left to right and top to bottom. So internal insights. I’ll give you some examples. This is our customer support team. They’re all in Atlanta. It’s actually about half the company. These folks answer chat and email support, right? So you just write in and say hey I’m having trouble syncing up my [inaudible] website with MailChimp, how do I make this happen? These folks are the people that are going to answer that question. And we’re adding users very quickly. So in queue one we added 200 thousand new users. And this is actually the night shift logo. We now have a night shift, yay! We’re hiring about one new support employee every day, which for a company that I started there less than three years ago and we had 80 employees and now we’re at about 300, so it’s pretty rapid for us. It’s been kind of a shock. But one of the things we’re bumping up against is okay, we’re now at a size with 6 million customers where we have to do some kind of triage or sorting of who gets support, what does that look like, right? And I think the easy way to do it would be to say okay, we’ve got forever free accounts for very small users, so these are people like oh I have a fantasy football league and I want to send a newsletter through MailChimp. They might use us. Should they get support in the same way that a paid user does? So that would be a very simple way to do it. Unfortunately what we’ve found in the data is that 40 percent of eventually paid users come into customer support before they pay us money. Right? So if you just give a certain type of support or support faster for people who have already paid you’re missing a lot of people who would have paid you money. So how do we solve that problem? Well, one of the things we could do, you’ll remember I’m in the insights section, not in the modeling and tools section of this presentation, I could just build a big AI model to predict paid customers. That’s one way to go about this. In fact, it something we toyed with to see how far we could get. This is a quote that I really like. This is from Robert Holt. He wrote a paper called Very Simple classification rules perform well on most commonly used datasets. And the basic premise is that you can build a big AI model to do a lot of things, but single rules often perform very well. So it’s a single rule. Okay, I have this data that comes in about this user. This user looks like this. They do this, they do this, they do this. Let me check one thing. And based on that one check I’m going to say they’re likely to pay us money. So that would be a single rule. Those actually perform very well, right? A lot of times people want to jump to the most complex thing they can do. But Holt’s contention in this paper is that complexity must be justified. People never really think about the fact that they have to justify why they’re using a really complex approach, and in fact I think Holt was probably saying you have to justify it with better accuracy, you’ve got to provide a raw curve that shows how if we use this complex model we get all these gains. You also need to justify the additional organizational complexity. I’ve worked at a lot of companies, especially as a consultant where my job was just to build this really complex thing that the client had asked for. And the sad thing about that is that they’ll often get these really complex things that they can tell, like man, we got this space-age model, the moment the consultants leave it dies. Right? Because no one knows how to update it. They ask for too many buttons and knobs and then immediately forgot what they do and no one wants to read the documentation. There’s an organizational complexity you can introduce if your model is just unwieldy. I had to build one forecast model that included standard deviation in it and I had to describe to every single person who was going to use it what standard deviation was. And the moment I left the model was moth balled. So when we think about these models at MailChimp I have to think okay, am I going to stick around for this paid-free prediction or is this something I’m handing to someone else? Who is that person? How do they keep it up-to-date? In this particular case the customer was just a developer in support right, and they were not going to be keeping any eyes on this, so we analyzed the data and found you know what, there actually is not one rule, there are two rules I can hand you that do a really good job of ordering customers. One is when you sign up for a MailChimp account what’s your email address? Is the domain a free mail domain or is the domain your own domain for your business? So is it a paid domain? If it’s a free mail domain you’re less likely to ever pay us money. Second rule, do you have a list of email addresses to import into MailChimp? If you say no, or if you leave this question blank you’re really not likely to ever pay us money. Those two questions combined were really extremely powerful features. After those two questions anything else we added to models helped, but wasn’t nearly as powerful as those two features. So we just left it at that. We realized we could get a more accurate model by including a bunch of other things, I mean you could go out to the domain of that person’s email address, look at it and use built with to tell what technologies they built, their check out process, you could do all of these creative things, but whose going to keep that up-to-date? Do we think it has a chance of survival? If the answer is no let’s stick with the simple one. That’s fine. People need to realize they’re free to do that. Okay, another example of the basic insight that we created. Someone in support had to schedule everyone who’s doing chat support, and they had to schedule when they would take their lunch break. That’s where it says phasing. They’re phasing out of chat. It’s kind of sad, some of these people are point five. That means they’re half a person. That’s because they were just hired and they’re not fully trained, so they’re counted as half a person, which is kind of depressing. But they all become full people eventually after they take enough chats. But this is very difficult for someone. This is actually kind of a classic operations research problem to have demand coming along the bottom here. It’s sort of a forecast in chat that comes out of a forecast model and then you’ve got to schedule when are people going to take their lunch breaks. They had all these rules like if you take a lunch break you should take it with someone else because you don’t want to be lonely. So we thought why don’t we just do the schedule for you because we know the math behind it? It’s a classic operations research problem it looks like this, it’s actually just a bunch of inequalities and then you’ve got an objective function, in our case the objective function is maximize availability subject to demand. Code up an LP format, which looks like this, which is frightening, don’t look at that too closely, and just return to them a schedule back in excel just how they wanted it. It says okay, here’s when people are on point that means they’re taking chats. Here’s when they’re taking their lunch break. Certain days they are on email. And that’s it. This is an example where we found that someone was doing something the manual way, right? They were just kind of playing with people like okay, what if I slide the lunch break, and we realized we know how to do this with math. This actually isn’t a big data problem and we can just do it real quick and just hand you back this artifact. This spreadsheet that’s actually optimal in terms of forecast and chat demand. One thing I like to keep in mind while we’re still on this topic is the [inaudible] example from star trek where Kirk has a test he has to take and this particular test is unbeatable. I think the story actually changes in the new star trek movie versus the old star trek, but the basic idea is that there are these clingon war birds and they’re coming in and they’re going to destroy you and you actually can’t win. The point of the test is to see how you do under pressure given that you will fail. And Kirk actually goes in and changes the rules of the game, right? So he actually goes into the system and changes it and cheats so that he can win because he refuses to admit defeat. He’s just like that. Instead he eats an apple and wins. But this is something to keep in mind as an analytics person because I think the tendency is to be presented with the very complex game or system to be like oh, I can solve it with math and data and you need people to go to lunch together so they’re not lonely, yeah absolutely. But I think that analysts need to feel empowered to also say, you know what? Maybe we don’t need to do that anymore. It is perfectly acceptable to win some sort of analytics problem by suggesting to the business that maybe you should no longer do that at all. And the nice thing that analysts have and that data scientists have is this ability to take in the data, describe the problem, model it and say okay what happens if we get rid of that rule? So what happens if I get rid of this rule that people have to go to lunch together? Does that somehow make us more available to take chats? And if so by how much? So it’s okay to change the rules of the game and it’s really nice when you can quantify what that change brings the business in terms of revenue or cost savings, so keep that in mind. External insights. We’ll go through this one really quickly. These are basically blog posts, guides, things like that. So a bunch of our users want to know where do the demographics of some of these free mail providers? So how does hotmail compare with yahoo, AOL, Gmail, we’ve got Comcast on here because we send a ton of email to Comcast. Interestingly enough this is just one graph I stole out of the post, this is age distribution. So these are first quartile median, third quartile age, for various email providers. Comcast is way older, which is kind of interesting and it’s because you have to have like at least a couple bucks to have a Comcast email address, right? Because you have to have cable or Internet or something. And then you’ll see AOL actually slides out about six or seven years past hotmail and Gmail. So that’s kind of interesting. Hotmail is pretty young. I suspect some of this has to do with the fact that I need an email address in order to have an Xbox live account when I’m setting that up. It’s going to be hotmail, a bunch of Microsoft owned free mail providers that are suggested. So we see that those age ranges hotmail and Gmail are actually very close to each other. Yahoo is a little bit older. So it’s just one sort of thing that we did and provided that to our users. We have a bunch of blog posts like this. And our users will get concerned about a few things. So a lot of people send email to Gmail. This is my Gmail account here. Gmail introduced the new promotions tab, right? So they took a lot of email marketing and put it on this upper tab. And so people were flipping out like my email is in another tab. What does that mean? Are people ever going to read my email again? So we were actually able to do analysis and find out what exactly it meant because there is nothing to be gained by hiding any of this from folks. Rather than letting the question fester, a data science team can actually go into the data and just quantify like okay, here’s what the change actually did. So for Gmail email addresses what we saw is the sort of typical click rates for weekend and weekday the three weeks before the promotions tab was introduced and the three weeks after the promotions tab was introduced, so typical click rates during the week would be 13 percent for news letter. After promotions was introduced it’s down around 12 percent. So you lost about one percent. So it’s not the end of the world, but it did get affected. So we’re able to just communicate that to the user. Here’s about what is going to happen. Another example. During the government shutdown people want to know what’s going to happen to the engagement of email addresses who work for the government, right? So we were actually able to provide some of that analysis as soon as the government shutdown started happening. Okay, here’s what we’re seeing. If you’re at the EPA or the SPA then no one is checking their email because they’re not allowed, they’re forbade by law. Some people are still cheating you’ll notice. Like technically no one form HUD should have been checking their email and about one-fifth were versus the SEC somehow skated by with everyone still working, and we actually saw this is when the fall was ramping up. So this is going to be an increase in email traffic and we should see actually an increasing engagement, which is why as a percentage of before the shutdown it’s actually above 100 percent for SEC and state department. So we were able to provide this to people so they knew okay, if I’m in the mortgage industry, if I’m sending a news letter that people who work at HUD are going to read, yeah I should expect that my engagement will fall off a cliff. That’s normal. Well, as much as a government shutdown is normal it’s normal. So these are just things that we can provide to people. Okay. So now let’s get into some of the fun stuff, which is building capabilities or tools. I’ll just give you some examples of these. I’m going to give you an example from compliance. We have a team that shuts users down. And the reason why they do this is because we have six million users. We don’t have six million ip addresses that we send email over. So a lot of our users share an ip address. There is some really good averaging affects that come from that, but if you get one really terrible user in there who is sending to a lot of dead mail boxes, people who haven’t check their mail for 10 years, it’s going to get noticed and that ip address could be blocked. So we need to shut those users down very quickly. So one of the things I looked at when I first joined MailChimp is who is getting shut down and for what reasons. I went through all the reasons and created this flow chart. You can get shut down for multiple accounts, you can get shut down for sending a bunch of dead addresses, and then you go through various processes and be kicked out the door permanently or make some restitution have the good lists and that. But one thing I noticed, I don’t know if you can read it in the back so I’ll read it for you, about one-sixth of our users were shut down for loading a large list of email addresses, trying to buy a high-dollar account, or requesting high volume approval, i.e. approval to send to a large list. None of those things are bad, right? They’re just scary. If I’ve got a user who comes in like I want to send to 2 million people, we’re going to shut you down to why do you want to send to 2 million people? Because if we let that mail out the door and they’re an abuser, that’s a really terrible situation. So we’re going to manually review them. But people have this expectation when they sign up for an online service that they shouldn’t be slowed down, that things should happen seamlessly. We get a lot of letters like this bullshit review process, I’m sending out press releases. Press release is in all caps. I don’t know. Yada, yada, yada this is a pain in the ass. People get really angry. And I understand. When I sign up for any account I expect online to kind of work really quickly. It’s email. I should just sign up and be able to send. They don’t realize that spam is something that sort of originated with email as a big problem, and it’s one we have to control for. And people are signing up like the day before black Friday, right? So it’s high potential for abuse, but they really want to get out their email today. People want to procrastinate. So how do we solve this problem? Well, we solve it instead of doing a manual review process we create models that will predict when someone’s going to abuse. And that’s what we tried to do. This is a quote from Paul Graham about a decade ago. Basically, he says well, spammers, it’s all about the content. They can’t change their content. They can’t get away from saying Viagra. They have to say Viagra. So as long as I can detect Viagra I can shut them down. This is actually wrong these days, right? Most of our abusers look like this. This is an email that I personally got. I did not sign up for this list. Dear Avondale Estates resident, I am a realtor, yada, yada, yada, welcome to the neighborhood, whatever. I never signed up for this. So there are some ways that this person might be within the law, might be within can spam if they provide certain information about how I can unsubscribe, etc, but from a MailChimp perspective this is non-permission based email. We would consider this person a spammer. They just got my email address from the city and blasted something at me. But there is nothing in the content that would allow me as an AI model to necessarily know this is a spammer. There are good real estate agents out there that only send to actual customers. So how do we tell the difference? Well, spam is in the eye of the beholder, right? We tell the difference because it’s the people on your list that are either going to open and click the email or they’re going to hit the spam button, right? And that’s the signal we’re going after. So really it’s not about content at all. It’s about your list. Who are these email addresses that you’re about to send this to? Where have I seen them before? What types of stuff do they receive? How old are those email addresses? What happened the last time anyone of our millions of users sent to them? We sent to 3 billion email addresses currently and we save all this data. So we’re able to build a big system that looked at that mostly list based information, as well as user-based information, a bit of campaign information. But the thing that we tried to focus on, and this goes back to the earlier slides at the beginning of the talk, is eliminating risk complexity, right? Even when we build a production system and we deem this one important enough to actually build a full-fledged AI model instead of just a simple rule, we want to make sure that we design it in such a way that it’s going to survive, right? So most of the data we used internally was already structured, so rather than using some unstructured database like a hadoop [phonetic] we just chose to use post graphs because our data is already structured so why not just use a charted CQL database for it? So we made a lot of design decisions like that. We used random forest as our AI model. I talk about that one in the book because that one is actually particularly hard to over train. It’s pretty robust against that. It’s not very finicky. It’s a pretty good one to use. So we just made decisions that would be slightly more robust. Okay. But let’s talk about building tools for external users, right? We’re not talking about building an AI system. That’s something that’s internal. It can be fairly complex. I’m around to explain it to my coworkers in person. What does it look like to do analytics where the customer is someone on the other side of the world? And they’re paying you maybe 15 dollars a month. How do I build a tool for them? This is very hard not only because tools are hard to build, but they’re also hard to communicate. Data science tool list. So we like to keep it simple. This is probably my favorite thing I’ve gotten to work on. We have this captia you have to fill out when you create an account with us. And this is a nightmare, right? I mean I don’t know what point Google decided that we were going to start labeling all the addresses in the world for them. I assume this is a street address, this like black blob is probably a street address, but I never know the answer to these things. I think on average I’m at least failing the first one. So can we just get rid of it? That was the question. Can we use our data to get rid of it? And in fact what we found is that for certain email addresses during the sign-up process we could use our antiabuse models we’d already built and all the data we were already saving in a large database called the email genome project, we actually have data about 3 billion email addresses that were constantly updating based on their interactions with the system, we could bring that data to bare on this question. There are certain email addresses, certain people who were signing up for an account where I just don’t know or it’s a grey area. You’re going to fill a captia. But there is a large portion of our user base for which we never have to show this. So we removed it from the account sign up process. We removed it from the process where you contact customer support. So the artifact for this project for me, how do I build a tool if we think about it’s really hard to communicate data science to people, there’s just nothing. We just removed something from the application. And that’s it. That’s all we did. And so I was really happy about that because I didn’t make anyone’s life any harder. All I did was remove a barrier. And I took great joy in that. So another example is send time optimization. A really nagging question for our users is when do I send my email marketing? They just get so hung up on it. We spent all this time creating this newsletter about some sale they’re about to have. They have maybe a million customers they’re going to send it to. And they just don’t know when to send it and they start freaking out. Then they start going online and reading anecdotes from people who generally have guru in their job title and they have no idea what they’re talking about. Maybe they have some small set of data from one client they worked with and they’re like oh, well people work from 9-5 and they always take their lunch break at noon, and they’re always stuck in traffic at 6, so you don’t want to send it at noon or 6. And it’s just always bologna. So could we answer this question for people? Well, this is a graph from one individual email address. We’ve got some timeslots here for sends versus clicks. And what are we really interested in? We’re interested in when are people engaging with their email? And that engagement is not being driven necessarily by the notification they just got that they received the email. So where can I find engagement that’s not tied to volume, but rather is oh, you were available and you went back and checked your email. So we built an optimization model off of that data. Here is actually some distributions of optimal send times for email addresses by categories. So we’ve got college-aged folks, folks in their 40s and folks over retirement, and if we look at the distribution of optimal send times, we’ll actually see that college pushes out to the right. So it peaks with most email addresses in college have an optimal send time around 1 p.m. versus folks in their 40s and after retirement it’s around 10 a.m. So it’s really interesting that the data actually pointed that out and you see a similar trend for a lot of stuff. Bartenders are closer to 1 p.m. Neonatal nurses there really is no peak. They’re kind of checking their email throughout the day because their schedules are just weird. So we built this system but it’s got tons of data in it and it looks at data in an individual level. It’s all stored in [inaudible]. It’s really sophisticated. How are we going to show this to our users? Well, they care about the optimal send time for their list. When they go through the campaign set-up process they have to pick a time to send at. If they want to schedule their campaign we just have a bubble there that says let MailChimp pick, and then a little check box that says the optimal time to send is 3 p.m. So we did all this work but that doesn’t mean I have to build a big graph with lots of colors and a heat map or anything like that to communicate that work. Right? That’s not for our users that would be for the press. So instead if you want to make the user experience better, let’s just make it a simple bubble. That’s enough, right? Here’s another example. I’m just loading up examples here. Apologies. What if I have a list and there is a particular segment I’m really interested in, but I don’t know all those people? Right? So let’s say MailChimp has a list of 5 million email addresses of people that have asked to be updated on our app and new things we’re releasing. So let’s say I want to find the press in there, the tech press, and I know a few of the email addresses like I know that person works for read-write web, I know that person who works for some tech blog, tech crunch. But I don’t know all of the press that’s on my list. How am I going to find them? Well, chances are based on their subscription data how they interact with our system, other people in the press live in their neighborhood in a data neighborhood sense. So they are subscribing to the same things, they are interacting with the same things. So what do I really want to know? I want to know who lives as close to the tech people, the tech writes I know, who lives as close to them as they live to each other? So we can do that calculation, but how do we expose it within the system? You go in to the create a segment tab here and you put in some email addresses of people we know. I know this person [inaudible]. I know this person [inaudible]. These are people on my list and I just want to find the other people on my list that I might not be aware of. All we did is we created a discover similar segments button. So you can either save this segment or you can save the segment with some other people who look like them. And that’s it. It’s just a button. So that’s really the approach we take. We work hard to build these systems that are very rigorous, but we try to keep things as simple as possible, right? If we can keep the math simple we keep the math simple. If the math has to be complex we try to make sure that the way it’s exposed to people, the way that it’s kept up is still simple. And so I’ll just leave you with four points really. I think anyone who is doing data analysis at any company it’s important to align yourself with the goals of the business and serve your colleagues. You see a lot of articles these days about data science is the sexiest job. Just align yourself with everyone else and serve in that way when all the hype dies people see that you are valuable and want to keep you around. Data science products should receive no special treatment. So if a designer can solve a problem that I can solve by just changing the color of a button or making the app look different or feel different that is fine. There is nothing special necessarily about using math versus something else. So if something is not working, feel free to kill it. Get a goal first, and then get toys. Don’t just flail about. Figure out exactly what you want to do and then figure out here’s the toy. Here are the tools that are going to solve that problem. Avoid complexity. That’s really what I’ve talked about for 30 minutes now. So that’s it. If y’all have questions I think we have time for questions, right? >>: You mentioned about the mail genome project [inaudible]? >> John Foreman: It’s not. That’s a short answer of that. We’re still trying. We’re going really slowly with it. There are a few companies out there that are sort of charging around like a bull in a china shop with related to data like this and they’re getting in huge trouble. We always error on the side of privacy, and these systems were built for first-thingsfirst to prevent abuse, right? So we have tons of data on billions of email addresses, but I don’t know. We always try to figure out how can we share and what can we share in a way that maintains expectations from recipients, expectations from our users? So stuff like discover similar segments is powered by EGP; however, it’s only about stuff on their list. We might use that 3 billion email address dataset to perform these neighborhood calculations, but all I’m telling you is you have these email addresses on your list, these are email addresses like them also on your list. So everyone is double opted in and everything is above board so we don’t really share any of that right now. Yes? >>: What is your educational background? >> John Foreman: So my background is pure math in undergrad, so I studied abstract algebra and stuff, and then my advisor sat me down and said you’re not going to be a great mathematician. His words were actually something like you could go to an okay grad school but you’re going to be one of those guys that kind of toys around with small results for the rest of your life and math will jump forward really quickly at certain points by geniuses and you’ll be toying around, but if you’re cool with that and you really like this keep doing it. So that sounds like that sucks. I’m going to get a job. So I went to grad school in applied math operations research, and did that instead and then my wife got pregnant so I was like well, now I really need a job. So I worked for the government for a while. My problem with the government is that some of the folks you interact with just really want to retire. I like things that are exciting and I just ran into problems with people just like, there was one person I was interacting with who had a picture of a golf course above their desk and it was like yeah, I’m going to be there next year when I retire and I’m not doing anything until then. And it’s just really rough. I’m sure there are other government organizations where that’s not true, but that was true for me so I got out of there. But that’s my background. Yes? >>: You talked a lot about for instance, one of your examples with the email, the subjects or the content of the email being the primary predictor versus the recipients, but it seemed as though the recipients were largely based on ancillary data about the recipients, where as the intrinsic data to the email itself the only thing really there is the subject plus some names that you don’t know anything about. So the question that I should have gotten to much quicker is how do you think about the data neighborhood of a person? Is it more important to get the data to understand the neighborhood or is it more important to get the toys, tools, and analysis to analyze what you do have? So if you were looking at emails alone with no further context, how far could you get with analytic tools and processes versus ancillary data collection and understanding the connections between multiple ->> John Foreman: You’re right. That becomes extremely important. So in order to have that abuse model work the main predictors are built on the list, right? So it’s when we think about an email sent through MailChimp there’s this huge payload that’s the content and then the list of email addresses and there is your user metadata. So you're located here. You say your business address is here because you have to provide a real address for can spam. And then your billing address is here and your diners club, we’ll look at all that too. But with the email addresses you are right. You can get some stuff just from the list, just as it is. So why is your list 100 percent Gmail and all in caps? Why is everything like admin@ or dba@ that would be really scary because they're all role addresses. But usually we have to connect it up. So we want to go into EGP and we have to have tools to do this. We want to go in and say okay, where have we seen these email addresses before? How many of them were involved in the Sony hack? How many of them have been made public? How many of them do we think are journalists? Why are you emailing 80 percent journalists? That seems scary. We’ll look at all sorts of stuff like that. How many of them are verifiably dead? So the last time they were sent to the inbox was dead. How many of them have soft bounced a gazillion times in a row, meaning they’re on vacation permanently as far as we’re concerned? In order to understand all that you’ve got to have this big network of everything. Also subscription data. You get some lists where it’s extremely broad, and then other lists where it’s all women in their 20s and that’s it. So what does it mean that you're sending to a huge group of these people when your content is kind of odd or you're geo located halfway around the world and you're sending a different language? So yeah, we have to consider that whole thing and get all sorts of tools and techniques that would help us look at that much data. Yes? >>: It seems like you're getting good results from all the networks you have built up and all the rules that you guys have. Just going back in time, when you first started at MailChimp you had a fewer set of users, was there a reliance on external data sources to add some of those problems, or how do you reconcile that as a company group? >> John Foreman: It never really factored in until -- so the company is quiet small, until we introduced our free plan. Then it exploded. It’s like free marketing. And one of the great things about free users is they’re essentially a marketing cost, but they provide all this data that can be used to facilitate building these tools, which helps keep the system clean for everybody. I came around after the free account because it there wasn’t really enough data to do this kind of analysis. And because there wasn’t enough data people just didn’t focus on it at all. Just didn’t even try. And so once we had all the data lying around it was like well, we could just operate as we always have, just charge for accounts, which is what we do and just keep growing and that’s fine, but they just wanted to make use of this and abuse was really one of the main things that catalyzed it because if you have a free plan you’re going to get abusers who are like, yeah, let’s just write a bot to sign up for free accounts. So they have to bring in someone like me to solve that problem, but then you get to do all these other fun things with the data. There's a question online. >>: How do you design software with data first? Is there a different way to think about designing a website if I want to collect all this data to model analyze later? >> John Foreman: So yeah, there's a tension there. Like we try to design MailChimp in a way that just provides a really fun experience for folks. If you’ve never used the website before you send an email campaign a little monkey hand comes down above this big red button that’s kind of shaking and sweat is dropping off it. It’s really fun. If you start thinking about oh, I want to collect this data and that data, all of a sudden the user experience is degraded. So recently I logged into facebook and it was like do you work here, or here, or here? I was like why would you even think that I work for Mary Kay? Like I don’t understand. What did I say that made you think that? And how do I get past this screen? So there is this tension with if I want to collect data and use it to then have some feedback loop where I get better products, is that going to get in the way of the thing that I want, which is a better product? So we try to find ways to not bug people, right? Yes? >>: If the point is not to bug people, do you have to deal with uncertainty and ambiguity in your models instead of knowing for certain that that person works at Mary Kay you say I have a guess that they might? >>: Yeah. And I think that that’s actually really good to keep that humility in mind all the time. Like shutter fly recently sent an email campaign to a bunch of people saying congratulations on your new baby! And a lot of those people did not have a new baby. They thought that their data was correct, and in fact it’s often not correct. So we try to build tools that will never get us into that situation. And so it’s kind of nice when you can never be in a situation where we’re like oh yeah, we know that for sure. Because most of the time you’re wrong. Yes. >>: I just wanted to thank you for the simplicity of your book. I just picked it up from the front desk on foundations for predictive analytics because it had words like foundations and practical and simple, and on page 10 that’s the math. And this is the second book. The fist one was predict [inaudible] for dummies. It was the same thing. Nowhere does it start by saying hey, I have two data points and I want to predict the third. And you can do that, but here’s the limitations of it, so here’s how you want to get around that, and then another example and that sounds like what you’re doing. >> John Foreman: Yeah, that’s why I chose Excel. A lot of books do choose to go like we’re going to do some mathematical notation written in [inaudible] first then Greek symbols like the sigma’s and we’re going to do some and just assume not only do people know how to read that but if they don’t like they are going to explain it and like people would even want to read it. And I think that excel is really nice because it’s this language that people already get because they have done it. I’ve got to do the sum, then the equal sum and then drag through this range. You already know that. And so yeah, I like that kind of stuff but you just get sick of it after a while. Sometimes you just want something that’s a little bit different. So I tried to ->>: Well if you have an operations research background, but for those of us who don’t we don’t have the option. >> John Foreman: Right, and I think if you’re in school this is like your job is to read this stuff. If you’re working another job you kind of want to get to the chase a little faster. Other questions? Yes. >>: [inaudible] >> John Foreman: The AB testing they did? The significant AB testing they did? >>: No, the Obama care role out. >> John Foreman: Oh, oh that actually, just the website falling over? I mean what I can say about that was having worked in government and worked in consulting when I heard that that was happening I was like yeah, of course. It was so many contractors all working on little pieces of it and then smashing it together. The whole government contracting apparatus is set up for failure, right? I mean, you look at that and then you look at the F35 that’s [inaudible] it’s hard. I feel bad for the government sometimes. Working MailChimp is so much nicer because we can do things really quickly and not building a camel for four years and then releasing and hoping it works. Yes? >>: [inaudible]. From your presentation I think that there are at least two points of [inaudible] to look at which is the number of emails sent and the number of clicks. At the same time, with MailChimp if I understand correctly, people can build arbitrary emails so there is an infinite number of content that they can produce. So can you walk us through how do you take this fairly complex product or product that enables people to build complex applications and distill that to a number of [inaudible] that you can track and also clearly communicate to others. >> John Foreman: The whole metrics question is kind of funny because I work in a company that is almost vehemently anti-metric. So for instance, we did a marketing campaign recently where we took out a bunch of billboards in various cities across the US and they just had the Freddy head, our little MailChimp logo, and no text on them whatsoever. So the only people who even understand what these billboards were people who were already customers. And there was no data gathering around how did you hear about us? Did you see the billboard when you signed up? We did nothing. It was just to make our current users a little happier. They would take pictures. They would tweet. And so I guess that’s kind of a metric. Did we see any pictures on twitter of the billboard today? We took the one down in Chicago there was a local paper wrote an article about how sad someone was across the street from the billboard that the monkey was gone. It was all facetious. It was like an onion article but it was great. So one of the ways that we handle metrics is we just blatantly ignore them. And then we do a lot of other things. We do just a lot of customer interviews. So a lot of what we would do is rather than looking at what’s our [inaudible] are we getting all these weird metrics that you can design around, you think they’re somehow related to the success of your company or revenue, but then you end up gaming them. Then you end up saying I want to increase average revenue per user so I’m going to do that by increasing wait times in support for users who don’t pay us as much and making them pissed. Then they leave and all of a sudden my average has gone up because I just got rid of a bunch of users. So we just seem to ignore stuff like that, talk to our customers very closely, and don’t track a lot of things like that. Talk to our customers closely, understand what they want. If we keep hearing enough of the same thing, enough of the same stories, and that’s a lot of what the UX team does. They do a lot of in-person video stuff. They also do a lot of surveys. And then we’ll build things off of those. So we’ll say okay, people really -- this is the problem they’re having. They say they want this but we’ve videoed them, watched what they do and they’re really bumping into stuff like this. You can’t ignore that sort of basic insight that may not actually come from data. It comes from watching people. We take that and we build tools off of that. That’s a lot of the end run we do around tracking website metrics. We just don’t. Yes? >>: You mentioned talking to users. How does your organization go about talking to users and who is it that actually does it? >> John Foreman: It is the UX team that does it. So we have a large team at MailChimp. It’s the user experience team. The data science team works very closely with them. Essentially the way it works is that they’re sort of a small yappy dog that can sense threats real quickly and run around and claw at the glass screen door and then we’re like a really fat slow dog that can do things that are really frightening but it takes us forever to get started. So they’ll start yapping and wake us up and then we’ll wander and be like all right, what are we building for y’all? But they do a lot of in person customer interviews all around the world. Some of those customers would be sort of bleeding edge using us in really creative ways. Other customers would just be more of our standard customers. Maybe it’s a mom and pop shop trying to sell something and they’ve realized a mailing list is really useful. And they’ll go out and actually do video of those. They’ll watch people use the website. So they do a lot of that. They also do a lot of surveys and that’s how they find people who are at least a little bit engaged, right? You filled out the survey you fill out the comment section like I really wish you guys did this! So they would find people that way too. So they do all of that. And then actually we’ve started just recently sending one data scientist with them. So the data science team at MailChimp we tend to hire former consultants. I don’t know if that’s just because I like former consultants because I was one, but we want people who just in general can talk to folks so there’s no fear of letting the data scientist out of the closet. It is perfectly acceptable to send someone with a match background to talk to a customer. And so they’ll hang out as a fly on the wall, and then once the data comes up or science comes up they’ll be like oh yeah, can I ask you some more questions about AB testing, how do you use it, yada, yada, yada. And they’ll come back with ideas like we should really try this! And that’s really nice because if you wait for other people who don’t understand what you do to translate it sometimes stuff gets lost in translation, right? I mean, the best translation occurs inside of one brain. So yeah, it’s a bit about that. Yes? >>: So you mentioned that [inaudible]. >> John Foreman: Well, the CEO helped prioritize that, for one. Another great technique we use is we let people submit requests and we wait two days and we see which ones survive. It’s amazing the number of requests people give you that the next day people are like oh, I don’t need that anymore. Okay, good. But we generally prioritize based on just having broader conversations with the managers, find out what are the main goals for the year for the company. Okay, this year we’re thinking about, I can’t tell you what they are because they’re secret, but we’re thinking about these three things. Which of these products could helps us achieve these goals and we tackle those first. And we never have enough time so it’s never like we get to a point where we clean off the list. Some stuff just gets too old and dies. Yes? >>: So when you started the email genome project and you had no data to start with, how did you decide what to put into your database, everything? Or how did you go about that? >> John Foreman: Well, we started with particular products in mind we were going to build first. And so we knew the first product we were going to build was this abuse prediction model, right? And so we knew we wanted mailbox interaction data in order to build that. I need to know about bounces, dead mailboxes, things like that, spam bounces, all that stuff. So that’s just going to be first. And so some of those data sets, that one is actually huge, right? Because we send 10 billion emails every month, all those emails are going to get some response back from the server saying yeah, we accepted it or in the case of certain email products they might say yeah, we’ve accepted it and then two days later they’ll send you a bounce like, haha, we didn’t actually accept it. So we’ve got to grab all that. So we started with that. So for us it’s always been about what’s the next thing we’re going to build, and let’s make sure that we start six months out planning how that data is going to enter into each EP. Other people tackle it in other ways. I’m a little bit not optimistic about people who are like let’s just get everything first and then figure out what we’re going to build. So they spend years capturing data and have no idea what they’re going to do with it till it’s all in there. Yes? >>: [inaudible] >> John Foreman: The one where we’ve looked a lot at accuracy would be the antiabuse stuff because we really cared that we’re only shutting down bad people. And so for that one what I can say is that there is a portion of people, probably ten percent of users that are grey area. We can tune the model so that we are extremely accurate for all these abusers and all these good people. And the cool thing there is if you are predicted good, verifiably good, we just never talk to you. You just sail straight through. But there is this portion in the middle that’s grey. So that would be one place where we found that. And then the cool thing is this doesn’t end at the research paper somewhere it’s published like this is how accurate we got. Instead, what we did is we figure out how to handle the grey people. A don’t he way we do that is through taste testing. So we essentially say hey, unfortunately this model has detected that you could be good, you could be bad, here’s why. What we’re going to do is take your list, send out a taste test first. So we’re going to send out this many email addresses, wait one hour, and see what kind of feedback comes in. And now we’re not in the realm of prediction anymore, right? Now we’re in the realm of reality. If you pass then the rest of your campaign goes out. So we’ve essentially mitigated some of our risk. It provides a slightly diminished user experience to that grey area, but we found that it’s necessary. We can’t just assume the grey area is bad, we can’t assume they’re good either. So that’s sort of the middle ground. But yes, we do analyze how well we’re doing. That’s part of what we do. Yes? >>: We have time for one more. >> John Foreman: Congratulations! >>: So you mentioned about the [inaudible] answer can be answered by the data science team or the UX team, or sometimes can be both. So what is the process about how to decide if a particular a question is to be answered by the data science or the UX [inaudible]? >> John Foreman: I mean, sometimes it’s a matter of priorities and whether we think there's additional gains to be made out of not just doing the basic UX stuff. If we can just do a few interviews, look at our survey data, maybe it’s a few CQL series that someone on the UX team could do. Yeah, just let them handle it. If it becomes a situation, there’s no formal process. It’s a small company, right? I mean, 280 folks, half of those are in customer support, so if you just look at people doing operational stuff it’s actually very small. We just talk to each other and eventually the UX team will say hey this is above our mathematical pay grade and it’s very important too. If everyone says this is rally important, it needs to be worked on, the data science team is quite small but you guys handle this one because we can’t crack it with our data. And the finance team will do that too. They have databases they look at, but sometimes they will be like we really have this question of international users, it’s beyond us, we have to go to content. We need to make sure we understand the language content written in, yada, yada, yada, all of a sudden it becomes a data science question and so it’ll just end up in our court. And then if the CEO thinks we should do it we just do it regardless. That always breaks a tie. All right. Thanks y’all. Thanks for coming out. [applause]

>> Ryan Brush: Terrific, let’s go ahead and... Welcome to the MSR visiting speaker series. Before we...

Related documents

Products

Support

&gt;&gt; Ryan Brush: Terrific, let’s go ahead and... Welcome to the MSR visiting speaker series. Before we...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Ryan Brush: Terrific, let’s go ahead and... Welcome to the MSR visiting speaker series. Before we...