Listen on iTunes

Mister Beacon Episode #115

The Decision Makers Handbook to Data Science

December 16, 2020

Why is it that decision makers should care about data science?

The simplest answer could be that some of the biggest businesses are leaders in data science. Think of Netflix, Amazon, Google, Facebook. Their content is all curated by artificial intelligence. Tune in this week to hear a fascinating episode with Dr. Stylianos Kampakis about data science in our world today and to dive into his book “The Decision Makers Handbook to Data Science”. We double click on all the buzz words: artificial intelligence, machine learning, and deep learning - demystifying and providing context to these subjects that are important for leaders to understand.


  • Narration 00:07

    The Mr. Beacon podcast is sponsored by Wiliot, scaling IoT with battery-free Bluetooth.

    Steve Statler 00:16

    Welcome, everyone to the Mr. beacon podcast. This week, we have Dr. Stylianos Kampakis. Probably like many British people didn't do your name full justice. Thank you very much for joining us. And we're here to talk about your book, it just came out with a second edition, 'The Decision Makers Guide to Data Science'. Thanks very much for joining us talk about your book.

    Dr. Stylianos Kampakis 00:46

    Yeah, happy to be here.

    Steve Statler 00:48

    Excellent. Well, um, why? Why is it the decision makers should care about data science?

    Dr. Stylianos Kampakis 00:59

    Well, I think data science may either be an important topic, because I mean, it does play a larger, larger roles in our lives. Okay. I think this is clear. Did you see graphs around the investment in AI, whether it's like startups or funds spent by governments,in a, like, in many different ways, you'll see that this is just a trend that keeps going up and up and up and up, I think a 2018, the AI startups received about $30 billion in funding, and this is going higher and higher. And we're also really witnessing an AI arms race by the, by the, by the strongest countries in the world. And obviously, this, you know, doesn't leave any industry unaffected. It doesn't leave any any component affected, and doesn't matter whether we're talking about small companies, big companies, it's, it's, it's there. There's something that it's there. And the sooner or later, like pretty much, you know, like, it's like social media, right. So people, when they're first exposure to social media, you have some early adopters, and now every company needs to be a social media, because there's so many people there, it's like this with AI is so that it could provide an unfair advantage to any company that's using it the right way. So sooner or later, you know, you have to care.

    Steve Statler 02:25

    Yeah, I was talking to my boss, the CEO without about the fact that you were coming on the show. And he was he's based in Israel, amazing entrepreneur, really, has done an amazing job of many things. But including getting funding for a company of some great investors. And he pointed out that in Israel, which is kind of this incredible powerhouse in terms of startups that data science is the hot topic. It's the it's the kind of the one of the most sought after skills. But can you point to companies where data science has been, you know, a significant factor in the success?

    Dr. Stylianos Kampakis 03:21

    The kind of the world's leading the world leaders out there, who I mean are countless examples, right? The best of the industry, one of the earliest examples in the United Kingdom were leave, and was Tesco. When they started using analytics, like all the other supermarket chains had to also do so because it was so successful in you know, using analytics. This was like the early days. So this club cards and all that stuff was now we were giving, right? If you think about sports, it gives the most famous example is a whole class as the whole approach of Moneyball. So that's another library was great. Yeah, exactly. Great movie. Totally, and great, great actors as well. And so these are some, let's say, prominent examples. But then it's it's, it's really comes down to the industry, I guess the most impressive examples are, but the most impressive lack of the examples that people will resonate more closely with would be things like Netflix, Amazon, every like every time you use a service like these, you see a recommender system, right? It's also Google Facebook, like everything in there has been chosen by AI. It's not like all the content is created by AI. Well, then again, that's this might not be as flashy cases, if that's one thing, but AI data science, it can improve efficiency in many ways, but it's not always as impressive as you know, seeing the robot for example, robots are pretty impressive. But AI is a bit like, let's say operating systems. It's super useful. But it's cool if you're a geek really, are you? Your data scientist? Yeah, most people just see the end result. And over the years, we take it as a given that, you know, Amazon's delivery works really well. And we don't do that thing cometary I suppose the whole supply chain and robotic Process Automation? And

    Steve Statler 05:17

    yeah, I mean, it sort of reminds me of the Life of Brian, that, you know, what are the Romans done for us? And I think about, well, what is data science done for companies? And then you look at Well, apart from Google, and Facebook and Netflix. And I'm just trying to think whether Apple is his data science critical to Apple, I guess they have Siri, that's probably not the center of their success. But would you say that data science has been helpful to them?

    Dr. Stylianos Kampakis 05:48

    Well, apple, yeah, Siri. I know, it is investing heavily in AI, but it's behind the AI arms race, because it has this mentality that everything needs to be closed. And they're like, everyone has realized that it's not how it works, because it's such a difficult problem trying to replicate human intelligence. And I think he has only recently started published some research. It's it's very secretive. So I think it's, if you compare it like like Google, Microsoft, they're, you know, like two years ahead of everyone else. You know, Amazon is also very strong.

    Steve Statler 06:23

    I mean, the platform providers are fueling itself, they, because they're providing these machine learning algorithms that you can use, I think Apple certainly has been pushing machine learning, designing it into their, their chips, and so forth. But maybe we should take a step back. And just to find, what, what is status? We should do that probably before.

    Dr. Stylianos Kampakis 06:48

    Yeah, that's great question. Yeah, you know, when that when sometimes when there are presentations, when I run workshops, that's that if a question I like to ask my audience, what is data science? And you see, some people are like, like, it's so interesting, like, every talk about it, many people can define it. And that's the thing thing with many, you know, with many difficult concepts, right, so it's, I think, you will see this a lot across the media center is to substitute too big, but but it's true. It's like, sometimes science, you know, they're like concepts and words created in science, and then society uses them, but they don't really understand them. I'd say that, you know, if we look 6070 years of the past, probably physics was the most popular, let's say science, ever abused, it's still abuse, energy, you know, energy physics is quite well defined, but people are not able to this energy that and whatever. This computer science, the most popular science, people abuse all kinds of concepts. So the thing with data science is that it gets a confused as to say AI and machine learning. And data science is just an umbrella term, which is used to bring together many different fields, which show that over the last 200 years, with similar goals, either to analyze data or to replicate human intelligence. And then people start using this term, to unify all these efforts and also make it easier to sell to, to sell services. So AI is actually more academic work, because this was the original, you know, discipline was a 1956. And Carnegie Mellon Oh, this is where it was born, and the Dartmouth College as well. And IBM, these were the three main players. And the machine learning was basically one approach to AI which has been popular since the 90s. It's by far the most popular approach in deep learning is a subset of AI. But the thing data science helps you bring everything together. I guess, as the industry becomes more familiar with data science, you'll hear the word AI that there may be used more and more simply because it's just so much cooler. Really, this piece is meant to research resolve, but that person will have to accept as a fact of life that this is how the world works, you know, through her biological Zarganar,

    Steve Statler 09:14

    but what is your the shortest definition for data science? If you had like an elevator that is where the cables have been cut, and it's plummeting down the shaft says then how would you define it before you hit the ground?

    Dr. Stylianos Kampakis 09:33

    I think I won't have enough time to answer this question. Because the way I define it when I am, at least what I say in my workshops is to simply use data to do useful stuff. Let's just say it you know, because if you break it down and you see like, Oh, it's costing three fart and all the different fields, and that's basically the common thread.

    Steve Statler 09:51

    And I love it. That is a fantastic definition. I would die feeling happy that learned something today.

    Dr. Stylianos Kampakis 10:02

    your very last moment.

    Steve Statler 10:05

    So but you know, it encompasses all those things that you mentioned artificial intelligence. And this is like the Russian doll, you peel the artificial intelligence. But it seems like machine learning is a subset of that. And deep learning is a subset of that. But then we have statistics as well, which I guess sits sits alongside it.But that's sort of orientates. And back to my establishing set of questions. So decision makers hand handbook to data science is aimed at decision makers, who do you think of as decision makers? Is it just CEOs?

    Dr. Stylianos Kampakis 10:49

    Decision makers can be anyone from a my solo entrepreneur, to a manager within a bigger organization, or a or, you know, this could be upper management, this can also be someone who's influential within a group within an organization. So I've worked like with the potential startup scaleups. And I've also worked with their, let's say, people who were in evaluating steam, and they were planning to expand, then they were just looking to understand a bit better how they could transition from simple analytics to AI and machine learning. So data science decision maker, if this in this case, is like the broad definition of a decision maker. Yeah, I obviously somebody you know, could also talk about politics, etc. But my book is not so much for the public sector is more for for companies. Yeah, so there's many, many decision makers, especially nowadays, in companies where we're supposed to be empowering them, then it gets flatter and flatter. And hopefully, we have almost all of us to decision makers in some ways in a company that really empowers its its workers.

    Steve Statler 11:57

    And I was kind of jealous when I looked at the fact. So we have the same publisher, a press and my book, flow beacon technologies, Hitchhiker's Guide to the ecosystem, I've got to plug my book. It's a lot less readable than your book, because your book is like, approximately 150 pages, and mine started off as 700. And then they used a bigger book format. And that got it down to 500. And I realized, I want to compliment you, because it's very, very readable. And it it really pulls you into this. And it seems like you I one of the questions I had was, what changed between the first edition and the second edition that is has just come out?

    Dr. Stylianos Kampakis 12:43

    Oh, there were a few more use cases, a few more tools at the end of the book, which can which helps decision makers to implement data science. I mean, I want to make the book very readable and not long on purpose. Because, like, if you're a decision maker, you're looking for information or for something to read, you know, while you're a holiday. So this was done on purpose. It's for the same reason that every time I was running workshops, I was trying to be efficient. Yes, I was so much as to because like a two week session, it was like a day session, my next book is going to be is going to be longer. It's about 100,000 words. But that's more for the general public, you know, that?just wants to read the book to learn more about data science and also pass the time.

    Steve Statler 13:29

    Yeah, struck me. I mean, there was a lot of you make it very practical. So you kind of bring it back to the nuts and bolts of businesses. And I you do consulting as well, is that how much did your consulting kind of inform the book?

    Dr. Stylianos Kampakis 13:48

    Yeah, exactly. And I'm also running a company for and that helps businesses and helps individuals design the right data strategy, and others data strategy, all the soft candles, or the soft parts of data science, like cultural hiring, etc. And, and the this basically, this, this book, is really was based on the work of leather headband with the tetrarch Academy over the last two, three years or so. Right. So it was based on my workshops, the conversations I had. So it's like the best part, though, the most useful parts. Yeah, for decision makers, that also consult Yeah, but but yeah.

    Steve Statler 14:31

    It's interesting that it came from that time. My book came from the same place I was consulting and training people, I really need to write this stuff down, then it just ballooned. But you've kept it to a point where people can really absorb it and get a lot out of it. That one of the pet peeves I have. It's you know, when something's successful, maybe it's because I grew up in England and whenever something's successful, So then I immediately look at it with suspicion. And data science has been very successful. And one of the responses that I hear entrepreneurs talking about is yes, we're gonna make money on the data, we've got to accumulate as much as we can. It's all about the data. And in, which is fine, I guess. But it's also kind of frustrating because there seems to be a lack of ability to answer the second question, which is, well, what is this going to be used for and so forth? How do you see the same thing of people just like jumping on the data bandwagon? And? And how do they get in trouble? How do people get in trouble? Maybe this is a better question. How do people get in trouble when they try and pursue data as a key part of their entrepreneurial business strategy?

    Dr. Stylianos Kampakis 16:02

    I mean, there are many ways someone could get in trouble when they tried to adopt data science. And one of the ways this happens is to not have any data strategy in place, just accumulating data, and then be like, Oh, I'm gonna deal with this later, once, you know, I get a data scientist to do it for me. And this is where usually things go wrong, when you realize that the data is what's the right format, the red purple circles there, etc, etc. And that's it say that the science, I mean, it's good to be skeptical, but I'd say the the stance hit the law and prove itself, because, you know, statistics as a discipline has been around for 200 years, or so. So it's like, you know, and then we have AI and machine learning, and now it's be used in so many different things. So it's not like data science came out of the blue. It's, I think there's a long time to prove itself. But there's still some people who are skeptics, I mean, even the industry. But yeah, say it's different than cryptocurrencies which can only stand to prove itself. Very recently.

    Steve Statler 17:10

    Yeah, I I'm definitely not arguing that. There's nothing there. This could use the examples you cited at the beginning, a proof that this value is the thing that I object to is just kind of an adoption of it without realistic well aligned goals. And having thought through, it's all about the data, I kind of feel like blockchains, a little bit like that it sort of gets technology a bad name, when people don't do it well, and the kind of failures because of that, maybe the answer is just to, to buy your book, I thought, one of the really interesting sections is you talk about how to lie with statistics cited, maybe I'm not capturing that as well. But you had a great example of showing this correlation between or maybe you can remind me, there's a film star and their habits helped me out here.

    Dr. Stylianos Kampakis 18:15

    Yeah, there are many ways to live with the big picture data problem with, let's say, data size. The problem with statistics is that this man methods from these disciplines, they used to make up stories, because data is always interpreted or communicated. And there's some degree of arbitrariness, not in every case, but in many cases, and I think you see this a lot in economics, finance, in politics, in election polling.There's so many ways to do this, right? I'd say graphs is the most obvious way to do this. I think graphs lie all the time. It's just the way things are. But you also see more subtle ways of doing this. Like in in research, for example, sometimes you see tests with maybe they're using their own way, this sort of thing. So sometimes language statistics can be purely tensional. As you might see, it happens in you I get in politics a lot. But sometimes it's unintentional, as you might see in research, for example, research, so that we see research, medicine, biology, and there might be some subtle issues with the methods used. And it's very difficult to, you know, to detect this unless someone actually is really looking for them. It doesn't work things are, you know, it's good to be a bit cautious about every new technology or new shiny thing. And I think it's good to, you know, it's good to be optimistic. But you know, of the worst mistakes you can make with technology is to believe that you're immune to to error because you're using some kind of new fast technology and data science and can often get give someone a false, let's say perception of being, you know invulnerable of the being possible to make a mistake because you're using this new fancy method. So that's one extreme. The other extreme is, you know, when you see some decision makers for thinking that it's only human judgment, done the work, go with your gut, that sort of thing. Yeah, I think both extremes are not very good. You know,

    Steve Statler 20:27

    I found it in your book. So I'm going to help myself out here. So then what you do correlate the number of people drowning by falling in swimming pools with the number of films that Nicolas Cage has appeared. And you show the graph, which shows a real correlation between Nicolas Cage driving the drownings and swimming pools, which I thought was was very funny. Which, obviously, you're not actually saying that Nicolas Cage is killing people. That's just an illustration.

    Dr. Stylianos Kampakis 21:02

    There is a website dedicated to this. Basically, there's a website dedicated to this, that finds funny correlations, and it presents them and it's all you know, all this kind of like, for example, how many, but you know, how much battery is consumed in, I don't know, in Arizona, but correlate with how many people die of skiing accidents in Switzerland? But, you know, this is funny, did you realize that the 16, might have actually caused the financial crisis. I mean, it wasn't the only reason. But for example, when you work in finance, and you're trying to find a portfolio optimization, or you're trying to find a series which are correlated, or uncorrelated, then you might come up with all kinds of which relationships because if you have 1000s, and 1000s, of fan series, eventually you're gonna stumble upon some patterns, which are purely accidental, right? Just like looking at the clouds and believing you see some kind of form, you know, it. So this can actually happen. This went over to say, and there was this argument by some by some statisticians saying, well, this partly caused a crisis, because you had people who were using this models, and they were the technical kinds of correlations time series would say were there and eventually This made the models weak here, obviously, it's a bit more complicated, because there are also some, many other factors in play around subprime mortgages, etc. But just shows you that, you know, what, when you have domain knowledge about the profit, like about a problem, it's clear that what the what the suggesting that you know, Nicolas Cage and suicides or whatever is, is connected, this is clearly in the same proposition. But if you talk about physical phenomena, financial time series, it is very difficult to have this kind of insight, you know, to know whether these correlations really exist, and medisafe is like a prime example. You know, it's like, you know, I think I think many people can relate this, like, you know, in medicine, and also in the things around medicine, like supplements and health and fitness, people have all these anecdotal stories, all this fat and all this weird correlations. And it's very difficult to find out what's true and what's not. So nutrition is a field, I like it very much, I think it's very important that you everything here to basically write the book. And yes, that's part of, because there are many issues with the methodology following that,

    Steve Statler 23:28

    How do you prove the negatives? so it's easy to put the graph up of suicides by hanging strangulation and suffocation, and us spending on science, space and technology? And you can I'm sure, at some point has been a congressional hearing where someone has tried to do that, but other than the fact that it's ludicrous. Is there any way that you can scientifically prove that they're not related? Or is it just a matter of challenging it and saying, well, show me why the related?

    Dr. Stylianos Kampakis 24:04

    It's not easy. I mean, it's impossible to report very difficult. This connected digital data science, from from domain knowledge, right? So you do need to have domain knowledge when you are working with data science. It's not it's you can't really automate everything and say, I'm just gonna throw numbers into a machine and some people come out of it because they're the size of the sticks to deal with models and models are simplifications of the world, right. And when we started talking, I mentioned that I realized You're my PhD days that and what I was working with Tottenham Hotspur is that big a data scientist quite often is much more than just handling data. And you have to handle the culture. You have to give it to have the knowledge about the organization you working for etc. And this just You know, and you can't really separate these two elements, unless, of course, you're in some domains where, let's say that the problem stays stable. Right? So for example, your computer vision, you know that animals, humans, more or less stay the same over over the millennia. But if you're talking about, you know, societies and anything where you have multiple moving pieces, then it's just tougher. I mean, then again, they're like, various, you know, there are techniques are also discussed significance, etc, to prove whether sometimes correlations exist or this or that, but zero, then it gets very technical, let's say if you if you need to resort to super complicated arguments. And, you know, there's something that common sense says, it's like, You're quite quite often, like data follow common sense of domain knowledge. So if you have like data that then you that something which seems totally crazy is happening, then you need to re examine your assumptions, and then maybe find other ways to study. I mean, one of the most common examples in this, this was this paradox, I remember the name with their mothers that are smoking the babies.That says that the mothers that are smoking the head, this is smaller probability of the babies dying. And I think the first year something like this, from a serious disease, and essentially what what I mean, this, like a probabilistic model was explained why this happens. But essentially, the thing is that you have this model where you have either babies that have some kind of disease, and then you have some mothers that are smoking. And if a mother is smoking, it's, it's unlikely that she's smoking that she also has the disease. So it was the best way to low birth weight. Yeah. So if so, for example, a baby's born with a low birth weight and the mother smoking, then essentially, we're less likely for the bit for the mother to be smoking and the baby goes to have the disease. Yes, and it's a bit of a complicated document. But if you select how the model breaks down, and the questions then kind of make sense, and but when you see this in real life, it looks to see smoking, protect babies, dying from some serious diseases. So this is what I was trying to get to. And it was really basically a side effect of how, you know, everything is organized. Got it. So you talk about data management in the book. I think it's worth explaining to people what data management is, could you do that? Sure. So data management is about creating a strategy. Yeah. And then following up with the strategy, and iterating upon it, and improving it as to how you can collect data, store data, manipulate data in order to extract value for your business. And I'm talking about data management, from a decision makers perspective, I'm not thinking about, you know, whether you should use MongoDB, or some other database, because it's a, you know, because in my book, I want to take a decision makers perspective, and you know, that the colleges, they change all the time, right, so you have new databases, and new trends and new versions, but the sound principles of data management do not really change, okay, and the thing they are principles of data management, but be similar to principles in other areas of business. That's why folks a lot of use cases and checklists and things that someone needs to keep an eye on, like, for example, data quality, whether you're collecting data with some particular use case in mind, which can guide relate their own. Essentially, data management is a lot about avoiding the situation we talked about earlier, which is like an old discipline to collect data, you know, figure it out along the way, which is a bit of a recipe for disaster either. So just like going to the gym and saying, I'm going to work out, and I'll just have to do random stuff and eat random stuff. And we'll just figure it out along the way. And then you wonder why you're not losing weight.

    Steve Statler 28:56

    I also want to talk a bit about data quality, those of us who spent time on the data warehousing analytics side know that dirty data is just, you know, it's a very challenging problem. And I think there's a lot of tools that have been put in place to clean it up. But how do you keep data clean? Is there a Is there a formula for if a decision maker wants to keep their data tidy? How can they do that?

    Dr. Stylianos Kampakis 29:29

    For sure, I think data data I mean, there are a few different things someone can do first of all, is I tried to document their approach to document the what what they're basically where the data is coming from, what the verbs represent, etc. But this doesn't happen that often. data needs to be centralized in some manner. So you don't have different people. Different groups have different data sets. That's very important. Then it's very important. And finally, it's important to have some kind of standard and which I see which is great. To assist with some kind of implementation. So if for example, you know that you want this data to build, let's say a recommender system, then you've gone to also start thinking about what the variable should be look like what variables you need, etc. If you just like storing data in an arbitrary way, then you'll end up in a situation where you're going to have to spend significant amount of time in order to fix the problems with this data. Right? And I think yeah, these are the main points about the Derrick, the entire data was compromised right now. Very good. And I thought it was interesting that you look at this subject, also from an organizational perspective, in terms of how to how to hire and keep good data scientists. Can you say a little bit about that? Yeah. So I think one key thing with data science, this is based on that, it's still in high demand, high demand skill set, and pretty much like all workers deck, right. And if you want, make sure that you lay the foundation before you hire someone, and you can also then keep them happy. So by Lady Foundation, I mean that if you if you've taken the steps to build the right data strategy, if you take the steps to build a data strategy, you have the right code for you have some interesting problems to solve. a data scientist will find this environment much more stimulating, rewarding, rather than if you just have a bunch of data lying around and this person will have to spend six months fixing this data. So that's, that's one. And then I think so I'd say Actually, this is this is very important, it may be the most important factor compared to other tech workers. Because, you know, the developer might be called upon to build something from the ground up. But the data scientists, they don't really enjoy, for example, data manipulation that must enjoy Godzilla bodily power. So that's why that appeared some like horror stories, like, you know, companies hiring people, two months later, these people basically resigned, because they have nothing to do you know, there was no data. And then obviously, you know, there are other things, which are pretty standard tech, now we see a lot of flexible working, which I think COVID will, will help, like ever post COVID things this year to stay sane. Yeah, I guess these are the basics, you know that it really depends on the person. But yeah, I mean, one one a difficulty, give an advice to a decision maker, be the company owner or manager, whatever it will be, yeah, lay the foundation first, right. So don't just hire some very smart, motivated, highly paid professionals and expect them to do all like the difficult and boring work. Because you didn't really take care of this earlier on. It's very easy for them to get demotivated and just find a better job. Yeah. Yeah. So that's actually a key competency is actually finding good ones and keeping a good data scientists. And I think reading your book is, is going to be helpful in doing that. Well, I know you're super busy. So we're coming up to the top of the hour. Thank you very much for spending time with us. And I think it's really one bit of data science that it's really well developed is the art of search engines. So what people need to do is search for decision makers, handbooks, page sites, so they're going to find you and this book, So congrats. Well done. And thank you. Thank you. Yeah, I was very happy to be here.

    Steve Statler 33:38

    I was fascinated in your PhD thesis on soccer injuries, I have to say soccer because I'm in the United States. If I say football, everyone's gonna get the wrong impression. But so you you work with Tottenham hotspurs. First on, on on soccer injuries, how did you end up with that brilliant idea? And how did you persuade them to work with you? So it was quite interesting, because at that point in time, there was some awareness in the world of sports of what data science season what can do. And they was excellent for the scholarship to study this very thing. I guess the team wanted to do this for a scholarship because through through university, because they didn't really know where to start. Right. So it was like a research project at that time. And because this is what the PhDs so it's, it was interesting, because sports at that time, they were behind 10 to 15 years in terms of the culture they're using compared to other sectors which were way more mature in data science.

    Dr. Stylianos Kampakis 34:50

    When Yeah, what's because interesting, partly because I get to work a lot with many different stakeholders. So that time made me realize how important the culture is if an organization because they would, it will, many thanks. But it, the most important thing, I guess was the cultural change that was required to take place within the theme. The reason being that, you know, as a PhD, you have to publish some innovative research, and you have to do new things, and blah, blah, blah. But at the same time, when you work with another sponsor, when you work with with an investment sponsor, you're supposed to give a sponsor happy. And they realized that, at least at the beginning, it wasn't so much about using sophisticated methods, it was more about building, setting up the foundation, the groundwork for what was to come. This is actually my PhD years, despite some of my later work, which has been around data strategy and data management and explaining data science to decision makers. Because it's a really if you if you want to use data science, you need to start from the basics. And it's not always the algorithms.

    Steve Statler 36:02

    Yeah, I thought it was I noticed in your book, you maybe leverage some of that experience. And you were talking about the conflict of interest between the different stakeholders, you have the coaches that want to get the players back on the field, and you have the doctors that are trying to make sure that the players get well. How how does that play out? What's the what are the challenges that that misalignment presents in terms of the club and your work as a data scientist?

    Dr. Stylianos Kampakis 36:44

    Yeah, so that's a very interesting question. Because one of the challenges was that, we had to lose lots of time to understand what the data meant in each case. And my work as a data scientist, I realized the total was about data science, and how it showed it was more than Hiroshima, just getting a data set, and then working with it, it was much more complicated, in a sense, because you had to involve the human factor in politics in there. And there, I quickly found out that this problem is they repeat themselves in other sectors, maybe not in the same context, because sports is, you know, competitive, but definition. But you do see these kinds of issues like politics and the English conceptions around around data. So yeah, this made me realize early on that data science is not necessarily about data in the real world. But how does it how does that conflict of interest manifests itself in the data that you were looking at issues in the data that you're looking at the conflict between the coach and the medical practitioners? Yeah, so the medical practitioner had them, it was had the head of an incentive to make the injuries look more severe than they are when they were being recorded. And the coach was trying to do the opposite and put players back into play, which means that they suggested recovery times and the actual recovery times. Were not good, not pure, in a sense, you know, what I mean? Which means if you try to train a model on data, that's not and that's, that's, you know, it's been labeled by human experts. There's never a human experts, and the experts that didn't really use their, their true judgment is then you can't be sure as to what you're modeling. Exactly. Right. So in this scenario, where we're talking about medicine, and doctors and physiology, and, you know, there's not going to be 100% agreement, even, even between experts, the ideal scenario would be to have three, four or five experts give an estimate, give their opinion, an average of this. Yeah. But here, we had a very different dynamic.

    Steve Statler 39:02

    Yeah, I get it. So to say, so your thesis is published online, but I'd be fascinated to get that kind of a summary from you on a Can you predict reliably sports injuries, using data science? And, you know, what were the kind of lessons learned? And how did you even had it? How do you do it, seems like an almost impossible task.

    Dr. Stylianos Kampakis 39:29

    Yeah, it looks like a bit of the right kinds of data, you can predict overuse injuries to some extent. Because overuse injuries, they, I mean, they follow certain patterns, that's one secondly, overuse injuries. They also you can like as your record data, this can manifest the dangerous human for themselves in many ways. But sometimes, even if you don't record, even if you only record the let's say, training sessions, you can still get At some, you know, some predictive power out of the relationship between the volume of the sessions over time, and their probability of injury. So it's 100%. doable. But if we're talking about more acute injuries, that's that's obviously nearly impossible. Right? So we focused on on overuse injuries, it's a common problem professional sport, like athletes, they push themselves too much. Yeah. And I think he is GPS. Is that true? Yeah, it sums up his unit with now a lot of human existence. It was called Viper. I when it was acquired by another company, I think? Or is it still around? I'm not 100%. Sure. Because, like, if it was at that point in time, there was like an explosion of devices for sport. I remember good corporate citizens, there was like a very big trend. And then I think a few of these like marriage, the other set now, the idea is that you were like a device around your chest. And then this can record using GPS many, many different and metrics. And, yeah, yeah, I don't think Viper GPS for sports exists anymore. I'm sure that putting sensors all over the place on on athletes, but it means to me, one of the takeaways, which is kind of obvious when you think about it is that people get injured when they're tired. They get tired when they're moving around. So is there some kind of correlation between length of play exertion and and injuries? Oh, yeah, absolutely. Of course. Yeah. I mean, that's common sense. So you expect to find this there may be something else that's interesting, it's is that some of the best players that just don't get injured that much, you know, that's something else, which many people don't realize that that's actually for sports. But the the people who don't get injured much, they can just drag longer train hard, they're participate in more games get more experience. So it's not just that sometimes they're good simply because don't get injured. So maybe part of being talented is also really big, very resilient. Again, that is fascinating. That is, Malcolm Gladwell in one of his books, everywhere that you read this one that he talks about, the correlation between the month that athletes are born and their success. And it turns out, if you're, you go to school, you're older. You know, the difference between a, an eight year old, a nine year old, a nine year old and a 10 year old is really significant. And so they tend to get picked more, so they get more experienced, and so they get better. And it's this kind of virtuous cycle and you get your 10,000 hours and you become an expert because of that. But what you're saying I think, is equally fascinating, which is actually a key competence for athletes is avoiding getting injured because you get more not just obviously, no one was injured. But it's not worth being brave, because you get less playing time. Amazing. Exactly.

    Steve Statler 43:09

    Right. Part of this sort of segment of the show where we actually talk to the experts, I always find it fascinating ended up talking to CEOs and leaders in companies and it's really interesting to hear a bit about them personally. But you know, the our way into that was talking about music and the the three songs you would take on a on a trip to Mars. I don't know if you've had a chance to think about that. But it's first of all, it is music important to you. Is that a key part of your life.

    Dr. Stylianos Kampakis 43:39

    All the time. I really like music. I mean, I listen to music, when I've worked as well.And yeah, and I also I also like play the piano and they used to compose music as a hobby, but I don't have much time today.

    Steve Statler 43:55

    So did you play classically? Or were you kind of more contemporary stuff? classical and jazz piano?

    Dr. Stylianos Kampakis 44:06


    Steve Statler 44:07

    And so I've given you this really challenging task, which is choosing three songs on a second, a very long trip, trip to Mars, and what would those three songs be if you had to choose just one?

    Dr. Stylianos Kampakis 44:21

    yeah, that was a tough question. And we know what to choose. I'm like, probably, I'm going to give you an answer, tomorrow, but give you a different one. But say, I mean, I would probably choose Beethoven's Ninth Symphony, which doesn't classify as a song, but yeah, there then maybe then maybe I would choose something by The Doors, like probably something like riding on the stall and I'm trying to think of an electronic track because I hear a lot of electronic music when I work with these can I mean pretty much anything and then let's say something which is like not not electronic but it's like instrumental it's it's an interesting band is it's like conch by bonobo it's a sort of an existing my favorite song to listen to when I wake up. yeah do you know the song?

    Steve Statler 45:22

    I don't know no I know the doors obviously The Doors song But why did it why did why did you choose that one?

    Dr. Stylianos Kampakis 45:30

    Yeah that's a great song to listen to when he actually if you wake up like literally like coffee okay

    Steve Statler 45:36

    so with the with the doors I was just wondering if there was some correlation between a time in your life for an incident or anything but

    Dr. Stylianos Kampakis 45:45

    it's not necessarily I think it's one of those bands which is like very special like when people can find something in The Doors in one way or another like it is a part of their lives or I don't know it's like what you give is one of those bands which I like to revisit every now and then so maybe I don't listen to The Doors for a year there's some delay. I just listen only to The Doors for like two weeks for some reason. So can't really explain it you know, I think also the Nirvana like that for the whole dancing. It was just doesn't exist anymore. But many of the social that seen are classics.

    Steve Statler 46:18

    Yeah, I got into the doors after Apocalypse Now. I think it was the end in this amazing. Very good, very good. Well, thanks for sharing a bit about your life a bit about music.