0:30
Hey, good morning, good afternoon, and good evening everyone again at the next session of AI42
0:38
And before we go into details, let me welcome my partner, Håkon. Hi, Håkon
0:43
Hello, Eve. How are you? I'm good. And hi, Heine, our beautiful presenter. Hi, how are you, Heine
0:52
I'm doing very well, thanks. It's going well. Awesome. That's good to hear
0:56
and today we are going to show you again something awesome from the mathematics field
1:04
which is going to be probability and this is another one of my favorite parts of the
1:11
mathematics field and it's because it is like getting very very close to to ai and data science
1:19
honey would you like to share some words about this exactly so we are going to be talking about
1:25
probabilities today. And as you kind of stated, it is kind of taking us towards machine learning and being able to
1:32
predict something. So it's kind of that bridge from just having data and then going towards making really kind of new
1:40
information out of it. So it's going to be very exciting today
1:45
I'm excited for this session. And I can't believe it's already the last one
1:50
Yes, and this is somehow it's sad. But in another way, it is
1:54
really cool that we have learned a lot already again. So that's exactly nice
1:58
And next it's the step of really starting to implement it and use everything that has been
2:04
learned. Yeah, that's very true. But before we get there, let's go, let us give you a quick intro to AI42
2:24
Hi again. Hello. So let's say a few words about AI42. Would you all can start it please
2:36
Yes. So the idea here with why we started AI42 is we saw that we couldn't really find one good
2:43
resource where they could cover everything from starting out with the basics over to actually do
2:48
some some real implementation with machine learning so then we start me and eve started
2:54
up this ai422 initiative so what we do here is we will be streaming uh two times a month on
3:00
wednesdays and then we will invite recognized speakers and experts that will teach you about
3:06
different topics so we started started out with mathematics and probability theory and then we
3:13
we will move more into different languages like R, R and Python
3:18
and SQL, for example. And then we go more into actual machine learning and data science
3:24
And in addition to this more theoretical part, we also will have some practical workshop sessions
3:32
Yes. And we would like to do this by inviting a lot of experts
3:40
basically, in the different fields so they can also give you a lot of knowledge
3:45
and you can extend your network as well with the help of these people in the field of data science and AI
3:55
Yes. And we are also visible here on social media, so you can follow us on Instagram or Twitter or Facebook
4:06
And if you scan our QR code, you can get to both our YouTube page
4:10
where you can find all our previous sessions. And we will announce our next sessions on our meetup page
4:17
where you can also see on this page here. Yes. And we want to say a big, big, big thank you for our sponsors
4:27
which are Microsoft and Myles. We also want to say thank you for all the contributors
4:34
who helped us making this happen. and you can also learn some more if you join us now
4:42
and follow us on Twitter, Facebook and Instagram. So I'm very happy that you are guys here
4:50
How about getting started with some probability then? Yes, let's go back to Heineer
5:06
Hi, Heini. Welcome back. Hello. So this session is going to be about probabilities
5:17
And would you like to give us a quick intro about what are we going to cover today
5:23
So really today's goal is to understand probabilities and give some idea of the different ways to calculate those
5:30
and then at the very end I'm going to give a little
5:35
kind of an overview of how all these different things we've been covering
5:39
fit into doing certain things in machine learning That sounds really interesting
5:45
Okay, do you have any other Yeah, go ahead, sorry I just wanted to say to the audience
5:51
here that before we start, if you have any questions, just feel free to post them
5:55
in the chat and then we will have a Q&A after the session is finished
6:00
yes sounds good so honey are you ready to take it away yes i am very ready and let's go
6:10
good evening everyone and welcome also on my behalf to the our fifth
6:29
mathematics session which will be around probabilities and before we get too deep
6:36
into probabilities, I just want to kind of set the stage again and kind of give you an overview of
6:41
what we have been covering before and how we're now ending up into the area of probabilities
6:49
So really on our previous sessions, we've been looking at this problem of us having some kind
6:56
of data set. And of course, in the beginning, we just want to find some kind of information about
7:03
the data that we have. Last time, we looked very extensively into statistics and found all those
7:08
different methods associated to understanding our data better in terms of looking at statistics
7:16
But then on the previous sessions, we've also been asking these questions that how can we
7:21
find this line for our data set that best describes it And then after that what comes in is how likely is some kind of event to occur And that is really what takes us into probabilities And so what we been going through in the past sessions
7:38
is we, in the first one, we looked at basic algebra, including looking at things like
7:45
data sets, equations, and solving equations and such, really setting the base for all of the other
7:53
math to follow. Then in the second one we were looking into calculus and calculating
8:00
things as derivatives and integrals. And then in our third session, which was around linear
8:06
algebra and might have been the most heavy of the sessions so far and kind of most extensive
8:15
in terms of how it's used in machine learning as well. And then last week, no, two weeks
8:21
ago we went into statistics and looking at those central tendencies and distribution and so forth
8:32
And now in this fifth session we will be looking into probabilities
8:37
So if you have not been following along to the other sessions, I'm Heine Ilmarinen
8:42
I live in Finland and I work as a DevOps console at Polar Squad. I do work with Azure currently and
8:51
And I also do training around Azure since I am a Microsoft Certified Trainer
8:56
And my favorite topic at work is definitely architecture and figuring out architectural problems
9:04
But I used to study math and actually started to become a math teacher
9:08
So it's really nice that I get to combine kind of those my past skills and current skills
9:13
and get to run these sessions for you around how to use mathematics for machine learning
9:18
and data science and everything in that area. And as you might see in the following slides
9:27
I am a bit of a dolder and like to draw pictures of everything pretty much. So we will be looking
9:35
today into probabilities. We will start first. Well, what we're looking at with probabilities
9:42
is really understanding the likelihood of things. We're looking at how likely is something
9:46
to occur. And really, if you think about it, that is already kind of talking about things regarding
9:53
being able to predict something, being able to determine whether something is likely to happen
9:59
or not. And to get to that point, we're first going to start looking at basic probabilities
10:06
Then we will look at, well, how do probabilities work in real life? And then that will take us
10:13
into the law of large numbers. Then we will look at conditional probability
10:19
and how can we determine independent versus dependent events. And then as the last point
10:27
we will be looking at the Bayes' theorem. And then at the very end, as I mentioned
10:32
I will be kind of tying in everything that we've been going through
10:35
with the different sessions so far and the learnings that we can take from here
10:42
for our machine learning career. So let us start with basic probabilities
10:49
So when we're talking about probabilities, we are really looking at how likely is an event to occur
10:56
And so how we would calculate that is by looking at first
11:02
the number of ways that event can occur. And then we want to divide that by the total number
11:09
of options, of outcomes that we have. And I think one of the most common examples
11:18
used in probabilities is throwing a dice, but it's used a lot because it is so efficient
11:24
to really demonstrate this point to us. So I'm not gonna lose something that works
11:31
So we're just gonna take a quick example here using the role of a dice
11:36
So if we think that we're rolling a dice and we want to calculate the probability of rolling a 1
11:49
what we need to think, what is the ways in which we can roll that number 1? Well, we can roll it
11:58
only by one way, because we only have one one in the dice. So the number of ways it can happen is
12:07
one. And then the total number of outcomes in a dice is, of course, six, because there are six
12:13
different numbers on that dice. So the probability is one out of six. And if we want to turn that
12:20
into a percentage, we would just do that division 1 divided by 6 and multiply by 100. So we would
12:29
get it to be about 16.7%. And in these simple situations, it is really easy to calculate these
12:40
probabilities. But of course, our situations might be way more complex, there might be
12:47
multiple events happening and things affecting one portion or the other. So we will be taking
12:54
steps into those more complex situations. But this is our starting point. Just simple event
13:01
nothing else affecting it, and we are just able to directly calculate the outcome
13:08
But if we think about this event of rolling a dice in real life
13:12
we now know that the theoretical probability that we just calculated, we just got the result
13:20
we know that rolling one, the probability is this one divided by six and about 16.7%
13:27
But if you think about real life situations, there is something called experimental probability
13:35
Because in real life, if we decide to test this out, test this probability out
13:40
we might not get the same results. So let's say that we decide that we're going to roll the dice
13:48
two times. And then from those two rolls, we're going to determine what is the probability of
13:53
rolling one. And if you've played any kind of game of dice, you might sometime have gotten that you've
14:01
rolled the dice twice, and both times you have gotten one. So we roll it once, we get a one
14:08
and we roll it twice, we get another one. And so if we have a set where we roll it
14:16
two times, and both of those times we get ones. In this case, when we're calculating the probability
14:26
we take the number of events where we got one, and then we divide it by the number of total events
14:37
So down here it's number of total events. And at the top, it is the number of events
14:47
where we got the desired outcome. So this is number of times we get one But of course on the other hand we could also in two rolls we could get no ones as well
15:07
So, but does this mean that when we do this experiment, we roll the dice twice
15:14
Does that mean our probability is now 100%? Does it work that way
15:21
Well, not quite. So what we are to pull away from this scenario is that we have a difference of the experimental probability, and then we have a theoretical probability
15:35
And what the relationship between these two is, that is what is called a law of large numbers
15:43
And what this means is that it means that the more we repeat an experiment, the closer we should get to that kind of calculated results, the theoretical probability
16:00
So in our example, we just rolled the dice twice. But if we keep rolling that, it is quite likely
16:07
that maybe the next ones will not be dices with one. And we roll it again, and maybe we do get
16:15
another one. But we roll it again, and we roll it again. And at this point, we already get to a
16:21
situation where we have our probability, our experimental probability is one third, which is
16:29
already much closer to our theoretical probability than was our first two tosses. So this law of large
16:38
numbers really states that the more we repeat an experiment, the closer we will get to that
16:45
theoretical probability. So if we were to repeat this experiment, for example, thousands of times
16:53
that probability of getting a one would fall very close to that probability that we got by
17:03
calculating, which was the one-sixth. It would be out of the thousand rolls of dice, about one-sixth
17:11
would be ones. So really the probabilities very much tie together with an actual data set that
17:19
we might have. So we might gather some data, and from our data, we might be able to calculate some
17:25
probabilities. And depending on how large our data set is, that probability will be more or less
17:33
accurate. And the less we have data, the less accurate that probability can be. And whereas
17:40
the more we have data, the more accurate that data will be as well. And if we look at this situation
17:49
that we have been looking at here. This case where we just roll some dice
17:55
This particular case has been a situation where we have very discrete values
18:03
So that roll of dice cannot get any value whatsoever. So it has some discreteness to what it can be showing
18:14
So, sorry, I'm trying to change the slide here. There we go. So really, if we think about how then
18:25
those rolls of dices will distribute, if we repeat it many times, it will create this discrete
18:32
distribution where these different numbers on the dice will get a different repetition of values
18:40
And those will be our probability distribution for that measurement that we're doing. For example
18:51
this could describe our distribution when we're rolling a dice, except I have five bars here
18:57
so we should actually have six to really be representing a roll of the dice
19:03
So this is one kind of measurement where we might be looking at the probabilities
19:08
But the other type that we might have is this, which is a continuous distribution
19:17
So where the values that we're measuring can get any value between certain numbers, maybe some minimum and maximum numbers
19:27
and then each of those values will have a different amount of repetitions again
19:34
So we again have the number of repetitions and the values, and this is the continuous distribution
19:42
And you might remember, we actually took a look at this in our previous session as we were looking
19:48
at statistics, because we can, of course, look at the distribution of the values from the perspective
19:54
of this is the data that we have gathered, or we can look at it from the perspective of what is
20:00
the probability of each of those values that we might get. And of course, as we looked at how to
20:07
calculate the probability, we look at the number of events per that value that we're looking at
20:14
divided by the total number of events. So it will of course be related to the number of events we
20:21
have for each specific measurement. So that is why just the general distribution and the
20:29
probability distribution will look very alike. But as when we're talking about probability
20:35
distributions, we're most often, instead of showing the number of repetitions, we instead
20:43
here on this bar here, we change this to the probability of a specific value
20:53
So the scale at which we represent those values will change. So we've been looking at how we can calculate the probability for one event
21:09
But sometimes there is the case that we might have multiple events happening
21:15
And in that case, we will need something called a conditional probability
21:21
But first to understand what conditional probability is about, we need to look at whether these two events that we're looking at
21:33
So we start with two events, and we need to see whether these two events are independent
21:40
or whether they are dependent. So those are our two options. And what this means is that if we look at independent events
22:00
then it means that if we have event A, then it does not affect the event B in any way
22:07
So if we say that event A is, for example, one roll of a dice
22:18
and then the event B is the second roll of a dice
22:26
is the first roll going to affect what number we gonna get out of the second roll
22:35
No, it's not. It's gonna, we're still gonna have the same probability
22:40
So if we're now going to calculate the probability of rolling a one and a one in our two roll situation here
22:49
roll situation here. So to calculate that, we need to take the probability of A
23:01
and then we need to multiply that by the probability of B. So we need the probability of what is
23:11
the probability of getting one in the first roll of dice, and we need the probability of B
23:16
of what is the probability to get one out of that second roll of the dice. And so this means
23:22
we have one sixth since we calculated that before, because we have one option out of the six to get
23:29
one. And then for the second roll, the probability is exactly the same. And we just multiply those
23:37
together. And so in this case, we get 1 out of 36. And again, you could write this as a percentage
23:47
if you would like to. We're just gonna leave it at that for now. So why do we do the multiplication
23:58
Well, because if we think about it, if you think about these two rolls of the dice, the combination of those two rolls
24:12
there's no longer six options, but there is six times six options total. Because if you get a one
24:19
on the first roll, you could get a one, two, three, four, five, and a six on the second roll
24:26
If you get a two, you can again get one, two, three, four, five, six on the second roll. So you
24:31
need to kind of multiply each of those options on the first roll with the options on the second roll
24:37
And that is why we use multiplication in these scenarios. So whenever you're thinking about
24:43
what is the probability of the first one and the second one, then you use multiplication there
24:51
And sometimes, in a way, this can feel a little, I now have the word only in Finnish in my head
25:03
counterintuitive, that's the word, to do the multiplication. But here it shows that rolling one, number one, in our roll of the dice
25:14
is more probable than rolling it twice in a row. And if you think about it in that way, that makes complete sense. So multiplication when you're looking at two events. But what if we have two dependent events
25:31
In that case, we can still look at this situation of having those two events and multiplying them together
25:44
But in that case, we need to consider also taking into account the other scenario
25:53
So, for example, if we have an event that is, it rains
26:07
And we have another event that says, it is sunny. But then we also know that if it's sunny, it is a bit less likely to rain
26:22
But we still do want to calculate the probability of A and B, that it is both raining and it is
26:29
sunny. So to do that we cannot just take the probability of it rains and it is sunny, but we
26:36
have to take the probability of it rains. And then we need to take the probability of it
26:45
it is sunny when we already know that it's raining. So this second part here is telling us
26:58
about, well, if A has already happened, then what is the probability of B
27:05
So for example, in this scenario, we could have a probability for it rains, because right now we
27:13
don't have any numbers here yet. But let's say that we have a probability of a training being 0.2
27:25
And we have a probability of it being sunny, let's say 0.4. So here we see that we want our 0.2 here
27:37
but actually we don't know what this probability here is. What is the probability of it being sunny
27:47
if we know that it rains? So in that case, we need to have possibly some more information
27:57
and maybe in that this case it has been given to us what is the probability of it then raining
28:07
if it is no it being sunny when it rains so we're in this case we're gonna assume that we have been
28:16
given that and let's just say that is 0.2 and we're gonna return to this in just a little bit so
28:21
So we're going to go over this whole process from a different angle to make it, let it make sense
28:28
So pretty much we, in this case, we were given this probability of it is sunny when we already
28:35
know it's raining. We know that it is 0.2. All right. So what tells us whether those two events
28:45
is R independent or dependent. Well, if they're independent, we can just take the two numbers
28:53
and multiply them together directly. If they are dependent, then we need to have
29:02
the probability of the second event happening when already A has happened. So in this case
29:11
we have the probability of it rains times then the probability of, we know it's already raining
29:17
so now the probability of it being sunny is 0.2 as well. And so we can then take those two numbers
29:24
together and then calculate those and get the probability for that. So 0.2 times 0.2 means
29:34
0.04 in this case. All right. So that is the first thing that we need to know here
29:47
But from this calculation of, oh, we actually need to know what is the probability
29:54
when one of these events has already happened. We then get to this conditional probability
29:59
So we're able to take this equation that we ended up here with. So we're going to be able to take
30:09
probability A and B equals probability A times probability B when A has happened. So this is
30:24
really saying probability of B when A has happened. So this is where we started from
30:49
So what if we don't have that information? What if we don't know what the probability of B is when A has already happened
30:59
Well, then in that case, we can use the information of maybe we do know what is the probability of A and B happening together
31:08
So then what we can do with this, if you remember from our basic algebra to manipulate an equation
31:20
in this case we can do a division here and divide by Pa here. And that way we're able to get the
31:27
result of PBA equals P and B divided by P . So that's what we get from there. So in this case
31:48
if we have the probability of A and B together and we have the probability of A, then we can solve
31:54
for what is the probability of B when A has already happened. And this is called conditional
32:01
probability. So sometimes we, for example, might have some information that, well, we know that
32:07
for somebody to, well, let's go back to our raining example. We might know the probability of
32:18
it raining and being sunny at the same time, and we might know the probability of it just being
32:25
sunny. And then by using those, we're able to get the probability of it raining when it is already
32:33
sunny. So that's how it works together. So a lot of times we're, rather than giving the probability
32:41
of when something has already happened, we most often have the probability of two
32:48
happening together. So that's how we're then able to use this conditional probability to our
32:54
advantage. And if we really think about that, this, most of the time, we don't have just one event
33:01
we're looking at. So that is why it's quite important to be able to use this conditional
33:06
probability to our advantage and use it when we might need it. Then, as kind of where we can get
33:16
from here. If we look at this calculation, another way that we could have presented this
33:24
same calculation here is that we would have taken P, A and B, but we could have looked at it from
33:35
the other direction and taking the probability of B times the probability of P when A has already
33:43
happened. And so if we take these two information that we have gotten from here, we have both this
33:53
one at the top and this one at the bottom. We notice that they equal the same thing. So we're
34:02
able to take both of these with just writing these two out times too many things to write
34:20
here. And from here, we get to something called a Bayes' theorem. And what the Bayes' theorem
34:30
is stating that when we have some kind of information about our probabilities
34:35
then we can find out one other aspect based on the information that we do actually already have
34:43
So based on the information we have, we're able to solve other probabilities
34:48
that we might be missing. So in that this case, we're going to take this
34:53
and for example, we're going to just solve it so that we can get PAB equals
35:03
so again, we wanna divide by PB because there's a multiplication mark there
35:08
And so that means that the left side stays put, and then we're just gonna divide it by this PB
35:19
that we had here. And so this is the Bayes' theorem. And so what this is telling us is that if we have some of this other information, then we're able to solve for any missing parts that we might have here
35:36
And then also by combining with this previous information here, for example, this one
35:44
then we are able to leverage any of the information we have regarding our probabilities in a more
35:54
complex scenario. And to make any of this make sense, I'm just going to pull this information
36:02
and we're going to look at a practical example. So let's just write this out here one more time
36:11
so that it's going to be there for us when we need it. Okay, so that's our Bayes' theorem
36:31
And then, on the other hand, we have the information about how to calculate also this situation with kind of having the information about how two of these aspects fit together
36:47
So one way to leverage these Bayes theorems is when we have probability trees. So we have some
36:56
kind of event that has different probabilities at some level. So let's just do one kind of example
37:06
So we have, for example, done an algorithm that looks at photos. And then by looking at those
37:16
photos, it will, for example, determine if the patient has cancer. And we have determined that..
37:24
Oops just a second Let me get my pen here working again So based on this algorithm we have been able to find out that there is a 4 probability that yes
37:42
there is cancer based on that photo and then 0.96 that no, there's actually no cancer
37:54
But then on this side where we have it saying that, yes, it is cancer, some of these will be actually true positives
38:09
So it will say that, yes, it is cancer, and it actually is cancer
38:17
So let's say that this is 0.8. and then we have like a false positive
38:30
So it is actually then when a test is done, it is not actually cancer
38:36
And let's say that is 0.2. And then on the null side
38:41
we're also going to have some probabilities. So we're again going to have kind of the true positive
38:48
but in this case it will be for the no. So let's..
39:02
Well, true negative actually, that's why I got confused. Negative. And let's say that the probability for that is 0.05
39:14
And then the false negative then that one will be 0.95
39:31
So to really understand then probabilities, we can use the Bayes' theorem here
39:39
So we can, for example, calculate that, yes, the patient has cancer
39:47
when they have gotten the positive test based on these results that we have
39:56
So we need to be calculating the true positives against then the true negatives plus the false
40:09
negatives as well. So we kind of have to take different parts from this graph to calculate
40:16
these all together. So we need to take the true positives and then calculate the true negatives
40:41
and then the false positives. No, I mean the false negatives. But this looks kind of tricky
40:55
So how are we actually going to go about it? So if we look at this graph on the right
41:03
of course, the situation that we're looking for the probability for is, well, this branch here
41:16
So the test has been positive and they actually do have cancer
41:23
So that is the ones that we're looking at. So those are the scenarios that we know are the ones that we want to look at
41:34
And so to calculate this kind of wanted scenarios would be to take 0.04 times 0.8
41:45
but since we're looking at well the test is positive but the patient actually
41:56
doesn't have cancer those are the all the scenarios so we need to add both these
42:04
scenarios where it's a true positive so those are part of the set
42:12
Oh, sorry, I completely miswrote this at the top. Excuse me. So this is actually the true positives plus false negatives
42:32
because the false negatives will give a positive result, even though they do not actually have cancer
42:41
So that's what we want to have down here. All the situations where there is a positive test result, but there isn't actually, there either is cancer or there isn't
42:55
So to do that, we want to then take the 0.96 times 0.05. So I have to do now some cleaning up because
43:11
I actually wrote some things wrong here. So this is the... Sorry about that. So to walk through this
43:20
one more time to erase any confusion. This was meant to be so that there is actually a positive
43:27
test result. That's what I was meaning with this. And this was to be a negative test result
43:39
And again, we have a positive test result and a negative test result. So sorry about that
43:50
So we have the, yes, there is actually cancer, no, there is not
43:56
Plus then we have the layer of what have the test results actually been
44:01
whether it has been positive or whether it has been negative. So those test results have been then used to confirm this
44:08
So whenever we get a result of yes, so this is our algorithm here saying that yes, it is cancer
44:16
This is what the algorithm says. But then we have some data, we get some results
44:23
and we find out that the positive test results are the probability for that from that group is 0.8
44:31
but there is still a possibility of getting a negative test results by 0.2
44:37
And then even when our algorithm says that no, there is no cancer
44:41
it's still possible to get a positive test result by 0.05, but a negative by 0.95
44:49
So that is what we're calculating here. So we taking the rightmost branch there is the desired outcomes We want to know when these both fit together
45:05
And then the ones that we want to compare this to is those who still got a positive result
45:12
So both also from the no branch where there is a positive result as well
45:19
And then when we have these numbers, The only thing that is left is just to calculate these out. So that's all we need to do in this case
45:29
So for this one, we get, let's get back to our blue so we don't get too many colors
45:35
we get 0.32 divided by 0.32 plus 0.048. And so based on this, we know that
45:59
actually the probability that that result is correct is actually pretty high because it is 0.87
46:15
here. So if we again present this as a probability, it is 87%. So that's how we can go about it. So
46:26
But probabilities are used a lot in these kind of probability trees and things like that
46:33
that we might be using. So for today, we are ending the near end of probability section
46:41
And so really know that you should, there is more to know about probabilities
46:47
We don't unfortunately have the time to go through every single detail
46:51
I wanted you to get started on understanding the basic probabilities. And then also understanding these scenarios where we have two dependent events happening
47:00
from each other. And that is where we get to this conditional probability, and then also to the Bayes' theorem
47:07
as well. And so in this case, I am going to start wrapping up and just reminding you what we've been
47:16
going through in the previous sessions and how this all fits together
47:21
So because we've been going through all this math and we're not just going through it pointlessly
47:29
it's not, well, it is fun in my opinion, but for you, it might be that you're going through
47:35
the sessions to really take something to your machine learning and data science
47:40
endeavors that you have ahead. So really what I want to come to here in the very end is to
47:48
try to help you tie together all this. So as we've been going through these sessions
47:55
if you remember, as I kind of stated in the beginning as well
48:00
we have started here from basic algebra. We built on top of that with calculus
48:06
Then we went into linear algebra. And then in these two last sessions
48:10
we have been looking at statistics and probabilities. And these two all fit together
48:17
with data science and machine learning. But if we look at data science and machine learning
48:22
it can be kind of a vast area and it can be very difficult to know
48:27
which of these to use where. And to help you out at least hopefully a little bit
48:33
to get a sense of that is that a lot of times when we look at data science and machine learning
48:39
or artificial intelligence, they might be described as these two overlapping spheres
48:46
where actually the overlapping part is quite large, but there is more to data science than just the things within artificial intelligence
48:55
and there's more to AI than is contained within data science. And most of the time, then machine learning is thought as a sub-area of artificial intelligence
49:06
That is how it's mostly presented in a lot of sources. So that will just kind of help you set the context of all this
49:14
So we've been going through all these areas. So really to be able to work within these two areas
49:22
we need to have the basic algebra in place in the ground underneath all the other math. Without that
49:31
it's going to get really tricky quite soon. So really try to get that foundation with basic
49:35
algebra really high. And then since these areas overlap and a lot of these methods are used
49:45
interchangeably and kind of across both areas, this is just kind of a best attempt at approximating
49:53
So really statistics leans more towards the data science side and also, oops, wrong directions
50:01
statistics leads towards the data science side, but also probabilities. But probabilities already
50:07
starts to move towards also the AI and machine learning needs as well. And the more we go to the
50:15
side of artificial intelligence, the more we're going to need calculus and linear algebra as well
50:22
But if we look at even one area of a task that you might be doing within these areas, let's say
50:30
that you're, for example, just starting to look into your data, you want to understand it, you try to find valuable insights from that data
50:38
For that, you might, for example, use statistics and probabilities. On the other hand, if you're starting to do linear regression, for you to do that, you're going to need some linear algebra
50:51
you're going to need some calculus, but you're also going to need some probabilities
50:56
So in a lot of the cases when you're working with these areas, it is not just one of these areas that you need
51:03
It is that you need pieces and portions out of many of these areas
51:09
But the level at which you need to understand it depends, I would say, purely on you
51:16
It will become easier to understand how things work if you do have that very deep knowledge
51:22
but also we have a lot of tools that do the work for you but when it becomes really essential to
51:29
know these things behind the scenes is when something goes wrong when we get some unexpected
51:35
results then we might be white might have the need to have the ability to really look behind
51:43
and see what has been happening to map out what has happened and why did we end up with the result
51:50
we got. And of course, it might be that, yes, the unexpected result was correct, but we need to have
51:56
some kind of way to verify and not just blindly trust, for example, the algorithms that we built
52:03
but also we need to have some understanding of the mathematics that go to making that algorithm work
52:12
So I really hope these sessions have given you a general understanding of these different topics
52:18
There is definitely more to know and don feel discouraged if you feel like you don know everything yet That is completely reasonable There is more to know I hope you inspired to find out more But also don be afraid to get into the actual practical implementations and get
52:37
started on those, because then looking at those practical implementations, you will get motivation
52:43
to look deeper into the math as well, once you see how things work on the practical sense
52:49
So I really hope you have enjoyed it, and I have really had a fun time going through these five different mathematics sessions with you
53:17
Heine, you are doing great again. Thank you a lot. Thank you
53:21
You're welcome. And yes, the Bayes theorem always gets me confused. Every single time
53:28
And now that you say that we actually got some questions, whether you could try to work
53:37
Yeah, well then look at that first. Okay. So could you give some example of how bias theorem
53:42
is used in practice in machine learning and data science? So this is for example, used in decision trees
53:50
when we're trying to find the best possible option for example. So that's one place where it's used
53:58
And feel the two of you can feel free to add anything that comes to mind right on the spot
54:03
I do have something in my mind. You know what I was thinking about this
54:07
This confusion metrics, for example, because it's not just it's not just when it's decision making
54:13
It is also really good at making evaluation of models. So, yeah, that's not true
54:21
Yeah. Very good. about when we will talk about these things on later sessions
54:27
and that's why I don't want to shoot out all the cool information now
54:32
So, yeah, let's have just this, that it is useful in building machine learning algorithms
54:38
and to evaluate these as well. So it has a lot of useful approaches that you can go with
54:46
So, and how much statistics is it necessary to know on an everyday basis
54:55
Hmm, very, very interesting question. Depends what you mean with everyday basis
55:04
Yes, especially nowadays, right? Yeah, exactly. I would say just have at least some general understanding of calculating averages and the different ways to do that and what it actually tells you
55:20
And I think it is very useful to also have some basic understanding of probabilities as well
55:26
Because really with probabilities, I think the one most misunderstood aspect is that then we try to rely on the probability for one particular outcome
55:38
where actually probabilities look at the mass of events and can really give an accurate answer
55:44
for when you have multiple repetitions of that one measurement. I think also if you know some probability
55:53
it can be easier to actually set up a problem and to realize is this problem actually solvable
55:58
with the information that we have? Do we have enough information to solve the problem
56:04
And also what you mentioned, Heine, that when things go right, it's okay, but then when things go wrong
56:12
it's even more important to be able to dig down and see what is the reason why things didn't work out
56:19
Exactly. And I think we can easily say, and I can say it for myself
56:24
that it's very often that I try to make decisions based on the information we have
56:32
And I'm not talking about the information I have only, but I'm talking about all the information I can gather all around and
56:38
and including all these to, to make a decision, especially if it's something big, don't have to look far
56:46
just look at the situation we have right now. So, yeah. So, um, and yeah
56:52
there were a few questions about whether we could take a look again at this
56:55
uh, this, uh, theorem. I think the most important part that was a bit crazy when
57:01
when we talked about this true positive and true negative, And if the person who asked about it is still with us, please feel free to shoot more questions if you have any
57:13
So let's take a look at that again. The last calculation. That's perfect
57:18
Yes. Yes, that's a perfect clarification. Let me just roll on to the right slide here
57:30
I think this is the easiest. If I just back up a little bit
57:34
Sorry, oops, there we go. So, actually. Yeah, one more time, I'm just gonna
57:43
Yeah, so we shared the screen as well, there we go. All right, so if we just start with our problem that we're trying to solve
57:53
So our scenario is that we have this, let's say, an algorithm that is able to, from a picture
58:01
picture see whether somebody has cancer or not. And then we want to know that whether this is
58:08
actually accurate. So we have this test data set, we have all the results from this algorithm, but we
58:15
also then have the actual test results from all the patients, whether they actually tested positive
58:21
for cancer or not. So to do a little more cleanup here. So these negative, positive, negative
58:35
positive, those are actually referencing to the test results. So let's just leave them like that
58:43
We have a negative result and a positive result from both of these scenarios, whether there is
58:48
cancer or whether there isn't. And so really, if we did want to name these with the true
58:55
positives and the true negatives, this here on the right, when there is a positive test
59:02
result plus our algorithm said that it's positive, this is the true positive here, actually
59:12
So sorry, I got my terms mixed a little bit in the explanation
59:16
And it is also getting mixed because when we are talking about cancer and such things
59:21
that is, we are talking about like healthcare situations and other way around
59:27
So it is positive and it's not that good, right? Yeah, exactly
59:32
It's a good result. And that's why it is a bit confusing at this point
59:37
Yeah, exactly. So the positive is in this case now referencing that to, yes, there is cancer
59:42
This algorithm is working. So the true positive reference is to yes, the algorithm works
59:48
So then this, for example, here would be the false positive here
59:55
And then on this side, this would be a false positive. negative. And then this would be a true negative because actually they didn't have cancer
1:00:12
So what we're calculating here is when we want to calculate what is the probability of somebody
1:00:20
having cancer when their test result is positive, we need to look at the true positives
1:00:28
of when they... So let me just do one more step here
1:00:40
Let's take that color. So if you think about it this way
1:00:45
this is our A and this is our B from here, A and B
1:00:50
And then so for the true positives, we want to take the probability of
1:00:56
yes, there is cancer. So that is this one here, that one. So this one. And then we want to take the probability of
1:01:08
the test being positive when they have cancer and that we have here. So we get it over here
1:01:17
And then the probability of B of there being positive then there we do need to take all the test results where we get positive So we need to find these ones Here a positive test result and here is a positive test result
1:01:33
So for this one, we get first these two again, and then for this one on the left, we get these two
1:01:42
So that's how it builds up. So that's how it's using the Bayes' theorem
1:01:48
behind the scenes. We just worded it a little differently. Yeah. And I think now that you made circles and lines here and there, I really hope it made a bit more sense
1:02:01
Yes, I hope so as well. Otherwise, I think since now it is over with the mathematics sessions, I think we are going to put together basically a collection of resources and some of the things Heine talked about and shown
1:02:20
and that is going to be available at our webpage as well
1:02:25
So you can find even more details, hopefully. And maybe some practical exercises as well
1:02:35
Exactly. So you can play around with all the things we learned already
1:02:40
And I think we answered the questions and we got a lot of nice feedbacks
1:02:47
feedbacks and thank you a lot everyone for joining Maybe we can show the one that Jean wrote us Yes I hope I correctly said his name
1:03:03
And yeah, so I also did have a lot of fun today
1:03:08
And I have the feeling that I'm sort of going to miss these mathematical sessions a lot
1:03:14
Yeah. Because it is actually a lot of fun. Exactly. and it is always nice to go back to the basics a bit and and refresh these things in our head as
1:03:26
well so i really i really enjoyed it so how did you guys feel about it i thought it was really
1:03:33
really interesting and as eve said you know we we will miss you because now we've had five sessions
1:03:39
which has been really great exactly and thank you so much for having me for this part it was
1:03:45
really fun for me to prepare for this and I'm glad I got to go back to the Bayes theorem to
1:03:51
make some circles and make it clearer I was hurrying through it a little bit and I'm like
1:03:57
never hurry that is never the answer in math no not in math at least exactly you put in some time
1:04:06
and make sure that you enjoy it also like how did you say last time like having a walk walk in the
1:04:14
In the park. A walk in the park, yeah. Right, that's exactly. Exactly
1:04:18
Yeah so that how it feels every time we talk about mess And well Heidi thank you again for your time and for the awesome sessions and all the knowledge that you shared with us
1:04:30
And I hope we can welcome you at other times as well at AI42
1:04:37
And I think it is time to say goodbye to you. And let's start and move on to our next event
1:04:46
which is going to be in two weeks on the 24th of March
1:04:52
which is March. Sorry, I can't talk anymore. And that is going to be about Transact SQL
1:04:58
So you can see we go a bit into the world of programming
1:05:02
and learn a bit more about technologies as well. And I hope you will join us that time as well
1:05:13
yep so with that said we just want to thank all the people who have tuned in and all of those of
1:05:21
you who looks at this afterwards on our youtube channel and looking forward to seeing you next
1:05:27
time yes and don't forget to follow us on twitter instagram or facebook or all of them
1:05:34
yes yes thank you see you soon thank you