0:03
so we're going to talk about the big
0:04
so we're going to talk about the big
0:04
so we're going to talk about the big ball of mud and I'm pretty sure that
0:06
ball of mud and I'm pretty sure that
0:06
ball of mud and I'm pretty sure that most of the people who had worked in the
0:08
most of the people who had worked in the
0:08
most of the people who had worked in the software industry for some time
0:11
software industry for some time
0:11
software industry for some time understand the concept of a big ball of
0:12
understand the concept of a big ball of
0:12
understand the concept of a big ball of mud because that's what most people are
0:14
mud because that's what most people are
0:14
mud because that's what most people are actually working with I also know that
0:17
actually working with I also know that
0:17
actually working with I also know that most people are not too happy to work on
0:18
most people are not too happy to work on
0:18
most people are not too happy to work on a big ball of mud it's not always fun
0:20
a big ball of mud it's not always fun
0:20
a big ball of mud it's not always fun it's actually sometimes very dangerous
0:22
it's actually sometimes very dangerous
0:22
it's actually sometimes very dangerous and and very dangerous to make changes
0:24
and and very dangerous to make changes
0:24
and and very dangerous to make changes because it's hard to basically
0:26
because it's hard to basically
0:26
because it's hard to basically understand the consequences of those
0:27
understand the consequences of those
0:27
understand the consequences of those changes because big balls of mud are
0:30
changes because big balls of mud are
0:30
changes because big balls of mud are suffering from high coupling and other
0:32
suffering from high coupling and other
0:32
suffering from high coupling and other issues we're going to go into details
0:34
issues we're going to go into details
0:34
issues we're going to go into details first we're going to talk about what is
0:35
first we're going to talk about what is
0:36
first we're going to talk about what is a big ball of mud and we're going to
0:38
a big ball of mud and we're going to
0:38
a big ball of mud and we're going to look at a very nice example um open
0:41
look at a very nice example um open
0:41
look at a very nice example um open source project called look Apache
0:44
source project called look Apache
0:44
source project called look Apache Cassandra then we going to discuss the
0:47
Cassandra then we going to discuss the
0:47
Cassandra then we going to discuss the characteristics of Big Balls of mud and
0:50
characteristics of Big Balls of mud and
0:50
characteristics of Big Balls of mud and then we going to find out how can we
0:52
then we going to find out how can we
0:52
then we going to find out how can we detect it if our project turns into big
0:55
detect it if our project turns into big
0:55
detect it if our project turns into big ball of mud and how can we stop it at
0:57
ball of mud and how can we stop it at
0:57
ball of mud and how can we stop it at the end from happening in the first
0:59
the end from happening in the first
0:59
the end from happening in the first place
1:00
place and what are we going to do if we
1:02
and what are we going to do if we
1:02
and what are we going to do if we already have a big ball of
1:06
mud so this is um a screenshot of a
1:09
mud so this is um a screenshot of a
1:09
mud so this is um a screenshot of a dependency
1:10
dependency diagram I think uh we can agree that it
1:13
diagram I think uh we can agree that it
1:13
diagram I think uh we can agree that it looks a little bit like a ball we see
1:16
looks a little bit like a ball we see
1:16
looks a little bit like a ball we see something like five 1553 Java fights
1:19
something like five 1553 Java fights
1:19
something like five 1553 Java fights that form one big cycle means from each
1:21
that form one big cycle means from each
1:21
that form one big cycle means from each of those 1500 Java fights you can reach
1:24
of those 1500 Java fights you can reach
1:24
of those 1500 Java fights you can reach any other one and come back another way
1:27
any other one and come back another way
1:27
any other one and come back another way so basically they form one gigantic big
1:29
so basically they form one gigantic big
1:29
so basically they form one gigantic big big Java file one logical big Java file
1:32
big Java file one logical big Java file
1:32
big Java file one logical big Java file that comprises basically most of aach
1:35
that comprises basically most of aach
1:35
that comprises basically most of aach candra which means that the original
1:37
candra which means that the original
1:37
candra which means that the original architecture has been lost or simplified
1:40
architecture has been lost or simplified
1:40
architecture has been lost or simplified to the maximum because in that case you
1:42
to the maximum because in that case you
1:42
to the maximum because in that case you could say it's a simple architecture
1:44
could say it's a simple architecture
1:44
could say it's a simple architecture diagram with one box labeled a p candra
1:46
diagram with one box labeled a p candra
1:46
diagram with one box labeled a p candra no further
1:48
no further subdivision so I'm pretty sure the
1:50
subdivision so I'm pretty sure the
1:50
subdivision so I'm pretty sure the original developers still understand
1:52
original developers still understand
1:52
original developers still understand their code ways a little bit but we also
1:54
their code ways a little bit but we also
1:54
their code ways a little bit but we also could agree that somebody who comes from
1:55
could agree that somebody who comes from
1:55
could agree that somebody who comes from the outside and needs to do meaningful
1:58
the outside and needs to do meaningful
1:58
the outside and needs to do meaningful changes to this code base might have a
1:59
changes to this code base might have a
1:59
changes to this code base might have a hard time doing
2:03
that so what happened
2:05
that so what happened
2:05
that so what happened here so I'm trying to prove that once a
2:09
here so I'm trying to prove that once a
2:10
here so I'm trying to prove that once a project becomes a big ball of M it will
2:12
project becomes a big ball of M it will
2:12
project becomes a big ball of M it will be very difficult to come back from
2:16
it and what I did I did um an analysis
2:21
it and what I did I did um an analysis
2:21
it and what I did I did um an analysis of different open source projects and
2:23
of different open source projects and
2:23
of different open source projects and one of them was a p Cambra and here what
2:25
one of them was a p Cambra and here what
2:25
one of them was a p Cambra and here what I did I compared five different versions
2:27
I did I compared five different versions
2:27
I did I compared five different versions of apach Cambo started with version 1.0
2:30
of apach Cambo started with version 1.0
2:30
of apach Cambo started with version 1.0 up to version 4.1 and I think now we
2:33
up to version 4.1 and I think now we
2:33
up to version 4.1 and I think now we have even a newer version that has even
2:35
have even a newer version that has even
2:35
have even a newer version that has even worse characteristics than the previous
2:38
worse characteristics than the previous
2:38
worse characteristics than the previous ones so what we can see in the trend
2:40
ones so what we can see in the trend
2:40
ones so what we can see in the trend which I'm I'm able to prove also with
2:43
which I'm I'm able to prove also with
2:43
which I'm I'm able to prove also with many other projects once you get those
2:45
many other projects once you get those
2:45
many other projects once you get those big cycle groups in a
2:47
big cycle groups in a
2:47
big cycle groups in a system um they tend to grow that's why I
2:51
system um they tend to grow that's why I
2:51
system um they tend to grow that's why I call them code cancer they're like
2:53
call them code cancer they're like
2:53
call them code cancer they're like little tumors and as soon as they grow
2:54
little tumors and as soon as they grow
2:54
little tumors and as soon as they grow over a certain size I will develop some
2:57
over a certain size I will develop some
2:57
over a certain size I will develop some Dynamic growth that at the end will end
2:59
Dynamic growth that at the end will end
2:59
Dynamic growth that at the end will end either up your whole
3:04
system and as I said by just looking at
3:07
system and as I said by just looking at
3:07
system and as I said by just looking at the data we can confirm that things got
3:09
the data we can confirm that things got
3:09
the data we can confirm that things got worse with every new
3:13
version they are caused by growing
3:16
version they are caused by growing
3:16
version they are caused by growing groups of elements involved in CYCC
3:17
groups of elements involved in CYCC
3:17
groups of elements involved in CYCC dependencies and once you have one those
3:19
dependencies and once you have one those
3:19
dependencies and once you have one those of those bigger Cycles let's assume you
3:21
of those bigger Cycles let's assume you
3:21
of those bigger Cycles let's assume you have 20 30 40 classes in a cycle it's
3:24
have 20 30 40 classes in a cycle it's
3:24
have 20 30 40 classes in a cycle it's very easy to add new classes to that
3:26
very easy to add new classes to that
3:26
very easy to add new classes to that cycle without even knowing it because
3:29
cycle without even knowing it because
3:29
cycle without even knowing it because they're already the more the bigger
3:30
they're already the more the bigger
3:30
they're already the more the bigger those Cycles become the more they
3:32
those Cycles become the more they
3:32
those Cycles become the more they develop their own kind of gravity and
3:34
develop their own kind of gravity and
3:34
develop their own kind of gravity and attracting more and more dependencies
3:35
attracting more and more dependencies
3:35
attracting more and more dependencies from the outside which at the end only
3:37
from the outside which at the end only
3:37
from the outside which at the end only means that the Cycles are
3:41
growing so this is how it started how
3:44
growing so this is how it started how
3:44
growing so this is how it started how it's going so version 1.0 we had 296
3:48
it's going so version 1.0 we had 296
3:48
it's going so version 1.0 we had 296 claes in the big cycle already pretty
3:50
claes in the big cycle already pretty
3:50
claes in the big cycle already pretty big and now in version 4.1 we have
3:54
big and now in version 4.1 we have
3:54
big and now in version 4.1 we have 1553 so the two graphs of course because
3:57
1553 so the two graphs of course because
3:57
1553 so the two graphs of course because most of the files are not visible the
3:59
most of the files are not visible the
4:00
most of the files are not visible the right side the the right one doesn't
4:01
right side the the right one doesn't
4:01
right side the the right one doesn't look so much worse than the first one so
4:03
look so much worse than the first one so
4:03
look so much worse than the first one so the worst one was already first one was
4:05
the worst one was already first one was
4:05
the worst one was already first one was already pretty bad but now we are in a
4:08
already pretty bad but now we are in a
4:08
already pretty bad but now we are in a stage where it becomes very very
4:10
stage where it becomes very very
4:10
stage where it becomes very very difficult to put any structure in this
4:13
difficult to put any structure in this
4:13
difficult to put any structure in this any architectur structure on top of a p
4:15
any architectur structure on top of a p
4:15
any architectur structure on top of a p because everything is literally
4:17
because everything is literally
4:17
because everything is literally connected to everything by the way the
4:19
connected to everything by the way the
4:19
connected to everything by the way the different colors in the graph are
4:20
different colors in the graph are
4:20
different colors in the graph are forming different parent packages so
4:22
forming different parent packages so
4:22
forming different parent packages so each of those little rectangles is one
4:24
each of those little rectangles is one
4:24
each of those little rectangles is one Java file and uh all Java files with the
4:28
Java file and uh all Java files with the
4:28
Java file and uh all Java files with the same color are in the same package
4:33
so here we have different charts where
4:35
so here we have different charts where
4:35
so here we have different charts where we compare those numbers over time so we
4:37
we compare those numbers over time so we
4:37
we compare those numbers over time so we can see in the first chart on the top
4:39
can see in the first chart on the top
4:39
can see in the first chart on the top left that the lines of code from version
4:42
left that the lines of code from version
4:42
left that the lines of code from version one to version 4.1 Drew from about
4:44
one to version 4.1 Drew from about
4:44
one to version 4.1 Drew from about 50,000 lines of code to over 300,000
4:47
50,000 lines of code to over 300,000
4:47
50,000 lines of code to over 300,000 lines of code the number of java files
4:50
lines of code the number of java files
4:50
lines of code the number of java files went from 500 to 2,200 or something like
4:53
went from 500 to 2,200 or something like
4:53
went from 500 to 2,200 or something like this and then the number of packages
4:56
this and then the number of packages
4:56
this and then the number of packages went from 40 to over 120
5:00
went from 40 to over 120
5:00
went from 40 to over 120 now we have two metrics here called
5:02
now we have two metrics here called
5:02
now we have two metrics here called propagation cost and
5:03
propagation cost and maintainability level blue is
5:05
maintainability level blue is
5:05
maintainability level blue is propagation cost propagation cost
5:07
propagation cost propagation cost
5:07
propagation cost propagation cost means if you have a propagation cost
5:10
means if you have a propagation cost
5:10
means if you have a propagation cost value of 70% it means if you change a
5:12
value of 70% it means if you change a
5:12
value of 70% it means if you change a random file if your in your
5:15
random file if your in your
5:15
random file if your in your system uh 70% of the system will be
5:18
system uh 70% of the system will be
5:18
system uh 70% of the system will be affected by that
5:19
affected by that change that's quite a big
5:22
change that's quite a big
5:22
change that's quite a big number then maintainability level is a
5:25
number then maintainability level is a
5:25
number then maintainability level is a metric that we develop together with
5:26
metric that we develop together with
5:26
metric that we develop together with some of our
5:27
some of our customers where we look at um how
5:30
customers where we look at um how
5:30
customers where we look at um how maintainable is a module in your
5:32
maintainable is a module in your
5:33
maintainable is a module in your software so it's it's computed on the
5:35
software so it's it's computed on the
5:35
software so it's it's computed on the module level and it takes into
5:37
module level and it takes into
5:37
module level and it takes into consideration the dependency structure
5:39
consideration the dependency structure
5:39
consideration the dependency structure the amount of size of cycle groups and
5:42
the amount of size of cycle groups and
5:42
the amount of size of cycle groups and similar things uh the best value is 100%
5:45
similar things uh the best value is 100%
5:45
similar things uh the best value is 100% the worst value is uh zero and you can
5:48
the worst value is uh zero and you can
5:48
the worst value is uh zero and you can see it's pretty close to zero so it
5:50
see it's pretty close to zero so it
5:50
see it's pretty close to zero so it started in version one a little bit over
5:52
started in version one a little bit over
5:52
started in version one a little bit over 20% and then after that it fell to the
5:55
20% and then after that it fell to the
5:55
20% and then after that it fell to the 10% range and never really recovered
5:57
10% range and never really recovered
5:57
10% range and never really recovered from that proper ation cost at the end
6:00
from that proper ation cost at the end
6:00
from that proper ation cost at the end is brutally High almost 80% so every
6:03
is brutally High almost 80% so every
6:03
is brutally High almost 80% so every time I change a file in AP candra 80% of
6:06
time I change a file in AP candra 80% of
6:06
time I change a file in AP candra 80% of the apand code base might be affected by
6:08
the apand code base might be affected by
6:08
the apand code base might be affected by that change so that of course
6:10
that change so that of course
6:10
that change so that of course dramatically increases the chances of
6:13
dramatically increases the chances of
6:13
dramatically increases the chances of regression box coming in it also makes
6:15
regression box coming in it also makes
6:15
regression box coming in it also makes it much harder to test the code because
6:17
it much harder to test the code because
6:17
it much harder to test the code because you cannot test anything in
6:19
you cannot test anything in
6:19
you cannot test anything in isolation never even talking about how
6:21
isolation never even talking about how
6:21
isolation never even talking about how how would you be able to understand that
6:23
how would you be able to understand that
6:23
how would you be able to understand that code base if there's so much coupling
6:25
code base if there's so much coupling
6:25
code base if there's so much coupling and everything is kind of related to
6:27
and everything is kind of related to
6:27
and everything is kind of related to everything
6:28
everything else we can also see that the the number
6:31
else we can also see that the the number
6:31
else we can also see that the the number of elements in the biggest cycle group
6:34
of elements in the biggest cycle group
6:34
of elements in the biggest cycle group is continuous increasing from version
6:35
is continuous increasing from version
6:35
is continuous increasing from version one it was 296 to 155 53 so that's a
6:39
one it was 296 to 155 53 so that's a
6:40
one it was 296 to 155 53 so that's a pretty steep increase also um the
6:43
pretty steep increase also um the
6:43
pretty steep increase also um the biggest package cycle groups went from
6:45
biggest package cycle groups went from
6:45
biggest package cycle groups went from about 30 to 110
6:48
about 30 to 110 packages and relative cyclicity it's
6:51
packages and relative cyclicity it's
6:51
packages and relative cyclicity it's another metric which I will explain down
6:53
another metric which I will explain down
6:53
another metric which I will explain down the road in this uh presentation also
6:56
the road in this uh presentation also
6:56
the road in this uh presentation also went up to the 90% wage on the package
6:59
went up to the 90% wage on the package
6:59
went up to the 90% wage on the package level
7:00
level and to the 70 to 80% rage in the Java
7:05
and to the 70 to 80% rage in the Java
7:05
and to the 70 to 80% rage in the Java file
7:09
level so that is what I'm trying to prove here once
7:12
is what I'm trying to prove here once
7:12
is what I'm trying to prove here once you have this kind of code cancer it
7:14
you have this kind of code cancer it
7:14
you have this kind of code cancer it usually only gets
7:16
usually only gets worse that's why it makes sense to to
7:19
worse that's why it makes sense to to
7:19
worse that's why it makes sense to to think about why is it happening in the
7:21
think about why is it happening in the
7:21
think about why is it happening in the first place and what can we do to stop
7:24
first place and what can we do to stop
7:24
first place and what can we do to stop it after all cando is quite a successful
7:27
it after all cando is quite a successful
7:27
it after all cando is quite a successful project it's used in many production
7:30
project it's used in many production
7:30
project it's used in many production environments and a very popular open
7:33
environments and a very popular open
7:33
environments and a very popular open source project
7:36
source project but it has no discernable architecture
7:39
but it has no discernable architecture
7:39
but it has no discernable architecture it is hard to understand especially if
7:41
it is hard to understand especially if
7:41
it is hard to understand especially if you if you're new to the project because
7:43
you if you're new to the project because
7:43
you if you're new to the project because then you have to basically read the
7:44
then you have to basically read the
7:44
then you have to basically read the whole code base to have a understanding
7:46
whole code base to have a understanding
7:47
whole code base to have a understanding how everything
7:49
works it is basically impossible to test
7:52
works it is basically impossible to test
7:52
works it is basically impossible to test anything in
7:54
anything in isolation it has no modularity
7:57
isolation it has no modularity
7:57
isolation it has no modularity whatsoever
8:01
and it's just one big jaw of Highly
8:03
and it's just one big jaw of Highly
8:03
and it's just one big jaw of Highly cooped
8:06
Cod which means you also have a higher
8:08
Cod which means you also have a higher
8:08
Cod which means you also have a higher risk of
8:10
risk of vulnerabilities just imagine you want to
8:12
vulnerabilities just imagine you want to
8:12
vulnerabilities just imagine you want to harden that code code base against cyber
8:15
harden that code code base against cyber
8:15
harden that code code base against cyber threats the more complicated the code
8:17
threats the more complicated the code
8:17
threats the more complicated the code base the more highly coupled is the more
8:20
base the more highly coupled is the more
8:20
base the more highly coupled is the more difficult it will become to harden the
8:22
difficult it will become to harden the
8:22
difficult it will become to harden the code base against threats from the
8:27
outside yeah and of course we we talked
8:29
outside yeah and of course we we talked
8:29
outside yeah and of course we we talked about the chance of regression box the
8:31
about the chance of regression box the
8:31
about the chance of regression box the probability increases dramatic because
8:33
probability increases dramatic because
8:33
probability increases dramatic because every time you change a file about 1,500
8:36
every time you change a file about 1,500
8:36
every time you change a file about 1,500 files are directly indirectly affected
8:38
files are directly indirectly affected
8:38
files are directly indirectly affected by the change due to the extremely high
8:40
by the change due to the extremely high
8:40
by the change due to the extremely high coupling values in that
8:44
project so another argument is that big
8:48
project so another argument is that big
8:48
project so another argument is that big ball of MZ BB are
8:51
ball of MZ BB are expensive very expensive indeed uh just
8:54
expensive very expensive indeed uh just
8:54
expensive very expensive indeed uh just in the United States there was a report
8:55
in the United States there was a report
8:55
in the United States there was a report from C
8:58
from C cisq about the total cost of poor
9:01
cisq about the total cost of poor
9:01
cisq about the total cost of poor software quality and it estimated for
9:03
software quality and it estimated for
9:03
software quality and it estimated for 2020 that this cost was in the range of
9:05
2020 that this cost was in the range of
9:05
2020 that this cost was in the range of $2
9:06
$2 trillion which is roughly 10% of GDP
9:09
trillion which is roughly 10% of GDP
9:09
trillion which is roughly 10% of GDP quite a bit of money and there's a new
9:12
quite a bit of money and there's a new
9:12
quite a bit of money and there's a new report out in the meantime from 22 that
9:14
report out in the meantime from 22 that
9:14
report out in the meantime from 22 that has even worse numbers in there so we
9:17
has even worse numbers in there so we
9:17
has even worse numbers in there so we can assume that poor software quality
9:19
can assume that poor software quality
9:19
can assume that poor software quality costs a lot of money over
9:21
costs a lot of money over
9:21
costs a lot of money over time we have a relatively large bubble
9:24
time we have a relatively large bubble
9:24
time we have a relatively large bubble of unsuccessful IT projects 260 billions
9:28
of unsuccessful IT projects 260 billions
9:28
of unsuccessful IT projects 260 billions alone that is a little shocking right
9:30
alone that is a little shocking right
9:30
alone that is a little shocking right because we're developing software for
9:32
because we're developing software for
9:32
because we're developing software for more than 50 years now you would hope
9:34
more than 50 years now you would hope
9:34
more than 50 years now you would hope that we figured out how to how to do it
9:36
that we figured out how to how to do it
9:36
that we figured out how to how to do it right in the meantime but obviously not
9:39
right in the meantime but obviously not
9:39
right in the meantime but obviously not we still having issues finishing project
9:42
we still having issues finishing project
9:42
we still having issues finishing project successfully
9:44
successfully obviously we have legally Legacy system
9:47
obviously we have legally Legacy system
9:47
obviously we have legally Legacy system with 520 billions operational failures
9:49
with 520 billions operational failures
9:49
with 520 billions operational failures with 1.56 trillion which is quite a big
9:53
with 1.56 trillion which is quite a big
9:53
with 1.56 trillion which is quite a big number and um there are many famous
9:56
number and um there are many famous
9:56
number and um there are many famous examples uh of bad software quality
9:59
examples uh of bad software quality
9:59
examples uh of bad software quality let's talk about boing 737 Max for
10:01
let's talk about boing 737 Max for
10:01
let's talk about boing 737 Max for example it's a nice example where sloppy
10:04
example it's a nice example where sloppy
10:04
example it's a nice example where sloppy software development and sloppy quality
10:06
software development and sloppy quality
10:06
software development and sloppy quality control actually led to the death of of
10:08
control actually led to the death of of
10:08
control actually led to the death of of 400 people and totally destroyed the
10:12
400 people and totally destroyed the
10:12
400 people and totally destroyed the reputation of boing and they're still
10:13
reputation of boing and they're still
10:13
reputation of boing and they're still trying to recover from it and obviously
10:16
trying to recover from it and obviously
10:16
trying to recover from it and obviously if you read the news those problems are
10:18
if you read the news those problems are
10:18
if you read the news those problems are not totally solved at boing yet know if
10:21
not totally solved at boing yet know if
10:21
not totally solved at boing yet know if if the doors can fly out mid-flight
10:24
if the doors can fly out mid-flight
10:24
if the doors can fly out mid-flight that's a serious quality issue too but
10:26
that's a serious quality issue too but
10:26
that's a serious quality issue too but it's all related so the quality of the
10:28
it's all related so the quality of the
10:28
it's all related so the quality of the airplane the the software all is related
10:30
airplane the the software all is related
10:31
airplane the the software all is related together and there are many other
10:32
together and there are many other
10:32
together and there are many other examples where poor software quality at
10:34
examples where poor software quality at
10:34
examples where poor software quality at the end cost real
10:36
the end cost real money now why what there are some some
10:40
money now why what there are some some
10:40
money now why what there are some some more interesting findings in this
10:42
more interesting findings in this
10:43
more interesting findings in this report for example fail projects are up
10:46
report for example fail projects are up
10:46
report for example fail projects are up 46% from
10:48
46% from 2018 and um I was wondering why that
10:52
2018 and um I was wondering why that
10:52
2018 and um I was wondering why that happened but I think there's a
10:54
happened but I think there's a
10:54
happened but I think there's a relationship between the hype going to
10:57
relationship between the hype going to
10:57
relationship between the hype going to micros service which start in about 200
10:59
micros service which start in about 200
10:59
micros service which start in about 200 15 there many organizations switched to
11:02
15 there many organizations switched to
11:02
15 there many organizations switched to a microservice architecture Without
11:03
a microservice architecture Without
11:03
a microservice architecture Without Really knowing what they were doing so
11:06
Really knowing what they were doing so
11:06
Really knowing what they were doing so many of those microservice projects
11:08
many of those microservice projects
11:08
many of those microservice projects actually failed I've seen a couple of
11:10
actually failed I've seen a couple of
11:10
actually failed I've seen a couple of them because we also do software
11:12
them because we also do software
11:12
them because we also do software Assessment Services and I can tell you
11:15
Assessment Services and I can tell you
11:15
Assessment Services and I can tell you there are lots of
11:16
there are lots of issues uh microservices are not a bad
11:19
issues uh microservices are not a bad
11:19
issues uh microservices are not a bad thing in itself but it requires a more
11:21
thing in itself but it requires a more
11:21
thing in itself but it requires a more skilled team to write a microservice
11:23
skilled team to write a microservice
11:23
skilled team to write a microservice compared to monolith and it also puts a
11:26
compared to monolith and it also puts a
11:26
compared to monolith and it also puts a lot of burden on your end users because
11:28
lot of burden on your end users because
11:28
lot of burden on your end users because now they have to manage many services
11:30
now they have to manage many services
11:30
now they have to manage many services that have to kind of uh be
11:33
that have to kind of uh be
11:33
that have to kind of uh be orchestrated running in a kubernetes
11:35
orchestrated running in a kubernetes
11:35
orchestrated running in a kubernetes cluster or something like this and all
11:37
cluster or something like this and all
11:37
cluster or something like this and all that increases complexity it also
11:40
that increases complexity it also
11:40
that increases complexity it also increases programming complexity because
11:42
increases programming complexity because
11:42
increases programming complexity because as soon as one microsof has to talk to
11:45
as soon as one microsof has to talk to
11:45
as soon as one microsof has to talk to another one you need some interprocess
11:47
another one you need some interprocess
11:47
another one you need some interprocess communication me mechanisms like message
11:51
communication me mechanisms like message
11:51
communication me mechanisms like message cues and similar things which increase
11:54
cues and similar things which increase
11:54
cues and similar things which increase complexity and do not decrease
11:57
complexity and do not decrease
11:57
complexity and do not decrease complexity what are the key
11:59
complexity what are the key
11:59
complexity what are the key recommendations from the from the
12:02
recommendations from the from the
12:02
recommendations from the from the repord uh one of them says ensure early
12:04
repord uh one of them says ensure early
12:04
repord uh one of them says ensure early and regular analysis of source code to
12:06
and regular analysis of source code to
12:06
and regular analysis of source code to detect violations weaknesses and
12:09
detect violations weaknesses and
12:09
detect violations weaknesses and vulnerabilities that is where you
12:10
vulnerabilities that is where you
12:10
vulnerabilities that is where you actually need
12:12
actually need tools and very few people use tools for
12:14
tools and very few people use tools for
12:14
tools and very few people use tools for that we're going to talk about some
12:16
that we're going to talk about some
12:16
that we're going to talk about some tools a little bit down the road
12:18
tools a little bit down the road
12:18
tools a little bit down the road here then measure structural quality
12:21
here then measure structural quality
12:21
here then measure structural quality characteristics I couldn't agree more
12:23
characteristics I couldn't agree more
12:23
characteristics I couldn't agree more that's why we have a tool called
12:25
that's why we have a tool called
12:25
that's why we have a tool called sonograph that can do all those things
12:26
sonograph that can do all those things
12:26
sonograph that can do all those things and there's even a free version of it
12:28
and there's even a free version of it
12:28
and there's even a free version of it called sonographic
12:29
called sonographic Explorer um which you can basically
12:32
Explorer um which you can basically
12:32
Explorer um which you can basically download for free from our website
12:34
download for free from our website
12:34
download for free from our website hellot tomorrow.com and use for free
12:35
hellot tomorrow.com and use for free
12:36
hellot tomorrow.com and use for free even in commercial context and will
12:37
even in commercial context and will
12:37
even in commercial context and will compute most of those metrics for you
12:39
compute most of those metrics for you
12:39
compute most of those metrics for you and give you some basic information
12:40
and give you some basic information
12:40
and give you some basic information about cycles and similar things there's
12:43
about cycles and similar things there's
12:43
about cycles and similar things there's also a commercial version of the product
12:44
also a commercial version of the product
12:44
also a commercial version of the product that can do nice visualization and other
12:47
that can do nice visualization and other
12:47
that can do nice visualization and other stuff if you want to have a little bit
12:49
stuff if you want to have a little bit
12:49
stuff if you want to have a little bit more than just the basic
12:51
more than just the basic
12:51
more than just the basic features yeah and then recognize the
12:54
features yeah and then recognize the
12:54
features yeah and then recognize the inherent difficulties of developing
12:55
inherent difficulties of developing
12:55
inherent difficulties of developing software and use effective tools to help
12:57
software and use effective tools to help
12:57
software and use effective tools to help with to deal with difficulties again
13:01
with to deal with difficulties again
13:01
with to deal with difficulties again totally correct answer but very few
13:05
totally correct answer but very few
13:05
totally correct answer but very few companies actually do something about
13:07
companies actually do something about
13:07
companies actually do something about that and I can I can kind of relate very
13:12
that and I can I can kind of relate very
13:12
that and I can I can kind of relate very well with that because when I go to
13:15
well with that because when I go to
13:15
well with that because when I go to conferences I often ask I talk about
13:18
conferences I often ask I talk about
13:18
conferences I often ask I talk about this topic very often so I often ask how
13:21
this topic very often so I often ask how
13:21
this topic very often so I often ask how many of you um having a formerly defined
13:25
many of you um having a formerly defined
13:25
many of you um having a formerly defined architecture model for the project
13:26
architecture model for the project
13:26
architecture model for the project they're working on right now
13:30
they're working on right now
13:30
they're working on right now and less than half of the hands go up
13:33
and less than half of the hands go up
13:33
and less than half of the hands go up most people don't have a formal
13:34
most people don't have a formal
13:34
most people don't have a formal architecture it's something that is
13:38
architecture it's something that is
13:38
architecture it's something that is maybe communicated verbally or hidden in
13:41
maybe communicated verbally or hidden in
13:41
maybe communicated verbally or hidden in some documents but it's it's it's not a
13:43
some documents but it's it's it's not a
13:43
some documents but it's it's it's not a lot so the basically some basic rules
13:46
lot so the basically some basic rules
13:46
lot so the basically some basic rules what you should do and what you
13:47
what you should do and what you
13:47
what you should do and what you shouldn't do but it's not enforced in
13:49
shouldn't do but it's not enforced in
13:49
shouldn't do but it's not enforced in any way shape or form of course now it's
13:52
any way shape or form of course now it's
13:52
any way shape or form of course now it's easy much easier for the big ball of M
13:54
easy much easier for the big ball of M
13:54
easy much easier for the big ball of M to grow because if there are no
13:55
to grow because if there are no
13:55
to grow because if there are no boundaries no rules that are enforced
13:59
boundaries no rules that are enforced
13:59
boundaries no rules that are enforced when it comes to the dependency
14:00
when it comes to the dependency
14:00
when it comes to the dependency structure those issues can happen very
14:04
structure those issues can happen very
14:04
structure those issues can happen very easily so you need tool basically to
14:06
easily so you need tool basically to
14:06
easily so you need tool basically to figure out what's going on with your
14:07
figure out what's going on with your
14:07
figure out what's going on with your code base and you need rules enforceable
14:10
code base and you need rules enforceable
14:10
code base and you need rules enforceable rules ideally something that breaks your
14:12
rules ideally something that breaks your
14:12
rules ideally something that breaks your build when something is happening that
14:14
build when something is happening that
14:14
build when something is happening that is not supporting your current
14:18
architecture now let's talk about how
14:21
architecture now let's talk about how
14:21
architecture now let's talk about how can we um what are the characteristics
14:23
can we um what are the characteristics
14:23
can we um what are the characteristics of a big ball of March and how can we
14:25
of a big ball of March and how can we
14:25
of a big ball of March and how can we measure that
14:30
so a big ball of mud have few but large
14:33
so a big ball of mud have few but large
14:33
so a big ball of mud have few but large groups of pyc
14:35
groups of pyc elements so that's a the sign of of of
14:38
elements so that's a the sign of of of
14:38
elements so that's a the sign of of of end stage code cancer if you have those
14:40
end stage code cancer if you have those
14:40
end stage code cancer if you have those big big cycle groups few big cycle
14:42
big big cycle groups few big cycle
14:42
big big cycle groups few big cycle groups in aacha candra there was one big
14:46
groups in aacha candra there was one big
14:46
groups in aacha candra there was one big cycle group and now it metastasized to a
14:48
cycle group and now it metastasized to a
14:48
cycle group and now it metastasized to a second one with only about 100 elements
14:51
second one with only about 100 elements
14:51
second one with only about 100 elements but if we have few large cycle groups
14:53
but if we have few large cycle groups
14:53
but if we have few large cycle groups and you have a big bow of
14:57
mud once it grow over a certain size
15:00
mud once it grow over a certain size
15:00
mud once it grow over a certain size there basically are like black holes in
15:02
there basically are like black holes in
15:02
there basically are like black holes in the universe that develop a lot of
15:04
the universe that develop a lot of
15:04
the universe that develop a lot of gravitational pull and ping pulling more
15:07
gravitational pull and ping pulling more
15:07
gravitational pull and ping pulling more more components in into their
15:12
orbit and of course we have some metrics
15:15
orbit and of course we have some metrics
15:15
orbit and of course we have some metrics which I'm going to explain in the next
15:17
which I'm going to explain in the next
15:17
which I'm going to explain in the next section that allow you to detect this
15:19
section that allow you to detect this
15:19
section that allow you to detect this problem early
15:23
on and the first good strategy to fight
15:26
on and the first good strategy to fight
15:26
on and the first good strategy to fight the big ball of M is Monitor psychic
15:28
the big ball of M is Monitor psychic
15:28
the big ball of M is Monitor psychic dependencies and not not allowing to let
15:32
dependencies and not not allowing to let
15:32
dependencies and not not allowing to let those large cycle groups grow that would
15:35
those large cycle groups grow that would
15:35
those large cycle groups grow that would be your first uh strategy at least to
15:38
be your first uh strategy at least to
15:38
be your first uh strategy at least to stop the
15:42
bleeding and of course if you can do
15:44
bleeding and of course if you can do
15:44
bleeding and of course if you can do that an enforceable architectural model
15:46
that an enforceable architectural model
15:46
that an enforceable architectural model is even better why is an architectural
15:47
is even better why is an architectural
15:47
is even better why is an architectural model better because in an architectural
15:49
model better because in an architectural
15:49
model better because in an architectural model you actually talk about the
15:50
model you actually talk about the
15:50
model you actually talk about the structure of your code base and I'm a
15:53
structure of your code base and I'm a
15:53
structure of your code base and I'm a big fan of domain driven design
15:55
big fan of domain driven design
15:55
big fan of domain driven design basically you cut your system by
15:57
basically you cut your system by
15:57
basically you cut your system by functionality first and then you have
15:58
functionality first and then you have
15:58
functionality first and then you have those different domains they should have
16:01
those different domains they should have
16:01
those different domains they should have a cycle free dependency structure
16:02
a cycle free dependency structure
16:02
a cycle free dependency structure between them and then inside the domains
16:04
between them and then inside the domains
16:04
between them and then inside the domains you can have layering like a UI layer a
16:08
you can have layering like a UI layer a
16:08
you can have layering like a UI layer a controller layer a service layer
16:11
controller layer a service layer
16:11
controller layer a service layer persistance layer and so on and then
16:13
persistance layer and so on and then
16:13
persistance layer and so on and then once you have this enforceable
16:15
once you have this enforceable
16:15
once you have this enforceable structure you will know which
16:17
structure you will know which
16:17
structure you will know which dependencies in your code base have to
16:19
dependencies in your code base have to
16:19
dependencies in your code base have to go anyway because they violate the
16:21
go anyway because they violate the
16:21
go anyway because they violate the structure and many time removing those
16:24
structure and many time removing those
16:24
structure and many time removing those unwanted dependencies leads to less
16:26
unwanted dependencies leads to less
16:26
unwanted dependencies leads to less cycles and better structure
16:33
but of course without using proper tools
16:35
but of course without using proper tools
16:35
but of course without using proper tools fighting this problem becomes very very
16:43
difficult so why does it happen so often
16:47
difficult so why does it happen so often
16:47
difficult so why does it happen so often because I'm pretty sure if I could see
16:49
because I'm pretty sure if I could see
16:49
because I'm pretty sure if I could see you now and I would ask you do you like
16:51
you now and I would ask you do you like
16:51
you now and I would ask you do you like working on a big ball of mud where few
16:52
working on a big ball of mud where few
16:53
working on a big ball of mud where few hands would would rise up and if I ask
16:55
hands would would rise up and if I ask
16:55
hands would would rise up and if I ask you uh do you intentionally create a big
16:58
you uh do you intentionally create a big
16:58
you uh do you intentionally create a big ball of mud nobody would agree to that
17:00
ball of mud nobody would agree to that
17:00
ball of mud nobody would agree to that because we're not doing that
17:01
because we're not doing that
17:01
because we're not doing that intentionally we're not trying to create
17:03
intentionally we're not trying to create
17:04
intentionally we're not trying to create the worst possible software structure
17:05
the worst possible software structure
17:05
the worst possible software structure that we can I think most developers I
17:08
that we can I think most developers I
17:08
that we can I think most developers I know are striving to create good code it
17:11
know are striving to create good code it
17:11
know are striving to create good code it is much more fun to work on a good code
17:13
is much more fun to work on a good code
17:13
is much more fun to work on a good code base anyway as soon as we have enough
17:15
base anyway as soon as we have enough
17:15
base anyway as soon as we have enough people working on the same code base
17:17
people working on the same code base
17:17
people working on the same code base that is the usual
17:18
that is the usual outcome so you need to put active
17:20
outcome so you need to put active
17:20
outcome so you need to put active counter measures in place otherwise a
17:23
counter measures in place otherwise a
17:23
counter measures in place otherwise a big ball of M is a default outcome for
17:25
big ball of M is a default outcome for
17:25
big ball of M is a default outcome for any non-ra project and I've proven that
17:27
any non-ra project and I've proven that
17:27
any non-ra project and I've proven that a million times I've done so many
17:29
a million times I've done so many
17:29
a million times I've done so many software assessments I can tell you that
17:31
software assessments I can tell you that
17:31
software assessments I can tell you that 80% of non-tv systems are ending up as a
17:34
80% of non-tv systems are ending up as a
17:34
80% of non-tv systems are ending up as a big ball of
17:35
big ball of M it's our default architecture design
17:41
pattern so um another reason because we
17:44
pattern so um another reason because we
17:44
pattern so um another reason because we have no formal definition of our
17:46
have no formal definition of our
17:46
have no formal definition of our architectural model and even if we have
17:49
architectural model and even if we have
17:49
architectural model and even if we have it we lack an enforcement mechanism
17:51
it we lack an enforcement mechanism
17:51
it we lack an enforcement mechanism because it's nice to have your
17:52
because it's nice to have your
17:52
because it's nice to have your architecture on paper but the paper
17:54
architecture on paper but the paper
17:54
architecture on paper but the paper won't tell you if your code base is
17:56
won't tell you if your code base is
17:56
won't tell you if your code base is conforming to the architecture
18:01
so without proper tools and dependency
18:04
so without proper tools and dependency
18:04
so without proper tools and dependency visualization those structural issues
18:06
visualization those structural issues
18:06
visualization those structural issues are introduced without the developers
18:09
are introduced without the developers
18:09
are introduced without the developers being aware of the problem they're not
18:11
being aware of the problem they're not
18:11
being aware of the problem they're not doing that intentionally they're not
18:13
doing that intentionally they're not
18:13
doing that intentionally they're not developers are not stupid but it's
18:15
developers are not stupid but it's
18:15
developers are not stupid but it's really hard if you're working on a big
18:17
really hard if you're working on a big
18:17
really hard if you're working on a big code base with thousands of files to
18:19
code base with thousands of files to
18:19
code base with thousands of files to understand all the dependency
18:20
understand all the dependency
18:21
understand all the dependency implications of the changes you are
18:22
implications of the changes you are
18:22
implications of the changes you are doing without some tools that help to
18:25
doing without some tools that help to
18:25
doing without some tools that help to visualize that
18:29
and as I said the as the cycle groups
18:31
and as I said the as the cycle groups
18:31
and as I said the as the cycle groups are growing like little tumors so they
18:34
are growing like little tumors so they
18:34
are growing like little tumors so they create a snowballing effect until your
18:36
create a snowballing effect until your
18:36
create a snowballing effect until your hold code base is
18:41
affected and also it's not easy to
18:43
affected and also it's not easy to
18:43
affected and also it's not easy to communicating architectural rules that's
18:45
communicating architectural rules that's
18:45
communicating architectural rules that's why a formal description that is
18:47
why a formal description that is
18:47
why a formal description that is enforced by a tool can be so helpful
18:50
enforced by a tool can be so helpful
18:50
enforced by a tool can be so helpful because then there's no no uh
18:53
because then there's no no uh
18:53
because then there's no no uh interpretation of the rules possible
18:55
interpretation of the rules possible
18:55
interpretation of the rules possible because the tool decides what is right
18:57
because the tool decides what is right
18:57
because the tool decides what is right and what is wrong
19:01
and of course we're always working under
19:03
and of course we're always working under
19:03
and of course we're always working under time pressure and time pressure
19:05
time pressure and time pressure
19:05
time pressure and time pressure encourages shortcuts definitely so many
19:08
encourages shortcuts definitely so many
19:08
encourages shortcuts definitely so many times we're just pragmatic and uh skip
19:11
times we're just pragmatic and uh skip
19:11
times we're just pragmatic and uh skip over a couple Hoops just to make sure
19:13
over a couple Hoops just to make sure
19:13
over a couple Hoops just to make sure that we are faster and that we meet our
19:15
that we are faster and that we meet our
19:15
that we are faster and that we meet our deadline at the end we create create
19:17
deadline at the end we create create
19:17
deadline at the end we create create increased technical debt by that but
19:20
increased technical debt by that but
19:20
increased technical debt by that but usually we don't have the time after the
19:22
usually we don't have the time after the
19:22
usually we don't have the time after the deadline to fix the technical debt that
19:24
deadline to fix the technical debt that
19:24
deadline to fix the technical debt that we already accumulated and if you never
19:26
we already accumulated and if you never
19:26
we already accumulated and if you never fix or address your technical debt it
19:28
fix or address your technical debt it
19:28
fix or address your technical debt it just only grows and makes things harder
19:30
just only grows and makes things harder
19:30
just only grows and makes things harder and harder until you reach a point where
19:33
and harder until you reach a point where
19:33
and harder until you reach a point where every change becomes incre increasingly
19:36
every change becomes incre increasingly
19:36
every change becomes incre increasingly very very
19:39
difficult so some people said oh
19:42
difficult so some people said oh
19:42
difficult so some people said oh microservices to the rescue right so
19:45
microservices to the rescue right so
19:45
microservices to the rescue right so just make smaller code bases and then
19:47
just make smaller code bases and then
19:47
just make smaller code bases and then everything becomes
19:52
easy of course the major cause for the
19:54
easy of course the major cause for the
19:54
easy of course the major cause for the big ball of mud is that it's hard to
19:55
big ball of mud is that it's hard to
19:55
big ball of mud is that it's hard to Define and force architectural
19:57
Define and force architectural
19:57
Define and force architectural boundaries
19:59
boundaries but that problem does not go away by
20:01
but that problem does not go away by
20:01
but that problem does not go away by simply splitting it up into a monolith
20:03
simply splitting it up into a monolith
20:03
simply splitting it up into a monolith into many smaller Services because you
20:05
into many smaller Services because you
20:05
into many smaller Services because you still have the dependences between those
20:08
still have the dependences between those
20:08
still have the dependences between those services so basically what many people
20:11
services so basically what many people
20:11
services so basically what many people do when they jump blindly into
20:12
do when they jump blindly into
20:12
do when they jump blindly into microservices architectures they
20:14
microservices architectures they
20:14
microservices architectures they basically split up that big ball of mud
20:17
basically split up that big ball of mud
20:17
basically split up that big ball of mud into a distributed big ball of mud which
20:19
into a distributed big ball of mud which
20:19
into a distributed big ball of mud which is even worse than what we had before
20:29
of course there are many good cases for
20:31
of course there are many good cases for
20:31
of course there are many good cases for microservices but they're not the silver
20:33
microservices but they're not the silver
20:33
microservices but they're not the silver bullet bullet and um the author of this
20:35
bullet bullet and um the author of this
20:35
bullet bullet and um the author of this famous book about microservices Sam
20:37
famous book about microservices Sam
20:37
famous book about microservices Sam Newman wrote it very nicely you have to
20:40
Newman wrote it very nicely you have to
20:40
Newman wrote it very nicely you have to convince
20:41
convince me that I need microservices and you
20:44
me that I need microservices and you
20:44
me that I need microservices and you have to have good arguments was because
20:45
have to have good arguments was because
20:45
have to have good arguments was because the default architectural pattern should
20:48
the default architectural pattern should
20:48
the default architectural pattern should still be the monolith the monolith is a
20:50
still be the monolith the monolith is a
20:50
still be the monolith the monolith is a valid architectural pattern the fact
20:53
valid architectural pattern the fact
20:53
valid architectural pattern the fact that so many people have problems with
20:54
that so many people have problems with
20:54
that so many people have problems with monolith is due to the fact that very
20:57
monolith is due to the fact that very
20:57
monolith is due to the fact that very few people are able to basically create
21:00
few people are able to basically create
21:00
few people are able to basically create a nicely structured monolith because
21:02
a nicely structured monolith because
21:02
a nicely structured monolith because they have no tools to enforce
21:04
they have no tools to enforce
21:04
they have no tools to enforce architectural rules if you have a well
21:06
architectural rules if you have a well
21:06
architectural rules if you have a well structured
21:08
structured monolith um we we also call those
21:10
monolith um we we also call those
21:10
monolith um we we also call those modulith for modular monolith then you
21:13
modulith for modular monolith then you
21:13
modulith for modular monolith then you don't have the usual problems that you
21:15
don't have the usual problems that you
21:15
don't have the usual problems that you normally have with monoliths and the
21:17
normally have with monoliths and the
21:18
normally have with monoliths and the argument that you need microservice for
21:20
argument that you need microservice for
21:20
argument that you need microservice for for um scaling or scalability is not
21:23
for um scaling or scalability is not
21:23
for um scaling or scalability is not always true sometimes it is true
21:26
always true sometimes it is true
21:26
always true sometimes it is true sometimes you can just split your big
21:27
sometimes you can just split your big
21:27
sometimes you can just split your big mod into two Services already create
21:30
mod into two Services already create
21:30
mod into two Services already create create something like uh great
21:32
create something like uh great
21:32
create something like uh great Improvement in
21:33
Improvement in scalability sometimes scalability is not
21:36
scalability sometimes scalability is not
21:36
scalability sometimes scalability is not improved by just splitting a service up
21:38
improved by just splitting a service up
21:38
improved by just splitting a service up into a smaller one it depends on the AR
21:40
into a smaller one it depends on the AR
21:40
into a smaller one it depends on the AR architecture characteristics of the
21:41
architecture characteristics of the
21:41
architecture characteristics of the different Services you're
21:44
different Services you're
21:44
different Services you're creating but um using micros Services
21:47
creating but um using micros Services
21:47
creating but um using micros Services just by itself creates more problems
21:51
just by itself creates more problems
21:51
just by itself creates more problems than you used to have before there can
21:53
than you used to have before there can
21:53
than you used to have before there can be a good solution but not everybody's
21:55
be a good solution but not everybody's
21:56
be a good solution but not everybody's Netflix not everybody needs Netflix
21:59
Netflix not everybody needs Netflix
21:59
Netflix not everybody needs Netflix scalability and not everybody has a
22:02
scalability and not everybody has a
22:02
scalability and not everybody has a rockstar developer team that the people
22:04
rockstar developer team that the people
22:04
rockstar developer team that the people at Netflix have that obviously
22:05
at Netflix have that obviously
22:05
at Netflix have that obviously understand all these complex things and
22:07
understand all these complex things and
22:07
understand all these complex things and complex relationships and can make it
22:13
work so microservices can be a good
22:15
work so microservices can be a good
22:16
work so microservices can be a good solution but they increase overall
22:17
solution but they increase overall
22:17
solution but they increase overall complexity they do not decrease it they
22:19
complexity they do not decrease it they
22:19
complexity they do not decrease it they increase
22:23
it and uh we know by now that many of
22:27
it and uh we know by now that many of
22:27
it and uh we know by now that many of those micros Serv migration projects
22:28
those micros Serv migration projects
22:28
those micros Serv migration projects actually
22:29
actually fail because as soon as you get into the
22:32
fail because as soon as you get into the
22:32
fail because as soon as you get into the details and the nitty-gritty things can
22:34
details and the nitty-gritty things can
22:34
details and the nitty-gritty things can get very nasty very
22:39
quickly so we talked a lot about why
22:42
quickly so we talked a lot about why
22:42
quickly so we talked a lot about why it's happening and and what the big ball
22:45
it's happening and and what the big ball
22:45
it's happening and and what the big ball of M is now it would be useful if we
22:47
of M is now it would be useful if we
22:47
of M is now it would be useful if we could actually get some metrics that
22:49
could actually get some metrics that
22:49
could actually get some metrics that help us finding out what is wrong with
22:51
help us finding out what is wrong with
22:51
help us finding out what is wrong with our software maybe detect harmful Trends
22:54
our software maybe detect harmful Trends
22:54
our software maybe detect harmful Trends early by using a metric based feedback
22:56
early by using a metric based feedback
22:56
early by using a metric based feedback loop
22:59
loop so two metrics I'm going to explain
23:00
so two metrics I'm going to explain
23:00
so two metrics I'm going to explain pretty quickly is average component
23:02
pretty quickly is average component
23:02
pretty quickly is average component dependenc in propagation cost those two
23:04
dependenc in propagation cost those two
23:04
dependenc in propagation cost those two are
23:07
related then we're going to talk about
23:10
related then we're going to talk about
23:10
related then we're going to talk about what are cycle
23:12
what are cycle groups we're going to introduce some
23:14
groups we're going to introduce some
23:14
groups we're going to introduce some cycle analysis
23:16
cycle analysis metrics and we're going to talk about
23:18
metrics and we're going to talk about
23:18
metrics and we're going to talk about the metric maintainability level which
23:20
the metric maintainability level which
23:20
the metric maintainability level which we develop together with some of our
23:22
we develop together with some of our
23:22
we develop together with some of our customers let's start with average
23:24
customers let's start with average
23:24
customers let's start with average component dependency that's a simple
23:26
component dependency that's a simple
23:26
component dependency that's a simple metric
23:28
metric on the left side so we have a couple
23:30
on the left side so we have a couple
23:31
on the left side so we have a couple dependency graphs here the boxes are are
23:33
dependency graphs here the boxes are are
23:33
dependency graphs here the boxes are are source files the arrows are dependencies
23:36
source files the arrows are dependencies
23:36
source files the arrows are dependencies between those source
23:37
between those source
23:37
between those source files and the numbers are the so-called
23:39
files and the numbers are the so-called
23:39
files and the numbers are the so-called depends upon
23:41
depends upon values on the bottom the depends upon
23:43
values on the bottom the depends upon
23:43
values on the bottom the depends upon value is one because this file only
23:45
value is one because this file only
23:45
value is one because this file only depends on itself this one depends on
23:48
depends on itself this one depends on
23:48
depends on itself this one depends on this one this one and on that's why we
23:50
this one this one and on that's why we
23:50
this one this one and on that's why we have a three in there and this one
23:51
have a three in there and this one
23:51
have a three in there and this one depends on all the other ones directly
23:52
depends on all the other ones directly
23:52
depends on all the other ones directly and indirectly class itself so that's
23:54
and indirectly class itself so that's
23:54
and indirectly class itself so that's why we put six in there if we add up
23:57
why we put six in there if we add up
23:57
why we put six in there if we add up those numbers we get cumulated component
23:59
those numbers we get cumulated component
23:59
those numbers we get cumulated component dependency 6 + 3 + 3+ 3 * 1 gives you 15
24:04
dependency 6 + 3 + 3+ 3 * 1 gives you 15
24:04
dependency 6 + 3 + 3+ 3 * 1 gives you 15 and if you divide it through the number
24:05
and if you divide it through the number
24:05
and if you divide it through the number of boxes we get average component
24:07
of boxes we get average component
24:07
of boxes we get average component dependency of 2.5 mean in that
24:10
dependency of 2.5 mean in that
24:10
dependency of 2.5 mean in that dependency graph every file on average
24:12
dependency graph every file on average
24:12
dependency graph every file on average depends on 2.5 files including
24:16
depends on 2.5 files including
24:16
depends on 2.5 files including itself now on the middle we see what
24:19
itself now on the middle we see what
24:19
itself now on the middle we see what happens if we apply the dependency
24:20
happens if we apply the dependency
24:20
happens if we apply the dependency inversion principle that means it allows
24:22
inversion principle that means it allows
24:22
inversion principle that means it allows us to turn around to the the direction
24:25
us to turn around to the the direction
24:25
us to turn around to the the direction of
24:26
of dependencies so in that case the two
24:28
dependencies so in that case the two
24:29
dependencies so in that case the two files in the middle become the bottom of
24:30
files in the middle become the bottom of
24:30
files in the middle become the bottom of our
24:31
our system and we get a better value of two
24:35
system and we get a better value of two
24:35
system and we get a better value of two instead of 2.5 minimum value would be
24:37
instead of 2.5 minimum value would be
24:37
instead of 2.5 minimum value would be one which means we have six islands with
24:39
one which means we have six islands with
24:39
one which means we have six islands with no connection between each other maximum
24:41
no connection between each other maximum
24:41
no connection between each other maximum value would be six means everybody's
24:43
value would be six means everybody's
24:43
value would be six means everybody's literally connected to everybody and
24:45
literally connected to everybody and
24:45
literally connected to everybody and here we see what happens if you just add
24:46
here we see what happens if you just add
24:46
here we see what happens if you just add one dependency that would create two
24:49
one dependency that would create two
24:49
one dependency that would create two cycles one to the left one to the right
24:52
cycles one to the left one to the right
24:52
cycles one to the left one to the right and then we get a much worse value for
24:54
and then we get a much worse value for
24:54
and then we get a much worse value for 4.33 which is pretty close to the
24:56
4.33 which is pretty close to the
24:56
4.33 which is pretty close to the maximum of six
24:58
maximum of six so psychic dependencies are always
25:00
so psychic dependencies are always
25:00
so psychic dependencies are always making average component dependency
25:07
worse now the input metric for average
25:11
worse now the input metric for average
25:11
worse now the input metric for average component dependen is depends upon value
25:13
component dependen is depends upon value
25:13
component dependen is depends upon value yeah so we just explain depends upon
25:15
yeah so we just explain depends upon
25:15
yeah so we just explain depends upon this is a graph of same thing this only
25:17
this is a graph of same thing this only
25:17
this is a graph of same thing this only depends on itself this one only depends
25:19
depends on itself this one only depends
25:19
depends on itself this one only depends on this and on itself and so on that's
25:21
on this and on itself and so on that's
25:21
on this and on itself and so on that's how you get the numbers there's a mirror
25:23
how you get the numbers there's a mirror
25:23
how you get the numbers there's a mirror metric used from which just turns the
25:26
metric used from which just turns the
25:26
metric used from which just turns the question around said how many many
25:28
question around said how many many
25:28
question around said how many many people are using me directly or
25:29
people are using me directly or
25:29
people are using me directly or indirectly and then the one on the
25:31
indirectly and then the one on the
25:31
indirectly and then the one on the bottom gets the biggest number and the
25:33
bottom gets the biggest number and the
25:33
bottom gets the biggest number and the smallest numbers are on the top here
25:35
smallest numbers are on the top here
25:35
smallest numbers are on the top here because it's only used by
25:37
because it's only used by
25:37
because it's only used by themselves the funny thing is and
25:39
themselves the funny thing is and
25:40
themselves the funny thing is and necessary by mathematics and graph
25:41
necessary by mathematics and graph
25:41
necessary by mathematics and graph theoretical reasons if you add up the
25:44
theoretical reasons if you add up the
25:44
theoretical reasons if you add up the depends upon values and the used from
25:46
depends upon values and the used from
25:46
depends upon values and the used from values from the same graph you always
25:48
values from the same graph you always
25:48
values from the same graph you always get to the same number in our case it
25:51
get to the same number in our case it
25:51
get to the same number in our case it would be 3 + 3 6 12
25:55
would be 3 + 3 6 12 14 and here we also have 14 add it up
25:58
14 and here we also have 14 add it up
25:58
14 and here we also have 14 add it up quickly so same number and then we can
26:01
quickly so same number and then we can
26:01
quickly so same number and then we can normalize those numbers divide by the
26:03
normalize those numbers divide by the
26:03
normalize those numbers divide by the number of boxes which gives us a metric
26:04
number of boxes which gives us a metric
26:05
number of boxes which gives us a metric fan out or F in
26:14
here propagation cost it's basically that percentage
26:16
cost it's basically that percentage
26:16
cost it's basically that percentage ready to indicate cing we learned that
26:18
ready to indicate cing we learned that
26:18
ready to indicate cing we learned that aacha cand at about 70% propagation
26:23
aacha cand at about 70% propagation
26:23
aacha cand at about 70% propagation cost and we can calculate
26:26
cost and we can calculate
26:26
cost and we can calculate that very easily by either taking the
26:29
that very easily by either taking the
26:29
that very easily by either taking the average value of fan in or the average
26:31
average value of fan in or the average
26:31
average value of fan in or the average value of fan
26:32
value of fan out or we can divide average component
26:35
out or we can divide average component
26:35
out or we can divide average component dependency by the number of components
26:37
dependency by the number of components
26:37
dependency by the number of components once more gives us the same value all
26:39
once more gives us the same value all
26:39
once more gives us the same value all result in the same number which is a
26:40
result in the same number which is a
26:40
result in the same number which is a percentage value which tells us how
26:43
percentage value which tells us how
26:43
percentage value which tells us how strong the coupling in our system
26:48
is yeah as soon as you get values above
26:51
is yeah as soon as you get values above
26:51
is yeah as soon as you get values above 20% you have bad values except for small
26:53
20% you have bad values except for small
26:53
20% you have bad values except for small systems so propagation cost calculations
26:56
systems so propagation cost calculations
26:56
systems so propagation cost calculations make sense for larger systems if you
26:57
make sense for larger systems if you
26:57
make sense for larger systems if you have very small system is 20 or 30 files
27:01
have very small system is 20 or 30 files
27:01
have very small system is 20 or 30 files it's not really relevant because then
27:02
it's not really relevant because then
27:02
it's not really relevant because then you will get naturally High coupling
27:04
you will get naturally High coupling
27:04
you will get naturally High coupling values anyway because the number of
27:05
values anyway because the number of
27:06
values anyway because the number of files are so
27:11
small and since it's a dependency
27:13
small and since it's a dependency
27:13
small and since it's a dependency basically dependency percentage
27:17
metric by just adding new files to a
27:20
metric by just adding new files to a
27:20
metric by just adding new files to a system this value will grow will will
27:22
system this value will grow will will
27:22
system this value will grow will will will uh shrink the value will shrink
27:25
will uh shrink the value will shrink
27:25
will uh shrink the value will shrink because you have more and more files so
27:27
because you have more and more files so
27:27
because you have more and more files so the the the density of relationships
27:29
the the the density of relationships
27:29
the the the density of relationships between files compared to the total
27:31
between files compared to the total
27:31
between files compared to the total number of files will
27:33
number of files will
27:33
number of files will sync it's uh ow to the fact that
27:36
sync it's uh ow to the fact that
27:36
sync it's uh ow to the fact that basically you have a quadratic number of
27:38
basically you have a quadratic number of
27:38
basically you have a quadratic number of dependencies so if you have 10 files you
27:39
dependencies so if you have 10 files you
27:39
dependencies so if you have 10 files you can have 100 different dependencies 10
27:42
can have 100 different dependencies 10
27:42
can have 100 different dependencies 10 time 10 if you have 100 files you
27:45
time 10 if you have 100 files you
27:45
time 10 if you have 100 files you already have 10,000 potential
27:46
already have 10,000 potential
27:46
already have 10,000 potential dependencies and so on so the number of
27:48
dependencies and so on so the number of
27:48
dependencies and so on so the number of potential dependencies increases with
27:50
potential dependencies increases with
27:50
potential dependencies increases with the square number of notes in your
27:53
the square number of notes in your
27:53
the square number of notes in your system and so that usually if you're not
27:57
system and so that usually if you're not
27:57
system and so that usually if you're not catastrophically added what you're doing
27:59
catastrophically added what you're doing
27:59
catastrophically added what you're doing just adding files will let the the value
28:01
just adding files will let the the value
28:01
just adding files will let the the value shrink a little bit but still high
28:04
shrink a little bit but still high
28:04
shrink a little bit but still high values higher values of propagation cost
28:06
values higher values of propagation cost
28:06
values higher values of propagation cost are always bad news never let it grow to
28:08
are always bad news never let it grow to
28:08
are always bad news never let it grow to the values that you see in
28:12
apach now we talked about cycle groups a
28:14
apach now we talked about cycle groups a
28:14
apach now we talked about cycle groups a little bit here's a visualization what
28:16
little bit here's a visualization what
28:16
little bit here's a visualization what that exactly
28:18
that exactly means so in this graph we have two cycle
28:21
means so in this graph we have two cycle
28:21
means so in this graph we have two cycle groups the gray one and the red one and
28:23
groups the gray one and the red one and
28:23
groups the gray one and the red one and the white nodes are not involved in any
28:25
the white nodes are not involved in any
28:25
the white nodes are not involved in any Cycles
28:30
and with SRA before we saw that
28:34
and with SRA before we saw that
28:34
and with SRA before we saw that um giant cycle with 50 not 53 elements
28:37
um giant cycle with 50 not 53 elements
28:37
um giant cycle with 50 not 53 elements you can imagine that this looks a lot
28:39
you can imagine that this looks a lot
28:39
you can imagine that this looks a lot more
28:44
no so more about cycle groups so
28:47
no so more about cycle groups so
28:47
no so more about cycle groups so dependencies can be analyzed between
28:49
dependencies can be analyzed between
28:49
dependencies can be analyzed between compilation units packages or name
28:51
compilation units packages or name
28:51
compilation units packages or name spaces or arbitrary elements defined in
28:53
spaces or arbitrary elements defined in
28:53
spaces or arbitrary elements defined in your
28:54
your code so I can just make a source file
28:57
code so I can just make a source file
28:57
code so I can just make a source file based cycle anal is I take all the C
28:59
based cycle anal is I take all the C
28:59
based cycle anal is I take all the C files or Java files from my system and
29:01
files or Java files from my system and
29:01
files or Java files from my system and find out if they are Cycles between each
29:03
find out if they are Cycles between each
29:03
find out if they are Cycles between each other or I can do that analysis by just
29:05
other or I can do that analysis by just
29:05
other or I can do that analysis by just making a nam space or package dependency
29:07
making a nam space or package dependency
29:07
making a nam space or package dependency graph and do an analysis on this level
29:10
graph and do an analysis on this level
29:10
graph and do an analysis on this level or I could basically do an analysis
29:12
or I could basically do an analysis
29:12
or I could basically do an analysis based on directories or something like
29:14
based on directories or something like
29:14
based on directories or something like this the different levels of Cycles I
29:16
this the different levels of Cycles I
29:16
this the different levels of Cycles I can calculate but the most popular
29:18
can calculate but the most popular
29:18
can calculate but the most popular levels I can calculate um Cy
29:22
levels I can calculate um Cy
29:22
levels I can calculate um Cy dependencies between are either
29:24
dependencies between are either
29:24
dependencies between are either compilation units themselves or packages
29:26
compilation units themselves or packages
29:26
compilation units themselves or packages or name spaces
29:33
so in our case we're mostly interested
29:35
so in our case we're mostly interested
29:35
so in our case we're mostly interested in in compilation unit cycles and
29:37
in in compilation unit cycles and
29:37
in in compilation unit cycles and package level
29:39
package level cycles and smaller cycle groups with
29:41
cycles and smaller cycle groups with
29:41
cycles and smaller cycle groups with five elements or less within a package
29:44
five elements or less within a package
29:44
five elements or less within a package or namespace are usually not too
29:45
or namespace are usually not too
29:45
or namespace are usually not too problematic so you can have smaller
29:47
problematic so you can have smaller
29:47
problematic so you can have smaller Cycles some design patterns basically
29:49
Cycles some design patterns basically
29:49
Cycles some design patterns basically imply cyclic dependencies between
29:50
imply cyclic dependencies between
29:50
imply cyclic dependencies between different classes but keep those cycle
29:53
different classes but keep those cycle
29:53
different classes but keep those cycle groups small less than five elements is
29:55
groups small less than five elements is
29:55
groups small less than five elements is a good idea less than three elements is
29:57
a good idea less than three elements is
29:57
a good idea less than three elements is even better
30:01
try to avoid package namespace Cycles at
30:03
try to avoid package namespace Cycles at
30:03
try to avoid package namespace Cycles at all so if you have Cycles try to isolate
30:06
all so if you have Cycles try to isolate
30:06
all so if you have Cycles try to isolate these Cycles within a namespace or
30:09
these Cycles within a namespace or
30:09
these Cycles within a namespace or package but as soon as you get to the
30:11
package but as soon as you get to the
30:11
package but as soon as you get to the package and namespace level it's always
30:13
package and namespace level it's always
30:13
package and namespace level it's always a good idea to keep that structure
30:15
a good idea to keep that structure
30:15
a good idea to keep that structure completely cycle free because then you
30:17
completely cycle free because then you
30:18
completely cycle free because then you can also use your package and namespace
30:19
can also use your package and namespace
30:19
can also use your package and namespace structure to basically connect it to
30:21
structure to basically connect it to
30:21
structure to basically connect it to your architecture in some way for
30:24
your architecture in some way for
30:24
your architecture in some way for example it's a very good idea for
30:25
example it's a very good idea for
30:25
example it's a very good idea for example if you think about domain driven
30:27
example if you think about domain driven
30:27
example if you think about domain driven design you can have your namespace
30:29
design you can have your namespace
30:29
design you can have your namespace naming strategy you say with your start
30:30
naming strategy you say with your start
30:30
naming strategy you say with your start with your company name hello tomorrow
30:32
with your company name hello tomorrow
30:32
with your company name hello tomorrow then the project name Project X and then
30:35
then the project name Project X and then
30:35
then the project name Project X and then the name of the domain and then the name
30:38
the name of the domain and then the name
30:38
the name of the domain and then the name of the layer so by just looking at a
30:40
of the layer so by just looking at a
30:40
of the layer so by just looking at a package or namespace name you would know
30:42
package or namespace name you would know
30:42
package or namespace name you would know where you are in your
30:43
where you are in your
30:43
where you are in your [Music]
30:46
architecture the good news all Cycles
30:49
architecture the good news all Cycles
30:49
architecture the good news all Cycles can be
30:51
can be broken uh the bad news is most people
30:54
broken uh the bad news is most people
30:54
broken uh the bad news is most people don't learn how to do that so that's
30:56
don't learn how to do that so that's
30:56
don't learn how to do that so that's something that you probably need to
30:57
something that you probably need to
30:57
something that you probably need to teacher develop in some kind of boot
30:59
teacher develop in some kind of boot
30:59
teacher develop in some kind of boot boot camp situation or send them through
31:02
boot camp situation or send them through
31:02
boot camp situation or send them through some good architecture training so they
31:04
some good architecture training so they
31:04
some good architecture training so they learn how to do that I give you an
31:07
learn how to do that I give you an
31:07
learn how to do that I give you an example so dependency inversion
31:08
example so dependency inversion
31:08
example so dependency inversion principle is a very famous way to break
31:11
principle is a very famous way to break
31:11
principle is a very famous way to break cycle so that's a simple cycle between
31:12
cycle so that's a simple cycle between
31:13
cycle so that's a simple cycle between two classes CL class R and Class B are
31:15
two classes CL class R and Class B are
31:15
two classes CL class R and Class B are using each
31:17
using each other and by just introducing an
31:19
other and by just introducing an
31:19
other and by just introducing an interface I can break that
31:22
interface I can break that
31:22
interface I can break that cycle so instead of class R pointing to
31:25
cycle so instead of class R pointing to
31:25
cycle so instead of class R pointing to B directly it points to interface for B
31:27
B directly it points to interface for B
31:27
B directly it points to interface for B that interface is implemented by
31:30
that interface is implemented by
31:30
that interface is implemented by B and now we have a cycle free
31:32
B and now we have a cycle free
31:32
B and now we have a cycle free dependency graph that's why it's called
31:34
dependency graph that's why it's called
31:34
dependency graph that's why it's called dependency inversion principles
31:35
dependency inversion principles
31:35
dependency inversion principles interfaces are really good for inversing
31:39
interfaces are really good for inversing
31:39
interfaces are really good for inversing directions of
31:42
dependencies another cycle uh breakup
31:45
dependencies another cycle uh breakup
31:45
dependencies another cycle uh breakup technique could be a reorganization of
31:47
technique could be a reorganization of
31:47
technique could be a reorganization of your code for example we have uh That's
31:49
your code for example we have uh That's
31:49
your code for example we have uh That's a classic example here we have a
31:51
a classic example here we have a
31:51
a classic example here we have a customer class and an order class the
31:53
customer class and an order class the
31:53
customer class and an order class the order knows its customer and the
31:55
order knows its customer and the
31:55
order knows its customer and the customer is a convenience function find
31:57
customer is a convenience function find
31:57
customer is a convenience function find the order with number ID that belongs to
31:59
the order with number ID that belongs to
32:00
the order with number ID that belongs to me of course the correct ordering would
32:02
me of course the correct ordering would
32:02
me of course the correct ordering would be that this find order
32:06
be that this find order
32:06
be that this find order function should belong to the order
32:08
function should belong to the order
32:08
function should belong to the order class and not to the customer class and
32:10
class and not to the customer class and
32:10
class and not to the customer class and then you again get a cycle free
32:12
then you again get a cycle free
32:12
then you again get a cycle free structure the other techniques too like
32:15
structure the other techniques too like
32:15
structure the other techniques too like promoting and demoting dependencies up
32:17
promoting and demoting dependencies up
32:17
promoting and demoting dependencies up and down so for example if you have two
32:19
and down so for example if you have two
32:19
and down so for example if you have two classes that use each other you can
32:22
classes that use each other you can
32:22
classes that use each other you can basically have a class above them that
32:23
basically have a class above them that
32:23
basically have a class above them that knows those two classes and basically
32:25
knows those two classes and basically
32:25
knows those two classes and basically internalize the cycle into its inner
32:27
internalize the cycle into its inner
32:27
internalize the cycle into its inner structure
32:29
structure or you can do this with a class that is
32:31
or you can do this with a class that is
32:31
or you can do this with a class that is known by both of the classes in a cycle
32:32
known by both of the classes in a cycle
32:32
known by both of the classes in a cycle and basically demot it to the lower
32:36
class now let's look at some cycle
32:39
class now let's look at some cycle
32:39
class now let's look at some cycle analysis
32:43
metrics let's start with cyclicity and
32:46
metrics let's start with cyclicity and
32:46
metrics let's start with cyclicity and relative
32:47
relative cyclicity biggest cycle group is also a
32:49
cyclicity biggest cycle group is also a
32:49
cyclicity biggest cycle group is also a good indicator it's just a very simple
32:51
good indicator it's just a very simple
32:51
good indicator it's just a very simple metric tells you what how many elements
32:53
metric tells you what how many elements
32:53
metric tells you what how many elements are in the biggest cycle group
32:56
are in the biggest cycle group
32:56
are in the biggest cycle group structural depth index is aetc that
32:57
structural depth index is aetc that
32:57
structural depth index is aetc that tells us how difficult would it be to
32:59
tells us how difficult would it be to
32:59
tells us how difficult would it be to break all these Cycles up and let's
33:02
break all these Cycles up and let's
33:02
break all these Cycles up and let's start with cyclicity here cyclicity is a
33:04
start with cyclicity here cyclicity is a
33:04
start with cyclicity here cyclicity is a very simple metric if you have a cycle
33:06
very simple metric if you have a cycle
33:06
very simple metric if you have a cycle group of three element it cyclicity is
33:08
group of three element it cyclicity is
33:08
group of three element it cyclicity is nine if you have a cycle group of 10
33:10
nine if you have a cycle group of 10
33:10
nine if you have a cycle group of 10 elements it cyclicity is 100 so it's a
33:12
elements it cyclicity is 100 so it's a
33:12
elements it cyclicity is 100 so it's a square number of its size very simple
33:15
square number of its size very simple
33:15
square number of its size very simple now we can add up the cyclicity for all
33:19
now we can add up the cyclicity for all
33:19
now we can add up the cyclicity for all Cycles inside the module or inside of
33:20
Cycles inside the module or inside of
33:20
Cycles inside the module or inside of the whole system and then you get a sum
33:22
the whole system and then you get a sum
33:22
the whole system and then you get a sum of
33:24
of cyclicity and then you can basically
33:26
cyclicity and then you can basically
33:26
cyclicity and then you can basically calculate relative Cy it for that
33:28
calculate relative Cy it for that
33:29
calculate relative Cy it for that container which can be the module or the
33:31
container which can be the module or the
33:31
container which can be the module or the whole system or it could even be a
33:33
whole system or it could even be a
33:33
whole system or it could even be a package as some of cyclicity the square
33:36
package as some of cyclicity the square
33:36
package as some of cyclicity the square root of the sum of cyclicity divided by
33:38
root of the sum of cyclicity divided by
33:38
root of the sum of cyclicity divided by the number of elements multiplied by
33:42
the number of elements multiplied by
33:42
the number of elements multiplied by 100 so why is this metric useful you
33:46
100 so why is this metric useful you
33:46
100 so why is this metric useful you might think I can give you an example
33:48
might think I can give you an example
33:48
might think I can give you an example now let's make a little thought
33:50
now let's make a little thought
33:50
now let's make a little thought experiment here please follow me in your
33:52
experiment here please follow me in your
33:52
experiment here please follow me in your head let's assume we have a system with
33:55
head let's assume we have a system with
33:55
head let's assume we have a system with 100 source files and the first one is
33:58
100 source files and the first one is
33:58
100 source files and the first one is using the second one the second one is
34:00
using the second one the second one is
34:00
using the second one the second one is using the third one and so on until we
34:01
using the third one and so on until we
34:01
using the third one and so on until we get to number 100 which is again using
34:03
get to number 100 which is again using
34:03
get to number 100 which is again using the first one so in our ahe we should
34:06
the first one so in our ahe we should
34:06
the first one so in our ahe we should have a big cycle having 100
34:08
have a big cycle having 100
34:08
have a big cycle having 100 nodes in that case relative cyclicity
34:11
nodes in that case relative cyclicity
34:11
nodes in that case relative cyclicity could be calculated at the sum of
34:13
could be calculated at the sum of
34:13
could be calculated at the sum of cyclicity in this case we only have one
34:15
cyclicity in this case we only have one
34:15
cyclicity in this case we only have one cycle of 100 element the cyclicity would
34:17
cycle of 100 element the cyclicity would
34:17
cycle of 100 element the cyclicity would be 10,000 the square root of 10,000
34:20
be 10,000 the square root of 10,000
34:20
be 10,000 the square root of 10,000 would be 100 divided by 100 would be one
34:23
would be 100 divided by 100 would be one
34:23
would be 100 divided by 100 would be one and then we get 100% relative cyclicity
34:26
and then we get 100% relative cyclicity
34:26
and then we get 100% relative cyclicity so that's a worst case scenario value
34:28
so that's a worst case scenario value
34:28
so that's a worst case scenario value one big cycle comprising all elements
34:30
one big cycle comprising all elements
34:31
one big cycle comprising all elements and we get the worst value of relative
34:32
and we get the worst value of relative
34:33
and we get the worst value of relative cyclicity now let's assume we have the
34:35
cyclicity now let's assume we have the
34:35
cyclicity now let's assume we have the same system but instead of having one
34:36
same system but instead of having one
34:36
same system but instead of having one big cycle we have 50 small cycles of two
34:40
big cycle we have 50 small cycles of two
34:40
big cycle we have 50 small cycles of two elements now if we do that exercise
34:42
elements now if we do that exercise
34:42
elements now if we do that exercise again the small Cycles have cyclicity of
34:45
again the small Cycles have cyclicity of
34:45
again the small Cycles have cyclicity of four 2 * 2 * 50 is 200 the square root
34:49
four 2 * 2 * 50 is 200 the square root
34:49
four 2 * 2 * 50 is 200 the square root of 200 would be 14 so in that case our
34:52
of 200 would be 14 so in that case our
34:52
of 200 would be 14 so in that case our relative cyclicity would only be
34:54
relative cyclicity would only be
34:54
relative cyclicity would only be 14% although in both cases
34:58
14% although in both cases
34:58
14% although in both cases 100% of all Source FES are involved in
35:00
100% of all Source FES are involved in
35:00
100% of all Source FES are involved in some kind of cycle but in the first case
35:03
some kind of cycle but in the first case
35:03
some kind of cycle but in the first case it was much a much bigger cycle
35:06
it was much a much bigger cycle
35:06
it was much a much bigger cycle basically this metric tells us how bad
35:08
basically this metric tells us how bad
35:08
basically this metric tells us how bad is it so if you have
35:11
is it so if you have
35:11
is it so if you have um lots of smaller Cycles it's always
35:14
um lots of smaller Cycles it's always
35:14
um lots of smaller Cycles it's always better than having one big cycle and
35:16
better than having one big cycle and
35:16
better than having one big cycle and that that is can be measured very nicely
35:18
that that is can be measured very nicely
35:18
that that is can be measured very nicely with relative cyclicity that's why this
35:20
with relative cyclicity that's why this
35:20
with relative cyclicity that's why this metric is
35:23
useful structural depth indic is kind of
35:26
useful structural depth indic is kind of
35:26
useful structural depth indic is kind of the other side of the coin there we try
35:28
the other side of the coin there we try
35:28
the other side of the coin there we try to answer the question how difficult
35:30
to answer the question how difficult
35:30
to answer the question how difficult would it be to break up a
35:34
cycle and so we run a graph algorithm
35:37
cycle and so we run a graph algorithm
35:37
cycle and so we run a graph algorithm over the cycle
35:39
over the cycle groups and for each cycle group we
35:41
groups and for each cycle group we
35:41
groups and for each cycle group we compute two values first of all we find
35:43
compute two values first of all we find
35:43
compute two values first of all we find out how many links do have to cut to
35:45
out how many links do have to cut to
35:45
out how many links do have to cut to break the cycle group and then for each
35:48
break the cycle group and then for each
35:48
break the cycle group and then for each link we look at the weight of the link
35:50
link we look at the weight of the link
35:50
link we look at the weight of the link so if you have a dependency between
35:51
so if you have a dependency between
35:51
so if you have a dependency between Class A and Class B and Class A is using
35:54
Class A and Class B and Class A is using
35:54
Class A and Class B and Class A is using Class B in three different ways then the
35:57
Class B in three different ways then the
35:57
Class B in three different ways then the weight of that link would be
35:59
weight of that link would be
35:59
weight of that link would be three if that's the only link we had to
36:02
three if that's the only link we had to
36:02
three if that's the only link we had to cut our structural depth index value
36:03
cut our structural depth index value
36:03
cut our structural depth index value would be 13 because we calculate
36:05
would be 13 because we calculate
36:05
would be 13 because we calculate structural depth IND as 10 times the
36:07
structural depth IND as 10 times the
36:07
structural depth IND as 10 times the links to break plus total weight of
36:10
links to break plus total weight of
36:10
links to break plus total weight of links so in our example if we had this
36:13
links so in our example if we had this
36:13
links so in our example if we had this one link with three dependencies and
36:14
one link with three dependencies and
36:14
one link with three dependencies and that's the one that has to go the
36:16
that's the one that has to go the
36:16
that's the one that has to go the structural dep index value would be
36:21
13 and of course we then add this up for
36:24
13 and of course we then add this up for
36:24
13 and of course we then add this up for the whole module or the whole system for
36:26
the whole module or the whole system for
36:26
the whole module or the whole system for each cycle group we get the value and be
36:29
each cycle group we get the value and be
36:29
each cycle group we get the value and be adding up these values for different
36:32
adding up these values for different
36:32
adding up these values for different scopes of course that's a value you you
36:35
scopes of course that's a value you you
36:35
scopes of course that's a value you you might want to track and then make sure
36:37
might want to track and then make sure
36:37
might want to track and then make sure that it's not growing because if this
36:39
that it's not growing because if this
36:39
that it's not growing because if this value is growing all the time means your
36:40
value is growing all the time means your
36:40
value is growing all the time means your Cycles are getting denser and denser and
36:42
Cycles are getting denser and denser and
36:42
Cycles are getting denser and denser and harder and harder to break
36:47
up now maintainability
36:50
up now maintainability
36:50
up now maintainability level is a way more complicated
36:55
level is a way more complicated
36:55
level is a way more complicated metric uh we implemented that together
36:58
metric uh we implemented that together
36:58
metric uh we implemented that together with the customer the idea was basically
37:00
with the customer the idea was basically
37:00
with the customer the idea was basically this a metric we calculate per module
37:02
this a metric we calculate per module
37:02
this a metric we calculate per module and for each module we wanted to get a
37:04
and for each module we wanted to get a
37:04
and for each module we wanted to get a value between 0 and 100% 100% would be
37:07
value between 0 and 100% 100% would be
37:07
value between 0 and 100% 100% would be perfect 0% would be very bad that
37:10
perfect 0% would be very bad that
37:10
perfect 0% would be very bad that basically um correlates to the stomach
37:13
basically um correlates to the stomach
37:13
basically um correlates to the stomach feeling of developers about the quality
37:15
feeling of developers about the quality
37:15
feeling of developers about the quality of
37:16
of modules and we let different inputs go
37:19
modules and we let different inputs go
37:19
modules and we let different inputs go into this
37:21
into this metric um it should be stable when there
37:23
metric um it should be stable when there
37:23
metric um it should be stable when there no major changes to the architecture or
37:25
no major changes to the architecture or
37:25
no major changes to the architecture or dependency structure
37:27
dependency structure
37:27
dependency structure and it measures decoupling and
37:29
and it measures decoupling and
37:29
and it measures decoupling and successful verticalization what is
37:32
successful verticalization what is
37:32
successful verticalization what is verticalization verticalization is
37:34
verticalization verticalization is
37:34
verticalization verticalization is domain driven design basically you first
37:36
domain driven design basically you first
37:36
domain driven design basically you first divide your system vertically by by
37:38
divide your system vertically by by
37:38
divide your system vertically by by organizing into different
37:41
organizing into different
37:41
organizing into different domains and the dependencies between the
37:44
domains and the dependencies between the
37:44
domains and the dependencies between the domains should be minimized so you have
37:45
domains should be minimized so you have
37:45
domains should be minimized so you have a successful verticalization if you can
37:47
a successful verticalization if you can
37:47
a successful verticalization if you can clearly see those domains in your code
37:50
clearly see those domains in your code
37:50
clearly see those domains in your code structure and they don't have too too
37:52
structure and they don't have too too
37:52
structure and they don't have too too many dependencies between each
37:55
many dependencies between each
37:55
many dependencies between each other and of course with using CES and
37:58
other and of course with using CES and
37:58
other and of course with using CES and reducing coupling will improve the
38:00
reducing coupling will improve the
38:00
reducing coupling will improve the metric and it's one of several
38:02
metric and it's one of several
38:02
metric and it's one of several indicators of design
38:04
indicators of design
38:04
indicators of design quality recommended value 75% or more
38:07
quality recommended value 75% or more
38:07
quality recommended value 75% or more definitely you want to keep it above
38:10
definitely you want to keep it above
38:10
definitely you want to keep it above 50% 75% is already pretty safe if you
38:13
50% 75% is already pretty safe if you
38:13
50% 75% is already pretty safe if you can keep it at 75 or higher that's
38:15
can keep it at 75 or higher that's
38:15
can keep it at 75 or higher that's that's
38:17
reasonable now let's see how we
38:19
reasonable now let's see how we
38:19
reasonable now let's see how we calculate that metric so this is a
38:22
calculate that metric so this is a
38:22
calculate that metric so this is a simple dependency graph again up there
38:25
simple dependency graph again up there
38:25
simple dependency graph again up there with 12 compilation units
38:29
and uh we calculate first all the fanin
38:32
and uh we calculate first all the fanin
38:32
and uh we calculate first all the fanin value maintainability level fan in L
38:35
value maintainability level fan in L
38:35
value maintainability level fan in L that's a percentage of higher level
38:36
that's a percentage of higher level
38:36
that's a percentage of higher level components influenced by a given
38:38
components influenced by a given
38:39
components influenced by a given component now if you create this graph
38:41
component now if you create this graph
38:41
component now if you create this graph you see we levelized this graph level
38:43
you see we levelized this graph level
38:43
you see we levelized this graph level one has no outgoing dependencies level
38:45
one has no outgoing dependencies level
38:45
one has no outgoing dependencies level two has only incoming dependencies from
38:47
two has only incoming dependencies from
38:47
two has only incoming dependencies from higher level and outgoing dependencies
38:49
higher level and outgoing dependencies
38:49
higher level and outgoing dependencies to lower level and so on uh but before
38:52
to lower level and so on uh but before
38:52
to lower level and so on uh but before we can levelize the graph we need to
38:54
we can levelize the graph we need to
38:55
we can levelize the graph we need to basically condense cycle groups into
38:56
basically condense cycle groups into
38:56
basically condense cycle groups into their own logical nodes so we have a
38:58
their own logical nodes so we have a
38:58
their own logical nodes so we have a little cycle group between FG and H here
39:00
little cycle group between FG and H here
39:00
little cycle group between FG and H here so we form one logical node called fgh
39:03
so we form one logical node called fgh
39:03
so we form one logical node called fgh out of the cycle group so we get a cycle
39:05
out of the cycle group so we get a cycle
39:06
out of the cycle group so we get a cycle free graph
39:10
structure so let's calculate that fed in
39:12
structure so let's calculate that fed in
39:12
structure so let's calculate that fed in for example for note a yeah a is used by
39:17
for example for note a yeah a is used by
39:17
for example for note a yeah a is used by E directly and by I and J
39:20
E directly and by I and J
39:20
E directly and by I and J indirectly so mean three nodes are using
39:22
indirectly so mean three nodes are using
39:22
indirectly so mean three nodes are using us three nodes of eight nodes in higher
39:25
us three nodes of eight nodes in higher
39:25
us three nodes of eight nodes in higher levels means 38s means the
39:28
levels means 38s means the
39:28
levels means 38s means the 70
39:29
70 37.5% value of f in ml for Noe
39:34
37.5% value of f in ml for Noe
39:34
37.5% value of f in ml for Noe a for note E the value is 50% because e
39:37
a for note E the value is 50% because e
39:37
a for note E the value is 50% because e is used by I and J and that is two nodes
39:40
is used by I and J and that is two nodes
39:40
is used by I and J and that is two nodes out of four nodes which are in higher
39:42
out of four nodes which are in higher
39:42
out of four nodes which are in higher levels so 50% for E for all the nodes on
39:46
levels so 50% for E for all the nodes on
39:46
levels so 50% for E for all the nodes on level three we have a value of zero
39:49
level three we have a value of zero
39:49
level three we have a value of zero because we have no incoming dependencies
39:51
because we have no incoming dependencies
39:51
because we have no incoming dependencies nobody's using us so I J K and L are the
39:54
nobody's using us so I J K and L are the
39:54
nobody's using us so I J K and L are the files that we can change without
39:56
files that we can change without
39:56
files that we can change without affecting the rest of the system so
39:59
affecting the rest of the system so
39:59
affecting the rest of the system so that's is where we put or should put our
40:01
that's is where we put or should put our
40:01
that's is where we put or should put our complicated logic put the complicated
40:03
complicated logic put the complicated
40:03
complicated logic put the complicated logic in files that have as many as as
40:06
logic in files that have as many as as
40:06
logic in files that have as many as as little as possible incoming dependencies
40:07
little as possible incoming dependencies
40:08
little as possible incoming dependencies ideally no incoming dependencies now you
40:10
ideally no incoming dependencies now you
40:10
ideally no incoming dependencies now you could argue if it doesn't have any
40:12
could argue if it doesn't have any
40:13
could argue if it doesn't have any incoming dependencies not used by
40:14
incoming dependencies not used by
40:14
incoming dependencies not used by anybody so that's useless not true if
40:18
anybody so that's useless not true if
40:18
anybody so that's useless not true if you basically if your class is
40:19
you basically if your class is
40:19
you basically if your class is implemented by an interface implements
40:21
implemented by an interface implements
40:21
implemented by an interface implements an interface you can call that interface
40:23
an interface you can call that interface
40:23
an interface you can call that interface without calling the class directly and
40:26
without calling the class directly and
40:26
without calling the class directly and the class itself might have no incoming
40:27
the class itself might have no incoming
40:27
the class itself might have no incoming dependency but it's still used over the
40:30
dependency but it's still used over the
40:30
dependency but it's still used over the interface the interface allows you a
40:33
interface the interface allows you a
40:33
interface the interface allows you a large degree of decoupling and as long
40:34
large degree of decoupling and as long
40:35
large degree of decoupling and as long as it don't change the interface you can
40:37
as it don't change the interface you can
40:37
as it don't change the interface you can change the code in I or J as much as you
40:39
change the code in I or J as much as you
40:39
change the code in I or J as much as you want without affecting anything in the
40:41
want without affecting anything in the
40:41
want without affecting anything in the rest of your
40:51
system now we have different
40:53
system now we have different
40:53
system now we have different calculations here for example fan in of
40:55
calculations here for example fan in of
40:55
calculations here for example fan in of this node fga would be 7 5% because it's
40:58
this node fga would be 7 5% because it's
40:58
this node fga would be 7 5% because it's used by JK and L which is 75% of those
41:01
used by JK and L which is 75% of those
41:01
used by JK and L which is 75% of those higher level
41:09
notes now there are a lot more details
41:12
notes now there are a lot more details
41:12
notes now there are a lot more details to this metric because there also
41:14
to this metric because there also
41:14
to this metric because there also alternative ways to calculate it there's
41:16
alternative ways to calculate it there's
41:16
alternative ways to calculate it there's a Blog article which contains all the
41:18
a Blog article which contains all the
41:18
a Blog article which contains all the details here just go to blog.
41:21
details here just go to blog.
41:21
details here just go to blog. tomorrow.com and look for the word
41:23
tomorrow.com and look for the word
41:23
tomorrow.com and look for the word keyword promising you find the
41:25
keyword promising you find the
41:25
keyword promising you find the article about the metric that contains
41:28
article about the metric that contains
41:28
article about the metric that contains all the formulas and everything in
41:30
all the formulas and everything in
41:30
all the formulas and everything in detail because in reality what we found
41:32
detail because in reality what we found
41:32
detail because in reality what we found out when we developed the metric that
41:34
out when we developed the metric that
41:34
out when we developed the metric that first step wasn't enough so we had an
41:36
first step wasn't enough so we had an
41:36
first step wasn't enough so we had an example from our customer Porsche
41:38
example from our customer Porsche
41:38
example from our customer Porsche informatic in salsburg Austria where
41:41
informatic in salsburg Austria where
41:41
informatic in salsburg Austria where they said um we have two modules here
41:43
they said um we have two modules here
41:43
they said um we have two modules here and both of them scored in the high 90s
41:45
and both of them scored in the high 90s
41:45
and both of them scored in the high 90s becoming maintainability one module was
41:47
becoming maintainability one module was
41:47
becoming maintainability one module was clearly well structured the other one
41:49
clearly well structured the other one
41:49
clearly well structured the other one was hated by most of the developers so
41:51
was hated by most of the developers so
41:51
was hated by most of the developers so we tried to figure out what happened
41:53
we tried to figure out what happened
41:53
we tried to figure out what happened there and it turned out that in the
41:54
there and it turned out that in the
41:54
there and it turned out that in the second module just the structure between
41:57
second module just the structure between
41:57
second module just the structure between the compilation units was pretty nice
41:59
the compilation units was pretty nice
41:59
the compilation units was pretty nice and almost cycle free but the package
42:02
and almost cycle free but the package
42:02
and almost cycle free but the package structure was completely random and
42:03
structure was completely random and
42:03
structure was completely random and chaotic so the basically the assignment
42:06
chaotic so the basically the assignment
42:06
chaotic so the basically the assignment of files to packages was more or less
42:08
of files to packages was more or less
42:08
of files to packages was more or less random there was no system in there
42:10
random there was no system in there
42:10
random there was no system in there which means it was very hard to find the
42:11
which means it was very hard to find the
42:11
which means it was very hard to find the code and that created also a lot of
42:13
code and that created also a lot of
42:13
code and that created also a lot of package
42:14
package cycles that were just caused by the fact
42:16
cycles that were just caused by the fact
42:16
cycles that were just caused by the fact that classes were in the wrong package
42:18
that classes were in the wrong package
42:18
that classes were in the wrong package and that makes it hard to understand and
42:19
and that makes it hard to understand and
42:20
and that makes it hard to understand and maintain a code base so we also added an
42:22
maintain a code base so we also added an
42:22
maintain a code base so we also added an alternative calculation which is based
42:24
alternative calculation which is based
42:24
alternative calculation which is based on relative cyclicity on the package
42:26
on relative cyclicity on the package
42:26
on relative cyclicity on the package level and then ml ended up to be the
42:29
level and then ml ended up to be the
42:29
level and then ml ended up to be the minimum of those two values once we did
42:31
minimum of those two values once we did
42:31
minimum of those two values once we did that we got the desired result the good
42:33
that we got the desired result the good
42:33
that we got the desired result the good module still scored in the '90s while
42:35
module still scored in the '90s while
42:35
module still scored in the '90s while the bad modules suddenly scored in the
42:37
the bad modules suddenly scored in the
42:37
the bad modules suddenly scored in the 40s and not in the 90s anymore so
42:40
40s and not in the 90s anymore so
42:40
40s and not in the 90s anymore so developing metrics can sometimes be
42:42
developing metrics can sometimes be
42:42
developing metrics can sometimes be tricky but once you get to a good point
42:44
tricky but once you get to a good point
42:44
tricky but once you get to a good point I would say in the meantime
42:45
I would say in the meantime
42:45
I would say in the meantime maintainability level is a pretty
42:48
maintainability level is a pretty
42:48
maintainability level is a pretty good
42:50
good um metric for assessing maintainability
42:53
um metric for assessing maintainability
42:53
um metric for assessing maintainability of code modules
42:57
now how do you get those metrics in the
42:59
now how do you get those metrics in the
42:59
now how do you get those metrics in the first place you need tools for that and
43:02
first place you need tools for that and
43:02
first place you need tools for that and there's a free tool I already mentioned
43:04
there's a free tool I already mentioned
43:04
there's a free tool I already mentioned sonov Explorer completely free to use
43:07
sonov Explorer completely free to use
43:07
sonov Explorer completely free to use also in a
43:08
also in a commercial setting the only thing you
43:10
commercial setting the only thing you
43:10
commercial setting the only thing you need to do is to basically register on
43:13
need to do is to basically register on
43:13
need to do is to basically register on hello.com and then get your free license
43:16
hello.com and then get your free license
43:16
hello.com and then get your free license and you can use it download it directly
43:18
and you can use it download it directly
43:18
and you can use it download it directly from the website and run it your things
43:20
from the website and run it your things
43:20
from the website and run it your things and maybe I'll have a little time if I
43:22
and maybe I'll have a little time if I
43:22
and maybe I'll have a little time if I have a little time I'm going to show you
43:24
have a little time I'm going to show you
43:24
have a little time I'm going to show you how to use it I'm not sure let's see how
43:27
how to use it I'm not sure let's see how
43:27
how to use it I'm not sure let's see how how far do we go short usage
43:29
how far do we go short usage
43:29
how far do we go short usage demo yeah let's do that
43:36
quickly uh yeah unfortunately my
43:38
quickly uh yeah unfortunately my
43:39
quickly uh yeah unfortunately my computer's a little slow with the
43:40
computer's a little slow with the
43:40
computer's a little slow with the software here but I hope it will
43:53
recover okay now let's so this is sonograph here
43:57
let's so this is sonograph here
43:58
let's so this is sonograph here uh let's open a system
44:01
here and I'm going to go to an open
44:03
here and I'm going to go to an open
44:03
here and I'm going to go to an open source system called
44:08
Gradle many of you are probably familiar
44:10
Gradle many of you are probably familiar
44:10
Gradle many of you are probably familiar with Gradle
44:29
takes a little longer because that
44:31
takes a little longer because that
44:31
takes a little longer because that streaming software is using some part of
44:33
streaming software is using some part of
44:33
streaming software is using some part of my CPU here now we have the
44:40
data still some analyzers are
44:45
running and I should soon get to my
44:49
running and I should soon get to my
44:49
running and I should soon get to my dashboard
44:52
here which gives me some overall
44:55
here which gives me some overall
44:55
here which gives me some overall information so this is basically the
44:57
information so this is basically the
44:57
information so this is basically the thing that you get also with sonograph
44:58
thing that you get also with sonograph
44:58
thing that you get also with sonograph Explorer you get a structure dashboard
45:00
Explorer you get a structure dashboard
45:00
Explorer you get a structure dashboard here which tells you what structural
45:02
here which tells you what structural
45:02
here which tells you what structural issues you have that looks at ENT tangle
45:04
issues you have that looks at ENT tangle
45:04
issues you have that looks at ENT tangle code and cycles and obviously we have
45:08
code and cycles and obviously we have
45:08
code and cycles and obviously we have almost 78% of the copas is entangled in
45:11
almost 78% of the copas is entangled in
45:12
almost 78% of the copas is entangled in some way shape or form but the relative
45:15
some way shape or form but the relative
45:15
some way shape or form but the relative entanglement which is based of Rel on
45:17
entanglement which is based of Rel on
45:17
entanglement which is based of Rel on relative cyclicity is not as bad so the
45:18
relative cyclicity is not as bad so the
45:19
relative cyclicity is not as bad so the second red bar is much smaller than the
45:20
second red bar is much smaller than the
45:21
second red bar is much smaller than the first one which we there's still hope
45:23
first one which we there's still hope
45:23
first one which we there's still hope regarding the overall structure you can
45:25
regarding the overall structure you can
45:25
regarding the overall structure you can also see that if I click down here I can
45:28
also see that if I click down here I can
45:28
also see that if I click down here I can see the metric propagation cost it's at
45:30
see the metric propagation cost it's at
45:30
see the metric propagation cost it's at 4 and a half% so that's a good value
45:33
4 and a half% so that's a good value
45:33
4 and a half% so that's a good value average component dependenc is
45:35
average component dependenc is
45:35
average component dependenc is 396 that is basically 4 and a half% of
45:38
396 that is basically 4 and a half% of
45:39
396 that is basically 4 and a half% of those
45:40
those 4,000 of those um 8,800 Java files that
45:44
4,000 of those um 8,800 Java files that
45:44
4,000 of those um 8,800 Java files that are in this system
45:47
here so I can see propagation cost here
45:50
here so I can see propagation cost here
45:50
here so I can see propagation cost here I can see maintainability level is at
45:53
I can see maintainability level is at
45:53
I can see maintainability level is at near the border that where I feel
45:55
near the border that where I feel
45:55
near the border that where I feel uncomfortable so what I would do next is
45:57
uncomfortable so what I would do next is
45:57
uncomfortable so what I would do next is here to basically go down into the
45:58
here to basically go down into the
45:58
here to basically go down into the metrix view and find out which are my
46:01
metrix view and find out which are my
46:01
metrix view and find out which are my problematic
46:05
modules so I go here click on
46:09
modules so I go here click on
46:09
modules so I go here click on module click on maintainability level if
46:12
module click on maintainability level if
46:12
module click on maintainability level if I can find it no not propagation cost
46:15
I can find it no not propagation cost
46:15
I can find it no not propagation cost maintainability level that's the
46:17
maintainability level that's the
46:18
maintainability level that's the one I'm going to sort inversely so we
46:20
one I'm going to sort inversely so we
46:20
one I'm going to sort inversely so we have one module called dependency
46:22
have one module called dependency
46:22
have one module called dependency management which has a very low level of
46:25
management which has a very low level of
46:25
management which has a very low level of maintainability let's see how big that
46:27
maintainability let's see how big that
46:27
maintainability let's see how big that module is I can add lines of code as a
46:29
module is I can add lines of code as a
46:29
module is I can add lines of code as a second
46:30
second metric so in this case we have 77,000
46:33
metric so in this case we have 77,000
46:33
metric so in this case we have 77,000 lines of code unfortunately it's also
46:34
lines of code unfortunately it's also
46:34
lines of code unfortunately it's also our biggest module and maybe we can also
46:37
our biggest module and maybe we can also
46:37
our biggest module and maybe we can also add propagation cost just for
46:40
add propagation cost just for
46:40
add propagation cost just for fun propagation cost is not too bad here
46:44
fun propagation cost is not too bad here
46:44
fun propagation cost is not too bad here in dependency
46:46
in dependency management but maintainability level is
46:48
management but maintainability level is
46:48
management but maintainability level is low let's see where it comes from so the
46:50
low let's see where it comes from so the
46:50
low let's see where it comes from so the next step what we usually do is look at
46:52
next step what we usually do is look at
46:52
next step what we usually do is look at the cycle
46:53
the cycle groups and if I look at basic
46:56
groups and if I look at basic
46:56
groups and if I look at basic compilation units cycle groups I
46:58
compilation units cycle groups I
46:58
compilation units cycle groups I see um oh man I didn't clean up for my
47:02
see um oh man I didn't clean up for my
47:02
see um oh man I didn't clean up for my last demo I'm
47:12
sorry give me a second it's now recalculating
47:18
everything and here we have the
47:20
everything and here we have the
47:20
everything and here we have the component Cycles
47:23
component Cycles again and now we see dependency
47:25
again and now we see dependency
47:25
again and now we see dependency management has a pretty big Java file
47:27
management has a pretty big Java file
47:27
management has a pretty big Java file cycle group so this this you can still
47:30
cycle group so this this you can still
47:30
cycle group so this this you can still get with sonograph Explorer with a free
47:32
get with sonograph Explorer with a free
47:32
get with sonograph Explorer with a free tool and if I open that I can see which
47:35
tool and if I open that I can see which
47:35
tool and if I open that I can see which files are part of that
47:37
files are part of that
47:37
files are part of that cycle now if you want to see the cycle
47:40
cycle now if you want to see the cycle
47:40
cycle now if you want to see the cycle itself visualized you would need the
47:42
itself visualized you would need the
47:42
itself visualized you would need the commercial product for that so if I can
47:43
commercial product for that so if I can
47:43
commercial product for that so if I can s that in a cycle view that it would
47:47
s that in a cycle view that it would
47:47
s that in a cycle view that it would visualize that whole thing for
47:51
visualize that whole thing for
47:51
visualize that whole thing for me in a way that you can see where is
47:53
me in a way that you can see where is
47:53
me in a way that you can see where is the cycle coming from so different
47:55
the cycle coming from so different
47:55
the cycle coming from so different colors again represent different
47:56
colors again represent different
47:56
colors again represent different different
47:58
different packages now let's go back um to our
48:04
slides let's hope I can go back to
48:07
slides let's hope I can go back to
48:07
slides let's hope I can go back to presentation mode without waiting five
48:28
now book recommendation if you want to
48:30
now book recommendation if you want to
48:30
now book recommendation if you want to learn more about those metrics U
48:32
learn more about those metrics U
48:32
learn more about those metrics U Shameless plug I'm I'm I'm a co-author
48:34
Shameless plug I'm I'm I'm a co-author
48:34
Shameless plug I'm I'm I'm a co-author of this book together with nine other
48:36
of this book together with nine other
48:36
of this book together with nine other people so the book has 10 chapters
48:37
people so the book has 10 chapters
48:38
people so the book has 10 chapters everybody wrote his own chapter and all
48:40
everybody wrote his own chapter and all
48:40
everybody wrote his own chapter and all the metrics I described here plus some
48:42
the metrics I described here plus some
48:42
the metrics I described here plus some extra are described in this Ro book
48:47
extra are described in this Ro book
48:47
extra are described in this Ro book here now how do we detect if a project
48:50
here now how do we detect if a project
48:50
here now how do we detect if a project turns into a big ball of
48:53
turns into a big ball of
48:53
turns into a big ball of mud collect metrics in your NY build and
48:55
mud collect metrics in your NY build and
48:55
mud collect metrics in your NY build and track them that would be the first
48:59
track them that would be the first
48:59
track them that would be the first step and check for the size of the
49:01
step and check for the size of the
49:01
step and check for the size of the biggest cycle group it should be five or
49:04
biggest cycle group it should be five or
49:04
biggest cycle group it should be five or below it becomes dangerous when it's
49:06
below it becomes dangerous when it's
49:06
below it becomes dangerous when it's over 30 it becomes really bad when it's
49:08
over 30 it becomes really bad when it's
49:08
over 30 it becomes really bad when it's over
49:11
100 um check your relative cyclicity
49:14
100 um check your relative cyclicity
49:14
100 um check your relative cyclicity values should be below 10% ideally and
49:17
values should be below 10% ideally and
49:17
values should be below 10% ideally and as soon as goes over 20% it becomes
49:22
dangerous and structural depth index how
49:25
dangerous and structural depth index how
49:25
dangerous and structural depth index how difficult would it be to break up all
49:26
difficult would it be to break up all
49:26
difficult would it be to break up all the side Cycles you have if that metric
49:29
the side Cycles you have if that metric
49:29
the side Cycles you have if that metric grows over time all the time then it
49:32
grows over time all the time then it
49:32
grows over time all the time then it points to a problem which again tells me
49:35
points to a problem which again tells me
49:35
points to a problem which again tells me that you need a metric based feedback
49:37
that you need a metric based feedback
49:37
that you need a metric based feedback loop you need some mechanism to
49:38
loop you need some mechanism to
49:38
loop you need some mechanism to integrate those metric calculations into
49:40
integrate those metric calculations into
49:40
integrate those metric calculations into your an ID build that can be done
49:41
your an ID build that can be done
49:41
your an ID build that can be done completely free so um sonograph Explorer
49:45
completely free so um sonograph Explorer
49:45
completely free so um sonograph Explorer also comes with a build build
49:46
also comes with a build build
49:47
also comes with a build build integration you can integrate that in
49:48
integration you can integrate that in
49:48
integration you can integrate that in your build to calculate metric
49:50
your build to calculate metric
49:50
your build to calculate metric reports and then you can basically use
49:53
reports and then you can basically use
49:53
reports and then you can basically use for example Jenkins to display those
49:55
for example Jenkins to display those
49:56
for example Jenkins to display those metrics or what
49:57
metrics or what time or you can use our commercial
49:59
time or you can use our commercial
49:59
time or you can use our commercial product here sonra Enterprise which is
50:01
product here sonra Enterprise which is
50:01
product here sonra Enterprise which is basically metrix database where you can
50:04
basically metrix database where you can
50:04
basically metrix database where you can upload metrics on a daily
50:08
upload metrics on a daily
50:08
upload metrics on a daily base and then see how metrics are
50:10
base and then see how metrics are
50:10
base and then see how metrics are changing over
50:13
time how do we stop the big ball of mod
50:16
time how do we stop the big ball of mod
50:16
time how do we stop the big ball of mod by just following some golden
50:18
by just following some golden
50:18
by just following some golden rules first one is Define architectural
50:20
rules first one is Define architectural
50:20
rules first one is Define architectural boundaries ideally you use a domain
50:22
boundaries ideally you use a domain
50:22
boundaries ideally you use a domain specific language like something that
50:24
specific language like something that
50:24
specific language like something that comes with sonog sonog comes with a
50:26
comes with sonog sonog comes with a
50:26
comes with sonog sonog comes with a domain specific language to Define
50:28
domain specific language to Define
50:28
domain specific language to Define architectural boundaries but you can
50:30
architectural boundaries but you can
50:30
architectural boundaries but you can also use tools like Arc unit or
50:33
also use tools like Arc unit or
50:33
also use tools like Arc unit or similar and force them into your C
50:37
similar and force them into your C
50:37
similar and force them into your C build make sure that your build breaks
50:39
build make sure that your build breaks
50:39
build make sure that your build breaks if something is not
50:41
if something is not koser do not allow package or namespace
50:44
koser do not allow package or namespace
50:44
koser do not allow package or namespace cycles and keep other Cycles under six
50:49
elements avoid Cod classes so the those
50:52
elements avoid Cod classes so the those
50:52
elements avoid Cod classes so the those big classes more than thousand lines of
50:54
big classes more than thousand lines of
50:54
big classes more than thousand lines of code are usually problematic because
50:56
code are usually problematic because
50:56
code are usually problematic because they also
50:57
they also have their own gravitational pull the
50:59
have their own gravitational pull the
50:59
have their own gravitational pull the bigger the class and the more dependency
51:00
bigger the class and the more dependency
51:00
bigger the class and the more dependency it will
51:03
attract and also limit local complexity
51:06
attract and also limit local complexity
51:06
attract and also limit local complexity don't write those G methods that go over
51:08
don't write those G methods that go over
51:08
don't write those G methods that go over hundreds of lines and are very complex
51:10
hundreds of lines and are very complex
51:10
hundreds of lines and are very complex to understand and read the effect of the
51:14
to understand and read the effect of the
51:14
to understand and read the effect of the golden rules is that your modularity
51:15
golden rules is that your modularity
51:15
golden rules is that your modularity will always be
51:18
will always be preserved coupling is
51:21
preserved coupling is
51:21
preserved coupling is minimized that increases code
51:23
minimized that increases code
51:23
minimized that increases code readability and testability
51:27
and also you add up with fewer potential
51:30
and also you add up with fewer potential
51:30
and also you add up with fewer potential vulnerabilities and if you do that your
51:32
vulnerabilities and if you do that your
51:32
vulnerabilities and if you do that your system will be better of than 90% of
51:34
system will be better of than 90% of
51:35
system will be better of than 90% of systems with comparable size and
51:40
complexity the most important thing to
51:42
complexity the most important thing to
51:42
complexity the most important thing to remember is the developers spend most of
51:43
remember is the developers spend most of
51:43
remember is the developers spend most of the time reading code we know that in
51:45
the time reading code we know that in
51:45
the time reading code we know that in most development organizations
51:47
most development organizations
51:47
most development organizations developers spend between 80 and 90% of
51:49
developers spend between 80 and 90% of
51:49
developers spend between 80 and 90% of the time reading code and very little
51:50
the time reading code and very little
51:50
the time reading code and very little time actually writing code if you want
51:52
time actually writing code if you want
51:53
time actually writing code if you want to make your developers more productive
51:55
to make your developers more productive
51:55
to make your developers more productive make your code more readable more
51:57
make your code more readable more
51:57
make your code more readable more understandable less coupled and then
51:58
understandable less coupled and then
51:58
understandable less coupled and then they have more time writing
52:00
they have more time writing
52:00
they have more time writing code this is a little example for an arc
52:03
code this is a little example for an arc
52:03
code this is a little example for an arc unit test so Arc unit is an open source
52:05
unit test so Arc unit is an open source
52:05
unit test so Arc unit is an open source system where where you can basically
52:07
system where where you can basically
52:07
system where where you can basically Define certain rules for
52:09
Define certain rules for
52:09
Define certain rules for example uh the rule no classes that
52:11
example uh the rule no classes that
52:11
example uh the rule no classes that reside in a package presentation should
52:13
reside in a package presentation should
52:13
reside in a package presentation should access CLA that reside in the package
52:15
access CLA that reside in the package
52:15
access CLA that reside in the package persistence and so on and that will
52:18
persistence and so on and that will
52:18
persistence and so on and that will basically break your
52:20
basically break your
52:20
basically break your build if that rule is broken in some
52:24
build if that rule is broken in some
52:24
build if that rule is broken in some way if you use son DSL that looks like
52:27
way if you use son DSL that looks like
52:27
way if you use son DSL that looks like this on the left side you see the domain
52:30
this on the left side you see the domain
52:30
this on the left side you see the domain specific language that comes with graph
52:32
specific language that comes with graph
52:32
specific language that comes with graph basically an artifact is a box and an
52:33
basically an artifact is a box and an
52:33
basically an artifact is a box and an architecture diagram the include
52:35
architecture diagram the include
52:35
architecture diagram the include statement tells us what is in the box
52:36
statement tells us what is in the box
52:36
statement tells us what is in the box and the connect statement tells me what
52:38
and the connect statement tells me what
52:38
and the connect statement tells me what it can talk
52:39
it can talk to and then here you also get a very
52:42
to and then here you also get a very
52:42
to and then here you also get a very nice description of your architecture
52:43
nice description of your architecture
52:43
nice description of your architecture and form of a model that is basically
52:46
and form of a model that is basically
52:46
and form of a model that is basically uml component diagram in text form what
52:48
uml component diagram in text form what
52:48
uml component diagram in text form what you see
52:50
you see there the main difference is that sonog
52:53
there the main difference is that sonog
52:53
there the main difference is that sonog uses a model based approach basically
52:55
uses a model based approach basically
52:55
uses a model based approach basically you create a model for your whole
52:57
you create a model for your whole
52:57
you create a model for your whole application and models have a big
52:59
application and models have a big
52:59
application and models have a big advantage that as soon as you cover all
53:01
advantage that as soon as you cover all
53:01
advantage that as soon as you cover all your code with a model everything that
53:03
your code with a model everything that
53:03
your code with a model everything that is not xcity a lot is forbidden and will
53:05
is not xcity a lot is forbidden and will
53:05
is not xcity a lot is forbidden and will be found so you you you can basically
53:08
be found so you you you can basically
53:08
be found so you you you can basically check for completeness of your model by
53:10
check for completeness of your model by
53:10
check for completeness of your model by making sure that your architecture
53:11
making sure that your architecture
53:11
making sure that your architecture covers all your code this Arc unit
53:14
covers all your code this Arc unit
53:14
covers all your code this Arc unit that's a lot harder because it's just a
53:15
that's a lot harder because it's just a
53:16
that's a lot harder because it's just a set of independent rules that can be
53:18
set of independent rules that can be
53:18
set of independent rules that can be checked so and checking for completeness
53:21
checked so and checking for completeness
53:21
checked so and checking for completeness is difficult but on the other hand by
53:23
is difficult but on the other hand by
53:23
is difficult but on the other hand by all means using Arc unit is much better
53:25
all means using Arc unit is much better
53:25
all means using Arc unit is much better than not using AR unit because that
53:27
than not using AR unit because that
53:27
than not using AR unit because that allows you to Define architectural
53:29
allows you to Define architectural
53:29
allows you to Define architectural boundaries it's just not as comfortable
53:31
boundaries it's just not as comfortable
53:32
boundaries it's just not as comfortable and luxurious at what you can do with a
53:39
DSL okay how do we improve an existing
53:42
DSL okay how do we improve an existing
53:42
DSL okay how do we improve an existing big ball of
53:44
big ball of mod first of all it's a good idea to
53:47
mod first of all it's a good idea to
53:47
mod first of all it's a good idea to find out which parts of your code base
53:48
find out which parts of your code base
53:48
find out which parts of your code base are actively worked on there's no need
53:51
are actively worked on there's no need
53:51
are actively worked on there's no need of resolving psychic dependencies in a
53:53
of resolving psychic dependencies in a
53:53
of resolving psychic dependencies in a code base that hasn't been touched for
53:55
code base that hasn't been touched for
53:55
code base that hasn't been touched for the last two years know because then
53:57
the last two years know because then
53:57
the last two years know because then it's stable and then doesn't need to be
53:59
it's stable and then doesn't need to be
53:59
it's stable and then doesn't need to be done anything but if you have code that
54:01
done anything but if you have code that
54:01
done anything but if you have code that actively changed and very complex then
54:03
actively changed and very complex then
54:03
actively changed and very complex then this is where you should focus your
54:08
improvements and the first thing would
54:10
improvements and the first thing would
54:10
improvements and the first thing would be to basically find out if you can
54:12
be to basically find out if you can
54:12
be to basically find out if you can reduce package Cycles package Cycles are
54:14
reduce package Cycles package Cycles are
54:14
reduce package Cycles package Cycles are worse in component cycles and then if
54:17
worse in component cycles and then if
54:17
worse in component cycles and then if you have those big compilation unit
54:18
you have those big compilation unit
54:18
you have those big compilation unit Cycles also try to make them smaller
54:20
Cycles also try to make them smaller
54:20
Cycles also try to make them smaller breaking them up into smaller Cycles is
54:22
breaking them up into smaller Cycles is
54:22
breaking them up into smaller Cycles is already good gain and we have some
54:25
already good gain and we have some
54:25
already good gain and we have some tutorial videos on our website that
54:27
tutorial videos on our website that
54:27
tutorial videos on our website that explains this
54:29
explains this process I see I'm running out of time
54:31
process I see I'm running out of time
54:31
process I see I'm running out of time we're almost at the end have to go a
54:34
we're almost at the end have to go a
54:34
we're almost at the end have to go a little faster
54:36
little faster here having an architectural model is
54:38
here having an architectural model is
54:38
here having an architectural model is always good and once you have an
54:40
always good and once you have an
54:40
always good and once you have an architectural model you can track your
54:42
architectural model you can track your
54:42
architectural model you can track your progress and see by just looking at the
54:46
progress and see by just looking at the
54:46
progress and see by just looking at the metric you can see if things are moving
54:47
metric you can see if things are moving
54:47
metric you can see if things are moving into the right direction or
54:53
not um there's some source code
54:55
not um there's some source code
54:55
not um there's some source code management met
54:57
management met that are really interesting basically
54:58
that are really interesting basically
54:58
that are really interesting basically metrics derived from a Version Control
55:00
metrics derived from a Version Control
55:00
metrics derived from a Version Control Systems change frequency how many file
55:03
Systems change frequency how many file
55:03
Systems change frequency how many file changes did we have how many commits how
55:06
changes did we have how many commits how
55:06
changes did we have how many commits how many lines have been
55:07
many lines have been
55:07
many lines have been changed uh what is a Cod TR rate the
55:10
changed uh what is a Cod TR rate the
55:10
changed uh what is a Cod TR rate the lines of that have been changed in
55:12
lines of that have been changed in
55:12
lines of that have been changed in relation to the total number of lines in
55:14
relation to the total number of lines in
55:14
relation to the total number of lines in your code base and how many different
55:16
your code base and how many different
55:16
your code base and how many different authors have worked on a piece of code
55:19
authors have worked on a piece of code
55:19
authors have worked on a piece of code and that can uh at the end lead to this
55:22
and that can uh at the end lead to this
55:22
and that can uh at the end lead to this hotspot map here that's a
55:23
hotspot map here that's a
55:23
hotspot map here that's a three-dimensional uh visualization of a
55:26
three-dimensional uh visualization of a
55:26
three-dimensional uh visualization of a code base each of those little blocks is
55:28
code base each of those little blocks is
55:28
code base each of those little blocks is a building a source file is a source
55:30
a building a source file is a source
55:30
a building a source file is a source file the the the ground area of the the
55:34
file the the the ground area of the the
55:34
file the the the ground area of the the block of the building is proportional to
55:36
block of the building is proportional to
55:36
block of the building is proportional to the lines of coat and the color and
55:38
the lines of coat and the color and
55:38
the lines of coat and the color and height of the building are assigned some
55:40
height of the building are assigned some
55:41
height of the building are assigned some arbitrary metrics in that case we have
55:42
arbitrary metrics in that case we have
55:42
arbitrary metrics in that case we have assigned complexity to the color and
55:46
assigned complexity to the color and
55:46
assigned complexity to the color and hate is a change frequency so what we're
55:49
hate is a change frequency so what we're
55:49
hate is a change frequency so what we're looking for would be red
55:51
looking for would be red
55:51
looking for would be red skyscrapers that's basically is based on
55:53
skyscrapers that's basically is based on
55:53
skyscrapers that's basically is based on a Patria Sandra we don't have a red
55:55
a Patria Sandra we don't have a red
55:55
a Patria Sandra we don't have a red skyscraper but we have a
55:56
skyscraper but we have a
55:57
skyscraper but we have a pretty big dark red building that's
55:58
pretty big dark red building that's
55:58
pretty big dark red building that's basically the the the core class of
56:01
basically the the the core class of
56:01
basically the the the core class of Apache Cassandra that is changed
56:03
Apache Cassandra that is changed
56:03
Apache Cassandra that is changed frequently and medium complex and pretty
56:06
frequently and medium complex and pretty
56:06
frequently and medium complex and pretty big so that is something we may be
56:08
big so that is something we may be
56:08
big so that is something we may be focusing refactorings
56:10
focusing refactorings
56:11
focusing refactorings on okay