0:00
Good morning to all of you
0:09
Myself Doctor Vishal Sharma and I would like to welcome you all in this first session of
0:19
light of virtual conference in support to UNICEF, a 24 hour Microsoft learning technologies and
0:27
and fundraising event to fight COVID-19. Today we have with us our first speaker, Claudio Silva
0:38
He works for Simmons on SQL Server Database Administrator and loves PowerShell
0:47
Today he is going to talk on DBA tools and Power BI walked into a bar
0:54
And let me have the privilege to invite Claudio Silva to give his presentation
1:01
Thank you, Vishal. Thank you. I'll just share your presentation and then you will start
1:09
Yes, please. I've heard your presentation. I think you can now
1:24
So, OK. So let's start. First of all, a big thank you for having me and to be able to support these calls with
1:39
children and these tough times. So today I decided to bring some some sessions which will talk about how we can use the DBA
1:52
Tools PowerShell module to gather data from our SQL Server instance and then we can save
1:58
them on the SQL Server database and by using Power BI, we can give some meaning to our data
2:07
to answer a couple of questions that we may have. My name is Claudio Silva
2:12
I'm a SQL Server DBA working for Siemens, and I love PowerShell as you will see
2:18
Again, if you had the chance, if you can even just $1, please donate to this cause
2:26
It will help, UNICEF helping children. So if you haven't done yet
2:31
just take the picture, read the QR codes, and please donate. Thank you to all the sponsors that help these conference
2:44
To be made and to be available for everyone. We are talking about 24 hours of technology sharing
2:53
Sharing the all the knowledge that the speaker says with you for free, OK
2:59
Going to our agenda, so we will see what are the pre requirements in the pre
3:07
the configurations that we will need to have this kind of solution running
3:13
We will see which questions we want to have answered and how we can do that
3:20
I will share the scripts and some caveats along the time, but everything is documented
3:26
and you can leverage from it afterwards. What was my implementation approach
3:33
course this can vary on your environment but at least is a starting point and at least will give
3:38
you some ideas to work on how we can save the data to sql server and we will see the database
3:47
structure the codes and the power bi dashboards to to answer these questions and after that we have
3:54
some faq if you if you to answer some of the questions and if you have more we can have a talk
4:00
course. OK, so first of all, regarding the prayer requirements. For this solution, we will need to
4:09
decide where the solution will be running. And I would say, and it's the way I have configured
4:16
for instance, a centralized server that can reach all your other servers, all your SQL Server
4:22
instance. We will need the dbaTools module installed. So if you don't have internet access
4:31
to download the module from the PowerShell gallery, you can connect with the server or with
4:36
the machine that has that connection. You can save the module using the save module command let
4:41
and then you can copy the module to your server which doesn't have access to the internet
4:46
In case you have internet access, it is as easy as using the install module with the name dbaTools
4:52
For this particular solution, you will also need the posh rs job powershell module
5:02
So this is a module done by Boprox that make possible to run our scripts in parallel with multi-threading
5:12
And that will be a plus whenever you want to work with a lot of instance and not just
5:17
a couple of them. And the other part of the solution is having Power BI
5:23
I'm saying here Power BI Desktop because it is free, so anyone can download and install
5:29
it on his laptop, on his computer as long as he has connection to the SQL server to
5:36
connect there. However, you can also, if you want, you can also install a Power BI report server if your
5:45
organization already have it you can publish your power bi dashboards to there and everyone can
5:52
leverage from the same dashboard on the centralized place to be to access this data
6:01
a quick overview on the kind of architecture that that i'm using for this so we have i have a
6:10
have a centralized server, a central server with running SQL Server where I have my PowerShell
6:16
scripts and the DBA tools in posh RS job modules to run them
6:21
And then on my environment, I have my environment across the company which can be a development
6:29
environments QA and prods where each one of them has a couple of instance
6:36
when i run the scripts i connect to each one of these instance collect the data save it on the
6:42
central server on a specific database and then using power bi i query the data from the central
6:49
server and i have my dashboards to answer my questions the pre-configurations um to to make
6:59
sure that you can use all all the the good things that the dba tools has because dba tools if you
7:05
don't know is a open source powershell comment uh sorry powershell module uh is being built by data
7:13
professionals to the data professionals and currently it has more than 500 comments which
7:20
can leverage about sql server powershell remoting the wmi the registry so it very important that you do a test and check if from your central server you have everything configured to access
7:35
the instance. As an example, I work on a very restricted environment where by default when a new server born simple
7:43
server is installed there and my central instance will not have access to that server by default, which means that I need
7:52
request to the firewall team to put the new IP as a destination and my central server as the source
8:01
to the firewall rules which will make it possible to connect and this goes by having the sql server
8:10
instance port which by default is 1433 but if you use a different port number of course you
8:15
should open that one you should have the udp ports the internet control message protocol
8:20
the decom and the winrm we put for the power show remoting because some comments will connect and
8:28
run things on the server sides to to get and to to be able to retrieve the data and and in this case
8:36
again in my environment as just to give an example if i do not use my full qualifying domain name
8:43
for the servers along with the user SSL and the include port in SPN, I will not be able to
8:50
connect to the instance. So it's a very restricted system with a lot of security which it's possible
8:57
to use the vehicles anyway. So it's not a problem, but you need the right configurations
9:03
If you have kind of this situation or if you want to, if you need some more configurable settings
9:11
regarding the power show remoting coverage on dba tools you can then take a look on these blog
9:15
posts where i explain how i have added these last two and of course if you need we can add even more
9:24
so the next question is if the firewall team says i have everything uh already um configured can you
9:31
please test how we will test if we have access to let's say our 100 instance or more test the dba
9:39
connection so all the comments that we will see here that has the dba on the name are comments
9:45
that belongs to the dba tools powershell module this dba connection is a handy comment that
9:51
makes possible to connect to the one instance and whenever you have at least these three settings to
9:58
true so connecting to the to connect to the the instance uh you can ping the server you can remote
10:05
access the server as long as you have this it should be okay for you to to run then the comments
10:11
from that server to the remote servers also i suggest that you you select a credential so a
10:23
service account something like that that will be running the jobs and you should test again so the
10:29
best way to do it if you have a specific service account not your nominal account is to run as open
10:36
a powershell session with the runners use that set that that account and then you can test of course
10:42
you will need an account that currently has access to the sql server instance and not only for the
10:50
to go by the firewall rules The next question that I want
10:56
I want one story here is just a couple of them to make the point
11:00
and to show what is possible and then everyone can have different ones
11:05
and wants to collect different data to answer those questions. And I will say OK if I want some server data
11:14
not the instance, but the server. Let's say that I want to know how many
11:19
CPU and how many memory I have, I can use the get DBA computer system PowerShell
11:25
DBA tools PowerShell command. If I want some SQL server data, let's say the version, the addition
11:31
I can use the get DBA instance property. But what if I want some table size, number of records that
11:37
the table has, we can use the get DBA DB table. We need a common that access both the server itself
11:45
and the instance we have for instance that the the the get dba disk space we'll gather the disk space
11:51
and it will also say to you if you have on those disks sql server files or not okay so that's why
11:58
access the server and the instance if you do not find the comments suitable for you to to get the
12:06
the data that you want um you can always visit our website the docs dba tools.io which will have
12:14
you can we have a search you can find there for a specific word or specific comments and if you are
12:23
without internet access but you already have the module you can use the find dba comments
12:27
to search for a pattern for a tag for something that is related with what you are trying to to get
12:36
regarding the script development so what i have done and what are the caveats that
12:41
are already overtaken, but I will talk about them anyway. So the skip the script pattern is nothing more than the DBA
12:51
tools common that we saw before running through some instance, collecting the data and outputting the PowerShell objects
12:59
which then we will have a collection time column because if you want to have these data over the time and being able to show
13:07
it in Power BI in a way that we can see the evolution about the
13:11
things, we have a collection time column where we will be adding at the current
13:16
date time of the execution. And finally, we will type these results for the right
13:22
DBA data table, which will save the data on a SQL Server table that we specify
13:27
The caveats, or one of the caveats that we have here, is that some comments, and in
13:32
this specific case, DBA tools has the get DBA instance property, these comments
13:37
does not output the results as a table with a lot of columns
13:44
but instead in a more or less key pair value. So what I have written is a function called invoke transpose data table
13:52
that will do the rotation of this table and make every single key a column
13:58
and every single value the value of that column. So it's something like this
14:03
These current columns have this pattern. So we call get dba instance property
14:11
and the output will come in a vertical way where we have
14:15
all the name of the properties and their values. But after running the function that will do the rotation of the table
14:22
we will have something like this. So all the key names became column names
14:31
all the values is now under the same row and that is much easier to save so we can
14:39
save on the table and then we can keep tracking of the evolution if we had more
14:43
or less CPUs and so on. The implementation so I'm answering the question
14:51
always runs so okay we scripts we have everything set up to run but how we should put it run on a daily basis or a week basis to collect the data without the needs to go there and run the scripts by ourselves so for this i have decided to go with
15:06
the sql server agent jobs on my central server i suggest that you request a dedicated domain
15:14
account or a service account with the permissions on the instance so we should be the same account
15:19
that you have decided before to test the DBA connection and you can put it as the owner of
15:26
the job because that way it will run the job as that account or you can create a SQL credential
15:34
and using the proxy you can use the to run the the job steps
15:41
The suggestion is to have one job with multiple steps each step will invoke one of these scripts
15:47
And of course, you should find the best schedule to run it
15:53
If you need help about to set up the agent job to run PowerShell scripts, please go to
15:58
the dbatools.io slash agents and there you will find the instructions how to do it to
16:05
make it possible. As a suggestion, if you have a lot of instance and if you are running it, running them, let's
16:16
say not in parallel or if you want you, if you have even more scripts that you want to run
16:21
and by running one script after the other, even if they are running for multiple instances at a time
16:27
it start to take a long time. You can split your job and create
16:32
different jobs with different steps. Probably most of the steps can be run even in parallel
16:37
So if you have two jobs, each one with two or three scripts to run
16:42
it will be okay to you and will became much faster to finish the all run. Regarding the data that is saved to the SQL server
16:54
as I said before, we are using the right DBA data table. And why? So first, we have a
17:01
parameter called auto-generate, which will generate the table. It will try to infer
17:07
the data types of the objects and create that table structure on the SQL server
17:14
However, it's important because to call out about this, whenever we have strings
17:22
mainly you will get the NBAR CarMax, which I should call it and say, do not forget to change them
17:29
So if you pick a new table to or a new script to run and you create the table
17:35
you can leverage on this to create a table, but then please alter scripted table
17:40
alter and save on your source control to have a better data types and data size for this
17:46
envar car max also as a suggestion create a cluster index and use compression it's available
17:53
on any edition since 2000 sql server 2016 service pack one so even if you are running on a standard
18:00
edition or something like that you can use it and you will save a lot of space because if you run
18:06
this every day or every week most of the data will be more or less the same which means that
18:11
you can leverage a lot from the compression okay so now let's see some code and some visuals
18:22
regarding this solution and for that i will pass my presentation for the Azure Data Studio. So I'm using Azure Data Studio to with a notebook
18:40
to be easier to follow along about these steps. And first of all, I want to show you here
18:49
the table structure and what we have. So currently, let's say, let's pick one. We have one table
18:57
called instance which is the table that will have all the instance names that you have on your
19:04
environment and which port they are running which environment they they belong because this will be
19:11
helpful then on the power bi to filter out and to to to search for some information and as i said
19:18
before here i have created a table with more proper data types and data size and also i have
19:26
had in this case a constraint because i want that by default my sql server instance name
19:31
to be the default one then if we go to a different table and all the other tables are related with
19:38
the data that we are getting using dba tools here as you can see i'm creating a table with the all
19:47
the data that is outputted by the the comment of the dba tools and also i'm creating a cluster
19:52
index okay using compression this way i can live with more data over the time and the same applies
20:02
to every single table that we have here so for these ones if you want to start with these ones
20:07
you can grab these scripts for my github repository you can just compile and run there and you will
20:15
be okay to start if you create if you use different uh dba tools comments to get
20:22
different data you will need to create the first time the table fulfill the
20:27
right data types compress with a cluster index and then you are good to go to
20:34
get the data so what I have this is a view so it's a view that is building my
20:40
full qualified domain name so my host name with my domain and also if I have
20:45
not known default PowerShell instance it will get that instance from from here and
20:50
also the port. So this way I can have my FQDN for the instance and use that to connect to
20:59
them. Finally, the one agent job that will have DBA tools data to connect. So here what
21:07
happens, we have just a job and we will call on each step a PowerShell comment with the
21:15
script that we have and this script that I show you I will show you in a minute runs with the verb
21:21
runner so which which are higher privilege and will collect the data and save the data on the
21:29
tables so before showing you the the scripts I want to show you here so I'm connecting to my
21:36
locals my debut it was a table sorry database and here I'm just trying to show that for all
21:44
the tables I have here you will not find here the envar car minus one which becomes the max
21:49
So the table structure is there and is I'd say with more much more correct at the times
21:57
Here just to show that what we would look like in the end table of the instance. So if you have
22:04
a lot of different host names you can you put them here you configure and then this is the ones
22:10
that we will use to start exploring the data. The job itself, as I said before, and showed the scripts, but here you can see I have already
22:23
a dba2s collectInfo job One step for each script here which calls the command line and then here If I can grab this
22:37
you will see that we are running on a specific folder, a specific file to collect this data
22:44
Now for the test propose for the empty database, so let's say for what you can get if you want to install and to run
22:54
this kind of solution. I will change here my kernel for PowerShell because now I will run PowerShell comments
23:01
What I will do is I will pick a backup file, which is a template database without any data there
23:08
but already with some tables, I will use DBA tools with the restore DBA database to restore
23:16
my template of my database. I'm replacing the existing one. First time it will take a little bit longer because it's holding the module
23:28
Then what we will check is that the database will be there
23:36
Just one more second. Okay, so it's done
23:47
So as you can see here, my computer name, my backup file where it was
23:51
uh the backup size database name that i have created is dba tools uh to how much time it took
23:57
to to to restore six seconds the files where it restores so everything is here i have done with
24:04
replace which means that right now if i go here to my dba tools and if i want to select the data
24:13
from my instance, this is empty. OK, so now with this in place again
24:21
get DBA database. So it's one of the comments that we will use to collect data from the instance
24:26
but here to show that the database exists and some data for about it
24:31
And now I will use invoke DBA query just to insert one entry on my DBO instance table
24:39
which will be my local hosts, And I will output what it is there
24:44
So just to show we have added one entry. So if I come here and select now I have one entry on my DBA tools to database
24:56
And now we will put the job running using DBA tools. So we have our get DBA agent job we can get we can connect to our local instance and we
25:10
We will get our job and then because this outputs a job objects from the SQL Server
25:17
agent, we can use the start methods to initiate the job and we can do this
25:24
It will run and asynchronous. So now if I use it to get a DBA agent job and check by my for my job, we will see that
25:35
the current status will be running. So it will be running the five or six scripts that I have there
25:44
So here, as you can see, is executing. OK. And of course, if we run this right now, doesn't have any data yet because this table is not
25:55
yet fulfilled, but in a couple of minutes we will have them here
26:00
we wait for this. Let's see here. Take a look on the scripts. So when we have the data collection
26:11
all the script is built and how the data is being collected from there. So let's pick the computer
26:18
system for instance, and let me briefly explain the code here. So we are importing the module
26:24
we are setting some configurations in this case regarding dba tools configuration so i'm saying
26:31
that whenever you run powershell comments and you need to use powershell rebooting please for this
26:37
session the user ssl ssl should be true and my include port in spn should also be true
26:44
here i'm saying i need to define what are what is my central server what will be my central
26:51
database where I will save the data. And here I decide my throttling, the number of sessions
26:57
that I will want to have running in parallel for each one of these scripts. And here my
27:03
suggestion is that you should test first and take a look because depending on the horsepower
27:09
that you have on your server, the number of CPUs and so on, this number may change. What
27:15
is important to understand is that if you have, let's say, 100 instances to run, if
27:20
put the throttle with 10 you may end with less than half of the time comparing when you need to
27:29
run one by one in a serial way. To run this we first get our server lists and our server list
27:38
as I said before I'm using my view and get my FQDN. Here I can or cannot filter by a domain
27:46
let's say in my case I have multiple domains so if I want I will have
27:51
different service accounts from different domains and by using this environment variable the script will know that I'm running for the environment
28:00
sorry for the domain a or domain B and then I can get from the database just
28:07
these these instance picking the server list invoking DBA query will invoke
28:14
this query getting just the FQDN and will create a list of servers that will be used later
28:21
Then what we have here is related with the posh rs job powershell module that I
28:28
said before that we will be using to have the multi threads. So to run the start rs job we
28:35
should have a script block. So in this case because we have a couple of comments to run
28:40
As said before, we have a DBA tools comment, in this case, get DBA computer system
28:46
Then we will add just one member property with the current date time
28:52
And finally, we'll pipe this to the right DBA table where I will say
28:56
please insert the data on this instance, on this table, on this specific table
29:02
And if I do not have the table created, it will auto create
29:07
and the other thing is that you have a batch size whenever you are running something that will output
29:14
multiple lines let's say the the table information for one database you can change here the batch
29:20
size to try to fine-tune your throughput to the database so this script block will pick
29:27
the two parameters the central server and the central database that will be passed here on
29:31
on the argument lists. So with this I can say, OK, for each
29:37
my server list, please type the values for the start RS job
29:41
Use this script block. The throttle will be 10 in this case
29:45
Use these arguments to pass the parameters insights and then wait for the jobs to finish
29:51
Behind the hoods, what will happen is that we will have, let's say, from 100 instances, we will start to have 10
29:59
Parallel threads running when one of these finish, it will immediately start another one
30:06
with a different instance and so on, which means that we will start with 10. We will skip with 10
30:11
until we get over to 90 and then it will finish to run. Once that's finished, we receive just a
30:19
jobs. This is something just to clear out the to get what's outputted here. In this case
30:25
will not output anything because we are we are just writing to the database and finally we can
30:31
clean up and get the jobs that were created and remove them just to clear our our cache and this
30:38
is pretty much the same for every single common that we have what can change something like this
30:45
so for the get dba database this returns if we get all the properties it will return even more
30:52
properties than these ones so what i have done here i have picked the ones that i decided that
30:58
were important to me and to keep tracking over the time and i'm just selecting these ones
31:03
adding again my collection date time and saving the data for a specific table on the database
31:09
same thing to show you the the the caveats that i mentioned before related with the the dba instance
31:19
property so here you can find the the the function that I have I have adapted I found on the internet
31:26
wrote it and adapted where it will do the rotation of the of the table so in this case because this
31:34
is a very quick comment that will output just let's say one line in the end what we are doing
31:41
here is slightly different of the other so we are saying for each one of the server list that we have
31:46
please get the data do the transpose and add it to a final data table and then in the end
31:53
we pick on our new data table and that is the one that we write the database and it's there
32:00
okay now i will show uh an example of a dashboard with data there let's let me just double check
32:10
here if we already so as you can see we already have some data on our database table these are my
32:18
local database on my local host instance and if we go here we can double check if the job
32:24
is still executing or if it's already finished I will open my Power BI dashboard
32:39
So I will show first the one I have with more data to give you the feeling about what we
32:46
will have after a couple of days. And then I will show you the empty one, what happens when we refresh for the to get the
33:45
Okay, so it's. So do not look to my great visual dashboard skills
34:00
So the idea here is just to show to you if you have the correct tables and the correct
34:09
relationships and so on. you can drag and drop a couple of things and just give some numbers
34:15
And as the here as a example, I'm showing all my environments and when I click here
34:24
we can see all these filtering out. So it's quite easy to understand how many database or many servers and so on
34:32
We have per environment. The other example that we have here is
34:38
related with the question so I'm running on a virtual server I'm running in bare metal
34:45
So because the data that we have from the server property we can know that and
34:51
here in this case all of my instance are running in virtual for this example
34:57
But again, we can go here and for the list of servers that I have here I can
35:01
and just get some data from specific instance or a set of instance
35:10
Regarding disk space, we can have an idea for the instance and even for a specific mount point or
35:21
specific drive how it looks like over the time. So let's say that I want to see my data where SQL
35:29
server MDF and the files are and I can see that I have two terabytes and I'm almost one
35:34
terabyte in use. OK, so this is the other thing that we can use and of course changing here for different
35:42
instance will bring different values and that's the idea to try to understand where we are
35:48
and where we go. Regarding the instance and here you can see which version or any edition we are running
35:58
to have that ID. If you have a lot of instance in that in that
36:04
values or not, you can use and filter out here and here to show
36:09
one thing that I have filter out here, but you haven't seen this one filter. I'm still showing the 8 instance. The thing
36:16
here, and this was on purpose, is that to show you that we have
36:21
a couple of relationships here created between our views and our tables and of course we need to have this correct in order to have the data being filtered of the
36:33
time. So now if I filter the track I will get the SQL Server 2016 and if I take off it will
36:42
show everything again. Regarding database, so here you can answer questions like
36:52
like is my tables growing a lot or is my database growing a lot or not? And we can have, I will
37:01
pick this example here so I will pick one database with archiving the name and this is what you may be thinking it a let say a housekeeping database that is getting a lot of data from the
37:15
the other the other database over the time which means that it's almost always
37:21
increasing over the time regarding the space occupied and this with with power bi
37:27
we can for instance um try to to put here um a visual
37:33
that will show to us for the database a forecast and say, okay, so if right now my database has
37:44
X amount of space being in use and we need to put here our collection date time. And now if I say
37:53
please use here our selects our data space usage in gigabytes
38:02
So as you can see from the 10th of September to the end of the month
38:10
so about 15 days, 20 days, it increases about 30 gigabytes. So what can happen in 30 days if this ratio of increasing space continues
38:24
So you can use Power BI for that. You can use here the forecasts and you can say, please tell me in 30 days what we can
38:35
expect for it. And by doing this, we can say, okay, in 30 days, if the same or based on these data points
38:44
that we have before is expected that your database reach between 132
38:49
132 gigabytes or 139. So this can be quite useful if you want to have a
38:56
prediction on how much the data will grow and when you probably need to think
39:03
about more disk space without or before eating the threshold of your disks, let's say
39:09
And here regarding the table, what we, one of the things that I usually use here to have
39:21
an idea is when the clients ask, OK, so how much, I know that my database is growing a lot
39:32
I know that has a lot of rules there, but what is the pattern
39:36
are increasing the number of rules on the weekdays and on the weekend is
39:43
Much lower, it's otherwise can we see a pattern and here you can have that answer
39:49
So here this has a custom SQL query written that is pointing to the to our
39:58
Database table, but it's a different one that will get to the day before and the current
40:05
day and doing the math to understand. OK, so between let's say between the
40:13
Between Friday and Saturday my table grows five, 500 and 7000 rows in this day and
40:24
between Saturday and Sunday. So because these are the two week weekend
40:32
this grows another 300 something and then we have 1 million over the week And again we can see and compare how this growth goes along the way with the evolution of the old table okay so it it a nice way to understand this when
40:52
sometimes all the sun that table grows a few hundred of gigabytes and you are not
40:59
expecting that and you can see you on which day that happens and maybe that
41:03
will help to understand if it's. It was something related with a new
41:08
deployment or an upgrade or something like that. Now I will show you the empty dashboards
41:17
and what you will need to do whenever you want to get this solution and as a
41:25
starting point after running your running your in. your first time the job and here I will use my other Power BI that will point for the DBA
41:40
Tools 2 database which is the one that we have used to collect the data
41:52
So right now if we come here and if we, well it's opening, if we come here and if we take
42:00
a look on our database, we can see here all the database and all the size, the compatibility
42:15
level, last full backup and so on for this. And whenever you open the first time the Power BI, you will get something with a lot of blanks
42:24
because these were saved pointing to a database without any data. Here, the only thing that you need to do
42:34
because you may want not to name your database DBA tools or DBA tools too
42:39
you can go here and we have parameters. If you just change here your instance name and here your database name
42:48
you are good to go and you can just hit the refresh button
42:51
And this will connect to the data module and run the queries on the database
42:57
bring the data to here and refresh our dashboard with the data that we were collecting
43:05
And of course, after a couple of days, weeks, months running this, getting data every day
43:11
you will have a very good idea about the evolution of the stuff
43:16
So let's see here. So I will see just one computer, my domain, development environment
43:23
which was the one that I selected. This is my computer name. And here I have a Lenovo
43:29
okay, with number of CPUs, number of processors and so on. And again, disk space, you can
43:35
have a look on the scene that we drive. Regarding the instance, we can see, so what is the version
43:44
and will be the custom version of the database. And here, for the old database that I have on my instance
43:54
I can take a look and see how much is occupying. For, in this case, I just have data from today
44:00
But of course, if you have more data along the time, you will see the evolution on the size, on the forecasts, and so on
44:09
OK, so let's back to our to the presentation