Semantic Link Power BI Dataset and Data Science in Microsoft Fabric
19K views
Oct 30, 2023
This video shows how to propagate semantic information from Power BI dataset to unlock new capabilities in Microsoft Fabric Data Science projects.
View Video Transcript
0:00
Hello everyone, welcome to this Microsoft Fabric video
0:08
On the 5th of October 2023, the Fabric team announced the arrival of Semantic Link, which
0:15
enable whether you are data ysts, data engineer or data scientists to propagate semantic
0:22
information from the Power BI platform to access new capabilities of the Microsoft
0:27
fabric to perform things like data augmentation, validation, exploration and extendable set of
0:34
thematic functions. So in this video, I'm just going to get to started on how to use the
0:40
semantic link in the Microsoft Fabric notebook objects. So without much talking, let's get started. Now
0:49
I'm going to swipe to my Power BI desktop and show the datasets as well as they already
0:54
established explicit dax measures. So I'm going to come to the power of the desktop and of course
1:00
we can see all the dimension tables as well as the fact table. Now under the fact table, I have so
1:06
many explicit dachs measures. Now I'm going to come to the model view so we can see the many to one
1:12
relationship across the dimension tables and the fact table. So this is basically a star schema
1:20
okay? So let's come back to the cloud. Now the first thing I
1:24
is to create what's called a workspace. So I actually created this semantic link in a workspace
1:31
Let me just click on that. After creating the workspace, I proceeded to create this semantic data leak house and then I used the upload button to upload the file
1:43
onto the workspace. And of course we can see the file here. It is called Sales Report for Microsoft Fabric
1:51
So this is the report. And of course, this is the under- data set which you can use in the notebook and of course we have the dashboard which is
1:59
automatically created which is quite amazing and then i just go ahead and click on the notebook and i
2:05
give meaningful name to the notebook called data science project and we can even see the
2:11
dataset here anyway so these are all the tables we have the dim tables as well as the you know
2:18
fact table so now this is the notebook now there are so many components of this
2:24
book now we can see the different kind of ribbons the home tab the edits run data and view now we have
2:31
different kind of command like you want to save if you want to download you want to apply in a different
2:36
kind of settings and you want to run all this and of course we can see the language which is
2:40
pi spark which is basically the property code item for developing apaches spark jobs and
2:45
machine learning modeling in Microsoft fabric notebook so the first thing i did is to install the semantic
2:53
link package and dependencies. Now to do that we have to use the percentage PIP installed
2:59
sematic link. Now that basically installed a Python package called semantic link with the necessary
3:05
dependencies. So let's go ahead and run this cell. Click on this to run the cell and there we go
3:10
We can see that the command was executed in two seconds for six seven milliseconds and of course you can see the name of the person who executed the command myself and of course you can see the time and the date So the package is already installed and of course the necessary dependencies are also installed
3:31
The next thing I did is to use this input, sempi, that fabric as in a fact
3:37
Now the name of the model is fabric and it is located in the Sempai package
3:42
So I'm just going to click on this to execute or run the cell. All right, so you can see it's working pretty fine
3:52
The next thing I did is to passing the name of the PowerBer file inside
3:56
global code, which is in this case sales report for Microsoft Fabric
4:01
Now it's very important the name correspond to exactly what we have here
4:05
which is sales report for Microsoft Fabric, okay? And then I defined another variable called
4:12
df underscore d underscore product and then we use the fab. dot read underscore table to read the dataset and then want to access the theme product
4:23
column so this has to be inside double code also and then we just go ahead and print
4:28
or see what this variable is delivering so let's go ahead and run the cell
4:36
amazing so you can see the command executed in 10 seconds and 67
4:42
milliseconds so we can see the name of the column product and we can see the product ID
4:48
column now let's scroll down I did the same thing here just passing the name of the file
4:53
and then another variable now in this code print or let's want to see the head the 14 rows
4:59
so go ahead and click on the cell to run and then we can see the list of the account managers name
5:06
we can see the index as well as the corresponding account managers ID so let's scroll down now in this case
5:12
we actually want to do the same thing to access or see all the columns within the fact table
5:18
so go ahead and click on execute or run and here we go so we can see all the columns within the
5:27
fact table which is quite amazing and of course we can see that we have 1,200 rows across
5:35
the three columns now we can even generate sample 10 sample record from the fact
5:42
table that we just executed at the top so just go ahead and click on this and then that's going
5:47
to give us a randomized records 10 record from the df underscore facts table
5:54
just like you do in the normal python and then you can even perform the described
6:00
can see different kind of things like the main accounts and so on and so forth the standard
6:06
deviation so and you can even check whether there are any no values within the dataset so let's go ahead and
6:12
check so we can see that everything is looking purely nice we have zero auto simply means
6:19
everything is well filled okay so let's go ahead now let's want to count number of empty record
6:26
if there is any of course this is actually delivering zero so we expect these to deliver an
6:32
empty record okay so you can see number of empty record is zero so basically i just use another variable here and then i specify the df underscore fact table dot is no and then we use the sum and across
6:48
so that's going to give us zero and maybe you just want to check using the c bone library to check
6:55
whether there is any you know no value you can just go ahead with this snsdd.it map and then we're
7:01
passing this bullion of operation and then we specify the white tick labels as false the coloring bar we
7:09
specify that to be as false and the coloring map as viridase so let's go ahead and
7:14
execute or run the cell so we can see it's going to deliver a purely clean
7:20
slate so you can see everything is looking good and let's scroll down now there's
7:25
some other thing we can do let's see the shape of course we can see we know that we have 1,200 rows and 233
7:33
columns now another thing is that we can actually list or see all the relationship that
7:39
exists inside our power bi data okay which is quite amazing so i just define another in
7:46
variable and then we use the class which is fab. list underscore relationship and they just
7:52
passing the dataset we defined earlier so let's run this and see what it's going to be
7:59
delivery you can see the multiplicity is many to one relationship so you can see it's
8:05
span from the front table from the fact table and then from column account manager to the
8:12
d account manager's table and to the account manager's column so you can see we have purely a one
8:20
for many to one relationship across so let's scroll down see other things and aside from the
8:27
let's see we want to visualize we want to see the diagram of the relationship that exists so we can use the class from from
8:36
same pilot relationship we import plots underscore relationship underscore metadata and then we just pass in the relationship that we defined that we created at
8:49
top here okay just pass it in as the compulsory argument and then let's go
8:55
ahead and run the cell and I mean this is quite amazing
8:59
I love this so you can see the diagram of the relationship that exists
9:04
So we have many to one relationship. And of course we can see the fact table and we can see the dimension table
9:12
Let's take this for instance. I can see this is the dim date and of course this is the other date in the fact table
9:19
So we have many to one relationship which is basically a star schema
9:24
Okay, in data modeling. And of course we can even list all the measures
9:29
that are already established inside the RBI file so we can just use the Fab
9:36
List measures and then when I execute these I have so many measures in the file so we
9:44
can see we have the name of the table where the measures were created which is
9:49
credited under the fact table and of course we can see the names of the measures and then we can see the measure expression and then we can see the measure type data type so let see for the first one that we created we have the total sales amount and we can see that
10:05
we use the sum x to iterate through the fact table and then we aggregate the sales amount column
10:12
in the fact table so that's the way the old thing works and it is quite amazing now not only that
10:20
Now, we also have the Fab underscore Evalute underscore Dax. Now in this case, we're just passing the dataset and then inside the double code I use Evalid FACT table
10:33
Now I'm going to run this and show you something. Now that's going to display just like what we have at the top
10:40
Okay, there we go. Now you can see this actually, you know, returning all the columns
10:45
within the fact table. However, in this case, it's actually showing the name
10:50
of the fact table itself the name of the table which is fact table now the difference is that
10:57
the one we did at the top here let's go to the top okay this very one do not have the name of the
11:04
table just the name of the columns okay so i'm just going to scroll down now you can even do
11:11
amazing things like using the goodbye columns inside the fab dot evaluative measures
11:20
Now let's see this scenario, we'll actually see the account manager by total sales amount, total units and total tax amount
11:29
So I just define this card variable and then we call the fab.e
11:34
Evaluate underscore measures. Now we're passing the dataset of the first argument, comma, and for the second argument we use the measure and then we specify the measures inside
11:46
double codes separated by commas so we have the total sales amount comma total units inside
11:53
double code comma and we have the total tax amount inside the double code and then of course we have
11:59
to just include that inside a square bracket now for the third argument we use the group by
12:05
of those columns and then we are grouping by the account managers column in the
12:11
the account manager's table so that has to be inside double good also and then to see what this is going to be delivered just call the name of the variable
12:21
and then click on run to execute and let's see what's going to be delivering wow this is quite
12:27
amazing so we can see the outfield you can see the list of the account man just name we can see
12:33
the total sales amount total unit total tax amount which is quite lovely see the last thing
12:39
now we can even retrieve the name of the workspace now don't forget we actually created this
12:46
link workspace so this is the name of the worksby by using the fab dot get underscore workspace
12:53
underscore id now doesn't require any input or any argument just open and close the bracket and then
12:59
just call it back and that's is so this the name of the workspace so this is basically just an
13:05
eye-opener to the semantic link in the Microsoft fabric i hope you enjoy this video if you do
13:12
share with your friends comments and see you in the next video
#Data Management
#Data Sheets & Electronics Reference