11 videos 📅 2024-05-20 09:00:00 America/Creston
2:05:45
2024-05-20 09:46:48
2:09
2024-05-20 12:30:32
2:41:18
2024-05-20 12:33:23
1:36:58
2024-05-21 08:00:54
5:24:36
2024-05-21 10:06:11
3:24
2024-05-22 06:36:04
9:25
2024-05-22 08:03:05
40:22
2024-05-22 08:14:12
2:49
2024-05-22 09:47:03
1:48:29
2024-05-22 09:50:24
1:57:28
2024-05-22 12:09:49

Visit the Apache Nifi GROUP 2 course recordings page

                WEBVTT

00:00:07.540 --> 00:00:14.620
You may have to log in to NAFI again this morning. I know I've had to. So hopefully

00:00:14.620 --> 00:00:25.600
you copied your username and password somewhere easy. If not, we can... In case you shut

00:00:25.600 --> 00:00:31.400
everything down, you may have to also run you go into the bin directory and go to run NAFI

00:00:32.000 --> 00:00:38.880
as well to start it back up. Hopefully you just left it running and so it just needs to re-log in.

00:00:48.260 --> 00:00:56.160
Looks like the only two that we are missing is is Ekta and Richard logged in. Yeah,

00:00:56.160 --> 00:01:01.760
I'm on it. I was having some menlo problems here on my side so looks like I should be up here.

00:01:01.760 --> 00:01:16.500
I see. Perfect. Now we just need Ekta. Looks like everybody's got it almost there. Give

00:01:16.500 --> 00:01:22.040
another couple minutes so everybody get up and running. Hopefully Ekta can join us again this

00:01:22.040 --> 00:01:40.360
morning. It's your local host so it's 127.0.0.1 colon 8443. Here, I'll bring my browser up

00:01:42.040 --> 00:01:51.620
and share it. You should be able to see the IP but yeah 127.0.0.1 colon 8443 slash NAFI.

00:01:53.000 --> 00:02:00.680
Okay, yeah. We're running all these locally on the machine so we should all have the same IPs.

00:02:24.540 --> 00:02:45.700
Peter, he just started it. It might take a minute but it's 127.0.0.1 colon 8443 slash NAFI.

00:02:46.180 --> 00:02:48.960
Okay, I'll do that. I'll also put it in chat.

00:02:52.460 --> 00:02:58.820
And you can also, if you want, you can just bookmark it once you get to the login page or

00:02:58.820 --> 00:03:05.480
to the main desktop canvas. That way you'll have it for tomorrow. I don't see it running

00:03:05.480 --> 00:03:11.520
on, Derek said I don't see yours running so yeah go into bin. There you go. And run NAFI.

00:03:11.740 --> 00:03:17.920
Perfect. And say run. Awesome. It'll be running in just a minute. It takes a minute

00:03:17.920 --> 00:03:25.000
for everything to initialize. As I mentioned yesterday, that lib directory is full of

00:03:25.000 --> 00:03:32.000
processors so it loads all those processors, loads everything, unpacks all the content.

00:03:32.660 --> 00:03:38.660
So NAFI itself can take a couple of minutes just to get started and running. Even though

00:03:38.660 --> 00:03:45.020
it will tell you that it's running, it takes still a few minutes for it to initialize.

00:03:45.020 --> 00:03:50.040
Then just make sure you have HTTPS in front of the 127.0.1.

00:03:50.100 --> 00:03:55.080
Hey, good morning, Ekta. I notice you just joined us and you are logging in. Perfect.

00:03:58.960 --> 00:04:04.220
Ekta, just a minute to get logged in. Darius and Peter, it looks like you guys are...

00:04:05.420 --> 00:04:11.260
Here, I don't know if your NAFI is running. I don't see the command.

00:04:13.980 --> 00:04:19.520
Peter, you'll have to go back into your... Go into your folder and you'll have to start...

00:04:19.520 --> 00:04:25.820
It looks like NAFI is not running. So if you'll go into the bin directory and double

00:04:25.820 --> 00:04:33.320
click on run NAFI, it will turn the bin directory. Okay. Perfect. Right there next to the last at

00:04:33.320 --> 00:04:36.860
the bottom and run that. Give it just a minute or two and you should be able to log in.

00:04:40.220 --> 00:04:47.700
Ekta is up. I'm not sure if my mind isn't working. I'm pulling it up right now.

00:05:01.310 --> 00:05:06.690
Let me try something real quick.

00:05:06.690 --> 00:05:22.610
I can bookmark this. It will let me bookmark it for you.

00:05:25.490 --> 00:05:30.710
I don't know if it works. You'll have to... I can't hit control D to bookmark it.

00:05:31.370 --> 00:05:39.450
But yours is up and running. No worries. Peter, it looks like yours is coming up.

00:05:39.630 --> 00:05:42.970
I remember yesterday you said the page sometimes just gets stuck for a couple of minutes.

00:05:43.470 --> 00:05:46.270
Yeah, it takes time to initialize HTTPS colon.

00:05:49.210 --> 00:05:53.910
Okay. I'm going to give it just another second or two and it might work.

00:05:55.590 --> 00:05:58.110
Okay. I'm still refusing on book browsers.

00:06:00.670 --> 00:06:06.770
Illegal character. All right. Since it's available, I see errors for some reason.

00:06:09.710 --> 00:06:13.350
Oh, there we go. All right. Peter, yours is coming up.

00:06:13.470 --> 00:06:14.630
Yeah, it looks good. Thank you.

00:06:17.810 --> 00:06:24.810
All right. Go ahead and get logged in. So I think yesterday everyone did an amazing

00:06:24.810 --> 00:06:33.830
job of getting your first data flow built. Today is a lot more hands-on.

00:06:35.710 --> 00:06:41.490
We are going to dive a little deeper now that we know how to pick a file up.

00:06:41.490 --> 00:06:49.110
We did a basic operation of unzipping the zip file and just putting it back to file system.

00:06:50.050 --> 00:06:58.770
Today's goal is to pick a file up and work with the data that it has.

00:06:59.750 --> 00:07:03.710
So what we're going to do today is learn about controller services.

00:07:04.330 --> 00:07:11.090
In the process of learning about controller services, we are going to take a CSV file

00:07:11.490 --> 00:07:22.370
and we are going to convert it to JSON. Now, this is a little bit more in depth to NAFA.

00:07:23.090 --> 00:07:28.030
Feel free to ask questions. I'll give you a hint after we get through doing this.

00:07:28.290 --> 00:07:37.290
The hands-on is going to be picking up some JSON and CSV files and making all of them,

00:07:37.290 --> 00:07:44.810
I have a scenario where all of them will need to be the same format and we are going to look for

00:07:45.730 --> 00:07:51.350
some patterns in the data. So that way we can send alerts.

00:07:51.450 --> 00:07:54.750
And I've got a whole write-up on the scenario to show.

00:07:56.150 --> 00:08:02.590
So it might be, even in the last class, it was a little tricky sometimes.

00:08:03.150 --> 00:08:07.390
And so, you know, we're going to have plenty of time to work through this.

00:08:08.670 --> 00:08:15.210
But, you know, I'll be, I'm going to sit here and watch and provide answers because

00:08:15.210 --> 00:08:19.490
I know you'll have, well, hopefully you'll have questions. If not, you can just bring

00:08:19.490 --> 00:08:25.710
straight through it. So yesterday we went and I'm going to pull up mine.

00:08:26.810 --> 00:08:34.250
We went and we made our whole flow to get files from the file system,

00:08:34.710 --> 00:08:39.250
put those back after we unzip them and those types of things.

00:08:39.890 --> 00:08:44.990
We also looked at some of the, you know, maybe cleaning the flow up.

00:08:46.130 --> 00:08:48.430
There's still some cleaning I need to do as well.

00:08:49.390 --> 00:08:54.750
But, you know, renaming a processor, adding color to a processor, adding

00:08:57.830 --> 00:09:03.590
background labels to the processors, you know, to make them more, you know,

00:09:03.770 --> 00:09:08.990
easier to navigate and those types of things. The only thing that has changed that I did

00:09:09.850 --> 00:09:13.650
was I put all this into a new process group.

00:09:14.310 --> 00:09:22.910
And so how I did that, and I'll show you, is I clicked on process group.

00:09:22.910 --> 00:09:28.910
And I drag it down and I just named it new process group or whatever.

00:09:29.510 --> 00:09:34.890
First sample flow is what I named mine. And then once you have that process group

00:09:34.890 --> 00:09:39.750
on your canvas, let me do it so I can show you.

00:09:47.710 --> 00:09:55.370
What I like to do is zoom out a little bit on my navigation.

00:09:55.890 --> 00:10:01.850
So I have my process group here. There we go.

00:10:02.230 --> 00:10:07.750
And what you can do is you can hold the shift key and you can then,

00:10:07.750 --> 00:10:14.590
it draws a box around your whole flow. And you can take that and drag it and

00:10:14.590 --> 00:10:21.170
drop it right into your process group. So if you can, on your canvas,

00:10:21.930 --> 00:10:27.870
bring down a new process group and drag and drop, like I just did,

00:10:28.250 --> 00:10:33.750
your whole flow into that process group. And the reason we do this is for many,

00:10:33.750 --> 00:10:43.630
many reasons. If you can imagine 300 processors doing all these operations

00:10:43.630 --> 00:10:49.050
running on this canvas, it would get very cluttered very quickly.

00:10:49.750 --> 00:10:54.130
It would be very hard to navigate and understand data flows.

00:10:54.910 --> 00:11:00.710
Also for security reasons, the way that NAFA likes to handle

00:11:01.550 --> 00:11:09.550
multi-tenancy and others is a process group. On the main canvas, if you had,

00:11:09.890 --> 00:11:15.730
say, 10 organizations all using the same NAFA, you could create a process group

00:11:15.730 --> 00:11:21.230
for each organization and lock that process group down to just them.

00:11:22.190 --> 00:11:29.950
And then that way, from the main canvas, you can only see 10 process groups, not 300 data

00:11:29.950 --> 00:11:36.590
flows. So if you can, you know, copy that in, bring it down, I am going to...

00:11:57.350 --> 00:12:01.750
So I missed how you did that. Can you go back over that?

00:12:01.750 --> 00:12:08.150
Yeah, exactly. So what I did is I brought down a process group,

00:12:10.050 --> 00:12:14.570
name it, you know, whatever you want to name it, my first data flow.

00:12:18.790 --> 00:12:26.890
Oh, we'll put it in there, how I got the whole flow into the process group.

00:12:29.030 --> 00:12:32.890
Okay. I can drop it on the canvas with my flow, but it doesn't...

00:12:34.890 --> 00:12:41.590
No worries, no worries. In fairness, the latency sometimes will mess with you.

00:12:42.130 --> 00:12:45.630
But what you do, what I like to do is zoom out a little bit on my

00:12:45.630 --> 00:12:51.050
navigation. And then I hold the shift key. And while I'm holding the shift key,

00:12:51.190 --> 00:12:56.030
I will drag and create like a box. You see the box that's being created?

00:12:56.890 --> 00:13:01.710
And let go. And it will highlight the whole data flow, the connections,

00:13:02.070 --> 00:13:06.990
the images, all of that. Then you can take it and drag it to your process group.

00:13:07.110 --> 00:13:11.730
Your process group should highlight blue and drag and drop it in.

00:13:13.810 --> 00:13:19.430
And then you should have only a process group on your canvas. And then when you

00:13:19.430 --> 00:13:25.530
double click your process group, you can go right in and see your flow.

00:13:25.530 --> 00:13:30.770
Now, again, the virtual desktop environment, sometimes it's a little

00:13:30.770 --> 00:13:36.170
difficult because of the latency. And it doesn't want to select all or

00:13:36.170 --> 00:13:38.710
something. So it may take a couple of attempts.

00:13:40.850 --> 00:13:44.650
Yeah. So I tried it a little differently. I did select the entire section.

00:13:45.090 --> 00:13:48.570
And when you right click, it lets you also create a process group.

00:13:50.090 --> 00:13:54.850
But I like to drag it down so that way we can do it. But yeah, you got the

00:13:54.850 --> 00:13:57.950
shortcut, right? Yeah. Just no reason why it's

00:13:57.950 --> 00:14:01.550
because I had that same problem. My browser was just not responding to

00:14:01.550 --> 00:14:04.930
that for whatever reason. So I noticed that and it seemed like it worked.

00:14:05.170 --> 00:14:10.930
Yeah. That's another way of doing it. Yeah. But also, see, you're already

00:14:10.930 --> 00:14:13.730
creating breadcrumbs. And I'm not doing that at all.

00:14:16.270 --> 00:14:17.970
No worries. I can take a look at it.

00:14:19.630 --> 00:14:24.610
Let's take a quick question. Once you click inside the flow, how do you

00:14:24.610 --> 00:14:31.070
get back out of it? Once you're in the process group.

00:14:31.170 --> 00:14:35.710
Oh, my God. Yeah. I noticed on the bottom left corner, there's a

00:14:36.310 --> 00:14:42.050
breadcrumb. Yeah. Yeah. So if you remember, I touched on it briefly,

00:14:42.190 --> 00:14:46.770
but I can go right back to my parent group. And if I go into the

00:14:46.770 --> 00:14:49.710
process group, it creates a breadcrumb and you can go right back.

00:14:50.530 --> 00:14:52.930
Also, if you're in this, you can

00:14:55.250 --> 00:14:59.370
leave the group right click and say leave group. And you can also

00:14:59.370 --> 00:15:03.150
leave the group. Tom, let's take a look at yours.

00:15:03.170 --> 00:15:05.790
It's all jacked up. No worries. We can get it fixed.

00:15:06.070 --> 00:15:08.490
I don't know why my stuff's all the way to the right now.

00:15:08.630 --> 00:15:12.150
I don't know what I did. Okay. Well, watch my screen and

00:15:12.150 --> 00:15:16.750
I'll take over and get you squared away. No, I completely

00:15:17.310 --> 00:15:21.890
completely get lost. I lost my label. I don't know what happened.

00:15:22.370 --> 00:15:27.550
All right. Let's see what you got. So remember you have the

00:15:27.550 --> 00:15:31.310
navigation panel here. You can actually then like drag and drop

00:15:32.050 --> 00:15:36.270
drag it around and kind of center. Okay. That's cool.

00:15:37.850 --> 00:15:40.850
Did you already bring your process group down?

00:15:42.010 --> 00:15:45.670
Nope. Okay. So I keep deleting it because I'm not doing it right.

00:15:46.230 --> 00:15:50.010
Okay. No worries. So if you hold shift and I'm holding the shift

00:15:50.010 --> 00:15:53.910
key, I'm going to start up here and I'm going to drag this box

00:15:53.910 --> 00:15:59.090
all the way over your flow. And if you already brought down

00:15:59.090 --> 00:16:02.930
a process group, you can just copy it in there. But since

00:16:02.930 --> 00:16:06.790
you've got it all selected, you can also say group, name

00:16:06.790 --> 00:16:13.130
your group, say add. And there you go.

00:16:13.130 --> 00:16:19.310
So now on the root canvas, you have your first flow group.

00:16:19.990 --> 00:16:21.850
So you just double click and you can go into it.

00:16:23.090 --> 00:16:28.530
Okay. And, you know, you can then if you click like on the

00:16:28.530 --> 00:16:32.670
canvas, you can, you know, hold and drag it around.

00:16:33.830 --> 00:16:38.790
You can also select everything and say align.

00:16:39.430 --> 00:16:45.670
And it's not the prettiest, but like you can get it aligned.

00:16:46.730 --> 00:16:49.190
Actually, that messed up even more.

00:16:52.370 --> 00:16:56.830
That aligning doesn't work right. Let me fix this for you.

00:17:00.750 --> 00:17:04.970
Undo. Undo. Yeah, unfortunately. Control Z.

00:17:07.290 --> 00:17:10.170
I wish. Let's see. Let me get this.

00:17:11.270 --> 00:17:14.330
Okay. So logically, we want this here.

00:17:15.010 --> 00:17:21.410
We would want our unpacked files. Get file from folder.

00:17:22.010 --> 00:17:24.930
Right here is the first one. Yeah, that is, you know,

00:17:25.210 --> 00:17:29.190
that's some of the nuances and it's moving because like the

00:17:29.630 --> 00:17:30.030
latencies.

00:17:32.850 --> 00:17:35.370
Yeah, that's okay. No, I'm going to clean it up real

00:17:35.370 --> 00:17:41.290
quickly. The file, the VM with its latency sometimes will,

00:17:41.910 --> 00:17:44.490
well, it just doesn't want to work right.

00:17:44.630 --> 00:17:47.870
I was going to ask how you do create, you know, like save.

00:17:48.010 --> 00:17:50.030
I guess you'd save this template, but like do a new,

00:17:50.290 --> 00:17:52.830
like, like I said, get a new blank canvas to start a new

00:17:52.830 --> 00:17:55.390
thing. You know what I mean? Without getting rid of what you

00:17:56.010 --> 00:18:00.050
built. I guess, I guess this is one way of doing that,

00:18:00.490 --> 00:18:02.990
right? Yeah. And we're going to.

00:18:09.270 --> 00:18:11.150
It's me, man. It's not going to be easy.

00:18:11.590 --> 00:18:16.330
No, it's fine. No, it's, it's again, like with some

00:18:16.330 --> 00:18:19.510
software products, like this virtual desktop works great.

00:18:20.110 --> 00:18:23.310
But when you're real time clicking and dragging and

00:18:23.310 --> 00:18:25.930
type of stuff, it can be a little bit.

00:18:26.830 --> 00:18:28.950
Okay. Let me zoom in a little bit more.

00:18:29.830 --> 00:18:31.310
Okay. So we can take this guy.

00:18:33.310 --> 00:18:35.390
Hopefully move him over here. Okay.

00:18:36.130 --> 00:18:39.190
And just like what, you know, the way we write,

00:18:39.310 --> 00:18:43.030
you know, left to right, unless you speak Arabic,

00:18:43.370 --> 00:18:46.030
then you're going off some opposite.

00:18:47.710 --> 00:18:52.150
You know, that's usually how like flows get designed

00:18:52.150 --> 00:18:57.210
is either left or right, or, you know, straight up and down.

00:18:57.690 --> 00:19:01.670
Right. And, and that way you can branch off your connections,

00:19:01.970 --> 00:19:04.890
your log messages, those types of things.

00:19:05.090 --> 00:19:09.650
So let me see if I can. That's going back to put to not

00:19:09.650 --> 00:19:15.490
go that way. There we go.

00:19:15.630 --> 00:19:18.970
That's a lot better. Okay. That's a little cleaner,

00:19:20.030 --> 00:19:24.790
I think. And then while I'm in yours, so this is,

00:19:24.810 --> 00:19:27.070
you know, this is your main canvas.

00:19:27.990 --> 00:19:29.870
You can use the breadcrumb trail.

00:19:30.330 --> 00:19:32.530
You can go into the process group.

00:19:32.750 --> 00:19:34.750
And then once you're in the process group,

00:19:35.950 --> 00:19:38.530
you can just right click and say leave group if you need.

00:19:39.590 --> 00:19:41.490
Or you can just click the breadcrumb trail.

00:19:42.030 --> 00:19:44.610
There's many ways to get out of that process group.

00:19:45.030 --> 00:19:48.430
But the reason, you know, we like to do this is,

00:19:48.430 --> 00:19:51.710
we have this flow. We're about to build another flow.

00:19:52.870 --> 00:19:56.110
And, you know, you can imagine your canvas would be just,

00:19:56.110 --> 00:19:58.270
just flows everywhere.

00:19:59.210 --> 00:20:01.870
That's also why we have input output ports

00:20:01.870 --> 00:20:05.050
to manage connections, those types of things.

00:20:06.590 --> 00:20:09.070
So, you know, for the next exercise,

00:20:09.810 --> 00:20:12.690
you would actually probably just bring down a new process group,

00:20:13.090 --> 00:20:14.010
create a new one.

00:20:14.230 --> 00:20:16.850
And that way your main canvas stays cleaner.

00:20:17.650 --> 00:20:22.810
And then when you've set this up in a real world environment

00:20:23.330 --> 00:20:26.870
where you have multi-tendency and some other things,

00:20:27.310 --> 00:20:31.510
you want to be able to, you know,

00:20:31.650 --> 00:20:33.270
I got mine open like 20 times.

00:20:34.490 --> 00:20:36.870
All right. You want to be able to lock this down

00:20:36.870 --> 00:20:41.250
where, you know, this may be organization A,

00:20:41.410 --> 00:20:43.090
and then this is organization B.

00:20:43.250 --> 00:20:47.310
In your policy for multi-tendency

00:20:47.310 --> 00:20:51.550
and some of the security, you can, you know,

00:20:51.730 --> 00:20:56.170
have that process group belong to another organization.

00:20:56.590 --> 00:20:59.730
And then under that process group is basically,

00:20:59.730 --> 00:21:03.210
you know, that organization's main canvas.

00:21:03.690 --> 00:21:06.610
And so it would be blank, you know,

00:21:06.690 --> 00:21:09.630
mine's not because I put a process group

00:21:09.630 --> 00:21:12.210
within a process group, but, you know,

00:21:12.210 --> 00:21:13.750
it would be blank.

00:21:13.970 --> 00:21:15.930
And when they log into NAFA,

00:21:16.210 --> 00:21:19.050
that's the only process group that they have access to

00:21:19.050 --> 00:21:19.910
on the canvas.

00:21:20.170 --> 00:21:23.870
And they're able to go in to their process group

00:21:23.870 --> 00:21:26.910
and then, you know, build whatever data flows as needed.

00:21:27.510 --> 00:21:29.690
Another advantage of that is,

00:21:30.250 --> 00:21:33.310
and I see this quite a bit where, you know,

00:21:33.450 --> 00:21:38.930
organization A has a responsibility to import data,

00:21:39.530 --> 00:21:42.190
ETL it, you know, get it into right format,

00:21:42.210 --> 00:21:43.090
and things like that.

00:21:44.190 --> 00:21:46.150
But they may also have a requirement

00:21:46.150 --> 00:21:47.810
to share with organization B.

00:21:48.930 --> 00:21:51.490
And so, you know, within NAFA, you can have,

00:21:52.550 --> 00:21:55.110
you know, organization A has their own section,

00:21:55.610 --> 00:21:57.430
organization B has their own section,

00:21:57.810 --> 00:22:02.030
and then you can actually do it like an output port

00:22:02.030 --> 00:22:04.270
on the data that, you know,

00:22:04.270 --> 00:22:06.450
you need to share with organization B

00:22:07.430 --> 00:22:10.570
and it go to, you know, to their process group.

00:22:10.570 --> 00:22:15.590
And organization B has an input port that receives that data.

00:22:16.070 --> 00:22:19.690
So that way, organization B doesn't really see,

00:22:20.530 --> 00:22:21.890
you know, if you have it locked down,

00:22:21.930 --> 00:22:23.850
they don't really see how you made the sausage,

00:22:24.350 --> 00:22:26.370
just as, you know, they're just getting,

00:22:26.370 --> 00:22:29.090
you know, sausage delivered to their process group.

00:22:29.970 --> 00:22:33.430
So the sausage making can hide within that group

00:22:33.430 --> 00:22:35.470
whatever logic you put in.

00:22:36.090 --> 00:22:38.410
I know some organizations, they, you know,

00:22:38.470 --> 00:22:40.430
they have models and stuff like that

00:22:40.430 --> 00:22:42.790
they don't want, you know, folks to mess with.

00:22:43.110 --> 00:22:45.310
And so, you know, they'll run everything

00:22:45.310 --> 00:22:47.230
within that locked down group.

00:22:47.570 --> 00:22:50.490
And then the output of that will go, you know, elsewhere.

00:22:51.230 --> 00:22:53.050
So, you know, just keep that in mind.

00:22:53.230 --> 00:22:56.690
There's many ways to do this.

00:22:56.870 --> 00:23:01.550
But again, you know, I think on the last class,

00:23:05.630 --> 00:23:09.830
Brett and a couple of others chatted

00:23:09.830 --> 00:23:12.970
on how to set up some multi-tenancy.

00:23:13.430 --> 00:23:15.950
It's my understanding, you know, some of the folks

00:23:15.950 --> 00:23:19.370
on that call is working also to help set this up.

00:23:19.870 --> 00:23:21.890
So, you know, there's no point in everybody

00:23:21.890 --> 00:23:24.270
running their own instance unless you have,

00:23:24.310 --> 00:23:27.270
you know, that need or if you're just developing

00:23:27.270 --> 00:23:28.270
and learning.

00:23:28.970 --> 00:23:31.850
But when it comes to like test prod, you know,

00:23:31.970 --> 00:23:35.350
having that multi-tenancy NAFA available

00:23:35.350 --> 00:23:37.870
for everyone to use, you know,

00:23:37.870 --> 00:23:39.590
it just sounds more reasonable.

00:23:40.570 --> 00:23:43.590
So, you know, that's my understanding is, you know,

00:23:43.710 --> 00:23:45.590
potentially that's what your environment

00:23:45.590 --> 00:23:48.550
may look like in the future if that's the way,

00:23:48.550 --> 00:23:50.510
you know, that those things are set up.

00:23:51.310 --> 00:23:57.110
But anyway, so I need to move to parent group.

00:23:57.530 --> 00:24:01.550
All right, and then I can get rid of this.

00:24:06.590 --> 00:24:08.030
All right.

00:24:08.350 --> 00:24:11.090
So, if you go back to your main canvas,

00:24:11.350 --> 00:24:13.790
you should have one process group.

00:24:17.410 --> 00:24:22.290
And then when we build additional data flows

00:24:22.290 --> 00:24:25.910
today and tomorrow, what if you can bring down

00:24:25.910 --> 00:24:29.190
a new process group and work within that group.

00:24:29.190 --> 00:24:32.430
That way, you know, you don't have three,

00:24:32.470 --> 00:24:34.870
four, five data flows that we're working on

00:24:34.870 --> 00:24:39.610
building on the canvas and, you know,

00:24:40.770 --> 00:24:43.050
and just cluttered up and you can't,

00:24:43.050 --> 00:24:45.390
you know, find, you know, certain processors

00:24:45.390 --> 00:24:46.290
or anything else.

00:24:47.210 --> 00:24:49.810
Now, if sometimes, you know, processors,

00:24:49.950 --> 00:24:51.970
you know, they can be a long data flow.

00:24:52.290 --> 00:24:54.990
So within a process group, you may have,

00:24:54.990 --> 00:24:58.450
you know, a couple of processors doing

00:24:58.450 --> 00:25:00.350
an input port, receiving data,

00:25:00.430 --> 00:25:01.510
those types of things.

00:25:01.910 --> 00:25:04.390
And then you may have 10 or 15 process groups

00:25:04.390 --> 00:25:08.250
within that, that original process group

00:25:08.250 --> 00:25:10.490
that does, you know, all the data movement,

00:25:10.690 --> 00:25:14.090
logic, ETL, and even embedded into that,

00:25:14.210 --> 00:25:17.030
you know, you will have different process groups

00:25:17.030 --> 00:25:18.310
and those types of things.

00:25:19.530 --> 00:25:23.250
So I've seen it six, seven, eight levels deep.

00:25:23.410 --> 00:25:26.150
And that's the reason we have a breadcrumb trail.

00:25:26.150 --> 00:25:29.150
So you can actually go back out, you know,

00:25:29.150 --> 00:25:33.030
very quickly and go to the process group you need.

00:25:35.370 --> 00:25:39.550
But if you get lost on a flow,

00:25:39.930 --> 00:25:41.950
you do have a search bar.

00:25:42.010 --> 00:25:52.810
So you can see if I can do.

00:26:02.530 --> 00:26:04.390
And then this one, for instance,

00:26:04.950 --> 00:26:08.590
is, you know, named connection to mind type.

00:26:08.890 --> 00:26:12.130
I think it was get file.

00:26:13.190 --> 00:26:16.170
Yeah, connection to get file to identify mind type.

00:26:16.470 --> 00:26:19.270
So that's also, you know, why you want

00:26:19.270 --> 00:26:21.950
to kind of label these with, you know,

00:26:21.950 --> 00:26:24.430
a good readable processor name

00:26:24.430 --> 00:26:26.350
because you can easily just search

00:26:26.350 --> 00:26:30.010
and, you know, find whatever you need to find.

00:26:31.150 --> 00:26:33.530
So, you know, it's all up here on the toolbar.

00:26:36.530 --> 00:26:39.570
And then now that we have processors

00:26:39.570 --> 00:26:44.350
on our NaFi instance, you can see the status bar.

00:26:44.430 --> 00:26:47.010
You know, I have 10 stopped processors.

00:26:47.630 --> 00:26:49.970
I have two that need attention.

00:26:51.510 --> 00:26:53.090
We haven't put any data through.

00:26:53.090 --> 00:26:56.010
We don't have any other connections

00:26:56.010 --> 00:26:57.930
or disabled services yet.

00:26:59.530 --> 00:27:01.330
So they're not showing up.

00:27:04.410 --> 00:27:07.410
But yeah, everything's on the status bar.

00:27:07.570 --> 00:27:09.730
You can search and those types of things.

00:27:10.110 --> 00:27:11.930
So that should make it a little easier,

00:27:12.150 --> 00:27:15.270
a little cleaner when we start building more flows

00:27:16.050 --> 00:27:17.370
and those types of things.

00:27:18.410 --> 00:27:21.930
So for this morning, we are going to learn

00:27:21.930 --> 00:27:23.850
about controller services.

00:27:24.530 --> 00:27:32.650
I think I mentioned controller services briefly.

00:27:34.270 --> 00:27:37.730
And let's see here.

00:27:38.950 --> 00:27:39.810
There's only one of the slides

00:27:39.810 --> 00:27:41.250
I mentioned controller services,

00:27:41.270 --> 00:27:42.230
but it doesn't matter.

00:27:43.530 --> 00:27:46.890
So controller services, I mentioned like

00:27:46.890 --> 00:27:51.530
a database connection where you can establish

00:27:51.530 --> 00:27:54.770
that database connection, that username and password,

00:27:54.970 --> 00:27:56.630
the IP address, the port.

00:27:57.070 --> 00:27:59.770
So if you're going to, you know, MoriahDB

00:28:01.010 --> 00:28:04.610
or MySQL, you know, the port that

00:28:04.610 --> 00:28:07.210
that database runs on is usually 3306.

00:28:08.230 --> 00:28:09.990
It has an IP address.

00:28:10.070 --> 00:28:13.710
It has a username and password, all of those things.

00:28:14.590 --> 00:28:19.050
So, you know, those are some sensitive information

00:28:19.050 --> 00:28:22.350
that you may not want to share across the board

00:28:22.350 --> 00:28:24.090
with everybody that's using your NAFA

00:28:24.990 --> 00:28:27.910
if you're like over managing this system.

00:28:28.270 --> 00:28:32.670
And so, you know, as a sysadmin or Dataflow developer,

00:28:33.190 --> 00:28:37.170
you can create controller services that, you know,

00:28:37.310 --> 00:28:40.310
here is you can put in the database connection.

00:28:40.670 --> 00:28:42.710
You can put in the database connection information

00:28:42.710 --> 00:28:43.810
and all of that.

00:28:44.050 --> 00:28:47.710
And then that way, when people are utilizing

00:28:47.710 --> 00:28:50.850
that database connection, they just use the one

00:28:50.850 --> 00:28:52.570
that's already set up and running.

00:28:53.570 --> 00:28:58.110
So they, you know, it saves time for other data engineers

00:28:58.110 --> 00:29:00.830
because they can just reference the connection that,

00:29:01.610 --> 00:29:04.370
you know, that controller service database connection

00:29:04.370 --> 00:29:05.750
that's already set up.

00:29:06.090 --> 00:29:08.110
You do not have to give out the username,

00:29:08.330 --> 00:29:10.490
the password, the IP address, the port,

00:29:10.870 --> 00:29:12.370
you know, those types of things.

00:29:12.970 --> 00:29:16.230
So it's really nice, you know, to have, you know,

00:29:16.230 --> 00:29:18.390
some of that shared services.

00:29:19.370 --> 00:29:20.670
And that's what it is, is, you know,

00:29:20.670 --> 00:29:23.210
controller services are shared services

00:29:23.210 --> 00:29:26.770
that can be used by processors, reporting tasks,

00:29:26.950 --> 00:29:28.910
and other controller services.

00:29:30.410 --> 00:29:33.750
But, you know, again, the nuances

00:29:34.710 --> 00:29:37.410
like removing a processor and putting another processor

00:29:37.410 --> 00:29:39.710
in that we ran into yesterday,

00:29:40.170 --> 00:29:42.630
in order to modify controller service,

00:29:42.870 --> 00:29:46.050
all the referencing components must be stopped.

00:29:46.230 --> 00:29:52.410
And there's ways to, like, stop all referencing processors

00:29:52.410 --> 00:29:56.170
together, but, you know, if you modify

00:29:56.170 --> 00:29:58.170
that controller service, you know,

00:29:58.290 --> 00:30:00.310
all the processors that are referencing

00:30:00.310 --> 00:30:03.730
that controller service will also need to be stopped.

00:30:04.570 --> 00:30:08.610
Luckily, you know, and that sounds, you know,

00:30:08.630 --> 00:30:09.590
you know, like a lot of work,

00:30:09.770 --> 00:30:13.750
but luckily once you establish your database connection,

00:30:13.870 --> 00:30:15.710
your database controller service,

00:30:16.270 --> 00:30:19.650
unless the IP address changes or the port changes

00:30:19.650 --> 00:30:22.330
or username and password changes,

00:30:22.390 --> 00:30:25.130
even username and passwords, you can automate that.

00:30:25.770 --> 00:30:27.630
But unless there's some major changes,

00:30:27.970 --> 00:30:31.390
you should be able to install that controller service

00:30:32.090 --> 00:30:34.750
and, you know, everyone take advantage of it.

00:30:35.370 --> 00:30:37.830
And then update, if it updates, you know,

00:30:37.890 --> 00:30:39.890
it shouldn't be that often.

00:30:40.690 --> 00:30:42.610
We see, like, in the real world,

00:30:42.890 --> 00:30:44.590
controller services running for years

00:30:44.590 --> 00:30:48.110
without any interaction, just because, you know,

00:30:48.230 --> 00:30:51.070
the database that's referencing is always there.

00:30:53.970 --> 00:30:58.570
So within the data flow, to scope a controller service,

00:30:59.730 --> 00:31:02.990
it can be created within any process group.

00:31:03.970 --> 00:31:05.790
You can create a controller service,

00:31:05.790 --> 00:31:08.010
you know, within a processor.

00:31:08.490 --> 00:31:11.230
You can create it within a processes group.

00:31:11.830 --> 00:31:14.230
You can actually go to the hamburger menu

00:31:14.970 --> 00:31:19.410
and you should see some controller services as well.

00:31:19.710 --> 00:31:23.110
We don't have any installed yet, but, you know,

00:31:23.110 --> 00:31:25.350
you can, if you have it on your main canvas,

00:31:25.350 --> 00:31:26.110
it will show up here.

00:31:27.110 --> 00:31:28.990
Later, when we install registry,

00:31:29.290 --> 00:31:33.370
we're actually going to install the registry service

00:31:33.870 --> 00:31:36.830
that everyone gets to use and, you know,

00:31:36.970 --> 00:31:37.950
you install it once.

00:31:39.610 --> 00:31:42.230
So that's one way of connecting to it.

00:31:42.230 --> 00:31:47.250
So what I'm going to do is kind of go into my sample flow

00:31:47.930 --> 00:31:51.790
and kind of walk you through how we do this.

00:31:52.090 --> 00:31:54.690
You're more than welcome to follow along.

00:31:55.610 --> 00:31:58.210
It is, you know, this is a little bit more advanced

00:31:58.210 --> 00:32:02.170
because we're converting CSV to JSON,

00:32:02.890 --> 00:32:05.370
setting the name and writing the JSON out.

00:32:05.990 --> 00:32:08.970
We're going to use a controller service for that

00:32:08.970 --> 00:32:12.210
and, you know, kind of show you how that's done.

00:32:13.410 --> 00:32:15.970
So the first processor, you know, of course,

00:32:15.990 --> 00:32:18.650
is we're going to get a file from a directory

00:32:19.630 --> 00:32:21.430
and this should not be configured.

00:32:23.110 --> 00:32:27.150
The file that we're looking for is inventory.csv.

00:32:27.290 --> 00:32:28.950
It is a CSV file.

00:32:29.170 --> 00:32:31.510
Pull that up and show you what that looks like.

00:32:42.410 --> 00:32:46.590
So the goal of this is we're going to take this CSV file,

00:32:46.630 --> 00:32:49.050
which is to store item and quantity

00:32:49.950 --> 00:32:52.250
and make it a JSON document.

00:32:52.730 --> 00:32:56.750
So that way all the data that we're receiving as CSV,

00:32:57.110 --> 00:33:00.930
we can convert it to JSON and, you know,

00:33:00.930 --> 00:33:02.430
do further operations.

00:33:03.070 --> 00:33:05.430
So that's the file that we are going to work with.

00:33:05.510 --> 00:33:08.930
You all have access to this file.

00:33:09.050 --> 00:33:10.150
You should.

00:33:10.150 --> 00:33:11.430
Let me see.

00:33:14.750 --> 00:33:17.510
If you don't, I can put it on your desktop.

00:33:17.670 --> 00:33:20.090
But for now, I'm just going to walk through with me.

00:33:20.750 --> 00:33:22.910
So I have this inventory file.

00:33:23.310 --> 00:33:25.190
Again, we're using a Git file processor.

00:33:25.330 --> 00:33:27.610
So I'm going to just copy the path

00:33:28.570 --> 00:33:30.050
and I'm going to put it in.

00:33:32.390 --> 00:33:34.970
The file filter, you know, is a little different.

00:33:35.830 --> 00:33:38.970
You can actually filter just on the file name.

00:33:39.930 --> 00:33:42.310
I put in inventory.csv.

00:33:42.370 --> 00:33:47.210
I can put whatever CSV, whatever, you know,

00:33:48.090 --> 00:33:49.970
zip file, JSON documents.

00:33:50.130 --> 00:33:50.790
It doesn't matter.

00:33:51.010 --> 00:33:53.930
It's only going to pick up inventory.csv.

00:33:55.830 --> 00:33:58.110
And then, of course, I have, you know,

00:33:58.310 --> 00:33:59.690
keep source formatting to true,

00:34:01.990 --> 00:34:03.190
recursive subdirectories.

00:34:03.310 --> 00:34:04.870
There's no other subdirectories

00:34:04.870 --> 00:34:05.830
and those types of things.

00:34:05.830 --> 00:34:10.110
So I have my Git file, you know, where I need it to be.

00:34:10.350 --> 00:34:11.730
And that's the first step, right?

00:34:11.850 --> 00:34:15.930
You know, read the CSV file, get it into a flow file

00:34:15.930 --> 00:34:18.350
so I can start operating with it.

00:34:20.450 --> 00:34:23.630
So the controller service we are using

00:34:24.710 --> 00:34:27.590
is a Avro schema service,

00:34:28.130 --> 00:34:30.370
a JSON writer service,

00:34:30.710 --> 00:34:33.070
and a CSV reader service.

00:34:33.410 --> 00:34:35.390
And we're going to go into those.

00:34:35.390 --> 00:34:39.890
But, you know, to convert CSV to JSON,

00:34:40.130 --> 00:34:42.750
there's a couple ways we can do it within NAFA.

00:34:43.230 --> 00:34:45.570
This is the most optimal way.

00:34:46.730 --> 00:34:50.030
The hands-on exercise is going to be a little different.

00:34:50.430 --> 00:34:52.230
It's very similar to this.

00:34:52.810 --> 00:34:54.550
But the hands-on exercise,

00:34:54.850 --> 00:34:57.110
you can use a controller service.

00:34:57.170 --> 00:34:59.130
You can not use a controller service.

00:34:59.470 --> 00:35:03.670
There's processors to do the scenario that's next.

00:35:03.670 --> 00:35:08.930
And so, but for this, we need to set a schema metadata.

00:35:09.430 --> 00:35:11.950
That means as soon as we get this data,

00:35:12.730 --> 00:35:16.090
we need to bring it in

00:35:16.090 --> 00:35:18.490
and we're going to set this attribute.

00:35:18.710 --> 00:35:22.310
So we use the update attribute processor.

00:35:23.430 --> 00:35:24.970
And so when we get this data,

00:35:25.010 --> 00:35:26.930
it's going to show just like we did yesterday.

00:35:27.070 --> 00:35:29.050
Here's the file name, here's the file size,

00:35:29.090 --> 00:35:30.170
those types of things.

00:35:31.190 --> 00:35:32.490
What we're doing, though,

00:35:32.490 --> 00:35:36.210
is we're adding to that metadata for the flow file

00:35:36.210 --> 00:35:39.230
an attribute called schema.name

00:35:39.230 --> 00:35:41.690
with the value of inventory.

00:35:42.170 --> 00:35:45.770
So that is going to tell our controller service

00:35:46.310 --> 00:35:50.850
which schema to use to convert the CSV to JSON.

00:35:51.250 --> 00:35:54.570
I can have another schema that's called,

00:35:54.570 --> 00:35:58.650
you know, conversion or whatever.

00:35:59.450 --> 00:36:01.970
And so, you know, as I bring data in,

00:36:01.970 --> 00:36:04.530
I could filter and sort that data

00:36:04.530 --> 00:36:06.930
and depending on the type of data,

00:36:06.930 --> 00:36:10.490
I would assign it, you know, a schema attribute

00:36:10.490 --> 00:36:13.290
that, you know, could be, you know,

00:36:13.330 --> 00:36:15.090
whatever name I needed to be to match up.

00:36:16.010 --> 00:36:19.810
But this one is schema.name is the property value.

00:36:20.330 --> 00:36:23.590
Inventory is the actual value.

00:36:24.170 --> 00:36:26.490
Now, schema.name is important here

00:36:26.490 --> 00:36:30.250
because NaFi is going to look for that attribute

00:36:30.810 --> 00:36:33.190
and we'll see that as we start setting up

00:36:33.190 --> 00:36:34.290
our controller services.

00:36:35.030 --> 00:36:36.470
So NaFi is going to look for this

00:36:36.470 --> 00:36:38.890
and it's going to ask, you know,

00:36:38.990 --> 00:36:40.250
the attribute, you know,

00:36:40.310 --> 00:36:42.770
what schema name you want me to apply

00:36:43.370 --> 00:36:45.330
and we're going to apply the inventory.

00:36:49.630 --> 00:36:50.790
So this process,

00:36:50.850 --> 00:36:53.230
next processor in line is a convert record.

00:36:54.470 --> 00:36:57.410
And again, you can,

00:36:57.410 --> 00:37:01.110
if we were converting CSV to JSON,

00:37:01.270 --> 00:37:03.450
I could use the processor.

00:37:08.290 --> 00:37:11.350
There's a extract record.

00:37:12.090 --> 00:37:14.130
There's a valid update record,

00:37:14.290 --> 00:37:14.890
split record.

00:37:14.990 --> 00:37:16.010
I can route text.

00:37:16.110 --> 00:37:20.310
I can use regular expression to pull that CSV out.

00:37:20.830 --> 00:37:22.110
Those types of things.

00:37:23.590 --> 00:37:26.990
I can write, well, this one's writing attributes to CSV.

00:37:27.410 --> 00:37:30.250
So I can actually extract the CSV,

00:37:30.650 --> 00:37:34.470
write everything out from attribute to a JSON document.

00:37:35.970 --> 00:37:37.730
You know, there's a few different ways

00:37:37.730 --> 00:37:38.790
to see attributes to JSON.

00:37:39.490 --> 00:37:41.650
So, you know, imagine your data flow,

00:37:41.870 --> 00:37:43.890
you know, you're importing this CSV.

00:37:44.330 --> 00:37:46.050
You can extract it.

00:37:47.150 --> 00:37:49.290
There's many ways to extract that.

00:37:49.810 --> 00:37:51.110
Once you have it extracted,

00:37:51.230 --> 00:37:54.490
you can use a processor to write it back as JSON,

00:37:54.970 --> 00:37:56.390
you know, those types of things.

00:37:56.950 --> 00:38:01.850
This is one of those nuances that I go through,

00:38:01.850 --> 00:38:03.030
you know, pretty constantly,

00:38:03.510 --> 00:38:05.810
is, you know, there's many ways to skin a cat.

00:38:06.270 --> 00:38:09.510
The most optimal way of doing this

00:38:10.190 --> 00:38:12.170
is the method I'm using right now,

00:38:12.210 --> 00:38:14.670
where we are using controller services.

00:38:15.650 --> 00:38:19.750
That, you know, I feel like it's a lot easier as well,

00:38:19.750 --> 00:38:23.350
because I don't have to do regular expression

00:38:23.350 --> 00:38:26.910
to extract text, you know, in those types of things.

00:38:28.690 --> 00:38:32.010
So I'm using the Convert Record processor.

00:38:32.850 --> 00:38:34.470
We'll go ahead and configure this,

00:38:35.750 --> 00:38:44.710
and that way we can...

00:38:44.710 --> 00:38:47.730
And you can see here the Convert Record,

00:38:48.010 --> 00:38:50.610
converts records from one data format to another,

00:38:50.890 --> 00:38:52.670
and it uses a record reader

00:38:52.670 --> 00:38:55.090
and a record writer controller service.

00:38:56.090 --> 00:38:57.850
So what, you know, like I mentioned,

00:38:58.050 --> 00:39:02.370
we are going to use the record reader CSV service,

00:39:02.690 --> 00:39:06.030
and we're going to use the record writer JSON service

00:39:06.830 --> 00:39:07.710
to do this.

00:39:09.730 --> 00:39:13.730
But any time you can, you know,

00:39:13.830 --> 00:39:16.350
pull this up and look at the documentation,

00:39:17.250 --> 00:39:19.830
the allowed values in the record reader,

00:39:20.530 --> 00:39:23.590
you know, there's CSV reader, there's a JSON,

00:39:23.890 --> 00:39:27.370
there's an Excel, there's XML, Windows event log reader.

00:39:28.290 --> 00:39:29.910
A lot of times this is used for,

00:39:31.230 --> 00:39:33.890
you know, cybersecurity use cases where you're pulling in

00:39:33.890 --> 00:39:36.910
syslog, event log, you know, those types of things.

00:39:37.550 --> 00:39:39.270
As you can see, there's a syslog reader,

00:39:39.610 --> 00:39:42.630
multiple different formats for syslog.

00:39:43.190 --> 00:39:46.710
But for us, we are going to use the CSV reader.

00:39:46.710 --> 00:39:50.090
Now the writer, we have a JSON record set writer.

00:39:50.950 --> 00:39:54.930
We could read something in CSV or JSON

00:39:54.930 --> 00:39:57.190
and convert it to CSV if you wanted to.

00:39:58.070 --> 00:40:01.670
But for this case, we are going to bring it in.

00:40:01.970 --> 00:40:05.710
We are going to bring this CSV in and convert it.

00:40:06.310 --> 00:40:10.870
So for the Convert Record, we need a record reader.

00:40:11.490 --> 00:40:14.450
So for this case, we have the CSV reader.

00:40:15.250 --> 00:40:17.650
And I've already selected it,

00:40:18.910 --> 00:40:22.150
because I know that that's a CSV file coming in.

00:40:22.450 --> 00:40:24.650
Record writer is going to be the JSON record writer.

00:40:25.190 --> 00:40:28.850
I've already got it selected just because that's the

00:40:28.850 --> 00:40:31.810
value that I'm going to need to write that out.

00:40:33.070 --> 00:40:35.770
One of the things that you'll notice with a processor

00:40:35.770 --> 00:40:39.430
that does anything with controller services,

00:40:39.890 --> 00:40:43.270
you will have a little arrow right here on the far right

00:40:43.270 --> 00:40:45.830
to actually go to that controller service.

00:40:46.710 --> 00:40:49.790
So if you were to drag and draw a Convert Record,

00:40:51.390 --> 00:40:52.430
let's do that.

00:40:54.330 --> 00:40:56.710
If you were to drag and drop a Convert Record

00:40:56.710 --> 00:40:57.930
or any kind of record,

00:40:57.950 --> 00:41:00.170
let me actually use a different one.

00:41:15.250 --> 00:41:18.350
So I can select a new service.

00:41:18.570 --> 00:41:20.410
So I've already got a JSON record set,

00:41:20.410 --> 00:41:22.110
but I can create a new service.

00:41:23.150 --> 00:41:27.430
And here is the services that I have available.

00:41:27.790 --> 00:41:33.230
So if I want to do whatever, I can set that service.

00:41:33.870 --> 00:41:38.570
But any time you use a controller service,

00:41:39.010 --> 00:41:41.330
you're going to have that arrow.

00:41:41.330 --> 00:41:45.190
So you can actually go in and configure the service,

00:41:45.410 --> 00:41:48.090
because things are not working right.

00:41:49.230 --> 00:41:53.750
On these, we don't have any kind of services,

00:41:53.770 --> 00:41:55.310
so we're not going to get the arrow.

00:41:57.210 --> 00:41:58.610
So what I want to do is,

00:41:58.610 --> 00:42:00.730
because I'm going to take that CSV in,

00:42:00.890 --> 00:42:02.270
I'm going to read in CSV.

00:42:02.490 --> 00:42:04.150
I'm going to tell you to read it out as JSON.

00:42:04.630 --> 00:42:07.690
So I want to go to my CSV record reader service.

00:42:08.570 --> 00:42:11.490
And so now I'm bringing up my controller services.

00:42:11.750 --> 00:42:13.510
I have four listed.

00:42:13.630 --> 00:42:15.670
One's the Avro reader.

00:42:16.010 --> 00:42:20.990
One's an Avro schema registry for this flow,

00:42:21.250 --> 00:42:23.970
a CSV reader, and a JSON record set right.

00:42:24.350 --> 00:42:26.610
So let's go to the CSV reader first

00:42:27.130 --> 00:42:29.430
and click the little gear icon,

00:42:29.470 --> 00:42:32.090
and I can configure the CSV reader.

00:42:33.690 --> 00:42:37.550
First thing you notice is the use schema name property.

00:42:37.690 --> 00:42:41.030
So if you remember the previous processor,

00:42:42.110 --> 00:42:47.730
we said, you know, set the attribute schema.name to inventory,

00:42:48.170 --> 00:42:52.030
because NaPy is going to look for schema.name.

00:42:52.490 --> 00:42:55.370
So for this, the schema access strategy

00:42:55.370 --> 00:42:59.450
is to use the schema name property that we set.

00:43:00.330 --> 00:43:02.390
The registry, you know,

00:43:02.470 --> 00:43:05.430
the schema registry is an Avro schema,

00:43:05.430 --> 00:43:09.910
so it needs to, usually within NaPy,

00:43:10.050 --> 00:43:14.510
it works with Avro, you know, formats.

00:43:15.010 --> 00:43:17.770
If you're not familiar with Avro,

00:43:18.670 --> 00:43:23.470
it is a serialization format for record data.

00:43:24.550 --> 00:43:27.930
It's used, you know, quite extensively,

00:43:28.790 --> 00:43:30.470
you know, throughout the community,

00:43:30.870 --> 00:43:34.770
but it's modeled all in JSON.

00:43:34.770 --> 00:43:39.570
So, you know, that way we have a schema that says,

00:43:39.850 --> 00:43:41.910
okay, well, I'm going to extract this CSV,

00:43:42.150 --> 00:43:43.990
I'm going to extract every column,

00:43:44.210 --> 00:43:48.410
but I need to know where to put these values

00:43:48.410 --> 00:43:50.630
that is going into this data.

00:43:51.550 --> 00:43:54.190
And so Avro is what we're using here.

00:43:55.110 --> 00:43:59.690
I do realize that, you know, this is a bit technical,

00:44:00.030 --> 00:44:02.650
and so, you know, just let me know

00:44:02.650 --> 00:44:03.790
if you have any questions.

00:44:05.410 --> 00:44:07.790
But for that controller, CSV controller service,

00:44:08.270 --> 00:44:11.150
we are going to use the schema.name property.

00:44:11.590 --> 00:44:13.730
We have a schema registry,

00:44:14.170 --> 00:44:16.530
and again, because we're now referencing

00:44:16.530 --> 00:44:18.210
another controller service,

00:44:18.230 --> 00:44:21.250
you should be able to see the arrow that goes to that.

00:44:22.270 --> 00:44:24.610
The schema name is, you know,

00:44:25.630 --> 00:44:28.790
this is how in NaPy you would like

00:44:28.790 --> 00:44:30.230
read the schema name using

00:44:30.230 --> 00:44:32.790
the NaPy regular expression language.

00:44:33.750 --> 00:44:36.710
If you have any questions on the NaPy

00:44:42.150 --> 00:44:43.350
expression language,

00:44:46.110 --> 00:44:48.070
there's a whole guide.

00:44:49.130 --> 00:44:54.110
So, you know, there's a lot to this,

00:44:54.590 --> 00:44:55.430
and there's a lot,

00:44:55.530 --> 00:44:58.090
and so if you're familiar with any kind of regex

00:44:58.090 --> 00:44:59.430
or those types of things,

00:45:00.710 --> 00:45:02.770
this should look very familiar.

00:45:02.790 --> 00:45:06.870
If you're not, you know, we'll work through it,

00:45:07.750 --> 00:45:09.810
but NaPy has its own.

00:45:09.970 --> 00:45:11.870
It's based off of Java, of course,

00:45:12.330 --> 00:45:14.150
regular expression engine,

00:45:14.370 --> 00:45:17.510
and so this is how you reference

00:45:17.510 --> 00:45:19.110
the file name, for instance, right?

00:45:20.250 --> 00:45:23.870
And so, you know, I can call this in a property,

00:45:24.310 --> 00:45:27.370
and it will return the file name attribute.

00:45:28.150 --> 00:45:30.170
So for this use case, though,

00:45:30.170 --> 00:45:32.910
we told it that it's going to use

00:45:32.910 --> 00:45:34.150
the schema name property.

00:45:34.510 --> 00:45:36.790
The name of the schema is schema.name

00:45:36.790 --> 00:45:40.010
that will match that update attribute.

00:45:40.450 --> 00:45:41.950
So if you noticed on the update attribute,

00:45:42.170 --> 00:45:45.110
it had schema.name and then it had inventory.

00:45:45.550 --> 00:45:47.890
So this tells, you know,

00:45:48.110 --> 00:45:51.370
the controller service which schema to use

00:45:51.370 --> 00:45:55.650
and which property should it look for.

00:45:56.530 --> 00:45:59.290
So that's where we get the schema.name.

00:45:59.930 --> 00:46:02.130
Some of the other things that are required,

00:46:02.550 --> 00:46:06.410
CSV parser, we're using the Apache common CSV parser.

00:46:07.290 --> 00:46:11.090
I think there's a Jackson CSV, Jackson JSON.

00:46:11.750 --> 00:46:14.270
I like the Apache commons just because I know it works.

00:46:16.550 --> 00:46:19.030
CSV format, if you have a custom format,

00:46:19.330 --> 00:46:21.110
you can do that.

00:46:21.170 --> 00:46:24.330
You can use Excel format, MySQL format.

00:46:24.730 --> 00:46:27.650
There's all kinds of formats that it can read from.

00:46:28.150 --> 00:46:29.610
Here we're going to put custom

00:46:29.610 --> 00:46:32.290
because the value separator is a comma.

00:46:32.830 --> 00:46:34.110
It's a regular JSON.

00:46:42.290 --> 00:46:44.350
It's a regular CSV, I mean, file.

00:46:44.730 --> 00:46:46.410
So it's got just, you know,

00:46:46.510 --> 00:46:49.090
your value, comma, your value, comma, your value.

00:46:52.710 --> 00:46:56.190
So we're going to set it to the value separator is a comma.

00:46:56.190 --> 00:47:00.350
The record separator slash n, that's just a new line.

00:47:02.050 --> 00:47:05.110
We want to treat the first line as a header.

00:47:05.830 --> 00:47:09.630
Actually, that's one of the bigger issues I have seen,

00:47:10.230 --> 00:47:15.450
you know, for learning this is folks will always forget

00:47:15.450 --> 00:47:17.650
to treat the first line as the header.

00:47:18.050 --> 00:47:23.670
And, you know, it'll throw data off because, you know,

00:47:24.170 --> 00:47:26.050
it's expecting, you may expect it,

00:47:26.050 --> 00:47:28.270
you may not expect to pull that data in,

00:47:28.410 --> 00:47:30.910
and then you've got the header as an actual data value

00:47:30.910 --> 00:47:33.930
in the JSON or the formatting will be off,

00:47:34.430 --> 00:47:35.690
you know, those types of things.

00:47:36.370 --> 00:47:39.170
So, you know, treat the first line as a header if you have it.

00:47:40.830 --> 00:47:42.130
And then, of course, you know,

00:47:42.130 --> 00:47:43.870
a lot of these are already kind of filled in.

00:47:44.070 --> 00:47:46.550
The quote character, the escape character,

00:47:47.170 --> 00:47:48.290
those types of things.

00:47:50.030 --> 00:47:51.890
So, oh, I don't think we'll get ahead of these.

00:47:52.290 --> 00:47:54.890
But anyway, so the one that we're looking for, though,

00:47:54.890 --> 00:47:57.610
is just saying that it's a comma separated,

00:47:58.050 --> 00:48:00.770
and the record separator is a new line slash in.

00:48:01.130 --> 00:48:04.530
We've got our Apache Comm and CSV parser,

00:48:04.910 --> 00:48:07.230
and then we've got our, you know,

00:48:07.270 --> 00:48:09.170
we're updating it and telling it to use

00:48:09.170 --> 00:48:13.810
the schema.name property that we have already set

00:48:13.810 --> 00:48:15.590
in that previous processor.

00:48:16.290 --> 00:48:21.990
And so I can apply, let me go back into that.

00:48:22.930 --> 00:48:26.990
And now I've got that configured.

00:48:27.050 --> 00:48:32.150
Now I'm wanting to go to my, the, the Avro schema.

00:48:32.650 --> 00:48:40.970
So if you notice, I had the property value

00:48:40.970 --> 00:48:43.710
for schema.name set to inventory.

00:48:44.910 --> 00:48:49.750
And so, you know, it, it knows when it pulls it in,

00:48:49.750 --> 00:48:53.590
that controller knows that it is to use the schema.name

00:48:54.330 --> 00:48:58.550
as the model to, to convert this.

00:48:59.430 --> 00:49:02.730
And then the name of that model is inventory.

00:49:03.110 --> 00:49:06.870
So I can actually have multiple different Avro schemas

00:49:07.730 --> 00:49:11.390
here on, you know, if I were bringing in

00:49:11.390 --> 00:49:14.250
CSV data from multiple different sources,

00:49:14.630 --> 00:49:17.810
I can just set this up where, okay,

00:49:17.810 --> 00:49:20.550
well, it's going to all convert to the same format

00:49:20.550 --> 00:49:23.550
and how it gets converted and recognized,

00:49:23.570 --> 00:49:25.550
I can split that schema up.

00:49:26.170 --> 00:49:27.970
So, you know, I can set the attribute to,

00:49:28.810 --> 00:49:34.850
you know, store or, or price or whatever for the inventory.

00:49:39.710 --> 00:49:40.850
Let me expand this.

00:49:46.370 --> 00:49:48.850
So a lot of times when we're working in NaPhi,

00:49:49.130 --> 00:49:52.010
you, you, you have a, a, a small box.

00:49:52.450 --> 00:49:55.550
And if you want, you know, a lot of these boxes,

00:49:55.550 --> 00:49:58.370
you can just drag and drop and, and make them,

00:49:58.370 --> 00:50:00.590
like, easily readable.

00:50:01.330 --> 00:50:04.150
So here I have a basic schema.

00:50:04.710 --> 00:50:07.390
Again, I want to take all of this data.

00:50:07.710 --> 00:50:10.910
I want to read it in and I want to put it as a JSON.

00:50:11.470 --> 00:50:13.610
So I have store, item, and quantity.

00:50:14.470 --> 00:50:17.390
So if you notice, the type is record,

00:50:17.550 --> 00:50:20.570
the name is inventory, and here's the fields

00:50:20.570 --> 00:50:23.330
that is going to go into that JSON document.

00:50:24.830 --> 00:50:27.090
Store is, is the first one.

00:50:27.630 --> 00:50:30.330
Item is the second one, and quantity is the other one.

00:50:30.770 --> 00:50:34.010
And you see, it will match my data.

00:50:35.870 --> 00:50:39.750
And what I can, if you, I'll make sure you have

00:50:39.750 --> 00:50:42.590
access to all of this when you're working

00:50:42.590 --> 00:50:45.890
on your scenario, as you may want to use it.

00:50:46.170 --> 00:50:48.870
I understand I'll sell you, this may be the first time

00:50:48.870 --> 00:50:50.790
you've ever seen an Avro schema,

00:50:51.810 --> 00:50:53.330
and we, but we'll work through it.

00:50:54.130 --> 00:50:56.490
So my schema is very easy.

00:50:56.730 --> 00:50:59.230
It's very, you know, it's a very simple schema.

00:50:59.490 --> 00:51:00.390
It's three fields.

00:51:00.910 --> 00:51:02.370
So I built my schema.

00:51:02.790 --> 00:51:05.590
I put that into the Avro schema registry.

00:51:06.330 --> 00:51:07.790
So, okay, apply.

00:51:09.030 --> 00:51:13.710
And now, now if I've, you know, now I've got this,

00:51:13.710 --> 00:51:17.650
the schema registry, and I can add different schemas.

00:51:17.970 --> 00:51:19.670
I can, I can do all kinds of things.

00:51:20.010 --> 00:51:22.890
But the nice thing is, is now I can just reference

00:51:22.890 --> 00:51:27.050
that schema name in all of my data flows,

00:51:27.210 --> 00:51:29.810
and I only have to configure this one time.

00:51:30.090 --> 00:51:32.850
And I only have to configure the schema one time.

00:51:33.210 --> 00:51:37.770
So it makes rents and reuse of those controller servers

00:51:37.770 --> 00:51:39.290
is a lot easier.

00:51:40.370 --> 00:51:42.610
So that was my CSV reader.

00:51:43.870 --> 00:51:45.530
It's going to read the CSV.

00:51:46.150 --> 00:51:49.950
It's going to use this schema to convert it to JSON

00:51:51.270 --> 00:51:53.730
and write a JSON document out.

00:51:54.250 --> 00:51:58.190
Now, my second step was writing it out as a JSON record.

00:51:59.730 --> 00:52:05.330
So the right strategy for this is to, you know,

00:52:05.330 --> 00:52:07.870
use the schema because it's already filled out.

00:52:07.890 --> 00:52:10.650
It already knows to extract that first column,

00:52:10.890 --> 00:52:14.190
put it into the JSON document, extract the second column,

00:52:14.210 --> 00:52:16.750
put it in, so forth and so on.

00:52:18.270 --> 00:52:21.490
So we're giving it, you know, here's the schema

00:52:21.490 --> 00:52:23.570
that registry that we're using, right?

00:52:23.570 --> 00:52:24.890
It's the same one that we were using

00:52:24.890 --> 00:52:26.250
from the CSV document.

00:52:26.790 --> 00:52:29.890
Here's the name of the schema, you know,

00:52:30.010 --> 00:52:32.770
and a couple of, you know, pretty JSON,

00:52:33.210 --> 00:52:34.550
just so it looks nicer.

00:52:35.550 --> 00:52:38.650
And some other, you know, is there any kind of compression

00:52:38.650 --> 00:52:42.250
or suppress no values, those types of things.

00:52:42.850 --> 00:52:45.130
So this is the controller service

00:52:45.130 --> 00:52:47.570
to write that JSON document.

00:52:48.790 --> 00:52:50.290
So what we're going to do is apply that.

00:52:51.890 --> 00:52:55.910
And so what happens is now we've got these

00:52:55.910 --> 00:52:57.670
controller services configured.

00:52:58.630 --> 00:53:03.210
We have our, you know, schema registry right here

00:53:03.210 --> 00:53:04.970
that we've already worked off of.

00:53:05.990 --> 00:53:11.950
We also have the Avro reader.

00:53:12.130 --> 00:53:15.210
So when you use the Avro schema registry,

00:53:15.670 --> 00:53:18.830
it'll automatically add an Avro reader

00:53:18.830 --> 00:53:22.870
because it just needs to read that schema

00:53:22.870 --> 00:53:26.350
in an Avro format to be able to write

00:53:26.350 --> 00:53:28.070
that JSON document.

00:53:30.170 --> 00:53:33.750
And so to get this working, though,

00:53:34.470 --> 00:53:38.430
we're still, you know, some of the controller services

00:53:38.430 --> 00:53:39.250
are disabled.

00:53:40.250 --> 00:53:42.530
So the other services are not going to work.

00:53:42.930 --> 00:53:44.570
You can see that, you know, it's just like

00:53:44.570 --> 00:53:46.130
when we're working with a processor.

00:53:46.950 --> 00:53:50.450
You can highlight over the yellow yield icon

00:53:50.450 --> 00:53:54.750
and it'll tell you why it's not working properly.

00:53:54.850 --> 00:53:58.450
So for this, it's because the schema registry

00:53:59.490 --> 00:54:01.790
is invalid because the controller service

00:54:01.790 --> 00:54:03.190
is disabled.

00:54:03.190 --> 00:54:04.430
Same here.

00:54:04.610 --> 00:54:08.050
So after you get through with working

00:54:08.050 --> 00:54:11.310
on your controller services and you've done

00:54:11.310 --> 00:54:14.490
your configuration, you need to enable them.

00:54:14.550 --> 00:54:17.610
So what I like to do is click the little lightning bolt

00:54:17.610 --> 00:54:21.210
and it starts to enable that service.

00:54:21.950 --> 00:54:25.490
I can select service only or I can select service

00:54:25.490 --> 00:54:27.310
and the referencing components.

00:54:28.010 --> 00:54:31.710
So, you know, because the, you know,

00:54:31.710 --> 00:54:34.310
processor references this service,

00:54:34.550 --> 00:54:37.550
I can enable this service and it will enable

00:54:37.550 --> 00:54:39.790
that processor as well.

00:54:41.690 --> 00:54:43.270
So I want to do service only.

00:54:43.670 --> 00:54:46.130
The reason being is I want to check my data flow

00:54:46.130 --> 00:54:47.850
to make sure everything looks good

00:54:47.850 --> 00:54:49.870
before I turn everything on.

00:54:51.530 --> 00:54:52.970
So I've got the green check box.

00:54:53.270 --> 00:54:54.710
I enabled that service.

00:54:54.910 --> 00:54:55.870
It's up and running.

00:54:56.830 --> 00:54:58.350
You see the state is enabled.

00:55:00.230 --> 00:55:00.910
Same thing here.

00:55:00.910 --> 00:55:03.290
I'm going to enable just the service here.

00:55:03.350 --> 00:55:05.510
It will let me know the referencing services.

00:55:05.970 --> 00:55:08.230
You know, it's got the CSV record reader,

00:55:08.730 --> 00:55:10.610
the JSON record writer,

00:55:10.610 --> 00:55:13.490
and you can actually see the processors too.

00:55:13.950 --> 00:55:16.450
Convert CSV to JSON, convert CSV to JSON.

00:55:17.390 --> 00:55:18.990
So it's that same processor,

00:55:19.830 --> 00:55:22.750
but, you know, you can see the services

00:55:22.750 --> 00:55:24.070
and processors.

00:55:25.230 --> 00:55:26.830
That way you can make a decision

00:55:26.830 --> 00:55:29.070
if you want to enable just the controller service

00:55:29.070 --> 00:55:30.890
or everything that goes with it.

00:55:31.950 --> 00:55:33.490
So we close that.

00:55:33.710 --> 00:55:36.270
Now our CSV record reader and writer

00:55:36.270 --> 00:55:38.870
is no longer in a, you know, yield state,

00:55:39.270 --> 00:55:40.910
but it is disabled.

00:55:41.270 --> 00:55:43.550
So now everything is good.

00:55:43.790 --> 00:55:44.790
It's good to go.

00:55:44.870 --> 00:55:46.610
We just need to enable it.

00:55:47.050 --> 00:55:49.610
So with those first two services enabled,

00:55:51.790 --> 00:55:57.490
we can enable our other two services.

00:56:00.890 --> 00:56:03.150
And now everything is enabled.

00:56:03.490 --> 00:56:05.690
So we can just exit our controller service

00:56:05.690 --> 00:56:11.070
and now our convert CSV to JSON is stopped.

00:56:11.370 --> 00:56:13.010
It's no longer yellow.

00:56:13.510 --> 00:56:15.550
It's got all the services configured

00:56:16.410 --> 00:56:17.750
and so forth and so on.

00:56:18.350 --> 00:56:20.790
So now we're getting the file,

00:56:21.830 --> 00:56:23.450
picking up from inventory.

00:56:24.210 --> 00:56:26.070
We're just updating the attribute

00:56:26.070 --> 00:56:29.970
to set a property name of schema.name

00:56:29.970 --> 00:56:31.210
to inventory.

00:56:32.510 --> 00:56:33.810
That way that attribute,

00:56:34.190 --> 00:56:35.890
when it gets to this processor,

00:56:36.450 --> 00:56:37.930
that attribute is set.

00:56:38.150 --> 00:56:40.170
And so that processor is going to look,

00:56:40.290 --> 00:56:41.670
based upon what we told it

00:56:41.670 --> 00:56:43.010
with the controller services,

00:56:43.610 --> 00:56:46.970
it's going to look for the schema.name property

00:56:46.970 --> 00:56:49.010
and then it's going to look at what value

00:56:49.650 --> 00:56:51.650
on which schema to use.

00:56:52.610 --> 00:56:55.990
So anyway, so we have CSV to JSON working now.

00:56:57.230 --> 00:57:00.650
We need, you know, this is a new document.

00:57:01.630 --> 00:57:05.170
So we are going to need to name this document.

00:57:05.570 --> 00:57:07.970
And all we're doing is do an update attribute.

00:57:08.710 --> 00:57:10.650
For this, it's very easy.

00:57:11.450 --> 00:57:14.650
As I mentioned, the file name attribute

00:57:17.930 --> 00:57:20.750
is right here in the Notify expression language.

00:57:21.830 --> 00:57:24.250
This is how we reference the file name.

00:57:24.250 --> 00:57:27.970
So in this scenario, we're just saying,

00:57:28.210 --> 00:57:32.170
okay, get the file name, add .json to the end.

00:57:32.770 --> 00:57:36.330
So it's going to look at that file name attribute

00:57:36.910 --> 00:57:38.430
and it's going to say,

00:57:38.710 --> 00:57:40.730
okay, I've got the file name attribute.

00:57:41.150 --> 00:57:42.670
All I need to do is name it

00:57:42.670 --> 00:57:45.410
with that name plus .json.

00:57:49.410 --> 00:57:50.530
Apply that.

00:57:51.970 --> 00:57:55.190
And then, of course, write the file back to a directory.

00:57:55.950 --> 00:58:00.050
So for this, I want to go back.

00:58:08.130 --> 00:58:13.150
And I want to say inventory JSON.

00:58:20.950 --> 00:58:24.010
Okay. So if we've done this right,

00:58:25.150 --> 00:58:27.730
we will be able to pick a CSV file up,

00:58:29.510 --> 00:58:32.330
set the schema, convert it to JSON,

00:58:33.330 --> 00:58:35.470
set the name of the JSON document,

00:58:35.530 --> 00:58:37.090
and then write it to file.

00:58:37.710 --> 00:58:40.130
So let's run this one time and see how it goes.

00:58:48.550 --> 00:58:49.130
All right.

00:58:49.130 --> 00:58:52.410
So we have our CSV in our queue.

00:58:54.030 --> 00:58:56.930
We can actually look at the attributes.

00:59:00.450 --> 00:59:02.230
And you see file name attribute

00:59:02.230 --> 00:59:04.230
with the file name of inventory.

00:59:04.970 --> 00:59:08.910
But what we don't see is the schema.name attribute.

00:59:10.170 --> 00:59:12.750
The reason we don't see the schema.name attribute

00:59:12.750 --> 00:59:14.970
is because it hasn't went to that processor yet.

00:59:15.670 --> 00:59:18.650
This one has a viewer because, you know,

00:59:18.650 --> 00:59:19.730
like I mentioned earlier,

00:59:20.070 --> 00:59:22.770
it's a zip file like we were working with yesterday.

00:59:23.250 --> 00:59:25.070
You will not have a viewer.

00:59:26.090 --> 00:59:29.870
But, you know, a CSV file, JSON, XML,

00:59:30.390 --> 00:59:32.650
text-based, you know, data,

00:59:32.770 --> 00:59:33.890
you're going to have a viewer.

00:59:35.530 --> 00:59:36.810
So let's exit that.

00:59:39.370 --> 00:59:41.250
That's what the data looks like.

00:59:42.390 --> 00:59:43.950
And then I can look at the provenance,

00:59:44.250 --> 00:59:46.590
which we will hear real soon.

00:59:47.010 --> 00:59:48.630
So run that one time.

00:59:48.650 --> 00:59:50.590
I'm going to run this once.

00:59:51.010 --> 00:59:53.770
The only thing that should change

00:59:53.770 --> 00:59:56.010
should be exactly the same data,

00:59:58.010 --> 01:00:02.030
except I should have a new attribute called schema.name.

01:00:04.250 --> 01:00:04.490
Right.

01:00:05.050 --> 01:00:08.710
So now I have a new attribute called schema.name

01:00:08.710 --> 01:00:11.730
and with the value of inventory.

01:00:14.830 --> 01:00:19.130
So now it's ready to go into the actual convert record.

01:00:20.490 --> 01:00:25.110
And so, again, you know, it's going to read

01:00:25.910 --> 01:00:28.210
the flow file as CSV

01:00:28.830 --> 01:00:32.290
and it's going to write the flow file as JSON.

01:00:33.430 --> 01:00:34.850
Because it's reading as CSV,

01:00:35.330 --> 01:00:37.650
we have a record service, you know,

01:00:37.770 --> 01:00:39.690
you can go back to the controller services.

01:00:39.870 --> 01:00:41.650
You see there's a CSV reader.

01:00:41.730 --> 01:00:43.310
The JSON record writer.

01:00:44.370 --> 01:00:46.050
We went through and configured these.

01:00:46.450 --> 01:00:48.190
So we should be good to go.

01:00:49.130 --> 01:00:50.610
So it's going to run once.

01:00:56.970 --> 01:00:57.290
All right.

01:00:57.630 --> 01:00:58.230
Success.

01:00:59.990 --> 01:01:00.530
List the queue.

01:01:05.870 --> 01:01:07.210
Still the same attributes.

01:01:07.570 --> 01:01:10.650
File name is inventory.csv.

01:01:10.650 --> 01:01:13.770
You do notice that it detected a mine type of...

01:01:14.410 --> 01:01:18.670
A lot of processors will automatically try to detect mine types.

01:01:19.570 --> 01:01:21.450
And so this one's application JSON now.

01:01:23.850 --> 01:01:27.810
And, you know, same type of details with file name

01:01:27.810 --> 01:01:29.870
and modified date and those types of things.

01:01:30.470 --> 01:01:32.110
We do have a different file size

01:01:32.110 --> 01:01:34.250
because we went from CSV to JSON.

01:01:35.230 --> 01:01:39.150
And actually now we can view the JSON document.

01:01:39.150 --> 01:01:46.750
So, you know, now it took all of those CSV file records

01:01:46.750 --> 01:01:49.790
and it wrote it out as JSON.

01:01:50.370 --> 01:01:53.070
And if you remember from our Ambrose schema,

01:01:53.730 --> 01:01:55.710
we had store, item, and quantity.

01:01:56.130 --> 01:01:58.190
And so it just followed that pattern

01:01:58.190 --> 01:02:00.170
and started writing out the JSON.

01:02:03.090 --> 01:02:05.110
We want to say, okay, you know,

01:02:05.110 --> 01:02:07.690
the problem with this is the file name.

01:02:09.690 --> 01:02:13.290
It's probably still inventory.json or CSV.

01:02:15.110 --> 01:02:15.430
Yep.

01:02:16.470 --> 01:02:17.470
So when it tries...

01:02:17.470 --> 01:02:20.630
If we were to try to write this right now,

01:02:20.750 --> 01:02:22.550
it would be a JSON document,

01:02:22.890 --> 01:02:26.370
but it'd be written as a file name inventory.csv.

01:02:29.330 --> 01:02:31.470
So we're going to do another update attribute

01:02:31.470 --> 01:02:37.970
where we told it I want to take the file name,

01:02:37.970 --> 01:02:39.710
which is this regular expression,

01:02:40.390 --> 01:02:42.710
and I want to save it, this file,

01:02:42.830 --> 01:02:45.290
as the file name .json.

01:02:49.630 --> 01:02:50.530
So you run once.

01:02:56.490 --> 01:02:57.350
Look at our queue.

01:02:59.250 --> 01:03:02.350
So our file name, you can already see here,

01:03:02.770 --> 01:03:06.610
but our file name is now inventory.csv.json.

01:03:07.970 --> 01:03:10.070
If you are really fancy,

01:03:10.490 --> 01:03:15.290
you could go in and strip the CSV name off,

01:03:15.470 --> 01:03:18.030
the .csv off, and put .json.

01:03:18.590 --> 01:03:23.330
To keep this as simple and straightforward as possible,

01:03:23.670 --> 01:03:26.670
I'm just adding a .json extension.

01:03:31.150 --> 01:03:35.330
And then the last step is to write this JSON file to...

01:03:37.970 --> 01:03:39.190
A directory.

01:03:39.530 --> 01:03:42.610
So that should be here.

01:03:49.110 --> 01:03:50.210
And there it is.

01:03:53.930 --> 01:03:59.730
So now I'm taking this inventory from a CSV to a JSON,

01:04:00.730 --> 01:04:03.330
even the bad data row that has nulls.

01:04:08.670 --> 01:04:12.730
So anyways, you notice that that's actually,

01:04:13.310 --> 01:04:17.490
bad data row did not have any commas.

01:04:18.130 --> 01:04:19.970
It didn't conform.

01:04:20.490 --> 01:04:25.430
So when it wrote it, it just applied null values to the others.

01:04:26.270 --> 01:04:30.050
So if you had a process where you were checking for null values,

01:04:30.350 --> 01:04:32.370
you would throw that record out,

01:04:33.110 --> 01:04:35.910
and you may send it to another processor

01:04:36.870 --> 01:04:40.430
to do ETL steps or whatever.

01:04:41.110 --> 01:04:45.790
But we've taken our CSV and we've made it a JSON document.

01:04:46.290 --> 01:04:50.610
So I'm going to pause there because that is a lot to ingest

01:04:50.610 --> 01:04:52.750
in the last hour, a little over an hour.

01:04:53.490 --> 01:04:56.270
What questions do you all have?

01:05:00.670 --> 01:05:03.030
So are we going to go through a process of

01:05:03.810 --> 01:05:06.190
doing the controllers ourselves, right?

01:05:06.270 --> 01:05:08.970
Because obviously for us to follow along,

01:05:09.030 --> 01:05:09.970
we would have to set those up.

01:05:11.570 --> 01:05:12.970
And I saw how you configured them,

01:05:13.010 --> 01:05:14.850
but I'm not entirely sure how you added them.

01:05:16.450 --> 01:05:21.150
Yeah, so the scenario is you're going to add these

01:05:22.750 --> 01:05:26.530
and start building them in and I'll help you along the way.

01:05:26.610 --> 01:05:28.050
But it's going to be...

01:05:28.050 --> 01:05:31.330
It's very hands-on for the next part to do this.

01:05:33.710 --> 01:05:36.270
And for the scenario though,

01:05:37.110 --> 01:05:40.850
if you want to use the record service, you can.

01:05:42.370 --> 01:05:46.830
The scenario on purpose is set up to be able to be...

01:05:46.830 --> 01:05:51.750
You can use multiple different processors to do this.

01:05:52.290 --> 01:05:56.930
What I'm looking for in the scenario is that thought process

01:05:56.930 --> 01:05:59.710
of here's what my data flow should look like

01:05:59.710 --> 01:06:04.110
if you're going to probably have technical questions,

01:06:04.550 --> 01:06:06.270
you're going to have technical issues.

01:06:06.670 --> 01:06:09.330
But what I'm looking for in the next scenario

01:06:09.330 --> 01:06:14.210
is I kind of thought this whole data flow through

01:06:14.210 --> 01:06:16.310
and I've kind of got it built out

01:06:16.310 --> 01:06:18.750
because you can build the whole data flow

01:06:18.750 --> 01:06:20.550
without turning it on.

01:06:21.070 --> 01:06:24.550
You may have missing relationships or something,

01:06:24.550 --> 01:06:27.370
but you can still build that whole data flow.

01:06:27.370 --> 01:06:30.090
And what we'll do is kind of walk through it.

01:06:30.090 --> 01:06:32.450
And then where you need technical help,

01:06:32.790 --> 01:06:37.630
I'm going to help you with whatever way

01:06:37.630 --> 01:06:39.050
you're designing your flow.

01:06:39.370 --> 01:06:41.130
Because you may say,

01:06:41.130 --> 01:06:42.950
I don't want to do with the record service.

01:06:43.470 --> 01:06:53.510
I want to extract text, for instance.

01:06:54.070 --> 01:07:03.410
And I want to use an extract text processor.

01:07:03.910 --> 01:07:06.830
So I can manually extract the text,

01:07:07.230 --> 01:07:08.470
those types of things.

01:07:09.390 --> 01:07:12.130
You may want to use a record writer, record setter,

01:07:12.130 --> 01:07:12.830
like I did.

01:07:14.230 --> 01:07:16.690
There's a few different ways.

01:07:17.690 --> 01:07:20.450
And so what I'm looking for in the scenario

01:07:20.450 --> 01:07:24.850
is just thinking through how I want to accomplish

01:07:24.850 --> 01:07:25.950
what I'm trying to do.

01:07:27.250 --> 01:07:31.850
And then if you come back to me during the scenario

01:07:31.850 --> 01:07:34.990
and say, hey, I want to use this record writer.

01:07:35.210 --> 01:07:36.530
I want to use this record setter.

01:07:37.590 --> 01:07:40.430
I need help configuring it or help with the schema.

01:07:40.970 --> 01:07:42.410
We can do that.

01:07:42.530 --> 01:07:44.930
Or I want to use extract JSON.

01:07:45.270 --> 01:07:46.530
And how do I do that?

01:07:48.070 --> 01:07:49.610
There's a couple of different ways

01:07:49.610 --> 01:07:52.750
because the last class, for instance,

01:07:53.610 --> 01:07:56.070
they spent quite a bit of time on that scenario.

01:07:56.610 --> 01:08:00.970
And there was five or six different ways

01:08:00.970 --> 01:08:02.250
people were doing it.

01:08:03.110 --> 01:08:05.690
Now, not everyone, I think only one person

01:08:05.690 --> 01:08:07.510
actually completed this scenario.

01:08:09.010 --> 01:08:12.410
But they got all of their processors down

01:08:13.090 --> 01:08:14.190
on the canvas.

01:08:14.550 --> 01:08:16.650
They got most of them configured.

01:08:17.190 --> 01:08:19.430
They applied the labels.

01:08:19.430 --> 01:08:22.850
They applied the naming convention

01:08:22.850 --> 01:08:24.330
and those types of things.

01:08:25.630 --> 01:08:27.870
And then what I did is go through

01:08:27.870 --> 01:08:31.190
and help them finish out the building of that.

01:08:32.030 --> 01:08:36.730
So hopefully that will help in the scenario.

01:08:38.670 --> 01:08:41.510
But again, if you run into any struggles

01:08:41.510 --> 01:08:43.490
or anything else, I'm right here

01:08:43.490 --> 01:08:44.850
during the scenario as well.

01:08:45.070 --> 01:08:46.730
And we'll just talk through it.

01:08:46.770 --> 01:08:47.750
Does that make sense?

01:08:51.530 --> 01:08:53.350
I think Richard, did you ask that question?

01:08:54.270 --> 01:08:54.370
Yep.

01:08:54.410 --> 01:08:54.690
Okay.

01:08:54.910 --> 01:08:55.270
Okay.

01:08:55.270 --> 01:08:55.590
Perfect.

01:08:55.610 --> 01:08:55.910
Perfect.

01:08:59.710 --> 01:09:03.550
So again, I know that's a lot to ingest.

01:09:04.370 --> 01:09:08.490
And we can build data flows

01:09:08.490 --> 01:09:11.010
and some basic data flows.

01:09:11.330 --> 01:09:14.050
But I really wanted us to try to get

01:09:14.050 --> 01:09:17.530
through some of the more advanced

01:09:17.530 --> 01:09:18.550
ways of doing this.

01:09:20.130 --> 01:09:21.830
And it's a little...

01:09:23.430 --> 01:09:25.370
It's a lot to learn real quickly.

01:09:25.870 --> 01:09:28.750
But that's why we have a few hours now

01:09:28.750 --> 01:09:31.110
to work through a scenario

01:09:32.650 --> 01:09:34.730
and help along the way.

01:09:37.590 --> 01:09:39.490
So I think I went over

01:09:39.490 --> 01:09:42.610
what a controller service is about.

01:09:44.190 --> 01:09:47.150
I know that there are still some points

01:09:47.150 --> 01:09:48.590
to learn and things like that.

01:09:48.590 --> 01:09:50.990
But we can get through it.

01:09:51.830 --> 01:09:53.690
But that is a controller service.

01:09:54.710 --> 01:09:56.470
That controller service, again,

01:09:56.690 --> 01:09:58.210
it was just a record reader

01:09:58.210 --> 01:09:59.430
and a record writer.

01:10:00.110 --> 01:10:01.770
It was reading in CSV

01:10:01.770 --> 01:10:03.670
and writing out JSON.

01:10:04.110 --> 01:10:04.990
If I wanted to,

01:10:06.610 --> 01:10:09.570
I could bring in more CSV files.

01:10:10.330 --> 01:10:11.070
I could have set...

01:10:12.190 --> 01:10:14.390
I could set an attribute

01:10:14.390 --> 01:10:18.530
based upon a source of the file.

01:10:19.150 --> 01:10:20.430
So that way I can say,

01:10:20.790 --> 01:10:23.690
well, everything I get from direction A

01:10:23.690 --> 01:10:24.870
gets this schema.

01:10:25.430 --> 01:10:26.850
Everything from direction B

01:10:26.850 --> 01:10:27.870
gets another schema.

01:10:29.690 --> 01:10:31.270
They're all CSVs.

01:10:31.770 --> 01:10:33.410
So that's how you can handle it.

01:10:33.550 --> 01:10:35.170
So that's the beauty

01:10:35.170 --> 01:10:36.730
of those controller services

01:10:36.730 --> 01:10:39.930
is I've now built them and set them.

01:10:40.190 --> 01:10:43.030
And so now everyone can use that service.

01:10:43.250 --> 01:10:45.490
If that was a database, again,

01:10:45.810 --> 01:10:47.930
everyone would be able to use that service.

01:10:48.490 --> 01:10:50.830
We're going to install

01:10:50.830 --> 01:10:52.130
NaPhi registry,

01:10:52.930 --> 01:10:54.790
which has to handle

01:10:54.790 --> 01:10:56.810
our version control of data flows.

01:10:57.810 --> 01:10:59.570
It's got its own service.

01:11:00.430 --> 01:11:02.050
And so once we set it, though,

01:11:02.310 --> 01:11:03.450
we all get to use it.

01:11:04.110 --> 01:11:05.470
So I wanted to make sure

01:11:05.470 --> 01:11:06.950
I went over service

01:11:06.950 --> 01:11:08.570
and those types of things

01:11:08.570 --> 01:11:12.110
because I think they are very valuable

01:11:12.110 --> 01:11:15.770
and it's a very important part of NaPhi.

01:11:16.410 --> 01:11:17.530
So with that being said,

01:11:19.070 --> 01:11:21.250
is there any general questions

01:11:21.250 --> 01:11:24.130
about services I can answer right now?

01:11:24.270 --> 01:11:30.350
And then we can go a little bit

01:11:30.350 --> 01:11:31.910
into provenance, take a break,

01:11:31.990 --> 01:11:33.690
and come back and work on scenarios.

01:11:35.150 --> 01:11:37.110
Any controller service questions?

01:11:37.850 --> 01:11:39.770
What exactly are the controller services

01:11:39.770 --> 01:11:42.030
for, like, specifically?

01:11:42.990 --> 01:11:46.350
Yeah, so a controller service

01:11:46.350 --> 01:11:49.790
is a shared service in NaPhi.

01:11:50.410 --> 01:11:52.630
So I could have

01:11:52.630 --> 01:11:55.430
a database controller service.

01:11:56.830 --> 01:11:58.830
And if I am building

01:11:58.830 --> 01:12:03.070
and that database controller service

01:12:03.070 --> 01:12:04.310
is already installed,

01:12:04.450 --> 01:12:05.790
it's running, it's good to go.

01:12:06.150 --> 01:12:08.390
And I can be a separate data engineer.

01:12:08.730 --> 01:12:09.550
I can come in

01:12:09.550 --> 01:12:11.950
and I need to make a database connection

01:12:12.030 --> 01:12:15.770
because I need to put data

01:12:15.770 --> 01:12:17.090
to a SQL server.

01:12:17.490 --> 01:12:18.990
So instead of me building in

01:12:20.110 --> 01:12:22.250
the connection details,

01:12:23.450 --> 01:12:25.710
where the database is,

01:12:26.010 --> 01:12:27.430
the username, password,

01:12:28.110 --> 01:12:30.530
the tables, all of those things,

01:12:30.850 --> 01:12:32.950
if you have a controller service

01:12:32.950 --> 01:12:34.210
set up already,

01:12:35.050 --> 01:12:36.090
multiple different users

01:12:36.090 --> 01:12:37.950
can use that same service

01:12:39.150 --> 01:12:40.890
in their data flows.

01:12:40.890 --> 01:12:45.090
So I could then

01:12:45.090 --> 01:12:47.230
use that database connection service

01:12:47.230 --> 01:12:48.450
and I could write the data

01:12:48.450 --> 01:12:49.170
to the database.

01:12:49.690 --> 01:12:51.270
But as a sys admin,

01:12:51.570 --> 01:12:53.270
you never had to give me the username,

01:12:53.490 --> 01:12:56.990
the password, those types of things.

01:12:57.310 --> 01:12:58.810
Plus, you're able to control

01:12:59.610 --> 01:13:01.690
who gets to use that connection

01:13:01.690 --> 01:13:04.830
and who did use that connection

01:13:04.830 --> 01:13:06.530
with all the provenance information.

01:13:07.930 --> 01:13:09.810
Did that kind of help answer

01:13:09.810 --> 01:13:12.050
you still have some additional questions about it?

01:13:13.590 --> 01:13:14.650
Yeah, that made it more clear.

01:13:15.410 --> 01:13:15.970
Yeah.

01:13:16.730 --> 01:13:19.490
And so that's why I let off

01:13:19.490 --> 01:13:20.690
with this this morning is

01:13:20.690 --> 01:13:22.970
I know this is probably

01:13:22.970 --> 01:13:24.130
one of the most tricky,

01:13:24.870 --> 01:13:27.230
harder to learn aspects of NaPhi.

01:13:27.850 --> 01:13:30.930
So I wanted to give us plenty of time

01:13:30.930 --> 01:13:32.290
to kind of go through this.

01:13:34.950 --> 01:13:36.190
Any other questions?

01:13:42.010 --> 01:13:44.890
This is going to be kind of not dumb,

01:13:44.910 --> 01:13:46.690
but maybe just more aesthetics.

01:13:46.810 --> 01:13:51.690
How did you get those relationships

01:13:52.310 --> 01:13:53.750
out to the side like that

01:13:53.750 --> 01:13:54.790
where the arrow goes?

01:13:55.890 --> 01:13:56.250
Yeah.

01:13:56.250 --> 01:13:57.210
I mean, like, yeah,

01:13:57.210 --> 01:13:58.150
when I try to do it,

01:13:58.270 --> 01:13:59.790
I don't know how I like to know

01:13:59.790 --> 01:14:00.250
how you did that.

01:14:00.550 --> 01:14:01.390
No, no.

01:14:01.550 --> 01:14:03.050
So if you remember yesterday,

01:14:03.290 --> 01:14:04.410
I said you can click.

01:14:04.770 --> 01:14:05.950
So let me do this.

01:14:05.950 --> 01:14:07.390
Let me take a processor

01:14:08.570 --> 01:14:09.690
and let me walk you there.

01:14:09.690 --> 01:14:14.030
Let me just do a get on something.

01:14:16.730 --> 01:14:17.670
I'm asking because I like

01:14:17.670 --> 01:14:18.950
I like the way that looks.

01:14:19.690 --> 01:14:21.610
No, having to spread the.

01:14:24.390 --> 01:14:27.230
So I'm going to do a solar.

01:14:31.450 --> 01:14:33.730
So I'm going to put two flows

01:14:33.730 --> 01:14:35.770
to get two blocks together right quick.

01:14:38.230 --> 01:14:40.450
So you see this.

01:14:40.830 --> 01:14:42.250
If I double click the line.

01:14:59.750 --> 01:15:00.710
Like that.

01:15:02.050 --> 01:15:03.310
That's exactly what I was trying to do.

01:15:03.450 --> 01:15:04.350
I could not get that to work.

01:15:04.350 --> 01:15:07.190
So click right off of the box.

01:15:09.310 --> 01:15:10.530
And let me do it again.

01:15:10.890 --> 01:15:14.890
So if you know it and then

01:15:14.890 --> 01:15:15.510
all these machines,

01:15:15.630 --> 01:15:16.470
it can be a little.

01:15:17.890 --> 01:15:19.830
And then right above that box,

01:15:19.890 --> 01:15:20.590
I just double click

01:15:21.230 --> 01:15:23.690
and I get my point

01:15:23.690 --> 01:15:24.990
and then I can adjust it.

01:15:26.870 --> 01:15:29.510
OK, OK, that's what I was trying to do.

01:15:29.570 --> 01:15:30.550
OK, thank you.

01:15:30.710 --> 01:15:33.930
Yeah, no, it makes it so much present.

01:15:34.490 --> 01:15:36.590
Better present presentation cleaner.

01:15:37.010 --> 01:15:37.550
Those are things.

01:15:37.790 --> 01:15:39.230
Yeah, yeah, I like it.

01:15:39.290 --> 01:15:41.130
Thank you.

01:15:41.130 --> 01:15:41.990
Great question.

01:15:42.150 --> 01:15:43.670
Actually, I get that question.

01:15:44.350 --> 01:15:47.290
There is there's really not a lot

01:15:47.290 --> 01:15:51.210
of documentation on the some of the

01:15:51.210 --> 01:15:54.430
like like beautification of flows

01:15:54.430 --> 01:15:56.870
and working, you know,

01:15:56.970 --> 01:15:58.370
clicking and doing things.

01:15:58.470 --> 01:16:00.730
And they're always introducing more,

01:16:00.730 --> 01:16:03.930
but they're like minor like features

01:16:03.930 --> 01:16:06.670
that don't really get publicized.

01:16:07.550 --> 01:16:10.170
So a lot of it is just trying to,

01:16:10.170 --> 01:16:11.490
you know, googling around

01:16:11.490 --> 01:16:13.750
or having the experience to work with it.

01:16:13.950 --> 01:16:15.990
And once you start getting this,

01:16:16.350 --> 01:16:18.890
then then you can, you know,

01:16:18.930 --> 01:16:20.630
you see I've got adjusted lines

01:16:20.630 --> 01:16:21.850
here that look different.

01:16:21.950 --> 01:16:22.730
I've got them here.

01:16:23.710 --> 01:16:25.990
I've actually got them all color coded

01:16:26.490 --> 01:16:28.690
just because it's easier to read.

01:16:28.690 --> 01:16:31.410
I have labels on every one,

01:16:33.650 --> 01:16:34.890
you know, to kind of give you

01:16:34.890 --> 01:16:36.270
that visual explanation.

01:16:36.970 --> 01:16:39.970
When you are building this out

01:16:39.970 --> 01:16:42.030
and in your environment,

01:16:42.450 --> 01:16:44.490
you know, you may have a policy

01:16:44.490 --> 01:16:46.470
that says here's some basic design

01:16:46.470 --> 01:16:47.810
principles that you need to follow.

01:16:49.570 --> 01:16:52.050
You know, software engineering teams

01:16:52.050 --> 01:16:53.990
when when I lead software

01:16:53.990 --> 01:16:55.710
engineering teams previously,

01:16:56.210 --> 01:16:57.890
we had to comment our code, right?

01:16:57.890 --> 01:17:00.010
And we had to make sure we had comments

01:17:00.010 --> 01:17:01.570
in and those types of things.

01:17:02.270 --> 01:17:03.850
So, you know, same thing here.

01:17:04.430 --> 01:17:06.410
If, you know, you may have a policy

01:17:06.410 --> 01:17:07.630
that says, you know,

01:17:07.750 --> 01:17:08.570
all these data flows,

01:17:08.710 --> 01:17:09.610
you need to label them

01:17:09.610 --> 01:17:10.690
that needs to make sense,

01:17:11.650 --> 01:17:13.210
you know, those types of things.

01:17:13.990 --> 01:17:15.190
And, you know, beautified

01:17:15.190 --> 01:17:18.130
because you can have a spider web

01:17:18.130 --> 01:17:19.390
of processors.

01:17:19.790 --> 01:17:21.330
And today we will have

01:17:21.330 --> 01:17:22.810
a spider web of processors.

01:17:23.430 --> 01:17:24.250
Like, you know,

01:17:24.270 --> 01:17:25.610
when we're working through this scenario,

01:17:25.830 --> 01:17:27.670
it's going to be processors

01:17:27.670 --> 01:17:28.530
all over the place.

01:17:29.090 --> 01:17:30.890
So, you know, just do the best you can

01:17:30.890 --> 01:17:32.330
and then come back behind

01:17:32.330 --> 01:17:33.530
and just clean up.

01:17:33.930 --> 01:17:35.310
That's usually the best thing

01:17:35.310 --> 01:17:37.370
that I like to do.

01:17:37.810 --> 01:17:39.910
All right, I got it.

01:17:40.010 --> 01:17:41.310
But that was tricky, man.

01:17:41.430 --> 01:17:42.110
I'm like, oh, yes.

01:17:44.890 --> 01:17:47.610
It's easier if you are running this

01:17:47.610 --> 01:17:49.450
on your local machine

01:17:50.450 --> 01:17:51.970
because the latency,

01:17:52.850 --> 01:17:54.750
you know, like you can click

01:17:54.750 --> 01:17:55.890
and then it won't drag

01:17:55.890 --> 01:17:57.390
and you drag too far.

01:17:57.670 --> 01:17:59.650
It never drags at all.

01:17:59.650 --> 01:18:01.850
And I've already got a pop-up

01:18:01.850 --> 01:18:04.750
that told me about latency once already.

01:18:05.230 --> 01:18:06.330
Okay, but great question.

01:18:06.950 --> 01:18:08.170
Any other questions?

01:18:10.750 --> 01:18:11.470
Okay.

01:18:12.370 --> 01:18:14.090
We have a few minutes before break.

01:18:14.690 --> 01:18:16.610
Before we go into the scenario,

01:18:17.310 --> 01:18:19.730
now that we've sent data through,

01:18:20.430 --> 01:18:23.350
let's look at our data provenance.

01:18:24.490 --> 01:18:25.750
So if you remember

01:18:26.570 --> 01:18:27.710
from the hamburger menu,

01:18:27.770 --> 01:18:28.990
you can actually pull down

01:18:28.990 --> 01:18:31.990
your data provenance events.

01:18:32.950 --> 01:18:35.370
So, you know, the component name.

01:18:36.150 --> 01:18:37.930
Here's the actual processor.

01:18:38.230 --> 01:18:39.670
Again, another reason

01:18:39.670 --> 01:18:40.930
to name your component

01:18:40.930 --> 01:18:42.170
something easy to read.

01:18:43.110 --> 01:18:44.970
So you can see, you know,

01:18:46.070 --> 01:18:47.710
all the data provenance events.

01:18:48.510 --> 01:18:50.750
If you have a good, you know,

01:18:50.930 --> 01:18:52.650
good readable name

01:18:52.650 --> 01:18:54.550
you can sort and filter

01:18:54.550 --> 01:18:55.830
and things like that.

01:18:55.850 --> 01:18:57.070
A lot easier.

01:18:57.230 --> 01:18:58.970
You can actually search for events

01:18:59.610 --> 01:19:00.850
and those types of things.

01:19:02.210 --> 01:19:03.530
So let's look at

01:19:04.990 --> 01:19:08.830
the JSON file to a directory.

01:19:09.210 --> 01:19:10.950
So when you click this,

01:19:11.350 --> 01:19:13.090
it's going to compute the lineage.

01:19:14.230 --> 01:19:16.330
And you can actually then replay

01:19:16.330 --> 01:19:19.630
this data of actually

01:19:19.630 --> 01:19:21.210
going through that data flow.

01:19:21.570 --> 01:19:25.630
So I received it from this

01:19:25.630 --> 01:19:26.470
provenance event.

01:19:27.050 --> 01:19:27.690
I received, you know,

01:19:27.810 --> 01:19:29.630
it was get CSV file from directory.

01:19:30.350 --> 01:19:32.770
I can actually look at the attributes.

01:19:33.510 --> 01:19:34.930
I can look at the content.

01:19:35.390 --> 01:19:37.770
And this is when I received it.

01:19:38.110 --> 01:19:39.510
So my next step

01:19:39.510 --> 01:19:44.350
in that provenance event was

01:19:51.090 --> 01:19:52.890
download the content

01:19:53.470 --> 01:19:55.470
from the processor.

01:19:56.270 --> 01:19:57.950
It received it.

01:19:57.950 --> 01:19:58.990
It downloaded it.

01:19:59.670 --> 01:20:01.090
Here's what the content

01:20:01.090 --> 01:20:02.730
looked like after that event.

01:20:04.850 --> 01:20:06.510
And here's what happened.

01:20:06.570 --> 01:20:07.770
Here's the attributes

01:20:07.770 --> 01:20:08.750
and those types of things.

01:20:14.910 --> 01:20:16.350
Modify the attribute.

01:20:17.070 --> 01:20:17.970
So if you remember

01:20:17.970 --> 01:20:19.550
the next step was to set

01:20:19.550 --> 01:20:20.390
the schema name.

01:20:21.370 --> 01:20:22.910
So here is that

01:20:22.910 --> 01:20:25.210
during that whole data flow

01:20:25.210 --> 01:20:27.050
here is, you know,

01:20:27.110 --> 01:20:28.570
it did an update attribute.

01:20:29.650 --> 01:20:31.250
I can look at the attributes,

01:20:31.530 --> 01:20:35.750
set the schema name as inventory.

01:20:36.650 --> 01:20:38.230
And I can look at the content.

01:20:38.350 --> 01:20:39.810
The content should still be the same

01:20:39.810 --> 01:20:41.490
because it's still a CSV file.

01:20:42.510 --> 01:20:44.170
But if anything changes,

01:20:44.670 --> 01:20:46.910
I can replay that and see it.

01:20:55.770 --> 01:20:58.530
Then the content was modified

01:20:58.530 --> 01:21:00.170
using the convert record.

01:21:01.370 --> 01:21:02.610
Convert CSV to JSON.

01:21:03.230 --> 01:21:06.330
I can see now that, you know,

01:21:07.230 --> 01:21:09.970
it came in as CSV.

01:21:10.310 --> 01:21:13.370
It comes out as a JSON document.

01:21:13.750 --> 01:21:15.630
I can look at the content now.

01:21:16.930 --> 01:21:18.390
Here's the input claim.

01:21:18.870 --> 01:21:20.150
Here's the output claim.

01:21:20.570 --> 01:21:24.030
So input was CSV.

01:21:28.170 --> 01:21:29.590
Output JSON.

01:21:30.410 --> 01:21:31.990
So that one processor

01:21:31.990 --> 01:21:33.290
took that document

01:21:33.290 --> 01:21:35.110
from a CSV to a JSON.

01:21:35.530 --> 01:21:37.050
I was able to replay this

01:21:37.990 --> 01:21:41.110
and see exactly

01:21:41.110 --> 01:21:42.990
like what changed,

01:21:43.030 --> 01:21:44.270
how it changed

01:21:44.270 --> 01:21:46.010
within that single processor.

01:21:53.490 --> 01:21:56.570
It received that JSON document.

01:21:58.530 --> 01:22:00.330
So it's a download event

01:22:00.330 --> 01:22:01.770
because it downloaded it

01:22:01.770 --> 01:22:02.890
from that processor.

01:22:04.070 --> 01:22:07.310
But now I have a JSON document

01:22:07.310 --> 01:22:09.030
that it has downloaded.

01:22:09.690 --> 01:22:11.630
I think of download as like

01:22:11.630 --> 01:22:13.950
the download is like the connection.

01:22:14.390 --> 01:22:15.450
So, you know,

01:22:15.450 --> 01:22:17.070
connection was to receive

01:22:17.070 --> 01:22:18.690
from processor to processor.

01:22:19.110 --> 01:22:20.610
So you have a download event.

01:22:28.430 --> 01:22:31.350
This should be the set JSON file name.

01:22:32.110 --> 01:22:34.270
So, you know, in that flow

01:22:34.270 --> 01:22:35.570
we set the file name

01:22:35.570 --> 01:22:36.790
of the JSON document.

01:22:37.850 --> 01:22:39.250
And we look at the attributes.

01:22:40.470 --> 01:22:43.470
We now have inventory.csv.json.

01:22:48.650 --> 01:22:52.030
And then after it had that JSON

01:22:52.030 --> 01:22:53.290
it received it

01:22:53.290 --> 01:22:55.570
and wrote the JSON file to directory.

01:22:56.090 --> 01:22:57.130
Here's where it put it.

01:22:58.410 --> 01:23:00.550
As well as here's the attributes.

01:23:01.130 --> 01:23:02.310
Here's the content.

01:23:03.370 --> 01:23:06.310
You know, the input of 745 bytes.

01:23:06.410 --> 01:23:07.930
Output 745 bytes.

01:23:09.070 --> 01:23:09.890
Same identifier.

01:23:10.250 --> 01:23:10.910
All set.

01:23:11.410 --> 01:23:12.510
You know, that type.

01:23:12.550 --> 01:23:13.670
And then it dropped

01:23:13.670 --> 01:23:15.470
that data flow.

01:23:15.650 --> 01:23:16.490
So it was done.

01:23:16.990 --> 01:23:22.790
So the whole event duration was .006 seconds.

01:23:26.850 --> 01:23:28.310
And you can see, you know,

01:23:28.370 --> 01:23:31.430
the final content, you know,

01:23:31.590 --> 01:23:33.510
replay that, those types of things.

01:23:34.210 --> 01:23:35.970
So one of the other nice things

01:23:35.970 --> 01:23:37.630
is you can actually download

01:23:37.630 --> 01:23:38.910
the lineage if you want.

01:23:40.310 --> 01:23:40.890
I've really

01:23:42.170 --> 01:23:45.370
I've never seen a lot of use for this.

01:23:46.110 --> 01:23:48.250
But, you know, if you were working

01:23:49.490 --> 01:23:52.050
for Center for Medicaid Services

01:23:52.050 --> 01:23:53.910
a while back, I was helping them

01:23:53.910 --> 01:23:54.770
for a while.

01:23:55.150 --> 01:23:57.550
And I worked in the fraud,

01:23:58.550 --> 01:24:01.050
you know, division of CMS

01:24:01.050 --> 01:24:03.910
where, you know, we would need

01:24:03.910 --> 01:24:06.890
to turn over a chain of custody

01:24:06.890 --> 01:24:09.690
for data that we had received.

01:24:09.690 --> 01:24:13.530
And, you know, to FBI and others

01:24:13.530 --> 01:24:15.970
to prosecute, you know,

01:24:16.330 --> 01:24:17.170
like false claims

01:24:17.170 --> 01:24:18.570
and those types of things.

01:24:19.230 --> 01:24:20.610
And in doing that, you know,

01:24:20.650 --> 01:24:21.790
we would have to download

01:24:21.790 --> 01:24:24.210
some of this lineage information.

01:24:24.730 --> 01:24:26.770
Now, when you click here

01:24:26.770 --> 01:24:27.590
to download lineage,

01:24:27.970 --> 01:24:29.210
you know, it's just basically

01:24:29.210 --> 01:24:32.050
giving you an image of what happened.

01:24:32.390 --> 01:24:35.010
I don't really find that useful.

01:24:37.030 --> 01:24:39.670
You know, but you

01:24:39.670 --> 01:24:41.710
were extracting, you can take

01:24:41.710 --> 01:24:42.970
all of these events

01:24:43.670 --> 01:24:45.350
and send them to either

01:24:45.350 --> 01:24:46.590
like corporate governance

01:24:47.190 --> 01:24:48.870
for long term storage

01:24:50.130 --> 01:24:51.970
or, you know, extract them

01:24:51.970 --> 01:24:53.950
out of the providence events

01:24:53.950 --> 01:24:54.510
and notify.

01:24:55.210 --> 01:24:56.910
That's how we would usually

01:24:56.910 --> 01:24:57.770
handle those.

01:24:58.890 --> 01:25:00.310
But I don't know, like,

01:25:00.310 --> 01:25:01.810
they have the image here.

01:25:02.170 --> 01:25:03.530
I just don't I just don't think

01:25:03.530 --> 01:25:04.610
it's that useful.

01:25:05.390 --> 01:25:06.390
You know, you may.

01:25:07.250 --> 01:25:09.570
So, you know, you do have

01:25:09.570 --> 01:25:14.370
capability, you can, you know,

01:25:14.470 --> 01:25:16.830
you can go strictly to that event

01:25:16.830 --> 01:25:18.930
and pull all of the things

01:25:18.930 --> 01:25:21.090
that happened, you know,

01:25:21.110 --> 01:25:22.170
those types of things.

01:25:22.830 --> 01:25:24.990
I've never had a use for the image,

01:25:25.010 --> 01:25:27.210
but you may have a use,

01:25:27.350 --> 01:25:29.170
but it's there in case you need it.

01:25:29.390 --> 01:25:31.430
So, you know, the beauty of this

01:25:31.430 --> 01:25:32.370
is we went through

01:25:33.190 --> 01:25:34.770
on the data provenance from

01:25:35.810 --> 01:25:38.870
start to finish with that data flow.

01:25:38.870 --> 01:25:41.610
We can look at exactly

01:25:41.610 --> 01:25:42.870
when we received it.

01:25:43.050 --> 01:25:44.650
We can see exactly

01:25:44.650 --> 01:25:45.790
when it was converted.

01:25:46.250 --> 01:25:47.210
We can see the before

01:25:47.210 --> 01:25:48.470
and after of that.

01:25:48.810 --> 01:25:52.110
We can, you know, see it move along

01:25:52.110 --> 01:25:53.670
from processor to processor

01:25:53.670 --> 01:25:56.730
to processor until, you know,

01:25:56.970 --> 01:25:59.250
until it's out of NAFA.

01:26:00.310 --> 01:26:03.110
The nice thing is, is, you know,

01:26:03.530 --> 01:26:04.610
that's built in.

01:26:05.190 --> 01:26:07.030
You can go and replay these events.

01:26:07.030 --> 01:26:09.670
We use it a lot for diagnosing

01:26:10.390 --> 01:26:11.710
data flow issues,

01:26:11.950 --> 01:26:13.590
because if you can imagine,

01:26:14.110 --> 01:26:17.070
you may have a valid data flow

01:26:17.070 --> 01:26:19.590
that is, you know, picking data up,

01:26:20.270 --> 01:26:21.670
you know, putting it somewhere,

01:26:21.890 --> 01:26:23.570
but, you know, you still run

01:26:23.570 --> 01:26:24.710
into an issue with,

01:26:25.650 --> 01:26:27.350
you know, processor.

01:26:28.110 --> 01:26:29.530
I've seen a processor

01:26:29.530 --> 01:26:32.490
handle characters incorrectly before,

01:26:32.910 --> 01:26:33.950
and we just, you know,

01:26:33.970 --> 01:26:36.250
but it never, never had an error.

01:26:37.030 --> 01:26:38.350
And so when you look

01:26:38.350 --> 01:26:39.710
at the data provenance events,

01:26:40.210 --> 01:26:41.370
you can actually go through

01:26:41.370 --> 01:26:42.830
and replay that and say,

01:26:42.830 --> 01:26:43.350
hey, wait a minute,

01:26:43.470 --> 01:26:46.690
that processor is malfunctioning.

01:26:46.970 --> 01:26:48.710
It's not reporting any errors,

01:26:48.710 --> 01:26:50.430
but we're seeing weird things

01:26:50.430 --> 01:26:51.270
within the data.

01:26:52.290 --> 01:26:53.810
So, you know, there are some

01:26:53.810 --> 01:26:55.110
some use cases for that as well.

01:26:56.250 --> 01:26:58.250
Provenance events for investigations,

01:26:58.410 --> 01:26:59.670
those types of things,

01:27:00.250 --> 01:27:02.730
you know, and, you know,

01:27:02.730 --> 01:27:04.670
just to provide that chain of custody

01:27:04.670 --> 01:27:05.910
to the data.

01:27:05.910 --> 01:27:09.370
So we went over provenance

01:27:09.370 --> 01:27:10.690
yesterday a little bit.

01:27:11.150 --> 01:27:12.070
We're going to touch on

01:27:12.070 --> 01:27:14.210
some of these things as we go along,

01:27:14.810 --> 01:27:16.710
but, you know, your flow

01:27:16.710 --> 01:27:18.510
from yesterday should have generated

01:27:18.510 --> 01:27:19.970
provenance events as well.

01:27:20.410 --> 01:27:22.190
And so if you want to pull up,

01:27:22.190 --> 01:27:24.190
you know, your flow and run it,

01:27:24.390 --> 01:27:26.890
you can see provenance events as well.

01:27:27.670 --> 01:27:29.730
So, you know, that's where you access it.

01:27:29.910 --> 01:27:31.770
We talked about it yesterday a little bit.

01:27:32.090 --> 01:27:32.970
We're going to talk about it

01:27:32.970 --> 01:27:33.990
a little bit today

01:27:34.470 --> 01:27:35.750
and those types of things.

01:27:36.370 --> 01:27:39.430
So, and then, you know, feel free to

01:27:42.210 --> 01:27:43.470
explore the menu.

01:27:43.810 --> 01:27:45.030
You break something,

01:27:45.130 --> 01:27:47.050
the beauty is I'm here to fix it.

01:27:47.270 --> 01:27:48.390
And so look at your flow

01:27:48.390 --> 01:27:49.570
configuration history.

01:27:50.650 --> 01:27:56.530
Look at the node status history.

01:27:57.030 --> 01:27:59.570
So, you know, here's how much

01:27:59.570 --> 01:28:00.870
free heap space.

01:28:01.750 --> 01:28:03.410
Again, we're on a Windows box

01:28:04.370 --> 01:28:07.350
and using Java, so it's going

01:28:07.350 --> 01:28:08.130
to be all over the place.

01:28:13.650 --> 01:28:16.290
Number of flow file repository

01:28:16.290 --> 01:28:17.250
free space, right?

01:28:18.830 --> 01:28:21.070
And so feel free to go through,

01:28:21.410 --> 01:28:23.570
look at all these, you know,

01:28:23.650 --> 01:28:26.350
things here because, you know,

01:28:26.470 --> 01:28:29.490
from I know we have some

01:28:31.190 --> 01:28:32.910
sysadmins on the call.

01:28:32.910 --> 01:28:36.150
Tom, you know, you may be interested

01:28:36.150 --> 01:28:38.410
in some of these metrics.

01:28:39.370 --> 01:28:41.430
Now, you know, there's an intro

01:28:41.430 --> 01:28:42.950
or service for Prometheus.

01:28:43.370 --> 01:28:44.730
So, you can actually have all

01:28:44.730 --> 01:28:47.150
of these events going out to Prometheus

01:28:47.150 --> 01:28:49.290
using your Grafana dashboard.

01:28:49.710 --> 01:28:51.470
You can see some of the same thing.

01:28:53.170 --> 01:28:56.910
You know, you can, you know,

01:28:56.990 --> 01:28:59.710
you can send these provenance events off.

01:28:59.710 --> 01:29:01.710
You can send the status

01:29:01.710 --> 01:29:04.730
and all of those metrics off as well.

01:29:05.930 --> 01:29:07.630
So, you know, it's here if you need it.

01:29:08.410 --> 01:29:10.710
If you're processing a large

01:29:10.710 --> 01:29:13.050
amount of files, you will

01:29:13.050 --> 01:29:14.390
definitely look at this because

01:29:14.390 --> 01:29:17.470
some processors consume

01:29:17.470 --> 01:29:18.970
a lot of resources.

01:29:20.170 --> 01:29:21.290
We were working with one

01:29:21.290 --> 01:29:23.630
of the bigger resource hogs

01:29:23.630 --> 01:29:24.990
of the whole system yesterday.

01:29:25.450 --> 01:29:27.050
The unpacking and packing

01:29:27.050 --> 01:29:29.490
of zip files, you know,

01:29:29.490 --> 01:29:32.150
can be very, if you can imagine,

01:29:32.470 --> 01:29:33.890
we've had folks, I've seen folks

01:29:33.890 --> 01:29:35.730
like trying to unzip and zip

01:29:35.730 --> 01:29:37.450
5 to 10 gig files.

01:29:38.570 --> 01:29:40.150
And so, you know, it's trying

01:29:40.150 --> 01:29:42.290
to throw all of that data

01:29:42.290 --> 01:29:45.230
into memory, trying to unzip it,

01:29:45.510 --> 01:29:47.210
trying to then take, you know,

01:29:47.370 --> 01:29:48.970
5 gig that turned into 20.

01:29:49.530 --> 01:29:50.630
And, you know, their system

01:29:50.630 --> 01:29:51.690
crashes, right?

01:29:51.890 --> 01:29:54.050
And so, there's smart ways to do this.

01:29:56.490 --> 01:29:57.930
And so, you know, you just got

01:29:57.930 --> 01:29:58.790
to work through it.

01:29:58.790 --> 01:30:01.530
As a sysadmin, I find the status

01:30:01.530 --> 01:30:04.630
history a lot of good information.

01:30:06.090 --> 01:30:06.890
And then we went through

01:30:06.890 --> 01:30:08.950
some of the other, you know,

01:30:08.950 --> 01:30:09.230
already.

01:30:09.430 --> 01:30:11.030
But feel free to click around

01:30:13.490 --> 01:30:16.370
and, you know, just explore the UI.

01:30:17.090 --> 01:30:18.770
What we will do is take

01:30:18.770 --> 01:30:20.550
our first break of the day

01:30:20.550 --> 01:30:22.290
and then when we come back,

01:30:22.810 --> 01:30:24.050
we're going to start working

01:30:24.050 --> 01:30:25.110
on our scenario.

01:30:25.710 --> 01:30:27.590
I do expect this scenario

01:30:27.590 --> 01:30:28.710
to take a while.

01:30:29.170 --> 01:30:30.810
And you're going to think

01:30:30.810 --> 01:30:33.750
that we went from easy data flow

01:30:33.750 --> 01:30:35.650
to going over the deep end.

01:30:36.270 --> 01:30:37.350
But I'm here to help.

01:30:37.970 --> 01:30:39.270
I'm here to walk through it

01:30:39.270 --> 01:30:40.630
with you, talk through it.

01:30:42.330 --> 01:30:44.410
I can see everyone's screen.

01:30:45.010 --> 01:30:46.490
So, you know, just bear with us

01:30:46.490 --> 01:30:47.370
and we'll get through

01:30:47.370 --> 01:30:48.550
some of these other scenarios

01:30:48.550 --> 01:30:49.370
I have planned.

01:30:49.810 --> 01:30:52.290
And then we'll go into probably

01:30:52.290 --> 01:30:54.030
some registry later today.

01:30:54.130 --> 01:30:56.490
And then tomorrow we'll wrap up,

01:30:56.490 --> 01:30:57.530
have a little test.

01:30:59.130 --> 01:31:00.890
Mostly like an open book Q&A

01:31:01.910 --> 01:31:03.670
and, you know, clean up

01:31:03.670 --> 01:31:04.590
our other data flows

01:31:04.590 --> 01:31:05.490
and go from there.

01:31:05.730 --> 01:31:06.730
But again, you know,

01:31:06.730 --> 01:31:08.050
when you're building this out,

01:31:08.070 --> 01:31:09.770
the scenario, I'm looking for

01:31:11.210 --> 01:31:13.690
mainly how you want to do this.

01:31:13.810 --> 01:31:15.350
I don't necessarily want to see

01:31:15.350 --> 01:31:17.210
a functioning data flow.

01:31:17.570 --> 01:31:19.790
I want to see the thought process

01:31:19.790 --> 01:31:21.490
of here's how I plan

01:31:21.490 --> 01:31:23.350
to accomplish the task.

01:31:23.710 --> 01:31:25.470
And here's the processors

01:31:25.470 --> 01:31:27.990
I would use and, you know,

01:31:28.110 --> 01:31:29.730
the connections and things like that.

01:31:30.030 --> 01:31:30.990
So with that said,

01:31:31.130 --> 01:31:32.290
unless there's a question,

01:31:32.690 --> 01:31:34.490
I am going to get something to drink

01:31:34.490 --> 01:31:38.110
and we'll go to on our first break.

01:31:38.510 --> 01:31:41.070
It is 11.

01:31:41.590 --> 01:31:43.470
So we will be back at nine.

01:31:46.850 --> 01:31:48.850
Okay, let's do that.

01:31:50.270 --> 01:31:50.770
All right.

01:31:50.770 --> 01:31:52.430
So let's take a quick 15 minute break.

01:31:52.890 --> 01:31:54.950
I will see everyone back here

01:31:54.950 --> 01:31:58.050
in 15 minutes and we will

01:31:58.050 --> 01:31:59.190
start going through some scenario.

01:32:07.830 --> 01:32:09.610
Give everybody a few minutes to get back.

01:32:09.630 --> 01:32:11.150
I was checking to make sure

01:32:11.150 --> 01:32:12.450
you all have a scenario

01:32:12.450 --> 01:32:13.990
and it looks like it's installed.

01:32:43.990 --> 01:32:44.110
So.

01:32:46.750 --> 01:32:47.610
What happened?

01:32:48.990 --> 01:32:49.890
What happened?

01:32:53.030 --> 01:32:53.470
Okay.

01:32:57.770 --> 01:33:00.890
I'm going to start with the C first.

01:33:21.810 --> 01:33:23.970
We'll get started up here in a minute.

01:33:24.870 --> 01:33:27.610
So the data flow

01:33:27.610 --> 01:33:29.510
I was using for the controller services.

01:33:30.590 --> 01:33:33.510
I have a template of that

01:33:33.510 --> 01:33:36.090
that you can use for reference.

01:33:36.690 --> 01:33:38.010
I didn't upload it yet

01:33:38.010 --> 01:33:38.790
because, you know,

01:33:38.810 --> 01:33:39.630
I kind of wanted to walk

01:33:39.630 --> 01:33:41.110
through my flow first.

01:33:41.910 --> 01:33:44.010
But, you know, for the scenario

01:33:44.010 --> 01:33:46.630
we can upload it

01:33:46.630 --> 01:33:47.590
for some assistance

01:33:49.110 --> 01:33:50.150
once everybody gets back.

01:33:53.250 --> 01:33:54.730
Thomas, it looks like you've got

01:33:54.730 --> 01:33:55.910
one of your lines figured out.

01:33:57.130 --> 01:33:58.390
Travis got it all figured out, I think.

01:33:59.510 --> 01:34:00.230
Yeah, I see.

01:34:00.330 --> 01:34:01.110
I'm looking at Travis.

01:34:01.350 --> 01:34:03.530
I'm like freaking dude, man.

01:34:04.050 --> 01:34:06.110
Still, it's not perfect.

01:34:06.430 --> 01:34:08.090
One of my lives is a little crooked

01:34:08.090 --> 01:34:10.230
because it's getting those points

01:34:11.050 --> 01:34:11.950
to straight.

01:34:12.190 --> 01:34:13.510
It's not very easy either.

01:34:14.070 --> 01:34:15.630
But I don't know how Travis did it.

01:34:15.890 --> 01:34:16.890
He's I haven't gotten

01:34:16.890 --> 01:34:18.610
the controller aspect of it yet.

01:34:21.170 --> 01:34:22.570
No, yours looks great.

01:34:23.010 --> 01:34:23.850
I'm jelly.

01:34:24.170 --> 01:34:26.590
Yeah, the controller aspect

01:34:26.590 --> 01:34:28.970
is like I said, like that's why

01:34:28.970 --> 01:34:30.090
I wanted to kind of lead off

01:34:30.090 --> 01:34:30.990
with that this morning.

01:34:31.970 --> 01:34:35.370
This will potentially take the entire day

01:34:35.370 --> 01:34:37.110
up until after lunch, at least.

01:34:39.050 --> 01:34:42.150
And but, you know, kind of throws

01:34:42.150 --> 01:34:43.690
us over into the deep end.

01:34:44.510 --> 01:34:46.210
And once we get this figured out,

01:34:46.510 --> 01:34:50.630
you have 90% figured out.

01:34:53.010 --> 01:34:54.250
There's some other ways

01:34:54.250 --> 01:34:54.970
of doing things.

01:34:55.050 --> 01:34:56.530
I know we asked some questions

01:34:56.530 --> 01:34:58.430
about some Python stuff like that.

01:34:58.550 --> 01:35:00.070
And I'm going to probably

01:35:00.070 --> 01:35:01.550
go into that tomorrow.

01:35:02.430 --> 01:35:05.390
But once we get controller services,

01:35:05.390 --> 01:35:06.890
you know, kind of squared away,

01:35:07.630 --> 01:35:08.790
you know, you should be

01:35:09.570 --> 01:35:11.730
you should be set for building

01:35:11.730 --> 01:35:13.130
your own not five flows.

01:35:13.710 --> 01:35:14.590
And this time.

01:35:14.850 --> 01:35:16.870
All right, let me see.

01:35:18.710 --> 01:35:21.910
There was a about to pull up a link.

01:35:24.590 --> 01:35:25.950
Hey, Maria, are you still there?

01:35:26.130 --> 01:35:26.870
No, she dropped off.

01:35:28.210 --> 01:35:29.010
Hang on.

01:35:29.190 --> 01:35:31.270
There's a scratch pad that we use.

01:35:33.590 --> 01:35:34.730
Oh, here we go.

01:35:38.970 --> 01:35:40.510
Okay, once you get once you get

01:35:40.510 --> 01:35:42.130
the hang of moving this around

01:35:43.050 --> 01:35:44.190
with those different.

01:35:44.190 --> 01:35:45.010
I don't know what to call

01:35:45.010 --> 01:35:46.070
way points or whatever.

01:35:46.150 --> 01:35:47.730
When you double click on the line,

01:35:48.150 --> 01:35:50.430
it's a little easier once you get it.

01:35:50.430 --> 01:35:51.310
Get the hang of it.

01:35:51.550 --> 01:35:56.410
Oh, that's okay.

01:36:02.070 --> 01:36:07.190
Okay, so in teams,

01:36:07.190 --> 01:36:09.050
I am posting a link.

01:36:11.570 --> 01:36:11.670
Hopefully.

01:36:15.250 --> 01:36:16.890
Hopefully you're able to click on it.

01:36:17.690 --> 01:36:19.410
It's a Dropbox link

01:36:19.410 --> 01:36:22.250
and it has the scenarios already

01:36:22.950 --> 01:36:25.070
uploaded to your uploads folder.

01:36:25.070 --> 01:36:27.450
But it also has the flow

01:36:27.450 --> 01:36:28.750
that I just worked on.

01:36:29.530 --> 01:36:31.070
So I saved that.

01:36:32.470 --> 01:36:38.350
So that way you can use it as reference.

01:36:38.370 --> 01:36:39.790
So if you are able to,

01:36:40.770 --> 01:36:43.530
you should be able to bring

01:36:43.530 --> 01:36:46.990
that Dropbox link up and download.

01:36:47.390 --> 01:36:48.130
It's two zip files.

01:36:48.410 --> 01:36:49.130
Not five scenario.

01:36:49.250 --> 01:36:50.650
Not five example me.

01:36:51.270 --> 01:36:53.430
And if you notice at the bottom

01:36:53.430 --> 01:36:56.550
of your screen.