14 videos 📅 2024-05-06 08:00:00 America/Creston
39:11
2024-05-06 08:57:50
3:11:28
2024-05-06 10:30:19
24:39
2024-05-07 07:20:20
19:49
2024-05-07 08:03:25
1:14:35
2024-05-07 08:35:13
15:41
2024-05-07 10:06:25
38:33
2024-05-07 10:37:03
2:48
2024-05-07 11:19:01
59:37
2024-05-07 11:33:56
6:10
2024-05-07 14:55:25
39:40
2024-05-07 15:02:44
9:45
2024-05-08 06:44:21
29:27
2024-05-08 08:09:24
2:51:40
2024-05-08 12:09:24

Visit the Apache Nifi - GROUP 1 course recordings page

                WEBVTT

00:00:08.220 --> 00:00:16.080
So you can save your flow to the same bucket or you can create a new bucket and save your

00:00:16.080 --> 00:00:24.540
other flow to that one and I would be curious if someone would, I'm going to test this

00:00:24.540 --> 00:00:25.000
out maybe.

00:00:25.000 --> 00:00:31.100
I'm going to put, I'm going to see if I can add my registry to one of you all's

00:00:31.100 --> 00:00:31.900
NaPy instance.

00:00:32.060 --> 00:00:35.900
I don't know if the IP address will resolve in these VMs.

00:00:35.920 --> 00:00:40.180
I don't think they're sharing in between them, but I'd be curious to see.

00:00:40.740 --> 00:00:48.180
But anyway, so you should now have, you know, two flows within your, within your

00:00:48.180 --> 00:00:50.420
registry, everyone does.

00:00:51.040 --> 00:00:51.920
Okay, perfect.

00:00:52.160 --> 00:00:54.060
If you do not, let me know.

00:00:54.060 --> 00:01:03.340
So let's go back to NaPy and let's go into that CSV to JSON data flow.

00:01:04.400 --> 00:01:10.120
And what I want to do is I am going to make some changes to this.

00:01:10.120 --> 00:01:12.660
I want to add, you know, a label.

00:01:32.880 --> 00:01:37.580
I'm just adding a label, just making some changes to my data flow.

00:01:57.500 --> 00:02:18.060
Okay, so the change I made is basically I just added a label to this and if you go

00:02:18.060 --> 00:02:22.920
back, you know, you can use your break from trail to, to go back or just right

00:02:22.920 --> 00:02:28.220
click and say, leave your group, but when you go back, you know, that green check

00:02:28.220 --> 00:02:39.020
box is now, and so what that lets you know, what's that letting you know is

00:02:39.020 --> 00:02:46.700
that you have changes that need to be committed and so you can right click

00:02:46.700 --> 00:02:51.620
again, same version, and you want to commit local changes.

00:02:51.620 --> 00:03:03.740
And so, but this version comment, I want to put, I added a label and then once I

00:03:03.740 --> 00:03:09.700
make that comment, it should be a check box and I should be good to go.

00:03:14.480 --> 00:03:19.520
Hopefully, if anyone, I'm not looking at everyone's screen right now.

00:03:19.520 --> 00:03:24.100
So if anyone's having an issue, feel free to speak up if you're not this far along.

00:03:25.880 --> 00:03:32.540
Okay, so now that I've committed my changes and it's now into the registry,

00:03:33.420 --> 00:03:41.240
if I had a GitHub or GitLab or Azure DevOps platform set up, you know,

00:03:41.300 --> 00:03:47.460
Git repo there, you know, the registry would automatically write these data

00:03:47.460 --> 00:03:57.080
flows to that versioning control system and utilize that version control, you

00:03:57.080 --> 00:03:59.720
know, for storage of all these data flows.

00:04:00.200 --> 00:04:05.360
I can use them outside of registry, you know, those types of things, you know,

00:04:05.380 --> 00:04:10.700
I may, we're going to get into this when we do some Minify stuff, but

00:04:10.700 --> 00:04:18.260
Minify, you need to develop your flows in Non-Fi and save them before Minify

00:04:18.260 --> 00:04:24.840
could run them, because Minify does not have a UI to develop its flows.

00:04:26.520 --> 00:04:31.620
And that is the, you know, from tomorrow morning to lunch tomorrow,

00:04:31.840 --> 00:04:37.000
we're working Minify, you know, I've set aside half a day just for that

00:04:37.000 --> 00:04:42.760
because I know it's a very, there's a lot of things going on once we start

00:04:42.760 --> 00:04:47.760
introducing Minify, but, you know, that could be a reason you want to save your

00:04:47.760 --> 00:04:50.640
flow, you built it in Non-Fi, you save it, you save it to you,

00:04:50.640 --> 00:04:52.500
and it gets backed up to your Git repo.

00:04:52.920 --> 00:05:00.220
You may have a CI CD process that will push Non-Fi to your, you know,

00:05:00.340 --> 00:05:05.820
Minify to your Raspberry Pi Edge device, it will push the correct flow

00:05:05.820 --> 00:05:11.500
to that as well, because you do not have registry installed on that Pi,

00:05:11.580 --> 00:05:17.500
you do not have Non-Fi installed, you do not have registry, you know,

00:05:17.600 --> 00:05:21.880
connected to that Non-Fi that's not installed, so, you know,

00:05:22.100 --> 00:05:27.360
Minify is probably one of the key users of that virtual control system,

00:05:28.440 --> 00:05:31.960
and, you know, in talking with, you know, folks,

00:05:31.960 --> 00:05:37.140
Minify is definitely, you know, looked at for some of you all's projects,

00:05:37.140 --> 00:05:38.920
so, you know, just keep that in mind.

00:05:40.420 --> 00:05:47.080
Now, with that being said, I can stop versioning control at any time.

00:05:47.380 --> 00:05:52.120
I can stop, you know, if I had to, you know, I don't know of a good reason

00:05:52.120 --> 00:05:57.020
to stop versioning control, but if you have, you know, your data

00:05:57.020 --> 00:05:59.760
flow at a good state, you may want to stop, you know,

00:05:59.760 --> 00:06:06.360
versioning it, or if it's going away, or you may want to mess with it locally

00:06:06.360 --> 00:06:10.160
and make a lot of changes, and then once you get to where you're at,

00:06:10.160 --> 00:06:11.680
then start committing again.

00:06:13.140 --> 00:06:18.460
I can't think of any real business use cases to stop versioning control

00:06:18.460 --> 00:06:22.000
except for like testing and training and those types of things.

00:06:23.180 --> 00:06:26.000
You can also revert back.

00:06:26.000 --> 00:06:33.320
So, what I did is I'm going to change my version back to the original.

00:06:33.900 --> 00:06:36.960
So, if you remember, I went through and added a label,

00:06:37.580 --> 00:06:42.680
and what I want to do is revert back and make that change.

00:06:43.080 --> 00:06:44.860
So, it's stopping all the processors

00:06:46.060 --> 00:06:48.760
and then updating the data flow that's in there.

00:06:49.040 --> 00:06:54.700
So, when I go back into here, my label that I created is gone.

00:07:00.540 --> 00:07:10.020
And the red arrow lets me know that there is a newer version of this flow out there.

00:07:10.140 --> 00:07:13.840
So, I was reverting my changes because I wanted to get rid of that label,

00:07:14.460 --> 00:07:19.760
and maybe I wanted to, you know, get rid of that label,

00:07:19.900 --> 00:07:22.980
and I wanted to add another label.

00:07:24.640 --> 00:07:28.020
By the way, you can control C, control V on labels.

00:07:37.780 --> 00:07:40.380
Maybe I wanted two labels now, right?

00:07:40.640 --> 00:07:43.300
So, what I would do is I created my new label,

00:07:43.780 --> 00:07:46.120
I have changes that needed to be made,

00:07:46.200 --> 00:07:50.300
and now I can, you know, I can revert my local changes,

00:07:50.380 --> 00:07:52.300
I can show the local changes,

00:07:52.300 --> 00:07:56.180
or I can just commit a newer version.

00:08:09.020 --> 00:08:13.260
And now we are back, you know, to a green checkbox.

00:08:20.000 --> 00:08:29.000
Okay. So, with that being said, now that is how we are committing our changes,

00:08:29.380 --> 00:08:34.380
that's how we are working with our data flow, you know, those types of things.

00:08:35.120 --> 00:08:42.360
So, what I want to do is I am going to stop my processor group.

00:08:42.360 --> 00:08:45.400
I am going to delete that processor group.

00:08:47.360 --> 00:08:51.660
Well, I would have to go in and delete the controller service

00:08:51.660 --> 00:08:54.380
that's associated with it, you know, some of these other things.

00:08:54.640 --> 00:08:57.060
But just stop your processor group.

00:08:58.200 --> 00:09:02.320
And now we have a version of that data flow.

00:09:02.620 --> 00:09:04.600
So, now we can actually go and say,

00:09:05.980 --> 00:09:08.520
drag and drop down a new processor group.

00:09:09.200 --> 00:09:13.300
And now we have the option to import from registry.

00:09:14.380 --> 00:09:16.600
So, previously we did not have this.

00:09:17.020 --> 00:09:21.180
We could only create a name for our processor group

00:09:21.180 --> 00:09:26.280
or upload like we did previously for the data flow that we were working with.

00:09:27.520 --> 00:09:30.380
You know, that JSON element that we can upload.

00:09:30.800 --> 00:09:32.940
But now you can import from registry.

00:09:33.980 --> 00:09:38.240
You can go to your buckets, and if you have many flow names,

00:09:38.240 --> 00:09:41.600
you can select the bucket that your flow is in,

00:09:42.740 --> 00:09:47.020
select the, you know, the flow name and the version.

00:09:47.600 --> 00:09:51.400
So, for this one, I want to go back to my original version.

00:09:51.880 --> 00:09:52.960
I am going to import that.

00:09:54.340 --> 00:09:55.340
And done.

00:09:57.120 --> 00:09:59.940
So, what it's letting me know, again, that is a,

00:09:59.940 --> 00:10:03.300
there is a newer version of this available.

00:10:03.440 --> 00:10:05.060
I pulled in version one.

00:10:05.240 --> 00:10:07.420
I now have three different versions.

00:10:08.560 --> 00:10:10.140
And, you know, those types of things.

00:10:10.240 --> 00:10:14.160
So, what I can do, as well as I can stop versioning control

00:10:16.180 --> 00:10:17.760
this process group, for instance,

00:10:23.300 --> 00:10:25.120
I can start versioning control.

00:10:26.640 --> 00:10:29.760
And if I wanted to, I could push this to a new bucket.

00:10:30.040 --> 00:10:32.080
I can give it a new name.

00:10:35.100 --> 00:10:40.360
So, now I have, you know, two different flows.

00:10:40.440 --> 00:10:41.920
You may have more.

00:10:42.260 --> 00:10:47.200
But I have two different flows saved in my registry.

00:10:52.040 --> 00:10:52.880
Refresh.

00:10:54.260 --> 00:10:54.740
Awesome.

00:10:55.240 --> 00:10:59.320
So, in that same bucket, I now have, you know,

00:10:59.440 --> 00:11:02.080
a new name, data flow, and I can go back to my registry.

00:11:02.080 --> 00:11:08.120
I also have the CSV to JSON that has three different versions.

00:11:08.720 --> 00:11:10.360
I can refresh these.

00:11:10.980 --> 00:11:13.880
I can delete the bucket altogether.

00:11:14.480 --> 00:11:16.360
I can export a version.

00:11:16.660 --> 00:11:18.880
I can import even a new version.

00:11:19.380 --> 00:11:22.500
Or I can just delete the flow altogether.

00:11:23.660 --> 00:11:26.680
And then, you know, you can go back to your settings here

00:11:26.680 --> 00:11:29.720
and, you know, delete your buckets

00:11:29.720 --> 00:11:32.000
and things like that as well.

00:11:32.720 --> 00:11:37.040
You know, so NaPy registry has a, you know,

00:11:37.200 --> 00:11:40.900
a ton of capabilities in managing your data flows.

00:11:41.460 --> 00:11:47.420
And, you know, now, if you were to run your import zip

00:11:47.420 --> 00:11:50.600
and, you know, it blows up, it, you know,

00:11:50.760 --> 00:11:53.760
it ingests your parent NaPy directory.

00:11:54.420 --> 00:11:56.340
And, you know, it no longer works.

00:11:56.520 --> 00:12:00.740
We can now delete that whole instance, unzip it,

00:12:00.740 --> 00:12:03.720
start NaPy, add this registry real quickly,

00:12:04.100 --> 00:12:06.460
and now we have a version of the flows.

00:12:07.540 --> 00:12:09.780
You know, so there's a lot of capabilities here.

00:12:10.340 --> 00:12:14.800
Again, we're not going to go into configuring GitHub

00:12:14.800 --> 00:12:19.760
and GitLab and Azure and those types of things.

00:12:20.000 --> 00:12:23.140
But what I will do is send out some tips and tricks

00:12:23.140 --> 00:12:26.420
and instructions on how I've done this

00:12:26.420 --> 00:12:29.540
and some reference material because, you know,

00:12:29.540 --> 00:12:32.300
you may need to reference a specific class

00:12:32.300 --> 00:12:35.260
or property in the configuration file.

00:12:36.280 --> 00:12:37.940
So I'm going to pause there.

00:12:38.320 --> 00:12:44.400
I think that just about covers, you know, the registry.

00:12:44.860 --> 00:12:47.880
The user guide and the system administrator guide

00:12:47.880 --> 00:12:48.780
is online.

00:12:49.740 --> 00:12:51.540
Those links I will send out as well.

00:12:52.900 --> 00:12:55.320
You know, it goes into a little bit more detail

00:12:55.320 --> 00:13:00.440
about, you know, managing users and things like that.

00:13:02.520 --> 00:13:04.700
If you remember when I was talking about

00:13:04.700 --> 00:13:08.100
some of the NaPy security is very fine detailed.

00:13:09.000 --> 00:13:12.280
It's, you know, it's fine grained controls.

00:13:12.680 --> 00:13:16.340
You can, you know, set your users to only allow them

00:13:16.340 --> 00:13:18.020
to view a data flow.

00:13:18.180 --> 00:13:19.260
You can start a data flow.

00:13:19.380 --> 00:13:21.800
They can import just the data flow, things like that.

00:13:23.160 --> 00:13:26.100
And registry is set up the same.

00:13:26.480 --> 00:13:28.360
So, you know, you can set up policies

00:13:28.360 --> 00:13:31.360
where, you know, someone could come in

00:13:31.360 --> 00:13:35.140
and view the registry but not make any delete.

00:13:35.380 --> 00:13:38.200
You cannot upload those types of things.

00:13:39.480 --> 00:13:41.120
So for you sysadmins out there,

00:13:41.660 --> 00:13:44.240
you know, it's good to keep those things in mind

00:13:44.240 --> 00:13:47.500
when you're setting this up in a more secure environment

00:13:48.180 --> 00:13:51.720
because you want to enable those fine grained controls.

00:13:52.580 --> 00:13:57.400
So with all that being said, did anyone have any issues

00:13:59.040 --> 00:14:02.840
creating the registry, managing the bucket,

00:14:03.260 --> 00:14:05.400
you know, any of those types of things?

00:14:05.420 --> 00:14:06.300
We're good on our end.

00:14:07.320 --> 00:14:08.740
Awesome, awesome, awesome.

00:14:09.020 --> 00:14:10.400
I didn't have any issues but I have some questions.

00:14:10.580 --> 00:14:13.300
Yeah, I was about to say, if nobody has any issues,

00:14:13.600 --> 00:14:14.560
let's hear the question.

00:14:14.800 --> 00:14:17.760
So it looks like if we had two different instances

00:14:17.760 --> 00:14:19.000
with different canvases,

00:14:19.240 --> 00:14:21.380
two people could be working on the same flow.

00:14:21.380 --> 00:14:23.520
If they're using the same registry.

00:14:23.640 --> 00:14:25.960
They can, they can.

00:14:26.840 --> 00:14:29.160
And when, if you were,

00:14:30.860 --> 00:14:33.340
if somebody is high speed enough,

00:14:33.780 --> 00:14:35.780
I don't know if this will work in this VM

00:14:39.160 --> 00:14:44.740
but you should be able to access my registry on,

00:14:44.740 --> 00:14:56.200
was it three, nine, yeah, nine.

00:14:57.900 --> 00:14:59.820
Ten dot zero dot three dot nine.

00:15:00.500 --> 00:15:00.600
I can't.

00:15:01.460 --> 00:15:01.840
Okay.

00:15:03.500 --> 00:15:06.700
Yeah, I was, I was wondering if these VMs would allow it.

00:15:06.720 --> 00:15:09.080
I don't, I don't think it would

00:15:09.620 --> 00:15:13.940
but if you wanted, you could add my registry

00:15:14.500 --> 00:15:19.220
and you could go back to your NaPhi canvas,

00:15:19.480 --> 00:15:22.960
create a new processor group, check that in

00:15:22.960 --> 00:15:25.440
and then while you're working on that,

00:15:25.780 --> 00:15:29.700
I can pull your latest version that you checked in.

00:15:30.040 --> 00:15:32.700
So I can't pull in real time,

00:15:32.780 --> 00:15:36.980
it's not like a Google doc or Office 365

00:15:36.980 --> 00:15:40.580
where we can work off of the same document

00:15:40.580 --> 00:15:43.500
but if you were to check something in

00:15:43.500 --> 00:15:45.820
and then continue working on it,

00:15:45.940 --> 00:15:49.980
I can pull what you checked in onto my canvas

00:15:49.980 --> 00:15:52.660
and I can continue working on it as well.

00:15:53.080 --> 00:15:54.800
So, you know, it does, you know,

00:15:54.800 --> 00:15:57.700
it doesn't provide that in-browser collaboration

00:15:57.700 --> 00:15:59.600
like you would see with a Word document

00:15:59.600 --> 00:16:03.400
but it does provide an ease of use to,

00:16:03.400 --> 00:16:07.520
for your teammates to, to all work on the same flow,

00:16:07.540 --> 00:16:08.320
for instance.

00:16:10.300 --> 00:16:10.640
Cool.

00:16:11.180 --> 00:16:12.660
My other, my last question.

00:16:12.660 --> 00:16:14.720
This may be out of scope at this training,

00:16:14.740 --> 00:16:17.420
I'm just curious how other people do this

00:16:17.420 --> 00:16:20.020
but if the flows are being backed up in source control,

00:16:20.640 --> 00:16:22.480
we're going to be running these as containers.

00:16:23.000 --> 00:16:24.780
Do we, do we need to back up

00:16:24.780 --> 00:16:26.980
any of the, anything other than the flow?

00:16:27.000 --> 00:16:29.460
Because I know there's like the data is queued

00:16:29.460 --> 00:16:30.780
and if there's errors and stuff.

00:16:31.840 --> 00:16:33.160
Do people typically back up stuff

00:16:33.160 --> 00:16:33.900
more than just the flow

00:16:33.900 --> 00:16:35.000
when they're running NaPhi in production?

00:16:35.520 --> 00:16:36.060
They do not.

00:16:36.120 --> 00:16:39.180
So that's, that's the, the beauty of this

00:16:39.180 --> 00:16:42.640
and the data flow, the queue, all of that,

00:16:42.660 --> 00:16:46.380
that's the, the actual data flow running.

00:16:47.420 --> 00:16:51.220
So when you, when you, you know, build your container,

00:16:51.960 --> 00:16:54.180
you will pull in your data flow.

00:16:54.620 --> 00:17:01.200
And so, and I can go into more of, of, of that.

00:17:01.960 --> 00:17:03.640
As a matter of fact, I'm writing it down now.

00:17:03.840 --> 00:17:05.860
Because in NaPhi, there is,

00:17:08.120 --> 00:17:10.400
there is a folder

00:17:15.780 --> 00:17:16.700
that you can use.

00:17:16.700 --> 00:17:18.740
So if you were deploying this

00:17:19.580 --> 00:17:24.000
and you were putting it into a container,

00:17:24.780 --> 00:17:28.020
you want to go ahead and kickstart the container

00:17:28.020 --> 00:17:29.740
with its flows.

00:17:30.100 --> 00:17:34.500
So you can actually, you know, pull the flow

00:17:34.500 --> 00:17:38.020
from, from GitHub or Azure DevOps

00:17:38.020 --> 00:17:41.320
and package that with your container

00:17:41.320 --> 00:17:44.340
and gun, you know, you just want to gunzip it.

00:17:44.940 --> 00:17:50.380
And so when NaPhi starts up, it looks for this file.

00:17:50.940 --> 00:17:53.800
And if it finds the flow.json.gz,

00:17:54.560 --> 00:17:59.140
it will extract that file and import the data flow.

00:17:59.780 --> 00:18:03.500
So now you've got a process set up that,

00:18:03.740 --> 00:18:06.460
you know, where you can, you know,

00:18:06.460 --> 00:18:08.600
you've got your flows, you know,

00:18:08.800 --> 00:18:11.300
managed appropriately during the version control

00:18:11.300 --> 00:18:12.400
and those types of things.

00:18:13.060 --> 00:18:15.160
And now you've got a process,

00:18:15.420 --> 00:18:17.120
you know, to build your container,

00:18:17.560 --> 00:18:19.760
that CSV process to build your container.

00:18:20.700 --> 00:18:23.340
You know, you will reference a data flow

00:18:23.340 --> 00:18:24.760
it needs to pull in.

00:18:25.200 --> 00:18:27.760
And so when that container is built

00:18:27.760 --> 00:18:28.840
and up and running,

00:18:29.700 --> 00:18:30.920
it's got the data flow

00:18:30.920 --> 00:18:32.780
that was in your versioning control.

00:18:33.440 --> 00:18:35.480
It's running with that information

00:18:35.480 --> 00:18:38.340
and you should be, you know, good to go.

00:18:39.180 --> 00:18:41.400
Minify, we're going to actually touch

00:18:41.400 --> 00:18:43.540
in some of this when we get to Minify

00:18:44.780 --> 00:18:48.480
because Minify does require you

00:18:48.480 --> 00:18:50.660
to bootstrap it with the flow

00:18:50.660 --> 00:18:51.640
that you want to use.

00:18:52.400 --> 00:18:55.200
There is no UI like I mentioned with Minify.

00:18:55.640 --> 00:18:58.220
So all of the flow building you will do

00:18:58.220 --> 00:19:00.260
for Minify is done within NaPhi.

00:19:00.780 --> 00:19:02.380
And so that's a great example

00:19:02.380 --> 00:19:05.080
where you may have a process set up

00:19:05.080 --> 00:19:07.640
where you're running Minify in a container.

00:19:08.340 --> 00:19:10.480
On a Raspberry Pi, for instance.

00:19:11.400 --> 00:19:12.880
And when you built that container,

00:19:13.140 --> 00:19:14.640
you need to feed it a flow

00:19:14.640 --> 00:19:17.160
that, you know, it will execute

00:19:17.160 --> 00:19:20.560
to ingest this log and format this log

00:19:20.560 --> 00:19:22.580
and send it over to NaPhi.

00:19:23.660 --> 00:19:26.460
You know, so that's how you would use that.

00:19:26.540 --> 00:19:28.440
But that's a great question.

00:19:28.660 --> 00:19:29.620
Did I answer it?

00:19:29.640 --> 00:19:31.220
Yeah, you did. Thank you.

00:19:31.420 --> 00:19:32.360
Okay, awesome.

00:19:33.220 --> 00:19:35.740
We have about five more minutes

00:19:35.740 --> 00:19:38.640
before, you know, let's break for lunch.

00:19:39.800 --> 00:19:41.720
Any other questions?

00:19:42.640 --> 00:19:45.740
Any, you know, even tips or tricks?

00:19:46.140 --> 00:19:48.860
You know, I like to, you know,

00:19:48.960 --> 00:19:51.620
do these classes where it's very casual

00:19:51.620 --> 00:19:54.360
and it's a very much conversation.

00:19:54.920 --> 00:19:56.740
You know, please pick my brain.

00:19:57.820 --> 00:19:59.240
You know, like I said, I've been

00:19:59.240 --> 00:20:02.560
through numerous huge NaPhi deployments.

00:20:02.640 --> 00:20:04.780
I'm one of the committers to this.

00:20:04.780 --> 00:20:06.820
So, you know, I know they are working.

00:20:07.040 --> 00:20:09.780
So if there's even an out of scope question,

00:20:09.800 --> 00:20:11.080
I'll be happy to answer.

00:20:11.820 --> 00:20:13.800
So we got about five minutes left.

00:20:14.100 --> 00:20:15.280
Any additional questions

00:20:15.280 --> 00:20:17.060
or we will just go to lunch a little early.

00:20:17.060 --> 00:20:18.100
Are we still planning to go over

00:20:18.100 --> 00:20:19.080
the multi-tenancy stuff?

00:20:19.500 --> 00:20:20.860
We are, we are, we are.

00:20:23.340 --> 00:20:25.060
Yeah, after lunch, I was going to have

00:20:25.060 --> 00:20:27.160
us work on one more flow

00:20:27.780 --> 00:20:28.940
and stuff like that.

00:20:30.180 --> 00:20:33.200
And tomorrow is going to be a lot of,

00:20:33.200 --> 00:20:35.580
you know, minify multi-tenancy

00:20:36.780 --> 00:20:38.740
and a couple other little things

00:20:38.740 --> 00:20:40.040
that we should need to cover.

00:20:40.120 --> 00:20:41.580
And that's also one of the things

00:20:41.580 --> 00:20:43.420
that you probably, you might have

00:20:43.420 --> 00:20:46.480
noticed is a Docker compose example

00:20:46.480 --> 00:20:50.240
in the, in your zip file.

00:20:50.600 --> 00:20:51.800
But we'll get into that.

00:20:51.840 --> 00:20:53.120
We do have Docker desktop

00:20:53.940 --> 00:20:56.140
deployed and those types of things.

00:20:58.280 --> 00:21:00.340
All right. Any other questions?

00:21:03.540 --> 00:21:07.600
Okay. If you wanted to try and like edit

00:21:08.200 --> 00:21:09.840
one of those, just to take a look at.

00:21:11.200 --> 00:21:12.080
Oh, great question.

00:21:12.280 --> 00:21:13.020
Great question.

00:21:14.020 --> 00:21:17.900
Let's go to download.

00:21:18.180 --> 00:21:19.520
And if you're looking at my screen,

00:21:22.060 --> 00:21:23.000
not five, one, two, six

00:21:23.000 --> 00:21:24.200
was released yesterday.

00:21:25.220 --> 00:21:27.460
I was wondering why I saw one,

00:21:27.460 --> 00:21:29.520
two, six earlier for, for,

00:21:30.900 --> 00:21:33.580
um, uh, for Brex, I think it was.

00:21:33.760 --> 00:21:35.120
And I was like, wait a minute,

00:21:35.120 --> 00:21:35.920
what's going on?

00:21:36.300 --> 00:21:39.500
So if you go to the not high download,

00:21:39.840 --> 00:21:43.140
you'll see the sources and the binaries.

00:21:43.260 --> 00:21:44.960
And so what we want to look at

00:21:44.960 --> 00:21:45.760
is the source.

00:21:45.780 --> 00:21:47.940
You can actually just download

00:21:47.940 --> 00:21:50.840
the whole source release as well.

00:21:50.840 --> 00:21:52.700
And that's going to give you

00:21:52.700 --> 00:21:58.220
the source code for not only the processors,

00:21:58.220 --> 00:22:02.760
but also the, um, not only the processors,

00:22:02.960 --> 00:22:04.600
but also NAFI engine

00:22:04.600 --> 00:22:06.300
and those other types of things.

00:22:07.280 --> 00:22:10.720
And then, um, you can actually look

00:22:10.720 --> 00:22:14.140
at Apache, you know, Apache links,

00:22:14.600 --> 00:22:15.800
their GitHub repository.

00:22:16.520 --> 00:22:18.600
You can see Minify was changed

00:22:18.600 --> 00:22:19.920
an hour ago, for instance.

00:22:20.360 --> 00:22:21.660
But if you look here,

00:22:21.680 --> 00:22:24.480
you will have all of the source code

00:22:24.480 --> 00:22:26.480
or all of the processors.

00:22:28.380 --> 00:22:31.220
One of the things that, uh, I mean,

00:22:31.240 --> 00:22:32.940
I would be happy you let me know,

00:22:33.100 --> 00:22:34.820
but is we can set up a,

00:22:34.820 --> 00:22:38.860
an environment to, to do, um, uh,

00:22:38.860 --> 00:22:40.640
you know, custom processor development.

00:22:41.400 --> 00:22:43.920
I was planning to potentially do that

00:22:43.920 --> 00:22:47.100
after lunch tomorrow after we get through

00:22:47.100 --> 00:22:49.680
Minify and some of the tendency questions.

00:22:50.200 --> 00:22:51.880
But yeah, you know, one of the things

00:22:51.880 --> 00:22:54.160
I like to do is, is, you know,

00:22:54.280 --> 00:22:55.520
quickly show you how to set up

00:22:55.520 --> 00:22:57.340
a dev environment for NAFI

00:22:57.340 --> 00:23:00.760
and how to develop your own custom processor.

00:23:01.560 --> 00:23:03.140
But if you're looking at the source code,

00:23:03.280 --> 00:23:04.740
it's available on GitHub.

00:23:05.720 --> 00:23:09.060
It's also available as a downloaded package.

00:23:09.560 --> 00:23:11.580
You, you know, you can see, um,

00:23:11.580 --> 00:23:13.200
if you do want to download

00:23:13.200 --> 00:23:18.260
and build it yourself, uh, the documentation

00:23:18.260 --> 00:23:21.200
actually goes into really good detail on that.

00:23:27.560 --> 00:23:28.760
Let's see here.

00:23:28.900 --> 00:23:29.680
Let's see here.

00:23:29.680 --> 00:23:30.620
Where's that documentation?

00:23:39.500 --> 00:23:42.640
So there's a whole developer guide, um,

00:23:43.120 --> 00:23:45.340
on how to, uh, you know,

00:23:45.420 --> 00:23:47.520
build a processor from scratch,

00:23:47.540 --> 00:23:50.420
um, and, and things like that.

00:23:50.740 --> 00:23:54.440
Um, it has a lot of information,

00:23:54.680 --> 00:23:56.940
uh, but you can go directly to GitHub

00:23:56.940 --> 00:23:59.940
download, uh, once you've got it downloaded,

00:24:00.420 --> 00:24:03.340
there's some things you can do to, uh,

00:24:03.340 --> 00:24:05.600
build it, uh, if you wanted to just,

00:24:05.600 --> 00:24:07.720
you know, mess around and change some things.

00:24:08.080 --> 00:24:09.960
If you want to build it yourself,

00:24:10.480 --> 00:24:13.280
building a custom distribution, uh,

00:24:13.280 --> 00:24:15.760
you can, you know, you can use Maven

00:24:15.760 --> 00:24:18.280
because Java's under the hood, um,

00:24:18.280 --> 00:24:19.560
and things like that.

00:24:19.940 --> 00:24:22.680
There is a MVM clean install,

00:24:22.780 --> 00:24:25.260
like include all, and that will include

00:24:25.260 --> 00:24:29.320
all of the, uh, NARS and all the bundles.

00:24:29.360 --> 00:24:33.600
You may want to, uh, only build, like,

00:24:34.200 --> 00:24:37.340
you know, certain bundles and things like that.

00:24:37.620 --> 00:24:40.740
There's actually another hidden, uh, option

00:24:40.740 --> 00:24:44.900
and it's the, um, rules engine.

00:24:45.160 --> 00:24:47.540
Uh, so there is a rules engine flag.

00:24:47.600 --> 00:24:48.960
They just don't list it here,

00:24:49.160 --> 00:24:51.080
uh, that you can also include.

00:24:51.460 --> 00:24:53.300
Um, but yeah, you, you,

00:24:53.300 --> 00:24:56.960
the developer's documentation and building it,

00:24:56.960 --> 00:25:00.120
um, you know, is really well documented.

00:25:01.380 --> 00:25:02.660
Um, and, and, you know,

00:25:02.660 --> 00:25:04.320
that leads me to another question.

00:25:04.640 --> 00:25:08.740
So, um, I've seen this, this quite a bit

00:25:08.740 --> 00:25:10.900
and I don't really go over it because,

00:25:11.200 --> 00:25:15.440
um, you know, um, to me, uh,

00:25:15.440 --> 00:25:17.080
you know, it's all in the install guide

00:25:17.080 --> 00:25:18.000
and things like that.

00:25:18.060 --> 00:25:20.260
But when you are installing this

00:25:20.900 --> 00:25:24.080
in and at, like, in a production type

00:25:24.080 --> 00:25:26.620
of environment, you want to make sure

00:25:26.620 --> 00:25:29.520
that you have, um, you know,

00:25:29.680 --> 00:25:32.020
all the hard file limits set,

00:25:32.320 --> 00:25:34.060
you know, the amount of files

00:25:34.060 --> 00:25:38.000
that you can have open on the, um,

00:25:38.460 --> 00:25:41.080
um, uh, system and those types of things.

00:25:41.680 --> 00:25:44.560
NAFA is, is really heavy

00:25:44.560 --> 00:25:47.080
on having a ton of files open.

00:25:47.340 --> 00:25:49.360
And so, you know, they will give you

00:25:49.360 --> 00:25:51.640
in the, um, instructions

00:25:51.640 --> 00:25:53.120
and I'm looking for it right now,

00:25:53.420 --> 00:25:55.400
they will give you in the instructions

00:25:55.400 --> 00:25:57.980
on how to set your, your hard limit,

00:25:58.380 --> 00:26:00.660
your soft limit on the open files,

00:26:00.900 --> 00:26:02.280
you know, those types of things.

00:26:02.380 --> 00:26:02.960
So let me see.

00:26:05.780 --> 00:26:06.880
But did that answer your question

00:26:06.880 --> 00:26:08.080
on where the source was?

00:26:09.920 --> 00:26:10.920
Yeah. Yeah.

00:26:11.020 --> 00:26:11.880
I think I might have more,

00:26:12.020 --> 00:26:15.020
but, uh, I'm moving away till we go

00:26:15.020 --> 00:26:17.440
over the process just about four to more, I guess.

00:26:18.240 --> 00:26:20.020
Um, is there anything I can answer

00:26:20.020 --> 00:26:21.040
real quickly now?

00:26:23.100 --> 00:26:25.740
Um, no, no, I think I,

00:26:25.740 --> 00:26:26.800
I can dig through that.

00:26:26.960 --> 00:26:29.180
I think I understand what I'm looking for.

00:26:29.360 --> 00:26:29.560
Thanks.

00:26:29.720 --> 00:26:32.600
Okay. Um, and, and like I said,

00:26:32.600 --> 00:26:33.700
I'll have more there.

00:26:33.920 --> 00:26:36.440
I can, I can, um, I can,

00:26:36.440 --> 00:26:38.320
I can type up some more, uh,

00:26:39.300 --> 00:26:41.720
tips and tricks on developing processors.

00:26:42.220 --> 00:26:43.300
But good question.

00:26:43.500 --> 00:26:45.140
Anybody else have any questions

00:26:45.140 --> 00:26:47.500
before we, uh, take a break?

00:26:48.920 --> 00:26:50.900
For me to eat early dinner,

00:26:50.960 --> 00:26:53.580
for you all, uh, get some lunch.

00:26:57.220 --> 00:27:01.840
Okay. Um, so let's, let's take a lunch break.

00:27:01.960 --> 00:27:04.500
It is two oh two.

00:27:06.160 --> 00:27:08.500
So let's just say it's two o'clock.

00:27:08.960 --> 00:27:10.660
Let's try to be back here.

00:27:10.900 --> 00:27:14.520
You know, let's take a, uh, a 42 minute lunch,

00:27:14.520 --> 00:27:16.080
43 minute lunch.

00:27:17.240 --> 00:27:20.940
Let's try to be back here by 12 to 45.

00:27:21.160 --> 00:27:25.020
You all your time to 45 central time.

00:27:25.620 --> 00:27:28.520
And, you know, we'll, we'll go into another,

00:27:28.880 --> 00:27:32.660
another day of flow and, and build that out

00:27:32.660 --> 00:27:34.940
and, uh, you know, get some,

00:27:35.240 --> 00:27:37.340
get some additional hands on and then,

00:27:37.340 --> 00:27:39.080
you know, call it a day.

00:27:39.080 --> 00:27:40.180
So have a great lunch.

00:27:40.360 --> 00:27:42.860
I will see you all in 43 minutes.

00:27:43.420 --> 00:27:44.760
Uh, and if you need anything,

00:27:45.080 --> 00:27:46.560
I'll most likely be at my desk

00:27:46.560 --> 00:27:48.960
or just chat me a message.

00:28:01.380 --> 00:28:02.540
Okay. Um, give me just a minute.

00:28:05.400 --> 00:28:06.500
Yeah, if you don't mind.

00:28:06.720 --> 00:28:07.200
Thank you very much.

00:28:07.220 --> 00:28:07.960
I don't know yet.

00:28:25.320 --> 00:28:25.720
Okay.

00:28:25.720 --> 00:28:26.080
Oh, do they?

00:28:26.540 --> 00:28:26.780
Yeah.

00:28:26.900 --> 00:28:27.620
I've just got an email.

00:28:28.680 --> 00:28:31.280
But it picks up a from 499.

00:28:31.340 --> 00:28:33.500
You can have your phone in for that.

00:28:33.500 --> 00:28:48.520
Uh, but the, the nonce coming out.

00:29:07.460 --> 00:29:10.260
All right.

00:30:50.560 --> 00:30:53.400
I'm just having to slow it down.

00:31:20.920 --> 00:31:23.380
Alright, I'm glad everyone had a great lunch.

00:31:23.380 --> 00:31:27.400
And hopefully everyone has made it back.

00:31:29.180 --> 00:31:36.580
Even given 43 minutes for lunch, I had barely enough time to finish.

00:31:38.080 --> 00:31:41.560
But we've got a good next scenario.

00:31:43.760 --> 00:31:47.960
Okay, so I have one minute after 2.46, my time.

00:31:47.960 --> 00:31:58.040
So let's get started. Our final activity for today is a scenario where you will be developing your own NAPA flow.

00:31:59.700 --> 00:32:05.280
I have put some tips, tricks, and some pointers in the scenario.

00:32:06.060 --> 00:32:12.500
But this is I don't usually like to give a test in any of the classes.

00:32:12.500 --> 00:32:21.320
This is what I like to see is just some hands on experience, you know, applying some of the things that we went over.

00:32:21.860 --> 00:32:26.160
So for this scenario, let me pull this up.

00:32:27.600 --> 00:32:30.880
So for this scenario, you will be.

00:32:31.880 --> 00:32:43.880
If you have the Dropbox link still or the the the ether pad pulled up, you can actually download the scenario information.

00:32:44.600 --> 00:32:48.360
Let me know if you cannot, but for this scenario.

00:32:48.360 --> 00:32:59.420
So for this scenario, you are a data analyst at a local government agency responsible for monitoring environmental conditions across various locations in the region.

00:32:59.420 --> 00:33:09.880
Your task is to aggregate, transform and analyze weather data collected from multiple local weather stations to provide daily summaries and alerts.

00:33:10.740 --> 00:33:16.460
So if you can, this scenario is is in the Dropbox link.

00:33:17.020 --> 00:33:20.540
It's also in the ether pad.

00:33:21.160 --> 00:33:23.880
Right here, so I can actually.

00:33:23.880 --> 00:33:30.180
I can go on to everyone's desktop and download it if you need help.

00:33:30.540 --> 00:33:32.500
But if you can.

00:33:34.100 --> 00:33:36.760
Try to grab this link if it's working.

00:33:38.740 --> 00:33:39.020
Nope.

00:33:41.780 --> 00:33:44.300
Oh, it's logging me in copy link address.

00:33:44.660 --> 00:33:51.200
OK, so if you can go back to the ether pad, I will also send this link in teams chat.

00:33:51.200 --> 00:33:57.200
There should be a zip folder that is.

00:33:58.400 --> 00:34:00.700
Not five scenario.

00:34:01.500 --> 00:34:05.840
It has the scenario as well as three data files.

00:34:06.500 --> 00:34:09.940
We are pulling from three different weather stations.

00:34:10.880 --> 00:34:14.140
It gives a report every hour.

00:34:14.140 --> 00:34:24.500
There is a data description in this scenario to describe the fields as well as, you know, like I mentioned some tips and tricks for that.

00:34:24.940 --> 00:34:29.180
So let me see if I can download this to my desktop.

00:34:31.100 --> 00:34:33.140
Go to your Dropbox.

00:34:34.880 --> 00:34:35.960
No, it's not.

00:34:42.940 --> 00:34:43.800
Oh, perfect.

00:34:44.440 --> 00:34:51.940
So if you take a look at that link that I put in, you should be able to download the scenario.

00:34:58.080 --> 00:35:00.460
Should be pretty easy.

00:35:01.940 --> 00:35:06.000
And then during that scenario, you just want to be able to go to your downloads.

00:35:07.360 --> 00:35:11.340
I am going to copy it to my desktop.

00:35:16.440 --> 00:35:19.280
You know, go ahead and extract the zip file.

00:35:22.500 --> 00:35:23.360
And.

00:35:24.640 --> 00:35:30.680
So, again, for this scenario, I've included the PDF of the scenario.

00:35:31.480 --> 00:35:41.380
So your objective is used to automate the collection, transformation and reporting of weather data from local files,

00:35:41.740 --> 00:35:44.720
simulating real time data ingestion and processing.

00:35:44.720 --> 00:35:50.120
I actually stood up a small web server to serve these files.

00:35:51.760 --> 00:35:56.220
But it requires a lot of SSL connections and everything else within nine five.

00:35:56.300 --> 00:35:58.860
So I decided to go with just a local file.

00:36:00.480 --> 00:36:08.740
But you should see this scenario as well as three data files, two CSVs and a JSON document.

00:36:08.740 --> 00:36:18.380
So if you can extract this information, go to your not by canvas, create a new processor group and.

00:36:18.600 --> 00:36:20.760
You know, start building your flow.

00:36:21.140 --> 00:36:29.780
What I'm looking for here is is just a recap of everything that we have learned with not five and registry.

00:36:30.460 --> 00:36:35.680
You know, because today is probably, you know, we'll touch on some stuff tomorrow.

00:36:35.680 --> 00:36:52.080
But tomorrow is mainly minify, you know, multi tendency, advanced, you know, deployment capabilities and also some some custom processor how to that will go through.

00:36:52.540 --> 00:36:58.060
So for this scenario, you're collecting the data, you know, the data is the CSVs and JSON.

00:36:58.680 --> 00:37:07.080
You're going to, you know, set up not five to monitor a directory ingest files as they appear.

00:37:07.560 --> 00:37:12.200
You know, those types of things I have put like a not five task.

00:37:12.380 --> 00:37:14.060
I didn't spell out how to do it.

00:37:15.060 --> 00:37:22.200
You can, you know, use the processors that you have available to build this flow.

00:37:22.400 --> 00:37:27.420
You can make it as complicated or as simple as you would like.

00:37:27.420 --> 00:37:39.520
Again, what I'm looking for here is a completed data flow that picks the data up, provide some enrichment, some aggregation.

00:37:40.080 --> 00:37:46.840
And then also, you know, that alert generation, you can just send the alert to a log message.

00:37:46.840 --> 00:37:57.340
You know, for instance, we do not have, you know, for fun or for me, you know, those types of things installed, you know, where you would typically see this in the real world.

00:37:57.360 --> 00:38:00.480
Well, that being said, feel free to get started.

00:38:00.820 --> 00:38:04.720
Does anyone have any issues getting the scenario?

00:38:10.160 --> 00:38:11.220
Good deal.

00:38:11.220 --> 00:38:14.220
So I'm here for questions.

00:38:15.280 --> 00:38:26.960
Feel free to ask any questions you may have, you know, any tips or tricks or anything that you'd like to see.

00:38:27.620 --> 00:38:34.640
There should be information within that PDF to help you out.

00:38:34.640 --> 00:38:48.480
The data structure is pretty self-explanatory, but if you need a breakdown of that, you know, I can send over something as well.

00:38:51.680 --> 00:39:05.680
And while everyone works on that, I am going to go mute, but I'm going to kind of pop into everyone's desktops, see how things are going and just provide any commentary as needed.

00:39:10.180 --> 00:39:12.160
Wish you were sitting on my face right now.

00:39:18.400 --> 00:39:23.060
So we have about two hours left for today.

00:39:24.100 --> 00:39:26.520
This may take a little bit of time.

00:39:27.500 --> 00:39:34.160
So what we'll do is spend the next hour on this flow development.

00:39:35.580 --> 00:39:38.380
Again, ask any questions along the way.

00:39:38.380 --> 00:39:47.940
We'll take a quick bio break or final break of the day and, you know, come back answering questions and review your flows.

00:39:48.960 --> 00:39:54.020
Again, you know, there's many ways to skin a cat with 9-5.

00:39:54.140 --> 00:40:04.160
So I'm really looking at, you know, some of the thought process behind, you know, developing your data flow and just that story that you have.

00:40:04.160 --> 00:40:09.760
I have a question. So for step three, it says execute script. Do we have a script for that?

00:40:10.220 --> 00:40:28.340
You do not. So if you do not have a script or, you know, you may not know scripting, feel free to use another processor to extract that, to deal with the, you know, that type of data.

00:40:28.340 --> 00:40:41.860
If you look through the processor, you know, the list of processor you have, you may want to be able to do like, you know, extract that as a CSV reader and a JSON reader and then put those back together.

00:40:41.960 --> 00:40:51.900
For those that knows how to execute like a script, you know, it's an option, you know, so do the best you can.

00:40:51.900 --> 00:41:02.480
For this one, you can, you can use custom scripting like there's also a jolt transform script.

00:41:03.600 --> 00:41:09.940
Python, I don't think we have Python installed here, so you may not be able to use a Python script.

00:41:10.260 --> 00:41:14.140
You might have to use a processor to just extract the data.

00:41:14.140 --> 00:41:24.140
And then once you have it extracted and saved as an attribute, you can then use those attributes to build your final output.

00:41:24.220 --> 00:41:38.420
And if you get hung up on that step or some of the other steps, you know, just, just ask me and we can, we can find a processor to do the function that you're looking to do most likely.

00:41:39.460 --> 00:41:49.120
Well, this scenario can definitely be accomplished, you know, utilizing all the processors that NaPy comes with out of the box.

00:41:49.220 --> 00:41:52.200
Not all the processors, but you wouldn't need any custom processors.

00:41:54.900 --> 00:42:04.580
Also, remember the documentation has a list of processors and their functions.

00:42:05.520 --> 00:42:17.780
You know, so if you are trying to find a processor to to simplify what you're trying to do or to do that function, feel free to reference, you know, documentation.

00:42:17.780 --> 00:42:21.780
A list of the processors and what it can do is there.

00:42:24.880 --> 00:42:31.540
I assume some of you will probably use the extract text processor as well.

00:42:31.680 --> 00:42:36.120
So, you know, that's there as well as, you know, some of those things.

00:42:54.340 --> 00:42:55.740
Sorry.

00:43:10.140 --> 00:43:11.540
Okay.

00:43:17.380 --> 00:43:18.020
Okay.

00:43:28.800 --> 00:43:32.300
So is this kind of like however we want to approach this?

00:43:32.420 --> 00:43:34.360
Yes, it's however you want to approach it.

00:43:35.220 --> 00:43:41.340
So like I said, yeah, you can learn from the previous example.

00:43:41.340 --> 00:43:45.480
And that's the reason we have both JSON and the CSV.

00:43:45.760 --> 00:43:53.560
So if you want to ingest CSV converted to JSON and then look up the JSON values together, I'm fine with that as well.

00:43:54.080 --> 00:44:06.560
Yeah, it's this again, you know, this is just to test the, you know, the skills that we've learned over the last day and a half and put those into practice.

00:44:06.900 --> 00:44:08.620
Do the best you can.

00:44:08.620 --> 00:44:11.780
You know, there's many ways to build a flow.

00:44:11.980 --> 00:44:13.820
There's many ways to do this.

00:44:14.060 --> 00:44:24.860
As you, you know, as this scenario proves that, you know, you may use, I bet when we pull this up, we're going to have folks use one set of processors.

00:44:25.120 --> 00:44:33.340
Others use another set of processors, you know, and it's all going to accomplish the same goal of getting that alert up.

00:44:33.340 --> 00:44:35.620
But yes, use any processor.

00:44:35.640 --> 00:44:37.560
You won't use previous processors.

00:44:37.560 --> 00:44:40.440
You've learned about those types of things.

00:44:40.460 --> 00:44:45.440
There's a reason I didn't make it all CSV is so you just can't reuse the last flow.

00:44:46.840 --> 00:44:49.260
But but yeah, you know, reuse what you need to.

00:44:49.620 --> 00:44:55.280
If you had Python set up properly and some of those things, you could easily do this in the script.

00:44:55.720 --> 00:45:02.080
If you knew Python, you know, so so yeah, you don't use the processors that's available.

00:45:02.080 --> 00:45:07.100
Do the best you can and use I put some trick tips in there.

00:45:07.100 --> 00:45:09.800
But if you want to use another processor habit.

00:45:12.140 --> 00:45:22.580
So just to see if I was on the right track with this, I was using an extract text and I created some properties based off of capture groups.

00:45:22.740 --> 00:45:23.320
Okay.

00:45:23.320 --> 00:45:24.300
I just call it call.

00:45:24.400 --> 00:45:31.040
So I have like called one through called 20 as attributes.

00:45:32.200 --> 00:45:35.760
Is that is that on the right track without extract text?

00:45:36.100 --> 00:45:37.120
It is.

00:45:37.460 --> 00:45:43.760
So, you know, one of the things that would put you on the right track is extracting these values as attributes.

00:45:44.380 --> 00:45:52.160
Because once you have them all as an attribute, you can you can manipulate all day long.

00:45:52.740 --> 00:45:56.180
And you can pass those attributes to different processors.

00:45:56.520 --> 00:45:57.520
You know, those types of things.

00:45:57.640 --> 00:45:59.660
But you are definitely on the right path.

00:46:01.200 --> 00:46:01.560
Thank you.

00:46:01.600 --> 00:46:03.120
Yeah, no worries. Good question.

00:46:14.120 --> 00:46:15.780
Hey, Josh, I got a question.

00:46:19.120 --> 00:46:20.160
Yes, sir.

00:46:20.660 --> 00:46:25.580
I have a feeling this is you went over this in the part that I missed this morning.

00:46:26.020 --> 00:46:35.760
But I'm trying to use the convert record processor and I'm getting a that it was validated against a different do it and the one I'm using.

00:46:35.900 --> 00:46:38.500
I'm not sure what the issue is.

00:46:39.060 --> 00:46:42.220
Oh, let's look at let's let me pull yours up right quick.

00:46:43.340 --> 00:46:50.000
And while I'm off mute, we still need to take our last break after lunch.

00:46:51.480 --> 00:46:58.360
But what I'm thinking is, is after I answer this question, I want to just go to the restroom and come right back.

00:46:58.680 --> 00:47:02.900
So, you know, just take a break in place and continue working on your data flow.

00:47:04.200 --> 00:47:08.980
Okay, record reader validated against this controller is disabled, right?

00:47:09.920 --> 00:47:12.620
Okay, so I'm not sure what that means.

00:47:12.620 --> 00:47:15.260
Yeah, no, you did.

00:47:15.960 --> 00:47:26.840
So go into the go into that processor and you see you have the CSV reader in the CSV are the Jason record set writer to the right is that arrow.

00:47:27.100 --> 00:47:27.700
There you go.

00:47:27.820 --> 00:47:29.980
And yes, you have those disabled.

00:47:30.960 --> 00:47:33.980
So you will need to enable with the lightning.

00:47:35.020 --> 00:47:40.360
Both the service, you can just do the service for now until you enable both of those.

00:47:40.860 --> 00:47:53.580
And I think you might miss this during the morning session as well as if you're trying to do like an Avro schema or something, you will need to put those services in there.

00:47:55.040 --> 00:48:05.060
If you want, you could if you have that zip file, well now see it's working now.

00:48:05.400 --> 00:48:17.700
But if you're using a schema, if you have that zip file that I sent that I had everyone download earlier, it actually has the work in flow.

00:48:17.700 --> 00:48:21.900
Okay, so it has that working flow in it.

00:48:21.900 --> 00:48:24.220
You know, go back and go to the.

00:48:25.460 --> 00:48:27.740
Oh, yeah, that's kind of what I've been.

00:48:27.940 --> 00:48:29.040
Okay, yeah, that's it.

00:48:29.160 --> 00:48:29.480
There you go.

00:48:29.560 --> 00:48:32.080
So it has that working flow.

00:48:32.540 --> 00:48:39.700
The reason that the CSV to JSON convert record is disabled there is only because, you know, you imported it in.

00:48:40.040 --> 00:48:43.040
But nobody ever enabled the controller service.

00:48:43.040 --> 00:48:58.200
If you go into that one, for instance, and go to the CSV reader and hit that, you see that you need to enable those and then your previous flow will work.

00:49:00.340 --> 00:49:04.260
And that should help take away the the warning.

00:49:04.380 --> 00:49:05.520
Yep, you're good.

00:49:05.900 --> 00:49:08.140
So the other two are disabled as well.

00:49:08.200 --> 00:49:09.860
Are you going to go back to your?

00:49:10.080 --> 00:49:10.220
Yep.

00:49:10.220 --> 00:49:17.980
The easiest way is just go into the processor, go over to your controller services and the bottom two are disabled as well.

00:49:18.200 --> 00:49:20.220
So you want to enable them and there we go.

00:49:20.820 --> 00:49:24.200
So that original data flow should work for you now.

00:49:25.680 --> 00:49:26.120
Okay.

00:49:26.400 --> 00:49:29.140
And you can tell actually the convert CSV to JSON.

00:49:29.360 --> 00:49:30.760
Just tell it to run once, right?

00:49:30.940 --> 00:49:33.080
Click until it to run once and see if it gets success.

00:49:33.100 --> 00:49:35.320
I don't think I did any of the steps.

00:49:35.340 --> 00:49:35.820
No, no, no.

00:49:35.860 --> 00:49:37.960
You already have a file in the queue.

00:49:38.660 --> 00:49:39.620
Oh, yeah.

00:49:41.260 --> 00:49:44.020
And then, yeah, hit refresh on your canvas.

00:49:45.000 --> 00:49:46.320
Oh, you do have success.

00:49:46.620 --> 00:49:47.080
Perfect.

00:49:47.380 --> 00:49:51.040
So you took that original CSV and you made it in JSON file.

00:49:52.320 --> 00:49:52.860
Cool.

00:49:53.180 --> 00:49:53.560
Okay.

00:49:53.840 --> 00:50:06.320
And you can use, you know, I don't I don't want to see like an exact copy of what we did this morning, but you can use this morning's activity as reference.

00:50:07.840 --> 00:50:09.540
Yeah, I missed the whole morning.

00:50:10.000 --> 00:50:10.860
Oh, no worries.

00:50:11.160 --> 00:50:11.620
No worries.

00:50:12.440 --> 00:50:12.660
Okay.

00:50:12.860 --> 00:50:13.140
Good.

00:50:13.480 --> 00:50:14.540
Any other questions?

00:50:16.060 --> 00:50:17.680
No, I think I'm good for now then.

00:50:18.020 --> 00:50:18.320
Okay.

00:50:18.600 --> 00:50:20.700
I am actually just going to run to the restroom.

00:50:21.000 --> 00:50:23.120
I'll be I will be right back.

00:50:23.120 --> 00:50:31.080
If I miss anybody, you know, while I'm away, you know, just leave me a message in the chat, but I'll be back in a couple minutes.

00:50:38.460 --> 00:50:39.280
I'm back.

00:50:39.440 --> 00:50:48.320
If anyone has any questions, let's spend about, you know, 20 more minutes on this.

00:50:50.140 --> 00:50:55.440
Hopefully make some progress and then start going over some of the data flows.

00:50:55.480 --> 00:51:02.320
If you get done, you know, with your data flow, let me know and we can start reviewing it.

00:51:06.320 --> 00:51:11.640
Looks like some of you are getting very close to completing this task.

00:51:13.440 --> 00:51:20.320
If you get hung up or, you know, if you're just taking a while, don't feel bad.

00:51:21.660 --> 00:51:22.940
We have plenty of time.

00:51:23.160 --> 00:51:29.340
You can work on this later, you know, after the class if you want or tomorrow morning.

00:51:29.420 --> 00:51:31.700
You can finish it on your own time later.

00:51:32.080 --> 00:51:33.600
I can send you the scenario.

00:51:33.680 --> 00:51:35.160
We'll be sending the scenario.

00:51:35.160 --> 00:51:38.720
So, you know, you can practice it on your own time as well.

00:51:38.740 --> 00:51:45.740
But we'll give it another 15 minutes and then let's start going through some of these data flows.

00:52:02.560 --> 00:52:03.960
Okay.

00:52:09.560 --> 00:52:10.960
Okay.

00:52:28.240 --> 00:52:29.640
Okay.

00:52:29.680 --> 00:52:35.380
Looks like everyone's very close.

00:52:36.940 --> 00:52:40.940
So we may just finish up working on this today.

00:52:41.200 --> 00:52:49.000
And then I can go through the flow later and see how you do.

00:52:49.040 --> 00:52:51.040
And then we can talk about it in the morning as well.

00:52:52.100 --> 00:52:54.400
So we have a few minutes left.

00:52:54.400 --> 00:53:00.260
I'm going to touch on some of these and see if there's anything I can do to help.

00:53:00.700 --> 00:53:02.620
The first one I have to pull up is Cody.

00:53:03.680 --> 00:53:04.840
How are things going?

00:53:05.180 --> 00:53:06.260
Anything I can do to help?

00:53:06.420 --> 00:53:08.240
Yeah, I tried to change.

00:53:08.340 --> 00:53:12.360
I tried to reuse the one we did in the previous flow.

00:53:12.700 --> 00:53:15.620
And I was trying to change some of the schema.

00:53:15.980 --> 00:53:21.460
But it was giving me an error on some of the formatting for that.

00:53:23.480 --> 00:53:24.080
Okay.

00:53:24.140 --> 00:53:30.000
So you're getting your weather data.

00:53:30.660 --> 00:53:32.560
Are you just getting the CSV files?

00:53:33.780 --> 00:53:35.500
Yeah, I'm just grabbing just the CSVs.

00:53:35.520 --> 00:53:37.100
Okay, perfect, perfect. Oh, awesome.

00:53:38.700 --> 00:53:41.560
And then you were setting the schema name?

00:53:42.400 --> 00:53:42.460
Yes.

00:53:42.460 --> 00:53:43.920
Do I need to add?

00:53:44.240 --> 00:53:45.400
Yes, just the weather.

00:53:45.620 --> 00:53:46.600
Okay, perfect.

00:53:46.600 --> 00:53:49.620
The last one we did was inventory.

00:53:52.900 --> 00:53:57.440
And then here, this is where it errors out.

00:53:58.620 --> 00:54:00.400
Let's stop it.

00:54:00.720 --> 00:54:01.440
I get an error.

00:54:04.640 --> 00:54:05.500
Temperature.

00:54:05.500 --> 00:54:06.920
The input string.

00:54:08.000 --> 00:54:08.860
Temperature.

00:54:10.000 --> 00:54:11.580
Yeah, click here.

00:54:15.760 --> 00:54:16.340
Okay.

00:54:18.140 --> 00:54:24.420
I noticed something already is weather capitalized on your other one.

00:54:24.780 --> 00:54:26.140
And here it's lowercase.

00:54:29.480 --> 00:54:32.460
Also, you can expand on that box.

00:54:32.580 --> 00:54:39.540
If you go down just a little bit below the down arrow on the box itself,

00:54:39.540 --> 00:54:41.600
to your right, right above.

00:54:41.880 --> 00:54:42.720
Okay, right there.

00:54:42.760 --> 00:54:44.220
There you go.

00:54:44.380 --> 00:54:45.360
That would have been helpful.

00:54:47.420 --> 00:54:51.440
Okay, so type record, name, weather, field.

00:54:51.500 --> 00:54:54.500
This one I had is an integer,

00:54:54.620 --> 00:54:57.600
but it was starting there and I changed the string and it passed.

00:54:57.980 --> 00:54:59.000
So I'm not sure.

00:55:00.800 --> 00:55:01.560
No worries.

00:55:02.060 --> 00:55:04.260
I mean, you can have them all at string if you want to.

00:55:04.260 --> 00:55:08.960
Okay, so I'm looking at the station ID,

00:55:10.680 --> 00:55:17.240
the date, the hour, temperature, humidity, wind speed, and precipitation.

00:55:18.380 --> 00:55:19.020
Okay.

00:55:20.360 --> 00:55:21.000
Say okay there.

00:55:21.420 --> 00:55:22.700
That actually looks good.

00:55:23.340 --> 00:55:28.340
Well, I did get an error on the caps mismatch.

00:55:28.340 --> 00:55:34.080
Say apply there and hit X and just close that window out.

00:55:34.760 --> 00:55:37.140
And let's look at your set schema, the step before.

00:55:37.260 --> 00:55:38.740
That's where I saw the capital.

00:55:40.060 --> 00:55:41.460
You got to stop and configure.

00:55:41.700 --> 00:55:42.000
There you go.

00:55:42.000 --> 00:55:42.380
Apply.

00:55:42.780 --> 00:55:43.260
Perfect.

00:55:44.640 --> 00:55:45.100
Okay.

00:55:45.260 --> 00:55:47.740
You need to enable your services again.

00:55:52.560 --> 00:55:54.460
And then the other ones are invalid,

00:55:54.520 --> 00:55:56.500
so let's look at why they are invalid.

00:55:56.500 --> 00:55:59.200
They may be invalid because of that service.

00:55:59.380 --> 00:56:01.180
There we go.

00:56:02.500 --> 00:56:03.260
All of them is enabled.

00:56:03.780 --> 00:56:05.260
Still throwing an error.

00:56:08.020 --> 00:56:10.840
Getting mixed for the temperature error.

00:56:12.320 --> 00:56:15.200
Let's go back into your convert CSV to JSON.

00:56:16.580 --> 00:56:18.660
Let's go to your CSV reader.

00:56:20.560 --> 00:56:25.540
Pull that controller service and let's look at the settings for that.

00:56:26.140 --> 00:56:27.240
Scroll down.

00:56:27.940 --> 00:56:30.020
You want to treat your first line as header.

00:56:30.340 --> 00:56:30.780
It's true.

00:56:31.820 --> 00:56:35.840
Because it doesn't know how to, there you go.

00:56:37.580 --> 00:56:42.580
So because it's trying to treat it as some of the data,

00:56:42.940 --> 00:56:44.880
so you want to set your first line,

00:56:44.880 --> 00:56:46.200
treat first line as header.

00:56:47.620 --> 00:56:48.200
Apply.

00:56:49.180 --> 00:56:50.100
Exit out of that.

00:56:50.380 --> 00:56:51.120
We'll enable it.

00:56:51.260 --> 00:56:52.680
Then next, what's our latest error?

00:56:52.960 --> 00:56:54.720
Did not parse.

00:56:56.080 --> 00:56:58.600
Failed to stream.

00:56:59.220 --> 00:57:04.260
Error while getting next to record for input stream 14.8.

00:57:04.980 --> 00:57:07.160
Not sure where that input stream would be.

00:57:07.160 --> 00:57:12.200
That is the temperature.

00:57:15.080 --> 00:57:19.160
So it pulled in the station ID, the date and hour,

00:57:20.160 --> 00:57:24.180
and then the temperature it was having a problem with.

00:57:25.880 --> 00:57:27.460
Change that to a stream maybe.

00:57:28.100 --> 00:57:29.280
Change that to a stream.

00:57:31.900 --> 00:57:37.660
We can just try to stream, not to get too over complicated here.

00:57:38.200 --> 00:57:38.640
Perfect.

00:57:38.660 --> 00:57:39.720
Put them all to stream.

00:57:39.880 --> 00:57:40.600
Make it a little easier.

00:57:40.720 --> 00:57:42.140
So when you go back to your canvas,

00:57:42.920 --> 00:57:47.720
that processor is going to be started because you started both the processor

00:57:47.720 --> 00:57:49.640
and the controller services.

00:57:50.060 --> 00:57:51.060
No, it's fine though.

00:57:52.020 --> 00:57:53.200
But go ahead and stop there.

00:57:54.320 --> 00:57:55.180
That's this.

00:57:56.140 --> 00:57:58.560
And then I haven't had him get to move.

00:57:58.560 --> 00:57:59.820
I just laid everything out.

00:57:59.820 --> 00:58:03.020
I haven't really built out any of the configurations for anything else.

00:58:03.140 --> 00:58:04.340
Okay, no worries.

00:58:04.660 --> 00:58:11.560
Like I said, I suspect some people will get done with this.

00:58:12.120 --> 00:58:13.180
Some people may not.

00:58:14.540 --> 00:58:17.640
But I'm glad you kind of laid it out.

00:58:18.100 --> 00:58:21.460
And so what we'll do first thing in the morning is kind of go through

00:58:21.460 --> 00:58:25.420
what your thought process is on some of the ones that you didn't get finished.

00:58:26.600 --> 00:58:29.240
And we can talk through those real quickly.

00:58:31.100 --> 00:58:34.780
Again, I'm not really looking for the flow to be completed.

00:58:36.200 --> 00:58:38.200
This is interactive.

00:58:38.360 --> 00:58:41.420
Just let me know what you're thinking and how you would do this.

00:58:41.980 --> 00:58:49.100
And just to make sure we have a good grasp on the components that we figured out.

00:58:49.100 --> 00:58:52.080
But besides that, it's looking good so far.

00:58:54.320 --> 00:59:00.920
If I was doing this real quickly, I would have used some of the previous flow as well.

00:59:01.280 --> 00:59:03.380
I would have saved that as JSON.

00:59:03.560 --> 00:59:06.220
I would have brought all three JSON documents in.

00:59:06.600 --> 00:59:11.860
I would have done an evaluate JSON path and extracted all the data from the JSON

00:59:11.860 --> 00:59:18.480
and then moved to the next processor to combine the data or make the calculations.

00:59:19.580 --> 00:59:26.680
So my flow would have probably been like six or seven processors if I was doing it a simple way.

00:59:27.420 --> 00:59:30.160
But I think you've got a grasp on this.

00:59:31.180 --> 00:59:32.960
Feel free to finish up.

00:59:32.960 --> 00:59:35.200
And then we'll go through it in the morning.