14 videos 📅 2024-05-06 08:00:00 America/Creston
39:11
2024-05-06 08:57:50
3:11:28
2024-05-06 10:30:19
24:39
2024-05-07 07:20:20
19:49
2024-05-07 08:03:25
1:14:35
2024-05-07 08:35:13
15:41
2024-05-07 10:06:25
38:33
2024-05-07 10:37:03
2:48
2024-05-07 11:19:01
59:37
2024-05-07 11:33:56
6:10
2024-05-07 14:55:25
39:40
2024-05-07 15:02:44
9:45
2024-05-08 06:44:21
29:27
2024-05-08 08:09:24
2:51:40
2024-05-08 12:09:24

Visit the Apache Nifi - GROUP 1 course recordings page

                WEBVTT

00:00:02.280 --> 00:00:08.420
Looking good Cody, looking good.

00:00:13.380 --> 00:00:19.660
Does anyone have any issues so far or are we good to go?

00:00:21.060 --> 00:00:23.800
Pedro, were you able to copy your data flow into a new group?

00:00:23.800 --> 00:00:30.680
Yeah, I was able to download it and create a new group processor and just copy and paste the old one into this one.

00:00:31.560 --> 00:00:36.600
Yeah, so you want to bring down a new processor group. Let me pull yours up and see.

00:00:40.940 --> 00:00:48.900
Perfect, so you see on your breadcrumb trail you have not hot flow and then new processor click not hot flow. Perfect, perfect.

00:00:48.900 --> 00:00:59.220
And named it. So yeah, if you create a new processor group, we are going to ingest a CSV file.

00:00:59.400 --> 00:01:04.980
I will talk about a little bit about this yesterday and we are going to convert that to JSON.

00:01:06.140 --> 00:01:15.820
So, you know, if you want to name your processor group CSV to JSON or something similar, you know, to make it easy to understand.

00:01:16.880 --> 00:01:21.020
Got it. Perfect. You got it. You got it.

00:01:21.240 --> 00:01:31.500
Alright, Pedro's good. Okay, so for this exercise, we are going to pick up a CSV file.

00:01:31.780 --> 00:01:37.360
We're going to convert it to JSON. We are going to use a controller services.

00:01:38.200 --> 00:01:46.460
And so we're going to need to create a record reader, a record writer and get those enabled.

00:01:47.680 --> 00:01:52.380
So this this is this is one of the more advanced data flows.

00:01:52.440 --> 00:01:58.620
I suspect this is going to take a little bit of time, but, you know, this you'll see a lot of this.

00:01:58.620 --> 00:02:09.940
It also reinforces, you know, the thing I'm talking about yesterday where you can set up these controller services and they can be reused.

00:02:10.480 --> 00:02:13.780
So, for instance, this morning we're going to set up a CSV reader.

00:02:14.120 --> 00:02:20.380
That CSV reader can be used over and over. Brett, you shouldn't need to log in.

00:02:21.180 --> 00:02:29.760
And I tell you to download without logging in is what my screen told me.

00:02:30.360 --> 00:02:39.220
But anyway, so, you know, we're going to set these controller services up and we're going to go through and build this out.

00:02:39.600 --> 00:02:48.600
I did include the flow itself already built and you're more than welcome to, you know,

00:02:48.600 --> 00:02:53.500
import that flow. We can you can use it as an example.

00:02:53.820 --> 00:03:05.320
But, you know, I really like for everyone to, you know, kind of build this from scratch if possible, just because, you know, it helps, you know, configure the controller services, those types of things.

00:03:05.500 --> 00:03:08.960
And I'll and I'll kind of walk along with it.

00:03:09.980 --> 00:03:12.340
Yeah, thanks for the laugh.

00:03:14.320 --> 00:03:17.800
All right, so let's get started on that.

00:03:19.100 --> 00:03:23.980
So, so the zip file you downloaded, did everyone get that zip file downloaded?

00:03:24.720 --> 00:03:27.640
And if you can extract that zip file.

00:03:27.900 --> 00:03:36.140
And so if you're looking at my screen, I downloaded the zip file.

00:03:36.740 --> 00:03:41.740
Perfect. And what I will do is just go to my.

00:03:47.420 --> 00:03:53.780
Downloads and you should see example main right click and say extract all.

00:03:57.420 --> 00:04:06.380
Perfect. And then go into the example main and you should see some sample data.

00:04:14.540 --> 00:04:18.980
Perfect. And inventory folders and inventory dot CSV.

00:04:25.560 --> 00:04:29.160
Let's take a look. Who was that speaking?

00:04:29.460 --> 00:04:33.120
I'm sorry. Hey, Cody, I've asked you that multiple times.

00:04:33.420 --> 00:04:38.580
All right, let's see. I should have your voice memorized now.

00:04:39.860 --> 00:04:48.280
What is? Yeah, that's very strange because I'm looking at it and you just chat me in the VM.

00:04:48.280 --> 00:04:53.480
Yeah, I can do that drop box box link night.

00:04:53.500 --> 00:04:58.120
Can I just send it directly to you in chat and teams chat?

00:04:58.940 --> 00:05:04.340
VM would be better if you can, because I have my teams on my government one and I'm using the VM on my personal computer.

00:05:04.460 --> 00:05:07.920
Oh, OK. Let me see.

00:05:09.360 --> 00:05:11.940
Teams is fine, too. I can I can just send it over.

00:05:15.300 --> 00:05:20.480
Let me put it in teams, but I might be able to put it directly into this.

00:05:21.320 --> 00:05:26.960
And let me see if I can take control interactive copy link.

00:05:28.700 --> 00:05:34.660
There you go. And I'll I'll download it for you to know it.

00:05:34.800 --> 00:05:38.320
It worked. I was very surprised it worked.

00:05:38.320 --> 00:05:45.140
So because I have to go back and then tell it I want to interact with your machine and then log in your machine.

00:05:46.140 --> 00:05:49.980
OK, so everyone should have that zip file.

00:05:50.260 --> 00:05:54.660
I'm going to have your machine if you can extract it.

00:05:55.560 --> 00:05:59.360
And, you know, it might be easier to put on your desktop,

00:05:59.400 --> 00:06:07.000
but put it in a location that that you know about because we're going to need to use our get file processor.

00:06:17.660 --> 00:06:22.180
OK, so hopefully everyone has that.

00:06:23.040 --> 00:06:30.580
Let me know if you run into any issues we can stop and get a square away.

00:06:33.180 --> 00:06:37.820
But with that being said, you should be able to go into that zip file.

00:06:37.820 --> 00:06:44.200
There's a sample data folder and then there's an inventory folder and inventory dot CSV.

00:06:45.900 --> 00:06:52.780
So if you can on your NIFI canvas, let's do a new process group.

00:06:53.960 --> 00:06:58.080
Because we already have a data flow that we worked on yesterday,

00:06:58.160 --> 00:07:02.840
we want to bring down and do a new process group onto your canvas.

00:07:02.840 --> 00:07:07.460
And so that way we can start building on to this data flow.

00:07:08.680 --> 00:07:11.860
And if you look at my screen.

00:07:14.840 --> 00:07:19.760
Right, I have a process group called CSV to JSON demo data flow.

00:07:22.160 --> 00:07:28.960
You can name it however you want, but, you know, definitely something you can,

00:07:29.720 --> 00:07:31.740
something you can work off of.

00:07:33.840 --> 00:07:39.140
So our first step is we need to get the file from the directory.

00:07:39.960 --> 00:07:43.880
So what you want to start with is most likely a get file.

00:07:44.780 --> 00:07:47.560
There's a couple of different ways you could do that.

00:07:47.580 --> 00:07:52.900
You could list the directory, filter on it, and then turn around and do a get file.

00:07:52.900 --> 00:07:59.560
But I find just leading off with a get file as the easiest method.

00:08:00.180 --> 00:08:03.700
And the file that we're looking for is inventory dot CSV.

00:08:06.760 --> 00:08:10.000
So you can design this how you want.

00:08:10.700 --> 00:08:21.040
But, you know, for me, for instance, I put, you know,

00:08:21.040 --> 00:08:27.960
it was in my uploads directory because this desktop environment that we're working in

00:08:27.960 --> 00:08:31.840
has an uploads directory on the desktop that we can put files into.

00:08:32.700 --> 00:08:38.740
And then the file filter, I put inventory dot CSV instead of picking up everything.

00:08:40.160 --> 00:08:47.580
All right. And then once we have that, we need to set the schema metadata.

00:08:48.100 --> 00:08:54.020
And so, you know, this attributes metadata so that we can later understand

00:08:54.020 --> 00:08:56.900
which schema to use to process the data.

00:08:57.280 --> 00:09:06.120
So, you know, we want to create an update attribute just because once we get that CSV file,

00:09:06.460 --> 00:09:16.340
we want to be able to tag that metadata with a schema name that it will use to read and to write the data.

00:09:17.580 --> 00:09:26.620
And I know I have mine pulled up, but I would love to see others in your own thought process

00:09:26.620 --> 00:09:31.560
putting this in and, you know, being able to bring that up.

00:09:32.320 --> 00:09:34.740
So we get the set schema metadata.

00:09:35.560 --> 00:09:42.880
You can just add if you didn't know on the update attribute.

00:09:43.140 --> 00:09:47.220
I think a couple of you were working on that yesterday with an update attribute.

00:09:47.220 --> 00:09:51.160
You can go to the properties and just add a property.

00:09:51.740 --> 00:09:53.720
And, you know, you want to do the property name.

00:09:53.860 --> 00:09:58.080
So we could actually do schema dot type, for instance.

00:09:59.320 --> 00:10:04.160
And it needs a value. And I'll just put a row.

00:10:05.900 --> 00:10:12.880
I'm not going to use that attribute, but, you know, that's how you would add attributes, you know, to a flow.

00:10:13.360 --> 00:10:15.460
So we are getting the file.

00:10:15.460 --> 00:10:17.920
We are ingesting that.

00:10:18.320 --> 00:10:20.680
We are not even looking at the CSV yet.

00:10:21.420 --> 00:10:23.300
We are just looking at the metadata.

00:10:24.240 --> 00:10:33.200
So what we want to do is add an attribute to that that says the schema dot name goes to inventory.

00:10:33.720 --> 00:10:41.160
And so once we have that, we then need to convert CSV to JSON.

00:10:42.120 --> 00:10:48.240
So the processor I like to use for that is the convert record processor.

00:10:51.560 --> 00:10:58.200
And in your properties, you're going to have a record reader and a record writer.

00:10:59.300 --> 00:11:03.020
This is a controller service that we're going to set up.

00:11:03.740 --> 00:11:06.740
So the record reader is a CSV reader.

00:11:06.740 --> 00:11:12.780
So you should be able to choose, you know, an Avro reader or CSV reader.

00:11:13.600 --> 00:11:18.200
And then on your record writer, we want to say the JSON record set writer.

00:11:19.480 --> 00:11:22.840
And let me know if you have any hiccups there.

00:11:23.660 --> 00:11:32.460
And then once you have that CSV reader, record reader set up, you can actually go to the service.

00:11:33.100 --> 00:11:38.060
And so, you know, if the service isn't there, let me know.

00:11:38.340 --> 00:11:45.240
But once you put that in and hit apply, you should take it.

00:11:45.240 --> 00:11:47.420
But if it doesn't, we can just create a new one.

00:11:56.920 --> 00:12:02.340
And you see where I have a CSV reader that I went in and created.

00:12:18.740 --> 00:12:23.440
Let me look at how this is the tricky part of this flow.

00:12:23.440 --> 00:12:32.920
And so what we want to do is set up the, you know, the first record or the first

00:12:32.920 --> 00:12:35.480
controller service is that CSV record reader.

00:12:36.260 --> 00:12:40.100
And then we want to set up a JSON record set writer.

00:12:40.260 --> 00:12:47.440
And we're going to also set up a couple of schema controller services.

00:13:01.080 --> 00:13:16.320
So when you use the convert record processor, it should convert record.

00:13:24.380 --> 00:13:25.060
Okay.

00:13:25.180 --> 00:13:25.340
Yeah.

00:13:25.340 --> 00:13:25.740
Perfect.

00:13:26.040 --> 00:13:28.840
So you should have CSV reader on the record writer.

00:13:29.200 --> 00:13:35.180
You should have JSON record set writer on the record writer.

00:13:35.720 --> 00:13:40.600
You want to change, you know, just as a tip, include zero record flow files.

00:13:40.740 --> 00:13:43.040
You want to change that to false.

00:13:54.060 --> 00:13:59.700
And then once you can save and say, okay, for that, you can go back into that

00:13:59.700 --> 00:14:06.440
configuration and then you can actually go straight to the controller services.

00:14:07.040 --> 00:14:15.940
And you should have controller services listed and you want to add a CSV reader,

00:14:16.520 --> 00:14:22.500
a JSON record set writer, an Avro reader, and an Avro schema registry.

00:14:26.060 --> 00:14:29.400
Andrew, I'm going to pull your screen up because it looks like you're in the

00:14:29.400 --> 00:14:32.100
middle of creating it.

00:14:32.100 --> 00:14:38.780
I'm trying to figure out what goes in that metadata properties.

00:14:39.840 --> 00:14:40.360
Yeah.

00:14:40.540 --> 00:14:43.040
If you can, you want to hit cancel.

00:14:43.420 --> 00:14:43.600
Okay.

00:14:43.840 --> 00:14:47.160
So we are get CSV file.

00:14:47.400 --> 00:14:48.180
You named it.

00:14:48.180 --> 00:14:49.120
You got it.

00:14:49.160 --> 00:14:53.420
But what we want to do then is do an update attribute because we want to

00:14:53.420 --> 00:14:56.840
tell NIFA which schema to use.

00:14:57.600 --> 00:15:04.440
So I hit cancel and bring down a new processor and the processor is an

00:15:04.440 --> 00:15:06.160
update attribute processor.

00:15:07.920 --> 00:15:11.900
So you can actually just start typing it in the update attribute.

00:15:12.420 --> 00:15:12.840
Perfect.

00:15:12.980 --> 00:15:18.020
And then you want to drag your success, you know, down to that one.

00:15:18.500 --> 00:15:19.020
Awesome.

00:15:19.280 --> 00:15:19.500
Bad.

00:15:19.800 --> 00:15:20.160
Perfect.

00:15:20.640 --> 00:15:22.400
And then go into your.

00:15:22.400 --> 00:15:23.680
You can delete that.

00:15:23.720 --> 00:15:25.280
You can delete the other processor.

00:15:25.960 --> 00:15:26.720
That's just dangling.

00:15:27.860 --> 00:15:32.440
So what we want to do is you want to go into your update attribute configuration.

00:15:33.580 --> 00:15:33.900
Okay.

00:15:33.960 --> 00:15:34.500
Perfect.

00:15:35.040 --> 00:15:41.040
And, you know, on that, you want to set, you want to do a new value.

00:15:41.220 --> 00:15:46.240
So that positive you see over there, you want to click on that.

00:15:48.380 --> 00:15:53.100
The plus sign on the top right in your processor configuration.

00:15:57.400 --> 00:15:57.920
Okay.

00:15:58.220 --> 00:16:03.320
And then you want to do property name is schema dot name.

00:16:03.460 --> 00:16:04.100
Say, okay.

00:16:04.260 --> 00:16:06.100
And you want to put inventory.

00:16:06.240 --> 00:16:06.760
Perfect.

00:16:07.260 --> 00:16:08.280
And say, okay.

00:16:09.580 --> 00:16:11.700
Your cash value lookup should be 100.

00:16:11.800 --> 00:16:13.120
Do not store state.

00:16:13.300 --> 00:16:15.580
And the other two fields are not required.

00:16:15.580 --> 00:16:16.560
Perfect.

00:16:17.100 --> 00:16:17.960
Say apply.

00:16:18.080 --> 00:16:18.440
Awesome.

00:16:18.640 --> 00:16:23.740
So what we're doing here, and this is a great example, is we're bringing that zip

00:16:23.740 --> 00:16:26.340
file, that CSV file in.

00:16:27.940 --> 00:16:31.940
And what we're, we're not necessarily reading the CSV file.

00:16:32.260 --> 00:16:39.580
We're just assigning as an attribute that that CSV file gets the schema name inventory.

00:16:40.080 --> 00:16:44.460
And so when we bring that file in, you're going to be able to look at the attributes

00:16:44.460 --> 00:16:50.340
and see that you've now assigned a schema name to that file.

00:16:51.660 --> 00:16:58.780
Now that you have the update attribute, we need to get a convert record processor

00:16:58.780 --> 00:17:02.680
because we're going to work on converting this to CSV to JSON.

00:17:02.820 --> 00:17:04.820
You want to just type convert record.

00:17:05.340 --> 00:17:05.680
Perfect.

00:17:07.160 --> 00:17:08.860
You want to, there you go.

00:17:08.920 --> 00:17:10.200
Drag that to there.

00:17:10.820 --> 00:17:11.480
Success.

00:17:12.120 --> 00:17:14.920
So an update attribute and get file.

00:17:15.060 --> 00:17:19.940
And some of these are only going to have success as the next termination, right?

00:17:20.040 --> 00:17:24.720
Because it's, it's just going to apply that attribute to no matter what file comes in.

00:17:25.060 --> 00:17:27.200
So there really is no failure.

00:17:28.340 --> 00:17:28.960
Okay.

00:17:29.200 --> 00:17:35.200
So now that you've got, you know, your convert record, we are reading in a CSV file

00:17:35.200 --> 00:17:38.440
and we are going to put a JSON file.

00:17:38.440 --> 00:17:47.720
So if you can, on your record reader, instead of no value, click that and click down.

00:17:47.900 --> 00:17:48.260
There you go.

00:17:48.400 --> 00:17:50.040
Reference, create new service.

00:17:50.440 --> 00:17:53.480
And you want to use a Avro reader.

00:17:53.760 --> 00:17:54.420
Yeah, perfect.

00:17:56.100 --> 00:17:56.440
Great.

00:17:56.760 --> 00:17:58.660
On the record reader, select your dropdown.

00:17:59.800 --> 00:18:02.860
Again, no, cancel on that reader, not the writer.

00:18:02.940 --> 00:18:03.680
There you go.

00:18:04.000 --> 00:18:04.480
Reader.

00:18:05.260 --> 00:18:05.960
Select your dropdown.

00:18:10.280 --> 00:18:13.280
No, why is that not coming up?

00:18:13.480 --> 00:18:14.360
I create new service.

00:18:15.220 --> 00:18:19.120
Oh, we want to do, do your dropdown instead of an Avro reader.

00:18:20.540 --> 00:18:22.320
You want to do a CSV reader.

00:18:23.980 --> 00:18:24.460
Perfect.

00:18:24.980 --> 00:18:25.860
And say create.

00:18:28.260 --> 00:18:28.600
Perfect.

00:18:28.600 --> 00:18:28.920
Perfect.

00:18:29.060 --> 00:18:29.280
Perfect.

00:18:29.340 --> 00:18:33.880
So we are reading in a CSV for a record writer.

00:18:34.120 --> 00:18:36.460
We are outputting JSON, right?

00:18:36.460 --> 00:18:39.840
So go ahead and click new service.

00:18:40.080 --> 00:18:42.800
Do your dropdown and create new service.

00:18:42.980 --> 00:18:43.580
Perfect.

00:18:44.180 --> 00:18:45.560
And we are doing JSON.

00:18:47.060 --> 00:18:47.960
And say create.

00:18:49.920 --> 00:18:50.960
You got it.

00:18:51.560 --> 00:18:57.320
And on the include zero record flow files, just say false.

00:18:59.260 --> 00:18:59.840
Okay.

00:19:00.740 --> 00:19:01.580
And say okay.

00:19:02.840 --> 00:19:06.020
What does that record zero flow file mean?

00:19:06.880 --> 00:19:12.320
What does the zero record flow files mean?

00:19:13.360 --> 00:19:13.760
Okay.

00:19:14.860 --> 00:19:25.380
So if include zero record flow files is usually like there's like no data in the

00:19:25.380 --> 00:19:25.840
flow file.

00:19:25.940 --> 00:19:26.780
How like that?

00:19:27.100 --> 00:19:27.600
Question mark.

00:19:27.660 --> 00:19:29.360
Hover over that question mark.

00:19:29.560 --> 00:19:30.080
Right there.

00:19:31.000 --> 00:19:34.700
When converting an incoming flow file, the conversion results in no data.

00:19:34.700 --> 00:19:40.240
So if there's zero data, you know, it knows how to handle that.

00:19:40.380 --> 00:19:44.580
But if you hover over the question mark again, there you go.

00:19:44.680 --> 00:19:50.180
So if it comes in and if the results is no data, you know, where do you want to

00:19:50.180 --> 00:19:50.600
do with that?

00:19:50.600 --> 00:19:53.720
Do you want to send it, you know, elsewhere?

00:19:54.160 --> 00:20:00.400
You know, you want to if it comes back as zero byte, you know, if it's a zero

00:20:00.400 --> 00:20:04.480
byte flow file, for instance, you don't want it to continue down the path.

00:20:05.300 --> 00:20:08.160
You want to probably send that to a different relationship.

00:20:08.400 --> 00:20:10.200
There's some reason it didn't get converted.

00:20:11.180 --> 00:20:12.640
But yeah, leave it as false.

00:20:13.880 --> 00:20:16.480
We're not we're not getting too advanced today.

00:20:17.860 --> 00:20:18.400
Okay.

00:20:18.520 --> 00:20:19.160
Say apply.

00:20:19.300 --> 00:20:19.700
Awesome.

00:20:19.960 --> 00:20:25.360
Now go back into your convert record and on your CSV reader, there is a little

00:20:25.360 --> 00:20:29.500
arrow to the right and the right side of it.

00:20:29.560 --> 00:20:29.840
Yep.

00:20:29.840 --> 00:20:30.880
Click that.

00:20:31.460 --> 00:20:36.580
And so what this is going to do is take you to your controller services.

00:20:37.560 --> 00:20:42.400
And so once you're there, we want to look at our CSV reader.

00:20:45.520 --> 00:20:49.700
You can go ahead and click on the gear icon.

00:20:49.980 --> 00:20:53.520
That is how we control the properties of a controller service.

00:20:54.400 --> 00:20:56.420
And you want to go to properties.

00:20:56.420 --> 00:20:57.200
Okay.

00:20:57.780 --> 00:20:59.940
The schema access strategy.

00:21:00.360 --> 00:21:02.840
We want to use the schema name property.

00:21:03.920 --> 00:21:10.340
So click that in first schema and use the schema name property.

00:21:11.520 --> 00:21:16.200
Remember, we set that schema name in the update attribute.

00:21:16.580 --> 00:21:16.960
Okay.

00:21:17.040 --> 00:21:17.300
Perfect.

00:21:17.660 --> 00:21:19.880
So say, okay.

00:21:21.600 --> 00:21:22.560
Schema registry.

00:21:23.100 --> 00:21:23.620
All right.

00:21:23.620 --> 00:21:26.420
On the schema registry, drop down.

00:21:27.480 --> 00:21:31.680
I create new service and we want to use an Avro schema registry.

00:21:34.100 --> 00:21:35.720
Say create and click create.

00:21:35.780 --> 00:21:36.200
Awesome.

00:21:37.800 --> 00:21:38.080
Okay.

00:21:38.480 --> 00:21:44.380
So the schema name property is we already said it, but you want to double check

00:21:44.380 --> 00:21:52.480
here to make sure that that schema name matches the update attribute that you applied.

00:21:52.480 --> 00:21:59.120
And so schema.name is what we put in the update attribute, I believe.

00:22:00.860 --> 00:22:01.780
Schema.name.

00:22:02.240 --> 00:22:05.240
That's the same property we used in the update attribute.

00:22:06.140 --> 00:22:06.160
Yeah.

00:22:06.360 --> 00:22:06.640
Okay.

00:22:06.740 --> 00:22:07.180
Awesome.

00:22:07.200 --> 00:22:07.420
Awesome.

00:22:08.140 --> 00:22:11.580
So just say okay there and scroll down.

00:22:11.720 --> 00:22:12.840
Let's see if there's any other settings.

00:22:13.220 --> 00:22:15.860
We're going to use the Apache common CSV.

00:22:17.040 --> 00:22:20.400
There's no really custom format value separator as comma

00:22:21.060 --> 00:22:23.020
slash n is the record set.

00:22:23.400 --> 00:22:23.720
Yeah.

00:22:23.720 --> 00:22:25.080
So I think we're good there.

00:22:25.320 --> 00:22:26.000
Say apply.

00:22:26.120 --> 00:22:26.580
Okay.

00:22:26.840 --> 00:22:29.640
So now it's still in an invalid state.

00:22:30.220 --> 00:22:34.180
We need to, we specified an Avro schema registry.

00:22:34.700 --> 00:22:43.120
So if you can go into the gear icon again and you see where it says Avro schema registry

00:22:43.120 --> 00:22:44.260
and it has an arrow.

00:22:45.720 --> 00:22:46.220
Okay.

00:22:46.520 --> 00:22:46.980
Perfect.

00:22:47.560 --> 00:22:50.200
So you want to go to, that's the schema

00:22:50.200 --> 00:22:51.700
registry it's using.

00:22:52.160 --> 00:22:59.580
So if you click the gear icon and you want to valid field names true.

00:23:01.660 --> 00:23:06.620
And then here is where we are going to apply our schema.

00:23:07.740 --> 00:23:10.320
When it reads the CSV file.

00:23:10.780 --> 00:23:18.200
So instead of writing a brand new schema, let me paste it into that.

00:23:18.200 --> 00:23:21.160
Are you able to go to the ether pad?

00:23:26.580 --> 00:23:30.320
The ether pad actually here.

00:23:31.600 --> 00:23:34.980
I'll put it in the teams chat as well to help everyone out.

00:23:35.060 --> 00:23:36.120
So you don't have to write your own.

00:23:39.260 --> 00:23:40.160
Oh, perfect.

00:23:40.220 --> 00:23:40.560
Perfect.

00:23:40.860 --> 00:23:42.040
So I'll put it in chat.

00:23:42.360 --> 00:23:43.600
So you want to copy that.

00:23:43.720 --> 00:23:46.220
So what we're doing is creating a schema.

00:23:46.220 --> 00:23:50.460
And so that way it will read that CSV in.

00:23:51.840 --> 00:23:57.180
The controller service is looking at the schema and that's what it's going to apply

00:23:57.180 --> 00:24:00.720
to that CSV file to write it as JSON.

00:24:01.220 --> 00:24:05.780
So if you can, you know, in your controller service,

00:24:06.120 --> 00:24:11.360
you're going to want to hit plus on the, to add a property.

00:24:14.140 --> 00:24:14.740
No worries.

00:24:14.740 --> 00:24:17.320
This is a very difficult hands on.

00:24:17.940 --> 00:24:24.220
So, you know, I completely understand how to go back and forth and all the copying.

00:24:29.160 --> 00:24:29.740
Correct.

00:24:33.500 --> 00:24:36.420
What the schema would look like writing it to JSON.

00:24:37.940 --> 00:24:41.000
Pedro, I didn't write your background.

00:24:41.540 --> 00:24:43.940
Can you real quickly, you know, what is it you do?

00:24:46.460 --> 00:24:48.400
Yeah, research management.

00:24:48.420 --> 00:24:49.080
Oh, okay.

00:24:49.080 --> 00:24:53.680
We're kind of dealing a little bit with the ATL.

00:24:54.120 --> 00:24:57.300
Oh, then this is right up your alley, right?

00:24:59.000 --> 00:25:04.520
No, no, no, I have everyone's background because, you know,

00:25:04.560 --> 00:25:07.160
we have some sysadmins, we have some developers.

00:25:08.080 --> 00:25:09.860
But yeah, so you want to create,

00:25:10.760 --> 00:25:13.020
did you get that little JSON block I sent you?

00:25:13.960 --> 00:25:18.900
I am still in teams and I am on the government computer.

00:25:19.900 --> 00:25:21.120
And I can paste it in.

00:25:21.540 --> 00:25:22.240
I can paste it in.

00:25:22.260 --> 00:25:23.140
Oh, if you could, that'd be awesome.

00:25:23.420 --> 00:25:24.060
Yeah, yeah, yeah.

00:25:24.080 --> 00:25:26.240
So I don't mind pasting it all.

00:25:26.980 --> 00:25:31.540
So if you can, though, I want you to go back to your controller services

00:25:32.940 --> 00:25:38.940
and hit the plus and the property name is inventory and say, okay.

00:25:38.940 --> 00:25:42.320
And I'm going to paste.

00:25:43.260 --> 00:25:48.640
I can't let me see if it won't let me just put like the number one

00:25:48.640 --> 00:25:52.440
and I will come back in and erase it and just say, okay,

00:25:52.640 --> 00:25:57.660
okay, I'm going to come in your instance where I can modify it.

00:25:57.660 --> 00:26:03.940
Is the property name have to match the schema to the other test?

00:26:03.980 --> 00:26:04.700
It does.

00:26:04.960 --> 00:26:05.480
It does.

00:26:05.700 --> 00:26:06.240
It does.

00:26:06.420 --> 00:26:07.340
Good question.

00:26:07.340 --> 00:26:08.460
Okay, that didn't work.

00:26:09.980 --> 00:26:10.160
Perfect.

00:26:10.300 --> 00:26:11.180
There we go.

00:26:11.780 --> 00:26:15.060
I'm going to exit back out of yours, Pedro,

00:26:15.060 --> 00:26:18.460
and I'm going back to view only because I don't want to mess

00:26:19.840 --> 00:26:21.880
with what you've got going here.

00:26:22.220 --> 00:26:27.020
Okay, so the property inventory needs to match.

00:26:27.220 --> 00:26:27.820
We've got it.

00:26:28.920 --> 00:26:31.680
And if you look at the value, so Pedro,

00:26:31.680 --> 00:26:35.120
if you can click on that inventory, look at the value,

00:26:35.120 --> 00:26:41.520
and you can see we have put our schema into this controller service.

00:26:41.900 --> 00:26:46.840
So when it reads that CSV, it's going to pull all that in

00:26:46.840 --> 00:26:50.580
and create a JSON document, you know,

00:26:50.780 --> 00:26:54.580
based on utilizing this schema based upon the CSV data.

00:26:56.060 --> 00:26:59.440
So you can go ahead and say, okay, and say apply.

00:27:01.520 --> 00:27:02.600
I've got a question.

00:27:04.660 --> 00:27:10.160
Do we have to manually create that schema when we need one for the file,

00:27:10.280 --> 00:27:12.980
or is there like a tool that will help us with that?

00:27:13.420 --> 00:27:15.080
That is a great question.

00:27:15.800 --> 00:27:20.240
So you will have to manually create your schema.

00:27:20.420 --> 00:27:24.120
Now, the beauty is, is NaFi accepts Avro schemas.

00:27:25.020 --> 00:27:27.460
And so if you already have an Avro schema,

00:27:27.460 --> 00:27:30.160
you're already, you know, ahead of the curve.

00:27:30.500 --> 00:27:33.600
But yes, you will have to create a schema

00:27:33.600 --> 00:27:36.600
when you're wanting to do things like what we're doing,

00:27:36.780 --> 00:27:40.180
what we're reading in one format, converting it to another format.

00:27:41.040 --> 00:27:47.420
You know, just because, you know, there's capabilities out there

00:27:47.420 --> 00:27:49.320
for NaFi to auto understand.

00:27:49.740 --> 00:27:52.400
But, you know, what I'm saying here is,

00:27:52.400 --> 00:27:54.500
is NaFi would not be able to understand

00:27:54.500 --> 00:27:57.480
how you want that data to be written back out.

00:27:58.120 --> 00:27:59.740
We have the CSV reader,

00:28:00.040 --> 00:28:03.980
and we're using the Apache Commons library to parse that CSV.

00:28:04.680 --> 00:28:08.340
But without that schema, we wouldn't know where to map,

00:28:08.960 --> 00:28:13.740
you know, the columns back to the JSON fields that we need.

00:28:14.300 --> 00:28:17.920
So yeah, unfortunately, you're going to have to do that.

00:28:19.320 --> 00:28:19.800
Okay.

00:28:20.360 --> 00:28:25.820
And then if you can, let's go to that Avro reader on the first row.

00:28:28.180 --> 00:28:29.340
And go to the gear.

00:28:29.920 --> 00:28:30.480
And perfect.

00:28:30.980 --> 00:28:36.240
It should be automatically set up where it uses embedded Avro schema

00:28:36.760 --> 00:28:38.740
with a cache size of 1000.

00:28:40.560 --> 00:28:42.060
There's capabilities here.

00:28:42.100 --> 00:28:45.420
We can use external Avro schemas.

00:28:45.640 --> 00:28:49.620
We can, you know, there's all kinds of capabilities here within NaFi.

00:28:49.800 --> 00:28:52.020
Depending on your setup.

00:28:52.660 --> 00:28:56.640
But here, we just want to use the internal Avro schema we just created.

00:28:56.880 --> 00:28:58.000
So go ahead and say, okay.

00:29:00.840 --> 00:29:01.320
Okay.

00:29:01.900 --> 00:29:07.700
So now that we have that, let's see, we see, okay, we're back.

00:29:08.400 --> 00:29:12.340
I do see an invalid on your CSV reader.

00:29:12.980 --> 00:29:17.260
Hover over the invalid, the little yield sign is disabled, right?

00:29:17.260 --> 00:29:24.600
So the only issue that I'm seeing so far is the Avro reader, the Avro schema registry is disabled.

00:29:25.420 --> 00:29:28.900
So if you can, let's go start with the Avro reader.

00:29:29.280 --> 00:29:30.860
And let's enable that service.

00:29:31.320 --> 00:29:33.060
Use the, nope, cancel.

00:29:34.940 --> 00:29:36.400
And use a little lightning bolt.

00:29:36.500 --> 00:29:36.960
There you go.

00:29:37.160 --> 00:29:37.260
Enable.

00:29:39.460 --> 00:29:45.820
And you can say service only or service and use the drop-down box.

00:29:45.820 --> 00:29:49.740
There is service and components and referencing component right there.

00:29:49.800 --> 00:29:50.180
Yep.

00:29:50.860 --> 00:29:52.900
So this option gives you that capability.

00:29:52.920 --> 00:29:57.360
If you just want to enable the service, you know, that's one thing.

00:29:57.640 --> 00:30:03.940
But you can actually enable the service and any processor, any other services,

00:30:04.480 --> 00:30:07.220
and referencing components will be enabled as well.

00:30:07.540 --> 00:30:10.020
So go ahead and say enable at the bottom right.

00:30:10.140 --> 00:30:12.800
So it's enabling this controller, enabling references.

00:30:13.400 --> 00:30:13.720
Boom.

00:30:14.000 --> 00:30:14.600
Stay close.

00:30:16.240 --> 00:30:17.180
All right.

00:30:17.480 --> 00:30:21.820
So we want to go to our Avro schema registry and we want to enable that one.

00:30:22.040 --> 00:30:22.720
Same thing.

00:30:22.980 --> 00:30:25.800
So you can see how it's referencing the CSV reader.

00:30:26.200 --> 00:30:27.040
So you say enable.

00:30:27.280 --> 00:30:27.880
Say okay.

00:30:29.820 --> 00:30:33.460
And what that's talking about is the actual processor it can't enable,

00:30:33.840 --> 00:30:38.920
but it did enable the referencing controller service and your Avro schema registry.

00:30:39.120 --> 00:30:40.100
Go ahead and say close.

00:30:40.280 --> 00:30:43.300
We'll take a look at why the processor is not enabling in a second.

00:30:44.020 --> 00:30:48.560
It probably is because you need to enable your JSON record set writer.

00:30:51.140 --> 00:30:54.800
So the, you know, same thing is you want to, there you go.

00:30:54.920 --> 00:30:55.460
Say okay.

00:30:55.820 --> 00:30:56.360
Say close.

00:30:57.720 --> 00:30:58.100
Awesome.

00:30:58.640 --> 00:31:02.800
So now we have went through, we've created our controller services.

00:31:03.580 --> 00:31:08.560
We have created that Avro reader, the Avro schema registry,

00:31:08.780 --> 00:31:11.180
CSV reader, and the JSON record set writer.

00:31:11.180 --> 00:31:12.640
So they're all enabled.

00:31:12.980 --> 00:31:14.100
The state is enabled.

00:31:14.220 --> 00:31:21.320
So go ahead and hit X and let's go into our processor again

00:31:22.700 --> 00:31:25.980
and look at the yield sign on the convert record.

00:31:26.020 --> 00:31:29.940
Oh, we need to finish our flow, right?

00:31:30.000 --> 00:31:33.880
We can't turn this on because the convert record has nowhere to go.

00:31:35.540 --> 00:31:38.760
So you want to then, you know, after we have set,

00:31:39.640 --> 00:31:41.740
you know, we have now picked this up.

00:31:41.840 --> 00:31:46.880
We have converted it to JSON.

00:31:47.340 --> 00:31:53.300
We now want to update the file name attribute so we can write that back to disk.

00:31:53.740 --> 00:31:58.600
Because right now it's creating a whole new flow file, right?

00:31:58.600 --> 00:32:00.160
It's creating a whole new document.

00:32:00.540 --> 00:32:05.300
So you go ahead and click and bring down a update attribute again.

00:32:11.180 --> 00:32:11.700
Perfect.

00:32:14.700 --> 00:32:19.120
And so on the update attribute, let's go ahead and configure it.

00:32:19.180 --> 00:32:21.660
We're going to give it a file name.

00:32:22.320 --> 00:32:26.360
So go ahead and hit plus so we can give it a new property.

00:32:28.280 --> 00:32:30.520
And the property name is file name, okay?

00:32:32.000 --> 00:32:32.440
Say okay.

00:32:32.600 --> 00:32:32.900
Okay.

00:32:33.140 --> 00:32:38.440
And, you know, we're using NaPy, you know, the expression language.

00:32:38.760 --> 00:32:43.260
So we want to do dollar open curly bracket.

00:32:43.520 --> 00:32:43.920
There you go.

00:32:44.100 --> 00:32:46.680
And it automatically puts the close in there for you.

00:32:46.880 --> 00:32:48.320
And then you just want to put file name.

00:32:48.580 --> 00:32:48.780
Awesome.

00:32:49.080 --> 00:32:54.460
And then you want to at the closing curly bracket put dot JSON.

00:32:56.180 --> 00:32:56.820
Dot JSON.

00:32:57.040 --> 00:32:57.500
There you go.

00:32:57.600 --> 00:32:58.360
Say apply.

00:33:00.680 --> 00:33:01.200
Okay.

00:33:02.000 --> 00:33:05.900
And so we want to say apply there.

00:33:06.740 --> 00:33:14.400
So for here, we do have, you know, the capability for failure, unlike the update attributes

00:33:14.400 --> 00:33:15.360
in the git file.

00:33:15.760 --> 00:33:18.980
So on success, we want to update that attribute.

00:33:19.040 --> 00:33:22.120
Go ahead and drag down your update to the update attribute.

00:33:23.440 --> 00:33:25.820
And you want that to be success.

00:33:27.260 --> 00:33:27.720
Awesome.

00:33:28.460 --> 00:33:33.420
And so the convert record also needs to know what to happen to failure.

00:33:33.420 --> 00:33:42.320
And what I like to do, and, you know, this is up to you all how you like to do these

00:33:42.320 --> 00:33:42.660
things.

00:33:43.000 --> 00:33:44.960
But I like to log everything.

00:33:45.820 --> 00:33:49.320
So on a failure, let's log that error message.

00:33:50.260 --> 00:33:52.480
So you want to bring down a new processor.

00:33:52.560 --> 00:33:55.080
Oh, I like how you're going to put the failure back on itself.

00:33:57.260 --> 00:33:59.760
So you want to do a log message.

00:34:00.420 --> 00:34:00.520
Perfect.

00:34:01.660 --> 00:34:02.000
Okay.

00:34:02.000 --> 00:34:07.900
And what I like to do is, yeah, just take and you can see here, you can see I put my

00:34:07.900 --> 00:34:10.620
log messages out to the right of the flow.

00:34:11.080 --> 00:34:16.760
And if you're working straight down, then, you know, I know that that is my success

00:34:16.760 --> 00:34:17.260
path.

00:34:17.340 --> 00:34:23.460
And then I'll put my log message to the right of my flow, whatever makes sense.

00:34:23.600 --> 00:34:29.120
Because you can reuse this log message processor over and over.

00:34:29.140 --> 00:34:31.560
And we're going to do that here as well.

00:34:31.560 --> 00:34:35.020
But for failure, you want to send a log message.

00:34:36.080 --> 00:34:36.940
So I'll say add.

00:34:37.300 --> 00:34:37.560
Awesome.

00:34:40.560 --> 00:34:43.800
On log message, we now need to configure it, right?

00:34:44.060 --> 00:34:49.520
So go into your log message and go to relationships.

00:34:49.700 --> 00:34:55.340
And when you want to auto terminate on success or retry, or, you know, on

00:34:55.340 --> 00:34:56.680
success, auto terminate.

00:34:58.520 --> 00:34:59.140
There you go.

00:34:59.180 --> 00:34:59.720
Hit apply.

00:34:59.720 --> 00:35:08.820
So what's happening is that message now will be sent into the logs.

00:35:09.460 --> 00:35:13.820
And so if you were telling, tell is a Linux command.

00:35:14.460 --> 00:35:19.960
So if you were telling a log and an error came through, you would be able to see

00:35:20.580 --> 00:35:22.420
that error message come through.

00:35:22.600 --> 00:35:27.720
It's also going to, you can actually pull a history as well to see what that error is.

00:35:28.360 --> 00:35:34.020
So our convert record now is went from yield to stop.

00:35:36.240 --> 00:35:38.220
You can leave the log message out there.

00:35:38.220 --> 00:35:38.860
That's fine.

00:35:39.120 --> 00:35:41.280
Now we need to continue our flow.

00:35:41.340 --> 00:35:42.760
So we have an update attribute.

00:35:43.360 --> 00:35:51.320
For that update attribute, we went through and added a JSON file name.

00:35:53.080 --> 00:35:56.320
And so now we want to do a put file.

00:35:56.320 --> 00:35:59.840
Now we have a new JSON document.

00:36:00.680 --> 00:36:02.280
We have it named.

00:36:02.540 --> 00:36:04.760
We have our schema set.

00:36:05.040 --> 00:36:06.840
We have all these things.

00:36:07.740 --> 00:36:10.400
So, yep, update attribute for success.

00:36:11.580 --> 00:36:12.880
And we want to put the file.

00:36:13.680 --> 00:36:22.480
So on this, if you want, you can write that file right back to the CSV directory.

00:36:22.480 --> 00:36:31.060
And I say that because, you know, for mine, I'm only picking up the CSVs.

00:36:31.080 --> 00:36:40.380
So any JSON that is written back to that directory, you know, will not read that back

00:36:40.380 --> 00:36:42.780
in when it comes time to write it again.

00:36:46.480 --> 00:36:50.220
So go ahead and figure out where you want to put the output of that.

00:36:51.480 --> 00:36:57.320
Do remember to, and this is for everyone, you know, do remember to keep the source

00:36:57.320 --> 00:37:03.320
file on your Git CSV, if possible, just because you, you know, your flow may not

00:37:03.320 --> 00:37:03.940
be correct.

00:37:03.980 --> 00:37:04.840
It runs through.

00:37:04.920 --> 00:37:06.220
You want to run it through again.

00:37:08.800 --> 00:37:10.140
Does that look great?

00:37:10.300 --> 00:37:10.720
It does.

00:37:10.720 --> 00:37:11.120
It does.

00:37:11.200 --> 00:37:12.860
Go ahead and say apply.

00:37:12.980 --> 00:37:13.560
All right.

00:37:13.560 --> 00:37:17.360
And then let's, let's see here, put file.

00:37:19.840 --> 00:37:26.160
Let's do a, you know, from your put file, let's drag another arrow to your log

00:37:26.160 --> 00:37:26.880
message.

00:37:29.540 --> 00:37:30.260
Perfect.

00:37:30.620 --> 00:37:33.220
And we want to do a failure on that one.

00:37:34.480 --> 00:37:41.680
So you have, I think, auto terminate enabled for the put file on a failure.

00:37:41.680 --> 00:37:47.920
But in case, you know, the hard disk fills up the, if this was a network share,

00:37:47.920 --> 00:37:49.220
it's not available.

00:37:49.580 --> 00:37:53.220
You know, there's a ton of different reasons why we might not be able to

00:37:53.220 --> 00:37:54.060
write that file.

00:37:54.280 --> 00:37:56.360
You want to log that message.

00:37:56.900 --> 00:37:58.000
And so go ahead and say add.

00:38:02.380 --> 00:38:02.880
Perfect.

00:38:03.240 --> 00:38:10.120
So what we've done is, is we've created this, this flow and any errors we

00:38:10.120 --> 00:38:12.720
are pushing it to the log message.

00:38:13.580 --> 00:38:19.100
And so if we were to add other aspects of this and there was an error, you

00:38:19.100 --> 00:38:24.680
know, it's the same, we can reuse that same processor over and over and

00:38:24.680 --> 00:38:25.380
over again.

00:38:25.920 --> 00:38:31.660
I've seen, you know, folks set up a whole processor group on how to

00:38:31.660 --> 00:38:35.880
handle errors with advanced filtering and things like that.

00:38:35.880 --> 00:38:42.560
And then, you know, you may have 10, 15 other data flows utilizing that,

00:38:42.560 --> 00:38:44.880
you know, that error flow.

00:38:45.460 --> 00:38:49.600
And so, you know, it's just something to keep in mind as you're developing

00:38:49.600 --> 00:38:51.160
and designing your data flows.

00:38:53.420 --> 00:38:53.800
Okay.

00:38:54.180 --> 00:38:55.420
Let's see if we can run this.

00:38:56.460 --> 00:38:58.740
I don't see any errors right yet.

00:39:00.060 --> 00:39:03.400
So let's see if we can, Pedro, if you can just run through this

00:39:03.400 --> 00:39:03.900
one time.

00:39:05.320 --> 00:39:07.060
So that started at the top or?

00:39:07.200 --> 00:39:07.580
Yes.

00:39:08.020 --> 00:39:11.200
So let's make sure we configure it first and make sure that we're

00:39:11.200 --> 00:39:16.800
keeping our source file because NaFi likes to default to false.

00:39:17.080 --> 00:39:17.420
Okay.

00:39:17.560 --> 00:39:17.940
You got it.

00:39:18.040 --> 00:39:18.120
Sure.

00:39:18.280 --> 00:39:18.580
Perfect.

00:39:19.780 --> 00:39:20.060
All right.

00:39:20.140 --> 00:39:20.760
Say cancel.

00:39:22.340 --> 00:39:23.640
Just to double check.

00:39:24.220 --> 00:39:25.020
Say run once.

00:39:25.140 --> 00:39:25.540
Refresh.

00:39:25.760 --> 00:39:25.980
Awesome.

00:39:26.240 --> 00:39:26.520
Awesome.

00:39:26.880 --> 00:39:29.880
Let's take a look at your queue to make sure that that is the

00:39:29.880 --> 00:39:30.820
file that you're expecting.

00:39:30.820 --> 00:39:36.800
So here is, you know, here we are testing out our flow and making sure,

00:39:37.000 --> 00:39:37.940
you know, it looks good.

00:39:37.940 --> 00:39:39.220
So go ahead and say list queue.

00:39:39.540 --> 00:39:41.640
And you remember how to view it?

00:39:41.860 --> 00:39:42.660
No, no.

00:39:44.920 --> 00:39:45.800
Go record.

00:39:47.420 --> 00:39:47.980
You content.

00:39:48.540 --> 00:39:52.600
So yesterday when I did a new content, I could not do it because

00:39:52.600 --> 00:39:57.100
it was a zip file and there was no viewer built into NaFi for zip files.

00:39:57.600 --> 00:39:58.880
Luckily, it's a CSV.

00:39:58.880 --> 00:40:03.520
So NaFi understands CSV and it has a viewer built in.

00:40:03.980 --> 00:40:09.340
But the file name is inventory.csv and there's 11 lines with, well,

00:40:09.340 --> 00:40:15.640
10 lines with a header and a bad day to roll just to throw us off too.

00:40:15.980 --> 00:40:18.780
So go ahead and exit out of that view.

00:40:20.300 --> 00:40:21.340
So you just close that tab.

00:40:22.060 --> 00:40:22.560
Awesome.

00:40:22.960 --> 00:40:25.580
And if you want, go ahead and look at the attribute.

00:40:25.580 --> 00:40:28.380
So scroll all the way, go all the way to your left to the little I.

00:40:30.180 --> 00:40:30.720
There you go.

00:40:31.500 --> 00:40:32.420
And attributes.

00:40:33.580 --> 00:40:38.600
And so if you remember from yesterday, this is what we were picking up data

00:40:38.600 --> 00:40:40.180
and we were looking at the attributes.

00:40:40.480 --> 00:40:45.180
If you scroll down, do you see anything in there about schemas?

00:40:46.500 --> 00:40:47.060
No.

00:40:47.380 --> 00:40:47.800
Perfect.

00:40:48.200 --> 00:40:51.880
Because we have not sent it to that update attribute yet.

00:40:52.460 --> 00:40:53.520
So go ahead and say, okay.

00:40:53.520 --> 00:40:55.920
And exit out of that.

00:40:57.200 --> 00:40:57.640
Perfect.

00:40:58.060 --> 00:40:58.480
Sorry, guys.

00:40:58.520 --> 00:40:59.320
My network dropped.

00:40:59.620 --> 00:41:00.920
I was out for like 10 minutes.

00:41:01.420 --> 00:41:02.080
Oh, wow.

00:41:05.320 --> 00:41:10.980
We're, you know, I'm using Pedro as an example walking through this flow.

00:41:11.800 --> 00:41:13.820
But, you know, we're still building the flow.

00:41:13.980 --> 00:41:16.640
So if you have any questions, just let me know, Brett.

00:41:19.020 --> 00:41:19.500
Okay.

00:41:19.620 --> 00:41:20.300
Thank you.

00:41:21.060 --> 00:41:21.620
Okay.

00:41:22.380 --> 00:41:25.500
So we have now got the file.

00:41:25.740 --> 00:41:27.240
Let's do an update attribute.

00:41:27.300 --> 00:41:28.600
So we'll just run that once.

00:41:28.840 --> 00:41:29.620
Just click run once.

00:41:29.620 --> 00:41:30.740
There you go.

00:41:31.620 --> 00:41:33.660
And it should show up in success.

00:41:34.100 --> 00:41:34.560
Awesome.

00:41:34.840 --> 00:41:38.360
So let's now take a look at that queue and let's look at the attributes

00:41:38.360 --> 00:41:40.180
and see what has changed.

00:41:40.340 --> 00:41:40.860
Let's go down.

00:41:43.340 --> 00:41:44.200
Ah, perfect.

00:41:44.580 --> 00:41:49.740
So you remember in the update attribute, we added a new property

00:41:49.740 --> 00:41:54.460
that was schema.name and with the value of inventory, right?

00:41:54.540 --> 00:41:58.660
So now, you know, now it should be coming together

00:41:58.660 --> 00:42:00.960
where we created that controller service.

00:42:01.140 --> 00:42:06.000
We told the controller service we're going to use the schema.name property

00:42:06.620 --> 00:42:11.060
as what, because you may have hundreds of different schemes.

00:42:11.600 --> 00:42:14.640
And so we're going to use the schema.name property.

00:42:15.080 --> 00:42:17.420
And the name is inventory.

00:42:17.420 --> 00:42:21.220
And that was the inventory name we gave it in our controller service.

00:42:21.860 --> 00:42:22.480
So, all right.

00:42:22.500 --> 00:42:23.700
This looks good so far.

00:42:23.800 --> 00:42:24.380
So we'll say okay.

00:42:26.280 --> 00:42:27.260
And exit out of that.

00:42:27.280 --> 00:42:27.780
Perfect.

00:42:28.420 --> 00:42:32.280
Now let's do, here's the heavy lifting is the convert record.

00:42:32.800 --> 00:42:36.060
So let's run once to see if it actually works.

00:42:36.200 --> 00:42:37.360
Oh, we have a failure.

00:42:37.500 --> 00:42:38.340
What is our failure?

00:42:39.620 --> 00:42:45.160
Cannot parse incoming data error while getting next record for string quantity.

00:42:46.480 --> 00:42:49.160
Oh, is that by a row that we have in there?

00:42:52.620 --> 00:42:59.060
Well, it should parse each row and create a JSON document out of each row

00:42:59.060 --> 00:43:03.620
and then ignore the thing.

00:43:03.640 --> 00:43:03.820
Hang on.

00:43:03.820 --> 00:43:05.240
Let me take a look at this real quick.

00:43:06.400 --> 00:43:07.960
Because mine's up and running.

00:43:35.440 --> 00:43:38.700
Let's go into your convert record.

00:43:39.240 --> 00:43:41.100
And let's look at the properties.

00:43:43.000 --> 00:43:45.040
And you have CSV reader.

00:43:45.160 --> 00:43:48.260
And you have JSON record set writer.

00:43:49.480 --> 00:43:50.460
You have false.

00:43:51.220 --> 00:43:52.760
Click on your CSV reader.

00:43:54.780 --> 00:43:55.460
No, hit cancel.

00:43:55.480 --> 00:43:58.000
Click on the arrow to take you to the service.

00:43:58.740 --> 00:43:59.220
Okay.

00:43:59.340 --> 00:44:00.300
That looks good.

00:44:01.680 --> 00:44:05.800
Let's view the configuration to make sure we have everything configured properly.

00:44:08.200 --> 00:44:08.840
There you go.

00:44:09.480 --> 00:44:12.000
We should have used the schema name property.

00:44:13.320 --> 00:44:17.080
The schema registry is the Avro schema registry.

00:44:17.280 --> 00:44:18.720
We do have schema name.

00:44:19.320 --> 00:44:19.860
Okay.

00:44:19.860 --> 00:44:20.680
That looks good.

00:44:21.080 --> 00:44:23.960
Let's look at our schema registry.

00:44:24.720 --> 00:44:26.480
So click on the arrow for that.

00:44:28.700 --> 00:44:29.200
Scroll back up.

00:44:30.760 --> 00:44:31.440
Scroll up.

00:44:31.640 --> 00:44:32.600
A little arrow there.

00:44:32.720 --> 00:44:33.240
Perfect.

00:44:34.960 --> 00:44:36.020
And then, okay.

00:44:36.200 --> 00:44:39.460
So let's look at the schema registry.

00:44:40.420 --> 00:44:42.060
We have inventory.

00:44:43.380 --> 00:44:45.860
And I put the schema in there.

00:44:46.020 --> 00:44:47.080
So click on that.

00:44:47.320 --> 00:44:50.060
Let me make sure that looks good.

00:44:50.880 --> 00:44:51.560
Okay.

00:44:53.240 --> 00:44:53.520
Okay.

00:44:53.520 --> 00:44:54.180
That looks good.

00:45:00.040 --> 00:45:00.740
Is that okay?

00:45:02.120 --> 00:45:02.840
Is that okay again?

00:45:03.000 --> 00:45:04.520
The Avro reader.

00:45:05.480 --> 00:45:07.700
Let's click on the gear icon for that one.

00:45:09.180 --> 00:45:10.220
Use embedded.

00:45:10.680 --> 00:45:12.080
You have that correct.

00:45:13.760 --> 00:45:14.640
And you have a thousand.

00:45:15.360 --> 00:45:15.660
Okay.

00:45:15.780 --> 00:45:16.340
Now say okay.

00:45:18.420 --> 00:45:20.800
And then on the JSON record set writer.

00:45:21.220 --> 00:45:24.480
Let's go, you know, what that configuration is.

00:45:26.460 --> 00:45:26.880
One second.

00:45:27.000 --> 00:45:29.080
I'm pulling mine up to make sure it matches.

00:45:34.740 --> 00:45:35.260
Oh.

00:45:35.480 --> 00:45:37.020
We did not configure this one.

00:45:37.800 --> 00:45:40.500
On the schema write strategy.

00:45:42.260 --> 00:45:46.880
We want to change that to set schema.name attribute.

00:45:47.020 --> 00:45:48.040
I'm not letting you, right?

00:45:49.940 --> 00:45:51.600
So close that.

00:45:53.140 --> 00:45:56.020
And you see up top it says disable and configure like that.

00:45:56.720 --> 00:45:59.060
So what it's doing is disabling the services

00:45:59.060 --> 00:46:01.580
and the processors associated with it.

00:46:01.940 --> 00:46:02.160
All right.

00:46:02.160 --> 00:46:07.180
So the schema write strategy is we want to set schema.name attribute.

00:46:07.700 --> 00:46:08.820
I thought we did this.

00:46:09.900 --> 00:46:10.500
Say okay.

00:46:10.600 --> 00:46:11.000
Awesome.

00:46:11.360 --> 00:46:13.420
And then the schema access strategy.

00:46:13.720 --> 00:46:17.820
We want to use schema.name property again.

00:46:17.920 --> 00:46:18.400
Awesome.

00:46:18.700 --> 00:46:18.980
Say okay.

00:46:19.220 --> 00:46:23.740
And then schema.name is already applied in the schema registry.

00:46:24.060 --> 00:46:26.200
So we want to click schema registry.

00:46:26.600 --> 00:46:28.240
And I bet you know this one.

00:46:29.060 --> 00:46:30.680
It's the Avro schema registry.

00:46:30.900 --> 00:46:31.480
We're using perfect.

00:46:31.600 --> 00:46:31.940
Say okay.

00:46:35.600 --> 00:46:38.580
And then pretty print JSON.

00:46:38.900 --> 00:46:40.380
Let's just make it pretty.

00:46:40.840 --> 00:46:41.180
So center.

00:46:41.940 --> 00:46:46.180
And then it should be never suppress array and none.

00:46:46.320 --> 00:46:46.700
Okay.

00:46:46.820 --> 00:46:47.320
Say apply.

00:46:48.440 --> 00:46:50.040
And now we want to enable that.

00:46:53.700 --> 00:46:55.260
We want to enable every.

00:46:55.500 --> 00:46:55.620
Awesome.

00:46:55.800 --> 00:46:58.640
So that is actually a very quick test.

00:46:58.640 --> 00:47:02.360
If you have your processors completely configured,

00:47:02.840 --> 00:47:07.320
it's a really good test to see if everything is configured

00:47:07.320 --> 00:47:10.180
appropriately because it will enable that processor.

00:47:11.080 --> 00:47:13.460
If it rejects enabling the processor,

00:47:13.920 --> 00:47:17.680
it's usually because you're missing some of the termination

00:47:17.680 --> 00:47:19.600
or you got a misconfiguration.

00:47:19.780 --> 00:47:20.240
So say close.

00:47:22.200 --> 00:47:23.160
And then exit out of that.

00:47:23.500 --> 00:47:23.860
All right.

00:47:23.860 --> 00:47:30.080
Let's clear our actually let's stop the convert record

00:47:30.080 --> 00:47:33.380
because it started it when you enabled that service.

00:47:34.800 --> 00:47:37.080
And let's see if we can run this one more time.

00:47:37.280 --> 00:47:39.420
So run once at the very beginning.

00:47:41.320 --> 00:47:42.120
Yes, sir.

00:47:43.680 --> 00:47:46.480
And I know I'm working with Pedro here,

00:47:46.740 --> 00:47:49.720
but you want to pause while he does the run once.

00:47:49.980 --> 00:47:51.820
Does anyone have any questions?

00:47:51.820 --> 00:47:53.520
Because this is a very difficult flow.

00:47:54.080 --> 00:47:57.120
And we've been kind of following what you guys are doing up there.

00:47:57.200 --> 00:47:58.760
Oh, and that's why we're doing it.

00:47:58.800 --> 00:47:59.200
Perfect.

00:48:00.040 --> 00:48:01.560
And hopefully we didn't lose breath.

00:48:01.560 --> 00:48:02.520
Yeah, I'm here.

00:48:02.920 --> 00:48:03.360
Awesome.

00:48:03.440 --> 00:48:03.580
Awesome.

00:48:03.840 --> 00:48:03.880
Awesome.

00:48:08.040 --> 00:48:08.600
No worries.

00:48:09.960 --> 00:48:13.980
And like I said, this is one more advanced data flows

00:48:13.980 --> 00:48:15.080
that we want to do.

00:48:15.280 --> 00:48:20.020
And so this starts getting you into controller services

00:48:20.020 --> 00:48:20.940
and those types of things.

00:48:20.940 --> 00:48:23.660
So we have a lot of time allocated for this.

00:48:24.720 --> 00:48:25.000
OK.

00:48:33.220 --> 00:48:33.680
Possibly.

00:48:34.240 --> 00:48:35.900
I'll take a look at yours in just a second.

00:48:37.620 --> 00:48:39.700
Pedro, you want to continue doing a run once

00:48:39.700 --> 00:48:42.140
and let's see if we can get that to success.

00:48:42.320 --> 00:48:43.520
And then I will pull breath.

00:48:45.220 --> 00:48:45.660
OK.

00:48:45.900 --> 00:48:49.220
Let's see if our convert record will actually work this time.

00:48:49.800 --> 00:48:50.160
Run once.

00:48:50.340 --> 00:48:50.920
It did not.

00:48:50.940 --> 00:48:52.460
What is our error?

00:48:52.460 --> 00:48:53.680
Failed to process.

00:48:53.680 --> 00:48:54.840
Failed to process.

00:48:54.840 --> 00:48:55.820
Why are you?

00:48:59.440 --> 00:48:59.920
One second.

00:48:59.920 --> 00:49:04.780
I'm looking at mine because we are working off of the exact same.

00:49:08.560 --> 00:49:09.360
Yeah.

00:49:09.360 --> 00:49:10.600
Bad data row.

00:49:11.180 --> 00:49:11.820
Same header.

00:49:14.140 --> 00:49:14.780
Sure.

00:49:27.300 --> 00:49:29.200
So one second, Pedro.

00:49:29.300 --> 00:49:32.100
I'm looking at mine, which worked.

00:49:35.360 --> 00:49:39.080
Oh, oh, I bet I know what we did not do.

00:49:41.400 --> 00:49:44.780
Can we go into the convert record?

00:49:47.800 --> 00:49:52.280
And then the CSV reader, you want to go to that service.

00:49:53.220 --> 00:49:54.620
Click the gear icon.

00:49:55.000 --> 00:49:55.720
No, no, you cancel that.

00:49:57.060 --> 00:49:57.320
There you go.

00:49:57.360 --> 00:49:59.340
Use the arrow to go to the CSV reader

00:49:59.340 --> 00:50:01.380
and use a little gear icon.

00:50:03.400 --> 00:50:04.020
Scroll down.

00:50:04.140 --> 00:50:05.440
I'll duplicate header names.

00:50:07.000 --> 00:50:08.200
Do you think true?

00:50:09.920 --> 00:50:12.860
Oh, treat first line as header.

00:50:13.420 --> 00:50:14.100
So that's true.

00:50:14.260 --> 00:50:16.300
But you remember we have to disable and configure.

00:50:16.460 --> 00:50:16.840
Awesome.

00:50:17.140 --> 00:50:17.660
OK.

00:50:17.820 --> 00:50:18.420
There we go.

00:50:18.620 --> 00:50:19.140
Apply.

00:50:21.040 --> 00:50:23.460
It couldn't parse because it doesn't understand

00:50:23.460 --> 00:50:25.000
what that header was.

00:50:25.040 --> 00:50:27.160
So go ahead and enable it.

00:50:27.480 --> 00:50:30.140
And hopefully convert record will turn perfect.

00:50:30.360 --> 00:50:30.580
All right.

00:50:30.580 --> 00:50:32.280
Let's do it run once all the way through

00:50:32.280 --> 00:50:33.540
and see if it works this time.

00:50:37.640 --> 00:50:38.660
Did it refresh?

00:50:38.880 --> 00:50:39.940
Ah, success.

00:50:40.260 --> 00:50:40.880
Great job.

00:50:41.720 --> 00:50:42.120
OK.

00:50:42.240 --> 00:50:45.560
So if you can, let's go ahead and go through

00:50:45.560 --> 00:50:48.080
and clean this process flow up.

00:50:48.400 --> 00:50:50.040
Let's get the naming in.

00:50:50.300 --> 00:50:53.440
Let's get some labels, those types of things.

00:50:53.780 --> 00:50:56.800
Here is an example of mine.

00:51:02.240 --> 00:51:05.140
Here's an example of what my flow looks like.

00:51:07.680 --> 00:51:12.000
So if you can, go ahead and update yours.

00:51:12.220 --> 00:51:14.880
I need to delete this because I was using it, for example.

00:51:15.700 --> 00:51:17.560
And if you have any other questions, Pedro,

00:51:17.600 --> 00:51:18.540
just let me know.

00:51:18.880 --> 00:51:20.720
Brad, let's go back to you.

00:51:24.760 --> 00:51:25.640
I would.

00:51:26.480 --> 00:51:27.860
Did you test yours out, Brad?

00:51:30.240 --> 00:51:32.040
Yeah, it doesn't look like it's.

00:51:32.140 --> 00:51:33.640
This first one doesn't have any in or out.

00:51:34.220 --> 00:51:34.640
OK.

00:51:34.700 --> 00:51:36.420
So I'm guessing it's just not picking up the file.

00:51:37.820 --> 00:51:40.280
Um, perfect.

00:51:42.220 --> 00:51:43.420
Let's show configuration.

00:51:44.360 --> 00:51:48.740
If you don't mind, let's stop all your processors

00:51:48.740 --> 00:51:51.900
for right now, just because we need to stop them

00:51:51.900 --> 00:51:52.940
and configure them anyway.

00:51:54.300 --> 00:51:55.340
So beautiful.

00:51:55.620 --> 00:51:55.860
Beautiful.

00:51:55.960 --> 00:51:59.560
And a great way to use to operate on the canvas.

00:52:02.160 --> 00:52:04.900
So take, if you can, go to that input directory

00:52:04.900 --> 00:52:06.080
and go to that value.

00:52:06.420 --> 00:52:10.840
No, go back to your not five canvas and under that value,

00:52:10.900 --> 00:52:14.020
open that copy that location.

00:52:15.540 --> 00:52:17.840
So you just highlight it and say copy.

00:52:17.840 --> 00:52:19.160
I just use control C.

00:52:19.320 --> 00:52:20.240
OK, that works.

00:52:20.380 --> 00:52:21.460
All right, then hit cancel.

00:52:22.100 --> 00:52:22.720
But don't hit.

00:52:22.900 --> 00:52:23.420
Yeah, there you go.

00:52:23.540 --> 00:52:25.320
And then you go back to your file browser.

00:52:25.320 --> 00:52:25.800
There you go.

00:52:25.800 --> 00:52:28.000
And then in your address bar, paste that.

00:52:29.220 --> 00:52:30.820
I didn't get it.

00:52:31.080 --> 00:52:31.900
Oh, and do it again.

00:52:31.900 --> 00:52:34.380
Control A, control C, then we'll control B.

00:52:34.380 --> 00:52:38.560
I'll hit control C 10 times, because it doesn't get it

00:52:38.560 --> 00:52:38.980
the first time.

00:52:39.140 --> 00:52:42.800
Yeah, the desktop environment sometimes will.

00:52:42.860 --> 00:52:43.160
There you go.

00:52:43.280 --> 00:52:43.800
Hit enter.

00:52:45.220 --> 00:52:46.560
Yeah, that was the.

00:52:46.760 --> 00:52:48.500
Here, I'll go up a directory so it's obvious.

00:52:49.660 --> 00:52:49.720
Yeah.

00:52:50.900 --> 00:52:52.040
Oh, it's there.

00:52:52.160 --> 00:52:53.380
The CSV file's there.

00:52:54.440 --> 00:52:55.360
OK, say cancel.

00:52:55.680 --> 00:52:57.740
Oh, your file filter.

00:52:58.560 --> 00:53:03.100
Let's just stick with inventory.csv for now.

00:53:04.400 --> 00:53:07.120
The reject pattern may be off.

00:53:11.000 --> 00:53:13.000
So you can just change that.

00:53:16.080 --> 00:53:17.360
Inventory.csv.

00:53:18.880 --> 00:53:20.180
I'm assuming that's all set.

00:53:20.320 --> 00:53:20.760
Should I run it now?

00:53:22.080 --> 00:53:24.080
Run once and hit refresh.

00:53:25.000 --> 00:53:27.040
Just right click on your canvas and hit refresh.

00:53:27.200 --> 00:53:29.700
I witnessed you set that path correctly.

00:53:30.840 --> 00:53:34.220
Um, right click on that get inventory file.

00:53:34.520 --> 00:53:35.800
And oh, it went.

00:53:35.920 --> 00:53:36.440
It went.

00:53:36.440 --> 00:53:38.280
I see success.

00:53:39.220 --> 00:53:40.680
OK, perfect.

00:53:41.000 --> 00:53:45.500
And then if you want, just do a run once.

00:53:45.560 --> 00:53:47.060
And let's see if it goes all the way through.

00:53:47.060 --> 00:53:52.000
Can I do a run once for the whole thing or do I have to do each?

00:53:52.000 --> 00:53:54.300
You have to do each individual one.

00:53:54.520 --> 00:53:57.480
Wouldn't it be cool if you just hit run once on the whole process group?

00:53:58.860 --> 00:53:59.540
Yeah, that would be nice.

00:53:59.700 --> 00:54:03.900
I'll submit it to them and see if they can get that into the next version.

00:54:05.300 --> 00:54:06.880
Like I said, I know all those guys.

00:54:07.280 --> 00:54:11.860
So like I submit feedback all the time.

00:54:13.040 --> 00:54:13.240
All right.

00:54:13.240 --> 00:54:17.320
So here's where we usually see our highest failures is on this convert record.

00:54:17.580 --> 00:54:19.040
So go ahead and run that once.

00:54:21.020 --> 00:54:27.680
If it fails, I would recommend here is logging your failures.

00:54:28.580 --> 00:54:32.540
Just because, you know, you see that red box in the top, right?

00:54:32.660 --> 00:54:38.380
So what it did is it auto terminated the failure and and that's it.

00:54:38.600 --> 00:54:41.520
So it won't go into any logs or anything.

00:54:42.040 --> 00:54:44.120
So what I like to do is do a log message.

00:54:44.220 --> 00:54:44.920
You got it.

00:54:46.080 --> 00:54:47.000
And this is what I was.

00:54:48.220 --> 00:54:54.620
Oh, so this is what I was saying where I like to have a log message

00:54:54.620 --> 00:55:01.500
just on my data flow and any chance I have where a failure can terminate.

00:55:01.620 --> 00:55:07.460
I always drag my failures to the log message just so I can also see the file.

00:55:07.740 --> 00:55:13.780
You know, you may have a file that comes in and is not recognized and it goes to failure.

00:55:14.120 --> 00:55:17.340
Then you can actually view the queue and look and see why.

00:55:17.440 --> 00:55:24.600
Like maybe it picked up a file that was inventory dot CSV that had bad values, right?

00:55:24.620 --> 00:55:28.500
If you do an auto terminate, it takes away some of your options.

00:55:28.540 --> 00:55:31.220
That's why I always recommend a log message.

00:55:32.400 --> 00:55:36.700
Okay, so you can go ahead and terminate that log message.

00:55:38.080 --> 00:55:41.680
You want to go ahead and figure it and then relationships.

00:55:41.960 --> 00:55:42.780
I'll terminate.

00:55:42.960 --> 00:55:43.080
Perfect.

00:55:43.240 --> 00:55:43.580
And say apply.

00:55:45.520 --> 00:55:45.800
Awesome.

00:55:47.000 --> 00:55:50.780
And then take your log message is fine.

00:55:50.800 --> 00:55:52.540
You can just have it set out to the side.

00:55:53.080 --> 00:55:54.500
You should be good there.

00:55:55.060 --> 00:55:56.880
You don't want to drag it to anything.

00:55:57.460 --> 00:55:59.320
You can drag it to itself.

00:56:02.580 --> 00:56:09.720
If you have a failure, you can drag it to itself and it will reprocess that data.

00:56:10.140 --> 00:56:17.280
We see that when you have a processor that could take a little time sometimes.

00:56:18.500 --> 00:56:22.940
If you have something that is taking a lot of resources,

00:56:22.940 --> 00:56:29.940
like unzipping a large file, but you know that you're getting a bunch of small files behind it,

00:56:30.060 --> 00:56:31.960
every once in a while we'll see a large one.

00:56:32.200 --> 00:56:36.080
You can put a failure back onto yourself so you can just reprocess it.

00:56:36.080 --> 00:56:41.280
So we'll take that flow file, put it back into the queue and try to reprocess it.

00:56:41.920 --> 00:56:45.960
But let's go to our convert CSV to JSON convert record.

00:56:47.180 --> 00:56:48.140
And let's configure that.

00:56:48.140 --> 00:56:48.440
Awesome.

00:56:48.460 --> 00:56:49.100
Go to properties.

00:56:49.440 --> 00:56:52.920
Okay, so we are reading in a CSV reader.

00:56:52.940 --> 00:56:56.020
So let's go to that controller service.

00:56:56.020 --> 00:56:58.020
You want to hit that arrow on the right.

00:56:58.300 --> 00:56:58.420
Awesome.

00:56:58.980 --> 00:57:04.620
And let's hit the gear and let's see what your configuration looks like.

00:57:05.840 --> 00:57:06.640
Go to properties.

00:57:07.320 --> 00:57:10.520
Scroll all the way down and treat.

00:57:10.840 --> 00:57:15.520
Okay, you do have the treat the first line as the header.

00:57:18.000 --> 00:57:22.020
Let's go all the way up to the top and let's see what that looks like.

00:57:25.620 --> 00:57:29.800
In first schema, I don't think it's correct.

00:57:30.640 --> 00:57:34.260
So we want to, yep, use schema name property.

00:57:34.920 --> 00:57:37.860
So you remember we set the schema name in that attribute.

00:57:39.700 --> 00:57:43.720
And so we're telling, now that we've set that attribute,

00:57:43.980 --> 00:57:46.920
we're telling NFI to use the schema name

00:57:46.920 --> 00:57:50.880
and we're going to specify the name as what schema to use.

00:57:50.880 --> 00:57:53.400
So you got that.

00:57:53.520 --> 00:57:55.020
And then schema registry.

00:57:55.200 --> 00:57:58.400
We want to set, you know, there's no value there.

00:57:58.400 --> 00:58:01.060
We want to set that as the Avro schema registry.

00:58:01.260 --> 00:58:01.600
Awesome.

00:58:01.740 --> 00:58:02.000
Say okay.

00:58:03.840 --> 00:58:05.820
And then I click apply.

00:58:05.980 --> 00:58:06.420
Awesome.

00:58:06.560 --> 00:58:10.880
And then let's go to your Avro schema registry we just set up.

00:58:12.560 --> 00:58:13.960
And click get gear icon.

00:58:15.700 --> 00:58:18.760
And you've got the model and you've got that.

00:58:18.760 --> 00:58:19.580
That's beautiful.

00:58:19.880 --> 00:58:20.160
All right.

00:58:20.160 --> 00:58:20.640
Say okay.

00:58:21.300 --> 00:58:25.260
I think maybe the only issue was is you just didn't reference the schema.

00:58:26.100 --> 00:58:30.540
So let's look at your Avro reader.

00:58:33.060 --> 00:58:37.020
It should be just use embedded Avro schema.

00:58:37.200 --> 00:58:37.420
Yep.

00:58:37.640 --> 00:58:37.960
Perfect.

00:58:38.240 --> 00:58:38.320
Perfect.

00:58:39.620 --> 00:58:39.920
Okay.

00:58:40.340 --> 00:58:42.860
And real quickly, let's look at JSON record set writer.

00:58:43.020 --> 00:58:45.940
So you should have set schema.name attribute.

00:58:46.820 --> 00:58:47.660
No value.

00:58:47.940 --> 00:58:49.340
Use schema name property.

00:58:49.340 --> 00:58:52.100
And what is the schema name property?

00:58:52.820 --> 00:58:55.480
It's, you know, we set that in that update attribute.

00:58:55.640 --> 00:58:57.060
So there you go.

00:58:57.940 --> 00:59:00.280
And you've got the Avro schema registry.

00:59:00.340 --> 00:59:01.260
Say okay.

00:59:02.060 --> 00:59:03.320
I think yours is going to work now.

00:59:06.700 --> 00:59:08.760
So just try to do a run once.

00:59:09.740 --> 00:59:11.060
And, you know, let's see how far we get.

00:59:11.840 --> 00:59:15.240
And while he's working on that, let's see.

00:59:15.580 --> 00:59:16.700
How's everyone else doing?

00:59:16.940 --> 00:59:18.180
Cody is cleaning his up.

00:59:18.180 --> 00:59:20.220
Tyler is cleaning his up.

00:59:20.560 --> 00:59:21.260
Looking good.

00:59:22.540 --> 00:59:26.480
So you were able to get the file.

00:59:27.460 --> 00:59:30.140
Let's run it once to set the update attribute.

00:59:32.600 --> 00:59:33.960
Oh, okay.

00:59:34.940 --> 00:59:35.280
Did it.

00:59:38.060 --> 00:59:38.740
Okay.

00:59:39.240 --> 00:59:39.700
Run once.

00:59:40.120 --> 00:59:40.740
Hit refresh.

00:59:41.100 --> 00:59:42.680
Right click and hit refresh on your canvas.

00:59:42.960 --> 00:59:43.060
There you go.

00:59:44.940 --> 00:59:45.500
Success.

00:59:48.180 --> 00:59:50.680
So and then you're updating the attribute.

00:59:51.000 --> 00:59:55.260
And you are then putting the file back to disk.

00:59:55.460 --> 01:00:00.100
So if you can, let's go ahead and clean this up.

01:00:01.440 --> 01:00:06.380
You know, do your names and labels and those types of things.

01:00:06.760 --> 01:00:09.280
Make sure the file is being written out.

01:00:09.620 --> 01:00:16.700
I would add another failure on put file just because you want to make sure you log

01:00:17.240 --> 01:00:18.120
that message.

01:00:18.420 --> 01:00:22.460
If it does have an issue putting the file, you know, as I mentioned earlier,

01:00:22.860 --> 01:00:25.940
when you're putting a file, that could be many things.

01:00:27.340 --> 01:00:30.440
And, you know, this could fill up or something like that.

01:00:30.440 --> 01:00:34.160
If you have some sort of logging mechanism set up where you're

01:00:34.160 --> 01:00:37.700
pushing all this to Prometheus and others, you know, Grafana,

01:00:37.800 --> 01:00:39.300
you can take a look at this.

01:00:40.980 --> 01:00:41.500
Okay.

01:00:42.400 --> 01:00:43.560
Brett is squared away.

01:00:44.520 --> 01:00:47.080
So it got output as a CSV.

01:00:48.960 --> 01:00:50.780
Oh, I thought we changed that.

01:00:52.800 --> 01:00:55.320
Yes, that might be another one you missed.

01:00:55.980 --> 01:00:58.520
So do you have, yeah, yeah, yeah.

01:00:58.680 --> 01:01:02.520
Do you have an update attribute after your convert record?

01:01:03.700 --> 01:01:04.920
Yeah, right here.

01:01:06.760 --> 01:01:08.720
All right, let me.

01:01:08.760 --> 01:01:11.160
But this is, this is, I think this is when it cut out.

01:01:12.220 --> 01:01:12.740
Okay.

01:01:14.140 --> 01:01:15.620
But no worries.

01:01:15.620 --> 01:01:16.440
We got you.

01:01:16.540 --> 01:01:17.920
We got you.

01:01:18.380 --> 01:01:19.400
I'm pulling yours back up.

01:01:20.440 --> 01:01:24.020
Okay, so we have the update attribute.

01:01:24.880 --> 01:01:29.740
When it's coming out, you added a new attribute called file name.

01:01:30.100 --> 01:01:34.800
And then the value should be file name like you have dot JSON.

01:01:35.580 --> 01:01:36.000
Perfect.

01:01:36.540 --> 01:01:36.980
Okay.

01:01:37.160 --> 01:01:38.020
And say apply.

01:01:38.860 --> 01:01:39.120
Okay.

01:01:39.180 --> 01:01:39.720
And apply.

01:01:44.260 --> 01:01:48.360
So is there a different attribute for like file extension?

01:01:49.180 --> 01:01:53.880
Because I didn't set that and it was it automatically just gave it CSV.

01:01:55.480 --> 01:02:01.260
Yeah, because it's using the file name attribute that was originally with it.

01:02:02.640 --> 01:02:08.180
And so it probably wrote that because it needs a file name to write.

01:02:08.180 --> 01:02:15.460
And so it wrote it as a, if you look at, you changed it.

01:02:20.020 --> 01:02:25.840
If you want to run once, but turn off that update attribute where we put on the file name.

01:02:26.980 --> 01:02:28.340
We'll show you where it happens.

01:02:29.340 --> 01:02:33.240
Go ahead and just stop that update attribute.

01:02:33.500 --> 01:02:36.060
And if you can just run it one more time through.

01:02:37.240 --> 01:02:39.340
Oh, you have an error on your file.

01:02:44.340 --> 01:02:47.100
Yeah, that's it's the same file.

01:02:47.220 --> 01:02:49.040
So go ahead and you've got that stop.

01:02:50.020 --> 01:02:50.740
Empty that queue.

01:02:51.520 --> 01:02:52.120
Yes, sir.

01:02:52.560 --> 01:02:57.620
And let's run it once all the way up to the update attribute.

01:02:57.740 --> 01:02:59.500
So let's let it convert.

01:02:59.980 --> 01:03:04.620
And then after it converts, let's not let's look at it, the attributes.

01:03:05.300 --> 01:03:05.620
Okay.

01:03:05.620 --> 01:03:06.640
And let it convert.

01:03:06.700 --> 01:03:08.860
It's already went to set name.

01:03:08.860 --> 01:03:09.700
Oh, it has.

01:03:10.380 --> 01:03:13.620
Well, your other file has that was in the queue, but it's okay.

01:03:15.180 --> 01:03:18.120
So you should have two in the queue after set name.

01:03:19.920 --> 01:03:21.880
Okay, success.

01:03:22.420 --> 01:03:24.600
So if you can, right before, yep, there.

01:03:24.780 --> 01:03:27.060
Let's look at the attributes of that file.

01:03:28.340 --> 01:03:28.820
Perfect.

01:03:28.900 --> 01:03:30.200
Do you remember where it's at?

01:03:30.860 --> 01:03:32.320
So you see the file name?

01:03:32.400 --> 01:03:33.180
Oh, oops.

01:03:33.580 --> 01:03:34.160
No, you're fine.

01:03:35.620 --> 01:03:37.520
So you see the file name?

01:03:39.020 --> 01:03:45.060
So it's going to write that out until we update that attribute.

01:03:45.380 --> 01:03:50.120
And so what we want to do is update it to be inventory dot JSON.

01:03:51.940 --> 01:04:00.400
So it's going to use that attribute to write the data back out unless you update it.

01:04:00.840 --> 01:04:05.600
And so if you say, okay, and then exit out of that.

01:04:06.260 --> 01:04:08.660
And let's look at your update attribute properties.

01:04:08.760 --> 01:04:14.840
And you see we kept the file name and we added dot JSON, right?

01:04:15.740 --> 01:04:16.360
Okay.

01:04:17.000 --> 01:04:20.120
And then let that run real quickly.

01:04:23.100 --> 01:04:26.040
That's going to be inventory dot CSV dot JSON.

01:04:26.260 --> 01:04:26.560
Correct.

01:04:28.620 --> 01:04:30.100
Yeah, we didn't strip the

01:04:40.460 --> 01:04:41.960
Yeah, let me make sure.

01:05:07.760 --> 01:05:12.860
So, you should now have file name inventory.csv.json.

01:05:13.320 --> 01:05:21.720
Now, we could do regex, you know, you can update that attribute with some regex and

01:05:21.720 --> 01:05:23.300
change the file name altogether.

01:05:24.500 --> 01:05:31.460
So if you want, you can actually go back into your update attribute and, you know,

01:05:31.460 --> 01:05:37.100
apply some regex, if you wanted to, to change that file name to whatever you want.

01:05:37.180 --> 01:05:39.740
You know, just, you know, as FYI, right?

01:05:40.200 --> 01:05:43.020
So right now it's using the file name and just adding.

01:05:43.840 --> 01:05:47.340
For the sake of this demo, could I just hard code it to inventory.json?

01:05:48.680 --> 01:05:49.460
You could.

01:05:49.580 --> 01:05:51.680
Or did you have the regex already that I can use?

01:05:51.820 --> 01:05:57.220
I do not have it handy, but yeah, just send it to inventory.json,

01:05:57.320 --> 01:05:58.140
see if that works for you.

01:05:58.340 --> 01:06:00.680
Oh, you already have a put file in your queue.

01:06:00.680 --> 01:06:08.120
So if you want to go ahead and run once, before you, unless you just did that,

01:06:08.140 --> 01:06:08.880
I don't know if you did.

01:06:09.200 --> 01:06:10.840
Oh, I did just do it, yeah.

01:06:10.960 --> 01:06:11.980
I know where it is.

01:06:13.660 --> 01:06:14.480
What's our error?

01:06:15.820 --> 01:06:19.820
Because it's going to complain about the file name already exists.

01:06:23.020 --> 01:06:24.860
But that actually is a good question.

01:06:24.860 --> 01:06:25.940
Let's see.

01:06:56.200 --> 01:07:04.080
What I usually like to do is I will actually update the file name and give it like a UUID

01:07:06.860 --> 01:07:09.620
and a new file extension.

01:07:11.380 --> 01:07:13.580
But I can actually, let me see here.

01:07:21.260 --> 01:07:28.780
You can do like a set file name as like file name equal dollar, curly bracket,

01:07:28.860 --> 01:07:34.100
new file name, right?

01:07:40.420 --> 01:07:50.100
And while you work on that, you can set a file name in the regex.

01:07:51.920 --> 01:07:57.740
So you can actually, yeah, so you can put in your update attribute, hang on,

01:07:58.080 --> 01:08:00.640
let me double check my notes here.

01:08:05.020 --> 01:08:10.420
Put, because if you do file name, it's going to pull in that .csv.

01:08:11.420 --> 01:08:16.360
We need to get rid of that csv or you could do.

01:08:18.320 --> 01:08:21.880
I'm looking at the regex patterns on the NaPhi website.

01:08:25.040 --> 01:08:28.220
I mean, if it's like typical regex, it would be something like begin of line.

01:08:29.220 --> 01:08:40.800
Yeah, I mean something and then any amount of characters and then like something like

01:08:40.800 --> 01:08:42.600
not literal.

01:08:43.020 --> 01:08:46.020
Well, you can also do something like a date.

01:08:46.780 --> 01:08:56.940
So you can do like, yes, you know, you can do like now and then parentheses

01:08:56.940 --> 01:09:02.980
and then close your curly brackets dot json, you know, you and that would assign

01:09:02.980 --> 01:09:04.880
if you do like now, there you go.

01:09:05.540 --> 01:09:08.820
And then what that's going to do is just give you a date, right?

01:09:08.960 --> 01:09:09.440
Timestamp.

01:09:09.580 --> 01:09:10.920
Yeah, a timestamp dot json.

01:09:11.360 --> 01:09:12.260
That's a good question.

01:09:12.400 --> 01:09:13.940
Run that once and see if it works.

01:09:15.480 --> 01:09:17.140
It looked okay, but we'll see.

01:09:18.160 --> 01:09:20.660
How do I get rid of these errors here?

01:09:21.520 --> 01:09:26.740
So the little red box in the top right, it goes away in five minutes.

01:09:29.180 --> 01:09:35.260
So it's annoying as well where, you know, it's there you could be testing and running

01:09:35.260 --> 01:09:39.080
through these things quickly and you know, the error still exists.

01:09:40.240 --> 01:09:42.620
But I think you're off to a good start.

01:09:43.260 --> 01:09:44.760
I'll look at some regex patterns.

01:09:45.040 --> 01:09:45.880
One more quick question.

01:09:47.340 --> 01:09:49.220
Can I clear out all the queues at once?

01:09:50.920 --> 01:09:51.940
Go back.

01:09:52.540 --> 01:09:57.460
See if you can go back, use your breadcrumb and go back to the NAFA flow main.

01:09:59.060 --> 01:10:04.320
And then right click on your perfect and say stop right click again.

01:10:06.180 --> 01:10:07.160
Empty all queues.

01:10:07.160 --> 01:10:08.080
Okay, cool.

01:10:08.380 --> 01:10:08.980
There you go.

01:10:08.980 --> 01:10:09.440
Perfect.

01:10:10.620 --> 01:10:11.000
Okay.

01:10:11.900 --> 01:10:17.840
So any like, I know we're still working on this, but is there any additional questions?

01:10:17.940 --> 01:10:18.700
I got a quick question.

01:10:19.040 --> 01:10:19.580
Yes, sir.

01:10:21.320 --> 01:10:23.620
So say does error out?

01:10:24.420 --> 01:10:26.300
What happens to the log?

01:10:26.660 --> 01:10:29.800
Does it become a file or does it just log it somewhere?

01:10:30.280 --> 01:10:30.880
It does.

01:10:31.640 --> 01:10:35.880
And that's the reason that I recommend you use the log message.

01:10:36.300 --> 01:10:39.820
And so what that is going to do is push that to the NAFA log.

01:10:40.200 --> 01:10:43.800
For instance, you pulled that log up to get your username and password.

01:10:44.980 --> 01:10:48.820
And so it will log that message to the NAFA logs.

01:10:50.740 --> 01:10:51.700
Yes, sir.

01:10:51.860 --> 01:11:03.680
So if you go into the log directory and you go to the NAFA app, if it creates a log here, it will log it to the app, to the NAFA app log.

01:11:08.180 --> 01:11:09.740
No, that's a great question.

01:11:09.960 --> 01:11:17.660
And again, some of the design patterns, you know, you may not want to put, you know, a log.

01:11:17.660 --> 01:11:20.680
I log every failure.

01:11:21.060 --> 01:11:28.740
I even use the log message for success sometimes because I want to see some certain aspects of that flow file.

01:11:29.480 --> 01:11:31.740
But no, that's a great question.

01:11:32.880 --> 01:11:33.920
Any other questions?

01:11:34.880 --> 01:11:37.180
And Brett, if you can, you know.

01:11:37.360 --> 01:11:39.760
That now didn't work, by the way.

01:11:41.140 --> 01:11:43.240
I'm just going to hard code it to inventory for now.

01:11:43.520 --> 01:11:43.960
Perfect.

01:11:44.120 --> 01:11:44.500
Perfect.

01:11:46.740 --> 01:11:47.880
Go ahead.

01:11:48.380 --> 01:11:57.600
So I hard coded my update attribute to inventory.json, but it seems like it's still wanting to output as a CSV.

01:12:00.240 --> 01:12:05.940
It's outputting the name as inventory.csv or the contents as a CSV.

01:12:06.600 --> 01:12:08.060
It's still a CSV file.

01:12:09.300 --> 01:12:09.740
Okay.

01:12:09.780 --> 01:12:11.500
So let me take a look.

01:12:12.720 --> 01:12:14.220
I'm sorry, I didn't have it pulled up.

01:12:14.220 --> 01:12:14.840
Who was speaking?

01:12:15.120 --> 01:12:16.100
Who was that?

01:12:16.340 --> 01:12:16.480
Cody.

01:12:17.180 --> 01:12:18.320
Oh, my, Cody.

01:12:18.500 --> 01:12:20.220
I should know your name by now.

01:12:23.360 --> 01:12:23.380
All right.

01:12:23.380 --> 01:12:24.040
Let's say cancel.

01:12:25.860 --> 01:12:26.540
All right.

01:12:26.680 --> 01:12:29.400
So you are, am I driving this?

01:12:29.640 --> 01:12:31.360
Let me make sure I have this on the interactive.

01:12:31.400 --> 01:12:32.760
Oh, I do have it on the interactive.

01:12:33.020 --> 01:12:33.660
So hang on one second.

01:12:33.720 --> 01:12:34.360
There you go.

01:12:34.800 --> 01:12:35.200
Perfect.

01:12:36.160 --> 01:12:37.120
All right.

01:12:37.760 --> 01:12:40.720
So you have your convert to JSON.

01:12:41.280 --> 01:12:42.400
What step is spelling?

01:12:43.800 --> 01:12:46.120
Oh, you're getting failures like crazy.

01:12:47.080 --> 01:12:52.220
That was failing because it's the same file name.

01:12:52.660 --> 01:12:53.860
Oh, oh, oh.

01:12:53.860 --> 01:12:54.540
No worries.

01:12:56.460 --> 01:12:56.800
Okay.

01:12:57.200 --> 01:13:01.180
And then update file name, success, put, and then failure.

01:13:01.600 --> 01:13:01.700
Where?

01:13:01.900 --> 01:13:07.460
And so the file that's being written, can we look at it?

01:13:08.040 --> 01:13:09.260
Yeah, it's this guy right here.

01:13:09.880 --> 01:13:10.640
Can you open it up?

01:13:10.640 --> 01:13:12.320
Let's make sure it's a JSON document.

01:13:13.560 --> 01:13:14.000
Awesome.

01:13:14.340 --> 01:13:14.940
It is.

01:13:15.000 --> 01:13:16.140
Go ahead and close that out.

01:13:17.160 --> 01:13:18.140
Go ahead and close that.

01:13:18.320 --> 01:13:20.840
And then your update file name.

01:13:20.980 --> 01:13:21.560
Let's look at that.

01:13:21.620 --> 01:13:21.940
Stop it.

01:13:21.980 --> 01:13:22.900
And let's look at that property.

01:13:26.040 --> 01:13:26.320
Yeah.

01:13:26.760 --> 01:13:29.980
So you're just putting in the updated.

01:13:30.080 --> 01:13:37.180
But we actually need to put it in the correct format for NonThon to read it.

01:13:37.180 --> 01:13:43.820
So what you want to do is, yeah, change that.

01:13:45.220 --> 01:13:46.440
And you want to use dollar.

01:13:47.860 --> 01:13:52.020
Because dollar, curly bracket, file name is part of the regex.

01:13:52.460 --> 01:13:55.680
And so NonThon understands that.

01:13:55.880 --> 01:13:58.540
So file name, close your curly bracket, JSON.

01:14:00.220 --> 01:14:00.880
And say, okay.

01:14:01.000 --> 01:14:01.380
All right.

01:14:01.380 --> 01:14:04.340
I had that originally.

01:14:04.760 --> 01:14:06.400
I was still getting a CSV.

01:14:07.660 --> 01:14:12.560
It should be inventory.csv.json.

01:14:13.200 --> 01:14:16.880
I'm going to come up with a regex pattern to change that.

01:14:17.200 --> 01:14:19.680
Just because I think Brett also has the same question.

01:14:20.580 --> 01:14:22.320
But can you run that to see?

01:14:22.440 --> 01:14:23.440
And then we will take a quick.

01:14:24.320 --> 01:14:28.600
Instead of on your place, JSON.

01:14:28.840 --> 01:14:29.100
Okay.

01:14:29.100 --> 01:14:29.800
There you go.

01:14:29.900 --> 01:14:30.720
Just run once.

01:14:31.120 --> 01:14:31.500
Okay.

01:14:32.000 --> 01:14:33.220
It's still writing a CSV.

01:14:35.500 --> 01:14:35.760
All right.