14 videos 📅 2024-05-06 08:00:00 America/Creston
39:11
2024-05-06 08:57:50
3:11:28
2024-05-06 10:30:19
24:39
2024-05-07 07:20:20
19:49
2024-05-07 08:03:25
1:14:35
2024-05-07 08:35:13
15:41
2024-05-07 10:06:25
38:33
2024-05-07 10:37:03
2:48
2024-05-07 11:19:01
59:37
2024-05-07 11:33:56
6:10
2024-05-07 14:55:25
39:40
2024-05-07 15:02:44
9:45
2024-05-08 06:44:21
29:27
2024-05-08 08:09:24
2:51:40
2024-05-08 12:09:24

Visit the Apache Nifi - GROUP 1 course recordings page

                WEBVTT

00:00:02.220 --> 00:00:08.580
Okay, perfect. So if you can stop all your processors, I just showed Brett a little trick

00:00:08.580 --> 00:00:13.360
on how to quickly stop those all. If you want to go right back to your main canvas, use

00:00:13.360 --> 00:00:20.260
your breadcrumb trail at the bottom. There you go. Right click on that processor

00:00:20.260 --> 00:00:28.740
group. Say stop. There you go. And right click again and say empty all cubes. Okay, perfect.

00:00:29.300 --> 00:00:35.940
Perfect. Alright, let's go back into that. So let's see. Everything is empty. Let's

00:00:35.940 --> 00:00:41.780
leave everything stopped and let's run it once. And then whatever the output of

00:00:42.040 --> 00:00:47.480
convert CSV to JSON, we're going to take a look at. Don't run the update file name yet.

00:00:47.920 --> 00:00:55.380
There you go. Hit refresh. Alright, success. Let's take a look at that update file name.

00:00:55.700 --> 00:01:00.000
Let's take a look at that queue right before list the queue. Go all the way to your left

00:01:00.000 --> 00:01:05.880
and let's look at the attributes. I can already see file name there. Okay. File name is

00:01:05.880 --> 00:01:14.180
inventory.csv. Perfect. Say okay. And exit out of that. And run the update file name

00:01:14.180 --> 00:01:25.980
one time and refresh. We're not going to put the file yet. We're just going to run

00:01:25.980 --> 00:01:33.120
it once. Yep. And hit refresh. Success. Let's look at that queue. List the queue.

00:01:33.120 --> 00:01:42.120
And why is there, okay, hit that. File name. Say okay. Exit out of that. And you're

00:01:42.120 --> 00:01:51.700
using an update attribute, right? Say configure. Oh, you capitalized file name. It's a lower

00:01:51.700 --> 00:01:59.800
case. So no, no, no. Cancel. Delete that. Delete that altogether. And then add a

00:01:59.800 --> 00:02:06.860
file name in the lower case because that's the property name. There you go. And now dollar

00:02:06.860 --> 00:02:12.620
curly bracket file name. Close your curly brackets. Awesome. Awesome. Awesome. And

00:02:12.620 --> 00:02:19.220
that should work. And so you want to clear your queue on the place.json file right

00:02:19.220 --> 00:02:24.860
before it. You want to clear that queue. And then run it once and make sure. Okay.

00:02:24.860 --> 00:02:39.120
So I have 12, 10, 10, 10. You all have time. Let's take a 15-minute break. Use

00:02:39.120 --> 00:02:49.140
the restroom and stuff. Let's try to get back here at 12.25 my time. 10.25 your

00:02:49.140 --> 00:02:56.720
time. If you get, you know, if you get a free minute, I'd like for you to go in and

00:02:56.720 --> 00:03:01.500
just kind of touch everything up. You know, clean it up. Those types of things.

00:03:01.640 --> 00:03:05.020
Let's take a look at. So when we get back, we're going to take a look at your

00:03:05.020 --> 00:03:11.700
data flow and go through any final questions. We still have more hands on to

00:03:11.700 --> 00:03:17.880
do. But now that we've got a couple of data flows in our canvas, we need to do

00:03:17.880 --> 00:03:23.420
conversion control. So let's take a break and I will see you all in 15.

00:04:03.160 --> 00:04:11.480
Give another minute for folks to return. But, you know, great job to everyone

00:04:11.480 --> 00:04:16.620
getting this flow working. There was a reason I chose this flow to go through

00:04:16.620 --> 00:04:21.160
this morning. It really kind of shows off some of the controller service

00:04:21.160 --> 00:04:28.200
capabilities. It's more than your standard Nafa flow of just picking a

00:04:28.200 --> 00:04:32.560
file up, unzipping it, and putting it somewhere else. We really started

00:04:32.560 --> 00:04:37.360
getting into some of those controller services, those types of things, as

00:04:37.360 --> 00:04:43.240
well as, you know, schemas, record readers, record writers. And you'll

00:04:43.240 --> 00:04:48.560
see a lot of this in, you know, a lot of your data flows will utilize these

00:04:48.560 --> 00:04:54.260
services. The beauty of that controller service was, you know, now we can

00:04:54.260 --> 00:04:58.740
reuse those components. We can reuse those controller services if need be.

00:04:58.960 --> 00:05:04.020
We can reuse that schema. So if we had, you know, CSV coming in from

00:05:04.020 --> 00:05:09.740
somewhere else, we can reuse a lot of this same data flow. So that makes

00:05:09.740 --> 00:05:15.920
it a lot easier for us. There's even more advanced capabilities under the

00:05:15.920 --> 00:05:20.580
hood when you start diving into this. The ability to have, you know,

00:05:20.780 --> 00:05:26.940
multiple Avro schemas, for instance, and, you know, those types of things.

00:05:27.760 --> 00:05:35.060
So I really like this hands-on experiment. What we'll do now that

00:05:35.060 --> 00:05:38.580
hopefully all of us are back, we'll kind of go through that flow, make

00:05:38.580 --> 00:05:43.660
sure everything looks good. We should have now two data flows on our

00:05:43.660 --> 00:05:49.620
canvas. And now, after we get through going through those, we will work on

00:05:49.620 --> 00:05:54.520
checking them into versioning control and what that looks like and how that

00:05:54.520 --> 00:06:02.260
gets configured. I know we have some DevOps and, you know, some CICD

00:06:02.260 --> 00:06:07.500
questions originally. So hopefully this will start flushing some of that out

00:06:07.500 --> 00:06:13.380
and we'll go from there. So that being said, let's quickly go through.

00:06:14.040 --> 00:06:17.560
Pedro, it looks like you've got yours working. Everything looks good.

00:06:17.640 --> 00:06:22.740
Awesome. And you've started naming your processors and labeling them.

00:06:23.080 --> 00:06:27.620
It's in its own processor group. You know, those types of things.

00:06:28.340 --> 00:06:32.360
So I like how you put error logout to the side. That, you know,

00:06:32.360 --> 00:06:38.000
I can quickly look at this data flow and I can distinguish the different parts

00:06:38.000 --> 00:06:43.140
of this. I can see where you're logging your errors. And, you know,

00:06:43.200 --> 00:06:47.640
your processors, a lot of your processors have human readable, human

00:06:47.640 --> 00:06:52.460
understandable meaning. So yours looks good. Any questions, Pedro,

00:06:52.660 --> 00:06:55.500
from yours? And if not, I'll take silence if not.

00:06:57.200 --> 00:06:57.740
Okay.

00:07:00.640 --> 00:07:01.860
Yes, go ahead.

00:07:05.200 --> 00:07:08.360
Because I noticed you had yours going diagonally out.

00:07:09.740 --> 00:07:13.840
I was trying to figure out how to do that. I couldn't drag them,

00:07:13.980 --> 00:07:14.980
which is what I thought.

00:07:15.880 --> 00:07:16.880
No worries.

00:07:21.440 --> 00:07:25.300
Let's see here. Let me bring mine back up. That's a good question.

00:07:25.640 --> 00:07:30.020
There is a lot of hidden tips and tricks on this.

00:07:31.220 --> 00:07:34.520
So if you want, you can go to the line and just double click.

00:07:36.360 --> 00:07:37.880
You see how I did that?

00:07:41.320 --> 00:07:42.680
So you can double click

00:07:42.680 --> 00:07:44.500
where you want it to bend.

00:07:50.460 --> 00:07:51.980
And you can bend your lines.

00:07:53.440 --> 00:07:57.880
So you can...

00:07:59.860 --> 00:08:03.100
You see what I'm doing is double clicking that line.

00:08:04.960 --> 00:08:07.020
Okay, perfect. No, good question.

00:08:09.160 --> 00:08:10.040
Go ahead.

00:08:12.020 --> 00:08:18.480
So because these VMs are so much like the C, I accidentally clicked one and dragged it

00:08:18.480 --> 00:08:22.960
and really messed up my canvas. Is there a Control Z in the canvas

00:08:22.960 --> 00:08:27.720
to undo the last thing you did? Or do you just have to try and click with the fallout?

00:08:27.720 --> 00:08:30.360
You're going to have to deal with a fallout.

00:08:32.080 --> 00:08:35.320
I do like that Control Z question.

00:08:37.300 --> 00:08:40.920
I'm going to take note of that one as well and see

00:08:42.500 --> 00:08:45.360
if it's possible we can do something like that.

00:08:46.780 --> 00:08:53.160
That's a great question. We are at the mercy of the latency of these VMs

00:08:53.160 --> 00:08:58.080
as well as proxies and everything else that you have to go through.

00:08:58.760 --> 00:09:01.020
But yeah, that's how you bend them.

00:09:03.320 --> 00:09:06.620
How do I get the line back?

00:09:11.700 --> 00:09:15.140
You want to do a

00:09:15.140 --> 00:09:16.620
line back.

00:09:18.040 --> 00:09:19.640
Let me see here.

00:09:22.660 --> 00:09:25.300
I'm trying to remember not double click.

00:09:31.540 --> 00:09:32.400
Oh, it was double click.

00:09:34.140 --> 00:09:35.160
Yeah, I'm about to say it.

00:09:36.880 --> 00:09:38.480
I got rid of one by just double clicking.

00:09:49.540 --> 00:09:50.360
And then

00:09:50.360 --> 00:09:51.360
once you get off of it,

00:09:51.360 --> 00:09:53.020
then you just double click.

00:09:53.160 --> 00:09:56.560
There you go. You just double click the spot where you put the bend.

00:10:02.800 --> 00:10:03.680
That was

00:10:03.680 --> 00:10:06.580
your old Sean.

00:10:07.180 --> 00:10:12.820
Yeah, unfortunately I was tied up with work until like 15 minutes ago.

00:10:12.940 --> 00:10:15.700
So I'm pretty far behind you guys now.

00:10:17.020 --> 00:10:22.260
Did you download that zip file?

00:10:23.540 --> 00:10:24.860
Okay, perfect.

00:10:24.860 --> 00:10:26.380
So if you can,

00:10:28.720 --> 00:10:35.100
you can go to that zip file

00:10:35.100 --> 00:10:37.620
or go back to your main canvas.

00:10:40.380 --> 00:10:43.980
And then the sample data flows, you can actually import that

00:10:43.980 --> 00:10:46.200
CSV JSON demo data flow

00:10:48.100 --> 00:10:51.200
and it will give you the complete flow

00:10:51.200 --> 00:10:56.440
just for FYI. So if you want to catch up, I would go back to your

00:10:56.440 --> 00:11:01.160
main FYI flow canvas, bring down a processor group.

00:11:01.200 --> 00:11:06.540
There you go, that one. And then say to the right, say upload. You know there's an up arrow.

00:11:06.620 --> 00:11:09.940
Click that and then go to that folder you downloaded.

00:11:12.180 --> 00:11:14.260
Right on your C drive.

00:11:16.000 --> 00:11:20.500
And click that one, say open, say add, and then double click to go in there.

00:11:22.280 --> 00:11:26.000
So I go through and build these to make sure

00:11:26.000 --> 00:11:30.640
for instances I get called away, I had work to do,

00:11:30.880 --> 00:11:35.840
I had a technical problem. You will still need to go through and configure your

00:11:35.840 --> 00:11:38.820
CSV to JSON, the record reader and record writer.

00:11:39.440 --> 00:11:45.040
But all the services are there, so you just need to go through usually and enable them.

00:11:47.560 --> 00:11:48.640
Yeah, yeah.

00:11:50.680 --> 00:11:53.720
It should be good for the rest of the day.

00:11:55.380 --> 00:11:59.000
No worries, no worries. And then when you can,

00:11:59.820 --> 00:12:04.240
you'll have to fix your CSV from directory because you'll need to know

00:12:04.240 --> 00:12:09.660
where to get that, so you'll need to update that, you'll need to update where you write the JSON,

00:12:11.320 --> 00:12:14.000
and also enable all your controller

00:12:14.000 --> 00:12:18.380
services. But you should be able to catch up pretty quickly in that flow,

00:12:18.740 --> 00:12:23.800
as well as kind of walk through it and see if you have any additional questions

00:12:23.800 --> 00:12:27.060
or anything else, you know, just feel free to interrupt me.

00:12:27.640 --> 00:12:31.640
Okay, yeah, I'll go through all these processors and try to configure them.

00:12:31.640 --> 00:12:37.180
Okay, yeah, and like I said, if you have any issues, you know, interrupt me and let me know.

00:12:37.980 --> 00:12:38.340
Alright.

00:12:43.880 --> 00:12:48.840
Any questions? I really like, you know, just a tip here,

00:12:48.840 --> 00:12:54.840
you know, you want to label processors that are, you know,

00:12:55.640 --> 00:12:58.840
like in kind. So, you know, you can,

00:13:00.760 --> 00:13:04.900
let's see, the pick up files and the set attributes,

00:13:05.360 --> 00:13:09.100
you may want to combine a label just to, you know,

00:13:09.200 --> 00:13:13.960
say this is where we're setting attributes or something, but I really like how you got all these

00:13:13.960 --> 00:13:18.460
labels labeled. I like the naming you put in.

00:13:18.840 --> 00:13:23.800
You've got multiple, you listened on the multiple logs, right? You're bringing your

00:13:23.800 --> 00:13:28.900
stored failed conversions, any failures there, any failures on the

00:13:28.900 --> 00:13:34.360
store, the JSON, all to have a log message. So everything looks good to me.

00:13:34.900 --> 00:13:37.560
I'll take silence as no, but do you have any questions?

00:13:37.560 --> 00:13:42.480
Right. Okay, Aaron, let's look at yours.

00:13:43.740 --> 00:13:45.780
If you can, can you go into your CSV to JSON?

00:13:47.980 --> 00:13:53.060
Awesome. Any issues? I see you have a file queued up.

00:13:53.720 --> 00:13:56.800
Did everything work well for you? Yeah.

00:13:57.280 --> 00:13:58.280
Perfect, perfect, perfect.

00:13:58.280 --> 00:14:02.200
All right.

00:14:02.620 --> 00:14:06.820
Cody, whose voice I should know. Very nice.

00:14:07.240 --> 00:14:11.840
If you want, you can log, you know,

00:14:11.920 --> 00:14:17.180
your log placement and your log conversion failure. You can reduce that to one

00:14:17.180 --> 00:14:20.940
processor and just log all your failures if you wanted.

00:14:20.940 --> 00:14:25.340
And then, you know, but you can,

00:14:26.860 --> 00:14:30.820
you know, you can have separate log strategies for separate processes.

00:14:30.900 --> 00:14:35.380
So, you know, totally your call. But I really like your labeling

00:14:37.280 --> 00:14:40.580
and the name mentions those types of things. So

00:14:42.160 --> 00:14:46.180
any questions? Let me know. Oh, Ben, you got some funnels going

00:14:46.180 --> 00:14:47.440
in. Wow.

00:14:52.820 --> 00:14:56.580
Outdoing yourself. So you even named the

00:14:56.580 --> 00:14:59.380
connections, you know, plus one for you.

00:15:00.340 --> 00:15:05.520
So that looks good. Hopefully you didn't have any issues. The only thing I would do is, you know,

00:15:05.520 --> 00:15:10.200
go through and add some labels. But, you know, the way you got this set up,

00:15:10.200 --> 00:15:14.960
I could easily read it. You may also want to consider, you know,

00:15:15.020 --> 00:15:20.020
how you align your processors. You know, you've seen already

00:15:20.020 --> 00:15:25.100
a lot of folks that will go straight down and go straight through the data flow.

00:15:25.700 --> 00:15:30.240
Or you can go left to right. If you speak Arabic, you can go right to left.

00:15:31.040 --> 00:15:34.660
You know, those types of things. So, you know, I just look at those types,

00:15:34.660 --> 00:15:38.840
you know, tips and tricks, then, you know, apply labels to make them understandable.

00:15:39.360 --> 00:15:40.100
But I don't see.