39:11
2024-05-06 08:57:50
3:11:28
2024-05-06 10:30:19
24:39
2024-05-07 07:20:20
19:49
2024-05-07 08:03:25
1:14:35
2024-05-07 08:35:13
15:41
2024-05-07 10:06:25
38:33
2024-05-07 10:37:03
2:48
2024-05-07 11:19:01
59:37
2024-05-07 11:33:56
6:10
2024-05-07 14:55:25
39:40
2024-05-07 15:02:44
9:45
2024-05-08 06:44:21
29:27
2024-05-08 08:09:24
2:51:40
2024-05-08 12:09:24
Visit the Apache Nifi - GROUP 1 course recordings page
WEBVTT--> Looking good Cody, looking good. --> Does anyone have any issues so far or are we good to go? --> Pedro, were you able to copy your data flow into a new group? --> Yeah, I was able to download it and create a new group processor and just copy and paste the old one into this one. --> Yeah, so you want to bring down a new processor group. Let me pull yours up and see. --> Perfect, so you see on your breadcrumb trail you have not hot flow and then new processor click not hot flow. Perfect, perfect. --> And named it. So yeah, if you create a new processor group, we are going to ingest a CSV file. --> I will talk about a little bit about this yesterday and we are going to convert that to JSON. --> So, you know, if you want to name your processor group CSV to JSON or something similar, you know, to make it easy to understand. --> Got it. Perfect. You got it. You got it. --> Alright, Pedro's good. Okay, so for this exercise, we are going to pick up a CSV file. --> We're going to convert it to JSON. We are going to use a controller services. --> And so we're going to need to create a record reader, a record writer and get those enabled. --> So this this is this is one of the more advanced data flows. --> I suspect this is going to take a little bit of time, but, you know, this you'll see a lot of this. --> It also reinforces, you know, the thing I'm talking about yesterday where you can set up these controller services and they can be reused. --> So, for instance, this morning we're going to set up a CSV reader. --> That CSV reader can be used over and over. Brett, you shouldn't need to log in. --> And I tell you to download without logging in is what my screen told me. --> But anyway, so, you know, we're going to set these controller services up and we're going to go through and build this out. --> I did include the flow itself already built and you're more than welcome to, you know, --> import that flow. We can you can use it as an example. --> But, you know, I really like for everyone to, you know, kind of build this from scratch if possible, just because, you know, it helps, you know, configure the controller services, those types of things. --> And I'll and I'll kind of walk along with it. --> Yeah, thanks for the laugh. --> All right, so let's get started on that. --> So, so the zip file you downloaded, did everyone get that zip file downloaded? --> And if you can extract that zip file. --> And so if you're looking at my screen, I downloaded the zip file. --> Perfect. And what I will do is just go to my. --> Downloads and you should see example main right click and say extract all. --> Perfect. And then go into the example main and you should see some sample data. --> Perfect. And inventory folders and inventory dot CSV. --> Let's take a look. Who was that speaking? --> I'm sorry. Hey, Cody, I've asked you that multiple times. --> All right, let's see. I should have your voice memorized now. --> What is? Yeah, that's very strange because I'm looking at it and you just chat me in the VM. --> Yeah, I can do that drop box box link night. --> Can I just send it directly to you in chat and teams chat? --> VM would be better if you can, because I have my teams on my government one and I'm using the VM on my personal computer. --> Oh, OK. Let me see. --> Teams is fine, too. I can I can just send it over. --> Let me put it in teams, but I might be able to put it directly into this. --> And let me see if I can take control interactive copy link. --> There you go. And I'll I'll download it for you to know it. --> It worked. I was very surprised it worked. --> So because I have to go back and then tell it I want to interact with your machine and then log in your machine. --> OK, so everyone should have that zip file. --> I'm going to have your machine if you can extract it. --> And, you know, it might be easier to put on your desktop, --> but put it in a location that that you know about because we're going to need to use our get file processor. --> OK, so hopefully everyone has that. --> Let me know if you run into any issues we can stop and get a square away. --> But with that being said, you should be able to go into that zip file. --> There's a sample data folder and then there's an inventory folder and inventory dot CSV. --> So if you can on your NIFI canvas, let's do a new process group. --> Because we already have a data flow that we worked on yesterday, --> we want to bring down and do a new process group onto your canvas. --> And so that way we can start building on to this data flow. --> And if you look at my screen. --> Right, I have a process group called CSV to JSON demo data flow. --> You can name it however you want, but, you know, definitely something you can, --> something you can work off of. --> So our first step is we need to get the file from the directory. --> So what you want to start with is most likely a get file. --> There's a couple of different ways you could do that. --> You could list the directory, filter on it, and then turn around and do a get file. --> But I find just leading off with a get file as the easiest method. --> And the file that we're looking for is inventory dot CSV. --> So you can design this how you want. --> But, you know, for me, for instance, I put, you know, --> it was in my uploads directory because this desktop environment that we're working in --> has an uploads directory on the desktop that we can put files into. --> And then the file filter, I put inventory dot CSV instead of picking up everything. --> All right. And then once we have that, we need to set the schema metadata. --> And so, you know, this attributes metadata so that we can later understand --> which schema to use to process the data. --> So, you know, we want to create an update attribute just because once we get that CSV file, --> we want to be able to tag that metadata with a schema name that it will use to read and to write the data. --> And I know I have mine pulled up, but I would love to see others in your own thought process --> putting this in and, you know, being able to bring that up. --> So we get the set schema metadata. --> You can just add if you didn't know on the update attribute. --> I think a couple of you were working on that yesterday with an update attribute. --> You can go to the properties and just add a property. --> And, you know, you want to do the property name. --> So we could actually do schema dot type, for instance. --> And it needs a value. And I'll just put a row. --> I'm not going to use that attribute, but, you know, that's how you would add attributes, you know, to a flow. --> So we are getting the file. --> We are ingesting that. --> We are not even looking at the CSV yet. --> We are just looking at the metadata. --> So what we want to do is add an attribute to that that says the schema dot name goes to inventory. --> And so once we have that, we then need to convert CSV to JSON. --> So the processor I like to use for that is the convert record processor. --> And in your properties, you're going to have a record reader and a record writer. --> This is a controller service that we're going to set up. --> So the record reader is a CSV reader. --> So you should be able to choose, you know, an Avro reader or CSV reader. --> And then on your record writer, we want to say the JSON record set writer. --> And let me know if you have any hiccups there. --> And then once you have that CSV reader, record reader set up, you can actually go to the service. --> And so, you know, if the service isn't there, let me know. --> But once you put that in and hit apply, you should take it. --> But if it doesn't, we can just create a new one. --> And you see where I have a CSV reader that I went in and created. --> Let me look at how this is the tricky part of this flow. --> And so what we want to do is set up the, you know, the first record or the first --> controller service is that CSV record reader. --> And then we want to set up a JSON record set writer. --> And we're going to also set up a couple of schema controller services. --> So when you use the convert record processor, it should convert record. --> Okay. --> Yeah. --> Perfect. --> So you should have CSV reader on the record writer. --> You should have JSON record set writer on the record writer. --> You want to change, you know, just as a tip, include zero record flow files. --> You want to change that to false. --> And then once you can save and say, okay, for that, you can go back into that --> configuration and then you can actually go straight to the controller services. --> And you should have controller services listed and you want to add a CSV reader, --> a JSON record set writer, an Avro reader, and an Avro schema registry. --> Andrew, I'm going to pull your screen up because it looks like you're in the --> middle of creating it. --> I'm trying to figure out what goes in that metadata properties. --> Yeah. --> If you can, you want to hit cancel. --> Okay. --> So we are get CSV file. --> You named it. --> You got it. --> But what we want to do then is do an update attribute because we want to --> tell NIFA which schema to use. --> So I hit cancel and bring down a new processor and the processor is an --> update attribute processor. --> So you can actually just start typing it in the update attribute. --> Perfect. --> And then you want to drag your success, you know, down to that one. --> Awesome. --> Bad. --> Perfect. --> And then go into your. --> You can delete that. --> You can delete the other processor. --> That's just dangling. --> So what we want to do is you want to go into your update attribute configuration. --> Okay. --> Perfect. --> And, you know, on that, you want to set, you want to do a new value. --> So that positive you see over there, you want to click on that. --> The plus sign on the top right in your processor configuration. --> Okay. --> And then you want to do property name is schema dot name. --> Say, okay. --> And you want to put inventory. --> Perfect. --> And say, okay. --> Your cash value lookup should be 100. --> Do not store state. --> And the other two fields are not required. --> Perfect. --> Say apply. --> Awesome. --> So what we're doing here, and this is a great example, is we're bringing that zip --> file, that CSV file in. --> And what we're, we're not necessarily reading the CSV file. --> We're just assigning as an attribute that that CSV file gets the schema name inventory. --> And so when we bring that file in, you're going to be able to look at the attributes --> and see that you've now assigned a schema name to that file. --> Now that you have the update attribute, we need to get a convert record processor --> because we're going to work on converting this to CSV to JSON. --> You want to just type convert record. --> Perfect. --> You want to, there you go. --> Drag that to there. --> Success. --> So an update attribute and get file. --> And some of these are only going to have success as the next termination, right? --> Because it's, it's just going to apply that attribute to no matter what file comes in. --> So there really is no failure. --> Okay. --> So now that you've got, you know, your convert record, we are reading in a CSV file --> and we are going to put a JSON file. --> So if you can, on your record reader, instead of no value, click that and click down. --> There you go. --> Reference, create new service. --> And you want to use a Avro reader. --> Yeah, perfect. --> Great. --> On the record reader, select your dropdown. --> Again, no, cancel on that reader, not the writer. --> There you go. --> Reader. --> Select your dropdown. --> No, why is that not coming up? --> I create new service. --> Oh, we want to do, do your dropdown instead of an Avro reader. --> You want to do a CSV reader. --> Perfect. --> And say create. --> Perfect. --> Perfect. --> Perfect. --> So we are reading in a CSV for a record writer. --> We are outputting JSON, right? --> So go ahead and click new service. --> Do your dropdown and create new service. --> Perfect. --> And we are doing JSON. --> And say create. --> You got it. --> And on the include zero record flow files, just say false. --> Okay. --> And say okay. --> What does that record zero flow file mean? --> What does the zero record flow files mean? --> Okay. --> So if include zero record flow files is usually like there's like no data in the --> flow file. --> How like that? --> Question mark. --> Hover over that question mark. --> Right there. --> When converting an incoming flow file, the conversion results in no data. --> So if there's zero data, you know, it knows how to handle that. --> But if you hover over the question mark again, there you go. --> So if it comes in and if the results is no data, you know, where do you want to --> do with that? --> Do you want to send it, you know, elsewhere? --> You know, you want to if it comes back as zero byte, you know, if it's a zero --> byte flow file, for instance, you don't want it to continue down the path. --> You want to probably send that to a different relationship. --> There's some reason it didn't get converted. --> But yeah, leave it as false. --> We're not we're not getting too advanced today. --> Okay. --> Say apply. --> Awesome. --> Now go back into your convert record and on your CSV reader, there is a little --> arrow to the right and the right side of it. --> Yep. --> Click that. --> And so what this is going to do is take you to your controller services. --> And so once you're there, we want to look at our CSV reader. --> You can go ahead and click on the gear icon. --> That is how we control the properties of a controller service. --> And you want to go to properties. --> Okay. --> The schema access strategy. --> We want to use the schema name property. --> So click that in first schema and use the schema name property. --> Remember, we set that schema name in the update attribute. --> Okay. --> Perfect. --> So say, okay. --> Schema registry. --> All right. --> On the schema registry, drop down. --> I create new service and we want to use an Avro schema registry. --> Say create and click create. --> Awesome. --> Okay. --> So the schema name property is we already said it, but you want to double check --> here to make sure that that schema name matches the update attribute that you applied. --> And so schema.name is what we put in the update attribute, I believe. --> Schema.name. --> That's the same property we used in the update attribute. --> Yeah. --> Okay. --> Awesome. --> Awesome. --> So just say okay there and scroll down. --> Let's see if there's any other settings. --> We're going to use the Apache common CSV. --> There's no really custom format value separator as comma --> slash n is the record set. --> Yeah. --> So I think we're good there. --> Say apply. --> Okay. --> So now it's still in an invalid state. --> We need to, we specified an Avro schema registry. --> So if you can go into the gear icon again and you see where it says Avro schema registry --> and it has an arrow. --> Okay. --> Perfect. --> So you want to go to, that's the schema --> registry it's using. --> So if you click the gear icon and you want to valid field names true. --> And then here is where we are going to apply our schema. --> When it reads the CSV file. --> So instead of writing a brand new schema, let me paste it into that. --> Are you able to go to the ether pad? --> The ether pad actually here. --> I'll put it in the teams chat as well to help everyone out. --> So you don't have to write your own. --> Oh, perfect. --> Perfect. --> So I'll put it in chat. --> So you want to copy that. --> So what we're doing is creating a schema. --> And so that way it will read that CSV in. --> The controller service is looking at the schema and that's what it's going to apply --> to that CSV file to write it as JSON. --> So if you can, you know, in your controller service, --> you're going to want to hit plus on the, to add a property. --> No worries. --> This is a very difficult hands on. --> So, you know, I completely understand how to go back and forth and all the copying. --> Correct. --> What the schema would look like writing it to JSON. --> Pedro, I didn't write your background. --> Can you real quickly, you know, what is it you do? --> Yeah, research management. --> Oh, okay. --> We're kind of dealing a little bit with the ATL. --> Oh, then this is right up your alley, right? --> No, no, no, I have everyone's background because, you know, --> we have some sysadmins, we have some developers. --> But yeah, so you want to create, --> did you get that little JSON block I sent you? --> I am still in teams and I am on the government computer. --> And I can paste it in. --> I can paste it in. --> Oh, if you could, that'd be awesome. --> Yeah, yeah, yeah. --> So I don't mind pasting it all. --> So if you can, though, I want you to go back to your controller services --> and hit the plus and the property name is inventory and say, okay. --> And I'm going to paste. --> I can't let me see if it won't let me just put like the number one --> and I will come back in and erase it and just say, okay, --> okay, I'm going to come in your instance where I can modify it. --> Is the property name have to match the schema to the other test? --> It does. --> It does. --> It does. --> Good question. --> Okay, that didn't work. --> Perfect. --> There we go. --> I'm going to exit back out of yours, Pedro, --> and I'm going back to view only because I don't want to mess --> with what you've got going here. --> Okay, so the property inventory needs to match. --> We've got it. --> And if you look at the value, so Pedro, --> if you can click on that inventory, look at the value, --> and you can see we have put our schema into this controller service. --> So when it reads that CSV, it's going to pull all that in --> and create a JSON document, you know, --> based on utilizing this schema based upon the CSV data. --> So you can go ahead and say, okay, and say apply. --> I've got a question. --> Do we have to manually create that schema when we need one for the file, --> or is there like a tool that will help us with that? --> That is a great question. --> So you will have to manually create your schema. --> Now, the beauty is, is NaFi accepts Avro schemas. --> And so if you already have an Avro schema, --> you're already, you know, ahead of the curve. --> But yes, you will have to create a schema --> when you're wanting to do things like what we're doing, --> what we're reading in one format, converting it to another format. --> You know, just because, you know, there's capabilities out there --> for NaFi to auto understand. --> But, you know, what I'm saying here is, --> is NaFi would not be able to understand --> how you want that data to be written back out. --> We have the CSV reader, --> and we're using the Apache Commons library to parse that CSV. --> But without that schema, we wouldn't know where to map, --> you know, the columns back to the JSON fields that we need. --> So yeah, unfortunately, you're going to have to do that. --> Okay. --> And then if you can, let's go to that Avro reader on the first row. --> And go to the gear. --> And perfect. --> It should be automatically set up where it uses embedded Avro schema --> with a cache size of 1000. --> There's capabilities here. --> We can use external Avro schemas. --> We can, you know, there's all kinds of capabilities here within NaFi. --> Depending on your setup. --> But here, we just want to use the internal Avro schema we just created. --> So go ahead and say, okay. --> Okay. --> So now that we have that, let's see, we see, okay, we're back. --> I do see an invalid on your CSV reader. --> Hover over the invalid, the little yield sign is disabled, right? --> So the only issue that I'm seeing so far is the Avro reader, the Avro schema registry is disabled. --> So if you can, let's go start with the Avro reader. --> And let's enable that service. --> Use the, nope, cancel. --> And use a little lightning bolt. --> There you go. --> Enable. --> And you can say service only or service and use the drop-down box. --> There is service and components and referencing component right there. --> Yep. --> So this option gives you that capability. --> If you just want to enable the service, you know, that's one thing. --> But you can actually enable the service and any processor, any other services, --> and referencing components will be enabled as well. --> So go ahead and say enable at the bottom right. --> So it's enabling this controller, enabling references. --> Boom. --> Stay close. --> All right. --> So we want to go to our Avro schema registry and we want to enable that one. --> Same thing. --> So you can see how it's referencing the CSV reader. --> So you say enable. --> Say okay. --> And what that's talking about is the actual processor it can't enable, --> but it did enable the referencing controller service and your Avro schema registry. --> Go ahead and say close. --> We'll take a look at why the processor is not enabling in a second. --> It probably is because you need to enable your JSON record set writer. --> So the, you know, same thing is you want to, there you go. --> Say okay. --> Say close. --> Awesome. --> So now we have went through, we've created our controller services. --> We have created that Avro reader, the Avro schema registry, --> CSV reader, and the JSON record set writer. --> So they're all enabled. --> The state is enabled. --> So go ahead and hit X and let's go into our processor again --> and look at the yield sign on the convert record. --> Oh, we need to finish our flow, right? --> We can't turn this on because the convert record has nowhere to go. --> So you want to then, you know, after we have set, --> you know, we have now picked this up. --> We have converted it to JSON. --> We now want to update the file name attribute so we can write that back to disk. --> Because right now it's creating a whole new flow file, right? --> It's creating a whole new document. --> So you go ahead and click and bring down a update attribute again. --> Perfect. --> And so on the update attribute, let's go ahead and configure it. --> We're going to give it a file name. --> So go ahead and hit plus so we can give it a new property. --> And the property name is file name, okay? --> Say okay. --> Okay. --> And, you know, we're using NaPy, you know, the expression language. --> So we want to do dollar open curly bracket. --> There you go. --> And it automatically puts the close in there for you. --> And then you just want to put file name. --> Awesome. --> And then you want to at the closing curly bracket put dot JSON. --> Dot JSON. --> There you go. --> Say apply. --> Okay. --> And so we want to say apply there. --> So for here, we do have, you know, the capability for failure, unlike the update attributes --> in the git file. --> So on success, we want to update that attribute. --> Go ahead and drag down your update to the update attribute. --> And you want that to be success. --> Awesome. --> And so the convert record also needs to know what to happen to failure. --> And what I like to do, and, you know, this is up to you all how you like to do these --> things. --> But I like to log everything. --> So on a failure, let's log that error message. --> So you want to bring down a new processor. --> Oh, I like how you're going to put the failure back on itself. --> So you want to do a log message. --> Perfect. --> Okay. --> And what I like to do is, yeah, just take and you can see here, you can see I put my --> log messages out to the right of the flow. --> And if you're working straight down, then, you know, I know that that is my success --> path. --> And then I'll put my log message to the right of my flow, whatever makes sense. --> Because you can reuse this log message processor over and over. --> And we're going to do that here as well. --> But for failure, you want to send a log message. --> So I'll say add. --> Awesome. --> On log message, we now need to configure it, right? --> So go into your log message and go to relationships. --> And when you want to auto terminate on success or retry, or, you know, on --> success, auto terminate. --> There you go. --> Hit apply. --> So what's happening is that message now will be sent into the logs. --> And so if you were telling, tell is a Linux command. --> So if you were telling a log and an error came through, you would be able to see --> that error message come through. --> It's also going to, you can actually pull a history as well to see what that error is. --> So our convert record now is went from yield to stop. --> You can leave the log message out there. --> That's fine. --> Now we need to continue our flow. --> So we have an update attribute. --> For that update attribute, we went through and added a JSON file name. --> And so now we want to do a put file. --> Now we have a new JSON document. --> We have it named. --> We have our schema set. --> We have all these things. --> So, yep, update attribute for success. --> And we want to put the file. --> So on this, if you want, you can write that file right back to the CSV directory. --> And I say that because, you know, for mine, I'm only picking up the CSVs. --> So any JSON that is written back to that directory, you know, will not read that back --> in when it comes time to write it again. --> So go ahead and figure out where you want to put the output of that. --> Do remember to, and this is for everyone, you know, do remember to keep the source --> file on your Git CSV, if possible, just because you, you know, your flow may not --> be correct. --> It runs through. --> You want to run it through again. --> Does that look great? --> It does. --> It does. --> Go ahead and say apply. --> All right. --> And then let's, let's see here, put file. --> Let's do a, you know, from your put file, let's drag another arrow to your log --> message. --> Perfect. --> And we want to do a failure on that one. --> So you have, I think, auto terminate enabled for the put file on a failure. --> But in case, you know, the hard disk fills up the, if this was a network share, --> it's not available. --> You know, there's a ton of different reasons why we might not be able to --> write that file. --> You want to log that message. --> And so go ahead and say add. --> Perfect. --> So what we've done is, is we've created this, this flow and any errors we --> are pushing it to the log message. --> And so if we were to add other aspects of this and there was an error, you --> know, it's the same, we can reuse that same processor over and over and --> over again. --> I've seen, you know, folks set up a whole processor group on how to --> handle errors with advanced filtering and things like that. --> And then, you know, you may have 10, 15 other data flows utilizing that, --> you know, that error flow. --> And so, you know, it's just something to keep in mind as you're developing --> and designing your data flows. --> Okay. --> Let's see if we can run this. --> I don't see any errors right yet. --> So let's see if we can, Pedro, if you can just run through this --> one time. --> So that started at the top or? --> Yes. --> So let's make sure we configure it first and make sure that we're --> keeping our source file because NaFi likes to default to false. --> Okay. --> You got it. --> Sure. --> Perfect. --> All right. --> Say cancel. --> Just to double check. --> Say run once. --> Refresh. --> Awesome. --> Awesome. --> Let's take a look at your queue to make sure that that is the --> file that you're expecting. --> So here is, you know, here we are testing out our flow and making sure, --> you know, it looks good. --> So go ahead and say list queue. --> And you remember how to view it? --> No, no. --> Go record. --> You content. --> So yesterday when I did a new content, I could not do it because --> it was a zip file and there was no viewer built into NaFi for zip files. --> Luckily, it's a CSV. --> So NaFi understands CSV and it has a viewer built in. --> But the file name is inventory.csv and there's 11 lines with, well, --> 10 lines with a header and a bad day to roll just to throw us off too. --> So go ahead and exit out of that view. --> So you just close that tab. --> Awesome. --> And if you want, go ahead and look at the attribute. --> So scroll all the way, go all the way to your left to the little I. --> There you go. --> And attributes. --> And so if you remember from yesterday, this is what we were picking up data --> and we were looking at the attributes. --> If you scroll down, do you see anything in there about schemas? --> No. --> Perfect. --> Because we have not sent it to that update attribute yet. --> So go ahead and say, okay. --> And exit out of that. --> Perfect. --> Sorry, guys. --> My network dropped. --> I was out for like 10 minutes. --> Oh, wow. --> We're, you know, I'm using Pedro as an example walking through this flow. --> But, you know, we're still building the flow. --> So if you have any questions, just let me know, Brett. --> Okay. --> Thank you. --> Okay. --> So we have now got the file. --> Let's do an update attribute. --> So we'll just run that once. --> Just click run once. --> There you go. --> And it should show up in success. --> Awesome. --> So let's now take a look at that queue and let's look at the attributes --> and see what has changed. --> Let's go down. --> Ah, perfect. --> So you remember in the update attribute, we added a new property --> that was schema.name and with the value of inventory, right? --> So now, you know, now it should be coming together --> where we created that controller service. --> We told the controller service we're going to use the schema.name property --> as what, because you may have hundreds of different schemes. --> And so we're going to use the schema.name property. --> And the name is inventory. --> And that was the inventory name we gave it in our controller service. --> So, all right. --> This looks good so far. --> So we'll say okay. --> And exit out of that. --> Perfect. --> Now let's do, here's the heavy lifting is the convert record. --> So let's run once to see if it actually works. --> Oh, we have a failure. --> What is our failure? --> Cannot parse incoming data error while getting next record for string quantity. --> Oh, is that by a row that we have in there? --> Well, it should parse each row and create a JSON document out of each row --> and then ignore the thing. --> Hang on. --> Let me take a look at this real quick. --> Because mine's up and running. --> Let's go into your convert record. --> And let's look at the properties. --> And you have CSV reader. --> And you have JSON record set writer. --> You have false. --> Click on your CSV reader. --> No, hit cancel. --> Click on the arrow to take you to the service. --> Okay. --> That looks good. --> Let's view the configuration to make sure we have everything configured properly. --> There you go. --> We should have used the schema name property. --> The schema registry is the Avro schema registry. --> We do have schema name. --> Okay. --> That looks good. --> Let's look at our schema registry. --> So click on the arrow for that. --> Scroll back up. --> Scroll up. --> A little arrow there. --> Perfect. --> And then, okay. --> So let's look at the schema registry. --> We have inventory. --> And I put the schema in there. --> So click on that. --> Let me make sure that looks good. --> Okay. --> Okay. --> That looks good. --> Is that okay? --> Is that okay again? --> The Avro reader. --> Let's click on the gear icon for that one. --> Use embedded. --> You have that correct. --> And you have a thousand. --> Okay. --> Now say okay. --> And then on the JSON record set writer. --> Let's go, you know, what that configuration is. --> One second. --> I'm pulling mine up to make sure it matches. --> Oh. --> We did not configure this one. --> On the schema write strategy. --> We want to change that to set schema.name attribute. --> I'm not letting you, right? --> So close that. --> And you see up top it says disable and configure like that. --> So what it's doing is disabling the services --> and the processors associated with it. --> All right. --> So the schema write strategy is we want to set schema.name attribute. --> I thought we did this. --> Say okay. --> Awesome. --> And then the schema access strategy. --> We want to use schema.name property again. --> Awesome. --> Say okay. --> And then schema.name is already applied in the schema registry. --> So we want to click schema registry. --> And I bet you know this one. --> It's the Avro schema registry. --> We're using perfect. --> Say okay. --> And then pretty print JSON. --> Let's just make it pretty. --> So center. --> And then it should be never suppress array and none. --> Okay. --> Say apply. --> And now we want to enable that. --> We want to enable every. --> Awesome. --> So that is actually a very quick test. --> If you have your processors completely configured, --> it's a really good test to see if everything is configured --> appropriately because it will enable that processor. --> If it rejects enabling the processor, --> it's usually because you're missing some of the termination --> or you got a misconfiguration. --> So say close. --> And then exit out of that. --> All right. --> Let's clear our actually let's stop the convert record --> because it started it when you enabled that service. --> And let's see if we can run this one more time. --> So run once at the very beginning. --> Yes, sir. --> And I know I'm working with Pedro here, --> but you want to pause while he does the run once. --> Does anyone have any questions? --> Because this is a very difficult flow. --> And we've been kind of following what you guys are doing up there. --> Oh, and that's why we're doing it. --> Perfect. --> And hopefully we didn't lose breath. --> Yeah, I'm here. --> Awesome. --> Awesome. --> Awesome. --> No worries. --> And like I said, this is one more advanced data flows --> that we want to do. --> And so this starts getting you into controller services --> and those types of things. --> So we have a lot of time allocated for this. --> OK. --> Possibly. --> I'll take a look at yours in just a second. --> Pedro, you want to continue doing a run once --> and let's see if we can get that to success. --> And then I will pull breath. --> OK. --> Let's see if our convert record will actually work this time. --> Run once. --> It did not. --> What is our error? --> Failed to process. --> Failed to process. --> Why are you? --> One second. --> I'm looking at mine because we are working off of the exact same. --> Yeah. --> Bad data row. --> Same header. --> Sure. --> So one second, Pedro. --> I'm looking at mine, which worked. --> Oh, oh, I bet I know what we did not do. --> Can we go into the convert record? --> And then the CSV reader, you want to go to that service. --> Click the gear icon. --> No, no, you cancel that. --> There you go. --> Use the arrow to go to the CSV reader --> and use a little gear icon. --> Scroll down. --> I'll duplicate header names. --> Do you think true? --> Oh, treat first line as header. --> So that's true. --> But you remember we have to disable and configure. --> Awesome. --> OK. --> There we go. --> Apply. --> It couldn't parse because it doesn't understand --> what that header was. --> So go ahead and enable it. --> And hopefully convert record will turn perfect. --> All right. --> Let's do it run once all the way through --> and see if it works this time. --> Did it refresh? --> Ah, success. --> Great job. --> OK. --> So if you can, let's go ahead and go through --> and clean this process flow up. --> Let's get the naming in. --> Let's get some labels, those types of things. --> Here is an example of mine. --> Here's an example of what my flow looks like. --> So if you can, go ahead and update yours. --> I need to delete this because I was using it, for example. --> And if you have any other questions, Pedro, --> just let me know. --> Brad, let's go back to you. --> I would. --> Did you test yours out, Brad? --> Yeah, it doesn't look like it's. --> This first one doesn't have any in or out. --> OK. --> So I'm guessing it's just not picking up the file. --> Um, perfect. --> Let's show configuration. --> If you don't mind, let's stop all your processors --> for right now, just because we need to stop them --> and configure them anyway. --> So beautiful. --> Beautiful. --> And a great way to use to operate on the canvas. --> So take, if you can, go to that input directory --> and go to that value. --> No, go back to your not five canvas and under that value, --> open that copy that location. --> So you just highlight it and say copy. --> I just use control C. --> OK, that works. --> All right, then hit cancel. --> But don't hit. --> Yeah, there you go. --> And then you go back to your file browser. --> There you go. --> And then in your address bar, paste that. --> I didn't get it. --> Oh, and do it again. --> Control A, control C, then we'll control B. --> I'll hit control C 10 times, because it doesn't get it --> the first time. --> Yeah, the desktop environment sometimes will. --> There you go. --> Hit enter. --> Yeah, that was the. --> Here, I'll go up a directory so it's obvious. --> Yeah. --> Oh, it's there. --> The CSV file's there. --> OK, say cancel. --> Oh, your file filter. --> Let's just stick with inventory.csv for now. --> The reject pattern may be off. --> So you can just change that. --> Inventory.csv. --> I'm assuming that's all set. --> Should I run it now? --> Run once and hit refresh. --> Just right click on your canvas and hit refresh. --> I witnessed you set that path correctly. --> Um, right click on that get inventory file. --> And oh, it went. --> It went. --> I see success. --> OK, perfect. --> And then if you want, just do a run once. --> And let's see if it goes all the way through. --> Can I do a run once for the whole thing or do I have to do each? --> You have to do each individual one. --> Wouldn't it be cool if you just hit run once on the whole process group? --> Yeah, that would be nice. --> I'll submit it to them and see if they can get that into the next version. --> Like I said, I know all those guys. --> So like I submit feedback all the time. --> All right. --> So here's where we usually see our highest failures is on this convert record. --> So go ahead and run that once. --> If it fails, I would recommend here is logging your failures. --> Just because, you know, you see that red box in the top, right? --> So what it did is it auto terminated the failure and and that's it. --> So it won't go into any logs or anything. --> So what I like to do is do a log message. --> You got it. --> And this is what I was. --> Oh, so this is what I was saying where I like to have a log message --> just on my data flow and any chance I have where a failure can terminate. --> I always drag my failures to the log message just so I can also see the file. --> You know, you may have a file that comes in and is not recognized and it goes to failure. --> Then you can actually view the queue and look and see why. --> Like maybe it picked up a file that was inventory dot CSV that had bad values, right? --> If you do an auto terminate, it takes away some of your options. --> That's why I always recommend a log message. --> Okay, so you can go ahead and terminate that log message. --> You want to go ahead and figure it and then relationships. --> I'll terminate. --> Perfect. --> And say apply. --> Awesome. --> And then take your log message is fine. --> You can just have it set out to the side. --> You should be good there. --> You don't want to drag it to anything. --> You can drag it to itself. --> If you have a failure, you can drag it to itself and it will reprocess that data. --> We see that when you have a processor that could take a little time sometimes. --> If you have something that is taking a lot of resources, --> like unzipping a large file, but you know that you're getting a bunch of small files behind it, --> every once in a while we'll see a large one. --> You can put a failure back onto yourself so you can just reprocess it. --> So we'll take that flow file, put it back into the queue and try to reprocess it. --> But let's go to our convert CSV to JSON convert record. --> And let's configure that. --> Awesome. --> Go to properties. --> Okay, so we are reading in a CSV reader. --> So let's go to that controller service. --> You want to hit that arrow on the right. --> Awesome. --> And let's hit the gear and let's see what your configuration looks like. --> Go to properties. --> Scroll all the way down and treat. --> Okay, you do have the treat the first line as the header. --> Let's go all the way up to the top and let's see what that looks like. --> In first schema, I don't think it's correct. --> So we want to, yep, use schema name property. --> So you remember we set the schema name in that attribute. --> And so we're telling, now that we've set that attribute, --> we're telling NFI to use the schema name --> and we're going to specify the name as what schema to use. --> So you got that. --> And then schema registry. --> We want to set, you know, there's no value there. --> We want to set that as the Avro schema registry. --> Awesome. --> Say okay. --> And then I click apply. --> Awesome. --> And then let's go to your Avro schema registry we just set up. --> And click get gear icon. --> And you've got the model and you've got that. --> That's beautiful. --> All right. --> Say okay. --> I think maybe the only issue was is you just didn't reference the schema. --> So let's look at your Avro reader. --> It should be just use embedded Avro schema. --> Yep. --> Perfect. --> Perfect. --> Okay. --> And real quickly, let's look at JSON record set writer. --> So you should have set schema.name attribute. --> No value. --> Use schema name property. --> And what is the schema name property? --> It's, you know, we set that in that update attribute. --> So there you go. --> And you've got the Avro schema registry. --> Say okay. --> I think yours is going to work now. --> So just try to do a run once. --> And, you know, let's see how far we get. --> And while he's working on that, let's see. --> How's everyone else doing? --> Cody is cleaning his up. --> Tyler is cleaning his up. --> Looking good. --> So you were able to get the file. --> Let's run it once to set the update attribute. --> Oh, okay. --> Did it. --> Okay. --> Run once. --> Hit refresh. --> Right click and hit refresh on your canvas. --> There you go. --> Success. --> So and then you're updating the attribute. --> And you are then putting the file back to disk. --> So if you can, let's go ahead and clean this up. --> You know, do your names and labels and those types of things. --> Make sure the file is being written out. --> I would add another failure on put file just because you want to make sure you log --> that message. --> If it does have an issue putting the file, you know, as I mentioned earlier, --> when you're putting a file, that could be many things. --> And, you know, this could fill up or something like that. --> If you have some sort of logging mechanism set up where you're --> pushing all this to Prometheus and others, you know, Grafana, --> you can take a look at this. --> Okay. --> Brett is squared away. --> So it got output as a CSV. --> Oh, I thought we changed that. --> Yes, that might be another one you missed. --> So do you have, yeah, yeah, yeah. --> Do you have an update attribute after your convert record? --> Yeah, right here. --> All right, let me. --> But this is, this is, I think this is when it cut out. --> Okay. --> But no worries. --> We got you. --> We got you. --> I'm pulling yours back up. --> Okay, so we have the update attribute. --> When it's coming out, you added a new attribute called file name. --> And then the value should be file name like you have dot JSON. --> Perfect. --> Okay. --> And say apply. --> Okay. --> And apply. --> So is there a different attribute for like file extension? --> Because I didn't set that and it was it automatically just gave it CSV. --> Yeah, because it's using the file name attribute that was originally with it. --> And so it probably wrote that because it needs a file name to write. --> And so it wrote it as a, if you look at, you changed it. --> If you want to run once, but turn off that update attribute where we put on the file name. --> We'll show you where it happens. --> Go ahead and just stop that update attribute. --> And if you can just run it one more time through. --> Oh, you have an error on your file. --> Yeah, that's it's the same file. --> So go ahead and you've got that stop. --> Empty that queue. --> Yes, sir. --> And let's run it once all the way up to the update attribute. --> So let's let it convert. --> And then after it converts, let's not let's look at it, the attributes. --> Okay. --> And let it convert. --> It's already went to set name. --> Oh, it has. --> Well, your other file has that was in the queue, but it's okay. --> So you should have two in the queue after set name. --> Okay, success. --> So if you can, right before, yep, there. --> Let's look at the attributes of that file. --> Perfect. --> Do you remember where it's at? --> So you see the file name? --> Oh, oops. --> No, you're fine. --> So you see the file name? --> So it's going to write that out until we update that attribute. --> And so what we want to do is update it to be inventory dot JSON. --> So it's going to use that attribute to write the data back out unless you update it. --> And so if you say, okay, and then exit out of that. --> And let's look at your update attribute properties. --> And you see we kept the file name and we added dot JSON, right? --> Okay. --> And then let that run real quickly. --> That's going to be inventory dot CSV dot JSON. --> Correct. --> Yeah, we didn't strip the --> Yeah, let me make sure. --> So, you should now have file name inventory.csv.json. --> Now, we could do regex, you know, you can update that attribute with some regex and --> change the file name altogether. --> So if you want, you can actually go back into your update attribute and, you know, --> apply some regex, if you wanted to, to change that file name to whatever you want. --> You know, just, you know, as FYI, right? --> So right now it's using the file name and just adding. --> For the sake of this demo, could I just hard code it to inventory.json? --> You could. --> Or did you have the regex already that I can use? --> I do not have it handy, but yeah, just send it to inventory.json, --> see if that works for you. --> Oh, you already have a put file in your queue. --> So if you want to go ahead and run once, before you, unless you just did that, --> I don't know if you did. --> Oh, I did just do it, yeah. --> I know where it is. --> What's our error? --> Because it's going to complain about the file name already exists. --> But that actually is a good question. --> Let's see. --> What I usually like to do is I will actually update the file name and give it like a UUID --> and a new file extension. --> But I can actually, let me see here. --> You can do like a set file name as like file name equal dollar, curly bracket, --> new file name, right? --> And while you work on that, you can set a file name in the regex. --> So you can actually, yeah, so you can put in your update attribute, hang on, --> let me double check my notes here. --> Put, because if you do file name, it's going to pull in that .csv. --> We need to get rid of that csv or you could do. --> I'm looking at the regex patterns on the NaPhi website. --> I mean, if it's like typical regex, it would be something like begin of line. --> Yeah, I mean something and then any amount of characters and then like something like --> not literal. --> Well, you can also do something like a date. --> So you can do like, yes, you know, you can do like now and then parentheses --> and then close your curly brackets dot json, you know, you and that would assign --> if you do like now, there you go. --> And then what that's going to do is just give you a date, right? --> Timestamp. --> Yeah, a timestamp dot json. --> That's a good question. --> Run that once and see if it works. --> It looked okay, but we'll see. --> How do I get rid of these errors here? --> So the little red box in the top right, it goes away in five minutes. --> So it's annoying as well where, you know, it's there you could be testing and running --> through these things quickly and you know, the error still exists. --> But I think you're off to a good start. --> I'll look at some regex patterns. --> One more quick question. --> Can I clear out all the queues at once? --> Go back. --> See if you can go back, use your breadcrumb and go back to the NAFA flow main. --> And then right click on your perfect and say stop right click again. --> Empty all queues. --> Okay, cool. --> There you go. --> Perfect. --> Okay. --> So any like, I know we're still working on this, but is there any additional questions? --> I got a quick question. --> Yes, sir. --> So say does error out? --> What happens to the log? --> Does it become a file or does it just log it somewhere? --> It does. --> And that's the reason that I recommend you use the log message. --> And so what that is going to do is push that to the NAFA log. --> For instance, you pulled that log up to get your username and password. --> And so it will log that message to the NAFA logs. --> Yes, sir. --> So if you go into the log directory and you go to the NAFA app, if it creates a log here, it will log it to the app, to the NAFA app log. --> No, that's a great question. --> And again, some of the design patterns, you know, you may not want to put, you know, a log. --> I log every failure. --> I even use the log message for success sometimes because I want to see some certain aspects of that flow file. --> But no, that's a great question. --> Any other questions? --> And Brett, if you can, you know. --> That now didn't work, by the way. --> I'm just going to hard code it to inventory for now. --> Perfect. --> Perfect. --> Go ahead. --> So I hard coded my update attribute to inventory.json, but it seems like it's still wanting to output as a CSV. --> It's outputting the name as inventory.csv or the contents as a CSV. --> It's still a CSV file. --> Okay. --> So let me take a look. --> I'm sorry, I didn't have it pulled up. --> Who was speaking? --> Who was that? --> Cody. --> Oh, my, Cody. --> I should know your name by now. --> All right. --> Let's say cancel. --> All right. --> So you are, am I driving this? --> Let me make sure I have this on the interactive. --> Oh, I do have it on the interactive. --> So hang on one second. --> There you go. --> Perfect. --> All right. --> So you have your convert to JSON. --> What step is spelling? --> Oh, you're getting failures like crazy. --> That was failing because it's the same file name. --> Oh, oh, oh. --> No worries. --> Okay. --> And then update file name, success, put, and then failure. --> Where? --> And so the file that's being written, can we look at it? --> Yeah, it's this guy right here. --> Can you open it up? --> Let's make sure it's a JSON document. --> Awesome. --> It is. --> Go ahead and close that out. --> Go ahead and close that. --> And then your update file name. --> Let's look at that. --> Stop it. --> And let's look at that property. --> Yeah. --> So you're just putting in the updated. --> But we actually need to put it in the correct format for NonThon to read it. --> So what you want to do is, yeah, change that. --> And you want to use dollar. --> Because dollar, curly bracket, file name is part of the regex. --> And so NonThon understands that. --> So file name, close your curly bracket, JSON. --> And say, okay. --> All right. --> I had that originally. --> I was still getting a CSV. --> It should be inventory.csv.json. --> I'm going to come up with a regex pattern to change that. --> Just because I think Brett also has the same question. --> But can you run that to see? --> And then we will take a quick. --> Instead of on your place, JSON. --> Okay. --> There you go. --> Just run once. --> Okay. --> It's still writing a CSV. --> All right.