28:16
2025-01-27 13:22:37
2:15:00
2025-01-27 15:12:42
2:08:39
2025-01-27 15:41:27
2:16:52
2025-01-27 20:41:38
3:55:27
2025-01-27 20:41:42
4:47:11
2025-01-28 08:47:09
Visit the Mongo DB for Administrators course recordings page
WEBVTT--> How's Akona doing there? --> I can see he's doing well. --> He's pasting something. --> Yeah, he's now looking at some collections. --> He's creating a collection. --> Why don't you tell him it's none of his business, please? --> How do I quit this IOS test? --> Did you try Control C? --> You see, I knew. --> You knew? --> Yeah, I was testing you. --> There's one you said we mustn't do. --> Which one was it? --> Seven? --> Or is it just a backup? --> Yeah, the backups, you don't need to do it. --> Wait, you mean there is the one that says creation of an SSL certificate on security and monitoring. --> You can leave that one. --> Okay. --> Okay. --> All right, cool. --> And then there's also the part of advanced monitoring. --> So it needs you to do a whole process of Prometheus and Grafana. --> I think that's number six and seven. --> Yes, that's where I was. --> Yeah, so you can leave that on because you need to have a new installation of Prometheus, Grafana and all that stuff. --> You can always do it at your own time because you still have access to these machines even after the course. --> Budget will be finished. --> Budget? --> Not even. --> So you can be able to come understand the loans, right? --> So it won't be part of the course anymore. --> Yeah. --> Okay. --> So are we doing six and seven or? --> Wait, hold on, hold on. --> We are going to continue just now. --> I'm going to do a theory just now. --> Everybody should just confirm that they are done with the security one and then we can continue. --> Okay, I will let us know. --> I think I'm good. --> Except when we think skipping the certain topics, do we skip the? --> What's this? --> The first one before the firewalls, before the certificate. --> The network one. --> You won't see the difference because you need to connect. --> What do you call this? --> Remotely. --> Yes, internally you can be able to do it, but you won't see the effect. --> If that makes sense. --> It's already connected to local host anyway. --> Yeah. --> So you won't see the effect. --> It needs you to be remote. --> Yeah. --> And then I'm good. --> Okay. --> Who else? --> Are you good? --> Yes. --> We need the lady of the house. --> All right, cool. --> So let's talk about some indexing. --> Web console. --> I said you can skip that one. --> You can do it at your own time. --> I'm going to add a bit of more information on that one because I've realized that it might need you to do a bit of work, but I'll add some stuff onto it in terms of how to do it. --> Right. --> But anyway, the web console, if you're going to be using the enterprise version of MongoDB, you get the ops manager. --> But there's also open source ones that you can be able to use. --> Key features of it, real time monitoring. --> You can view your metrics, your operations, your memory usage and network activity. --> You can set up alerts. --> You can perform analysis. --> You can analyze performance, you know, identify slow quiz, and then you can do backup and restores from the web interface. --> Right. --> That's just the advantage of the web console. --> Now let's talk about indexing. --> Indexing is where you structure your data in such a way that there is improved speed of data retrieval. --> So in essence, you're taking a bit of data and moving it closer for easier accessibility. --> You use that to be able to quickly locate documents without scanning the entire collection. --> So in essence, instead of you having to go through 2,000 students. --> Right. --> If you create an index, let's say for a name, right, then it won't need to go through the whole collection of students. --> It will just go look for the specifics in the name. --> That's how indexes work. --> And they use what you call bit trees, sort of a type of balance tree to store indexes. --> Right. --> So there is the database and then there's indexes going down. --> And then each index entry contains a value from the index fields. --> Which is used as a pointer to the corresponding document. --> So, for example, if you are going to use indexes of first name, which means in every document, the first name is going to be used as that. --> So when you want to try and search for something using the first name, it's going to be quick to retrieve. --> Right. --> And then when a query is executed, MongoDB uses that index to quickly locate the relevant documents. --> So in essence, what it will do is instead of going through the whole document, it goes looks for first name called Kumbulani and then bring back whatever it needs to bring back from that Kumbulani. --> Right. Instead of it having to go through the whole collection of documents looking for a Kumbulani, it goes first name, first name is index. --> All I need to look for is Kumbulani, Kumbulani here, digital type of setup. --> Right. There's different types of indexes. --> First on being an index single field. --> Right. In a collection. --> So, for example, you create name as an index. --> Right. --> That's a single index, single field index. --> Right. --> And then you've got a compound where you can have multiple fields as indexes. --> So, for example, there in DB students, you've got name and then age. --> So in essence, if you want to retrieve something quickly, if you have name and age on it, it will be very, very quick. --> I'm sure in banks they use the ID number. --> Right. --> To retrieve information about you at any point. --> So they would index the ID number because that's something that is so different. --> That's that you if you index that, then it will quickly retrieve anything that's specific to that to that ID. --> You can also have multi key indexes. --> Right. --> Or nested documents. --> Let's say, for example, subjects. --> Right. --> Where you have a nested document or an array. --> Right. --> Because subjects has got many other subjects under it. --> Right. --> That you can actually add. --> You can actually index that array or that nested document. --> Right. --> And then you can also do text index where you can it supports text search queries or string content. --> Right. --> So in essence, you can then index name as a text. --> Right. --> So it becomes easy to be able to search first name using your text. --> The text type of data. --> Right. --> And then you've got your just part, which is obviously this one is used by your buzz and your bolts. --> Right. --> They index locations. --> Right. --> So location is the main thing that's used by Uber and both. --> So they index that. --> And then you've got hashed index where you can hash or you can use the hash of a field. --> Right. --> Probably the email or something. --> You know, it's also good for shutting. --> And then you've got TTL index where you automatically move documents after a specified time. --> So in essence, after a month. --> Right. --> You can be able to actually remove those documents if you want to. --> Right. --> So you can index them that way. --> Right. --> And then managing index. --> What do you do in managing index? --> You can create indexes. --> Right. --> Which could be a single field index. --> You use the create index. --> You can list indexes. --> To see within your collection, what is it that's indexed? --> And then you can drop indexes. --> You no more want that index. --> Then you can drop it. --> Or you can rebuild indexes where you do the DB students re-index. --> So in essence, it will rebuild all the indexes in the collection. --> Right. --> So some information about or some talking about the indexing internals. --> Right. --> You're B structure. --> So as you said, it uses B structure, B tree structure to store your indexes. --> In essence, it's got something like a tree. --> And when you're speaking about Linux, right. --> The tree is where you've got the mother and then the child, the children as they follow below. --> And usually it's balanced trees that allow efficient insertion, deletion and search operations. --> So your database will be on the one or your collection will be on being the highest and then your documents being under there. --> Right. --> And each node in the B2 contains multiple keys and pointers to child nodes. --> Right. --> And then index storage indexes are stored in separate data files within the DB path. --> So they create smaller file indexes, create smaller files where each index will contain the indexed field value. --> And then a pointer to the corresponding document. --> Right. --> You've created an index of ID number. --> Right. --> It will index that field. --> Right. --> Of the ID number and a pointer to the correct document. --> Right. --> And then you've got index selectivity. --> Right. --> Where it refers to how unique the values in an indexed field are. --> So you've got what you call high selectivity. --> Right. --> Where there's better performance that way. --> Then there's low selectivity, which may reduce the effectiveness sometimes of the index. --> But you'll have many duplicate values. --> Right. --> Where high selectivity is involved, it's a unique value. --> So there is never the same, which makes it, which makes life easier. --> And then you've got cardinality where it refers to the number of unique values in an indexed field. --> Right. --> Where you've got high cardinality field. --> Right. --> E.G. --> Email. --> Right. --> There's so many unique values in that field. --> It's very, very good. --> But where you've got low cardinality gender, gender is either male or female. --> Right. --> And that's it. --> You probably won't get that much benefit out of indexing because it's either --> male or female. --> And if you had to pull anything that's male related, you can get half the database --> or 80 percent of the database, you know. --> But if it's email and it's looking for kumbulani T at Gmail dot com, that's the only one that is going to look for. --> Right. --> And then also indexes consume additional storage space because remember it creates files. --> Right. --> Um, number one, what does the size of the index depend on? --> Um, the size of the indexed field of fields. --> Right. --> If it's a very huge, um, if it's a field for very huge information, then it would be big. --> And then the number of documents in the collection. --> So the bigger the more the number of documents that are linked to a specific indexed field, the more obviously the index file becomes bigger. --> Right. --> And then some best practices for that. --> Index only frequently queried fields. --> Right. --> As an example, as I said, in a bank, the very first thing that they ask you is your ID number so that they can pull your profile. --> You can index that, you know, if they are going to use maybe card number, for example, you can index that. --> Then it pulls much, much quicker, you know, because then anybody that comes in needs to query using probably an ID number or a cell phone number. --> Or a card number, you know, and use compound indexes wisely. --> Don't just put them right. --> Make sure you use them in a in a very clever way. --> Um, where you can be able to filter for on multiple films, but use them very, very wisely. --> You don't want to have a situation where it ends up confusing you and then avoid over indexing because then it can consume storage and then it can slow down your right operation. --> Right. And then monitor your index usage. --> Use the index stats aggregation stage, the stage to monitor index usage. --> It's very, very important. --> Make sure that you monitor your index usage and then use covered queries. --> Um, the way it is covered, if it can be satisfied entirely using the index. --> So if you can use the index, then make sure just use the index and not really go all over the place. --> You know, um, it's for part of it uses an index and part of it has to go and search in other documents and whatnot. --> It wouldn't make sense. --> So make sure that if you're going to entirely use indexes in your query, use indexes the most right. --> Some single field index where you've got a single field as an index right on a single field in a collection. --> Right. That one very easy to use. --> Um, speeds up quiz that that do things like filtering, sorting, aggregate based on that field. --> Right. You can aggregate based on that field, your totals, your sum and all that stuff. --> Obviously, each entry in the index contains a value of the index field and then a pointer to the corresponding document. --> Um, for example, they when you have, um, this part, right. --> Uh, where you've got, uh, where's my pen now? --> Um, this part, so you're creating on, um, on names. --> Uh, that's your index, you know, um, and when it's one, it's ascending order. --> When it's minus one, it's descending order. --> Use cases of it, filtering or sorting by a single field. --> When you want to find all students with a specific name or sorting students by age. --> That's another way you can do that. --> Then there's the compound one where you've got a combination of multiple fields in a connection in a collection that you want to index. --> Right. Um, multiple fields. --> And then it helps, especially when you do things like filter your sort aggregation. --> Right. Uh, same thing, um, creates a B structure, um, B tree structure for the combination of fields. --> And then the order of the field in the index is it matters. --> Right. Queries can be used to index. --> They include the prefix of the index field. --> Right. An example is if you index on name and age, right, then you can be able to, um, use queries on name or age or name and age. --> Right. Um, that's how you can, um, be able to utilize them. --> So it's either you can use name or you can use name and age or you can't use just age. --> So there is that precedence of the order of how you're using it. --> Name and age, name or name and age, but not age. --> Right. Um, compound, compound index when you're creating it, it's just as simple as --> adding the two. Right. And then use cases when you want to filter or sort by multiple fields. --> And then, for example, if you want to find all students with a specific name and age range, then you can be able to use that. --> Right. Geospatial, as I said, geospatial, as I said, is where you're using coordinates, your geospatial data, --> um, 2D for flat 2D coordinates, your 2D sphere. And I think there's 3D now. --> Um, and how does it work? It uses like it uses special specialized data structure, which is geohashing. --> Right. To index your geospatial, geospatial data. Um, example, Uber and all that stuff. --> And then, for an example, you can be able to create an index using location. --> Right. Which is 2D sphere. Um, and then how can you be able to use it? --> An example is finding all places within a certain distance from a point. --> Best practices when it comes to single field index is used for filtering or sorting by a single field. --> Right. Um, where you want to index email for user authentication, where you use your email, for example, --> for authentication. That's a single field. And then compound is where you want to filter or sort by multiple fields. --> Right. And then ensure the order of the field matches the query patterns. That's very important. --> Geospatial, when you want to use for location based queries, you know, --> use 2D sphere for spherical geometry, for example, its surfaces and then avoid over indexing. --> It's very, very important or else it will slow down operation to use up your storage and then monitor. --> Always monitor. Monitoring is going to be a word that we will hear all the time. --> Now, any questions on indexing? --> That's too good my side. --> Everyone else? Oh, good. --> Now we can now go and do number five and then exercise number one exercise day one. --> So if we go and look at number five, it says indexing so we can look at the indexing there. --> It still uses the same database, which is university, to do some indexing. --> You're going to create some index. You're going to create compound indexes, geospatial. --> But I don't think you might really it might really be worth it. --> But you can try it. I don't think it will work though. --> But then you also look at query optimization, where you find a student by email and you optimize by a single field. --> Or you then look at putting using a compound index, you know, putting geospatial not really using any coordinates or anything. --> So it might be really tricky. You can try it, but I don't think it will work. --> It might give you an error. And then there's something called what you call this query profile that you can be able to use. --> So also have a look at that. --> It also helps in terms of query optimization and all that stuff. --> And then, yeah, after that, there's exercise day one, right? --> Exercise day one is more or less what we've spoke about the whole entire time. --> There's going to be some insert. There's going to be some deletion, updating and then some operations that you need to do. --> For example, checking stats, creating an index, a bit of advanced where you probably need to change one or two configurations and then be able to see how it works. --> But then you find that most of the stuff that we spoke about or that we try to do might actually be there, for example. --> Yes. Are you sharing screen or we must just follow on the site? Oh, I wasn't sharing my screen. --> But OK, let me share my screen with you. --> Where is my machine? --> This one. OK, I think I got it now. --> We knew. --> Can you see my screen now? --> Yes. OK, cool. --> So I was saying that so indexing is more or less you're going to create an index using the existing database that database that is there, which is the university one. --> Create your complex query, your compound indexing. --> And then you do some query optimization, use a query profiler and all that stuff. --> Right. And after that, there's exercise day one. --> Right. There are some things that you probably have done already, like you've created the university database already. --> And then you need to insert some documents. --> Be mindful of the names. It might be names that are existing, the existing change it or use upset. --> It's up to you. And then query finding some courses, some crude operations where you need to delete something, --> you know, some intermediate operations, some collecting of stats and then some indexing also. --> And then a bit of advanced stuff, for example, where you're going to do security, storage path, you know, system logs. --> This might not really you might not really worry about it because already it's connecting on local host, so there's no need to worry. --> But in terms of the rest of the stuff, start Mongo with the configuration file. --> If you want to create a separate configuration file and put this, then you can be able to start it that way. --> And yeah, most of the stuff is just what we spoke about. --> Hardware and file system, you might not be able to do that because obviously you can't add any any SSD, but you can be able to add journaling. --> Right. And then see what could be happening within the journaling. --> And then there's some security aspect of it that we did already some security deployment recommendations. --> These are just the recommendations. Right. --> And then you do some monitoring and then you be done for that day for the day. --> So let's do number five. And then after that, do exercise number one exercise day one. --> Sorry, it's super clean. Cool. --> So we're doing five and exercise day one. --> OK, OK, cool. --> So I've lost you there when you said so you said to say something. --> When you mentioned geospatial, yeah, geospatial because we're not we're not we don't have any data that relates to geospatial. --> Right. Like coordinates at any point. Right. --> We in that we you probably might not see anything that's not no effect. --> Rather, if I if that makes sense, because we don't have that data, we don't have any way that needs that data like location. --> For example, we don't have a situation like it's an application or data. --> We don't have that within our database. --> We can I can always create a script that can create that data if you if you want. --> But it's like you might not really see the effect besides just showing you the information in itself. --> The effect of it speeding up works when there's an application that needs to access the location. --> Right. But we don't have that application. --> So it's it really would be too much admin to have that. --> It's just the run commands for for nothing. --> Yeah, you. --> It's obvious. --> Comparison starts with errors. --> What is started with errors? --> Unexpected token limit. --> Number five point one point one five point one point one. --> Well, what are you getting? --> Who is this now? --> Unexpected token. --> I'm also getting the same error. --> Well, I can see the I can see the issue before I even tell you you're not logged into your database. --> Yes, sir. --> Log into your database. --> Mongo. --> Yes. --> And then you can then run those. --> You can now copy and paste. --> Anything that that has a use this or DB something should know that you need to run it within the database. --> Right. --> You need to you need to authenticate. --> You need to authenticate. --> So exit and then authenticate. --> Right. And just go up arrow until you get to the part where it's supposed to. --> Yeah, that one. --> Yeah. --> Yeah. --> And then now you can try and run those. --> Those commands. --> Use investee and yeah. --> Did you see that? --> Are you sorted Winnie? --> No, I'm not sorted. --> OK, I'll come to your screen just now. --> This is going to be interesting to figure out. --> You're not authorized to do what you did, which means the user that you're using is not authorized to do that.