10 videos 📅 2025-06-26 09:00:00 America/New_York
2:14:39
2025-06-26 09:07:32
1:12:32
2025-06-26 09:11:34
6:42
2025-06-26 11:08:41
35:51
2025-06-26 11:24:37
38:41
2025-06-26 13:21:35
20:37
2025-06-26 15:06:35
51:46
2025-06-27 09:06:19
58:45
2025-06-27 09:06:25
36:01
2025-06-27 11:26:09
1:12:38
2025-06-27 13:45:09

Visit the Kubernetes Comprehensive 2-Day course recordings page

                WEBVTT

00:00:00.720 --> 00:00:19.030
All right, lesson nine, scale and upgrade Kubernetes clusters using advanced strategy.

00:00:20.510 --> 00:00:26.110
So this will be kind of an abbreviated session, and then we'll go into testing it out.

00:00:28.870 --> 00:00:33.850
We don't have a way to really test this on a full Kubernetes cluster.

00:00:33.850 --> 00:00:40.910
So I'll explain to you the concepts of how it works, and then we'll.

00:00:41.090 --> 00:00:43.850
We'll demonstrate it on a Coup on a mini.

00:00:43.850 --> 00:00:50.690
Typically, if you do this, you would use something like Cupidmin.

00:00:50.690 --> 00:00:54.370
The problem with that is we don't use Cupidmin really

00:00:54.370 --> 00:00:58.310
to manage clusters too much if we're using infrastructure as code

00:00:58.310 --> 00:00:59.910
especially.

00:00:59.910 --> 00:01:01.710
So if there are two ways to do it.

00:01:01.710 --> 00:01:04.110
And so with infrastructure is code,

00:01:04.110 --> 00:01:05.910
we build the cluster how we need it.

00:01:07.750 --> 00:01:09.750
And so using Terraformer hands,

00:01:09.750 --> 00:01:11.070
well, we build it how we need it.

00:01:11.090 --> 00:01:15.910
If we want a new cluster with a new version, then we use IAC.

00:01:16.150 --> 00:01:21.390
The other way is you build a cluster like a cloud server, right?

00:01:22.190 --> 00:01:25.130
And then you manage it with KUBITM, which maintains your state.

00:01:27.150 --> 00:01:31.490
And again, that's an older way of doing it, but there are still practitioners who use.

00:01:33.270 --> 00:01:34.350
Infrastructure is code?

00:01:35.490 --> 00:01:35.670
Yeah.

00:01:35.710 --> 00:01:40.470
Yeah, so there are several ways to do it, and it just depends on how you're, you know,

00:01:40.470 --> 00:01:42.770
how you're setting up your cluster to begin with.

00:01:42.770 --> 00:01:45.790
So if we're using IAC to set it up,

00:01:45.790 --> 00:01:48.210
we wouldn't use COVID-MIN.

00:01:48.210 --> 00:01:50.370
But if you're using KubeDmin to set it up,

00:01:50.370 --> 00:01:55.410
then you could use Kupidmin to upgrade your nodes

00:01:55.410 --> 00:01:58.010
to scale your nodes up or down.

00:01:58.010 --> 00:02:00.070
So but we'll go through the concepts

00:02:00.070 --> 00:02:02.790
and then I'll demonstrate it with many.

00:02:02.790 --> 00:02:05.170
Oh yeah, yeah.

00:02:05.170 --> 00:02:08.410
So remember when we looked at Sillium,

00:02:08.410 --> 00:02:10.450
when we did Sillium status and it showed

00:02:10.470 --> 00:02:17.430
cluster mesh was disabled okay so with cillium you can set up two clusters and you can join them

00:02:17.430 --> 00:02:23.910
with a cluster mesh and then what you can do is cluster a is running your workloads your

00:02:23.910 --> 00:02:29.270
stateless workloads cluster b is your new cluster with the latest kubernetes version

00:02:30.470 --> 00:02:39.110
and it's connected with a cluster mesh and you can start draining your nodes and um start deploying your

00:02:39.110 --> 00:02:44.070
your stateful workloads onto the second cluster because it joins them together like it's one

00:02:44.070 --> 00:02:49.910
cluster it's a very advanced concept and it's improved through the years i mean when it first

00:02:49.910 --> 00:02:55.430
started um there were government agencies that put you know a half a million dollars into their

00:02:55.430 --> 00:03:04.790
cluster mesh um that was when it was you know beta today you can build a cluster mesh

00:03:04.790 --> 00:03:06.010
really economical.

00:03:06.570 --> 00:03:10.230
So you can see with a lot of the concepts of Kubernetes

00:03:10.230 --> 00:03:11.870
who started out very complex.

00:03:12.450 --> 00:03:15.270
Only a few engineers knew how to make it work.

00:03:15.610 --> 00:03:17.810
They were the early alpha testers, if you will.

00:03:18.730 --> 00:03:21.250
And then that piece has actually improved considerably,

00:03:21.490 --> 00:03:22.550
especially in the past year.

00:03:22.790 --> 00:03:24.630
Yes, by pointing the DNS.

00:03:25.010 --> 00:03:28.290
So the way that I recommend for my clients is

00:03:28.290 --> 00:03:31.630
just build another cluster the way you want it

00:03:31.630 --> 00:03:32.490
with the latest version.

00:03:32.490 --> 00:03:33.230
Test it.

00:03:33.790 --> 00:03:34.770
And then once you're competent,

00:03:34.790 --> 00:03:41.830
that your tests have all passed, then go ahead and use declarative GitOps to bring in your

00:03:41.830 --> 00:03:48.110
stateless workloads. And then once you're satisfied with the state that they're in,

00:03:48.490 --> 00:03:51.970
then you just point the DNS from the old ingress to the new ingress.

00:03:53.150 --> 00:03:55.850
Because they're stateless workloads, so it doesn't matter, right?

00:03:55.910 --> 00:04:00.010
So that's the simplest way. Oh, no, no, that's actually the simplest way.

00:04:04.700 --> 00:04:07.720
So Kubernetes worker nodes can be scaled up or down as needed.

00:04:07.720 --> 00:04:12.380
You can have as many as you want up to 4,000.

00:04:12.880 --> 00:04:13.420
Let's see.

00:04:14.180 --> 00:04:20.740
On a high availability, you can add 4,900 and 97 workers.

00:04:20.780 --> 00:04:21.440
So there is a limit.

00:04:22.020 --> 00:04:23.800
The limit is 5,000 nodes per cluster.

00:04:25.020 --> 00:04:30.140
So if you have three for your control plane, you can have 4,997 worker nodes.

00:04:30.840 --> 00:04:35.740
Now, that's not going to give you any room to add a node before you delete a node, though.

00:04:35.740 --> 00:04:39.420
So you might not want to run up against that maximum.

00:04:39.420 --> 00:04:43.260
I've never run into anyone who has come close.

00:04:44.220 --> 00:04:47.580
There are government agencies that run 1,000 node clusters,

00:04:47.580 --> 00:04:51.740
but they've never run into anyone who did 5,000.

00:04:53.180 --> 00:04:56.700
Kubernetes control plane nodes must be scaled an odd number.

00:04:56.700 --> 00:05:00.060
Control plane nodes are running one, three, or five node cluster.

00:05:00.060 --> 00:05:05.580
And Minicube provides the ability to add or delete

00:05:05.580 --> 00:05:12.820
nodes. Kind of a neat little feature. They abstracted away a little bit of the more difficult

00:05:12.820 --> 00:05:22.080
aspects of it. And so to scale downward, Kubernetes scaling downward involves coordinating the node.

00:05:22.080 --> 00:05:27.080
Coordining a node prevents future pods from being scheduled on the nodes. While it's cordoned,

00:05:27.080 --> 00:05:35.560
the pods run that are already scheduled, they run on it, but no new work can be scheduled.

00:05:35.580 --> 00:05:42.920
So this is a similar state that you would encounter if you lose your high availability on your control plane.

00:05:43.360 --> 00:05:49.720
Everything would continue to run, but you wouldn't be able to schedule, you know, new workloads.

00:05:51.220 --> 00:05:54.280
Because essentially all of your nodes would be cordoned.

00:05:55.040 --> 00:05:58.220
So coordinating a node prevents future pods from being scheduled.

00:05:59.020 --> 00:06:02.340
The next step is you drain the node of all the pods.

00:06:03.420 --> 00:06:05.560
So you drain it, which tells you.

00:06:05.580 --> 00:06:08.420
tells a scheduler, hey, find another node

00:06:08.420 --> 00:06:10.440
to put these pods on.

00:06:10.440 --> 00:06:13.880
It will remove them and will simultaneously

00:06:13.880 --> 00:06:18.360
attempt to schedule them on another node

00:06:18.360 --> 00:06:20.840
if there is availability and if not,

00:06:20.840 --> 00:06:24.600
then it will just say pending note.

00:06:24.600 --> 00:06:26.800
So you'll be able to look at the pod,

00:06:27.720 --> 00:06:29.360
but there will be no node assigned to it.

00:06:29.360 --> 00:06:32.400
If you use the hyphen O wide,

00:06:32.400 --> 00:06:34.960
you'll see that the node space is empty.

00:06:35.580 --> 00:06:42.780
and it'll say no nodes available to schedule to and it'll wait until the scheduler finds it a node okay

00:06:44.850 --> 00:06:52.930
so pod disruption budgets may need to be set to avoid interruption so that could cause issues with some of your auto-scaling

00:06:52.930 --> 00:06:59.330
setups so you need to enable pod disruption budgets if you're going to be doing this manually means an extra step

00:06:59.970 --> 00:07:06.850
that needs to be tested on a test server before you put that into deployment if you plan on doing this on a

00:07:06.930 --> 00:07:08.830
a production server.

00:07:08.830 --> 00:07:10.530
I don't recommend it, but

00:07:10.530 --> 00:07:12.510
there are engineers who

00:07:12.510 --> 00:07:14.770
do scale up or down

00:07:14.770 --> 00:07:18.070
in production.

00:07:18.330 --> 00:07:20.850
Draining a node forces pods to be

00:07:20.850 --> 00:07:22.250
scheduled on other nodes.

00:07:22.990 --> 00:07:24.930
So the final step after we

00:07:24.930 --> 00:07:26.830
drain the node is to remove

00:07:26.830 --> 00:07:27.310
the node.

00:07:28.630 --> 00:07:30.790
Essentially, deleting a node removes it

00:07:30.790 --> 00:07:32.470
from the Kubernetes cluster right.

00:07:33.410 --> 00:07:35.230
Scaling Kubernetes nodes

00:07:35.230 --> 00:07:35.790
upward.

00:07:36.930 --> 00:07:41.930
Kubernetes scaling upward involves creating a node.

00:07:42.530 --> 00:07:47.530
So we would create a VM to Ubuntu 24 on there,

00:07:47.530 --> 00:07:49.530
install our Kubernetes flavor.

00:07:50.730 --> 00:07:52.390
And now we're ready to,

00:07:52.390 --> 00:07:55.270
the next step would be to join the node to the cluster.

00:07:56.470 --> 00:07:59.150
So in that case, we give it the cert,

00:07:59.150 --> 00:08:01.290
so that it can connect to the cluster

00:08:01.290 --> 00:08:03.810
because it uses a certificate to connect

00:08:03.810 --> 00:08:05.010
along with the stream.

00:08:05.010 --> 00:08:14.730
And it tells the to the API server that, hey, I am a server node or, hey, I am a worker.

00:08:14.730 --> 00:08:19.770
The next step after that is we accordion the node, if it was cordoned during startup.

00:08:19.770 --> 00:08:23.330
And this notifies the pod scheduler to allow pods to be added to them.

00:08:23.330 --> 00:08:24.330
And we're up and running.

00:08:24.330 --> 00:08:27.250
All right, upgrading Kubernetes nodes.

00:08:28.770 --> 00:08:32.330
Upgrading the Kubernetes version requires cordoning the node.

00:08:32.330 --> 00:08:38.330
Next is to drain the node so that all of the pods are off of it.

00:08:38.330 --> 00:08:43.330
We then upgrade the node to the latest version that we're happy with,

00:08:43.330 --> 00:08:49.330
probably not a dot zero or maybe not even a dot one, but a dot two onward.

00:08:49.330 --> 00:08:51.330
We then restart the node.

00:08:51.330 --> 00:08:55.330
Alternatively, you can just restart the system control processes,

00:08:55.330 --> 00:09:01.330
but I would recommend restarting the full node is that cleans everything out.

00:09:01.330 --> 00:09:09.590
Imagine if you had a server running for two years and all you did was a fake version of Kubernetes and didn't actually restart the server.

00:09:11.090 --> 00:09:17.850
So generally, this is an existing VM and it's been running for probably close to a year.

00:09:18.590 --> 00:09:22.110
And so you'd want to totally upgrade the server.

00:09:22.350 --> 00:09:26.110
You restart the node and then finally you uncordant the node.

00:09:26.110 --> 00:09:29.790
And you'll run into issues probably 10% of the time when you do this.

00:09:29.790 --> 00:09:32.550
But it's just not going to want to come up or joy.

00:09:32.550 --> 00:09:35.290
That's why I don't recommend doing this with production clusters.

00:09:38.290 --> 00:09:42.030
Okay, cluster scaling and upgrading using infrastructure is code.

00:09:43.230 --> 00:09:48.230
So cluster scaling or upgrading involving IAC in a stateless workload.

00:09:48.230 --> 00:09:51.230
We create a new cluster with the latest tested version.

00:09:51.230 --> 00:09:57.590
The workload state can be copied using GitOps, declared a GitOps, onto the new cluster.

00:09:57.590 --> 00:10:05.750
reaching the desired state, the DNS can be transferred to the new IP. So you could have multiple

00:10:05.750 --> 00:10:11.350
gateways coming in. You could add multiple ingresses. You would transfer your ingress,

00:10:11.350 --> 00:10:19.510
your DNS to the IP addresses of the new cluster. You then monitor stateless workloads

00:10:19.510 --> 00:10:25.750
for any issues, and you can transfer back to the original cluster if issues around. So the alternative

00:10:25.750 --> 00:10:32.110
is in a cluster mesh then you can simply drain the nodes and if you have it set up

00:10:32.110 --> 00:10:37.570
correctly inside that cluster mesh the pods will automatically transfer over to

00:10:37.570 --> 00:10:41.530
the duplicate and obviously takes a fair amount of work to settle all that up to

00:10:41.530 --> 00:10:47.350
work in an automated fashion okay let's go right into the practical shall we

00:10:47.350 --> 00:10:55.730
make sure we have a fresh mini-cube profile and this time we're going to create a

00:10:55.750 --> 00:11:01.810
H-A. Mini-Cube cluster with Container D.

00:11:03.940 --> 00:11:09.160
Up till now, other than yesterday morning, we've been using the Docker runtime,

00:11:09.460 --> 00:11:13.080
and so now we're going to do this with Container D just to mix it up with it.

00:11:13.320 --> 00:11:13.880
All right.

00:11:14.920 --> 00:11:17.140
Let's make sure we have a happy cluster.

00:11:17.260 --> 00:11:18.560
Is it happy?

00:11:19.760 --> 00:11:24.700
Yeah, Mini-Cube engineers had obviously too much time on their hands in a sense of humor at some point.

00:11:24.700 --> 00:11:29.700
So, all right, we'll get the status of a cluster.

00:11:29.700 --> 00:11:32.460
Okay, and then go back to that happy.

00:11:32.460 --> 00:11:35.500
The profile has noticed anything interesting in the port.

00:11:35.500 --> 00:11:41.940
Typically, a Kubernetes cluster in H.A. production is on one of two ports.

00:11:41.940 --> 00:11:47.380
It's going to be on 443, 46443.

00:11:47.380 --> 00:11:53.140
But the engineers decided to put MiniCube on 8,443, which nothing else is going to use.

00:11:53.140 --> 00:12:03.060
My guess is somewhere along the way, someone had a conflict inside a single VM with 6443, or with 6443.

00:12:04.100 --> 00:12:08.740
So they moved it to 8443, which would eliminate most conflict.

00:12:08.740 --> 00:12:15.620
Okay. So if you're looking for 6443 or 443, MiniCube's going to be 8443.

00:12:15.620 --> 00:12:17.780
Let's have some fun with this cluster.

00:12:19.460 --> 00:12:23.060
Now my slides are those. All right. So we delete.

00:12:23.140 --> 00:12:30.260
a control plane node. You can see that M-O-2 is number two. Yeah, so it, it knows that it's

00:12:30.260 --> 00:12:36.580
Minicube-M-O-2 because it's given zero two because that's the profile. So if we were

00:12:36.580 --> 00:12:41.940
deleting the primary node, we would call it just mini-cube, but it recognizes after

00:12:41.940 --> 00:12:46.980
the height. At least it does on MIS, check the status. Well, only two nodes,

00:12:46.980 --> 00:12:51.220
and notice we have mini-cube an M-0-3.

00:12:52.980 --> 00:12:54.180
Is it a happy cluster?

00:12:56.180 --> 00:12:58.180
And take a look at the pods real quick.

00:12:58.180 --> 00:13:03.060
All right. So notice we just have the pods that were on the two remaining nodes

00:13:03.060 --> 00:13:06.740
and the pods that were on the original node we deleted.

00:13:06.740 --> 00:13:09.220
Number two are gone. So nothing there.

00:13:10.260 --> 00:13:12.180
Okay, let's add a control plane node.

00:13:37.010 --> 00:13:40.530
So now we've, yeah, so now we have three so we're happy again.

00:13:41.010 --> 00:13:49.150
let's take a look at the pods um well it's not going to be able to do a leader election

00:13:50.270 --> 00:13:55.470
it can cause issues with a leader election correct well depending on how the

00:13:55.470 --> 00:14:03.870
many cube engineers set this up they may have only accounted for three nodes in an h a

00:14:03.870 --> 00:14:08.910
we could definitely test that for fun though you want to go ahead and add

00:14:08.910 --> 00:14:12.510
one more and let's test it and two more all right

00:14:12.670 --> 00:14:18.390
It's happy with four nodes, but it won't be able to do a leader election, probably, because it's an uneven number.

00:14:18.590 --> 00:14:22.070
Well, it's a lottery, so it depends, but go ahead and add a fifth one.

00:14:24.170 --> 00:14:28.170
All right, let's see if it's happy or if anything changes there.

00:14:29.230 --> 00:14:30.610
Oh, happy with five.

00:14:30.690 --> 00:14:31.230
So there you go.

00:14:31.310 --> 00:14:33.810
Now you have a five control plane node.

00:14:33.810 --> 00:14:36.050
Now, what is the difference between a three and a five?

00:14:36.190 --> 00:14:42.550
We haven't talked about that because that's an advanced concept, and you probably won't run into it anyway, but what is the difference?

00:14:42.670 --> 00:14:51.230
Well, if we have three control planes in AHA and we lose one, then we still have two left,

00:14:51.230 --> 00:14:55.710
but they can't elect a leader because there's two.

00:14:55.710 --> 00:14:59.070
So each one's going to elect themselves on the first round, right?

00:15:00.430 --> 00:15:03.950
And then they might elect the other one on the second round.

00:15:03.950 --> 00:15:05.550
So then they elect each other.

00:15:07.070 --> 00:15:12.430
So you can't schedule anything new, but you can continue to run your existing workloads.

00:15:12.670 --> 00:15:17.670
And if you drop down to one node, you're going to start to lose pods.

00:15:17.670 --> 00:15:26.950
Because it's going to be too much for that node to handle by itself because it went from a high availability to no high availability.

00:15:26.950 --> 00:15:32.950
But if we have five and we lose a node, we still have four.

00:15:32.950 --> 00:15:37.950
So we can still schedule and it will eventually probably elect a leader.

00:15:37.950 --> 00:15:41.950
But there is a very remote chance that it does not work for.

00:15:41.950 --> 00:15:48.950
And then if we lose another node, we can still schedule pods because we have three.

00:15:48.950 --> 00:15:53.950
We have to lose three nodes before we can no longer schedule pods.

00:15:53.950 --> 00:15:56.950
You have to have three nodes come offline.

00:15:56.950 --> 00:16:03.950
On your control plane, if you're running a five control plane, high availability setup.

00:16:03.950 --> 00:16:08.950
Now that requires a lot of additional work to go from three to five, but it can be done.

00:16:08.950 --> 00:16:20.130
When it does give you extra resiliency, if you think that you might lose one, you know, one of your bare metal instances.

00:16:24.730 --> 00:16:25.330
Correct.

00:16:25.890 --> 00:16:26.570
Yeah, correct.

00:16:26.810 --> 00:16:34.490
So once you lose one, you can't schedule because you no longer have the high availability, but you can continue to run.

00:16:35.030 --> 00:16:38.830
But if you lose a second one, then you may start drop odds.

00:16:39.630 --> 00:16:39.790
Yeah.

00:16:39.790 --> 00:16:46.050
So that's, yeah, so that's the way you, you run things with your own ORCA.

00:16:47.710 --> 00:16:54.890
So typically if you're running, say, no more than 20 nodes, a three-node control plane is fine.

00:16:56.330 --> 00:17:03.790
But when you start getting above 100 nodes, that's when I start telling clients, you know, consider running five control plane notes.

00:17:04.630 --> 00:17:09.230
Why, yeah, why they don't is because, well, I have to pay for those control planes.

00:17:09.230 --> 00:17:11.610
So I have to spend up VMs that's using resources.

00:17:12.270 --> 00:17:13.170
Why am I doing that?

00:17:13.210 --> 00:17:13.850
What do I gain?

00:17:14.570 --> 00:17:19.290
But you can lose three and still maintain your workloads, right?

00:17:19.470 --> 00:17:21.850
And still schedule new workloads.

00:17:22.010 --> 00:17:23.030
You can lose three of them.

00:17:23.030 --> 00:17:29.550
So you can, yeah, you can lose three and still maintain.

00:17:29.850 --> 00:17:32.170
You can lose two and still schedule.

00:17:32.970 --> 00:17:33.110
Yeah.

00:17:34.250 --> 00:17:34.950
A hundred.

00:17:35.610 --> 00:17:38.090
Yeah, once you get beyond a hundred nodes, yeah.

00:17:38.130 --> 00:17:39.210
So you probably get,

00:17:39.230 --> 00:17:47.630
into complexity when you have, you know, above 100 nodes. So I define clusters as above 100 node

00:17:47.630 --> 00:17:53.870
and below 100 nodes. And the majority of clusters are below 20 nodes total, including control

00:17:53.870 --> 00:18:01.070
frame. So that's your vast majority of clusters, probably 95% are going to be below 20. And then you

00:18:01.070 --> 00:18:07.550
have this area where you start moving up towards 100, still a 3 will work fine. Well, when you get above

00:18:07.550 --> 00:18:12.670
100, you're probably going away above 100, right? I mean, that's kind of the difference in

00:18:12.670 --> 00:18:19.150
Kubernetes. You could have 500, you could have a thousand. And so you need to be running a five-node

00:18:19.150 --> 00:18:28.690
control plane. And it's, yeah, it has to do with the elector and how all of the services work on the

00:18:28.690 --> 00:18:36.370
control plane that once you lose one node and you just have two, an even number, there's issues

00:18:36.370 --> 00:18:42.930
that arise so it keeps it from scheduling. Now it doesn't mean that it might not resolve it for a couple

00:18:42.930 --> 00:18:47.410
of hours and allow you to schedule pods for a couple hours and then all of a sudden it goes back

00:18:47.410 --> 00:18:54.130
into the hey we didn't you know we can't communicate we can't figure out who the leader is. But I wouldn't

00:18:54.130 --> 00:18:59.290
count on that. You know that's kind of like when you set a limit for a pod and you're able or a

00:18:59.290 --> 00:19:05.210
container and you're able to exceed that limit for a minute before the limit kicks in kind of the same

00:19:05.210 --> 00:19:11.130
principle i wouldn't count on that but yeah for sure a five control plane setup will give you more

00:19:11.130 --> 00:19:16.010
resiliency than a three okay let's see we add so now we need to remove two of those

00:19:17.210 --> 00:19:22.330
yeah it's somewhere in there any group node delete control plane or no it's delete uh we have

00:19:22.330 --> 00:19:30.410
give it a name delete m06 and we'll 5 yeah they've abstracted away some of the difficult

00:19:30.410 --> 00:19:36.170
stuff because we'd have the cord and then then drain and then delete and they do that for us

00:19:37.130 --> 00:19:40.130
Part of that mini-cube magic.

00:19:40.130 --> 00:19:47.130
Yeah, because daemon sets are automatically on each node or not on each node, so you ignore the daemon sets.

00:19:47.130 --> 00:20:03.130
So when you start draining it, well, if you're, yeah, if you're just talking about removing a node, but if it's a, if it's a node you're going to be working on, then you want to, the proper step is the coordinate first and then drain.

00:20:03.130 --> 00:20:07.110
Yeah, it might not actually do anything because the scheduler might say,

00:20:07.130 --> 00:20:11.130
hey, I don't have anything on the schedule for your nodes, so it was a waste of effort.

00:20:11.130 --> 00:20:15.860
But yeah, okay.

00:20:15.860 --> 00:20:25.860
Yeah, coordinate first and then give it a second to settle so they can pick another node to put pods on and then start drinking.

00:20:25.860 --> 00:20:27.860
All right.

00:20:27.860 --> 00:20:32.860
So now I'm going to do, see this works here.

00:20:32.860 --> 00:20:34.860
Let's add a worker node.

00:20:34.860 --> 00:20:37.860
All right, so now let's have some fun here.

00:20:37.860 --> 00:20:40.500
Let's deploy a mini-cube cluster.

00:20:41.140 --> 00:20:47.740
It's your a fresh mini-cube environment and deploy a cluster with the Ecyllium CNI.

00:20:48.560 --> 00:20:50.220
Notice we have six nodes.

00:20:51.400 --> 00:20:52.940
And I'll be right back.

00:20:54.870 --> 00:20:58.570
It's Docker magic, you know, mini-cube engineering magic also.

00:21:01.300 --> 00:21:02.720
Let's see where we're at here.

00:21:03.300 --> 00:21:04.720
Looks like we've settled out.

00:21:04.860 --> 00:21:06.180
Should we have a happy cluster?

00:21:06.380 --> 00:21:08.120
Hey, hey, she's happy.

00:21:09.160 --> 00:21:11.000
Yeah, let's start the status real quick.

00:21:11.060 --> 00:21:15.140
Yeah, there we've got three control planes and three workers.

00:21:15.140 --> 00:21:20.740
Now we're going to look for a control plane node with a cillium operator on it.

00:21:20.740 --> 00:21:24.820
So we'll do nodes O wide.

00:21:24.820 --> 00:21:27.540
Control these nodes are wide.

00:21:29.380 --> 00:21:31.220
Oh, whoops, no pods, my bad.

00:21:31.220 --> 00:21:33.860
Minus A, uh, minus A.

00:21:33.860 --> 00:21:34.340
There we go.

00:21:34.340 --> 00:21:36.820
Right. Where's mini-coo? All right.

00:21:36.820 --> 00:21:38.740
Let's delete this.

00:21:38.740 --> 00:21:45.240
Oh, you only knew how many weeks it takes to build all that out.

00:21:45.240 --> 00:21:46.240
It's amazing.

00:21:46.240 --> 00:21:50.240
Let's take the status of the cluster.

00:21:50.240 --> 00:22:09.650
Let's take a look at the nodes.

00:22:09.650 --> 00:22:10.650
And is it happy?

00:22:10.650 --> 00:22:30.090
Doesn't say degraded, does it?

00:22:30.090 --> 00:22:32.090
It just says running.

00:22:32.090 --> 00:22:34.090
Hmm, interesting.

00:22:34.090 --> 00:22:37.090
All right.

00:22:37.090 --> 00:22:44.970
So, let's have some fun here.

00:22:44.970 --> 00:22:51.210
I like breaking things so you know yeah just go ahead and control C out of that

00:22:52.730 --> 00:23:08.880
obviously you can run minicube profile list you have six nodes running but something's not

00:23:08.880 --> 00:23:26.320
right um right so uh all kinds of stuff going on so you can look at the nodes let's look at the

00:23:26.320 --> 00:23:32.720
pods what we have running is running but we don't have all of our nodes all right so what happened

00:23:35.800 --> 00:23:41.320
MiniCube with Cillium and Kubup, HA, is not really H.A.

00:23:41.320 --> 00:23:43.240
There's a lot more to configuring H.A.

00:23:43.240 --> 00:23:44.760
than just a basic install.

00:23:45.720 --> 00:23:48.200
So this is similar to what can happen in a production H.A.

00:23:48.200 --> 00:23:51.880
Cluster that's not all configured correctly to handle.

00:23:51.880 --> 00:24:01.200
Right. So let's review. Lesson 9, we learned how to scale and upgrade Kubernetes clusters.

00:24:02.560 --> 00:24:06.080
We learned how control plane nodes must be scaled in odd numbers.

00:24:06.080 --> 00:24:12.520
Not that they won't work for a few minutes in an even, and it will work with four,

00:24:13.120 --> 00:24:14.980
while it does not work with two.

00:24:14.980 --> 00:24:23.920
Scaling downward involves coordinating, draining, and deleting, and scaling upward involves creating, joining, and then un-coordening.

00:24:23.920 --> 00:24:33.320
We learned that upgrading a node requires coordinating a node, then drain, upgrade, restart, and finally uncordinated.

00:24:33.320 --> 00:24:37.680
The cluster scaling using IAC is easier with stateless workloads.

00:24:38.920 --> 00:24:44.760
Simple as creating a new cluster with IEC and set workload state with the clarity of GitOps.

00:24:46.200 --> 00:24:49.320
After reaching desired state, transfer DNS to new cluster.

00:24:50.680 --> 00:24:55.000
The new cluster ingress and then monitor the new cluster for any issues.

00:24:55.000 --> 00:24:58.600
We learned how to scale a mini-cube H-A control plane down,

00:24:59.800 --> 00:25:03.160
how to scale a mini-cube H-A control plane up.

00:25:03.320 --> 00:25:11.320
how to scale a worker node, how complex HAA setups may be more difficult, and what happens

00:25:11.320 --> 00:25:17.320
when the primary control plane goes offline? So what you just saw there was mini-coop was our

00:25:17.320 --> 00:25:26.760
primary control plane and went offline. So that's part of what our problem is that we just saw.

00:25:26.760 --> 00:25:31.760
So Kubevip and SELAM are not configured correctly for H.A.

00:25:31.760 --> 00:25:37.980
So it works great if you take out O2 or O3, I think.

00:25:37.980 --> 00:25:41.700
But when you take out O1 or take out mini-cube, which is O1,

00:25:41.700 --> 00:25:46.000
and then add one back in, we're missing a few things.

00:25:46.000 --> 00:25:49.800
All right.

00:25:49.800 --> 00:25:51.720
Any questions on lesson nine?

00:25:51.720 --> 00:26:02.130
All right, let's jump into lesson 10.

00:26:02.130 --> 00:26:05.890
might, I'm not sure which environment will meet.

00:26:05.890 --> 00:26:25.020
Oh yeah, for sure.

00:26:25.020 --> 00:26:28.040
Yeah, analyze and troubleshoot Kubernetes issues.

00:26:28.040 --> 00:26:34.780
So rather than devote a large lesson for 10,

00:26:34.780 --> 00:26:39.620
I've incorporated much of this lesson into all of the other eight lessons,

00:26:39.620 --> 00:26:43.680
not so much nine, although we did have a little bit,

00:26:43.680 --> 00:26:46.560
but lesson one through eight,

00:26:46.560 --> 00:26:49.460
I incorporated analyzing and troubleshooting so that you

00:26:49.480 --> 00:26:55.760
could see it in real time, you know.

00:26:55.760 --> 00:26:58.160
And so I think that's better than just studying the theory.

00:26:58.160 --> 00:27:00.580
So this is kind of a recap of some of it.

00:27:00.580 --> 00:27:03.240
So poop control is a primary CLI tool

00:27:03.240 --> 00:27:05.700
for obtaining cluster information.

00:27:05.700 --> 00:27:09.620
And we know that when COP API server is unavailable,

00:27:09.620 --> 00:27:12.120
then we're going to be unavailable,

00:27:12.120 --> 00:27:16.150
we're going to be unable to use COP control, right?

00:27:16.150 --> 00:27:19.450
So we're going to be very limited in what we will have access

00:27:19.450 --> 00:27:22.330
to if API server is unavailable,

00:27:22.330 --> 00:27:26.090
group control has nothing to connect to.

00:27:26.090 --> 00:27:28.990
So in that case, when we're analyzing,

00:27:28.990 --> 00:27:31.930
events provide important information for one hour

00:27:31.930 --> 00:27:34.210
and then they're gone as well, right?

00:27:34.210 --> 00:27:37.930
And they're deleted in order to save space.

00:27:37.930 --> 00:27:42.250
So logs can provide important information

00:27:42.250 --> 00:27:45.010
after starting up, but we don't have logs

00:27:45.010 --> 00:27:47.730
when we're trying to start up a container, right?

00:27:47.730 --> 00:27:51.290
So container creating, there's nothing there other than events.

00:27:51.290 --> 00:27:56.290
And then the other problem is that logs are deleted if the container is deleted.

00:27:56.290 --> 00:28:02.290
So when it's running a job and it deletes a container and starts a new one up,

00:28:02.290 --> 00:28:08.290
the logs are all shipping events and logs can save valuable time and troubleshoot.

00:28:08.290 --> 00:28:09.290
Yes.

00:28:09.290 --> 00:28:16.290
So node analysis is important to analyze resource constraints.

00:28:16.290 --> 00:28:20.290
And resource constraints can be in many forms.

00:28:20.290 --> 00:28:26.790
You can have CPU, memory, or PID, and that's the one that gets people.

00:28:27.430 --> 00:28:28.330
I nodes also.

00:28:28.610 --> 00:28:32.270
Pid and I nodes, two that developers don't even think about, right?

00:28:33.030 --> 00:28:35.870
So when your DevOps personnel are deploying their workloads

00:28:35.870 --> 00:28:39.910
and you can't figure out how it's crashing or what's going on with the database,

00:28:40.210 --> 00:28:41.690
those are the two that are they hidden.

00:28:42.090 --> 00:28:45.370
They just sneak up on them, Pids and I nodes.

00:28:45.970 --> 00:28:48.130
So I nodes are on your database.

00:28:48.130 --> 00:28:57.010
that allows so many it's like files and so if you have a lot of so i believe it's a lot of rights of very

00:28:57.010 --> 00:29:04.210
small files you get a generous i node allocation but if you have a lot of rights of very small files

00:29:05.650 --> 00:29:11.890
then it'll blow up your i nodes after a year right so that so everything's running along nicely

00:29:11.890 --> 00:29:16.290
on your application and after a year all of a sudden your database crashes like what happened

00:29:16.290 --> 00:29:46.230
I didn't do anything. I didn't update it. I didn't, you know, we were at, you know, 100 meg yesterday and we're at 101 meg today. What happened? We had 200 meg available, right? Or whatever size your data received. Yeah, I don't understand it. Why did it crash? If you start looking at I nodes relating to like a MySQL or Postgres or whatever, you'll find that your I nodes have probably been exceeded if you have a type of application. That's one of them that, um,

00:29:46.290 --> 00:29:47.510
We'll creep up on you.

00:29:47.910 --> 00:29:50.990
And so the other constraint can be node paints, right?

00:29:51.110 --> 00:29:51.990
So we paint a node.

00:29:53.010 --> 00:29:58.970
And so we have to tolerate it to enable a workload or not tolerate it, and it will

00:29:58.970 --> 00:29:59.670
repel it.

00:30:00.630 --> 00:30:01.970
So we need to check that.

00:30:02.610 --> 00:30:04.430
And then node pressure, disk.

00:30:04.770 --> 00:30:10.170
So if we're not properly managing our node logs, that can creep up, depending on how

00:30:10.170 --> 00:30:16.110
much you've allocated for your node storage, or especially,

00:30:16.290 --> 00:30:20.370
you know control planes we generally don't allocate a lot of storage to control planes

00:30:20.370 --> 00:30:25.090
control planes are an expense they don't generate anything because you're not a

00:30:25.090 --> 00:30:33.090
workload so we try to use this resource friendly as possible all right

00:30:34.610 --> 00:30:41.810
pod analysis is important to analyze pod constraints so is there a note to deploy on

00:30:43.250 --> 00:30:48.370
are there any taints that need to be tolerated is the image repository

00:30:48.850 --> 00:30:55.330
available right so there's we have a credential issue is the image repo down did the

00:30:55.330 --> 00:31:01.330
upstream repo change they've been using a repo for the last five years and all of a

00:31:01.330 --> 00:31:06.290
sudden with no notice you using a totally different one we can't pull an image so is the

00:31:06.290 --> 00:31:11.170
image able to be pulled because of credentials right repo is there but we don't

00:31:11.170 --> 00:31:17.170
have credentials are there enough resources allocated to start up the node ready

00:31:17.170 --> 00:31:24.430
state. So it must be ready or no pods can deploy. Our pod ready state must be ready for

00:31:24.430 --> 00:31:30.110
workloads to function. Services need endpoints to connect to, even Headless. Headless uses DNS.

00:31:30.910 --> 00:31:36.890
So it uses DNS instead of going through proxy, to proxy it goes through the DNS server in order

00:31:36.890 --> 00:31:42.410
to connect to the service. But it still needs endpoint. Ingress needs endpoints to connect to, right?

00:31:42.650 --> 00:31:47.150
So just like when we look through the ingress and the endpoints were empty. And even when it

00:31:47.170 --> 00:31:49.710
connected to the service, the service endpoints.

00:31:49.710 --> 00:31:53.410
Labels enable the ability to query all resources by label.

00:31:53.410 --> 00:31:54.310
That can be a time.

00:31:54.310 --> 00:31:56.210
And namespaces enable the ability

00:31:56.210 --> 00:31:58.710
to query all resources by namespace.

00:31:59.610 --> 00:32:03.770
So we do a practice here if we have time.

00:32:03.770 --> 00:32:05.950
So it depends.

00:32:05.950 --> 00:32:08.490
So this was designed to basically go through everything

00:32:08.490 --> 00:32:10.790
that we've learned and spin it up again

00:32:10.790 --> 00:32:12.430
and see how much you've obtained.

00:32:12.430 --> 00:32:14.470
It's about 240.

00:32:14.470 --> 00:32:17.130
So if we take our last 15 minute

00:32:17.170 --> 00:32:25.030
it would put us at 255 so if we you know skip this or put us at around 3 o'clock and

00:32:25.030 --> 00:32:31.630
then we can spend the last 45 minutes because we have to do the review before 4 o'clock

00:32:31.630 --> 00:32:38.830
so did you receive the review from from Neil okay because he wanted to make sure that

00:32:38.830 --> 00:32:46.410
was completed before 4 o'clock so they'd like to do that between 345 and 4 so

00:32:46.410 --> 00:32:50.590
Okay, so I'm Eastern.

00:32:50.970 --> 00:32:51.590
Oh, interesting.

00:32:52.030 --> 00:32:56.150
So, yeah, so for me, he wants it filled out, and I guess you email it to me.

00:32:57.470 --> 00:32:58.230
Is that how?

00:32:58.310 --> 00:33:05.150
Okay, yeah, so he says, have the suit filled it out around 345, so 15 minutes before the end of the course.

00:33:05.870 --> 00:33:08.350
Are you central?

00:33:08.590 --> 00:33:09.430
Huh, interesting.

00:33:11.720 --> 00:33:12.400
We'll be my back.

00:33:14.000 --> 00:33:14.460
Had a little.

00:33:15.460 --> 00:33:18.640
So we can do a quick review, and then we can go on break, and then

00:33:18.640 --> 00:33:22.000
lesson 11 is where we take everything we've learned

00:33:22.000 --> 00:33:24.120
with Helm charts. We're going to learn Helm charts

00:33:24.120 --> 00:33:27.020
and then we're going to take everything we've learned with Helm charts.

00:33:27.460 --> 00:33:30.560
Okay, the Helm chart pieces. Yeah, it's involved

00:33:30.560 --> 00:33:33.720
but it's fun. So Helm is, to me, is where Kubernetes

00:33:33.720 --> 00:33:36.480
comes together. Okay, so

00:33:36.480 --> 00:33:39.020
let's skip this.

00:33:42.010 --> 00:33:44.790
So proper analysis and troubleshooting starts with

00:33:44.790 --> 00:33:49.030
events and logs. Shipping events and logs

00:33:49.030 --> 00:33:50.430
can save valuable time.

00:33:50.850 --> 00:33:55.850
Coop control provides feedback on YAML formatting errors.

00:33:55.850 --> 00:34:03.850
That's kind of nice, but that's one of the very few feedbacks that you will receive on YAML formatting is with Coop Control.

00:34:03.850 --> 00:34:09.850
So if you're using Helm charge, you may not receive feedback that's helpful at all.

00:34:09.850 --> 00:34:16.850
And if you have a thousand line values file, shipping events and logs can save valuable time.

00:34:16.850 --> 00:34:19.850
Helm may not provide descriptive feedback on YAMLOR.

00:34:19.850 --> 00:34:29.440
on Yamow errors. Start with the pod and any controllers, whether it's a replica set or

00:34:29.440 --> 00:34:37.940
deployment controllers. Next, the service, if there is any service, ingress or gateway, may have

00:34:37.940 --> 00:34:44.140
multiple components to analyze, especially when you get into gateways, your gateway class may

00:34:44.140 --> 00:34:52.700
be incorrect. Or your HTTP route may not be read correctly, may not be connecting to

00:34:52.700 --> 00:34:59.340
the gateway and always remember to test on a development or production like cluster, especially

00:34:59.340 --> 00:35:05.550
when working with Helm charge. All right, let's go ahead and take our last 15-minute break and

00:35:05.550 --> 00:35:10.350
we'll come back at just before 3 o'clock. How's that? Yeah, yeah, yeah, absolutely.

00:35:12.190 --> 00:35:36.050
All right, I will see you at 3 o'clock and you're ready to continue. Oh, okay. All right.

00:35:36.050 --> 00:35:41.890
all right now we get into the fun part so this is kind of where everything that we have learned so far

00:35:41.890 --> 00:35:52.370
or it all kind of gel and that's because you know helm charts are the you know production way of deploying

00:35:52.370 --> 00:36:02.530
manifest files and you can certainly do it the way that we know deployed different containers workloads

00:36:02.530 --> 00:36:08.970
with manifest files, but Helm templating provides a uniform method of doing that.

00:36:09.270 --> 00:36:12.190
So what were you going to say?

00:36:12.390 --> 00:36:14.550
Let's jump into it here.

00:36:15.750 --> 00:36:20.330
So Helm templating follows a specific layout for standard charts.

00:36:20.330 --> 00:36:25.490
Within the chart folder, you will find several required folders in file.

00:36:25.550 --> 00:36:30.470
Part.comel file lists both the chart version and the app version,

00:36:31.230 --> 00:36:33.390
the chart version being self-descriptive.

00:36:34.130 --> 00:36:39.530
It's the version of that Axel Helm chart and the app version relating to whatever workload that you're

00:36:39.530 --> 00:36:48.480
you know running controlled by that chart the values dot yamil file contains the user-defined variables

00:36:48.480 --> 00:36:54.580
of course it's a lowercase v not an uppercase is that changed to chart. yamma however is uppercase

00:36:54.580 --> 00:37:00.660
the templates directory contains manifest file templates templates are designed to read variables in the values

00:37:00.660 --> 00:37:11.980
YAML file. Templates rarely contain user-defined variables. If they do, generally, it's going to

00:37:11.980 --> 00:37:17.960
trigger something. So we'll get chart directory. The charts directory is optional and may contain

00:37:17.960 --> 00:37:26.040
upstream charts. So it may pull in upstream charts and then those upstream charts will be

00:37:26.040 --> 00:37:28.880
updated within that charts direction. All right, Helm templating.

00:37:30.660 --> 00:37:33.340
Home templating follows a specific design batter.

00:37:33.340 --> 00:37:38.980
It is designed to enable DevOps personnel to modify the chart, meaning the modifications

00:37:38.980 --> 00:37:42.220
are to the values that AML file only.

00:37:42.220 --> 00:37:46.040
And the rest of the chart is maintained by upstream chart captains.

00:37:46.040 --> 00:37:50.160
And it's the actual term, for some reason, someone chose a long time ago.

00:37:50.160 --> 00:37:54.420
I think when the word container was chosen, they decided to go full nautical with every

00:37:54.420 --> 00:37:55.420
family.

00:37:55.420 --> 00:37:57.840
So that's where that comes from the chart captain.

00:37:57.840 --> 00:38:03.080
The chart is versioned and installed by latest or the version number.

00:38:03.080 --> 00:38:09.400
Helm issues, upstream providers often do not have a full-time helm captain.

00:38:09.400 --> 00:38:14.720
This leads to upstream personnel maintaining the template and errors can arise due to lack

00:38:14.720 --> 00:38:16.400
of templating knowledge.

00:38:16.400 --> 00:38:25.680
Common errors, headless service, I see that one a lot, ingress, ports, image version.

00:38:25.680 --> 00:38:43.180
Oh, by the way, a good practice, and I did not cover this, but a good practice is to install an ironclad firewall on your Kubernetes node so that you can catch all of the ports that are being used that you have no idea.

00:38:44.180 --> 00:38:54.360
And what I mean by that is upstream maintainers will release a container with a helm chart, and it might use three different ports, metrics,

00:38:54.360 --> 00:39:03.960
ingress and maybe a communication port for high availability or it'll have a

00:39:03.960 --> 00:39:09.320
live-ness probe port right and then all of a sudden they'll release a new chart

00:39:09.320 --> 00:39:14.200
with a new version of the container and it's using two new ports what are those

00:39:14.200 --> 00:39:17.720
ports and when you communicate with them sometimes we're not using two new

00:39:17.720 --> 00:39:22.840
ports and well here's here you go take a look and they realize they've incorporated

00:39:22.840 --> 00:39:26.980
something in there that they didn't even realize that was using two additional ports.

00:39:28.080 --> 00:39:35.340
So all of them, yeah, so I install firewalls on all of my nodes, worker, control plane,

00:39:35.340 --> 00:39:44.080
and storage because each one, yeah, yeah, so you can use UFW and install UFW with Ubuntu,

00:39:44.800 --> 00:39:49.620
and then you will find that your container doesn't look, you know, finish spinning up, right?

00:39:49.680 --> 00:39:52.240
The pod doesn't become ready yet.

00:39:52.240 --> 00:39:53.240
So why is it?

00:39:53.240 --> 00:39:55.740
And you look and it says it can't reach something.

00:39:55.740 --> 00:40:02.560
So you go into your node and you query to see what the latest firewall block is.

00:40:02.560 --> 00:40:05.520
And it will show you whether it's incoming or outgoing.

00:40:05.520 --> 00:40:09.420
And then you can see what port that that container is using.

00:40:09.420 --> 00:40:14.100
You can go and try to look it up in the manifest file, but oftentimes it won't even show

00:40:14.100 --> 00:40:16.960
you that port in the manifest file.

00:40:16.960 --> 00:40:20.360
So there's no way to know unless you had a firewall that it's actually communicating

00:40:20.360 --> 00:40:24.820
through a specific port, whether it's going out through a port and then coming back into another

00:40:24.820 --> 00:40:31.740
node for that same container running on a second node in a different port. And then it allows you to

00:40:31.740 --> 00:40:37.660
just open the ports that you fill are necessary. And so if a container is shipping metrics,

00:40:37.660 --> 00:40:41.960
for example, and I didn't authorize metrics to be shipped, you know, to a foreign country,

00:40:41.960 --> 00:40:48.600
and some of these open source projects are actually maintained in a foreign country. And so you

00:40:48.600 --> 00:40:54.440
can block that port so no metrics are going to be shipped you know to their upstream data collection

00:40:54.440 --> 00:41:05.560
project so yeah firewalls on nodes yeah they're very very important yeah yeah um so ports and then

00:41:05.560 --> 00:41:10.600
image version is another common area it's you'd be amazed how many times the helmsert ships with the

00:41:10.600 --> 00:41:17.480
old version so that the helms chart has a new version but the the actual container still the the old

00:41:17.480 --> 00:41:23.800
And again, a lot of that has to do with the team doesn't have a Helm captain, so they're

00:41:23.800 --> 00:41:27.800
not familiar with every process for updating the Helm chart.

00:41:27.800 --> 00:41:29.960
All right, Helm solutions to issues.

00:41:29.960 --> 00:41:35.080
So it can take a while if a home chart is not correct and it does not work, your choice is

00:41:35.080 --> 00:41:42.520
to roll back to an older version of work, that may not be possible due to issues with the older

00:41:42.520 --> 00:41:49.240
version and so you can modify the home templating within the chart to fix the issues.

00:41:49.240 --> 00:41:55.800
If you understand templating, you can do that. However, this leads to a requirement to maintain your

00:41:55.800 --> 00:42:03.480
own version. This requires that you have your own chart repository. Common repositories are GitLab

00:42:03.480 --> 00:42:10.040
and GitHub. There are newer ones that are out there. However, every time you upgrade to a new version,

00:42:10.040 --> 00:42:15.400
and you have to compare charts for any changes and pull them in manually.

00:42:15.400 --> 00:42:20.540
That is a very tedious process to maintain a downstream chart.

00:42:20.540 --> 00:42:21.540
All right, any question.

00:42:21.540 --> 00:42:25.540
Okay, so we're going to ensure a fresh mini-cube profile.

00:42:25.540 --> 00:42:30.540
Yeah, just a single node on this exercise.

00:42:30.540 --> 00:42:31.540
All right, okay.

00:42:31.540 --> 00:42:33.540
So now we're going to create a Helm chart.

00:42:33.540 --> 00:42:38.540
So we have Helm installed and checked the version yesterday.

00:42:38.540 --> 00:42:45.700
So the command to create a Helm chart, and we're in the root directory, is just Helm Create,

00:42:46.420 --> 00:42:48.660
and we'll call it test-hyphen app.

00:42:48.900 --> 00:42:50.660
So now we're going to LS and find it.

00:42:50.860 --> 00:42:55.140
We're going to CD into it, and what do we see in there?

00:42:55.780 --> 00:42:57.320
So let's take a look at value.

00:42:57.500 --> 00:42:57.960
I don't know.

00:42:57.960 --> 00:42:59.260
I see anything familiar in here?

00:42:59.340 --> 00:43:02.820
Let's scroll up to the top and then read it from the top down to the bottom.

00:43:04.180 --> 00:43:05.940
Got a few lines in there, don't we?

00:43:05.940 --> 00:43:10.940
So we have a rep, yeah, I can see that.

00:43:10.960 --> 00:43:12.760
And you also have the delay,

00:43:12.760 --> 00:43:16.500
because the data center they're running this on.

00:43:18.320 --> 00:43:21.540
We can see we have a replica count of one.

00:43:22.760 --> 00:43:24.560
All right, so that's interesting.

00:43:24.560 --> 00:43:25.700
File that away.

00:43:27.160 --> 00:43:30.700
And see we have our image, yeah, yeah.

00:43:32.540 --> 00:43:35.780
And very similar to the template we worked with before.

00:43:35.940 --> 00:43:43.940
Now there is no private registry because we're pulling it from a public registry for EngineX, which it's going to assume that's Docker, probably.

00:43:45.940 --> 00:43:48.940
If it's Quaid.io, you generally need to put Quaio.

00:43:48.940 --> 00:43:50.940
Let's see here.

00:43:50.940 --> 00:43:54.940
Nothing in there, no secrets to pull from the private registry.

00:43:54.940 --> 00:44:01.940
So that Docker config JSON, if we had created one for private registry, that would go in the image pull secrets.

00:44:01.940 --> 00:44:04.940
The name of that, making sure that,

00:44:04.940 --> 00:44:12.300
that it's in the proper name space so you can see this doesn't have anything for

00:44:12.300 --> 00:44:17.660
service accounts and if we scroll down a little more we see pod annotations nothing in

00:44:17.660 --> 00:44:25.660
there so pod annotations pod labels nothing in there that's where we could label our

00:44:25.660 --> 00:44:34.780
pods pod security context nothing there we have a service we're at 80 cluster IP

00:44:34.940 --> 00:44:37.140
Let's see if that service is created.

00:44:37.340 --> 00:44:38.200
Let's go a little further.

00:44:38.360 --> 00:44:40.240
It looks like it does create the service.

00:44:41.140 --> 00:44:44.340
So it's going to create pods, one pod.

00:44:44.880 --> 00:44:47.560
It's going to create a service from 4 to 80.

00:44:48.640 --> 00:44:50.880
And is it going to create an ingress?

00:44:51.000 --> 00:44:51.420
Correct.

00:44:51.880 --> 00:44:52.820
It's false.

00:44:54.020 --> 00:44:56.220
So this leads to an interesting question.

00:44:57.760 --> 00:45:03.480
If we are using ingress and we want to use this to handle the TLS cert,

00:45:03.480 --> 00:45:12.620
into its own ecosystem, its own container, we can use the ingress.

00:45:12.720 --> 00:45:17.800
However, I generally do not use ingresses anymore because I use Gateway API.

00:45:18.540 --> 00:45:21.020
So I terminate automatically at the gateway.

00:45:21.140 --> 00:45:26.860
In fact, my Helm charts, so I'm a Heldencaptain, and I build Helm Charts from scratch,

00:45:26.860 --> 00:45:30.560
and I don't actually have an ingress in my home charts anymore.

00:45:30.560 --> 00:45:36.960
I have HTTP routes and so I had to create from scratch and we'll see that in a

00:45:36.960 --> 00:45:42.080
minute my own templating for HTTP routes for Gateway API because that doesn't

00:45:42.080 --> 00:45:48.800
exist in the Helm ecosystem yet so something to keep in mind is is Kubernetes

00:45:48.800 --> 00:45:54.560
goes from ingress to Gateway API so you can see our resources we don't have

00:45:54.560 --> 00:45:59.440
any constraints but it's there so you can look at it and we go down a little further

00:45:59.440 --> 00:46:07.840
we've got our probes and probably empty let's see yeah they're empty so no no port

00:46:09.200 --> 00:46:15.360
my other port is http but oh they're just hitting the path i guess well we'll find out if it has

00:46:15.360 --> 00:46:23.680
probes usually i see something else there beyond path it's hitting probe or something like that

00:46:23.680 --> 00:46:29.360
after the auto scaling oh interesting that could be done enabled to all you know

00:46:29.440 --> 00:46:36.640
a replica is one max replica is 100 nice and it targets CPU utilization percentage 80 so we didn't

00:46:36.640 --> 00:46:42.320
demonstrate that because it's more of an advanced concept but essentially you could set of

00:46:42.320 --> 00:46:50.000
auto scaling and add those in there and test that out additional volumes if we want to attach a

00:46:50.000 --> 00:46:55.570
volume so we could attach a secret as a volume for example

00:46:55.570 --> 00:47:02.510
I'm applying mount, see the node selector down there, and then right below that tolerations.

00:47:02.910 --> 00:47:08.750
So on a lot of helm charts where they don't have a helm captain, those won't do anything.

00:47:08.990 --> 00:47:14.330
You can fill those out, and it will not translate up to the template, which we'll look at in a minute.

00:47:14.870 --> 00:47:20.870
And that's because they don't have a helm captain, and they don't know how to read what you're putting in there for the value.

00:47:20.870 --> 00:47:27.530
So there's no way to force that pod to run on a specific node.

00:47:27.530 --> 00:47:31.530
There's no way to tolerate that node for a taint.

00:47:32.650 --> 00:47:39.450
And so you're stuck running that workload on whatever workload is available.

00:47:40.570 --> 00:47:46.170
And then our affinity is for pod affinity, pod anti-affinity, and node affinity, node-anty-afinity.

00:47:46.170 --> 00:47:50.850
and node affinity, node, anti-affinity.

00:47:51.570 --> 00:47:53.230
And you will find that some teams,

00:47:53.310 --> 00:47:54.350
some upstream teams,

00:47:54.510 --> 00:47:56.170
they don't know how to work with node selector.

00:47:57.770 --> 00:47:59.190
You know how to work with affinity.

00:47:59.490 --> 00:48:04.530
So everything they do is it's affinity or anti-affinity or no affinity.

00:48:05.970 --> 00:48:10.290
And so usually you have to propose to them how to modify their Helm chart

00:48:10.290 --> 00:48:12.330
to enable node selector into corporations.

00:48:14.090 --> 00:48:15.150
Again, limiting factors,

00:48:15.150 --> 00:48:18.470
you'll run into in Upstream Helm chart.

00:48:18.470 --> 00:48:21.570
All right, let's view the chart. Gamal file.

00:48:21.570 --> 00:48:23.690
So here we have our two versions,

00:48:23.690 --> 00:48:26.590
we have our chart version, 0.1.0.

00:48:26.590 --> 00:48:29.550
Every time you change anything in the chart,

00:48:29.550 --> 00:48:30.910
and that should increment,

00:48:32.150 --> 00:48:35.490
and then the app version should follow

00:48:35.490 --> 00:48:40.870
the container version that you're pulling in.

00:48:40.870 --> 00:48:44.270
So if your container is 1.17,

00:48:44.270 --> 00:48:50.670
So it's EngineX1.17, then the app version should also read 1.17.

00:48:51.550 --> 00:48:59.270
And the values file and the templating should automatically read the app version for the container model.

00:48:59.270 --> 00:49:00.630
CD into charts.

00:49:01.190 --> 00:49:01.810
Let's take the look at it.

00:49:03.010 --> 00:49:10.050
So this folder is for if you pull in upstream charts, so if you have a chart, it depends on other charts to run.

00:49:10.850 --> 00:49:12.490
Then they will go into this folder.

00:49:12.490 --> 00:49:13.890
Let's CD back out of that.

00:49:14.270 --> 00:49:16.270
video into templates.

00:49:16.270 --> 00:49:18.270
Describe the contents here.

00:49:18.270 --> 00:49:19.270
So we've got a notes.

00:49:19.270 --> 00:49:21.270
Dot text tells us a little bit.

00:49:21.270 --> 00:49:23.270
Sometimes it's descriptive, sometimes it's just,

00:49:23.270 --> 00:49:25.270
there's nothing important in there.

00:49:25.270 --> 00:49:27.270
The helpers.

00:49:27.270 --> 00:49:28.270
What's that?

00:49:28.270 --> 00:49:31.270
Uh, yeah, you can go through them one at a time, yeah?

00:49:31.270 --> 00:49:32.270
Absolutely.

00:49:32.270 --> 00:49:36.270
Okay, so this is what gets output for this chart

00:49:36.270 --> 00:49:38.270
when it loads.

00:49:38.270 --> 00:49:40.270
Okay, so you can get out of that.

00:49:40.270 --> 00:49:41.270
Helpers.

00:49:41.270 --> 00:49:43.270
I think there's an underscore there.

00:49:43.270 --> 00:49:52.030
So this is a helper which creates variables, which are then used throughout the other

00:49:52.030 --> 00:49:53.030
templates.

00:49:53.030 --> 00:49:58.770
So the helper creates a variable from something, oftentimes pulls it from the values,

00:49:58.770 --> 00:50:02.510
a YAML file, makes it available to the other type.

00:50:02.510 --> 00:50:07.970
It doesn't look at all, like what we typed out earlier, doesn't.

00:50:07.970 --> 00:50:09.230
This is Helm template.

00:50:09.230 --> 00:50:15.970
So you notice, so it's designed to read the values file.

00:50:16.050 --> 00:50:18.030
This is a very simplistic Helm chart.

00:50:18.430 --> 00:50:27.180
But if you scroll down to the bottom, I think, yep, here we go.

00:50:27.340 --> 00:50:31.100
Node selector with values, not node selector.

00:50:31.460 --> 00:50:34.720
Node selector, and it takes it to YAML.

00:50:34.720 --> 00:50:42.000
So whatever you put in that node selector value, it converts it directly to YAML

00:50:42.000 --> 00:50:48.640
and indents it and puts it right there and so that is actually missing in a lot of upstream charts

00:50:48.640 --> 00:50:54.880
just because the team doesn't know how to put that in there and apply it in the same way with

00:50:54.880 --> 00:50:59.760
tolerations you might have node selector but no tolerations so if your note is tainted

00:51:00.960 --> 00:51:07.280
there's no way to deploy it all right let's do the next one this is your horizontal pod

00:51:07.280 --> 00:51:12.480
auto scaler and all right let's do the next one in our ingress and you can see that we use the

00:51:12.480 --> 00:51:24.400
ingress API version one and your name space name label sanitation no name space interesting very very

00:51:24.400 --> 00:51:32.720
basic culture okay the next one all right so it's going to create a basic service again

00:51:32.720 --> 00:51:37.120
no namespace making it very simple to deploy

00:51:37.280 --> 00:51:40.380
All right, let's do the next one.

00:51:40.380 --> 00:51:42.520
And we don't use service accounts because we didn't.

00:51:42.940 --> 00:51:46.440
It's an advanced topic, RVAC service accounts, et cetera.

00:51:47.720 --> 00:51:48.480
All right.

00:51:48.740 --> 00:51:53.610
So now we can change back to the root folder.

00:51:53.790 --> 00:51:56.210
And we're going to create a namespace test app.

00:51:56.210 --> 00:51:58.570
And we'll make sure that that's created.

00:51:59.130 --> 00:51:59.330
Okay.

00:52:00.010 --> 00:52:04.730
Now we're going to deploy the Helm chart using the Helm install command.

00:52:05.490 --> 00:52:08.510
So the format is command, which it's written there,

00:52:08.590 --> 00:52:16.710
Helm install, chart name, deployment name, and namespace with a minus N, right?

00:52:17.430 --> 00:52:18.590
So how would you write?

00:52:19.350 --> 00:52:24.790
I'll give you the first hint is we're going to start with two words, Helm, space, install,

00:52:25.710 --> 00:52:26.910
and then there will be a space.

00:52:27.010 --> 00:52:28.810
And we can give it a cool deployment name.

00:52:29.110 --> 00:52:32.070
We can call it cool, cool name or something like that.

00:52:33.030 --> 00:52:35.010
Whatever you want to do, call it whatever you want.

00:52:36.010 --> 00:52:38.390
It won't actually work if you're in a test app.

00:52:38.590 --> 00:53:08.570
It'll give you an error. We can do it, changing the test app. Well, let's get it. I was going to say it'll be fun to see what the error is like. Yeah, we might as well see it. No, it's just test dash app. The name of the chart is test, hyphen app. So let's run the command from here just for fun. Yeah, so we'll do Helmin install. Let's call it cool, cool app or something. There we go. Name space test test app. And so see, it can't find the

00:53:08.590 --> 00:53:14.290
chart because you're in the chart. So let's go out to the root. Let's do the same.

00:53:14.290 --> 00:53:19.610
No, let's change it. Helm install test, let's do test app, test. My bad. We named it

00:53:19.610 --> 00:53:25.330
test app, but it can't find test. So the first one should be cool. Yeah. So we're naming it

00:53:25.330 --> 00:53:31.010
first and then we're finding the charters. Yeah. So the first test app is actually the

00:53:31.010 --> 00:53:35.750
deployment name. There we go. This should work. All right, remember what we saw in the notes.

00:53:35.750 --> 00:53:43.670
text file that grabbed what it needed and through templating and then output it here so it says

00:53:43.670 --> 00:53:58.610
get the application URL running that you may have to run them one at a time i'm not sure looks like

00:53:58.610 --> 00:54:08.980
fork commands there yeah they don't have the uh the backslash yeah so you're yeah you're

00:54:08.980 --> 00:54:16.260
exporting these first export both yeah you did and an echo and then yeah now run cue control yeah

00:54:17.140 --> 00:54:21.140
Yeah, just the last line. Yeah, I think that's correct.

00:54:21.140 --> 00:54:25.140
Oh, cool, it's forwarding. You could open your browser and test it.

00:54:25.140 --> 00:54:29.140
See if it works. I have no idea if that's going to work inside here, though.

00:54:29.140 --> 00:54:35.140
So it's port forwarding. If it'll let us open a browser and connect to

00:54:35.140 --> 00:54:43.140
127.0.0.1880. Bingo. Port forwarding magic. All right, we can cancel out of that.

00:54:43.140 --> 00:54:45.140
Let's take a look at our pods.

00:54:45.140 --> 00:54:51.500
Let's look at how to get all the pods and services and a single command for that namespace.

00:54:52.940 --> 00:54:59.960
So we've got our pod, cool app, test app, and you can overwrite that and call it whatever you want.

00:54:59.960 --> 00:55:08.740
It's just automatically taking your deployment name, the chart name, adding a, let's see, it's got a replica set.

00:55:09.080 --> 00:55:11.680
So adding a replica set hash and a pod hash.

00:55:11.680 --> 00:55:16.680
We can see our service is running on port 80.

00:55:16.680 --> 00:55:18.680
Let's take a look at that service.

00:55:18.680 --> 00:55:21.680
And we have an endpoint, a single endpoint.

00:55:21.680 --> 00:55:22.680
And we have no ingress.

00:55:22.680 --> 00:55:25.680
You can test it and control get ingress.

00:55:25.680 --> 00:55:30.120
It should not have deployed one.

00:55:30.120 --> 00:55:32.120
I would do hyphen A.

00:55:32.120 --> 00:55:34.120
All right.

00:55:34.120 --> 00:55:38.120
Now let's use the deployments.

00:55:39.120 --> 00:55:41.120
List all of the helm deployments.

00:55:41.120 --> 00:55:42.120
There we go.

00:55:42.120 --> 00:55:44.120
That is our deployment.

00:55:44.120 --> 00:55:48.120
It tells us what version we're running, what app version, everything there.

00:55:48.120 --> 00:55:51.120
Now, how do we get rid of it?

00:55:51.120 --> 00:55:56.120
Well, Helm uninstall, which is your command,

00:55:56.120 --> 00:55:59.120
then the name of the deployment, and then the name space.

00:55:59.120 --> 00:56:01.120
All right.

00:56:01.120 --> 00:56:02.120
Cool.

00:56:02.120 --> 00:56:04.120
Now let's check to see if it was removed.

00:56:04.120 --> 00:56:07.120
So we're going to do Helm List.

00:56:07.120 --> 00:56:09.120
Yeah, I believe Helm List.

00:56:09.120 --> 00:56:11.120
I don't believe so.

00:56:11.120 --> 00:56:12.120
Let's see here.

00:56:12.120 --> 00:56:13.120
Yep, there we go.

00:56:13.120 --> 00:56:14.120
Yeah.

00:56:14.120 --> 00:56:15.120
All right.

00:56:15.120 --> 00:56:16.120
All right.

00:56:16.120 --> 00:56:17.120
Everything's gone now.

00:56:17.120 --> 00:56:21.120
So now let's re-deploy it, but let's adjust the values that YAML file to enable

00:56:21.120 --> 00:56:23.120
ingress within the .

00:56:23.120 --> 00:56:24.120
VEM.

00:56:24.120 --> 00:56:27.120
We have to CD into the folder.

00:56:27.120 --> 00:56:29.120
And then the values.

00:56:29.120 --> 00:56:30.120
YAML.

00:56:30.120 --> 00:56:31.120
And scroll down the ingress.

00:56:31.120 --> 00:56:35.120
We'll change it to true and we'll give it a host at enginex.

00:56:35.120 --> 00:56:36.120
example.

00:56:36.120 --> 00:56:39.120
So we'll change enabled to true.

00:56:39.120 --> 00:56:43.820
And then the host will be enginex.

00:56:43.820 --> 00:56:48.120
Because that's the DNS that we have in our Ubuntu.

00:56:48.120 --> 00:56:51.120
All right, that looks pretty good.

00:56:51.120 --> 00:56:54.120
You can set resources, you can do whatever you want,

00:56:54.120 --> 00:56:58.120
but this will at least enable our ingress to pop up,

00:56:58.120 --> 00:56:59.120
I think.

00:56:59.120 --> 00:57:01.120
We'll see if there are any bugs.

00:57:01.120 --> 00:57:05.120
So this chart actually didn't work for several years.

00:57:07.120 --> 00:57:08.120
It was frustrating for individuals.

00:57:08.120 --> 00:57:10.120
for individuals trying to learn Helm.

00:57:10.120 --> 00:57:14.120
They said, you know, even the tutorials don't work.

00:57:14.120 --> 00:57:19.120
And I said, yeah, there's a bunch of errors in the tutorial Helm chart.

00:57:19.120 --> 00:57:24.120
I couldn't figure out if that was by design or if they just hadn't maintained it for several years.

00:57:24.120 --> 00:57:29.690
All right, so now, I'm going to read a boy.

00:57:29.690 --> 00:57:34.690
Go back to, yeah, there you go.

00:57:34.690 --> 00:57:36.690
Yeah, test that.

00:57:36.690 --> 00:57:38.690
And, yeah.

00:57:38.690 --> 00:58:10.030
Yeah. Oh, look at that. Yeah. So it has a still has a mistake. Yeah, it still has a mistake because it, well, okay, they could have done that better. They could have provided that information to you the first time instead of having to run it yourself. Oh, well, maybe it's a little more comfortable. So let's try it. I don't think it's going to work. Did it work? I know the connect. Oh, yeah, take out the S.

00:58:10.190 --> 00:58:10.750
Let's see what happened.

00:58:10.770 --> 00:58:10.950
Yeah.

00:58:11.410 --> 00:58:11.690
All right.

00:58:11.690 --> 00:58:16.070
So now let's ensure a fresh mini-cube environment.

00:58:17.290 --> 00:58:19.970
We're going to install a CNI-free cluster.

00:58:21.190 --> 00:58:23.570
Neal pinged me a little while ago, and he said,

00:58:23.810 --> 00:58:27.550
have him fill out the form from 4 p.m. to 4.15.

00:58:27.850 --> 00:58:28.350
I'm like, okay.

00:58:31.260 --> 00:58:32.200
Yeah, Eastern Time.

00:58:32.260 --> 00:58:33.160
Yeah, he says, do it.

00:58:33.940 --> 00:58:36.700
Yeah, at the end, fill of 15 minutes afterward.

00:58:37.220 --> 00:58:37.660
Okay.

00:58:37.840 --> 00:58:38.880
Yeah, there was your disconnect.

00:58:39.800 --> 00:58:41.500
I think you have that one figured out.

00:58:42.300 --> 00:58:43.300
Right.

00:58:43.300 --> 00:58:46.300
And oh, coming up.

00:58:46.300 --> 00:58:47.300
There's no network.

00:58:47.300 --> 00:58:48.300
There's no CNI.

00:58:48.300 --> 00:58:49.300
Yep.

00:58:49.300 --> 00:58:51.300
So core DNS isn't going to work.

00:58:51.300 --> 00:58:53.300
Doesn't want anything to connect to it.

00:58:53.300 --> 00:58:54.300
Okay.

00:58:54.300 --> 00:58:58.300
So now we're going to connect our Helm instance to an upstream Helm repo.

00:58:58.300 --> 00:59:00.300
We're going to use the Cillium Helm repo.

00:59:00.300 --> 00:59:07.300
The command is Helm repo add, and then name, and then repository.

00:59:07.300 --> 00:59:11.300
So the upstream Cillium repo is usually listed on the GitHub.

00:59:11.300 --> 00:59:20.360
on the GitHub read me file for that Helm chart repo and so we're going to add

00:59:20.360 --> 00:59:27.620
Cillium so Helm repo add note yeah so this is actually Cillium so we're

00:59:27.620 --> 00:59:33.500
gonna we're gonna install the chart for Cillium so here's the command film repo

00:59:33.500 --> 00:59:40.420
add it's going to be Cillium and then HTPS helm.cilium.

00:59:40.420 --> 00:59:44.420
So Cillium has their own repo self-hosted.

00:59:44.420 --> 00:59:45.420
Probably get left.

00:59:45.420 --> 00:59:46.420
All right.

00:59:46.420 --> 00:59:48.420
Verify the home repo as that.

00:59:48.420 --> 00:59:50.420
It won't be there.

00:59:50.420 --> 00:59:52.420
Alright, now.

00:59:52.420 --> 00:59:57.420
And verify what is available in our home repo list for installation.

00:59:57.420 --> 01:00:01.420
So what do we have available to us if you want to install something?

01:00:01.420 --> 01:00:05.420
Now we've got Cillium Cillium, Cetragon.

01:00:05.420 --> 01:00:08.420
We've got two Helm charts.

01:00:08.420 --> 01:00:09.420
Cillium Cillium Cillium.

01:00:09.420 --> 01:00:12.780
is 1.17.5 for the chart.

01:00:13.660 --> 01:00:16.560
The app is also 1.17.5 because

01:00:16.560 --> 01:00:20.240
their brain did when they keep the chart version

01:00:20.240 --> 01:00:21.700
the same as their app versions.

01:00:21.700 --> 01:00:25.040
Now actually it's because they want to keep things simple.

01:00:25.040 --> 01:00:29.300
And they're differing from Helm templating

01:00:29.300 --> 01:00:32.480
where your app version runs based on your container

01:00:32.480 --> 01:00:35.120
and your chart version runs based on your chart.

01:00:35.120 --> 01:00:39.280
So what they do is they don't push a chart unless

01:00:39.280 --> 01:00:41.200
They have a container change,

01:00:41.200 --> 01:00:43.380
but if they have a mistake in a chart,

01:00:43.380 --> 01:00:45.000
what happens now?

01:00:45.000 --> 01:00:49.140
So in that case, they, I guess,

01:00:49.140 --> 01:00:51.840
delete the chart and we push the chart.

01:00:54.000 --> 01:00:56.460
Anyway, this is just a selling team

01:00:56.460 --> 01:00:57.600
doing their own thing.

01:00:57.600 --> 01:01:00.960
All right, let's get all versions

01:01:00.960 --> 01:01:02.960
that are available to us in our repo,

01:01:02.960 --> 01:01:04.120
now they're connected to it.

01:01:04.120 --> 01:01:06.480
And this actually goes out to the internet.

01:01:06.480 --> 01:01:11.230
All we did is grab the ceiling.

01:01:11.430 --> 01:01:17.030
scroll up and see all the different versions that are available to install so if you're running into

01:01:17.030 --> 01:01:22.950
issues and you say hey all of these versions don't work but this old one does we can install

01:01:22.950 --> 01:01:29.410
it next now what we need to do is every time we add something to the report every time we get

01:01:29.410 --> 01:01:34.850
ready to install a chart we need to update the repo we need to pull in all of the new

01:01:34.850 --> 01:01:40.850
information so let's Helm repo update and so now we've we've pulled it in that's what's available

01:01:40.850 --> 01:01:42.350
Now we're going to helm repo up.

01:01:42.350 --> 01:01:45.350
So let's say we hadn't run this for two weeks, right?

01:01:45.350 --> 01:01:46.350
And they have a new version.

01:01:46.350 --> 01:01:49.350
We would need to helm repo update.

01:01:49.350 --> 01:01:51.350
So it's happy helming.

01:01:51.350 --> 01:01:52.350
All right.

01:01:52.350 --> 01:01:57.350
Now let's show the values for cillium ciline or Xpo all the way up.

01:01:57.350 --> 01:01:59.350
The thing they don't have line numbers.

01:01:59.350 --> 01:02:10.710
You see just how many lines are in this thing.

01:02:10.710 --> 01:02:13.710
Okay, so here's what we're going to do.

01:02:13.710 --> 01:02:16.710
We're going to cat the values to a values.

01:02:16.710 --> 01:02:17.710
YAML file for editing.

01:02:17.710 --> 01:02:21.010
So we're going to helm show values again.

01:02:21.010 --> 01:02:24.110
So Liam Sillingham and then cat it to values.

01:02:24.110 --> 01:02:31.140
Yeah, obviously we can only do this once.

01:02:31.140 --> 01:02:43.950
But, and now let's then have the values.

01:02:43.950 --> 01:02:46.150
Alright, you start at the top.

01:02:46.150 --> 01:02:46.950
That's nice.

01:02:46.950 --> 01:02:48.550
Okay, you don't want you to edit this

01:02:48.550 --> 01:02:50.750
because of the way that they have the templateing head.

01:02:50.750 --> 01:02:53.350
So rather than editing this, what you would do

01:02:53.350 --> 01:02:55.750
is you would create your own values. YAML.

01:02:55.750 --> 01:02:57.350
And you would follow the same format.

01:02:57.350 --> 01:02:59.350
Can you see my cursor moving on your screen?

01:02:59.350 --> 01:03:05.110
moving on your screen right now no oh interesting all right let me try

01:03:05.110 --> 01:03:11.270
okay can you see me now okay so when you create your own values file you want to keep

01:03:11.270 --> 01:03:15.430
the same formatting so let's say you wanted to change common labels you would

01:03:15.430 --> 01:03:20.470
create just a values that you have a file and you might start it with just this right here

01:03:21.350 --> 01:03:26.710
so common labels and add your you know label in there and make sure you maintain your

01:03:26.710 --> 01:03:33.350
you're indenting um okay i'm gonna change you back to view only i don't want to mess up your um

01:03:35.350 --> 01:03:40.390
your terminal there all right take a second here and you should be back

01:03:40.390 --> 01:03:42.830
Did I lose you?

01:03:43.070 --> 01:03:44.890
Oh, there you go. Oh, there you go.

01:03:44.950 --> 01:03:45.810
Okay, you're back now.

01:03:45.970 --> 01:03:46.930
Okay, I can see you.

01:03:47.070 --> 01:03:49.830
Okay, so we can kind of go down through.

01:03:50.030 --> 01:03:51.790
It's got a debug mode.

01:03:51.910 --> 01:03:53.470
So if you run into issues with Sillium,

01:03:53.590 --> 01:03:56.290
turn on the debug mode, and it'll help you solve them.

01:03:56.350 --> 01:03:57.810
That's nice, nice features.

01:03:57.970 --> 01:03:58.730
Somebody thought ahead.

01:04:02.290 --> 01:04:03.890
Don't need upgrade compatibility,

01:04:03.890 --> 01:04:07.810
but there was a time where you needed upgrade compatibility.

01:04:09.130 --> 01:04:10.170
I don't need that today.

01:04:10.390 --> 01:04:19.390
Let's see, scroll on down, and it lets you turn on debug for specific containers.

01:04:19.390 --> 01:04:23.390
It's got Arbac controls, Instcle Secrets.

01:04:23.390 --> 01:04:28.390
Well, we would use secrets, we're still in, but we're not using it now.

01:04:28.390 --> 01:04:32.390
Let's see here, configure IP tables, no.

01:04:32.390 --> 01:04:35.390
And scroll on down to the bottom.

01:04:35.390 --> 01:04:39.390
Cluster. See, we have cluster in there, and then we have a scroll on down level.

01:04:39.390 --> 01:05:09.230
Scroll on down a lot for is at the bottom. Okay, there's a lot reload relay. So that's that's going to be your Hubble relay create truth. You I create truth. All right, let's take this thing on install. Let's exit out of this. Okay, this is going to be a little different because remember we want version 1.17.5. That's what we catted now. So we're going to do helm install cillium, cillium, cillium.

01:05:09.390 --> 01:05:11.810
because remember that's what it's called,

01:05:12.370 --> 01:05:14.690
that we're going to call it cillium on our machine,

01:05:15.350 --> 01:05:18.050
but cillium cillium is what it's called inside the repo,

01:05:19.010 --> 01:05:22.290
and then minus f for file values.amil,

01:05:22.450 --> 01:05:23.250
because that's our values.

01:05:23.390 --> 01:05:23.790
YAML,

01:05:24.490 --> 01:05:27.470
we're going to tell it we want version 1.17.5,

01:05:28.190 --> 01:05:30.950
and then the namespace is going to be Cuvith systems,

01:05:31.490 --> 01:05:35.330
just to make sure it can communicate with DNS proxy.

01:05:36.410 --> 01:05:39.370
I eliminate any problems on the first step up, but who know,

01:05:39.390 --> 01:05:44.110
because it's MiniCube and it's your version of MiniCube and my version.

01:05:44.110 --> 01:05:45.310
All right, let's for the pod.

01:05:45.310 --> 01:05:47.790
Creating crash loop back off.

01:05:47.790 --> 01:05:51.630
Oh, Core TNS, because it doesn't have anything to connect to.

01:05:51.630 --> 01:05:55.790
We don't have a CNI, so it says, hey, I don't have a network address.

01:05:57.390 --> 01:05:59.790
So Envoy's up. What else is that? Nothing else.

01:06:00.990 --> 01:06:02.430
Yep, because we only have one node.

01:06:02.430 --> 01:06:07.710
So operator, remember, that's a Damon set. Is that a Damon set? Let's check. Let's check real quick.

01:06:07.710 --> 01:06:10.670
Oh, let's do. Yeah, let's get daemon sets and see if it's listed.

01:06:10.670 --> 01:06:14.670
Cillium is a daemon set. Interesting. Well, let's see if operator's up yet.

01:06:16.430 --> 01:06:18.990
I wonder why it's pending. Yeah, let's look at it and see.

01:06:18.990 --> 01:06:23.630
No, this core DNS is running now. So we got a network. So we got a container

01:06:23.630 --> 01:06:28.430
networking interface now. Oh, that's on the same port. Wondo didn't have

01:06:28.430 --> 01:06:33.390
free ports for the request pod port. So it's trying to use a specific port.

01:06:37.710 --> 01:06:40.510
9234 would be my guess.

01:06:40.510 --> 01:06:42.230
Could be 9963.

01:06:42.230 --> 01:06:44.650
I see a bunch of ports listed there.

01:06:44.650 --> 01:06:47.970
So it doesn't have enough free ports,

01:06:47.970 --> 01:06:50.210
so it can't install both on the same node.

01:06:50.210 --> 01:06:51.110
But if we had two nodes,

01:06:51.110 --> 01:06:52.270
they could install it just fine,

01:06:52.270 --> 01:06:55.870
because it would automatically bump it to the next node.

01:06:56.790 --> 01:06:57.510
Or should.

01:06:59.330 --> 01:07:00.310
So you could fix that.

01:07:00.310 --> 01:07:03.110
But this gives you an idea of, you know,

01:07:03.110 --> 01:07:06.470
how to spin up Sillium, it's a very simplified version.

01:07:06.470 --> 01:07:13.990
you can feed it a kubevip from the kubbip load balancer if you installed kubbip on your control plane

01:07:13.990 --> 01:07:22.150
in high availability then you take your kubvip vip and feed that in the sillium values

01:07:22.150 --> 01:07:30.230
yaml file it has a line in there for your vip and you give it the vip and so like that you can

01:07:30.230 --> 01:07:35.590
create your own values file you can install your own helm chart and now

01:07:36.470 --> 01:07:37.490
We're almost out of time.

01:07:37.650 --> 01:07:42.530
We have five minutes left, but if you're done with that, we can...

01:07:42.530 --> 01:07:43.670
Oh, let's look at the deployment.

01:07:44.270 --> 01:07:45.470
Let's check our deployment again.

01:07:45.950 --> 01:07:46.450
Helm.

01:07:46.530 --> 01:07:47.870
Let's see what we have installed with.

01:07:49.210 --> 01:07:50.090
So it's a helm.

01:07:52.230 --> 01:07:52.790
Let's see.

01:07:52.910 --> 01:07:52.950
Let's see.

01:07:52.950 --> 01:07:53.910
Let's let up for you.

01:07:54.010 --> 01:07:57.030
LIS, when I say, I'd have to go back 10 charts.

01:07:57.230 --> 01:07:57.430
All right.

01:08:01.060 --> 01:08:01.340
Yeah.

01:08:02.040 --> 01:08:04.580
So, you can see it's deployed.

01:08:05.700 --> 01:08:08.080
It's telling us the app person.

01:08:08.100 --> 01:08:14.640
the chart version when it was updated. So you can change your values file and then instead of

01:08:14.640 --> 01:08:22.320
Helm install you can do Helm and update it with a similar command and it will apply the values

01:08:22.320 --> 01:08:28.900
file, your new values file to it and it will restart it. So you could install cillium and then

01:08:28.900 --> 01:08:37.220
install coup-v and then get your vip and feed it to cillium and restart cillium. Again chicken and egg

01:08:37.220 --> 01:08:40.980
in an egg and we discussed that earlier.

01:08:40.980 --> 01:08:42.280
Yeah, so there's a sequence here,

01:08:42.280 --> 01:08:44.900
and you just update Cillium to give it to a VIP

01:08:44.900 --> 01:08:48.020
because it's going to install before the VIP is it

01:08:48.020 --> 01:08:51.660
because you need a CNI in climate.

01:08:51.660 --> 01:08:55.180
So there you go, so let's look like we don't have any errors,

01:08:55.180 --> 01:08:56.400
but let's check real quick,

01:08:56.400 --> 01:08:58.320
get all minus the N cube system.

01:08:58.320 --> 01:09:01.840
Don't think we have any errors, but yeah,

01:09:01.840 --> 01:09:03.400
Cillium operator.

01:09:03.400 --> 01:09:06.900
Yeah, so because we don't have the same ports available

01:09:06.900 --> 01:09:11.780
on the same node, one one port would be two nodes to do that.

01:09:11.780 --> 01:09:16.580
Because it's set, now you could set Selim operator to one in the values.

01:09:16.580 --> 01:09:21.300
YAML, and then reapply that with Helm update,

01:09:21.300 --> 01:09:27.850
and that error would go away and you would just have one of one for your deployment.

01:09:27.850 --> 01:09:30.010
Yes, you could add a worker node.

01:09:30.010 --> 01:09:34.010
I would add a control plane node, but the problem with another control plane is

01:09:34.010 --> 01:09:37.850
they won't be high availability, so, because you don't have kubeb.

01:09:37.850 --> 01:09:42.370
So I would do a worker nose and see if it forces that over.

01:09:42.670 --> 01:09:43.210
But yeah, definitely.

01:09:43.370 --> 01:09:45.250
I mean, you could spend a lot of time playing around with this

01:09:45.250 --> 01:09:48.690
and get to no cillium just in any cue nuances of mini-cube.

01:09:49.050 --> 01:09:51.750
And then I would do cube-vip with the chart.

01:09:52.050 --> 01:09:53.670
I wouldn't mess with mini-cub.

01:09:53.970 --> 01:09:58.010
In instance, I would see if you can get cub-vip to run with a chart,

01:09:58.410 --> 01:10:01.650
get it to feed you the vip, and then feed cillium the vip,

01:10:01.890 --> 01:10:03.070
and then reinstall cillium.

01:10:04.470 --> 01:10:07.130
Then you would have a high availability cluster.

01:10:07.130 --> 01:10:10.890
But you'd have to create three of everything for cillium, so you need three cillium operators.

01:10:15.770 --> 01:10:19.770
And then when you do your Hubble relay and all of that, you need three of everything.

01:10:20.490 --> 01:10:25.050
So as you go in and you start adding it all of the other cillium pieces that were in that values.

01:10:25.050 --> 01:10:29.690
YSyamophile, you'd want to make sure that you did high availability on all of them,

01:10:29.690 --> 01:10:32.090
and then you would force them all to the control plane node

01:10:34.010 --> 01:10:36.970
so that they run on the control plane because that's part of the networking.

01:10:37.130 --> 01:10:42.130
All right, we can remove this.

01:10:42.130 --> 01:10:45.130
So we can now immediately look at your pods.

01:10:45.130 --> 01:10:48.130
Scroll up to and just get in.

01:10:48.130 --> 01:10:49.130
Oh yeah, they're already gone.

01:10:49.130 --> 01:10:51.130
Yeah, they're already gone.

01:10:51.130 --> 01:10:52.130
Yeah.

01:10:52.130 --> 01:10:55.130
It knows how quick helm was.

01:10:55.130 --> 01:10:57.130
It knocked them all out of their immediate.

01:10:57.130 --> 01:10:59.130
All right.

01:10:59.130 --> 01:11:01.130
So let's recap.

01:11:01.130 --> 01:11:02.130
We had time.

01:11:02.130 --> 01:11:04.130
We were going to do a few more.

01:11:04.130 --> 01:11:06.130
Longhorn would have been fun ones.

01:11:06.130 --> 01:11:11.130
All right, so we learned how Helm templating works,

01:11:11.130 --> 01:11:14.130
how the Helm chart is structured,

01:11:14.130 --> 01:11:16.130
how to create a Helm chart,

01:11:16.130 --> 01:11:19.130
how to add an upstream Helm repo,

01:11:19.130 --> 01:11:23.130
how to view all repos that are in your Helm instance,

01:11:23.130 --> 01:11:28.130
how to view all versions in a home repository,

01:11:28.130 --> 01:11:31.130
how to install a Helm chart,

01:11:31.130 --> 01:11:35.130
how to modify a values at YAML file,

01:11:35.130 --> 01:11:42.130
adding it out and seeing what you need to do to create your own values that yellow file.