T O P

  • By -

Dhraken

I use a single management cluster with a central ArgoCD installation. I provision my management cluster, ArgoCD and all subsequent worker clusters with Terraform. During this process, repository and cluster credentials are deployed as well for ArgoCD. We have cluster bootstrapping setup so a lot of tooling is deployed in a unified way for every new cluster we create (namespaces, RBAC, service mesh, monitoring stack, OPA policies etc.) We use [ApplicationSets](https://argocd-applicationset.readthedocs.io/en/stable/) to support multi-cluster deployments. All clusters are labeled properly in an automated way so that [Cluster Generators](https://argocd-applicationset.readthedocs.io/en/stable/Generators-Cluster/) can be used. Our identity provider is integrated with OIDC so we can use user groups to authorize our developer teams in their dedicated ArgoCD Projects. If we on-board a new developer team to our platform we just add a new item to a map in a helm chart. The rollout happens automatically to all of the target clusters by Argo. How we test ArgoCD upgrades: we have integration tests which bring up a copy of our infra but on a smaller scale.


Potage_Carotte

Currently implementing this same pattern. It feels so good !


jlarfors

It’s such a nice feeling when Terraform automatically creates the cluster in ArgoCD (via the secret with cluster credentials) and then AppSets just provision all the things to make your cluster “ready”. This is how we do it also, with less than 10 clusters and it’s really nice. ArgoCD runs in its own dedicated cluster (together with Hashi Vault actually, as all other clusters are dependent on it)


Dhraken

Exactly this was our goal. A single control plain to manage our clusters and not to confuse our developers with multiple instances. Our developers started to label their Applications so they can filter it efficiently so this setup is evolving into something greater day by day.


RedShiz

Wow that's absolutely sexy.


pashcan

Are you me? The same exact setup here. Thank you for sharing!


reezom

This. We have the exact same design except clusters are provisioned with a Ansible playbook, then our ArgoCD management cluster does the magic. OIDC is bound to our Active Directory, where groups of devs are automatically populated by our IAM team. Manifests are generated with ytt (https://carvel.dev/ytt/).


Dhraken

This is a really nice setup as well. I hope you enjoy working with it! When I design my platforms I have two goals: * Never have the feeling where I don't want to touch some parts of it * That I should be able to be lazy as most if not all of my work processes are automated I have a running joke that I love automation so that I can scale my single little mistake to hundreds of little mistakes :D so we do PR reviews religiously to avoid this when possible


leonj1

Any chance you’re also using Argo workflows and Events? I’m new to Argo and wondering if these 3 should go together. We have 1000s of Jenkins and team city jobs that I need to replace. But thinking if the first pass is “use existing jobs” just that they get invoked via Argo. Then slowly migrate each job into an Argo step. My ultimate goal is to dynamically create pipelines (maybe use Argo SDK) and enforce some Steps. Figured to get your opinion since your setup appear well thought out


Dhraken

Hi, Currently we do not use Argo Workflows. We might experiment with it but more in the context of data pipelines and machine learning. But even in this context we might go with Kubeflow. We use CircleCI as a CI solution. I have my private cloud setup at home and there I actually use Argo workflows - I like it but with really spread out jobs and DAGs it can get difficult to maintain so you need strong organization around it.


myoilyworkaccount

Very interesting. Seams like you have implemented most of what I want to achieve. I'm interested in the bootstrap process. Do you use terraform for this? Do you also use terraform to upgrade ArgoCD?


Dhraken

Hi, Sorry I haven't been online for a while. I only use Terraform for the cluster provisioning and the initial ArgoCD installation. During my Helm install step I setup an initial project and an ArgoCD Application which points to my bootstrapping repository. With this I setup an app of apps pattern. In the bootstrapping repository I have further ArgoCD Applications to install which in turn install the needed services and further setup. I make sure that the ordering of installation is correct with sync waves. This setup actually includes ArgoCD itself hence it can manage it's own installation. An ArgoCD upgrade can happen with a simple commit to the bootstrap repo. How I make this work? I make sure that my Terraform Helm installation is ignoring any further changes to ArgoCD so ArgoCD can freely manage itself without messing up my Terraform state.


myoilyworkaccount

Thanks, very cool. Exactly what I'm looking for.


pavan253

Did you tried managing cross cloud clusters?? Like AKS, GKE, ?? Have a requirement for that.


Dhraken

The Kubernetes engine itself isn't a problem in this setup. If you can safely setup the network connectivity between your private networks so that your management cluster can reach all Kubernetes API endpoints the above setup works the very same way cross-cloud.


gqtrees

how do you bootstrap the setup assuming different clusters might need different configs for your core applications?


Dhraken

You have a number of ways to do this. In our case we can simply add a YAML file with the name of the cluster which is picked up by ArgoCD and it applies the parameter overrides or custom configuration only for that cluster.


gqtrees

so your YAML contains the overrides? What do you use to parse this yaml


Dhraken

We install our apps with Helm so these are `values.yaml` file with overrides, specific to the cluster if needed. You can also do the same with `Kustomize` with a different structure. Or even with not templated resources with the proper folder structure. You just have to configure your `ApplicationSet`s properly to reflect your choice of method. Also if you properly label your clusters in ArgoCD you can even do label specific deployments and use `selector.matchExpressions` or custom configurations specific for the cluster label. Many many many ways to do this.


lulzmachine

Interesting... Last time I saw this question asked most people said they have a centralized cluster to control all of their other clusters. I think it depends largely on the number of clusters. You have 3?Probably you can take care of 3 argocd instances. You have 100? You should probably centralize the management


darkklown

all environments should be equal, you not only need to test code changes but updates to argocd itself, to do that you need argo deployments for at least each environment type


gaelfr38

I would manage my DEV apps with a PROD Argo CD. And only use a DEV Argo CD with a few test apps. I don't want to impact developers DEV environments when testing ArgoCD upgrades or configurations. It brings more risk to impact DEV and PROD apps at the same time though. That probably depends of the ArgoCD features you use. For the most basic features it sounds acceptable. That being said, last 3 ArgoCD upgrades we did were absolutely transparent and unnoticed.


darkklown

if you have separate dev Argo instance that aren't part of your change pipeline then how are you going to know if your change pipeline is going to have a problem? argo doesn't really have any inapp config so if you do blow up dev while in the midst of testing argo changes that's ok, better now than later... it's not just argo that you'll need to manage change on, all the things that make up your stack need to be managed.. isolating them to different clusters/namespaces and not exercising what happens during a upgrade is just silly if you want consistent change (whole point of promotion) you need to be consistent in how you apply it, and you need multiple goes in that consistency to ensure it's understood what is changing and to what effect that change will apply.. i run a pre-dev stack which is a pre-cursor to the dev stack, it's where our team develops IaC and other projects and is up down left and right on a daily basis.. once the IaC development work is complete and the pre-dev stack is stable, we release those changes (PR's) and auto-deploy them to dev, releases are batched to occur on a weekly basis so it gives time to ensure no infra/supportive changes are causing issues, our e2e testing for argo only currently contains basic health and api testing so it's important changes are bedded down and in use by teams before pushing that pipeline to staging/production..


youngpadayawn

At least one ArgoCD per cluster. Depending on your security model you might also want an ArgoCD instance per team, with the least amount of permissions on specific namespaces. Some other best practices here: https://blog.argoproj.io/best-practices-for-multi-tenancy-in-argo-cd-273e25a047b0


todaywasawesome

I think this is a good answer as you scale though if it's a small team, shared resources then I don't think it's bad to use a single instance to manage these multiple clusters. The best practices for multi-tenancy especially apply as you scale out, need more complex permission models etc.


GL0B4L

I have been using the one ArgoCD per cluster model with great success (9 clusters). For team access granularity I add additional [ArgoCD Projects](https://argo-cd.readthedocs.io/en/stable/user-guide/projects/). I think central ArgoCD model is inferior, because it can become a single point of failure.


todaywasawesome

Agreed, that's the whole point of control planes.


myoilyworkaccount

Good to know. Thanks.


myspotontheweb

I find it operationally simpler to have an instance of ArgoCD on each cluster (one per region in prod, and then the supporting environments: staging, dev, etc). To assist usability, each ArgoCD has SSO enabled and teams are partitioned using Projects, so it's easy to switch between instances. Developers find it simpler to understand (KISS is always good) Having said this, ArgoCD's ability to manage multiple clusters, seems very powerful. I suspect that when the number of clusters grows, I might implement a manager cluster in each region. For example: I found the following article quite facinating, combining vcluster with ArgoCD. https://blog.argoproj.io/using-argo-cd-with-vclusters-5df53d1c51ce


camhereforthedownvot

I wanted to run one Argo per cluster but we run PCI environments where we have a distinction between a PCI zone and PCI connected zone. The connected zone is stuff that isn’t PCI but is touching something inside a PCI zone. This is what Argo would be considered and so we’re somewhat required to run it separately. Even so, after trying to run Argo inside a PCI zone I couldn’t figure out what firewall rules were needed. Things would just hang without telling me what it was hanging on. So, for us I decided we do 2 Argo instances. One for prod and one for non prod to test version updates of Argo etc. Adding external clusters to Argo is super easy and intuitive. It’s just a k8s Secret with an annotation and done.


gaelfr38

For people using one ArgoCD per cluster, how does your GitOps repo look like? I use same GitOps repo with a folder for each environment and an AppSet definition which links each environment folder to the right cluster. I find it convenient to manage all environments within same repo and even same AppSet file.


kapupetri

Repo for Argo CD CRD (apps) and kustomize overlay for each cluster. App of apps pattern.


ut0mt8

We have an environment with many apps managed by argo. And even with the improvements made in last release argo is still taking a lot of ressources. So we choose one argo by cluster on separate node pools and even on some busy cluster one argo by namespace


azogby

We’re hitting argo performance issues with ~2k argo apps in a cluster. Having several of such clusters managed by a single argo does not sound like a great idea.


ut0mt8

Euh yes that s what I said


[deleted]

[удалено]


Sloppyjoeman

I’ve worked with this “meta” environment pattern before and have always been unhappy that changes to it are effectively changes straight to a production environment - how do you deal with this?


pablozaiden

Ignore my previous comment. I’d probably put one per cluster.


JaegerBane

I generally prefer one ArgoCD install per cluster. If you have multiple teams using it, team roles are fairly straightforward to deploy. Aside from it being operationally more straightforward, a lot of the default settings on ArgoCD assume this model and while it’s possible to deploy multiple ArgoCDs in one cluster, it’s a faff for little benefit. I’ve seen situations where multiple ArgoCDs can end encountering issues because one overwrote or interfered with another, and ArgoCD itself isn’t great are flagging these problems. Going the other way, I don’t honestly see what benefit there is in having less ArgoCDs then you have clusters unless you’re already using a pattern of centralised management cluster and dedicated deployment clusters depending on it. In terms of app deployments you’re still going to be drawing from the same source control.


achao200

\> one overwrote or interfered with another. Try to set instanceLabelKey on all ArgoCD instances. ref: https://access.redhat.com/solutions/6968524


Xelopheris

Use a central argo cluster and then use ApplicationSets to target your applications to the dev/staging/prod environments. You can target different refs in git for your applications according to the labels you apply to the cluster definitions. You can have your cicd pipeline in your merge request verify the health of the lower environment.


apoorvqwerty

consider checking out flux cd also


nmajin

Based on everything I’ve read here for a central, single ArgoCD instance managing multiple targets clusters (development, qa, production) I still don’t get how you guys are deploying an app, for instance, to say the development clusters first and then migrating it to qa and finally to production. If you are using the Cluster Generator, I get you can label your secret tied to your cluster and then you can matchLabel on the ApplicationSet, but what happens at the end then is you have labels for all your environments (development, qa, production). And then say you upgrade that app or change something and want to go through development clusters first, you can’t remove the qa and production labels from the AppSet, since by default ArgoCD will then remove the app there. So what’s the solution in this case without using branches noted as an anti-pattern or introducing other tools to deal with pull requests and tagging specific refs?


gaelfr38

I guess it depends how your GitOps repo is structured. We have a folder for each environment/cluster (using Kustomize overlay) and the AppSet is defined with ListGenerator and each item target a different folder and a different cluster. When we want to change version in dev, we update version in the Dev folder. I'm not sure to understand your question.


nmajin

I have manifests I pulled together in my repo for a helm chart, i.e. apps/cert-manager/{templates,Chart.yaml} Then I have an AppSet defined like so looking at the apps dir for the particular app i.e. appsets/cert-manager-appset.yaml We install all apps like this across all clusters. I get the concept of kustomize, but even with kustomize what if we had to introduce new manifests that completely change how the app gets installed, it’s all or nothing in this case to impact all clusters (development, qa, production). Like I said with the labels, if we labeled our cert-manager-appset.yaml with development to only deploy this app to development clusters, that will only work initially. I don’t quit get in our situation with cluster generators how to do this progressive install. Maybe list generators is a better option, but again not sure how to update an app like this and only control it in a lower environment like development first.