.. _controller-lifecycle: ******************** Controller Lifecycle ******************** Bowtie Controllers are designed to run as fleet infrastructure across their whole lifecycle. Provisioning, updates, certificate renewal, health monitoring, and recovery are automated or scripted, so a team can operate one Controller or a fleet of them with the same tooling. This page summarizes how each phase works and links to the detailed procedures. .. list-table:: Lifecycle at a glance :header-rows: 1 * - Phase - How it is handled - Details * - Provisioning - Pre-built cloud and hypervisor images, declarative cloud-init bootstrap, and Terraform. - :doc:`setup-controller`, :ref:`terraform` * - Updates - Scheduled, unattended updates with per-Controller or fleet-wide version strategies. - :ref:`version strategies `, :ref:`unattended-updates` * - Certificate renewal - Automatic TLS issuance and renewal through ACME. - :doc:`setup-controller` * - Health monitoring - Built-in Grafana dashboards, Prometheus metrics, and one-click diagnostics. - :ref:`controller-observability` * - Recovery - No single point of failure, plus encrypted backup and restore. - :ref:`backup-and-restore`, :doc:`ha-controller` Provisioning ************ Bowtie publishes Controller images for the major public clouds, including AWS GovCloud and Azure Government, and for on-premises virtualization. Every image boots from cloud-init, so a Controller can configure itself with no manual steps. You can seed all of the following through cloud-init, which lets a Controller come up ready to serve: - The hostname and public endpoint, along with bring-your-own TLS certificates if you do not want automatic issuance. - The first administrator account, so no one has to complete the setup wizard by hand. The ``skip-gui-init`` file bypasses the wizard entirely. - Single sign-on connectors. - Cluster membership, by pointing a new Controller at an existing one with a shared key and a common site identifier. No primary node exists, so Controllers can be added or replaced in any order. Devices can be pre-approved by serial number, and Controllers can be set to admit new cluster peers automatically when they present a valid key, which removes the last manual step from scaling out. For infrastructure as code, build the Controller virtual machine with your normal cloud provider Terraform and the cloud-init described above, then manage in-product objects such as sites, resources, and DNS through the :ref:`Bowtie Terraform provider `. See :doc:`setup-controller` for the full platform list and :ref:`terraform` for examples. Updates ******* Each Controller follows a version strategy that you set per Controller or as an organization-wide default: - *Manual* holds the Controller on its current version. - *Fixed Version* pins a specific version. - *Check for Latest at time* and *Check for Latest periodically* move to newer versions automatically on a schedule. Fleet-wide controls keep updates orderly. A stagger setting spreads update times across the fleet so Controllers do not all restart at once, a minimum age holds new releases back for a set number of days before they are adopted, and an opt-in allows preview releases. Updates run unattended on the configured schedule and activate in place, restarting only the services that changed rather than rebooting the host. Because clients keep a connection to every Controller and move to a healthy one on their own (see :doc:`ha-controller`), the brief window while one Controller updates is largely transparent to users in a deployment with more than one Controller. This is fast fail-over rather than session preservation, so plan updates for a maintenance window if individual long-lived sessions must not be interrupted. See :ref:`version strategies ` and :ref:`unattended-updates` for the settings and the schedule format. Certificate Renewal ******************* Controllers obtain and renew TLS certificates automatically through ACME. A Controller with a public address can get a working hostname and certificate with no manual DNS or certificate steps through the ``bowtie.direct`` helper. If you prefer to manage certificates yourself, you can supply your own certificate and key through cloud-init at provisioning time. Backup encryption keys and other secret values are managed through the Control Plane secrets page. See :ref:`backup-encryption`. Health and Monitoring ********************* Every Controller ships with a self-hosted observability stack, so health data is available without extra setup: - Pre-built Grafana dashboards cover host resources, the reverse proxy, tunnel traffic, and the policy enforcement path. They are served behind single sign-on at ``/grafana``. - Prometheus collects metrics locally. You can pull them into your own monitoring through federation, or forward logs and metrics to the platform of your choice. See :ref:`exporting-telemetry`. - The Control Plane shows the status of each Controller, the Bowtie service restarts itself automatically if it becomes unhealthy, and a one-click support bundle captures host, service, and metric state for diagnosis. See :ref:`controller-observability` for the full observability documentation. Recovery and Resilience *********************** Bowtie is built to tolerate the loss of a Controller without losing the network: - **No single point of failure.** Every Controller holds a complete copy of the configuration and keeps serving clients during a partition. A Controller that rejoins the cluster synchronizes automatically. Running more than one Controller is the recommended configuration. See :doc:`ha-controller`. - **Automatic partition recovery.** Controllers recover from prolonged network partitions on their own as connectivity returns, with no per-Controller configuration required. - **Backup and restore.** Controllers take encrypted, scheduled backups to a restic-compatible repository such as S3 or a local directory, with configurable retention. You can restore interactively, or unattended at boot by supplying a restore file, which makes a Controller straightforward to rebuild and rejoin. See :ref:`backup-and-restore` and :ref:`automated-restore`.