matrix-docker-ansible-deploy

self-hosted/matrix-docker-ansible-deploy

mirror of https://github.com/spantaleev/matrix-docker-ansible-deploy.git synced 2024-12-24 10:27:04 +02:00

Author	SHA1	Message	Date
borisrunakov	acaebfbf67	optional media cache with range requests support (#1759 )	2022-04-21 10:31:26 +03:00
Slavi Pantaleev	0364c6c634	Suppress old container cleanup (kill/rm) failures People often report and ask about these "failures". More-so previously, when the `docker kill/rm` output was collected, but it still happens now when people do `systemctl status matrix-something` and notice that it says "FAILURE". Suppressing to avoid further time being wasted on saying "this is expected".	2022-04-11 09:05:33 +03:00
Slavi Pantaleev	86c36523df	Replace ExecStopPost with ExecStop Reverts `b1b4ba501f`, `90c9801c56`, `a3c84f78ca`, .. I haven't really traced it (yet), but on some servers, I'm observing `ansible-playbook ... --tags=start` completing very slowly, waiting to stop services. I can't reproduce this on all Matrix servers I manage. I suspect that either the systemd version is to blame or that some specific service is not responding well to some `docker kill/rm` command. `ExecStop` seems to work great in all cases and it's what we've been using for a very long time, so I'm reverting to that.	2022-02-05 12:13:36 +02:00
Slavi Pantaleev	29bc22a085	Add matrix_nginx_proxy_container_additional_networks Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1498	2022-01-10 11:51:57 +02:00
Slavi Pantaleev	b1b4ba501f	Replace ExecStop with ExecStopPost ExecStopPost should allow us to clean up (docker kill + docker rm) even if the ExecStart (docker run ..) command failed, and not just after a graceful service stop was initiated. Source: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStopPost=	2022-01-04 17:27:25 +02:00
Michael	33ec5710d9	0.2.1 revision	2021-02-28 22:21:40 +08:00
Michael	4c882c513b	initial PR	2021-02-20 17:19:17 +08:00
Slavi Pantaleev	512f42aa76	Do not report docker kill/rm attempts as errors These are just defensive cleanup tasks that we run. In the good case, there's nothing to kill or remove, so they trigger an error like this: > Error response from daemon: Cannot kill container: something: No such container: something and: > Error: No such container: something People often ask us if this is a problem, so instead of always having to answer with "no, this is to be expected", we'd rather eliminate it now and make logs cleaner. In the event that: - a container is really stuck and needs cleanup using kill/rm - and cleanup fails, and we fail to report it because of error suppression (`2>/dev/null`) .. we'd still get an error when launching ("container name already in use .."), so it shouldn't be too hard to investigate.	2021-01-27 10:22:46 +02:00
Slavi Pantaleev	1692a28fe4	Work around annoying Docker warning about undefined $HOME > WARNING: Error loading config file: .dockercfg: $HOME is not defined .. which appeared in Docker 20.10.	2021-01-15 00:23:01 +02:00
Slavi Pantaleev	e1690722f7	Replace cronjobs with systemd timers Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/756 Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/737 I feel like timers are somewhat more complicated and dirty (compared to cronjobs), but they come with these benefits: - log output goes to journald - on newer systemd distros, you can see when the timer fired, when it will fire, etc. - we don't need to rely on cron (reducing our dependencies to just systemd + Docker) Cronjobs work well, but it's one more dependency that needs to be installed. We were even asking people to install it manually (in `docs/prerequisites.md`), which could have gone unnoticed. Once in a while someone says "my SSL certificates didn't renew" and it's likely because they forgot to install a cron daemon. Switching to systemd timers means that installation is simpler and more unified.	2021-01-14 23:35:50 +02:00
Slavi Pantaleev	d08b27784f	Fix systemd services autostart problem with Docker 20.10 The Docker 19.04 -> 20.10 upgrade contains the following change in `/usr/lib/systemd/system/docker.service`: ``` -BindsTo=containerd.service -After=network-online.target firewalld.service containerd.service +After=network-online.target firewalld.service containerd.service multi-user.target -Requires=docker.socket +Requires=docker.socket containerd.service Wants=network-online.target ``` The `multi-user.target` requirement in `After` seems to be in conflict with our `WantedBy=multi-user.target` and `After=docker.service` / `Requires=docker.service` definitions, causing the following error on startup for all of our systemd services: > Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start A workaround which appears to work is to add `DefaultDependencies=no` to all of our services.	2020-12-10 11:43:20 +02:00
Slavi Pantaleev	d702e74079	Fix matrix-nginx-proxy static files mounting when SSL retrieval is none Fixup for `12867e9f18`. This shouldn't have been caught in the `if`. Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/734	2020-11-26 18:40:15 +02:00
Slavi Pantaleev	12867e9f18	Do not try to mount /matrix/ssl when matrix_ssl_retrieval_method is 'none' Since the switch from `-v` to `--mount` (in `1fca917ad1`), we've regressed when `matrix_ssl_retrieval_method == 'none'`. In such a case, we don't create `/matrix/ssl` directories at all and shouldn't be trying to mount them into the `matrix-nginx-proxy` container. Previously, with `-v`, Docker would auto-create them, effectively hiding our mistake. Now that `--mount` doesn't do such auto-creation magic, the `matrix-nginx-proxy` container was failing to start. Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/734	2020-11-26 09:55:26 +02:00
Slavi Pantaleev	1fca917ad1	Replace some -v instances with --mount `-v` magically creates the source destination as a directory, if it doesn't exist already. We'd like to avoid this magic and the potential breakage that it might cause. We'd rather fail while Docker tries to find things to `--mount` than have it automatically create directories and fail anyway, while having contaminated the filesystem. There's a lot more `-v` instances remaining to be fixed later on. This is just some start. Things like `matrix_synapse_container_additional_volumes` and `matrix_nginx_proxy_container_additional_volumes` were not changed to use `--mount`, as options for each one are passed differently (`ro` is `ro`, but `rw` doesn't exist and `slave` is `bind-propagation=slave`). To avoid breaking people's custom volume mounts, we keep it as it is for now. A deficiency with `--mount` is that it lacks the `z` option (SELinux ownership changes), and some of our `-v` instances use that. I'm not sure how supported SELinux is for us right now, but it might be, and breaking that would not be a good idea.	2020-11-24 10:26:05 +02:00
Chris van Dijk	6334f6c1ea	Remove hardcoded command paths in systemd unit files Depending on the distro, common commands like sleep and chown may either be located in /bin or /usr/bin. Systemd added path lookup to ExecStart in v239, allowing only the command name to be put in unit files and not the full path as historically required. At least Ubuntu 18.04 LTS is however still on v237 so we should maintain portability for a while longer.	2020-05-27 23:14:54 +02:00
Slavi Pantaleev	ca3b158d94	Add support to matrix-nginx-proxy to work in HTTP-only mode	2019-12-06 11:53:15 +02:00
Slavi Pantaleev	ae7c8d1524	Use SyslogIdentifier to improve logging Reasoning is the same as for matrix-org/synapse#5023. For us, the journal used to contain `docker` for all services, which is not very helpful when looking at them all together (`journalctl -f`).	2019-05-16 09:43:46 +09:00
Hugues De Keyzer	c451025134	Fix indentation in templates Use Jinja2 lstrip_blocks option in templates to ensure consistent indentation in generated files.	2019-05-07 21:23:35 +02:00
Sylvia van Os	75b1528d13	Add the possibility to pass extra flags to the docker container	2019-04-30 16:35:18 +02:00
Slavi Pantaleev	e645b0e372	Rename matrix_nginx_proxy_data_path to matrix_nginx_proxy_base_path `matrix_nginx_proxy_data_path` has always served as a base path, so we're renaming it to reflect that. Along with this, we're also introducing a new "data path" variable (`matrix_nginx_proxy_data_path`), which is really a data path this time. It's used for storing additional, non-configuration, files related to matrix-nginx-proxy.	2019-03-12 23:01:16 +02:00
Slavi Pantaleev	f6ebd4ce62	Initial work on Synapse 0.99/1.0 preparation	2019-02-05 12:09:46 +02:00
Slavi Pantaleev	96afbbb5af	Allow additional volumes to be mounted into matrix-nginx-proxy Certain use-cases may require that people mount additional files into the matrix-nginx-proxy container. Similarly to how we do it for Synapse, we are introducing a new variable that makes this possible (`matrix_nginx_proxy_container_additional_volumes`). This makes the htpasswd file for Synapse Metrics (introduced in #86, Github Pull Request) to also perform mounting using this new mechanism. Hopefully, for such an "extension", keeping htpasswd file-creation and volume definition in the same place (the tasks file) is better. All other major volumes' mounting mechanism remains the same (explicit mounting).	2019-02-05 11:46:16 +02:00
dhose	87e3deebfd	Enable exposure of Prometheus metrics.	2019-02-01 20:02:11 +01:00
Slavi Pantaleev	0be7b25c64	Make (most) containers run with a read-only filesystem	2019-01-29 18:52:02 +02:00
Slavi Pantaleev	316d653d3e	Drop capabilities in containers We run containers as a non-root user (no effective capabilities). Still, if a setuid binary is available in a container image, it could potentially be used to give the user the default capabilities that the container was started with. For Docker, the default set currently is: - "CAP_CHOWN" - "CAP_DAC_OVERRIDE" - "CAP_FSETID" - "CAP_FOWNER" - "CAP_MKNOD" - "CAP_NET_RAW" - "CAP_SETGID" - "CAP_SETUID" - "CAP_SETFCAP" - "CAP_SETPCAP" - "CAP_NET_BIND_SERVICE" - "CAP_SYS_CHROOT" - "CAP_KILL" - "CAP_AUDIT_WRITE" We'd rather prevent such a potential escalation by dropping ALL capabilities. The problem is nicely explained here: https://github.com/projectatomic/atomic-site/issues/203	2019-01-28 11:22:54 +02:00
Slavi Pantaleev	299a8c4c7c	Make (most) containers start as non-root This makes all containers (except mautrix-telegram and mautrix-whatsapp), start as a non-root user. We do this, because we don't trust some of the images. In any case, we'd rather not trust ALL images and avoid giving `root` access at all. We can't be sure they would drop privileges or what they might do before they do it. Because Postfix doesn't support running as non-root, it had to be replaced by an Exim mail server. The matrix-nginx-proxy nginx container image is patched up (by replacing its main configuration) so that it can work as non-root. It seems like there's no other good image that we can use and that is up-to-date (https://hub.docker.com/r/nginxinc/nginx-unprivileged is outdated). Likewise for riot-web (https://hub.docker.com/r/bubuntux/riot-web/), we patch it up ourselves when starting (replacing the main nginx configuration). Ideally, it would be fixed upstream so we can simplify.	2019-01-27 20:25:13 +02:00
Slavi Pantaleev	c10182e5a6	Make roles more independent of one another With this change, the following roles are now only dependent on the minimal `matrix-base` role: - `matrix-corporal` - `matrix-coturn` - `matrix-mailer` - `matrix-mxisd` - `matrix-postgres` - `matrix-riot-web` - `matrix-synapse` The `matrix-nginx-proxy` role still does too much and remains dependent on the others. Wiring up the various (now-independent) roles happens via a glue variables file (`group_vars/matrix-servers`). It's triggered for all hosts in the `matrix-servers` group. According to Ansible's rules of priority, we have the following chain of inclusion/overriding now: - role defaults (mostly empty or good for independent usage) - playbook glue variables (`group_vars/matrix-servers`) - inventory host variables (`inventory/host_vars/matrix.<your-domain>`) All roles default to enabling their main component (e.g. `matrix_mxisd_enabled: true`, `matrix_riot_web_enabled: true`). Reasoning: if a role is included in a playbook (especially separately, in another playbook), it should "work" by default. Our playbook disables some of those if they are not generally useful (e.g. `matrix_corporal_enabled: false`).	2019-01-16 18:05:48 +02:00
Slavi Pantaleev	51312b8250	Split playbook into multiple roles As suggested in #63 (Github issue), splitting the playbook's logic into multiple roles will be beneficial for maintainability. This patch realizes this split. Still, some components affect others, so the roles are not really independent of one another. For example: - disabling mxisd (`matrix_mxisd_enabled: false`), causes Synapse and riot-web to reconfigure themselves with other (public) Identity servers. - enabling matrix-corporal (`matrix_corporal_enabled: true`) affects how reverse-proxying (by `matrix-nginx-proxy`) is done, in order to put matrix-corporal's gateway server in front of Synapse We may be able to move away from such dependencies in the future, at the expense of a more complicated manual configuration, but it's probably not worth sacrificing the convenience we have now. As part of this work, the way we do "start components" has been redone now to use a loop, as suggested in #65 (Github issue). This should make restarting faster and more reliable.	2019-01-12 18:01:10 +02:00

28 Commits