Moving away from using the default bridge network to using our own.
This isolates our services from other Docker containers running
on the default network on the same host.
The benefits are that:
- isolation is a little better - we no longer share a default
bridge network with any other containers that might be running on the host
- there are no longer hard dependencies - we do service discovery
by DNS name, and not via explicit `--link` usage during container start,
so containers can start out of order and fail without bringing down others
with them
(`matrix-nginx-proxy` can continue running, even if one of the other services dies)
In the future, when other services get introduced,
the increased resilience and simplicity will help as well.
Until now, we were starting from a fresh configuration, as generated
by Synapse and manipulating it with regex and line replacements,
until we made it work.
This is more fragile and less predictable, so we're moving to a static
configuration file generated from a Jinja template.
The upside is that configuration will be stable and predictable.
The downside of this new approach is that any manual configuration changes
after the playbook is done, will be thrown away on future playbook
invocations.
There are 2 ways to work around the need for manual configuration
changes though:
- making them part of this playbook and its default template
configuration files (which benefits everyone)
- going your own way for a given host and overriding the template files
that gets used (that is, the
`matrix_synapse_template_synapse_homeserver` or
`matrix_synapse_template_synapse_log` variables)
Since cbee084ac19b6, this playbook supports Postgres 10.x,
but keeps existing Postgres-9.x installs on 9.x.
This playbook can now also be ran with `--tags=upgrade-postgres`
to make it upgrade from Postgres 9.x to 10.x (or other versions
in the future).
This playbook just tries to avoid trying to setup a Postgres 10
database with existing 9.x files, as that makes Postgres complain.
Due to this, existing installs (still on 9.x) are detected
and left on Postgres 9.x.
They need to be upgraded to Postgres 10.x manually.
Switching from from avhost/docker-matrix (silviof/docker-matrix)
to matrixdotorg/synapse.
The avhost/docker-matrix (silviof/docker-matrix) image used to bundle
in the coturn STUN/TURN server, so as part of the move,
we're separating this to a separately-ran service
(matrix-coturn.service, powered by instrumentisto/coturn-docker-image)
A `.log.config` file may be generated with a different
level of indentation depending on which (Docker image, etc.)
generates it.
With this patch, we tolerate different levels of indentation
(2 spaces, 4 spaces, etc.) and don't break the configuration.
The non-working script is supposed to be fixed
by https://github.com/matrix-org/synapse/pull/2375
To have it work, we'd need an updated Docker image
of `silviof/matrix-riot-docker:latest`, which is not yet available
at the time of this commit.
Still, the previous patched synapse_port_db didn't work well either,
so it's not like we're regressing much by getting rid of it.
As described here (
https://github.com/matrix-org/synapse/issues/2438#issuecomment-327424711
), using own SSL certificates for the federation port is more fragile,
as renewing them could cause federation outages.
The recommended setup is to use the self-signed certificates generated
by Synapse.
On the 443 port (matrix-nginx-proxy) side, we still use the Let's Encrypt
certificates, which ensures API consumers work without having to trust
"our own CA".
Having done this, we also don't need to ever restart Synapse anymore,
as no new SSL certificates need to be applied there.
It's just matrix-nginx-proxy that needs to be restarted, and it doesn't
even need a full restart as an "nginx reload" does the job of swithing
to the new SSL certificates.
Moving keeps everything in the /matrix directory, so that we
wouldn't contaminate anything else on the system or risk
clashing with something else.
Also retrieving certificates separately for the Riot and Matrix domains,
which should help in multiple ways:
- allows them to be very different (completely separate base domain..)
- allows for Riot to be disabled for the playbook some time later
and still have the code not break
Let's let the admin set them as they wish.
We don't care what they are anyway.
If other things run on the same server,
it's also better not to hijack these for our
own purposes, especially when we don't need to.
The timedatectl call also seems to fail on Ubuntu 17.04
for some reason (missing timezones information file?).
The goal is to allow these to be on separate partitions
(including remote ones in the future).
Because the `silviof/docker-matrix` image chowns
everything to MATRIX_UID:MATRIX_GID on startup,
we definitely don't want to include `media_store` in it.
If it's on a remote FS, it would cause a slow startup.
Also, adding some safety checks to the "import media store"
task, after passing a wrong path to it on multiple occassions and
wondering what's wrong.
Also, making logging configurable. The default of keeping 10x100MB
log files is likely excessive and people may want to change that.
Port 8008 is forwarded in our case, so unless we adjust
`x_forwaded` for it, Docker's local network IPs are
logged/displayed for devices.
The TLS port (8448) is not proxied in our setup,
so its `x_forwarded` setting remains `false`.
Otherwise certains values in the config file,
such as `macaroon_secret_key`, would be regenerated,
which is not something that we want.
If `macaroon_secret_key` is regenerated, all users'
auth tokens will become invalid (effectively logging out
all users).
Some CentOS 7 hosts may not have firewalld installed.
We shouldn't expect it to be, but should ensure by ourselves that it is.
Docker likes to mess around with iptables forwarding rules,
so it ought to start after firewalld.
matrix-nginx-proxy will be occupying port 80 soon,
so that we can be more user-friendly and have
http->https forwarding for the Riot hostname.
During the playbook run, acmetool also expects to use
port 80 for domain verification.
During an initial playbook run, this wouldn't cause trouble
because matrix-nginx-proxy is not installed yet.
However, on subsequent playbook runs, it would cause trouble.
This ensures that if matrix-nginx-proxy is available
and running, it would be stopped before running acmetool
and started right after.