SKSI/Guidelines/SystemMaintenance

The IT infrastructure and services provided to SKSI by the OER Foundation (OERF) has the following ongoing maintenance requirements.

Payment to Cloud Infrastructure Supplier
In this case, that is Digital Ocean. It means paying the monthly invoices based on actual usage of several Virtual Private Servers (VPSs), which is typically within a set of allotted pre-paid resource bands for those VPSs. The monthly amount is normally as described in the Cloud Budgeting Notes, or approximately USD90/month, although it is possible that it will be somewhat more than that if IT staff create temporary VPSs for testing or training purposes.

Security Updates
All of the VPS set up by the OERF run Ubuntu Linux (mostly version 20.04, with one new 22.04 instance). As with any computing platform, Ubuntu Linux is undergoing continual development, with its developer community providing functional and security improvements. These need to be applied to all the SKSI servers regularly - we try to do it at least weekly if not more frequently.

VPS system updates
Each of the VPS systems requires regular system upgrades, and occasionally urgent ones, e.g. if a major security vulnerability is identified and a patch released, it might require rapid deployment to minimise the "window" of vulnerability of the SKSI VPSs.

The normal practice we follow is to log into each VPS via Secure Shell (SSH) at least once weekly, and issue the following command (usually copy-and-pasting it for consistency):



The use of the 'sudo' prefix is required for these administrative commands, and requires that the user issuing them provides his or her password (assuming that his or her user has the 'sudo' privilege). The command updates the VPS' list of official Ubuntu (and possibly added community) software package source. It then compares what is installed with what is available, offering to upgrade any packages for which a newer version exists. It then removes any dependent packages made redundant by those upgrades, and removes the installation files to free up disk space. It also provides the user running the command a list of any 'pinned' packages that could be upgraded but haven't been due to past administrator request that a version be 'pinned' at a specific version, to, for example, ensure compatibility with a custom software package we have installed.

It also shows disk space usage listing to help the administrator identify any storage devices that are approaching full (to allow for pre-emptive action to free up space), and restarts the webserver on the host (if one is present) which, among other things, ensures that the web server is using up-to-date Secure Sockets Layer (SSL) certificates issued by the Let's Encrypt group of which we make widespread use (these are automatically updated every 6 or so weeks, but require a web server reload to be applied).

Service updates
In addition to the VPS host's operating system, each VPS is also running one or more 'services', made of of collections of Docker containers (managed by Docker Compose) which, together, provide a defined service. This could be a Moodle, Docker, Rocket.Chat, Vaultwarden, or WordPress Multisite instance, among other services key to the OERF Free and Open Source Software (FOSS) stack. These are all undergoing active development by their various communities and require periodic upgrades/updates to ensure that the users of those services benefit from the latest (and greatest) versions of the software functionality, and from the latest security measures to protect the users' data stored in the system.

The process of upgrading each of these is specific to the service (each has different sub-components and dependencies, each of which, in turn, can have quirks and lore that affect upgrade processes). Many upgrade processes for these services are described in detail on the OERF-authoured OERu Technology Blog for the benefit of FOSS communities including those involved in the SKSI.

Data backups
On OERF systems, nothing is more important than data, particularly that personal to users. To the extent possible, we aim to create timely backups of system and user data that allows us to protect against hardware failure, illicit access and data corruption, and accidental data loss. We attempt to ensure all of these systems are automated, not requiring active involvement from system administrators - they should only monitor the continual reliable running of these systems.

System administrators are also responsible for ensure that backed up data can be recovered. To achieve this, they should regularly exercise the process of rebuilding a system based on backed up data.

Database backups
Unlike regular files on a computer filesystem, the files comprising a database cannot normally be backed up safely (often databases store crucial data in system memory, and they can leave files on the hard disk in an inconsistent (i.e. corrupt) form if the database server is suddenly, unexpectedly stopped. Backing up databases usually requires a 'dump' process which is typically specific to each database system. The OERF typically makes daily (or, in some cases, hourly) data backups local to the system.

The OERF typically uses a mixture of databases for its services. These include MariaDB (a drop-in replacement for MySQL), PostgreSQL, SQLite, and MongoDB. For each of these, the OERF has developed backup scripts that work on databases running on the VPS itself - that is usually the way the OERF implements MariaDB - or within containers (as we deploy the other database by preference), saving the resulting consistent data dumps on the VPS' filesystems where they can be backed up by the normal automated filesystem backups.

Remote VPS backups
The OERF implements automated data backups that are
 * remote - transferred across a network to a different physical server, providing geographic distribution. This guards against the possibility of a natural disaster devastating both a live VPS and all of its backups,
 * incremental - only changed data since the previous backup needs to be saved, not the entire corpus of data for each backup. This approach is vastly is more disk space and data transfer efficient as well as being much faster and requiring far fewer system resources, and
 * encrypted - so that private information, particularly related to users (e.g. personal data) or the systems storing their data (e.g. secrets like passwords and encryption keys) cannot be recovered if the backup storage location is compromised by a remote attacker.

To achieve these aims, the OERF uses a FOSS backup application called 'restic' which is designed to meet all of the above requirements. We have configured some of the SKSI servers to do data backups to the OERF backup infrastructure, but the OERF Open Source Technologist recommends that the SKSI provision its own on-shore backup infrastructure to ensure that, in addition to preserving the security of user and organisational data, it also preserves data sovereignty on behalf of Samoa.

User data backups
Our normal policy is that all user data held within the 'containerised' services the OERF deploys is stored on the host VPS' filesystems rather than in the containers. That means all containers can be removed without losing any crucial user or configuration data. Where containers (e.g. those running databases) control data that cannot safely be copied as files on the host VPS' filesystem, the OERF configures backup scripts which transform that data into a form (e.g. database dumps) that provide a full record of the data in a consistent form that can be used to recover from the failure modes described previously.