Runbook

Warning

This is content is primarily aimed at an internal audience of staff at the Green Web Foundation operating the carbon.txt API service.

How the Green Web Foundation deploys the validator

We use the following Ansible playbook ourselves to deploy the latest version of the API to a given server.

It assumes you have a dedicated user set up to run the service running on a linux server, called deploy, and sets up a folder structure to run the service, using the tool uv to run the latest published version of the package.

It refers to a django config file accessible in the same folder as where the command is run.

It also sets up a systemd service to run the service behind a reverse proxy server like Nginx or Caddy (in our case, we use Caddy). Systemd handles restarts and failures, and collects logs to be sent to a centralised logging server.

This playbook is designed to be run from a developer’s server, or as part of an internally managed github action.

Deployment

---
- name: Deploy carbon.txt API to server and run service with systemd
  gather_facts: false

  hosts:
    - app5.greenweb.org

  remote_user: deploy

  vars:
    service_user: deploy
    service_restart: true

    # Environment variables for carbon.txt.api
    secret_key: "{{ lookup('env', 'CARBON_TXT_API_SECRET_KEY') | string }}"
    database_url: "{{ lookup('env', 'CARBON_TXT_API_DATABASE_URL') | string }}"
    api_port: "{{ lookup('env', 'CARBON_TXT_API_PORT') | string }}"
    git_branch: "{{ lookup('env', 'CARBON_TXT_API_GIT_BRANCH') | string }}"
    # Sentry configuration
    sentry_dsn: "{{ lookup('env', 'CARBON_TXT_API_SENTRY_DSN') | string }}"
    carbon_txt_hostname: "{{ lookup('env', 'CARBON_TXT_API_HOSTNAME' ) | string }}"
    service_name: "{{ lookup('env', 'CARBON_TXT_API_SERVICE_NAME') | string }}"
    project_path: "/var/www/{{carbon_txt_hostname}}"
    service_file_name: "{{ service_name }}.service"
    sentry_trace_sample_rate: 1.0
    sentry_profile_sample_rate: 1.0

    # running these checks is slow, and we only need them when setting up
    check_for_mysql_dependencies: false

  tasks:

    - name: Verify required environment variables
      ansible.builtin.fail:
        msg: "Required environment variable {{ item }} is not set or empty"
      when: lookup('env', item) | trim == ''
      loop:
        - CARBON_TXT_API_SECRET_KEY
        - CARBON_TXT_API_DATABASE_URL
        - CARBON_TXT_API_PORT
      tags: [pre-flight]

    - name: "Install pkg-config dependency for mysql client"
      ansible.builtin.apt:
        name: "pkg-config"
        state: "present"
      when: check_for_mysql_dependencies | bool
      tags: [setup-script]

    - name: "Install clang dependency for mysql client"
      ansible.builtin.apt:
        name: "clang"
        state: "present"
      when: check_for_mysql_dependencies | bool
      tags: [setup-script]

    - name: Set up directory for running web app
      ansible.builtin.file:
        path: "{{ project_path }}"
        state: directory
        owner: "{{ service_user }}"
        group: "{{ service_user }}"
        mode: "0755"
      tags: [setup-script]

    - name: Upload bash script for running carbon.txt.api
      ansible.builtin.template:
        src: "run_carbon_txt_api.sh.j2"
        dest: "{{ project_path }}/run_carbon_txt_api.sh"
        mode: "0755"
      tags: [setup-script]

    - name: Upload dotenv file for running carbon.txt.api
      ansible.builtin.template:
        src: carbon_txt_api_dotenv.sh.j2
        dest: "{{ project_path }}/.env"
        mode: "0755"
      tags: [setup-script]

    - name: Upload extra django config file
      ansible.builtin.template:
        src: carbon_txt_api_local_config.py.j2
        dest: "{{ project_path }}/local_config.py"
        mode: "0755"
      tags: [setup-script]

    - name: Upload systemd service file for running carbon.txt.api
      ansible.builtin.template:
        src: systemd.carbon_txt_api.service.j2
        dest: "/etc/systemd/system/{{ service_file_name }}"
        mode: "0755"
        owner: "{{ service_user }}"
        group: "{{ service_user }}"
      become: true
      tags: [setup-script]

    - name: Reload systemd to pick up new changes
      ansible.builtin.systemd:
        daemon_reload: true
      become: true
      tags: [systemd-config]

    - name: Query state of services including api service
      ansible.builtin.service_facts:
      tags: [systemd-service, systemd-check]

    - name: Show state of services
      ansible.builtin.debug:
        var: ansible_facts.services[service_file_name]
      tags: [systemd-check]

    - name: Trigger restart for app with systemd
      ansible.builtin.systemd:
        name: "{{service_file_name}}"
        state: restarted
      become: true
      when: service_restart | bool
      tags: [systemd-service]

The carbon_txt_api file is the file run by Systemd. Every time this is run, it pulls down the latest published version of the carbon-txt package and runs the serve command.

It also uses a local_config file - this can be used to add extra configuration.

# written to /var/www/carbon-txt-api.greenweb.org/run_carbon_txt_api.sh

# Inject any extra libraries needed for database connectivity here, eg mysqlclient;
/path/to/bin/uv tool --with mysqlclient run carbon-txt@latest serve \
--django-settings local_config \
--port <PORT> \
--host <HOST> \
--server granian \
--migrate

The local config file is templated out into the the same directory as where the command is run from. and the same directory as the environment variables file:

# written to /var/www/carbon-txt-api.greenweb.org/local_config.py

# local_config.py

from carbon_txt.web.config.settings.production import *  # noqa

# extra settings here
EXTRA_CONFIG = True
SOME_SETTING = "some value"

Environment variables are provided through a env file:

# carbon_text_api_dotenv.sh
# templated out to /var/www/carbon-txt-api.greenweb.org/.env

DATABASE_URL="{{ database_url }}"

The templated out Systemd service file looks like the example below. It uses the run_carbon_txt_api.sh script to run the service. Any required environment variables are placed in the .env environment file.

# written to /etc/systemd/system/carbon_txt_api.service

# {{ ansible_managed }}
# Last run: {{ template_run_date }}

[Unit]
Description=Carbon.txt API
Documentation=https://carbon-txt-validator.readthedocs.io/en/latest/
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=bash ./run_carbon_txt_api.sh
ExecReload=/bin/kill -s HUP $MAINPID
WorkingDirectory={{ project_path }}/
EnvironmentFile={{ project_path }}/.env
User={{ service_user }}
Group={{ service_user }}
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
[Install]
WantedBy=multi-user.target

Operating the front end

See the README on the carbon-text-site github repository, for instructions on developing, operating and deploying the front end that consumes the API offered by the carbon.txt validator.

Seeing Logs

Logs from the carbon txt validator service, when deployed in Green Web Foundation infrastructure are aggregated by Systemd, and forwarded to a Loki centralised logging server. These logs can be queried at grafana.greenweb.org - filter logs by the systemd unit carbon_txt_api, using the label filter {unit="carbon_txt_api.service"} Anonymised information on the domains validated is also logged to the django database, in the table validation_logging_ValidationLogEntry.

You can see these statistics in the carbon.txt validations grafana dashboard:

Screnshot of the grafana dashboard.

The large numbers at the top of the dashboard show figures for all time, and do not alter with the time range selected - They show the number of unique domains requested for validation (plus the number of those domains which validates succesfully), and on the next row, the total number of validation requests (and the number of those which were succesful).

Below these, the tables and graphs which provide an overview for the selected time range: A breakdown of the validations requested during the selected time period by domain validated, plus a graph of the total, cumulative number of validations requested, and the same, but for succesful validations: a list of domains which were succesfully validated, plus a graph of the cumulative total number of succesful validations, over time.

Monitoring, and exception tracking

We use Sentry’s suite of hosted monitoring tools for tracking exceptions, performance, and uptime. See greenweb.sentry.io.