Skip to content

Onionprobe ChangeLog

v1.3.0 - Unreleased

Fixes

  • Standalone monitoring node:
  • The start action in the onionprobe-monitor script now pulls and builds images.

  • Failure rate was erroneously being reported at 1% when all services were working, when the excepted would be a reported value of 0%. This is now fixed.

Features

v1.2.1 - 2024-11-27

Features

Fixes

v1.2.0 - 2024-04-24

Features

  • New metrics (tpo/onion-services/onionprobe#78):

    • From the outer descriptor wrapper:
      • descriptor-lifetime.
      • revision-counter.
    • From the second layer of encryption:
      • single-onion-service.
      • pow-params.
    • HSDir latency when fetching descriptors.
  • Enhanced Grafana Dashboard (tpo/onion-services/onionprobe#80) with the following new visualizations:

    • Overview:
      • Current failure rate of onionsites.
      • Total expiring certificates in the next 7 days.
      • List of certificate expirations up to the next 180 days.
      • List of unreachable instances.
      • Graph with the total unreachable instances.
      • List of invalid HTTPS certificates.
      • List of services with HTTPS errors.
    • Performance:
      • Total of minimum, average and maximum service connection latency.
      • Total of minimum, average and maximum descriptor fetch latency.
      • Chart of minimum, average and maximum service connection latency.
      • Chart of minimum, average and maximum descriptor fetch latency.
      • Rate of services using the single hop mode, relative to the total services monitored.
      • List of slow services.
    • Descriptors:
      • List of services missing a published descriptor.
      • Chart of the minimum, average and maximum descriptor sizes (decrypted outer layer).
      • Chart of the minimum, average and maximum descriptor sizes (decrypted second layer).
    • Introduction points:
      • Chart of minimum, average and maximum number of introduction points per service.
      • List of services and it's number of introduction points.
    • HSDir:
      • Total number of HSDirs tested.
      • Chart of minimum, average and maximum HSDir latency for fetching descriptors.
      • List of HSDirs sorted by descriptor fetch latency.
    • Proof of Work (PoW):
      • Ratio of services with PoW enabled, relative to the total services monitored.
      • Total number of services with PoW enabled.
      • Chart of minimum, average and maximum PoW v1 effort seem.
      • List of services with PoW enabled.
      • List of services with PoW enabled with effort greater than zero.
  • Improved log message for elapsed time.

  • New log messages for:

    • Number of introduction points.
    • HS_DESC events:
      • Descriptor reachability.
      • HSDir used.
  • Create a GitLab release at every new tag (experimental) (tpo/onion-services/onionprobe#82).

  • Running lintian on CI to check the generated Debian package.

Fixes

  • Manpage generation is now compatible with the Onion Services Ecosystem Portal (tpo/onion-services/ecosystem#1).

  • Use the correct copyright line in source files.

  • Support for a wider range of pyca/cryptography versions at setup.cfg.

  • Display Tor bootstrap messages only for the debug log level.

  • Disable stem logging if log level is below debug (tpo/onion-services/onionprobe#63).

  • Exit codes now reflects reality (tpo/onion-services/onionprobe#64).

  • Calculate the elapsed time for descriptors right after fetching.

  • Updated the SecureDrop list.

  • Upgraded Grafana image to 10.4.2.

  • Upgraded Alertmanager image to 0.27.0.

  • Upgraded Prometheus image to 2.51.2.

  • Upgraded PostgreSQL image to 16. Please run the needed upgrading steps.

  • Upgraded CI and container images to Debian bookworm.

  • Upgraded vendors/onion-mkdocs.

v1.1.2 - 2023-09-28

Fixes

  • Make the tor process quiet when generating hashed passwords (reported by @anarcat): https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/81

  • CI/CD: use rsync to copy slide artifacts, preserving the folder structure.

  • Minor documentation improvements.

Features

  • Debug outer and inner layer descriptor contents.

  • Decrease Prometheus certificate expiration alerts to 7 days in advance.

v1.1.1 - 2023-04-04

Fixes

  • Grafana dashboard:
  • Apply workaround for "Invalid dashboard UID in the request error on custom home dashboard": https://github.com/grafana/grafana/issues/54574

  • Docker:

  • Stick to specific upstream image versions to avoid unexpected upgrade issues.

  • Change the onionprobe image version scheme to match semantic versioning.

  • PostgreSQL upgrade script (upgrade-postgresql-database):

  • Misc fixes.

v1.1.0 - 2023-04-03

Fixes

  • Stick to a PostgreSQL docker image: See https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/70

  • Command-line URL parsing: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/merge_requests/17

  • Display default values for most options on onionprobe --help.

Features

  • Support for Tor metrics Prometheus exporter via MetricsPort and MetricsPortPolicy settings, available respectively as metrics_port and metrics_port_policy configuration or command line parameters.

These settings are disabled by default. If you plan to use this with the standalone monitoring node, you may also want to edit configs/prometheus/prometheys.yml and uncomment Tor's Prometheus configuration so this data becomes available at Prometheus, Alertmanager and Grafana.

WARNING: Before enabling this, it is important to understand that exposing tor metrics publicly is dangerous to the Tor network users. Please take extra precaution and care when opening this port. Set a very strict access policy with MetricsPortPolicy and consider using your operating systems firewall features for defense in depth.

We recommend, for the prometheus format, that the only address that can access this port should be the Prometheus server itself. Remember that the connection is unencrypted (HTTP) hence consider using a tool like stunnel to secure the link from this port to the server.

Check the standalone monitoring node docs for detailed instructions in how to enable this additional metric collection.

  • TLS certificate verification:
  • Added a global tls_verify flag to check certificates during HTTP tests. Set it to false to ignore TLS certificate verification. By default all TLS certificates are checked.

  • Added a per-endpoint tls_verify flag to check certificates in HTTP tests, overriding the global setting for the endpoint context.

  • Changed the onion_service_valid_certificate metric to also inform when a certificate wasn't tested by setting a value of 2 on that case. This isn't a breaking change since the TLS certificate is enabled by default, so unless verification is disabled the metric will only vary between 0 (invalid cert) and 1 (valid cert).

  • TLS and X.509 certificate test:

  • Added a new test to check the conditions of the underlying TLS connection and to get detailed certificate information.

  • This test currently only happens for endpoints with the https protocol, and only if the test_tls_connection configuration is set to true in the global scope or in the endpoint configuration.

  • Certificates are retrieved and analyzed even if they're not valid, in order to also collect data on self-signed, expired or otherwise invalid certificates.

  • A number of new metrics is included both for the TLS connection and for the server certificate:

    • onion_service_certificate_not_valid_before_timestamp_seconds: Register the beginning of the validity period of the certificate in UTC. This does not mean necessarily that the certificate is CA-validated. Value is represented as a POSIX timestamp,

    • onion_service_certificate_not_valid_after_timestamp_seconds: Register the end of the validity period of the certificate in UTC. This does not mean necessarily that the certificate is CA-validated. Value is represented as a POSIX timestamp.

    • onion_service_certificate_expiry_seconds: Register how many seconds are left before the certificate expire. Negative values indicate how many seconds passed after the certificate already expired.

    • onion_service_certificate_match_hostname: Register whether a provided server certificate matches the server hostname in a TLS connection: value is 1 for matched hostname and 0 otherwise. Check is done both on the commonName and subjectAltName fields. A value of 1 does not mean necessarily that the certificate is CA-validated.

    • onion_service_certificate_info: Register miscellaneous TLS certificate information for a given Onion Service such as version and fingerprints.

    • onion_service_tls_security_level: Tracks the SSL security level in use. Needs Python 3.10+ to work. See SSL_CTX_get_security_level(3) manpage for details: https://www.openssl.org/docs/manmaster/man3/SSL_CTX_get_security_level.html

    • onion_service_tls_info: Register miscellaneous TLS information for a given Onion Service such as version and ciphers.

  • Prometheus rules for the standalone monitoring node were updated to include an alert for certificates about to expire (defaults to 30 days in advance).

  • Details at https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/49

  • Added the onion_service_generic_error_total metric to track probing errors not covered by other metrics.

  • Added script to handle PostgreSQL version upgrades at the service container: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/70

  • Using Onion Mkdocs for the documentation, now hosted at https://tpo.pages.torproject.net/onion-services/onionprobe/

See https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/19

  • Expected HTTP status codes:
  • Per-endpoint configuration specifying a list of expected HTTP status codes, useful when it's expected that an endpoint returns a status other than 200.
  • Custom metric indicating if the status code is expected or not.

  • CI/CD: added jobs to test building debian and python packages, as well as configurations and slides.

v1.0.0 - 2022-05-31

Breaking changes

  • Changed Prometheus exporter metric names to adhere to the Best practices and to other recommendations when writing an exporter. Prometheus admins might want to rename their old metrics to the new ones to keep time series continuity, drop the old ones or keep both during a transition phase. The following metrics were renamed:
  • From onionprobe_wait to onionprobe_wait_seconds.
  • From onion_service_latency to onion_service_latency_seconds.
  • From onion_service_descriptor_latency to onion_service_descriptor_latency_seconds.
  • From onion_service_fetch_error_counter to onion_service_fetch_error_total.
  • From onion_service_descriptor_fetch_error_counter to onion_service_descriptor_fetch_error_total.
  • From onion_service_request_exception to onion_service_request_exception_total.
  • From onion_service_connection_error to onion_service_connection_error_total.
  • From onion_service_http_error to onion_service_http_error_total.
  • From onion_service_too_many_redirects to onion_service_too_many_redirects_total.
  • From onion_service_connection_timeout to onion_service_connection_timeout_total.
  • From onion_service_read_timeout to onion_service_read_timeout_total.
  • From onion_service_timeout to onion_service_timeout_total.
  • From onion_service_certificate_error to onion_service_certificate_error_total.

  • Removed the updated_at label from all metrics, which was creating a new data series for every measurement on Prometheus.

  • Removed the hsdir label from onion_service_descriptor_reachable metric, which was creating a new data series for every measurement on Prometheus.

Features

  • Monitoring node setup using Docker Compose and Prometheus, Alertmanager and Grafana dashboards served via Onion Services.

  • Config generation improvements.

  • New metrics:

  • onion_service_fetch_requests_total.
  • onion_service_descriptor_fetch_requests_total.
  • onion_service_descriptor, with Onion Service descriptor information.
  • onion_service_probe_status, with timestamp from the last test.

  • Default Grafana Dashboard with basic metrics.

v0.3.4 - 2022-05-11

Fixes

  • Onionprobe's exporter port allocation conflict with the push gateway https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/45

v0.3.3 - 2022-05-11

Fixes

  • Stem is unable to find cryptography module when runing from the pip package https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/43

v0.3.2 - 2022-05-11

Main issue: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/42

Features

  • Enhanced config generators: switch all three config generators currently supporter (Real-World Onion Sites, SecureDrop and TPO) to rely on argparse for command line arguments.

v0.3.1 - 2022-05-10

Main issue: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40717

Features

  • Adds packages/tpo.py to generate an Onionprobe config with Tor Project's .onions. Details at https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/merge_requests/4

  • Other minor fixes and enhancements.

v0.3.0 - 2022-04-19

Main issue: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/4

Features

  • Debian package.
  • Better logging.
  • Additional command line options.
  • Handling of SIGTERM and other signals.

Documentation

  • Manpage.
  • Auto-generate command line docs from CLI invocation.
  • Auto-generate manpage from argparse.

v0.2.2 - 2022-04-06

Fixes

  • Print usage when no arguments are supplied.

v0.2.1 - 2022-04-06

Fixes

  • Python package fixes.

v0.2.0 - 2022-04-06

Main issue: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/3

Enhancements

  • Python packaging: https://pypi.org/project/onionprobe.
  • Support for --endpoints command line argument.
  • Display available metrics at command line usage.
  • Adds OnionprobeConfigCompiler to help compile custom configuration.

v0.1.0 - 2022-03-31

Main issue: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/2

Meta

Probing

Metrics

  • Status: sleeping, probing, starting or stopping.
  • Match found / not found.
  • Metric units in the description.
  • Number of introduction points.
  • Timestamp label.
  • Register HSDir used to fetch the descriptor. Check the control-spec for HSFETCH command and the HS_DESC event (using SETEVENTS). Relevant issues:

Enhancements

  • Refactor into smaller modules.
  • Better exception handling.

Bonus

v0.0.1 - 2022-03-23

Main issue: https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/1

Basic

  • Take a list of onions to check and make sure that you can always fetch descriptors rather than just using cached descriptors etc.
  • Randomisation of timing to avoid systemic errors getting lucky and not detected.
  • Looping support: goes through the list of onions in a loop, testing one at a time continuously.
  • Flush descriptor caches so testing happens like if a fresh client.
  • Support for HTTP status codes.
  • Page load latency.
  • Ability to fetch a set of paths from each onion. Customisable by test path: not all our sites have content at the root, but do not bootstrap every time if that can be avoided.
  • Need to know about "does the site have useful content?" Regex for content inside the page: allow configuring a regex per path for what should be found in the returned content/headers.
  • Documentation.

Meta

  • Dockerfile (and optionally a Docker Compose).

Prometheus

  • Exports Prometheus metrics for the connection to the onion service, and extra metrics per path on the status code for each path returned by the server. If using the prometheus exporter with python, consider to just use request and beautiful soup to check that the page is returning what one expects.
  • Add in additional metrics wherever appropriate.
  • To get the timings right, the tool should take care of the test frequency and just expose the metrics rather than having Prometheus scraping individual targets on Prometheus' schedule.

Bonus