OpenCensus Service Configuration

Published by Steve Flanders on

In my last post, I talked about what the OpenCensus Service is and why you should care. After reading that post, you are likely wondering how to get started. In this post, I would like to walk through how to deploy and configure the OpenCensus Service in a greenfield or brownfield environment.

(Spoiler alert: it is really easy!)

Deployment

The best practice when deploying the OpenCensus Service is to use the latest release. While you can always build from master, a release reduces the number of steps you need to perform and ensures stability. The OpenCensus Service SIG currently works on a two-week sprint cadence and typically releases a new version at the end of the sprint. You can track the latest milestone in order to figure out what is coming next.

Agent

There are three deployment options for the Agent (listed in order of preference):

  1. Daemonset
  2. Sidecar
  3. Binary

Preference is determined based on the available resources and security concerns. For example, daemonsets are typically more resource efficient than running an Agent as a sidecar, but the security vector is larger in this deployment model. No matter which deployment model you choose, the mapping is always 1:1 (e.g. one Agent per host for daemonset or one Agent per service for sidecar).

Collector

The Collector can be deployed as a binary or container. In either case, it should be deployed as a standalone service. The Collector can be scaled up or out as needed. Performance information can be found on Github (more on OpenCensus Service performance in a future post).

Configuration

Configuration of the OpenCensus Service is generally done via YAML (though the Collector also offers CLI configuration flags). Both the Agent and Collector listen for incoming traffic on port 55678 (OpenCensus receiver) and have no exporter configured by default. Both also expose z-pages on port 55679 (more on z-pages in a future post), which can be used to troubleshoot the OpenCensus Service. The Agent and Collector are based on the same code base and, as a result, share many of the same features and configuration options.

Whether working with the Agent or the Collector, configuring receivers is the same. Here is an example configuration to enable additional receivers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
receivers:
  jaeger:
    jaeger-thrift-tchannel-port: 14267
    jaeger-thrift-http-port: 14268

  zipkin:
    address: "127.0.0.1:9411"

  prometheus:
    config:
      scrape_configs:
        - job_name: "caching_cluster"
          scrape_interval: 5s
          static_configs:
            - targets: ["localhost:8889"]

To simplify operations, the OpenCensus receiver is enabled by default, and only needs to be specified when a custom receiver configuration is desired.

Protip: If you wish to enable a receiver with the default configuration, you can do so by specifying the receiver without any parameters. For example, the following is equivalent to the zipkin and jaeger configurations above:

1
2
3
receivers:
  jaeger: {}
  zipkin: {}

Agent

As mentioned above, the Agent (like the Collector) does not send data to any destination by default. Destinations can be specified by configuring one or more exporters. The recommended exporter configuration for the Agent would be to the OpenCensus Collector (via the opencensus exporter). For example:

1
2
3
4
exporters:
  opencensus:
    compression: "gzip"
    endpoint: "oc-collector.default.svc.cluster.local:55678"

While there are other configuration options for the Agent, the receivers and exporters sections are the only commonly changed options.

Collector

Like the Agent, exporting needs to be configured on the Collector. One difference with the Collector is that it leverages queued-exporters, which offers many additional configuration options. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
queued-exporters:
  # A friendly name for the processor
  omnition:
    batching:
      enable: false
      # sets the time, in seconds, after which a batch will be
      # sent regardless of size
      timeout: 1
      # number of spans which after hit, will trigger it to be
      # sent
      send-batch-size: 8192
    # num-workers is the number of queue workers that will be
    # dequeuing batches and sending them out (default is 10)
    num-workers: 2
    # queue-size is the maximum number of batches allowed in
    # the queue at a given time (default is 5000)
    queue-size: 100
    # retry-on-failure indicates whether queue processor
    # should retry span batches in case of processing
    # failure (default is true)
    retry-on-failure: true
    # backoff-delay is the amount of time a worker waits after
    # a failed send before retrying (default is 5 seconds)
    backoff-delay: 3s
    # sender-type is the type of sender used by this
    # processor, the default is an invalid sender so it forces
    # one to be specified
    sender-type: opencensus
    # configuration of the selected sender-type, in this
    # example opencensus. Which supports 3 settings:
    # collector-endpoint: address of Jaeger collector
    #                     thrift-http endpoint
    # headers: a map of any additional headers to be sent
    #          with each batch (e.g.: api keys, etc)
    # timeout: the timeout for the sender to consider the
    #          operation as failed
    opencensus:
      endpoint: "https://ingest.omnition.io"
      headers: { "x-omnition-api-key": "00000000-0000-0000" }
      secure: true
      timeout: 5s

In general, the backend you wish to export to will have a recommended default configuration. The only configuration a user needs to specify is where the destination is and how to authenticate, if applicable.

In addition to exporters, it may be necessary to specify others configuration options in the Collector. For example:

All of this can be done by configuring global attributes. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
global:
  attributes:
    overwrite: true
    values:
      # values are key value pairs where the value can be an
      # int, float, bool, or string
      some_string: "hello world"
      some_int: 1234
      some_float: 3.14159
      some_bool: false
    key-mapping:
      # key-mapping is used to replace the attribute key with
      # different keys
      - key: servertracer.http.responsecode
        replacement: http.status_code
      - key:  servertracer.http.responsephrase
        replacement: http.message
        overwrite: true
        keep: true

Another options in the Collector is the ability to configure intelligent (tail-based) sampling. This allows for sampling decisions to be made from complete traces instead of when a trace starts (more on intelligent sampling in a future post). For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
sampling:
  mode: tail
  # amount of time from seeing the first span in a trace until
  # making the sampling decision
  decision-wait: 10s
  # maximum number of traces kept in the memory
  num-traces: 10000
  policies:
    # user-defined policy name
    my-string-tag-filter:
      # exporters the policy applies to
      exporters:
        - jaeger
      policy: string-tag-filter
      configuration:
        tag: tag1
        values:
          - value1
          - value2
    my-numeric-tag-filter:
      exporters:
        - zipkin
      policy: numeric-tag-filter
      configuration:
        tag: tag1
        min-value: 0
        max-value: 100

Finally, the Collector also offers command-line flags to enable functionality (i.e. without manually editing or supplying the configuration). The parameters available today are:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
OpenCensus Collector

Usage:
  occollector [flags]

Flags:
  --config string
  --debug-processor (combine with log level DEBUG)
  --health-check-http-port uint (default 13133)
  -h, --help
  --http-pprof-port uint
  --log-level string (DEBUG, *INFO*, WARN, ERROR, FATAL)
  --metrics-level string (NONE, *BASIC*, NORMAL, DETAILED)
  --metrics-port uint (default 8888)
  --receive-jaeger (default {TChannel:14267 HTTP:14268})
  --receive-oc-trace (default {Port:55678}) (default true)
  --receive-zipkin (default {Port:9411})
  --receive-zipkin-scribe (default {Port:9410})
  --tail-sampling-always-sample

Getting Started

While it may look like there are a lot of configuration options for the OpenCensus Service, the repository has some great resources to get you started quickly. If you are on Kubernetes, an example YAML file is provided. Otherwise, deployment options for running locally or via a container are provided in the README.

Some things to note regarding configuration:

If you are interested in learning more about the OpenCensus Service configuration options, take a look at the OpenCensus Service documentation or the README. If you are interested in getting involved in the project, drop me an email at steve at omnition dot io or direct message me on Gitter @flands. In addition, take a look at our help-wanted and good-first-issue Github issues.

Categories: [Observability]

Tags: [OpenCensus OpenCensus Service OpenCensus Agent OpenCensus Collector]

See Also