# Adding telegraf agent to opentelemetry-collector-contrib
## Import issues
While working on adding telegraf receiver to
[opentelemetry-collector-contrib](https://github.com/open-telemetry/opentelemetry-collector-contrib)
one might notice that the dependencies pulled in by telegraf and
opentelemetry-collector-contrib have a couple of conflicts.
Adding the following dependency in opentelemetry-collector-contrib
```
require github.com/influxdata/telegraf v1.17.1
```
results in the following build errors:
```
$ make otelcontribcol-unstable
GO111MODULE=on CGO_ENABLED=0 go build -o ./bin/otelcontribcol_unstable_darwin_amd64 \
-ldflags "-X github.com/open-telemetry/opentelemetry-collector-contrib/internal/version.GitHash=63b2f339 -X github.com/open-telemetry/opentelemetry-collector-contrib/internal/version.Version=v0.19.0 -X go.opentelemetry.io/collector/internal/version.BuildType=release" -tags enable_unstable ./cmd/otelcontribcol
go: finding module for package github.com/prometheus/prometheus/discovery/install
go: finding module for package github.com/Azure/azure-sdk-for-go/arm/compute
go: finding module for package github.com/Azure/azure-sdk-for-go/arm/network
../../../.gvm/pkgsets/go1.15.7/global/pkg/mod/github.com/prometheus/prometheus@v2.5.0+incompatible/discovery/azure/azure.go:24:2: module github.com/Azure/azure-sdk-for-go@latest found (v51.0.0+incompatible), but does not contain package github.com/Azure/azure-sdk-for-go/arm/compute
../../../.gvm/pkgsets/go1.15.7/global/pkg/mod/github.com/prometheus/prometheus@v2.5.0+incompatible/discovery/azure/azure.go:25:2: module github.com/Azure/azure-sdk-for-go@latest found (v51.0.0+incompatible), but does not contain package github.com/Azure/azure-sdk-for-go/arm/network
../../../.gvm/pkgsets/go1.15.7/global/pkg/mod/github.com/prometheus/prometheus@v2.5.0+incompatible/discovery/consul/consul.go:27:2: ambiguous import: found package github.com/hashicorp/consul/api in multiple modules:
github.com/hashicorp/consul v1.2.1 (/Users/pmalek/.gvm/pkgsets/go1.15.7/global/pkg/mod/github.com/hashicorp/consul@v1.2.1/api)
github.com/hashicorp/consul/api v1.7.0 (/Users/pmalek/.gvm/pkgsets/go1.15.7/global/pkg/mod/github.com/hashicorp/consul/api@v1.7.0)
../../../.gvm/pkgsets/go1.15.7/global/pkg/mod/go.opentelemetry.io/collector@v0.19.0/receiver/prometheusreceiver/factory.go:22:2: module github.com/prometheus/prometheus@latest found (v2.5.0+incompatible), but does not contain package github.com/prometheus/prometheus/discovery/install
make: *** [otelcontribcol-unstable] Error 1
```
### Working version with resolved import issues
There is a working version at
https://github.com/pmalek-sumo/opentelemetry-collector-contrib/tree/telegrafreceiver
(on `telegrafreceiver` branch) which uses a fork of telegraf at
https://github.com/pmalek-sumo/telegraf/releases/tag/v1.17.7
`telegrafreceiver` branch in fork above is added to the unstable
components list so in order to build this one will need to run:
```
make otelcontribcol-unstable
```
This version has removed the following plugins in order to make
compilation pass:
* `inputs/kube_inventory`
* `inputs/prometheus`
It was also necessary to change `*prompb.Label` to `prompb.Label` in
a couple of lines in
`plugins/serializers/prometheusremotewrite/prometheusremotewrite.go`
due to [changes](https://github.com/prometheus/prometheus/pull/4957)
in `prompb/remote.pb.go` from `github.com/prometheus/prometheus`.
### Incompatible Prometheus dependency
Because of introduction of Go modules in Prometheus `v2.6`
[commit][1] and not following Go semver convention (every module version past
`v1` should have a `v<num>` suffix) and [no intention on changing this][2], requiring Prometheus package results in adding `github.com/prometheus/prometheus@v2.5.0` as
a dependency since that was the last version without support for Go modules
(hence no Go module requirements mentioned above).
This can be overcome with requiring a
[particular SHA of a release commit][3] in telegraf like so:
```
go get github.com/prometheus/prometheus@e83ef207b6c2398919b69cd87d2693cfc2fb4127
```
This particular version is [a v2.21.0 release commit][4] which doesn't conflict with
`00f16d1ac3a4` version required by otc-contrib - pointing to [`v2.22.1` commit][5].
[1]: https://github.com/prometheus/prometheus/commit/a516bc2160b86c652d7ebb7d2df0fc27ca328f8b
[2]: https://github.com/prometheus/prometheus/issues/8417#issuecomment-769042914
[3]: https://github.com/prometheus/prometheus/issues/7991#issuecomment-701298893
[4]: https://github.com/prometheus/prometheus/commit/e83ef207b6c2398919b69cd87d2693cfc2fb4127
[5]: https://github.com/prometheus/prometheus/commit/00f16d1ac3a4c94561e5133b821d8e4d9ef78ec2
### Renaming telegraf imports
The [working version](#Working-version-with-resolved-import-issues)
also needed to have import paths from `github.com/influxdata/telegraf` to `github.com/pmalek-sumo/telegraf` to prevent using original
telegraf sources that do not work when imported to OTC.
## In process data flow with telegraf's Agent
### `telegraf.Agent` interface
Telegraf's [`Agent`][agent_1] which manages plugins defined in the
config has the following functions exported (available to be called
from a 3rd party Go package):
* `func (a *Agent) Once(ctx context.Context, wait time.Duration) error`
which runs a single metrics gather
* `func (a *Agent) Run(ctx context.Context) error` which runs the
agent until the context is cancelled
* `func (a *Agent) Test(ctx context.Context, wait time.Duration) error`
runs a single gather but writes the gathered metrics on stdout
As one can see, there is no point where one could possible read the
gathered metrics (apart from running `Test` and hooking into process'
stdout) hence it's impossible with the current state of `Agent`'s API
to get metrics from telegraf's input plugin to otc for processing when
in process otc data model would be chosen as the data flow.
Changing telegraf's [`agent.Run()`][agent_run] which manages plugins
(inputs, processors, aggregators, outputs) as well as adjusting the
code to export ingested metrics in order to allow otc to consume it
would be a rather controversial undertaking which most likely wouldn't
be accepted upstream.
[agent_1]: https://github.com/influxdata/telegraf/blob/86e50f85b39fe9afe1b62b8e1f5ef8c268ff1894/agent/agent.go#L20-L23
[agent_run]: https://github.com/influxdata/telegraf/blob/86e50f85b39fe9afe1b62b8e1f5ef8c268ff1894/agent/agent.go#L112-L198
#### Extending the interface
One could be tempted to extend the interface with a following func:
```
func (a *Agent) RunWithChannel(ctx context.Context, out chan<- telegraf.Metric) error
```
which could be used in OTC's receiver to consume the metrics from
telegraf.
### Data model differences and controversies
`telegraf.Metric` is an interface with (among others) the following
funcs:
```
type Metric interface {
// Name is the primary identifier for the Metric and corresponds to the
// measurement in the InfluxDB data model.
Name() string
// TagList returns the tags as a slice ordered by the tag key in lexical
// bytewise ascending order. The returned value should not be modified,
// use the AddTag or RemoveTag methods instead.
TagList() []*Tag
// FieldList returns the fields as a slice in an undefined order. The
// returned value should not be modified, use the AddField or RemoveField
// methods instead.
FieldList() []*Field
// Type returns a general type for the entire metric that describes how you
// might interpret, aggregate the values.
//
// This method may be removed in the future and its use is discouraged.
Type() ValueType
...
}
```
Where metrics values are stored as fields in a key-value map returned
by `FieldList()` and metric type is set on top of all those values
and one get it with `Type()`.
On the other hand OTC has a data model where a metric has a type (more
fine grained, e.g. `pdata.MetricDataTypeDoubleGauge` instead of
`Gauge`) and then data points attached to this metric with value and
timestamp set.
```
var m telegraf.Metric
...
var t = m.Time().UnixNano()
...
switch m.Type() {
// ... other types ...
// case telegraf.Cou
for _, f := range m.FieldList() {
pm := pdata.NewMetric()
pm.SetName(m.Name() + "_" + f.Key)
switch v := f.Value.(type) {
case float64:
pm.SetDataType(pdata.MetricDataTypeDoubleGauge)
dps := pm.DoubleGauge().DataPoints()
dps.Resize(1)
dps.At(0).SetValue(v)
dps.At(0).SetTimestamp(pdata.TimestampUnixNano(t))
case uint64:
pm.SetDataType(pdata.MetricDataTypeIntGauge)
dps := pm.IntGauge().DataPoints()
dps.Resize(1)
dps.At(0).SetValue(int64(v))
dps.At(0).SetTimestamp(pdata.TimestampUnixNano(t))
}
metrics.Append(pm)
}
...
```
## Data flow through loopback interface
Another option would be to run telegraf as describe above (using
`telegraf.Agent`) but with its exporters sending data to otc receivers.
This would require the user to pay more attention during the
configuration and it would consume more resources because of the
serialization/deserialization costs (serializing at telegraf's output
plugin and deserializing in the otc receiver).
This might also be a rather controversial idea to bring up to upstream.
---
###### tags: `telegraf` `otc`