Sending Wavefront Data from the CLI
last updated: 12 September 2019

Though Wavefront has a remarkable ability to relentlessly consume huge amounts of metrics, from agents like Telgraf or CollectD, it is sometimes useful to be able to send data in a more ad-hoc way. The Wavefront CLI can be useful for that.

This article looks at data ingestion from the command-line. All the examples work with version 2.16.2 of the wavefront-cli gem, going into a 2018-46-44 cluster, via a 4.29 proxy.

Single, Arbitrary Points

Sometimes you want to poke just the odd point into Wavefront. The write point sub-command does just that. Here’s the syntax, with the write-specific options.

wf write point [-DnViq] [-c file] [-P profile] [-E proxy] [-t time]
         [-p port] [-H host] [-T tag...] [-u method] [-S socket] <metric>
         <value>

Options:
  -E, --proxy=URI               proxy endpoint
  -t, --time=TIME               time of data point (omit to use current
                                time)
  -H, --host=STRING             source host
  -p, --port=INT                Wavefront proxy port
  -T, --tag=TAG                 point tag in key=value form
  -F, --infileformat=STRING     format of input file or stdin
  -m, --metric=STRING           the metric path to which contents of a file
                                will be assigned. If the file contains a metric
                                name, the two will be dot-concatenated, with
                                this value first
  -i, --delta                   increment metric by given value
  -I, --interval=INTERVAL       interval of distribution (default 'm')
  -u, --using=METHOD            method by which to send points
  -S, --socket=FILE             Unix datagram socket
  -q, --quiet                   don't report the points sent summary (unless
                                there were errors)

Pretty simple, I hope.

$ wf write point dev.cli.example 10
          sent 1
      rejected 0
        unsent 0
$ wf write -u api point dev.cli.example 20
          sent 1
      rejected 0
        unsent 0
$ wf query --start=-1m "ts(dev.cli.example)"
name         ts(dev.cli.example)
query        ts(dev.cli.example)
timeseries
  label      dev.cli.example
  sparkline  > <
  host       box
  data       2019-02-13   15:51:18    20.0
             ---------------------------------------------------------------
  label      dev.cli.example
  sparkline  > <
  host       box
  tags
    env      lab
  data       2019-02-13   15:51:13    10.0

Note that the query results are in two sections, even though we sent two values on the same metric path. The directly ingested point (value 20) has no tags, but the one we sent via the proxy is tagged with env=lab. This is because my lab proxy has a preprocessor rule which tags everything going through it, and it shows that sending points directly, though useful in many cases, is not a straight substitute for using a proxy.

Most obviously, API calls are far more expensive. Sending a metric via -u api takes about a second for me, because that metric’s got to get from England to us-west-1, over HTTPS. Sending to a proxy listening on a Unix socket on the same subnet is pretty much instantaneous. (The proxy batches points and compresses the bundle before sending to your cluster, so the actual delay may be longer, though it will “feel” faster to your client.)

The Wavefront proxy is not a thing you want to avoid. Even a modestly sized one can handle insane amounts of metrics, and it gives you great reliability through its buffering and retrying, and efficiency through batching and compressing of the data it sends. It lets you manipulate, mangle, or block points based on sophisticated rules. It can extract metrics from log files, tag things on the fly, understand all kinds of different formats. And it generates rich metrics of everything it does, which can be very useful for debugging and tuning.

That’s not to say direct ingestion is without value. For instance, when we made an IoT biscuit tin, we wanted its metrics to go to Wavefront. That turned out to be a huge pain, because all our proxies, and indeed, all our hosts were in AWS, and our biscuit tin was in the office. Direct ingestion – which didn’t exist at the time – would have been perfect for a little job like that. We’ve also, at times, wanted to put a small amount of data from Lambda functions into Wavefront, but the Lambdas were running in VPCs without proxy access. We could peer, or stand up proxies, but direct ingestion would have been easier all round. Now direct ingestion is available, we’re starting to use it all over the place.

Note that we didn’t specify a timestamp for the point, so the CLI assumed “now”.

When you set a timestamp, as in all wf subcommands, you can use epoch seconds or, anything naturally parseable by Ruby’s strptime() method.

$ wf write point -t 14:20:33 dev.cli.example 98.76
          sent 1
      rejected 0
        unsent 0

If you find yourself wondering whether, or how, wf will parse a time you enter, open up irb and find out.

$ irb -r time
irb(main):003:0> Time.parse('12:00')
=> 2018-10-22 12:00:00 +0100
irb(main):004:0> Time.parse('13/03/2016')
=> 2016-03-13 00:00:00 +0000

Note that when you send points, you get a summary of how many were sent, rejected, or unsent. Depending on your viewpoint this is useful and reassuring, or irritating, so you have the option to make the write command quiet with -q. If anything goes wrong, even with -q specified, wf will exit nonzero and print the summary anyway.

If you are not irritated by summaries, and demand EVEN MORE verbosity when writing points, you’re in luck. --verbose (AKA -V) will print make wf print out every point it sends in native Wavefront wire format. (Or as an HTTP POST if you’re going over the API.) You can combine with -q to only get the detail.

$ wf write point -t 14:20:33 dev.cli.example 98.76 --verbose
SDK INFO: dev.cli.example 98.76 1550067633 source=box
          sent 1
      rejected 0
        unsent 0
$ wf write point -t 14:21:33 dev.cli.example 54.321 -Vq
SDK INFO: dev.cli.example 54.321 1550067693 source=box
$ wf write point -t 14:20:33 -u api dev.cli.example 98.76 --verbose
SDK INFO: dev.cli.example 98.76 1550067633 source=box
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.example 98.76 1550067633 source=box
          sent 1
      rejected 0
        unsent 0
$ wf write point -u api -t 14:21:33 dev.cli.example 54.321 -Vq
SDK INFO: dev.cli.example 54.321 1550067693 source=box
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.example 54.321 1550067693 source=box

There’s --debug too, but that will take you into the innards of wf. Hopefully everything will work all the time and you’ll never need it.

You can write points with tags, using -T. Multiples are allowed.

$ wf write point -q -t 14:25 -T cmd=wf -T subcmd=write dev.cli.example 99.999

If I don’t specify a source (or “host”), the CLI will use what it thinks is the hostname of my machine. Up to now that’s been box.

$ wf write point -q -H made-up-host dev.cli.example 99

Here are the points we just sent. Hover over them and you’ll see the tags. (If you can’t see the chart, you’ll have to enable third-party cookies for this page, because the embedded graphs use Typekit.)

I did not specify a proxy endpoint or port in any of the above examples. The write command respects the .wavefront config file, so I have my proxy stowed away in there:

$ grep -v token ~/.wavefront
[default]
endpoint = metrics.wavefront.com
format = human
proxy = wavefront.localnet

write -u api uses the proxy and token just like any other wf command.

api isn’t the only thing you can -u. As of late 2018, proxies accept HTTP POSTed points, on the same port the socket uses. This is simple HTTP - there’s no authentication or authorization yet, but it works, and the CLI supports it.

$ wf write point dev.cli.example 123 --verbose --using http
SDK INFO: dev.cli.example 123.0 source=box
SDK INFO: uri: POST http://wavefront:2878/
SDK INFO: body: dev.cli.example 123.0 source=box
          sent 1
      rejected 0
        uunsent 0

The final (for now) -u mechanism is unix. This writes points to a Unix socket. I added this because I wanted a very fast local mechanism to write points, which could handle a proxy temporarily being unavailable. I was already using Telegraf, so I added a socket listener by putting this in telegraf.conf.

[[inputs.socket_listener]]
  service_address = "unix:///tmp/telegraf.sock"
  data_format = "wavefront"

Then I could write points straight to the socket. (I do this from inside Ruby, using the SDK. But once it was in the SDK I thought I might as well put a CLI front-end on it. You have to specify the path to the socket with -S (or --socket). No other credentials are required, and everything looks exactly the same as the other methods.

$ wf write point -u unix -S /tmp/telegraf.sock dev.cli 1 --verbose
SDK INFO: dev.cli.socket 1.0 source=www-blue
          sent 1
      rejected 0
        unsent 0

As other transport mechanisms appear, they will be supported by the CLI.

Multiple Points, From a File

Writing one point at a time is fine, and may well be just what you need, but it’s more likely that you want to push in a batch of points. Enter write file.

wf write file [-DnViq] [-c file] [-P profile] [-E proxy] [-H host]
         [-p port] [-F infileformat] [-m metric] [-T tag...]
         [-u method] [-S socket] <file>

A while ago, I needed to push retrospective data into Wavefront, and had to hack together some Ruby to generate and push the points. Now I could use the CLI.

Here’s an example file.

$ cat file1
1550075043 dev.cli.file1 144
1550075167 dev.cli.file1 185
1550075253 dev.cli.file1 157
1550075350 dev.cli.file1 129
1550075384 dev.cli.file1 48
1550075540 dev.cli.file1 67
1550075549 dev.cli.file1 172

Clearly the three fields are epoch timestamp, metric path, and value. I can load in that file with the following ‘write file’ command. Supplying -V will show me the points in Wavefront wire format, as they go in.

$ wf write file -V -F tmv file1
SDK INFO: dev.cli.file1 144.0 1550075043 source=box
SDK INFO: dev.cli.file1 185.0 1550075167 source=box
SDK INFO: dev.cli.file1 157.0 1550075253 source=box
SDK INFO: dev.cli.file1 129.0 1550075350 source=box
SDK INFO: dev.cli.file1 48.0 1550075384 source=box
SDK INFO: dev.cli.file1 67.0 1550075540 source=box
SDK INFO: dev.cli.file1 172.0 1550075549 source=box
          sent 7
      rejected 0
        unsent 0

And here’s the chart. Hover over the points and you’ll see the values from the file.

The key part of the wf write file command is the -F option. This lets the user describe the format of the file they wish wf to parse. t stands for timestamp; m for metric, and v for value. So, tmv, describes the format of file1.

The v column is mandatory, but the time and metric path can be set in other ways. For instance, the -m option allows you to define a metric path which will be applied to all data points in the file. So, the following file and command would be an identical data load to the example above.

$ cat file1
1550075043 144
1550075167 185
1550075253 157
1550075350 129
1550075384 48
1550075540 67
1550075549 172
$ wf write file -F tv -m dev.cli.file1 file1
          sent 7
      rejected 0
        unsent 0

You can also use -m to set a metric prefix, and have the final portion of the metric in your file. If you do that, the two parts will be concatenated. I’ll show you that later.

If you wish, you can even add point tags to a data load. For fine-grained control, put them at the end of each line to which they apply. To tag everything uniformly, use the -T key=val option. If you do both, you get both sets of tags. Tags have to be at the end of the line because there can be arbitrarily many for each data point, and the number may not be constant.

All this, of course, works exactly the same for any ingestion method.

Multiple Points, from a Live Source

Though it’s more useful than sending a single point, I still think loading data in from a static file is something most people would use rarely, if ever. Far more useful to, in proper Unix style, set the input file to -, and read from standard in.

Maybe the simplest illustration is to generate some (pseudo) random data. (Ignoring the fact that Wavefront has a perfectly capable random() function.)

$ while true; do echo $RANDOM; sleep 1; done | wf write file -V -m dev.cli.demo -Fv -
Connecting to wavefront.localnet:2878.
Sending: dev.cli.demo 18718.0 source=box
Sending: dev.cli.demo 13481.0 source=box
Sending: dev.cli.demo 18154.0 source=box
Sending: dev.cli.demo 7834.0 source=box
Sending: dev.cli.demo 19986.0 source=box
Sending: dev.cli.demo 7418.0 source=box
Sending: dev.cli.demo 20295.0 source=box
Sending: dev.cli.demo 20602.0 source=box
...

Producing:

That’s fine, but you’re more likely to want to plot the output of a command, so to illustrate that, here’s a little script which generates the points for a parabola. You can see it outputs pairs of numbers: the first is the abcissa, as a timestamp, and the second is the ordinate.

#!/usr/bin/env ruby

h, k, a = 25, 1000, 10

1.upto(49) do |x|
  $stdout.puts "#{Time.now.to_i} #{a * (x - h) ** 2 + k}"
  $stdout.flush
  sleep 1
end

The $stdout stuff is necessary because otherwise the script will flush all its output when it exits, and I wanted to use wf’s -V option to watch the points flowing through when I was testing. (wf has a --noop flag which will not make a connection to the proxy, and will show you the points in Wavefront wire format, in real-time.)

Anyway, run the script, and pipe its output into wf, supplying a metric path and a description of the file format.

$ ./parabola.rb | wf write file -m dev.cli.demo -V -F tv -

Back in a previous article, I wrote some Ruby to wire DTrace into Wavefront. Now, I can use the write file command for simple D scripts.

Revisiting intr.d, I can describe the field format I expect. The old version of the CLI would ignore lines which don’t match the field definition, but in the rewrite I chose to make that throw an error. So now we have to run the output through awk to reject anything without two fields. The first field is the CPU ID, which I want as the final part of the metric path, and the second is the value to send (in this case, the total number of interrupts handled by that CPU). Because I am not supplying any timestamps, wf will use the current UTC time whenever it sends a point. -V is for verbosity.

# ./intr.d | awk 'NF == 2 { print $0 }' | wf write file -V -m dev.cli.d1 -F mv -
dtrace: script './intr.d' matched 4 probes
SDK INFO: dev.cli.d1.1 1029.0 source=cube
SDK INFO: dev.cli.d1.0 1214.0 source=cube
SDK INFO: dev.cli.d1.2 1188.0 source=cube
SDK INFO: dev.cli.d1.3 1239.0 source=cube
SDK INFO: dev.cli.d1.1 1620.0 source=cube
SDK INFO: dev.cli.d1.0 1910.0 source=cube
SDK INFO: dev.cli.d1.2 1775.0 source=cube
SDK INFO: dev.cli.d1.3 1867.0 source=cube
SDK INFO: dev.cli.d1.0 2705.0 source=cube
SDK INFO: dev.cli.d1.1 3394.0 source=cube
SDK INFO: dev.cli.d1.2 2146.0 source=cube
...

and, with the whole thing wrapped in a deriv() expression, to turn a counter into a gauge, I see:

How about kstats? Say I’d like to see a chart of network throughput when I do an NFS copy between a couple of machines. That’s now a one-liner. (Or it would be if I didn’t have to break it because of formatting issues!) Let’s use direct ingestion, just to show that it works the same.

# while true; do kstat link:0:net0:obytes64 | grep obytes; sleep 1; \
  done | wf  write file -u api -V -Fmv -m dev.cli.network -
SDK INFO: dev.cli.network.obytes64 12747937388.0 source=cube
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.network.obytes64 12747937388.0 source=cube
SDK INFO: dev.cli.network.obytes64 12747937554.0 source=cube
SDK INFO: uri: POST https://metrics.wavefront.com/report
SDK INFO: body: dev.cli.network.obytes64 12747937554.0 source=cube
SDK INFO: dev.cli.network.obytes64 12747941720.0 source=cube
SDK INFO: uri: POST https://metrics.wavefront.com/report
...

That required no setting up, and nothing beyond a local installation of the wavefront-cli gem. Now you have no excuse for not putting everything in Wavefront!

Histograms

Wavefront now has an add-on histogram feature. For this to work you need to have a histogram-enabled endpoint. Speak to your sales person.

Histograms are a way around Wavefront’s one-second resolution limit, and a way of intepreting millions of points per second without it costing the earth. They work like a global statsd. You send points to a proxy, which buckets them all, and flushes a mathematical description of said bucket up to your cluster at a predefined interval. These intervals are every minute, hour, and day.

You must configure your proxy to allow histogram ingestion, and each of the intervals I mentioned has its own port. By default the “minute” bucket listens on 40001, the hourly one on 40002, and the daily on 40003. To send metrics with the CLI and have them bucketed in one minute intervals is exactly as I described above, but pop -p 40001 in the command. Watch.

$ while true
> do
> wf write point -qV -p 40001 demo.cli.histogram_1 $RANDOM
> sleep 0.1
> done
SDK INFO: demo.cli.histogram_1 1028.0 source=box
SDK INFO: demo.cli.histogram_1 11952.0 source=box
SDK INFO: demo.cli.histogram_1 12442.0 source=box
SDK INFO: demo.cli.histogram_1 26243.0 source=box
SDK INFO: demo.cli.histogram_1 17687.0 source=box
...

produces:

Once the results are in Wavefront, you can view them with an hs() (as opposed to ts()) expression, and apply various statistical functions. The chart above uses, max(), median(), min(), and uses percentile() to show the 95th percentile. As this is analysis is performed on data from all hosts, it’s a true 95th percentile, not an average view of the 95th percentile from each host.

There is another way of writing histogram data to Wavefront, which is to use a “distribution”. A distribution assigns multiple values to a single metric over a given time range. So, if you were recording web server response codes, and had 150 “200”s and 6 “404”s in a minute, you could send a distribution which looked like #150 200 #6 404.

The CLI lets you send distributions just like normal points, using write distribution.

wf write distribution [-DnViq] [-c file] [-P profile] [-E proxy]
         [-H host] [-p port] [-T tag...] [-u method] [-S socket] [-I interval]
         <metric> <val>...

Wavefront describes distributions in the way I just showed you, with #a b where a is the number of times b occurred during the time range. To save you the trouble of counting your individual values, the CLI lets you describe a distributin “in the raw”.

$ wf write distribution -V demo.dist 3 1 4 1 1 2 3 6 4 1 3 2
SDK INFO: !M 1539780323 #3 3.0 #4 1.0 #2 4.0 #2 2.0 #1 6.0 demo.dist source=box
       sent 1
   rejected 0
     unsent 0

But if you have gone to the trouble of counting up the values, it would be rude of me to expect you to break them up again. So this will work too.

$ wf write distribution -Vq test.dist 3x3 4x1 2x4 2x2 1x6
SDK INFO: !M 1539781868 #3 3.0 #4 1.0 #2 4.0 #2 2.0 #1 6.0 test.dist source=box

I chose 3x1 rather than Wavefront’s #3 1 format to save you having to escape the hash. You can even mix and match, so 3x1 2 3 is fine.

When you send a distribution, you must define the time interval it covers. The -I option lets you do this, and its value can be m, h or d. If you don’t specify, m is chosen. When the CLI detects a distribution it will automatically send it to port 40000. If you need to use a different port, -p will help you out.

You can even take distributions from a file, as we saw above. When you describe the input file format, just use d for distribution instead of v for value. And instead of a single value in the file, use a comma-separated list of values. Values can be straight numbers, or they can be duplicated with an x in the way you already saw. All the other rules of write file apply.

Note that distribution and histogram data cannot be sent via the API. They must go through a proxy. This is a design decision of Wavefront itself, not of the CLI. They also don’t currently appear to work if you send them to the proxy over HTTP.

I hope you find the CLI a useful way of getting data into Wavefront. If you find any bugs, or wish any enhancements, please open an issue.