May 11, 2026

Cron Monitoring Is Better When The Job Sends Data

Why I built TelemHQ to track what happened inside scheduled jobs, not only whether they checked in.

The Basic Pattern

The pattern is intentionally small:

curl -X POST "https://telemhq.com/ping/YOUR_TRACKING_TOKEN"

But the useful version looks more like this:

curl -X POST "https://telemhq.com/ping/YOUR_TRACKING_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "status": "success",
    "duration_ms": 91000,
    "records_processed": 12450,
    "records_failed": 0,
    "bytes_written": 8821044
  }'

Now the tracker has more than a timestamp. It has a small run log.

Ping history showing a cron run with duration, records processed, failed records, and bytes written

That is the difference between:

The nightly import ran.

And:

The nightly import ran in 91 seconds, processed 12,450 records, failed 0 records, and wrote 8.8 MB of output.

That second version is a lot easier to trust.

What I Would Put In The Payload

For a normal cron job, I would start with boring operational fields:

status
duration_ms
records_processed
records_failed
bytes_read
bytes_written
error

For a scheduled report:

status
duration_ms
rows_generated
recipients
email_status
report_date

For a backup:

status
duration_ms
bytes_written
file_count
destination
verification_status

For a scheduled AI job:

provider
model
project
status
input_tokens
output_tokens
total_tokens
latency_ms
cost_usd
items_processed
items_failed
eval_score

The specific fields do not matter as much as the habit: after every run, send the facts that would help you debug, explain, or trust that run later.

For most jobs, metadata is enough. I would not send prompts, completions, generated code, secrets, customer data, or raw private paths unless there is a clear reason and everyone involved understands it.

Payload metric charts for duration, records processed, failed records, and bytes written

The Sneaky Failure: Success With Bad Data

The most annoying failures are not always crashes.

Sometimes the script exits successfully, but the payload is bad:

records_processed is 0
records_failed is greater than 0
duration_ms is much higher than normal
cost_usd is above budget
eval_score drops below the threshold

That is why I like payload assertions.

For example, a tracker can treat these as rules:

status = success
records_processed > 0
records_failed = 0
duration_ms < 300000

Payload assertion rules for a scheduled job

Now the job does not get a free pass because the process completed. The payload has to make sense too.

This is especially useful for AI and data jobs, where "completed" can hide bad output. A model call can return something. A data sync can finish. An enrichment job can loop through the queue. But if the output count, failure count, quality score, or cost is wrong, I want the monitoring system to know that.

Scheduled And Ad Hoc Jobs Are Different

Some jobs run every five minutes or every night at 2 AM. Those should have a schedule. If they miss their expected check-in, the tracker should fail after the grace period.

Other jobs run whenever something happens: a webhook fires, a queue receives work, a user clicks a button, or an agent starts a task. Those should not fail because they did not run on a clock.

In TelemHQ, that is why the schedule is optional.

If a tracker has a cron expression, it behaves like traditional cron monitoring: it expects pings on time.

If a tracker has no schedule, it behaves like run history for ad hoc work: it records every ping, stores the payload, and can still apply payload assertions, but it does not mark the tracker as failing because nothing happened today.

That distinction matters. A missed nightly report is a problem. A quiet queue might be perfectly fine.

Why This Became TelemHQ

This started with cron jobs, but the same problem shows up anywhere work runs in the background.

For classic cron jobs, I want to know:

Did it run?
Was it late?
How long did it take?
How much work did it do?
Did anything fail?

For AI jobs, I also want to know:

What provider and model ran?
How many tokens did it use?
What did it cost?
Did it process the expected number of items?
Did the payload pass quality or budget checks?

And for teams, I want the same pattern to roll up across projects, teammates, branches, workers, and model providers.

The integration does not need to be complicated. No SDK is required. Any script that can make an HTTP request can send a useful ping.

A Small Upgrade To One Cron Job

If you have a cron job today, the smallest upgrade is this:

Add a tracker for the job.
Paste the ping URL into the script.
Send a POST at the end of the run.
Add one or two payload fields that prove the job did useful work.

For example:

await fetch(process.env.TELEMHQ_PING_URL, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    status: "success",
    duration_ms: durationMs,
    records_processed: recordsProcessed,
    records_failed: recordsFailed
  })
});

That is enough to start building a real history: not only that the job ran, but what it did.

I am building TelemHQ for this exact kind of work: scheduled jobs, ad hoc AI workers, AI coding tools like Codex and Claude Code, model usage across OpenAI, Anthropic, Gemini, Llama, Qwen, DeepSeek, Kimi, and GLM, RAG syncs, evals, agents, and traditional cron jobs that need more than "it ran."