Eslam HelmyEslam Helmy
7 min readEslam

Build Your Own Dev Agent — Lesson 8: Heartbeat + Remote Access

Who watches the watcher? If a cron job silently expires, if a state file gets corrupted, if tasks go stale -- who catches it? The heartbeat skill. It runs every 2 hours, checks everything, and fixes what it can.


Where You Are

your-project/
  CLAUDE.md
  .claude/
    preferences.md
    tasks-active.md
    tasks-completed.md
    progress.txt
    error-log.md
    learnings.md
    priority-map.md
    cron-jobs.json                   # 5 jobs
    failed-jobs.log
    settings.json
    hooks/
      stop-telegram.sh
      permission-gate.sh
    skills/
      daily-planner/
        SKILL.md
      pr-reviewer/
        SKILL.md
      git-reviewer/
        SKILL.md
      standup-generator/
        SKILL.md
      meeting-ingest/
        SKILL.md

Part 1: The Heartbeat

The heartbeat is the safety net for the entire system.

Heartbeat self-healing checks

What It Checks

  • Cron health — are all jobs alive? Expired jobs (7-day limit) get recreated via CronCreate.
  • Progress freshness — when was the last entry in progress.txt? Flags if stale >24h.
  • Task deadlines — any overdue P0/P1 tasks in tasks-active.md?
  • Failed jobs — unresolved entries in failed-jobs.log?
  • State validity — can all JSON/markdown files be parsed?
  • Skill files — do all referenced SKILL.md files exist?

What It Can Fix Autonomously

  • Recreate expired cron jobs (expired jobs delete themselves — heartbeat recreates from cron-jobs.json)
  • Retry failed jobs from failed-jobs.log (max 1 retry per cycle)
  • Re-read critical files after context compaction

What It Cannot Do (needs approval)

Delete files, modify skill logic, push code.


Build It: Heartbeat Skill

Prompt for Claude Code:

Create a heartbeat skill at .claude/skills/heartbeat/SKILL.md.

The skill should:

INPUT:
- Read ALL state files: cron-jobs.json, progress.txt, tasks-active.md,
  failed-jobs.log, learnings.md, error-log.md, priority-map.md, preferences.md
- Also verify all skill SKILL.md files exist

PROCESS:
Run 7 checks in order:
1. Cron health: compare CronList output vs cron-jobs.json entries.
   Recreate any missing jobs via CronCreate.
2. Progress freshness: flag if last entry >24h (MEDIUM) or >48h (HIGH)
3. Task deadlines: flag overdue P0/P1 as HIGH, P2 as MEDIUM
4. Failed jobs: flag if >0 unresolved (MEDIUM), >5 (HIGH)
5. State validity: verify JSON parses, markdown files non-empty
6. Skill files: verify all referenced SKILL.md files exist
7. Learnings: note if not updated in 7+ days

OUTPUT:
- Health report at .claude/reports/heartbeat-[date]-[time].md
  with status (HEALTHY/DEGRADED/CRITICAL), check results, actions taken
- Append summary line to progress.txt

STATE UPDATE:
- Recreate any missing crons
- If CRITICAL: send Telegram notification
- If DEGRADED 3 consecutive times: send Telegram notification

Expected output: A heartbeat skill file with self-healing capabilities.


Build It: Schedule the Heartbeat

Prompt for Claude Code:

Add heartbeat to cron-jobs.json. Schedule: every 2 hours.
The file should now have 6 jobs total.

Expected output: cron-jobs.json with six entries.


Build It: Test It

Prompt for Claude Code:

Run the heartbeat skill. Read .claude/skills/heartbeat/SKILL.md and
follow its instructions. Generate the health report.

Expected output: A health report showing current system status.


Part 2: Access Your Agent from Anywhere

Your agent runs on your machine. But you're not always at your machine. This section sets up remote access — and it's not just for the agent. Once you have tmux + Tailscale, you can reach any project, any terminal session, any running process on your machine from your phone. The agent is one use case. Your entire development environment becomes portable.

The Stack

  • tmux — Keeps terminal sessions alive after disconnect. Your agent survives laptop close, SSH drops, even reboots (with a startup service). But so does any long-running process — builds, migrations, monitoring scripts.
  • Tailscale — Mesh VPN that makes your devices find each other anywhere. SSH from your phone to your laptop without port forwarding, static IPs, or VPN servers. Works through NAT, hotel WiFi, and mobile networks.
  • Termius — Mobile SSH client (iOS/Android). Clean interface, supports SSH keys, multiple hosts. Connect and you're in your terminal.

Step 1: tmux — Persistent Sessions

brew install tmux                    # macOS
# or: sudo apt install tmux          # Linux
 
tmux new -s agent                    # create a named session
# You're now inside a tmux session. Start Claude Code here:
claude
 
# To detach (leave it running): press Ctrl+B, then D
# To reattach from anywhere:
tmux attach -t agent

You can have multiple sessions — one for the agent, one for a build, one for logs:

tmux new -s agent        # agent session
tmux new -s builds       # separate session for builds
tmux ls                  # list all sessions

Step 2: Tailscale — Reach Your Machine from Anywhere

brew install tailscale               # macOS
# or: curl -fsSL https://tailscale.com/install.sh | sh   # Linux
 
sudo tailscale up                    # authenticate via browser
tailscale ip -4                      # shows your Tailscale IP (e.g., 100.x.y.z)

Once authenticated, any device on your Tailscale network can SSH to this machine using the Tailscale IP.

Step 3: Termius — SSH from Your Phone

  1. Download Termius from App Store or Play Store
  2. Add a new host:
    • Hostname: your Tailscale IP
    • Username: your macOS/Linux username
    • Auth: SSH key (recommended) or password
  3. Connect
  4. Run: tmux attach -t agent

The Full Flow

Your Phone (Termius)
  → Tailscale (encrypted tunnel)
    → Your Machine (SSH)
      → tmux session "agent"
        → Claude Code (running, context intact)
          → 6 skills on cron
          → Telegram notifications to your phone
          → Heartbeat self-healing every 2h

This same flow works for any project. The agent is just the most useful thing to keep running.


Part 3: What You Built

Across 8 lessons you created ~20 files — CLAUDE.md, state files, hooks, 6 skills, cron scheduling, failure handling, and a self-healing heartbeat. A complete autonomous agent running on your machine.

Adding New Skills

Tell Claude Code what you need:

Create a new skill called "deployment-monitor" that watches production
deploys and alerts on error rate increases. It should run every 30 minutes.
Follow the same SKILL.md pattern as existing skills. Add it to
cron-jobs.json and register it with the heartbeat.

Deployment Phases

  • Phase 1 — Prototype (1-3 days): Run skills manually, fix issues interactively.
  • Phase 2 — Reliable (1-2 weeks): Enable cron schedules, run the heartbeat, tune thresholds.
  • Phase 3 — Production (ongoing): Persistent session with heartbeat and Telegram notifications.

The Compound Effect

Each correction is recorded and read at startup. The effect compounds — week 1 you correct 10 times, month 2 you correct 2-3 times.

Final Checklist

  • [ ] CLAUDE.md has complete session startup instructions
  • [ ] All 6 skills have SKILL.md files with Input, Process, Output, State Update
  • [ ] cron-jobs.json has entries for all scheduled skills and heartbeat is enabled
  • [ ] Telegram notifications and permission gate hooks work
  • [ ] All state files are committed to git (except secrets)

Fork It

This is not a product. It is a pattern. Fork it for your workflow:

  • Start small. Pick the 2 skills that save you the most time. Disable the rest.
  • Tune aggressively. The default thresholds are starting points. Adjust them after the first week.
  • Share skill files. Skills are portable. A teammate can copy your pr-reviewer/SKILL.md and it works.
  • Version everything. Commit .claude/ to git. Your agent's configuration is part of your codebase.

You have the architecture. You have the patterns. Now make it yours.


This is part of the Build Your Own Dev Agent course. ← Previous Lesson | Next Lesson →

Share this post