Skip to content

Telephony — Give Hermes phone capabilities without core tool changes

Give Hermes phone capabilities without core tool changes. Provision and persist a Twilio number, send and receive SMS/MMS, make direct calls, and place AI-driven outbound calls through Bland.ai or Vapi.

SourceOptional — install with hermes skills install official/productivity/telephony
Pathoptional-skills/productivity/telephony
Version1.0.0
AuthorNous Research
LicenseMIT
Platformslinux, macos, windows
Tagstelephony, phone, sms, mms, voice, twilio, bland.ai, vapi, calling, texting
Related skillsmaps, google-workspace, agentmail

The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.

Telephony — Numbers, Calls, and Texts without Core Tool Changes

Section titled “Telephony — Numbers, Calls, and Texts without Core Tool Changes”

This optional skill gives Hermes practical phone capabilities while keeping telephony out of the core tool list.

It ships with a helper script, scripts/telephony.py, that can:

  • save provider credentials into ~/.hermes/.env
  • search for and buy a Twilio phone number
  • remember that owned number for later sessions
  • send SMS / MMS from the owned number
  • poll inbound SMS for that number with no webhook server required
  • make direct Twilio calls using TwiML <Say> or <Play>
  • import the owned Twilio number into Vapi
  • place outbound AI calls through Bland.ai or Vapi

This skill is meant to cover the practical phone tasks users actually want:

  • outbound calls
  • texting
  • owning a reusable agent number
  • checking messages that arrive to that number later
  • preserving that number and related IDs between sessions
  • future-friendly telephony identity for inbound SMS polling and other automations

It does not turn Hermes into a real-time inbound phone gateway. Inbound SMS is handled by polling the Twilio REST API. That is enough for many workflows, including notifications and some one-time-code retrieval, without adding core webhook infrastructure.

  1. Always confirm before placing a call or sending a text.
  2. Never dial emergency numbers.
  3. Never use telephony for harassment, spam, impersonation, or anything illegal.
  4. Treat third-party phone numbers as sensitive operational data:
    • do not save them to Hermes memory
    • do not include them in skill docs, summaries, or follow-up notes unless the user explicitly wants that
  5. It is fine to persist the agent-owned Twilio number because that is part of the user’s configuration.
  6. VoIP numbers are not guaranteed to work for all third-party 2FA flows. Use with caution and set user expectations clearly.

Use this logic instead of hardcoded provider routing:

1) “I want Hermes to own a real phone number”

Section titled “1) “I want Hermes to own a real phone number””

Use Twilio.

Why:

  • easiest path to buying and keeping a number
  • best SMS / MMS support
  • simplest inbound SMS polling story
  • cleanest future path to inbound webhooks or call handling

Use cases:

  • receive texts later
  • send deployment alerts / cron notifications
  • maintain a reusable phone identity for the agent
  • experiment with phone-based auth flows later

2) “I only need the easiest outbound AI phone call right now”

Section titled “2) “I only need the easiest outbound AI phone call right now””

Use Bland.ai.

Why:

  • quickest setup
  • one API key
  • no need to first buy/import a number yourself

Tradeoff:

  • less flexible
  • voice quality is decent, but not the best

3) “I want the best conversational AI voice quality”

Section titled “3) “I want the best conversational AI voice quality””

Use Twilio + Vapi.

Why:

  • Twilio gives you the owned number
  • Vapi gives you better conversational AI call quality and more voice/model flexibility

Recommended flow:

  1. Buy/save a Twilio number
  2. Import it into Vapi
  3. Save the returned VAPI_PHONE_NUMBER_ID
  4. Use ai-call --provider vapi

4) “I want to call with a custom prerecorded voice message”

Section titled “4) “I want to call with a custom prerecorded voice message””

Use Twilio direct call with a public audio URL.

Why:

  • easiest way to play a custom MP3
  • pairs well with Hermes text_to_speech plus a public file host or tunnel

The skill persists telephony state in two places:

Used for long-lived provider credentials and owned-number IDs, for example:

  • TWILIO_ACCOUNT_SID
  • TWILIO_AUTH_TOKEN
  • TWILIO_PHONE_NUMBER
  • TWILIO_PHONE_NUMBER_SID
  • BLAND_API_KEY
  • VAPI_API_KEY
  • VAPI_PHONE_NUMBER_ID
  • PHONE_PROVIDER (AI call provider: bland or vapi)

Used for skill-only state that should survive across sessions, for example:

  • remembered default Twilio number / SID
  • remembered Vapi phone number ID
  • last inbound message SID/date for inbox polling checkpoints

This means:

  • the next time the skill is loaded, diagnose can tell you what number is already configured
  • twilio-inbox --since-last --mark-seen can continue from the previous checkpoint

After installing this skill, locate the script like this:

Окно терминала
SCRIPT="$(find ~/.hermes/skills -path '*/telephony/scripts/telephony.py' -print -quit)"

If SCRIPT is empty, the skill is not installed yet.

This is an official optional skill, so install it from the Skills Hub:

Окно терминала
hermes skills search telephony
hermes skills install official/productivity/telephony

Twilio — owned number, SMS/MMS, direct calls, inbound SMS polling

Section titled “Twilio — owned number, SMS/MMS, direct calls, inbound SMS polling”

Sign up at:

Then save credentials into Hermes:

Окно терминала
python3 "$SCRIPT" save-twilio ACXXXXXXXXXXXXXXXXXXXXXXXXXXXX your_auth_token_here

Search for available numbers:

Окно терминала
python3 "$SCRIPT" twilio-search --country US --area-code 702 --limit 5

Buy and remember a number:

Окно терминала
python3 "$SCRIPT" twilio-buy "+17025551234" --save-env

List owned numbers:

Окно терминала
python3 "$SCRIPT" twilio-owned

Set one of them as the default later:

Окно терминала
python3 "$SCRIPT" twilio-set-default "+17025551234" --save-env
# or
python3 "$SCRIPT" twilio-set-default PNXXXXXXXXXXXXXXXXXXXXXXXXXXXX --save-env

Sign up at:

Save config:

Окно терминала
python3 "$SCRIPT" save-bland your_bland_api_key --voice mason

Vapi — better conversational voice quality

Section titled “Vapi — better conversational voice quality”

Sign up at:

Save the API key first:

Окно терминала
python3 "$SCRIPT" save-vapi your_vapi_api_key

Import your owned Twilio number into Vapi and persist the returned phone number ID:

Окно терминала
python3 "$SCRIPT" vapi-import-twilio --save-env

If you already know the Vapi phone number ID, save it directly:

Окно терминала
python3 "$SCRIPT" save-vapi your_vapi_api_key --phone-number-id vapi_phone_number_id_here

At any time, inspect what the skill already knows:

Окно терминала
python3 "$SCRIPT" diagnose

Use this first when resuming work in a later session.

A. Buy an agent number and keep using it later

Section titled “A. Buy an agent number and keep using it later”
  1. Save Twilio credentials:
Окно терминала
python3 "$SCRIPT" save-twilio AC... auth_token_here
  1. Search for a number:
Окно терминала
python3 "$SCRIPT" twilio-search --country US --area-code 702 --limit 10
  1. Buy it and save it into ~/.hermes/.env + state:
Окно терминала
python3 "$SCRIPT" twilio-buy "+17025551234" --save-env
  1. Next session, run:
Окно терминала
python3 "$SCRIPT" diagnose

This shows the remembered default number and inbox checkpoint state.

Окно терминала
python3 "$SCRIPT" twilio-send-sms "+15551230000" "Your deployment completed successfully."

With media:

Окно терминала
python3 "$SCRIPT" twilio-send-sms "+15551230000" "Here is the chart." --media-url "https://example.com/chart.png"

C. Check inbound texts later with no webhook server

Section titled “C. Check inbound texts later with no webhook server”

Poll the inbox for the default Twilio number:

Окно терминала
python3 "$SCRIPT" twilio-inbox --limit 20

Only show messages that arrived after the last checkpoint, and advance the checkpoint when you’re done reading:

Окно терминала
python3 "$SCRIPT" twilio-inbox --since-last --mark-seen

This is the main answer to “how do I access messages the number receives next time the skill is loaded?”

D. Make a direct Twilio call with built-in TTS

Section titled “D. Make a direct Twilio call with built-in TTS”
Окно терминала
python3 "$SCRIPT" twilio-call "+15551230000" --message "Hello! This is Hermes calling with your status update." --voice Polly.Joanna

E. Call with a prerecorded / custom voice message

Section titled “E. Call with a prerecorded / custom voice message”

This is the main path for reusing Hermes’s existing text_to_speech support.

Use this when:

  • you want the call to use Hermes’s configured TTS voice rather than Twilio <Say>
  • you want a one-way voice delivery (briefing, alert, joke, reminder, status update)
  • you do not need a live conversational phone call

Generate or host audio separately, then:

Окно терминала
python3 "$SCRIPT" twilio-call "+155****0000" --audio-url "https://example.com/briefing.mp3"

Recommended Hermes TTS -> Twilio Play workflow:

  1. Generate the audio with Hermes text_to_speech.
  2. Make the resulting MP3 publicly reachable.
  3. Place the Twilio call with --audio-url.

Example agent flow:

  • Ask Hermes to create the message audio with text_to_speech
  • If needed, expose the file with a temporary static host / tunnel / object storage URL
  • Use twilio-call --audio-url ... to deliver it by phone

Good hosting options for the MP3:

  • a temporary public object/storage URL
  • a short-lived tunnel to a local static file server
  • any existing HTTPS URL the phone provider can fetch directly

Important note:

  • Hermes TTS is great for prerecorded outbound messages
  • Bland/Vapi are better for live conversational AI calls because they handle the real-time telephony audio stack themselves
  • Hermes STT/TTS alone is not being used here as a full duplex phone conversation engine; that would require a much heavier streaming/webhook integration than this skill is trying to introduce

F. Navigate a phone tree / IVR with Twilio direct calling

Section titled “F. Navigate a phone tree / IVR with Twilio direct calling”

If you need to press digits after the call connects, use --send-digits. Twilio interprets w as a short wait.

Окно терминала
python3 "$SCRIPT" twilio-call "+18005551234" --message "Connecting to billing now." --send-digits "ww1w2w3"

This is useful for reaching a specific menu branch before handing off to a human or delivering a short status message.

Окно терминала
python3 "$SCRIPT" ai-call "+15551230000" "Call the dental office, ask for a cleaning appointment on Tuesday afternoon, and if they do not have Tuesday availability, ask for Wednesday or Thursday instead." --provider bland --voice mason --max-duration 3

Check status:

Окно терминала
python3 "$SCRIPT" ai-status <call_id> --provider bland

Ask Bland analysis questions after completion:

Окно терминала
python3 "$SCRIPT" ai-status <call_id> --provider bland --analyze "Was the appointment confirmed?,What date and time?,Any special instructions?"

H. Outbound AI phone call with Vapi on your owned number

Section titled “H. Outbound AI phone call with Vapi on your owned number”
  1. Import your Twilio number into Vapi:
Окно терминала
python3 "$SCRIPT" vapi-import-twilio --save-env
  1. Place the call:
Окно терминала
python3 "$SCRIPT" ai-call "+15551230000" "You are calling to make a dinner reservation for two at 7:30 PM. If that is unavailable, ask for the nearest time between 6:30 and 8:30 PM." --provider vapi --max-duration 4
  1. Check result:
Окно терминала
python3 "$SCRIPT" ai-status <call_id> --provider vapi

When the user asks for a call or text:

  1. Determine which path fits the request via the decision tree.
  2. Run diagnose if configuration state is unclear.
  3. Gather the full task details.
  4. Confirm with the user before dialing or texting.
  5. Use the correct command.
  6. Poll for results if needed.
  7. Summarize the outcome without persisting third-party numbers to Hermes memory.
  • real-time inbound call answering
  • webhook-based live SMS push into the agent loop
  • guaranteed support for arbitrary third-party 2FA providers

Those would require more infrastructure than a pure optional skill.

  • Twilio trial accounts and regional rules can restrict who you can call/text.
  • Some services reject VoIP numbers for 2FA.
  • twilio-inbox polls the REST API; it is not instant push delivery.
  • Vapi outbound calling still depends on having a valid imported number.
  • Bland is easiest, but not always the best-sounding.
  • Do not store arbitrary third-party phone numbers in Hermes memory.

After setup, you should be able to do all of the following with just this skill:

  1. diagnose shows provider readiness and remembered state
  2. search and buy a Twilio number
  3. persist that number to ~/.hermes/.env
  4. send an SMS from the owned number
  5. poll inbound texts for the owned number later
  6. place a direct Twilio call
  7. place an AI call via Bland or Vapi