audio-inbox

Turn a raw Whisper transcript into a clean, concise inbox note with provenance preserved.

When to use

  • A new .md file appears in 0_Inbox/transcripts/.
  • The user asks to “clean up this voice memo” or “process this transcript”.
  • The user runs /audio-inbox explicitly.

Workflow

  1. Read the raw transcript from 0_Inbox/transcripts/<filename>.md.

  2. Detect languagede (German), en (English), ja (Japanese), or mixed. If mixed, use the dominant language for the concise version; keep the original in the callout.

  3. Produce three outputs (same language as source):

    • Cleaned transcript — remove filler words (“um”, “uh”, “like”, “えーと”, “あの”), fix transcription/spelling mistakes. Preserve sentence structure and voice.
    • Concise version — what the note would read like if written by hand instead of rambled. First-person. No third-person summarization. No “Summary:” or “Overview:” prefixes.
    • Filename — no extension, no type words (“summary”, “note”, “transcript”). Content-bearing. Same language as source.
  4. Write the note to 0_Inbox/<filename>.md (not in transcripts/ — that’s for raw input). Structure:

    ---
    created: 2026-04-17
    lang: en
    tags: [voice-memo]
    source: voice-memo
    audio-file: <path if known, else omit>
    ---
     
    <concise version as the body>
     
    > [!note]- Full transcript
    > <cleaned transcript, preserving paragraph breaks>
  5. Delete the raw transcript from 0_Inbox/transcripts/. Provenance lives in the callout and in git history.

Rules

  • Do all three outputs in the source language. If the transcript is German, the concise version is German, the filename is German.
  • First person throughout. “I went to the park” — not “The user went to the park” or “The speaker discussed”.
  • No summary / overview / notes on in the filename. Content itself names the note.
  • Do not route. Write to 0_Inbox/ root. process-inbox handles PARA placement.
  • Do not touch other pages. This skill is narrow. Cross-references happen in process-inbox.
  • Preserve furigana syntax if Japanese content already has it (see obsidian-markdown).

Blocked cases

  • Transcript is empty or unintelligible — leave the raw transcript in place, create a file 0_Inbox/_unprocessable-<timestamp>.md with a one-line note, so the user can decide.
  • Transcript references a specific page that you can’t identify — still produce the note; cross-referencing is process-inbox’s job.

Verification

After running:

  • 0_Inbox/<new-filename>.md exists with schema-conformant frontmatter and the callout.
  • 0_Inbox/transcripts/<original-filename>.md is deleted.
  • The body reads as written-by-hand prose, not a rambled monologue.