Version: 1.0
Last Updated: 2026-01-23
Status: Implementation-ready (final)
ClipLoop is a simple web app that:
- Ingests one or more Google Sheets produced by ClipPulse (upstream product), including the linked video artifacts referenced in those sheets.
- Accepts human intent (goals/constraints) and then runs a ChatGPT-style dialogue.
- Iteratively proposes next hypotheses and success criteria worth validating, grounded in the ingested data and media.
- When a human selects a hypothesis, generates a short vertical "AI influencer talking-head" style video aligned to that hypothesis via an external, official video-generation API (default: HeyGen). (HeyGen API Documentation)
- Logs everything to Google Sheets + saves all generated videos to Google Drive so the workflow can be resumed later.
Enable continuous, hypothesis-driven video creation by combining:
- Structured performance data (Sheets from ClipPulse),
- Real media assets (videos referenced in those sheets),
- Human judgment (intent and selections),
- AI-assisted iterative validation (chat + hypothesis/success-criteria loops).
-
A single web app with:
- Sheet selection (multi-select),
- Chat-room UI as the main experience,
- AI proposing hypotheses + success criteria,
- Button-driven video generation from the chat,
- Full logging (Sheets) + video storage (Drive),
- Resume/restart from past sessions with correct context.
- Automated posting to Instagram/TikTok.
- Automated re-ingestion of new ClipPulse runs on a schedule (manual selection is sufficient).
- Deep video understanding (full frame-by-frame analysis). v1 uses transcripts/metadata/artifacts available via ClipPulse and Drive (details below).
-
Google Apps Script Web App (V8 runtime) as the primary platform, including:
- Backend logic (server-side Apps Script),
- Frontend served via Apps Script HTML Service. (Google for Developers)
-
Google Apps Script (JavaScript, V8)
-
Core Google services:
SpreadsheetApp(read ClipPulse sheets + write logs),DriveApp(store session artifacts + generated videos),PropertiesService(store secrets/config),CacheService(short-lived caches),LockService(concurrency safety),UrlFetchApp(OpenAI + HeyGen API calls).
-
Design constraints:
- Apps Script execution time limits apply (use async polling patterns + background tasks where needed). (Google for Developers)
- Apps Script HTML Service + vanilla HTML/CSS/JS (no build step required).
- Use
google.script.runfor client ↔ server RPC. (Google for Developers)
- OpenAI for LLM reasoning (Responses API) and optional transcription. (OpenAI Platform)
- HeyGen (default) for avatar/talking-head video generation (official API). (HeyGen API Documentation)
ClipLoop only guarantees compatibility with Google Sheets produced by ClipPulse.
InstagramTikTok
- Row 1 is the header row (column names).
- Each subsequent row is one post (one record).
ClipPulse stores a Drive artifact per post and writes its link into drive_url. That artifact is either:
video.mp4(downloaded file), orwatch.html(a lightweight HTML file with the original platform URL).
ClipLoop must treat drive_url as the canonical "real media asset pointer" and attempt to enrich context from it.
ClipLoop must parse all columns as raw strings, but it must specifically recognize and use the following columns for downstream logic.
From ClipPulse schema:
| Column | Used for |
|---|---|
platform_post_id |
stable post identifier + dedup key |
username, create_username |
creator/account metadata |
create_time, posted_at |
chronology |
description |
caption/context |
voice_to_text |
transcript-like context (preferred) |
view, like, comments, share_count, collect_count |
performance metrics |
video_duration |
content constraints |
region_code |
segmenting patterns |
hashtag_names |
topic extraction |
drive_url |
link to media artifact in Drive |
From ClipPulse schema:
| Column | Used for |
|---|---|
platform_post_id |
stable post identifier + dedup key |
account, create_username |
creator/account metadata |
timestamp, posted_at |
chronology |
caption |
caption/context |
like_count, comments_count |
performance metrics |
post_url |
reference URL |
media_type, media_product_type |
content constraints |
drive_url |
link to media artifact in Drive |
Note: Even if additional columns exist (ClipPulse has more), ClipLoop must store them for context but only requires the above for core logic.
At session creation time, ClipLoop must accept:
-
Intent / Goal (free text; required)
-
Optional constraints:
- target platform(s) (IG, TikTok, both),
- topic/domain,
- tone/persona,
- language,
- posting cadence,
- constraints about what not to do.
This intent becomes part of the session's "Context Pack" and is always available to the AI.
Because Apps Script is constrained, ClipLoop defines "AI can check videos" as:
For each post row, ClipLoop tries to provide the AI with:
-
Caption/description text,
-
Transcript-like text:
- Prefer
voice_to_text(TikTok), else - Optional OpenAI transcription if the Drive artifact is a small
video.mp4and can be sent to OpenAI within file-size constraints (see §9.2).
- Prefer
-
The
drive_urlitself (as a human-auditable pointer), -
Key metrics.
Components:
-
Reference Sheets selector
- Add by pasting Google Sheet URL(s) (multi-add)
- Display selected sheets list with remove buttons
-
Intent / Goal text area
-
Optional settings:
- default platform focus (IG/TikTok/Both),
- default video settings (length range, avatar/voice)
-
Primary CTA: Start AI
Layout:
-
Left/top: Chat timeline (ChatGPT-style)
-
Right/bottom side panel (or collapsible drawer):
-
"Current hypotheses" list (AI-generated candidates)
-
Controls:
- "Regenerate hypotheses"
- "Select hypothesis"
- "Create video from selected hypothesis"
-
-
Below timeline: message composer + send button
-
Status area: "Loading context…", "AI thinking…", "Rendering video…", etc.
-
User selects reference sheet(s) + enters intent → clicks Start AI.
-
App builds a Context Pack (data summary + exemplars) and saves it.
-
App requests the AI to produce initial hypotheses/success criteria.
-
Chat becomes active once the initial AI output is ready.
-
User iterates with AI; hypotheses panel updates each turn (or on demand).
-
User selects a hypothesis → clicks Create video.
-
App calls:
- OpenAI to produce a Video Brief aligned with conversation + chosen hypothesis,
- HeyGen to generate the video.
-
When done, the video appears in chat with:
- embedded player (if possible) + Drive link,
- metadata (hypothesis ID, timestamps).
-
User can repeat steps 5–8 indefinitely.
From Home:
-
"Resume session" list (most recent first)
-
Selecting a session loads:
- prior conversation,
- the Context Pack snapshot used at that time,
- prior hypotheses and videos,
- and restarts chat with full context.
Apps Script Web App
-
Serves frontend (HTML Service)
-
Implements backend endpoints (server functions callable via
google.script.run) -
Owns logging + storage in:
- a dedicated log spreadsheet,
- a dedicated Drive folder tree.
External providers
- OpenAI (LLM + optional transcription) via HTTPS calls. (OpenAI Platform)
- HeyGen (video generation) via HTTPS calls. (HeyGen API Documentation)
Because:
- HeyGen video rendering can take minutes to hours depending on load and plan,
- OpenAI reasoning can take > typical UI wait time, ClipLoop must use:
- OpenAI background mode (for LLM calls) + polling, (OpenAI Platform)
- HeyGen status polling (
video_status.get) untilcompleted/failed. (HeyGen API Documentation)
On first run, ClipLoop creates (or requests) a root folder:
Drive/ClipLoop/
Inside:
ClipLoop/
sessions/
session_<SESSION_ID>/
context/
reference_sheets.json
context_pack.json
chat/
messages.jsonl
openai_calls/
<OPENAI_RESPONSE_ID>.request.json
<OPENAI_RESPONSE_ID>.response.json
hypotheses/
hypotheses_<TIMESTAMP>.json
videos/
video_<ITERATION>_<HEYGEN_VIDEO_ID>/
brief.json
heygen.request.json
heygen.status.json
output.mp4
drive_link.txt
errors/
<TIMESTAMP>_<KIND>.json
exports/
Notes:
- The log spreadsheet stores indexes and links; Drive stores full JSON payloads to avoid huge cells.
SESSION_IDis globally unique (see §10.1).
Create one dedicated spreadsheet: ClipLoop Logs.
Tabs and columns:
| Column | Type | Notes |
|---|---|---|
session_id |
string | primary key |
created_at |
ISO string | |
updated_at |
ISO string | |
title |
string | derived from intent or user input |
intent |
string | original user intent |
reference_sheet_urls_json |
string (JSON) | list of selected sheets |
session_drive_folder_url |
string | Drive folder |
context_pack_drive_url |
string | Drive link to context_pack.json |
status |
enum | active, archived, error |
| Column | Type | Notes |
|---|---|---|
session_id |
string | |
message_id |
string | unique within session |
ts |
ISO string | |
role |
enum | user, assistant, system |
content_text |
string | rendered content |
openai_response_id |
string | nullable |
attachments_json |
string (JSON) | e.g., video drive links |
raw_drive_url |
string | optional pointer to JSONL chunk |
| Column | Type | Notes |
|---|---|---|
session_id |
string | |
ts |
ISO string | |
hypotheses_json_drive_url |
string | Drive link to hypotheses JSON |
selected_hypothesis_id |
string | nullable |
| Column | Type | Notes | ||||
|---|---|---|---|---|---|---|
session_id |
string | |||||
iteration |
number | 1,2,3... | ||||
ts_requested |
ISO string | |||||
ts_completed |
ISO string | nullable | ||||
hypothesis_id |
string | |||||
heygen_video_id |
string | |||||
heygen_status |
string | `pending | waiting | processing | completed | failed` |
drive_video_url |
string | Drive link to mp4 | ||||
video_brief_drive_url |
string | Drive link to brief.json |
| Column | Type | Notes |
|---|---|---|
session_id |
string | |
ts |
ISO string | |
stage |
string | context_build, llm, heygen, storage, etc |
error_summary |
string | short |
error_json_drive_url |
string | full payload |
- Use OpenAI GPT-5.2 Pro as the primary reasoning model:
gpt-5.2-pro. (OpenAI Platform) - Use OpenAI Responses API for all LLM calls. (OpenAI Platform)
All LLM calls must be created with background mode to avoid request timeouts and support polling. (OpenAI Platform)
Implementation requirement:
- Create response → get
response_id - Poll retrieve endpoint until terminal state (
completed,failed,cancelled) - Store
response_idon the assistant message row for traceability (OpenAI Platform)
- Set
store: falsein OpenAI requests (ClipLoop will persist everything itself). (OpenAI Platform) - Note: Background mode responses are stored briefly for retrieval; background mode is not compatible with some "zero data retention" constraints. (OpenAI Platform)
Because gpt-5.2-pro does not support "Structured outputs" directly, ClipLoop must use function calling to produce machine-readable artifacts (hypothesis candidates and video briefs). (OpenAI Platform)
ClipLoop must define two functions in the OpenAI request tools:
propose_hypothesescompose_video_brief
The assistant's natural-language chat response should remain human-readable, but hypotheses and video briefs must come from tool-call arguments.
propose_hypotheses(args) must conform to:
mode:"initial"or"iteration"hypotheses: array of 3–7 candidates
Each hypothesis candidate must include:
hypothesis_id(string, unique within session)title(string)hypothesis_statement(string; "If we do X, then Y because Z…")rationale_from_data(string; reference observed patterns)what_to_make_next(string; content concept guidance)success_criteria(array of metric rules)test_plan(string; what to post, how many, timeframe)risks_and_unknowns(array of strings)questions_for_user(array of strings)
Metric rule object:
metric(enum:views,likes,comments,shares,engagement_rate,watch_time,saves)operator(enum:>=,>,<=,<)target_value(number)window(string; e.g.,24h,7d)notes(string)
compose_video_brief(args) must conform to:
-
brief_id(string) -
hypothesis_id(string) -
platform_target(instagram_reels|tiktok|both) -
video_formataspect_ratiodefault"9:16"resolutiondefault{ "width": 720, "height": 1280 }duration_secondsinteger (default 20–45)
-
scripthook(string, 1–2 sentences)body(string)cta(string)on_screen_text(array of short strings)captions(boolean)
-
avatarheygen_avatar_id(string; can be default from settings)heygen_voice_id(string; can be default from settings)voice_speed(number; default 1.0)
-
safetyrequires_moderation(boolean; default true)blocked_topics(array; default empty)
Hard constraint: script.body must be sized so that the final HeyGen input_text is < 5000 characters. (HeyGen API Documentation)
ClipLoop's system prompt must enforce:
- Always ground hypotheses in the provided Context Pack.
- Always produce success criteria that are measurable.
- Ask clarifying questions when uncertainty is high.
- Never fabricate metrics; if unknown, label as assumption.
- For
compose_video_brief, prioritize short, punchy, vertical "AI influencer" style.
If drive_url resolves to a Drive video.mp4 and it is ≤ OpenAI file size limits, ClipLoop may transcribe it using OpenAI speech-to-text. File uploads are limited (documented as 25MB for speech-to-text).
Recommended model for transcription:
gpt-4o-mini-transcribe
If the file is too large or cannot be fetched, skip transcription and rely on caption/voice_to_text.
Default video provider: HeyGen API (avatar/talking-head generation). (HeyGen API Documentation)
- List avatars:
GET https://api.heygen.com/v2/avatars(HeyGen API Documentation) - List voices:
GET https://api.heygen.com/v2/voices(HeyGen API Documentation) - Create video:
POST https://api.heygen.com/v2/video/generate(HeyGen API Documentation) - Video status:
GET https://api.heygen.com/v1/video_status.get(HeyGen API Documentation)
ClipLoop must treat status values as:
pending,waiting,processing,completed,failed(HeyGen API Documentation)
Polling must continue until completed or failed.
HeyGen's returned video_url expires after 7 days; ClipLoop must download and store the video in Drive promptly on completion. (HeyGen API Documentation)
SESSION_ID:sess_<YYYYMMDD>_<randomBase36(10)>MESSAGE_ID:msg_<incrementingInt>HYPOTHESIS_ID:hyp_<incrementingInt>BRIEF_ID:brief_<incrementingInt>ITERATION: integer starting at 1
context_pack.json must include:
-
session_id -
intent -
reference_sheets(array of{url, spreadsheet_id, title, ingested_at}) -
ingested_posts(array of "ContentCard"; see below) OR (recommended) anexemplarssubset + summary stats -
summary_stats- counts, top posts, median/mean views where available
- hashtag frequency
- platform breakdown
-
exemplars- top N by views (TikTok)
- top N by engagement proxy
- N random samples
Minimal fields:
platform(tiktok|instagram)platform_post_idposted_atcaption_or_descriptiontranscript_text(fromvoice_to_textor transcription)drive_urlmetrics(object; includes raw numeric strings + parsed numbers where safe)raw_row(key/value map for all columns)
For every session:
- The user's intent.
- The exact reference sheet URLs used.
- The Context Pack snapshot link.
- All chat messages (user + assistant).
- Every hypothesis set produced (stored as JSON in Drive + indexed in
Hypothesestab). - Every video brief and HeyGen job (request/response/status snapshots).
- The Drive link to every created video.
ClipLoop must implement these server functions callable from the frontend:
createSession({ intent, referenceSheetUrls[] }) -> { sessionId }listSessions() -> { sessions[] }loadSession({ sessionId }) -> { session, contextPack, messages, latestHypotheses, videos }
startContextBuild({ sessionId }) -> { jobId }pollContextBuild({ sessionId, jobId }) -> { status, progress, ready }
sendUserMessage({ sessionId, text }) -> { messageId, openaiResponseId }pollAssistantMessage({ sessionId, openaiResponseId }) -> { status, assistantText?, hypothesesUpdate? }
selectHypothesis({ sessionId, hypothesisId }) -> { ok }
startVideoGeneration({ sessionId, hypothesisId }) -> { iteration, briefOpenaiResponseId }pollVideoBrief({ sessionId, iteration, briefOpenaiResponseId }) -> { status, brief? }submitHeygenJob({ sessionId, iteration, brief }) -> { heygenVideoId }pollHeygenJob({ sessionId, iteration, heygenVideoId }) -> { status, videoUrl?, completedDriveUrl? }
-
Create a Drive folder named
ClipLoop. -
Create a Google Spreadsheet named
ClipLoop Logswith tabs:Sessions,Messages,Hypotheses,Videos,Errors(exact schemas in §8.2).
-
Create a new Apps Script project named
ClipLoop. -
Add HTML Service frontend files:
index.html(single-page UI)app.js.html(client JS)styles.css.html(inline CSS or separate template)
-
Add server-side
.gsfiles (recommended split):Main.gs(doGet + routing)Config.gs(script properties + constants)Storage.gs(Drive + log sheet helpers)ClipPulseIngest.gs(sheet parsing)ContextPack.gs(summary + exemplars)OpenAI.gs(Responses API wrapper + polling)HeyGen.gs(API wrapper + polling + download)SessionApi.gs(RPC functions)
-
Set OAuth scopes in manifest (minimum):
- Drive read/write
- Sheets read/write
- External requests
-
Deploy as Web App:
- Execute as: Me (owner) (recommended for consistent Drive/log ownership)
- Access: restrict to allowed users (single-user MVP: only you). (Google for Developers)
-
Implement "ensure folder" logic for
ClipLoop/and per-session subfolders (§8.1). -
Implement log-sheet append/update helpers that:
- batch writes where possible,
- never exceed Apps Script execution time limits. (Google for Developers)
-
Implement JSON persistence:
- write/read JSON files in Drive by path.
-
For each selected reference sheet URL:
- open spreadsheet,
- read
InstagramandTikToktabs (if missing → error).
-
Build ContentCards:
- parse all columns into
raw_row, - normalize required fields (post_id, caption, metrics, drive_url).
- parse all columns into
-
Deduplicate by
(platform, platform_post_id)across all selected sheets. -
Store a snapshot of ingested data (or exemplars) into
context_pack.json.
-
Compute:
- per-platform counts,
- top posts (by views when available),
- hashtag frequencies,
- simple engagement proxy metrics where possible.
-
Select exemplars:
- top N by views / engagement,
- N random.
-
Save
context_pack.jsonand link it inSessionstab.
-
Implement OpenAI Responses API wrapper:
- create response with
background: trueandstore: false. (OpenAI Platform) - poll
GET /v1/responses/{id}until terminal state. (OpenAI Platform)
- create response with
-
Implement function calling tool definitions:
propose_hypothesescompose_video_brief(OpenAI Platform)
-
Implement message pipeline:
- write user message → kick off OpenAI → write assistant placeholder → poll updates → finalize message + store hypotheses JSON.
-
Implement HeyGen API wrapper:
- list avatars (for settings UI). (HeyGen API Documentation)
- list voices (for settings UI). (HeyGen API Documentation)
- create video: POST
/v2/video/generate. (HeyGen API Documentation) - poll status: GET
/v1/video_status.get. (HeyGen API Documentation)
-
Enforce input constraints:
- ensure script text < 5000 chars. (HeyGen API Documentation)
-
On completion:
- download
video_urlimmediately and store into Drive (URL expires in 7 days). (HeyGen API Documentation)
- download
-
Write back:
Videostab row + chat message attachment with Drive link.
-
Build
listSessions()fromSessionstab. -
loadSession()loads:- messages from
Messages, - latest hypotheses from
Hypotheses, - videos from
Videos, - context from Drive
context_pack.json.
- messages from
-
UI renders and allows continued conversation.
-
Load a known ClipPulse sheet and confirm:
- Context build completes.
- Initial hypotheses appear.
-
Chat loop:
- Send a message → receive assistant response and hypothesis refresh.
-
Select hypothesis → create video:
- Video brief created.
- HeyGen job submitted.
- Status transitions to completed or failed.
- On completed: MP4 saved to Drive + shown in chat.
-
Resume:
- Reload session and verify context + history match prior state.
ClipLoop is "done" when:
- Reference sheet selection
- User can add and remove multiple ClipPulse sheet URLs.
- Context gating
- Chat cannot begin until Context Pack snapshot exists and initial AI message is generated.
- Hypothesis generation
- AI provides 3–7 hypotheses with explicit success criteria, and the UI presents them as selectable items.
- Video generation
- After selecting a hypothesis, user can generate a HeyGen video and receive the output in chat + Drive.
- Logging
- Every session, message, hypotheses set, and video is logged to the log spreadsheet and Drive.
- Resume
- User can reopen a previous session and continue with correct prior context.
Store in PropertiesService:
CLIPLOOP_LOG_SPREADSHEET_ID(string)CLIPLOOP_ROOT_DRIVE_FOLDER_ID(string)
OpenAI:
OPENAI_API_KEY(string)OPENAI_MODEL_REASONINGdefaultgpt-5.2-pro(OpenAI Platform)OPENAI_MODEL_TRANSCRIBEdefaultgpt-4o-mini-transcribe
HeyGen:
HEYGEN_API_KEY(string)HEYGEN_DEFAULT_AVATAR_ID(string)HEYGEN_DEFAULT_VOICE_ID(string)
App behavior:
MAX_EXEMPLARS_PER_PLATFORM(number; default 20)MAX_RANDOM_SAMPLES_PER_PLATFORM(number; default 20)HYPOTHESIS_CANDIDATE_COUNT(number; default 5)