BeClaude

pdf-bilingual-translation-md

New
GitHub TrendingDocumentationby Akisui

Use this skill when the user asks to translate a PDF paper, article, report, or technical document into paragraph-aligned bilingual Markdown, especially Chinese-English or another user-confirmed language pair. Preserve equations in LaTeX, translate paragraph by paragraph, crop original figures from the PDF at high resolution, include complete original captions in figure images, optionally run human-in-the-loop figure confirmation, and verify every referenced image before delivery. Do not create ZIP/DOCX/PDF/PPTX unless the user explicitly asks.

Community PluginView Source

Overview

PDF bilingual paragraph-aligned Markdown translation

Use this skill for PDF-to-Markdown translation tasks where the user wants:

  • Paragraph-by-paragraph bilingual alignment.
  • Mathematical formulas preserved accurately in LaTeX.
  • Original figures inserted near the corresponding paragraphs.
  • Figure crops taken from the original PDF, not generated or redrawn.
  • Figure crops that include the full visual content and the original figure caption.
  • A final Markdown deliverable that can be opened locally by Codex or edited by the user.

Design basis

Codex skills are local workflow instructions stored in a skill directory with a required SKILL.md; scripts, references, and assets are optional.[^codex-skills] Codex uses progressive disclosure: it first sees each skill's name, description, and file path, then loads the full SKILL.md only when it decides to use the skill.[^codex-skills] Codex can activate a skill explicitly through /skills or $, or implicitly when the task matches the skill description.[^codex-skills]

For interactive work, distinguish two mechanisms:

  1. Workflow-level user confirmation: the agent asks the user a normal question in the Codex conversation and waits before continuing. Use this for confirming translation language and figure crops.
  2. Codex approval prompts: Codex approval modes control when Codex stops before running commands, editing outside scope, or using network access. They are security/permission approvals, not a substitute for content QA.[^codex-approval-modes]

If the current Codex environment exposes tool/requestUserInput, it may be used for short structured choices; OpenAI documents it as an experimental app-server tool that can prompt the user with 1-3 short questions.[^codex-request-user-input] Do not depend on it. If it is unavailable, ask the same questions in plain chat and wait for the user reply.

codex exec is non-interactive and returns the final plan/results to stdout, so it is not suitable for per-figure user confirmation. In non-interactive runs, either skip user confirmation and do mandatory self-QA, or stop after preparing preview files and tell the user to rerun in an interactive Codex session.[^codex-exec]

Output contract

Default output:

text
<source_basename>_bilingual.md
<source_basename>_assets_<unique_suffix>/
  figure1_*.png
  figure2_*.png
  ...

If the user says they only want one Markdown file, create a standalone Markdown file and embed images as base64 data:image/png;base64,... URIs. Otherwise, prefer a normal Markdown file plus a relative assets directory, because it is easier to inspect and edit locally.

Do not create a .zip unless the user explicitly asks for packaging.

User interaction language policy

All workflow questions, confirmation prompts, progress updates, rejection/approval choices, and final status messages must use the interaction language inferred from the current user conversation, not a hardcoded language.

Rules:

  • Infer the interaction language from the user's latest request and the surrounding conversation context.
  • If the user is writing in Chinese, ask follow-up questions and confirmation prompts in Chinese.
  • If the user is writing in Japanese, use Japanese.
  • If the user is writing in English, use English.
  • If the conversation is mixed-language, use the language the user most recently used for task instructions, unless they explicitly requested another language.
  • The interaction language is independent of the translation direction. For example, the user may ask in Chinese to translate an English paper into Japanese; in that case, ask confirmations in Chinese, but produce the translated document in Japanese as confirmed.
  • Do not hardcode Chinese examples directly into the final user prompts. Treat examples in this skill as templates and localize them to the inferred interaction language before asking the user.
  • Keep technical labels such as Figure 2, Markdown, LaTeX, assets directory, and file paths unchanged unless translating them improves clarity.
  • If the interaction language is unclear, ask one short clarification question in the language of the user's latest message.

Required initial confirmation

Before doing the full translation, ask the user to confirm these two points unless the prompt already explicitly forbids follow-up questions. Write these questions in the inferred interaction language from the policy above.

1. Confirm translation languages

Ask a concise localized question such as:

text
请确认翻译方向:
A. English → Chinese paragraph-aligned Markdown
B. Chinese → English paragraph-aligned Markdown
C. Keep original + translate into another language: <please specify>

If the source language is obvious from the PDF and the user's prompt, propose the inferred direction in the inferred interaction language, but still ask for confirmation:

text
我将按 English → Chinese 做段落对照翻译。请回复“确认”,或指定其他目标语言。

Do not start the final translation until the language direction is confirmed.

2. Ask whether to enable per-figure user confirmation

Ask the user to choose one mode, localized to the inferred interaction language:

text
图片是否需要逐张人工确认?
A. 需要:每裁一张图后暂停,让我确认;有问题就重裁。
B. 不需要:你自行逐张检查,最后给我 Markdown。

Recommended default for papers with many figures: B unless the user previously complained about crop quality, asks for strict visual QA, or the document has complex multi-panel figures. If the user has already complained about incomplete crops, default to A.

Workflow

1. Inspect the source PDF before translating

  1. Identify the paper title, authors, abstract, section headings, formulas, tables, figures, references, footnotes, and page count.
  2. Determine where each figure appears in the reading order.
  3. Do not rely only on extracted text. Render or visually inspect pages that contain figures, captions, complex equations, tables, or two-column layouts.
  4. For scanned PDFs, use OCR only as a fallback. Prefer embedded text extraction when available.

2. Extract and normalize text

Preserve the original logical order rather than the raw PDF column order when necessary.

Remove extraction artifacts:

  • Broken hyphenation caused by line wrapping, for example coloriza- tioncolorization.
  • PDF ligature or encoding noise when it is clearly an extraction artifact.
  • Repeated headers, footers, page numbers, and copyright boilerplate unless the user asks to keep them.

Keep meaningful content:

  • Title, authors, abstract, keywords, section headings, body paragraphs, equations, captions, acknowledgements, and references.
  • Footnotes near the paragraph where they are cited.

3. Translate paragraph by paragraph

Use this format for normal paragraphs:

markdown
**Original:** <original paragraph>

**Translation:** <translated paragraph>

For Chinese output, use:

markdown
**Original:** <original English paragraph>

**中文:** <Chinese translation>

For headings:

markdown
## 1 Introduction / 1 引言

For short labels such as “Abstract”, “Keywords”, “References”, use bilingual headings:

markdown
## Abstract / 摘要

Translation rules:

  • Translate faithfully; do not summarize unless the user asks.
  • Preserve technical terms consistently. If a term has a standard translation, use it; otherwise keep the source term in parentheses on first occurrence.
  • Preserve citation markers such as [Levin et al. 2004], equation references such as Eq. (7), and figure references such as Figure 3(a).
  • Preserve variables exactly: do not rename G_r, G_t, V_r, E_t, x_{ij}, etc.
  • Do not silently omit paragraphs because they are difficult or repetitive.

4. Preserve formulas accurately

Convert formulas to LaTeX in Markdown.

Inline formulas:

markdown
$x_{ij}$, $G_r = (V_r, E_r)$

Display formulas:

markdown
$$
\sum_j x_{ij} = 1, \qquad \sum_i x_{ij} = 1.
$$

Formula rules:

  • Preserve equation numbering when present.
  • Use \tag{n} only when the equation number is part of the paper and useful for cross-reference.
  • Check all subscripts and superscripts visually against the PDF.
  • Distinguish similar symbols carefully: i_1 vs i, j_1 vs j, N^2 vs N_2, \lambda_{\min} vs \lambda min.
  • Preserve constraints such as x \in \{0,1\}^{N^2} and x \in [0,1]^{N^2} exactly.
  • If a formula cannot be read confidently, mark it with <!-- TODO: verify formula from page X --> rather than guessing.

5. Crop figures from the original PDF

Figures must be cropped from the original PDF render. Do not generate, redraw, trace, or enhance them with image generation.

Recommended local process:

  1. Render relevant PDF pages at high DPI, usually 300–600 DPI. Use 450 DPI or higher for small diagrams and equations inside figures.
  2. Crop the figure from the rendered page.
  3. Include the full figure body, all subfigure labels, arrows, annotations, and the original caption below the figure.
  4. Keep enough margin so that no text or line is clipped.
  5. Save figures as PNG unless the source figure is photographic and JPEG is clearly smaller without visible quality loss.

File naming:

text
<source_basename>_assets_<unique_suffix>/figure1_system_overview.png
<source_basename>_assets_<unique_suffix>/figure2_graph_nodes.png

Use a new unique assets directory each time the figures are regenerated. This avoids accidentally reusing old cropped images or triggering stale previewer caches.

6. Per-figure user confirmation mode

Run this section only if the user chose per-figure confirmation.

For each figure, do this loop:

  1. Crop the figure into a staging assets directory using a versioned filename, for example figure2_graph_nodes_v1.png.
  2. Show or open the crop for the user if the environment supports it. Otherwise, provide the exact local file path and ask the user to inspect it.
  3. Ask, using the inferred interaction language:
text
Figure 2 是否通过?
A. 通过
B. 不通过:缺上/下/左/右边缘
C. 不通过:缺图注
D. 不通过:图太小/太模糊
E. 不通过:不是这张图或位置不对
F. 其他问题:请说明

If tool/requestUserInput is available, it may be used for this question. Otherwise, ask in plain chat.

  1. If the user rejects the crop, adjust the crop box or DPI, save a new versioned file such as figure2_graph_nodes_v2.png, and ask again.
  2. When the user approves a figure, copy or rename the approved version into the final unique assets directory and update the Markdown to reference only the approved final filename.
  3. Never leave rejected crop versions referenced in the final Markdown.

If there are many figures, you may ask the user whether to switch to batch confirmation after the first few figures:

text
是否改为批量确认?我可以生成 contact sheet,一次展示多张图;你指出有问题的图号后我再逐张修正。

7. Batch figure confirmation option

If the user wants faster confirmation, generate a contact sheet containing all figure crops with clear labels such as Figure 1, Figure 2, etc. Ask the user, in the inferred interaction language, to inspect it and list any problematic figure numbers.

If the user reports problems:

  1. Reopen only the problematic figures.
  2. Recrop them with new versioned filenames or a new assets directory.
  3. Regenerate the contact sheet.
  4. Ask for confirmation again.

Do not finalize the Markdown until the user approves all figures, or explicitly says to proceed despite known problems.

8. Insert figures at the correct positions

Insert each figure near the paragraph where the paper first discusses it, or at the same logical location as the original PDF.

Use HTML <img> rather than bare Markdown image syntax when high-resolution figures may render too large:

html
<p align="center">
  <img src="./<asset_dir>/figure2_graph_nodes.png" alt="Figure 2: Relation between graph nodes" style="max-width: 100%; height: auto;">
</p>

Below the image, add a translated caption in Markdown only if useful:

markdown
**Figure 2 / 图 2:** Relation between graph nodes. / 图节点之间的关系。

The image crop itself must still include the original caption from the PDF. The translated caption below the image is not a substitute for cropping the original caption.

9. Avoid the common failure modes

Before final delivery, specifically check for these problems:

  • The Markdown references an old assets directory.
  • The assets directory contains outdated images with the same names.
  • A high-resolution figure displays at full pixel width and appears “cropped” in the previewer.
  • A figure crop includes the diagram but not the original caption.
  • A crop cuts off the bottom of a diagram, subfigure label, equation label, or caption line.
  • A page screenshot was inserted instead of a clean figure crop when a figure-level crop was requested.
  • Figure order in the Markdown does not match the paper's reading order.
  • Formula variables were changed during translation.
  • Equation numbers in the text no longer match the displayed formulas.

10. Mandatory final QA

Perform these checks before telling the user the file is ready:

  1. List all image references in the Markdown.
  2. Confirm every referenced image file exists at the exact relative path used in the Markdown.
  3. Open every referenced image one by one.
  4. For each image, verify:

- full figure body is visible; - original caption is visible; - no edges, annotations, arrows, or subfigure labels are clipped; - the image is readable at normal Markdown preview size; - the image is from the current final assets directory; - if per-figure confirmation was enabled, the referenced image is the user-approved version.

  1. If using a standalone base64 Markdown file, confirm the embedded image count equals the number of figures and that no external image paths remain.
  2. Search the Markdown for TODO markers, broken placeholders, missing translations, and malformed LaTeX delimiters.
  3. Give the user a short checklist, in the inferred interaction language, of what was verified.

11. User-facing response

Keep the final response short and use the inferred interaction language. Include the Markdown file path and, if relevant, the assets directory path.

Example:

text
已完成本地 Markdown。翻译方向已按 English → Chinese 确认;图片确认模式为逐张确认。最终 Markdown 只引用已确认的图片目录。

- Markdown: <path>
- 图片目录: <path>

If any figure or formula could not be verified, state that explicitly and identify the page/figure number.

Local installation note

Place this file at one of the following paths:

text
$HOME/.agents/skills/pdf-bilingual-translation-md/SKILL.md
<repo>/.agents/skills/pdf-bilingual-translation-md/SKILL.md

Use the global path for personal reuse across projects. Use the repo path when the workflow should be shared with the project.

References

[^codex-skills]: OpenAI, “Agent Skills – Codex.” https://developers.openai.com/codex/skills [^codex-approval-modes]: OpenAI, “Features – Codex CLI: Approval modes.” https://developers.openai.com/codex/cli/features [^codex-request-user-input]: OpenAI, “Codex App Server: tool/requestUserInput.” https://developers.openai.com/codex/app-server [^codex-exec]: OpenAI, “Features – Codex CLI: Scripting Codex.” https://developers.openai.com/codex/cli/features

Install & Usage

1
Create the skills directory
mkdir -p .claude/skills
2
Download the skill file
mkdir -p .claude/skills && curl -o .claude/skills/pdf-bilingual-translation-md.md https://raw.githubusercontent.com/Akisui/pdf-bilingual-translation-md-skill/main/SKILL.md
3
Invoke in Claude Code
/pdf-bilingual-translation-md
View source on GitHub
documentation

Frequently Asked Questions

What is pdf-bilingual-translation-md?

Use this skill when the user asks to translate a PDF paper, article, report, or technical document into paragraph-aligned bilingual Markdown, especially Chinese-English or another user-confirmed language pair. Preserve equations in LaTeX, translate paragraph by paragraph, crop original figures from the PDF at high resolution, include complete original captions in figure images, optionally run human-in-the-loop figure confirmation, and verify every referenced image before delivery. Do not create ZIP/DOCX/PDF/PPTX unless the user explicitly asks.

How to install pdf-bilingual-translation-md?

To install pdf-bilingual-translation-md, create the .claude/skills directory in your project, then run the curl command to download the skill file. Once installed, invoke it in Claude Code with /pdf-bilingual-translation-md.

What is pdf-bilingual-translation-md best for?

pdf-bilingual-translation-md is a community categorized under Documentation. It is designed for: documentation. Created by Akisui.