Agentic Translation Research Framework - README.md

# Agentic Translation Research Framework AI 기반 번역 품질 향상을 위한 종합적 연구 프레임워크입니다. 다양한 번역 모델과 후처리 기법을 통해 고품질 번역을 달성하고, LLM 기반 의미론적 평가를 제공합니다. ## 🚀 주요 기능 - **다중 번역 엔진**: DeepL API, LLM 기반 번역기 - **지능형 후처리**: 컨텍스트 추출, 가이드라인 생성, 자동 교정 - **의미론적 평가**: GPT-4o를 활용한 번역 품질 평가 - **비동기 배치 처리**: 대량 번역 효율적 처리 - **확장 가능한 아키텍처**: 플러그인 형태의 컴포넌트 설계 ## 📋 Notation - $\mathcal M$: LLM 모델 - $\mathcal I$: 시스템 프롬프트 (System Instruction) - $s$: 원문 (Source Text) - $c$: 맥락 정보 (Context) - $g$: 수정 가이드라인 (Guidance) - $o$: 번역문 (Translated Output) - $o^\prime$: 후편집된 문서 (Post-edited Text) - $\mathbb L_s$: 출발 언어 (Source Language) - $\mathbb L_t$: 도착 언어 (Target Language) ## 🏗️ Component Formulation ### Component #1. NMT Translator - NMT 번역기 $$ \mathcal T_\text{NMT}(s, \mathbb L_s, \mathbb L_t) \rightarrow o $$ Neural Machine Translation을 통한 기본 번역. DeepL API를 활용하여 고품질 번역을 제공합니다. > 입력: 원문 $s$, 출발언어 $\mathbb L_s$, 도착언어 $\mathbb L_t$ > 출력: 번역문 $o$ ### Component #2. LLM Translator - LLM 번역기 $$ \mathcal T_\text{Agent}(\mathcal M, \mathcal I_\text{llm_translator}, s, \mathbb L_s, \mathbb L_t) \rightarrow o $$ LLM을 활용한 지능형 번역. 컨텍스트 이해와 문화적 적절성을 고려한 번역을 수행합니다. ### Component #3. Context Extractor - 맥락 추출기 $$ \mathcal C(\mathcal M, \mathcal I_\text{context_extractor}, s) \rightarrow c $$ 원문에서 번역에 필요한 맥락 정보를 추출합니다. 문서의 도메인, 톤, 전문 용어 등을 파악합니다. ### Component #4. Guideline Extractor - 가이드라인 추출기 $$ \mathcal G(\mathcal M, \mathcal I_\text{guideline_extractor}, [s; c]) \rightarrow g $$ 추출된 컨텍스트를 바탕으로 번역 가이드라인을 생성합니다. 일관성 있는 번역을 위한 규칙을 제시합니다. ### Component #5. Naive Post-Editor - 순진한 후처리기 $$ \mathcal E(\mathcal M, \mathcal I_\text{naive_editor}, [s; o]) \rightarrow o^\prime_\text{naive_editor} $$ 기본적인 후편집기. 원문과 번역문만을 참고하여 단순한 오류를 수정합니다. > 입력: LLM 모델, 시스템 프롬프트, 원문 $s$, 번역문 $o$ > 출력: 후편집된(교정된) 문서 ### Component #6. Official Editor - 공식적인 후처리기 (제안하는 방법론) $$ \mathcal E(\mathcal M, \mathcal I_\text{editor}, [s; o; c; g], \mathbb L_s, \mathbb L_t) \rightarrow o^\prime $$ 종합적인 후편집기. 컨텍스트와 가이드라인을 활용하여 고품질 번역을 생성합니다. > [Component 6]에서 가이드라인 텍스트 $g$, 맥락 정보 $c$가 추가됨. ### Component #7. Translation Evaluator - 번역 평가자 $$ \mathcal V(\mathcal M_\text{GPT-4o}, s, o, \mathbb L_s, \mathbb L_t) \rightarrow \text{Score} $$ GPT-4o 기반 의미론적 평가. 번역의 의미 보존, 자연스러움, 문법 정확성, 문화적 적절성을 종합 평가합니다. > COMET으로 평가할거면, 이건 안 해도 됨. LLM-as-a-Judge 방법론용임. (e.g. AlpacaEval) ## 🔧 Model Formulation ### 1. Basic Translation Architecture $$ s \xrightarrow{\mathcal T_\text{NMT}} o_\text{basic} $$ 가장 기본적인 형태로 DeepL과 같은 NMT 시스템을 직접 활용합니다. ### 2. LLM-Enhanced Translation Architecture $$ s \xrightarrow{\mathcal T_\text{Agent}} o_\text{llm} \xrightarrow{\mathcal V} \text{Quality Score} $$ LLM 번역기를 사용하고 의미론적 평가까지 수행하는 구조입니다. ### 3. Naive Post-Editing Architecture $$ s \xrightarrow{\mathcal T_\text{NMT}} o \xrightarrow{\mathcal E_\text{naive}} o^\prime_\text{naive} $$ 기본 번역 후 단순 후편집을 수행합니다: 1. **Initial Translation**: NMT 시스템으로 1차 번역 생성 2. **Naive Correction**: 원문-번역문 쌍을 기반으로 기본적 오류 수정 3. **Quality Check**: 문법적 오류나 명백한 의미 오류 교정 **장점**: 빠른 처리 속도, 낮은 계산 비용 **단점**: 제한적인 품질 개선, 컨텍스트 무시 ### 4. Editor After Coordinator Architecture 가장 고도화된 5단계 번역 파이프라인으로, 각 단계가 다음 단계의 품질을 향상시키는 구조입니다: ```mermaid flowchart TD A[원문 s] --> B1{①번역 에이전트 선택} B1 --> C1[LLM Translator] B1 --> D1[NMT Translator] C1 --> E1[초벌번역 o] D1 --> E1 A --> F2[②컨텍스트 추출] F2 --> G2[맥락정보 c] G2 --> H3[③가이드라인 생성] A --> H3 H3 --> I3[번역규칙 g] A --> J4[④지능형 후편집] E1 --> J4 G2 --> J4 I3 --> J4 J4 --> K4[완성번역 o′] A --> L5[⑤품질평가] K4 --> L5 L5 --> M5[평가점수] style A fill:#e1f5fe style B1 fill:#ffecb3 style F2 fill:#f3e5f5 style H3 fill:#e8f5e8 style J4 fill:#fff3e0 style L5 fill:#fce4ec style K4 fill:#c8e6c9 style M5 fill:#fff3e0 ``` #### **각 단계별 상세 설명** | 단계 | 컴포넌트 | 입력 | 출력 | 역할 | |------|----------|------|------|------| | ① | **Translation Agent** | 원문 `s` | 초벌번역 `o` | LLM/NMT 번역기를 선택하여 기본 번역 수행 | | ② | **Context Extractor** | 원문 `s` | 맥락 `c` | 도메인, 톤, 전문용어 파악 | | ③ | **Guideline Generator** | `[s; c]` | 규칙 `g` | 번역 일관성을 위한 가이드라인 생성 | | ④ | **Official Editor** | `[s; o; c; g]` | 완성번역 `o′` | 맥락과 규칙을 반영한 고품질 후편집 | | ⑤ | **Quality Evaluator** | `[s; o′]` | 점수 | 의미/문법/자연스러움 종합 평가 | #### **수학적 표현** $$ \begin{align} \text{Step 1: } & s \xrightarrow{\text{translate}} o & \text{(번역 수행)} \\ \text{Step 2: } & s \xrightarrow{\text{extract context}} c & \text{(맥락 추출)} \\ \text{Step 3: } & [s; c] \xrightarrow{\text{generate guidelines}} g & \text{(가이드라인 생성)} \\ \text{Step 4: } & [s; o; c; g] \xrightarrow{\text{post-edit}} o^\prime & \text{(후편집 수행)} \\ \text{Step 5: } & [s; o^\prime] \xrightarrow{\text{evaluate}} \text{Quality Score} & \text{(품질 평가)} \end{align} $$ 여기서 **translate** 행위는 다음 중 하나를 선택할 수 있습니다: - $s \xrightarrow{\text{translate via LLM}} o$ : LLM 기반 번역 수행 - $s \xrightarrow{\text{translate via NMT}} o$ : NMT 시스템 번역 수행 #### **Component Formulation** **Step 1. Translation Agent Selection & Initial Translation** 번역 에이전트를 선택하여 초벌 번역을 수행합니다: **Option A: LLM Translation** $$ \mathcal T_\text{LLM}(\mathcal M, \mathcal I_\text{llm\_translator}, s, \mathbb L_s, \mathbb L_t) \rightarrow o $$ LLM을 활용한 지능형 번역. 컨텍스트 이해와 문화적 적절성을 고려하여 번역합니다. **Option B: NMT Translation** $$ \mathcal T_\text{NMT}(s, \mathbb L_s, \mathbb L_t) \rightarrow o $$ Neural Machine Translation을 통한 기본 번역. DeepL과 같은 고성능 NMT 시스템을 활용합니다. **Step 2. Context Extraction** $$ \mathcal C(\mathcal M, \mathcal I_\text{context\_extractor}, s) \rightarrow c $$ LLM을 사용하여 원문에서 번역에 필요한 맥락 정보를 추출합니다. 문서의 도메인, 톤, 전문 용어, 문체 등을 파악합니다. **Step 3. Guideline Generation** $$ \mathcal G(\mathcal M, \mathcal I_\text{guideline\_extractor}, [s; c]) \rightarrow g $$ 추출된 컨텍스트와 원문을 바탕으로 번역 가이드라인을 생성합니다. 용어 통일, 문체 일관성 등 번역 품질을 위한 규칙을 제시합니다. **Step 4. Intelligent Post-Editing** $$ \mathcal E(\mathcal M, \mathcal I_\text{editor}, [s; o; c; g], \mathbb L_s, \mathbb L_t) \rightarrow o^\prime $$ 원문, 초벌번역, 컨텍스트, 가이드라인을 모두 활용하여 고품질 후편집을 수행합니다. LLM이 맥락을 이해하고 규칙을 적용하여 최종 번역문을 생성합니다. **Step 5. Semantic Evaluation** $$ \mathcal V(\mathcal M_\text{GPT-4o}, s, o^\prime, \mathbb L_s, \mathbb L_t) \rightarrow \text{Quality Assessment} $$ GPT-4o를 사용하여 번역 품질을 의미론적으로 평가합니다. 의미 보존, 자연스러움, 문법 정확성, 문화적 적절성을 종합적으로 분석합니다. #### **특징** | 구분 | 내용 | |------|------| | **장점** | • 최고 품질의 번역 결과 • 컨텍스트 인식 번역 • 일관성 보장 • 객관적 품질 평가 | | **단점** | • 높은 계산 비용 • 복잡한 파이프라인 • 다수의 API 호출 필요 | | **적용 분야** | • 전문 문서 번역 • 고품질 요구 프로젝트 • 일관성이 중요한 대량 번역 | ## 🚀 Quick Start ### 설치 ```bash # 의존성 설치 rye sync # 환경 변수 설정 export DEEPL_API_KEY="your_deepl_api_key" export OPENAI_API_KEY="your_openai_api_key" # 평가용 ``` ### 기본 사용법 ```python from agents.translator import DeepLTranslator, LLMTranslator from agents.evaluator import TranslationEvaluator # DeepL 번역 deepl_translator = DeepLTranslator() result = deepl_translator.translate("Hello world", "en", "ko") print(result) # "안녕하세요 세상" # LLM 번역 llm_translator = LLMTranslator( model_name="fireworks_ai/accounts/fireworks/models/gpt-oss-20b" ) result = llm_translator.translate("Hello world", "en", "ko") # 의미론적 평가 evaluator = TranslationEvaluator(model_name="openai/gpt-4o") evaluation = evaluator.evaluate_translation( "Hello world", result, "en", "ko" ) print(f"종합 점수: {evaluation['overall_score']}/5") ``` ### 배치 번역 ```python import asyncio async def batch_translate(): translator = LLMTranslator() texts = ["Hello", "World", "How are you?"] results = await translator.translate_batch(texts, "en", "ko") return results results = asyncio.run(batch_translate()) ``` # Appendix ## LLM Translator System Prompt  ```  source_language = {{source_lang}} target_langugage = {{target_lang}} You are a professional translator that converts text into {language}. ## TRANSLATION GUIDELINES: - Accurately convey the meaning and nuance of the source text - Paraphras for easy understanding by {language} readers, taking into account the cultural context - Ambiguous expressions are translated according to the most natural interpretation. INPUT FORMAT: Text enclosed in <TEXT></TEXT> tags OUTPUT FORMAT: {{source_lang}} -> {{target_lang}} translation only, no explanations Source Text: {{source_text}} ``` > `source_language`: 출발 언어 $\mathbb L_s$ > `target_language`: 도착 언어 $\mathbb L_t$ > `source_text`: 원문 $s$