Detective - Business Process Document

# Detective - Business Process Document ## 1. System Overview ### 1.1 Process Description #### 1.1.1 Identity & Permission Management Phase When the system starts, Brand data is synchronized from the external system **Domainarium** to this system. Both Brand and Client Account data are synchronized from **Domainarium**. The System Administrator's responsibility is to assign roles and permissions to these synchronized accounts. **Account Status Management** The system strictly controls account security. If a user exceeds the system-configured threshold for consecutive failed login attempts, the account will be automatically set to "Inactive" status by the system to prevent brute-force attacks. Inactive accounts cannot log in and must be manually re-enabled by a System Administrator. **White Label Service & Customization** To meet the business needs of law firms or brand protection consultants, the system supports White Label services. Brands can configure custom logo images, color schemes, and domain pointers. When external partners or their clients log in, the system will present the Brand's exclusive visual identity instead of the default system style. **Role & Permission Assignment** System Administrators assign roles to each Client Account. Each role contains a set of predefined permissions, such as "Create Monitoring Task", "Review Finding", or "Export Report". The system follows the principle of least privilege, granting only necessary permissions by default. When a Brand needs to commission external professionals for assistance, the Client can grant access to specific Brands to External Partners (such as law firms or brand protection consultants). The system's permission design follows outsourcing practices: External Partners typically possess broader operational permissions than Client Accounts to execute the full workflow, while Client Accounts are primarily responsible for review and oversight. However, Client Accounts have the ability to assign permissions; they can grant these broader roles to External Partners and operate within their authorized scope. Account Managers possess cross-Brand operational permissions and can assist multiple Brands with monitoring work. All permission changes, role assignments, and user login/logout activities are fully recorded in the system log, ensuring compliance and traceability. #### 1.1.2 Monitoring Task Configuration Phase The Account Manager or Client Account begins configuring a Monitoring Task. **Optional Step: Create Protected Asset** If the client needs to provide additional background information for AI Triage, such as brand logos, core descriptions, or official website lists, they can pre-create a Protected Asset. This entity contains only basic information like name, description, and cover image, and does not include any strategy configuration. This entity serves solely as background information for the AI Triage service. **Optional Step: Define Task Label Library** During or after task creation, the user can define a set of dedicated Labels for the Monitoring Task, such as "Legal Review", "High Priority", or "Reported". This Label Library will be shared and used in the review of all findings under this task. **Quantity Limit**: To avoid visual fatigue, the Label Library limit for a single task is set to 30. **Create Monitoring Task & First Phase** The user creates a Monitoring Task. This process is consolidated; the user completes the settings for the task, the first phase, and its scopes and strategies in a single form. **Monitoring Task Level Settings:** - **Task Name** - **Protected Asset** (Optional): A Monitoring Task can reference at most one Protected Asset. This setting cannot be modified after task creation to ensure consistency of background information referenced by all phases. - **Platform Type**: Only one type can be selected. Supported types include: Marketplace, Social Media, Web, Images, Domain. This setting cannot be modified after task creation to ensure all phases follow a consistent data structure. - **Task Owner**: Each task must be assigned an Account Manager as the Task Owner, responsible for subsequent maintenance and contact. This assignment is mandatory. - **Schedule Expression** (Optional): A cron expression defining the automatic execution frequency of the task. This setting can be modified after task creation. **First Phase Level Settings:** - **Phase Name** - **Start Date**: Sets the business start date of the phase. This is an editable attribute, allowing subsequent adjustments. - **End Date**: Sets the business end date of the phase. This is an editable attribute, allowing subsequent adjustments. - **Default Translation Language** (Optional): Marks the default review language for reports in this phase. This setting does not affect crawling; it is only used to assist the interface in providing a one-click translation function to translate finding content from non-default languages into this language. This setting belongs to the phase and cannot be modified after phase creation. - **Reference Currency** (Conditional Optional): If the task's Platform Type is Marketplace, the user can specify a Reference Currency (e.g., USD, TWD). This setting belongs to the phase and is used to convert and display item prices in different currencies uniformly on reports and interfaces. This setting cannot be modified after phase creation to ensure the consistency of report amounts. - **Highlighted Terms** (Optional): Defines a set of keywords used to highlight matching text in the finding list interface. This setting belongs to the phase and is an editable attribute, allowing adjustments at any time to optimize the review experience. - **Monitoring Scope**: This setting belongs to the phase. At least one set must be configured. This is an editable attribute, allowing keywords and platform combinations to be added, modified, or deleted at any time during the phase execution. Each set of settings includes: - User checks target platforms (Multiple selection) - Input monitoring keyword - Input crawl limit (default is 500) - **AI Triage Strategy**: This setting belongs to the phase. This is an optional setting; the user can configure multiple strategies for the phase. This is an editable attribute, allowing AI instructions and actions to be adjusted at any time to optimize triage accuracy. Settings include: - Strategy Title: User-defined readable name - Applicable Language: Defines which language of findings this strategy should execute on - Strategy Instruction: Instruction used for AI verification - Priority: User decides execution order via drag-and-drop interface; higher priority executes first - Action: Defines the action to execute when the strategy matches (e.g., Triage as Relevant, Triage as Less Relevant, Triage as Excluded, or Move to Trash) After the task is created, the default status is Running. #### 1.1.3 Monitoring Task Lifecycle Management The lifecycle of a Monitoring Task is jointly determined by two dimensions: "Active Status" and "Execution Control". The system uses this to distinguish whether a task is operating, temporarily stopped, or permanently archived. **Task Status Definitions** - **Running**: The task is in "Active" status, and the "Paused" function is not enabled. This is the default status after task creation, and the system will automatically execute scans according to the schedule. - **Paused**: The task is still in "Active" status, but the user has manually enabled the "Paused" function. At this time, the task is still valid, but the automated schedule will be temporarily frozen until the user unpauses it. In this status, users can still execute Manual Retain. - **Archived**: The task is set to "Inactive" status. This is a permanent termination status; task data is locked, no longer updated, and all operations, including Manual Retain, are prohibited. **Status Executable Operations & Restrictive Editing** The system strictly enforces restrictive editing principles to protect the integrity of the core context while retaining the flexibility of business logic. - **Monitoring Task**: Can edit Task Name, Schedule Expression, Task Owner. Cannot edit Brand, Platform Type, Protected Asset. - **Pause/Resume**: Only toggles the task's "Paused" switch; does not change the task's "Active" status. - **Archive**: Sets the task to "Inactive" status; this operation will permanently terminate the schedule. - **Phase**: Can edit Phase Name, Start Date, End Date, Monitoring Scope, Highlighted Terms, AI Triage Strategy. Cannot edit Default Translation Language, Reference Currency, or the associated Monitoring Task. #### 1.1.4 Content Crawling & Automated Processing Phase **Scan Trigger Mechanism** When a task is in Running status (Active and not Paused), the system determines the scan trigger method based on its configured Schedule Expression: - **With Schedule Expression (Regular Monitoring)**: - The system triggers scans at the specified frequency based on the Schedule Expression. - Upon each trigger, the system reads the latest Monitoring Scope configuration of the phase at that moment. - The system determines if the current time falls within the Start Date and End Date range set for the phase. If it matches, the scan operation is executed for that phase. - **Without Schedule Expression (One-time Scan)**: - If the task does not have a Schedule Expression configured, the system immediately triggers a full scan for that phase only once when the phase is created. - Subsequent modifications to the Monitoring Scope will require waiting for the next manual trigger or setting a schedule to take effect. **Scan Definition & Execution** A single scan is defined as: one crawler execution for a single monitoring keyword plus a single target platform. When the schedule triggers, the system reads the list of Monitoring Scopes for phases that meet the time conditions and decomposes each scope into multiple independent Scan units. _Example_: Suppose a phase is configured with the following Monitoring Scopes: - Scope 1: Keyword "Nike Shoes", Platforms "Amazon, eBay, Shopee", Crawl Limit 500 - Scope 2: Keyword "Nike Sneakers", Platforms "Amazon, eBay", Crawl Limit 500 Then one schedule trigger will generate five independent scans: 1. "Nike Shoes" on "Amazon" (Max 500 items) 2. "Nike Shoes" on "eBay" (Max 500 items) 3. "Nike Shoes" on "Shopee" (Max 500 items) 4. "Nike Sneakers" on "Amazon" (Max 500 items) 5. "Nike Sneakers" on "eBay" (Max 500 items) Each scan is executed independently, and its execution status is recorded separately in the execution log. Even if the Monitoring Scope is modified during the scan execution, the triggered scan will still complete execution based on the parameters at the time of triggering. **Automated Crawling & Snapshot Pricing Principles** When the crawler executes each scan: 1. Searches for the keyword on the specified platform. 2. Crawls the list of URLs from the search results. 3. **Snapshot Principle**: For URLs that have already been crawled, the system treats them as a snapshot in time. If the same URL is scanned again and the price or content has changed, the system **will not update** the price or content of the existing finding to maintain the evidence status of the first crawl. 4. **Duplicate Data & Quota**: Duplicate scanned URLs still count towards the daily crawl quota and are not exempted because the content was not updated. 5. Crawls content for each URL and creates a finding record. 6. Records the source identifier and the corresponding Monitoring Scope ID. **Technical Limitations Note** For specific high-defense e-commerce platforms (such as Chinese e-commerce), due to strict anti-crawler mechanisms and account risk controls (e.g., SMS verification, account nurturing costs), the system does not guarantee 100% successful screenshots. In such gray areas, manual assistance or other outsourcing methods may be adopted if necessary. **Retention Mechanism** The system provides two retention mechanisms for updating the latest status of existing findings: - **Automatic Retention**: During each scan execution, if a URL crawled by the crawler already exists in the system, the system will automatically update the latest content of that finding (Note: Following the Snapshot Principle above, key evidence like price may remain unchanged). - **Manual Retention**: Account Managers and System Administrators can manually submit a URL for crawling when the task is in Running or Paused status. - **Status Selection**: When manually importing, the user can decide the initial status of the finding via a dropdown menu. The default is "Relevant", but "To Be Processed" can also be selected. **Asynchronous Supplement of Exchange Rate & Translation** After each retention action is completed, the system triggers an event-driven asynchronous process: 1. Query the current exchange rate data (based on the Reference Currency set for the phase). 2. Execute content translation (based on the Default Translation Language set for the phase). 3. Store the exchange rate snapshot and translation result in the finding record. #### 1.1.5 Classification & AI Triage Phase Account Managers or External Partners browse the list of unprocessed items in the system. **User Interface & Operational Experience** - **List Selection**: Users can select items by clicking on the blank space of the list row, not limited to clicking the checkbox. Supports using the Shift key for range selection; multi-selection is possible without pressing the Ctrl key. The bottom toolbar provides a prominent "Clear Selection" button. - **Global Filtering**: All filters (e.g., Label, Risk Category) include a "None" option to filter for unclassified or unlabeled items. - **Label Presentation**: Uses a floating chip layout to avoid overly long dropdown menus. - **Label Deletion**: When deleting a label, the system pops up a warning modal. Upon confirmation, the label definition will be removed from all historical findings. **AI Triage Service** AI Triage is fully triggered manually by the user. The execution process is as follows: - **Availability Check**: The user can only execute this when the phase to which the finding belongs has at least one AI Triage Strategy configured. - **Load Strategies**: The system reads the latest AI Triage Strategy list for that phase at that moment. This means users can optimize strategy instructions at any time and re-execute triage on findings to obtain better results. - **Load Background Information**: The system loads the Protected Asset information associated with the task as background. - **Sort & Test**: The system sorts strategies by priority and tests the finding content one by one. - **Execute Action**: If a strategy matches, the defined action is executed (e.g., Triage as Relevant), and a system comment is automatically added to record the triggered strategy name and reasoning. **Manual Review & Labeling** Reviewers can manually assign Risk Categories, add comments, apply labels, etc. **Finding Status Flow & Pagination** The finding list is divided into the following independent tabs, without distinguishing secondary statuses: - **To Be Processed**: (Optional) Generated by manual import or specific strategies. - **Relevant**: Findings confirmed as risks. - **Enforcement**: Independent tab, located before Excluded. All findings entering the takedown request process are moved here, without further subdivision of takedown status tabs. In this status, the original label and color are retained for traceability. - **Excluded** - **Trash** **Takedown Request** For infringing findings confirmed as Relevant, the Account Manager can initiate a takedown request. The system updates the Takedown Status to "Requested" and triggers the subsequent disposal process. The Takedown Status is independent of the finding's classification status; even if the takedown fails, the finding can remain in "Relevant" or "Enforcement" status. #### 1.1.6 Insight Analysis & Report Export Phase When clients need to analyze monitoring effectiveness, they can use the Insight Analysis and Report Export functions. - **Cross-Report Search**: The system provides an independent search page supporting retrieval across "Any Phase within the Entire File". Search targets mainly focus on high-risk seller names or specific historical sales records, not limited to title keywords. Results list matching findings and annotate their belonging phase, without distinguishing status tabs. - **Finding List Export**: Exports raw finding data filtered by the user. - **Statistics Report Export**: Exports aggregated statistical data for all findings within a single phase. Since the Reference Currency and Language are locked, the statistical basis is consistent. - **Insight Analysis**: In the Task Detail, allows users to select multiple phases to generate cross-time trend or differential comparison charts in real-time. - **Historical Report Presentation**: The interface displays the current report and historical report icons for the past 11 periods (total 12 periods), without needing to expand all historical records. #### 1.1.7 Audit & Tracking Phase All operations in the entire system are continuously recorded in the background, establishing a complete audit trail. - **System Log**: Records system-level events (Login, Role changes). - **Task Log**: Records task-related operations (Settings modification, Status changes, Label management). - **Phase Log**: Records phase business operations (Add scope, Modify strategy, Date adjustment). - **Finding Log**: Records finding processing flow changes (including Takedown Status changes, Label changes). - **Execution Log**: Records technical execution results of scans and retentions. #### 1.1.8 Internationalization & Translation Support The system provides global interface language switching, content translation functions, and an administrator translation module. ## 2. Common Glossary ### 2.1 Organization Structure & Identity Permission Management - **Brand**: An enterprise or organization requiring brand protection services. Formerly Portfolio. - **Client Account**: System account for brand legal departments or brand protection team members. - **External Partner**: External professionals commissioned by the brand. - **Account Manager**: Internal system employee possessing cross-Brand operational permissions. - **System Admin**: Administrator responsible for system configuration and account management. - **White Label Service**: Service function to change system logo, color scheme, and domain. - **Domainarium**: IPTwins' domain management system, serving as the source of Brand and Account data for this system. - **Detective v2**: The legacy monitoring system. - **Active**: Account or Task is in an operational state. - **Inactive**: Account or Task is in a non-operational state, usually requiring administrator permission to change. ### 2.2 Monitoring Task & Classification - **Monitoring Task**: Monitoring activity executed for a specific target. Holds immutable core context (Brand, Platform, Asset). - **Monitoring Scope**: A specific execution unit belonging to a phase, defining the target platform list and monitoring keyword combination. Editable. - **Platform Type**: The platform category targeted by a single Monitoring Task. Includes: Marketplace, Social Media, Web, Images, Domain. - **Reference Currency**: Marketplace-specific field defining a target currency. Immutable. - **Highlighted Terms**: Defines a set of keywords used to highlight matching text in the finding list interface. Editable. - **Phase**: An execution stage within a Monitoring Task. Contains time range, format setting snapshot (Currency/Language), and execution logic configuration (Scope/Strategy). - **Start/End Date**: The business time interval of the phase. Editable. - **Schedule Expression**: Defines the automatic execution frequency of the task. Editable. - **Default Translation Language**: Default translation target language used for interface review. Immutable. - **Scan**: One crawler execution for a single monitoring keyword plus a single target platform. - **AI Triage Strategy**: Configuration containing matching conditions, trigger conditions, and execution actions. Editable. - **Cross-Report Search**: Global search function across all phases. - **Any Phase within the Entire File**: Scope definition for Cross-Report Search. - **Running**: A task status indicating the task is Active and the automated schedule is operating. - **Paused**: A task status indicating the task is Active but the automated schedule is temporarily frozen. - **Archived**: A task status indicating the task has been set to Inactive and no longer executes any operations. ### 2.3 Protected Asset - **Protected Asset**: Represents intellectual property the brand needs to protect, serving as background knowledge for AI Triage. ### 2.4 Finding & Operations - **Finding**: Suspected infringing content detected by the monitoring system on the web. - **Risk Category**: A single fixed classification of infringement type. - **Takedown Status**: Indicates the disposal progress of a finding, divided into Requested, Succeeded, Ignored, Failed. Independent of Risk Category. - **Label**: User-defined free text marker belonging to a Monitoring Task. Limit 30. - **Snapshot**: Content and price crawled by the crawler are fixed after the first crawl and do not update with subsequent scans. - **Duplicate Data**: Repeatedly crawled URLs. - **Quota / Crawl Limit**: Daily crawl limit. - **Manual Inject**: Manually adding a URL to the system. - **To Be Processed**: Initial status of a finding (Optional). - **Relevant**: Status confirmed as a risk. - **Enforcement**: Finding status tab for those where a takedown request has been initiated. - **Excluded**: Status confirmed as non-risk. - **Trash**: Junk data status. - **None**: "Unclassified" or "No Label" option in filters. - **Chips**: UI presentation form for labels. - **Warning Modal**: Confirmation window before deletion operations. - **Range Selection**: Selecting multiple continuous items using the Shift key. - **Clear Selection**: Button to cancel all selected items. - **AI Triage**: Automated decision service. ## 3. Information Architecture ### 3.1 Core Entity Relationship Diagram #### 3.1.1 Identity & Global Assets - **Brand** [Aggregate Root] - [Attribute] **White Label Settings**: Contains logo image, color scheme, domain pointer. - [1:N] [Owns] **Client Account List** - [N:M] [References] **External Partners** - [1:N] [Owns] **Protected Asset List** - **User Account** [Aggregate Root] - [Attribute] **Status**: Active or Inactive. #### 3.1.2 Monitoring Task Container - **Monitoring Task** [Aggregate Root] - [1:1] [Attribute] **Core Context** (Immutable: Brand, Platform Type, Protected Asset) - [1:1] [Attribute] **Editable Settings** (Task Name, Schedule, Owner) - [1:1] [Attribute] **Status & Execution Control**: Defines if task is Active/Inactive, and Paused/Resumed status. - [1:N] [Owns] **Task Label Library** [Max 30] - [1:N] [Owns] **Phase List** - [1:N] [Owns] **Finding Data Pool** #### 3.1.3 Phase - **Single Phase** [Entity] - [1:1] [Attribute] **Basic Information** (Editable) - Phase Name - Start Date - End Date - [1:1] [Attribute] **Format Setting Snapshot** (Immutable) - Default Translation Language - Reference Currency - [1:1] [Attribute] **Execution Logic Configuration** (Editable) - Highlighted Terms List - Monitoring Scope List - AI Triage Strategy List - [1:N] [Owns] **Execution Logs** #### 3.1.4 Finding Data Pool - **Single Finding** [Aggregate Root] - [1:1] [Attribute] **Core Data** (Status, Risk Category, Takedown Status, Raw Data, Price Snapshot) - [1:1] [Reference] **Source Traceability** (Belonging Task, Phase, Monitoring Scope) - [1:N] [Owns] **Comments**, **Finding Logs**, **Execution Logs** ### 3.2 Mutual Exclusion Constraints - **Platform Type Singularity**: Each Monitoring Task can only select one Platform Type. - **Task Context Lock**: After task creation, editing its Protected Asset and Platform Type is prohibited. - **Phase Format Lock**: After phase creation, editing its Default Translation Language and Reference Currency is prohibited to ensure consistency of reports and execution time. - **Dynamic Execution Logic**: Monitoring Scope, Highlighted Terms, AI Triage Strategy, Start and End Dates are all editable attributes, allowing dynamic adjustment during execution. - **AI Triage Availability**: If no AI Triage Strategy is currently configured for the phase, the AI Triage service cannot be used. ## 4. Key Design Decisions & Rationale ### 4.1 Why move Monitoring Scope and AI Triage Strategy to Phase and make them editable? This decision is to adopt a "Hybrid Mode" to balance report consistency and execution flexibility. - **Format Lock**: Reference Currency and Default Language adopt snapshot locking to ensure the statistical report basis (such as sum of amounts) within a single phase does not become chaotic due to setting changes. - **Logic Mutable**: Monitoring Scope and AI Triage Strategy are designed to be editable to support the iterative process of "Tuning & Optimization". Users can add keywords to expand monitoring coverage or modify AI instructions to improve triage accuracy at any time, and immediately apply them to the next scan or triage without being forced to recreate the phase due to fine-tuning settings. ### 4.2 Why import recent settings when adding a new phase? For operational convenience. The system automatically brings in the settings of the most recent phase (including snapshot settings like Scope, Strategy) as default values, allowing users to add, delete, or modify on this basis to continue monitoring work. ### 4.3 Why are Task core fields (Asset, Platform Type) not editable? The Monitoring Task plays the role of an immutable context container. Platform Type determines the data structure, and Protected Asset is the basic background for AI Triage. Modifying these fields would cause the context of existing data to become invalid. ### 4.4 Why does Label belong to Monitoring Task? To improve cohesion. Label is a classification dimension at the Task level and should remain consistent throughout the entire Task lifecycle, unaffected by Phase changes. ### 4.5 Decoupled design of Crawler Schedule and Phase Time The system decouples frequency (Task Schedule) and time range (Phase Date). This allows users to flexibly adjust business cycles. When the Monitoring Scope is modified, the new settings take effect immediately and are adopted upon the next schedule trigger. ### 4.6 Why are Exchange Rate and Translation event-driven asynchronous processes? To improve crawler performance and allow retry mechanisms. The exchange rate snapshot is recorded at the time of retention, ensuring that price conversions for historical data do not change due to subsequent exchange rate fluctuations. ### 4.7 Why use System Comments to record AI Triage reasoning? Since AI Triage Strategies are mutable, current strategy configurations cannot explain past triage results. Therefore, when executing triage, the system writes the triggered strategy name and reasoning at that time into "System Comments" as a permanent audit record. ### 4.8 Terminology Refactoring & Domain Language Refinement This system has undergone critical domain language refactoring to more precisely align with business essence and user mental models. **Portfolio -> Brand** - **Decision Rationale**: The term "Portfolio" is too abstract and carries connotations of financial investment portfolios. In the brand protection domain, clients intuitively manage specific "Brands". This renaming reduces user cognitive load and makes the hierarchy (Brand -> Protected Assets) more intuitive. **Smart Labeling -> AI Triage** - **Decision Rationale**: The original name "Smart Labeling" easily misled users into thinking it was just an "auto-tagging" function. - **Imagery of Triage**: The essence of this function is decision and disposal. Like triage in an emergency room, AI's responsibility is to quickly judge the retention (Relevant vs Trash) and risk level of findings amidst massive noise based on strategy priority. Using "Triage" better conveys the professional value of "Active diversion, reducing manual burden". **Tags -> Labels** - **Decision Rationale**: To make a clear distinction from "AI Triage". - **Labels vs Triage**: - **Labels (User-defined)**: Manual, flexible, unstructured markers applied by users (e.g., "To Discuss", "Important Client"). - **Triage (AI-driven)**: System-judged, structured, status-oriented disposal flows (e.g., "High Risk", "Excluded"). - This renaming eliminates confusion between "Manual Tagging" and "AI Labeling" for users. ### 4.9 Why retain "To Be Processed" status? This decision is to ensure the **clarity** and **operational continuity** of review work. - **Clear Review Boundary (Inbox Zero)**: By retaining the "To Be Processed" status, the system can clearly distinguish between "New Data" and "Reviewed Data". This creates a workflow similar to an Inbox, where the reviewer's goal is to clear this list. Lacking this status would mix new and old data, making it difficult for users to track review progress and easily causing missed judgments or repeated reviews. - **Continuing Existing Operational Habits**: The legacy domain monitoring system (Detective v2) already possesses this status design. Retaining this status ensures consistent operational experience between new and old systems, reducing the learning migration cost for existing users and enabling a seamless transition to the new platform's operational flow.