Final Project - HackMD

# GitHub Commands > Add to git ``` git add <file> ``` > Add all but those in .gitignore ``` git add . ``` > Check status under current directory ``` git status ``` > Undo add ``` git restore --staged <file1>, <file2> ``` > Commit ``` git commit -m "<comments>" ``` > View the commit ``` git show ``` > Push ``` git push origin main ``` ## Activate virtual environment > For macOS ```bash! source .venv_serpentai/bin/activate ``` > For windows #### Open anaconda prompt terminal ``` conda activate serpent ``` # Serpent Commands Launch game ```bash serpent launch MiniMetro ``` use AI agent ``` serpent play MiniMetro SerpentMiniMetro_RL_AgentGameAgent ``` # Serpent File Configs serpent_minimetro_game.py ```python from serpent.game import Game from .api.api import MiniMetroAPI from serpent.utilities import Singleton import time class SerpentMiniMetroGame(Game, metaclass=Singleton): def __init__(self, **kwargs): kwargs["platform"] = "steam" kwargs["window_name"] = "Mini Metro" kwargs["app_id"] = "287980" kwargs["app_args"] = None super().__init__(**kwargs) self.api_class = MiniMetroAPI self.api_instance = None @property def screen_regions(self): regions = { "SAMPLE_REGION": (0, 0, 0, 0) } return regions @property def ocr_presets(self): presets = { "SAMPLE_PRESET": { "extract": { "gradient_size": 1, "closing_size": 1 }, "perform": { "scale": 10, "order": 1, "horizontal_closing": 1, "vertical_closing": 1 } } } return presets def after_launch(self): """ Wait until macOS has created window 1, then run the normal after_launch. If moving the window raises -1719 we still record geometry so the frame-grabber has numbers to use. """ time.sleep(2.0) # increase if Mini Metro is slow to appear try: super().after_launch() # find window → sets self.window_id except Exception: # swallow the AppleScript move-window error pass finally: # Ensure window_geometry exists if not self.window_geometry: try: self.update_window_geometry() except Exception: # last-ditch: fall back to the values you saw printed self.window_geometry = { "width": 1280, "height": 828, "x_offset": 0, "y_offset": 25 # menu bar height on your MacBook } ``` # File Structure and overall Workflow ## What every file does in the RL pipeline | File | Purpose in the pipeline | | ---------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **SerpentMiniMetroGamePlugin/plugin.py** | Registers the game plugin and makes `files/` importable. | | **SerpentMiniMetroGamePlugin/files/serpent\_MiniMetro\_game.py** | *Game* definition. – Starts/terminates Mini Metro, grabs frames, exposes window regions. | | **SerpentMiniMetro\_RL\_AgentGameAgentPlugin/plugin.py** | Registers the agent plugin and adds its own `files/` to `sys.path`. | | **…/files/actions.py** | Single source-of-truth `Enum` for every discrete action. Used by: data-merger, BC training, agent runtime. | | **…/files/serpent\_MiniMetro\_RL\_Agent\_game\_agent.py** | Your *GameAgent* class. Holds: • `setup_learn` / `handle_learn` handlers (records demos) • `handle_policy` (runs model, dispatches actions) • `dispatch()` helper that turns `Action` → `InputController` calls. | | **…/files/recorder.py** | Thin wrapper around SneakySnek – logs `(timestamp, keyboard/mouse)` to CSV while you play. | | **…/files/ml\_models/** | All ML code: • `imitation.py` (train behaviour cloning) • `ppo_finetune.py` (Gym env + PPO finetuning). | | **datasets/** (outside the plugins) | Collected PNGs, CSVs, merged `.npz` / HDF5 files. | | **offshoot.manifest.json** (project root) | Tells Offshoot which files to auto-import and which classes are “pluggables”. | ### How data flows ```text ┌────────────────────┐ │ You play in LEARN │ │ (handler=learn) │ └─────────┬──────────┘ │ ① Game frames captured at 2 fps │ ② CSVActionRecorder logs every input event ▼ datasets/recordings/{run}/frames/*.png datasets/inputs_mm.csv │ │ ③ merge_frames_inputs.py → stacks 4 frames, │ converts raw mouse/keys → Action enum │ ▼ datasets/merged_mm.npz │ │ ④ imitation.py loads .npz, trains BC model │ and saves ml_models/imit_bc.pkl │ ▼ ml_models/imit_bc.pkl │ │ ⑤ ppo_finetune.py wraps Mini Metro in a Gym │ env, loads BC weights as starting point │ and saves ml_models/ppo_policy.pkl │ ▼ ml_models/ppo_policy.pkl │ │ ⑥ handle_policy (handler=policy) loads │ ppo_policy.pkl, predicts Action every frame, │ dispatches clicks & drags through │ self.input_controller ▼ Mini Metro plays itself — press TAB to take over ``` # ChatGPT Responses for RL agent Below is a practical roadmap that takes you **from a working Mini Metro game-plugin to a reinforcement-learning agent that can (1) watch you play, (2) imitate, and (3) graduate to fully autonomous play while still letting you jump in and override its moves.** I keep everything in the order you would actually do it on-disk, so you can follow along step-by-step. --- ## 0 Know the moving parts in Serpent AI | Piece | What it really is | Why you care | | --------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | | **Game plugin** (`SerpentMiniMetroGamePlugin`) | Launches the game window and grabs frames. | You already have this. | | **`GameFrame`** | A wrapper around each screenshot (NumPy array + timestamp). | This is the “observation” your RL algorithm will see. | | **`InputController`** | Sends key / mouse events through PyAutoGUI (macOS). | This is the “action” interface for the agent. ([GitHub][1]) | | **Game-agent plugin** (`SerpentMiniMetroGameAgentPlugin`) | A subclass of `GameAgent`; runs an infinite loop, receives frames, decides actions, calls `input_controller`. ([GitHub][2]) | | | **Offshoot plugin system** | Scans `offshoot.manifest.json` and auto-imports any extra Python files that live inside the plugin directory. | So yes—you can drop as many extra `.py` files (or whole packages) as you like and just import them. | --- ## 1 Define a *clean* action space Mini Metro is mouse-heavy, so trying to predict raw cursor coordinates is hard. A common simplification is: | Internal name | What it does | Implementation | | ---------------- | ----------------------------------------------------------------------------------- | --------------------------------------------- | | `SELECT_STATION` | Left-click on a detected station sprite. | `input_controller.click(x, y)` | | `DRAG_TRACK` | Drag from station A to station B. | `input_controller.drag(x0, y0, x1, y1)` | | `UPGRADE` | Press the shortcut key (e.g. `Space`) when a locomotive/line upgrade popup appears. | `input_controller.tap_key(KeyboardKey.SPACE)` | Put them in an Enum so both your human-recorder and the RL policy share the same label set. --- ## 2 Collect a *labeled* demonstration set (“learn my actions”) 1. **Capture frames** while you play: ```bash serpent launch MiniMetro # keep it windowed serpent capture context MiniMetro 1 playing # every second ``` The images land in `datasets/collect_frames_for_context/playing`. ([GitHub][3]) 2. **Record your own inputs** at the same time. The framework has a “Gameplay Recording” addon, but the simplest cross-platform hack is to run a tiny PyObjC script that listens to macOS CGEvents and dumps `(timestamp, key/mouse)` to a CSV (one liner with `Quartz.CGEventTapCreate`). Make sure the timestamps use the same clock as the Serpent frame-grabber (`time.time()`). 3. **Merge** the two streams into training tuples `(latest 4 stacked frames, action-enum, t)`. A 4-frame stack captures train motion without needing an LSTM yet. --- > How to record player inputs? Below are the **three practical ways Serpent-AI users normally capture their own keyboard + mouse actions** while frames are being saved. Pick the one that best fits your workflow and OS security settings (you’re on macOS): --- ### 1 Use Serpent’s built-in “gameplay-recording” command (*zero extra code, relies on the same recorder that powers the framework’s YouTube demo*) ```bash # in the same venv where Serpent is installed serpent record MiniMetro --name harry_play_01 --fps 8 ``` What happens 1. `serpent record` launches the game and opens a SneakySnek recorder behind the scenes. 2. Every frame (default: 8 fps) is written to `datasets/recordings/harry_play_01/frames/000001.png`, … 3. Each input event is appended to `datasets/recordings/harry_play_01/inputs.csv` in the format ``` timestamp,event_type,key_or_button,x,y 1715572484.31,KEY_DOWN,SPACE,, 1715572485.02,MOUSE_CLICK,LEFT,682,512 ``` The feature was introduced in **Serpent 2018.1.0** under the name *cross-platform gameplay recording* ([GitHub][1]) and it still works on current macOS as long as: * the *Terminal* (or IDE) that runs the command has *System Settings ▸ Privacy & Security ▸ Accessibility* **and** *Input Monitoring* permission, and * the game is in windowed mode (otherwise frame capture will see a black screen). --- ### 2 Start SneakySnek directly (stand-alone logger) SneakySnek is the micro-library written for Serpent’s recorder and kept as a separate PyPI package ([GitHub][2]). ```bash pip install sneakysnek # already in your venv if Serpent pulled it python -m sneakysnek -o inputs_minimetro.json ``` * A tiny window tells you it’s recording. * Hit **⌃-C** when you’re done; a JSON file containing every keyboard and mouse event with timestamps drops next to the script. * Combine that JSON with the PNGs that Serpent captured earlier (or with your own screen-grabber) by matching the timestamps. --- ### 3 Embed the recorder in your **GameAgent** (most flexible) If you want to capture actions only while a special “learn” handler is active, add \~20 lines to your agent: ```python # plugins/SerpentMiniMetroGameAgentPlugin/recorder.py import csv, os from sneakysnek.recorder import Recorder from sneakysnek.events import KeyboardEvent, MouseEvent class CSVActionRecorder: def __init__(self, path="inputs.csv"): os.makedirs(os.path.dirname(path), exist_ok=True) self.file = open(path, "w", newline="") self.writer = csv.writer(self.file) self.writer.writerow(["timestamp", "kind", "code", "x", "y"]) self.rec = Recorder.record(self._callback) def _callback(self, event): if isinstance(event, KeyboardEvent): self.writer.writerow([event.timestamp, "K_"+event.event.name, event.keyboard_key.name, "", ""]) elif isinstance(event, MouseEvent): self.writer.writerow([event.timestamp, "M_"+event.event.name, event.button.name, event.x, event.y]) def stop(self): self.rec.stop() self.file.close() ``` …and inside **`serpent_mini_metro_game_agent.py`**: ```python def setup_learn(self): self.action_logger = CSVActionRecorder("datasets/inputs_harry.csv") def handle_learn(self, game_frame): # just keep letting the recorder callback run pass ``` Run it with ```bash serpent play MiniMetro SerpentMiniMetroGameAgent --handler learn ``` The recorder lives in its own thread; stopping the agent (press **Q** in the Serpent console) automatically closes the CSV. --- #### Giving macOS the necessary permissions *System Settings → Privacy & Security →* * **Accessibility** → enable for Terminal (or PyCharm, VS Code) * **Input Monitoring** → same apps above * (Optional) **Screen Recording** → if you rely on an external screen-grabber Without those checks, macOS will raise the same `-10006` AppleScript error you saw earlier when the recorder tries to inspect mouse coordinates. --- ### Which option should you choose? | Goal | Best choice | | ---------------------------------------------------------------------------------- | ------------------------------- | | Quick one-off demo data | **Option 1** – `serpent record` | | Capture with no Serpent window or when you want different FPS for frames vs inputs | **Option 2** | | Integrated pipeline, switchable “human / auto” modes, custom CSV schema | **Option 3** | Once you have the **frames** and a matching **CSV** of your actions, you can feed them straight into your behaviour-cloning or RL preprocessing script exactly as outlined in the previous message. [1]: https://github.com/SerpentAI/SerpentAI/blob/dev/CHANGELOG.md?utm_source=chatgpt.com "SerpentAI/CHANGELOG.md at dev · SerpentAI/SerpentAI · GitHub" [2]: https://github.com/SerpentAI/sneakysnek "GitHub - SerpentAI/sneakysnek: Dead simple cross-platform keyboard & mouse global input capture solution for Python 3.6+" ## 3 Train an *imitation* policy offline Any RL library is fine—**Stable-Baselines-3** is convenient but needs ≥Python 3.8, so in your 3.6 venv install the last pre-3.8 wheel (`stable-baselines==2.10.2`). A minimal behaviour-cloning script: ```python model = cnn_to_mlp() # conv layers → 2-layer MLP optimizer = Adam(model.parameters(), lr=1e-4) for obs, act in dataloader: logits = model(obs) # [batch, |A|] loss = cross_entropy(logits, act) loss.backward(); optimizer.step() ``` Save it as `plugins/SerpentMiniMetroGameAgentPlugin/ml/imitation.py` and commit the path to `offshoot.manifest.json`. From the agent you will later call: ```python self.policy = self.load_machine_learning_model("ml/imit_bc.pkl") ``` (`load_machine_learning_model` is baked into `GameAgent` and handles pickled files for you. ([GitHub][2])) --- ## 4 Create the agent plugin skeleton ```bash serpent generate game_agent # Game name? MiniMetro ``` Serpent drops `plugins/SerpentMiniMetroGameAgentPlugin/...` for you. ([GitHub][4]) Edit `serpent_MiniMetro_game_agent.py`: ```python from serpent.input_controller import MouseButton, KeyboardKey from imitation import ImitationPolicy # your file! class SerpentMiniMetroGameAgent(GameAgent): def __init__(self, **kwargs): super().__init__(**kwargs) # two modes: imitate and RL self.frame_handler_setups["learn"] = self.setup_learn self.frame_handlers["learn"] = self.handle_learn self.frame_handlers["policy"] = self.handle_policy # ---------------------------------------- # 1) during data collection def setup_learn(self): self.recorder = CSVActionRecorder("human_actions.csv") def handle_learn(self, game_frame): self.recorder.sync(game_frame.frame_time) # No automated play, we just watch & save # ---------------------------------------- # 2) autonomous play def handle_policy(self, game_frame): state = preprocess(game_frame.frame) # resize, stack, float32 action = self.policy.predict(state) # enum label self.dispatch(action) # helper: maps logical actions to real inputs def dispatch(self, action): if action == Action.SELECT_STATION: x, y = self.pick_station() self.input_controller.click(x, y) elif action == Action.DRAG_TRACK: (x0,y0), (x1,y1) = self.pick_drag() self.input_controller.drag(x0,y0,x1,y1) ... ``` --- ## 5 Add extra `.py` files (yes, absolutely!) Anything you place *inside the plugin directory* is import-able without touching PYTHONPATH. If you prefer a clean separation, create `plugins/SerpentMiniMetroGameAgentPlugin/ml/` and list those new modules in **`offshoot.manifest.json`**: ```json { "plugins": [...], "files": [ "plugins/SerpentMiniMetroGameAgentPlugin/ml/__init__.py", "plugins/SerpentMiniMetroGameAgentPlugin/ml/imitation.py", "plugins/SerpentMiniMetroGameAgentPlugin/ml/ppo_finetune.py" ] } ``` Offshoot rereads the manifest the next time you run any `serpent` command; no `pip install -e` step needed. ([GitHub][5]) --- ## 6 Fine-tune with reinforcement learning 1. Wrap Mini Metro in a thin **Gym-like adapter** living in `ppo_finetune.py`: ```python class MiniMetroEnv(gym.Env): def step(self, action): self.input_controller.dispatch(action) time.sleep(self.dt) # let the world advance obs = self.grab_frame() reward, done = self.compute_reward(obs) return obs, reward, done, {} ``` 2. Load the imitation weights into PPO as the starting point (`policy_kwargs={"weights": imitation_state_dict}`) and run PPO or A2C for, say, 1 million steps. 3. Pickle the new weights as `ml/ppo_policy.pkl` and have `handle_policy` swap models with a keypress so you can *interfere* mid-game: ```python if self.input_controller.is_key_pressed(KeyboardKey.KEY_TAB): self.autonomous = not self.autonomous ``` --- ## 7 Run everything ```bash # 1) collect demonstrations (learn mode) serpent play MiniMetro SerpentMiniMetroGameAgent --handler learn # 2) after training serpent play MiniMetro SerpentMiniMetroGameAgent --handler policy ``` At this point the agent will react every game frame, choose an action from your finite set, and dispatch it through the built-in InputController. You can press **Tab** (or whatever key you wired) to temporarily take back control—your clicks will simply overwrite the last action and the RL loop continues. --- ### Key take-aways * **Perception** – Every iteration of the frame handler hands you a fresh `GameFrame` object containing a NumPy array you can reshape, crop or down-sample at will. * **Action** – You never touch OS APIs yourself; you just call high-level helpers on `self.input_controller`. * **Extensibility** – Extra Python files live happily inside the plugin folder and are auto-imported by Offshoot; external ML libraries can be installed in the same venv. * **Learning loop** – Start with *behaviour cloning* from your recorded CSV, then switch to *policy-gradient* fine-tuning once the agent can at least mimic your fundamentals. Happy training! [1]: https://github.com/SerpentAI/SerpentAI/wiki/The-%27InputController%27-Class "The 'InputController' Class · SerpentAI/SerpentAI Wiki · GitHub" [2]: https://github.com/SerpentAI/SerpentAI/wiki/The-%27GameAgent%27-Class "The 'GameAgent' Class · SerpentAI/SerpentAI Wiki · GitHub" [3]: https://github.com/SerpentAI/SerpentAI/wiki/Capturing-Frames-and-Regions "Capturing Frames and Regions · SerpentAI/SerpentAI Wiki · GitHub" [4]: https://github.com/SerpentAI/SerpentAI/wiki/The-Serpent.AI-Hello-World "The Serpent.AI Hello World · SerpentAI/SerpentAI Wiki · GitHub" [5]: https://github.com/SerpentAI/offshoot?utm_source=chatgpt.com "SerpentAI/offshoot: Modern, elegant, minimalistic but ... - GitHub" # FAQ > Cannot Launch MiniMetro after serpent setup and serpent generate? :::info Have you activated MiniMetro Plugin? ``` serpent pluigins # Check for any inactive plugins serpent activate <PluginName> #Activate Plugin serpent launch <Game_Name> #Launch Again ``` ::: > Cannot train ML model even though ML setup is complete :::info try ``` # still **inside** the .venv_serpentai virtual-env pip uninstall -y keras keras-nightly keras-preprocessing # install versions contemporary with TF 1.4 pip install "keras==2.0.8" "h5py<3" # h5py 3.x needs newer TF ``` then run classifier train again. ::: # 5/19 * Created Action Space Below is a “guided tour” of the two Mini-Metro plugins exactly as they appear in your tree, followed by: # 5/26 Here’s a deeper dive into our journey—step by step, with the key decisions, false starts, and final breakthroughs: --- ## 1. The Initial “Unclassified Contour” Crash * **Symptom**: As soon as your agent hit an in-game frame, `detect_stations()` would throw ```python RuntimeError: Unclassified contour: v=4, circ=0.75 ``` because it was picking up enormous rectangular blobs (your whole window border!) and tiny “river-crossing” ticks and trying to match them against station shapes. * **Root Cause**: 1. **No line-vs-station separation**: your code thresholded for any dark region, then tried to recognize circles/triangles/quads on *every* contour. 2. Big UI elements and river-bridge connectors ended up in that pipeline. --- ## 2. Bringing in SerpentAI’s VisualDebugger * **Goal**: See exactly what `detect_stations` was feeding through—and what it was rejecting. * **What we did**: * Called ```python self.visual_debugger.store_image_data(frame_bgr, ..., bucket="raw") self.visual_debugger.store_image_data(annot, ..., bucket="det") ``` so that the VisualDebugger window showed side-by-side “raw” and “det” images. * Added temporary `cv2.drawContours(...)` in your vision code to outline each candidate before classification. * **What we learned**: * Entire frame and UI arrows were making their way into `raw_cnts`. * Many tiny passenger-dot clusters and river-bridge ticks looked “polygonal” enough to confuse the N-gon tests. --- ## 3. First Refinements—Masking Out Everything but Bright White * **Idea**: Stations in Mini-Metro are *pure white* icons; everything else (lines, UI, passengers) is colored or dark. * **Attempt**: * Convert to HSV, threshold on very low S (desaturation) and very high V (value) to isolate white pixels. * Morphological open/close to remove specks. * Then run contour→circularity/polygon test. * **Pitfall**: * Stations *on top* of colored tracks (e.g. on a bridge) had only a thin white rim—HSV mask sometimes missed those edges entirely. * Isolated stations (no overlapping line) were detected perfectly, but connected ones vanished. --- ## 4. Over-Engineering the Line-Mask Subtraction * **Goal**: Guarantee that *all* colored line pixels get removed from the white mask—so that even bridged stations’ white rings remain connected components. * **Steps**: 1. Compute a **union** of every color mask (loop through your HSV\_RANGES → `cv2.inRange` → open → **dilate**). * The dilation step was critical to “eat” the 1-pixel-wide river-cross markers. 2. Subtract that union mask from your white-pixel mask. 3. Find contours on the cleaned white mask. * **Tuning**: * Dialed dilation iterations until thin bridge-ticks vanished from your mask but station rims stayed. * Verified in VisualDebugger that *every* station—regardless of overlap—showed up as a solid white blob. --- ## 5. Simplified, Robust Shape Classification * **Contour filtering**: After obtaining a clean set of white blobs (`cnts`), we applied in this order: 1. **Area filter**: discard <30px noise or >50% screen. 2. **Aspect-ratio guard**: skip anything extremely long/thin (passenger dots, random UI bars). 3. **Circularity & polygon-vertex test**: * If `circ > 0.85` → CIRCLE * `v == 3` → TRIANGLE * `v == 4` → axis-aligned vs. 45° rotated → SQUARE or GEM * `v == 5` → convex→PENTAGON else → DIAMOND 4. **Fallback**: matchShapes against your preloaded station-template contours, accepting only very tight matches (score <0.2). * **Outcome**: No more mystery cross icons or river ticks—only real station shapes. --- ## 6. Final Benefits 1. **All stations detected**, whether they sit atop a track or alone in open space. 2. **No spurious “passenger” or “UI” detections**—tiny dots and Monday-popup crosses never make it past area/aspect filters. 3. **Clean VisualDebugger views** * **Raw**: original game frame * **Det**: annotated detection contours and labels * You can now watch in real time as each true station lights up a blue circle + name. --- ### Next Steps * **Tune the strictness** of the matchShapes threshold if you ever see a mis-classification. * **Extend** the same clean-mask approach to your line-detector (`detect_lines`) if you want to remove UI artifacts there as well. * **Rock on**—your agent now has a rock-solid “world model” of every station on the map! Let me know if you’d like code snippets or more on any particular tweak! # 5/27 ## Classifier * 捨棄Serpent * Use Python Methods to classify stations * OpenCV * CNN to dectect special station * Detect Inventory & assets (carriages, Interchange, etc) * Detect Lines * [Done] Detect context ## RL ??

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.