# Router Hacking 2019-11-26 ## Objective * Have 4 new routers with bird2 and alpine and new hardware * Have Balazs understand basics of routers * Have maximum 1 minute downtimes * 3rd person will monitor monitoring servers * 3rd person will see pings from outside (server1.place4 and server1.place11) ## Input Balazs * Can we shift traffic to new routers? * we need routes in kernel table * we need bird * do we need a default route (ipv4, ipv6 * no, if we have the whole internet in our kernel table * we need a firewall first (!!!) ## Steps to take in order 1. Ensure radvd is always running (__start_on_boot) on routers (Balazs in the lead) **done** 2. No dependency on keepalived 3. Update to the cdist type 4. Rename type 5. Add config files for testing 6. Apply config 3. Ensure we have replacement-router{1,2} in every place (Balazs, Sami in the lead) **DONE** 4. place6 5. Balazs and Sami locate 2 replacement routers 6. ~~2 new hardware boxes~~ (Sami found 2, 2x r710) 7. ~~Define / select suffixes~~ (if not already done) 8. ~~Install with Alpine~~ 9. both: ~~10. update DNS and hostnames of the replacement-routers~~ 11. Status 12. Both are reachable via copper (for bootstrapping) @ replacement-router[1,2].intern.place6.ungleich.ch 5. Ensure network addresses are correctly configured on replacement-router* (Balazs in the lead) **CURRENT** 2. ~~ensure every router has a suffix assigned and reachable by network **TODAY**~~ 3. Ensure that cdist config is suitable for router{1..4}.placeX 4. ~~Ensure that replacement-router{1,2} don't have UPSTREAM IPs (no sunrise, init7, netstream,...)~~ 5. ~~Update the wiki accordingly (-> can we put this into netbox???)~~ 6. Status 7. Need to create config 8. Next steps afterwards 9. Deploy config && restart 6. Add bird to replacement-router{1,2} in place5 (Nico in the lead, Balazs) **TODAY** 7. ~~Need from Balazs: 2x /48 (one place5, one place6)~~ 2. ~~easy to phase in, if we don't announce routes, but only receive routes (kind of safe)~~ 3. Prepare the cdist configuration for it / double check it 4. ~~add a config snippet for replacement-router* which does not announce~~ 5. ~~add a config snippet to router-* to not accept routes from them~~ 5. configure existing routers to PEER with replacement-router{1,2} in each place 6. configure existing routers to DROP all routes from replacement-router{1,2} 7. ensure nft is correctly deployed and running on router3/4.placeX (Balazs) **TODAY** 2. cdist config + nft list ruleset -a 3. Clarify multi protocol BGP 4. Can we (always) route via IPv6 on IPv6 only routers? 5. Can we keep the config 1:1 and on productive routers we get upstream info via 2 peers? 8. Configure radvd on replacement-routers (Balazs) **TODAY** 9. New cdist type 4. Only for the test vlans 5. Define the networks 9. Test the routing on the replacement routers (**Balazs**) 4. create test vlan 5. With RAs only from replacement routers 6. One VLAN per place 7. Assign a /64 per place 6. How to configure clients for this VLAN 7. To be defined by Balazs (<5m+15m task, 5 thinking, 15 doing) 8. generate some traffic 9. Ensure routers survive 12. Move netboot stuff (Nico, **Balazs, Sami**) 13. Remove IPv4 14. Create USB sticks for every server (ipxe, Sami) 15. Add dhcpv6 for netbooting 16. Reboot servers w/ the usb stick 17. Clarify the image location issue ??? 18. Test netbooting with APUs 14. Move NAT64 (Nico, Balazs) 14. Configure and use jool 15. With test clients 16. From the new test VLANs 17. Need to test the sychronisation of joold 18. Routers need to receive traffic for a certain IPv4 address (also double assigned) 17. DNS + DNS64 (Balazs) **TODAY** 18. Configure bind 19. Should be usable on all routers 20. New NAT64 prefixes 21. New one in both places 22. Taking from existing /48 16. Redundancy 11. If one routers goes down ... 12. 2x hardware (servers, cables, switches) 13. 2x all IP addresses (assigned on the "bond.X device", no keepalived) 14. disable dad for that 15. 2x DNS servers 16. 2x NAT64 17. Move v4 stuff -> what is stuff? 11. maybe dhcpv4 (not sure) 15. Finally, replace the existing routers with replacement-routers 13. Disconnect one of the old routers (keep running, but disconnected) 14. Disconnect by shutting down the network ports on the arista 15. LABEL the router to NOT reconnect unless you know what you do 16. Remove the keepalived stuff from cdist 17. So routers don't get this anymore 18. Replace it with the IP being statically added to the network interface 18. Reconfigure one of the replacement-routers to become the disconnected router 19. Every step here will be communicated in the infrastructure channel of our chat 17. create a /etc/hosts entry with the DISCONNECTED router's name pointing to the replacement-routers IP 18. And configure it with cdist 19. gets new hostname 20. gets new network config 21. gets new bird config 22. ... 23. Need to be in front of the console and reboot the server for this 24. Also should be attached to arista to possibly kill the ports of the new router 25. Need to monitor traffic before applying (Sami) 26. Logged into server1.place4, server1.place11 (ping4, ping6, mtr -4, mtr -6) 27. all started in a tmux 28. Checks after reboot 29. bird up and running? 30. bird peered? 31. routes in kernel? 32. nft up and running? 33. dns up and running? 34. virtual ip addresses present on the right interfaces? 35. traffic passed on? 36. reachable from outside (v6,v4) 37. 16. Cleanup tasks for the next days 14. Remove routers / put them on a pile for future use 15. Re-enable the ports on the arista ## Changes after the migration * IPv6 only based netbooting ## Queue / stuff to pick * ensure router{3,4}.place6 is ready to be used with Alpine * status of router1.place6 * maybe call new routers "replacement-routerY", Y = {1,2} ## Long term queue * Make 2a0a:e5c0::/64 (or similar) a "virtual network" * Or use a different one ??? * Ensure the netboot image also finds its way to the routers * Ensure that 100% of stuff is in cdist of the routers and replacement-routers * Reinstall routers once per month ## Decisions * No VMs for replacement routers at the moment * Add DNS64 based unbound server * Everything has to be in cdist before we replace them... logically