# Router Hacking 2019-11-26
## Objective
* Have 4 new routers with bird2 and alpine and new hardware
* Have Balazs understand basics of routers
* Have maximum 1 minute downtimes
* 3rd person will monitor monitoring servers
* 3rd person will see pings from outside (server1.place4 and server1.place11)
## Input Balazs
* Can we shift traffic to new routers?
* we need routes in kernel table
* we need bird
* do we need a default route (ipv4, ipv6
* no, if we have the whole internet in our kernel table
* we need a firewall first (!!!)
## Steps to take in order
1. Ensure radvd is always running (__start_on_boot) on routers (Balazs in the lead) **done**
2. No dependency on keepalived
3. Update to the cdist type
4. Rename type
5. Add config files for testing
6. Apply config
3. Ensure we have replacement-router{1,2} in every place (Balazs, Sami in the lead) **DONE**
4. place6
5. Balazs and Sami locate 2 replacement routers
6. ~~2 new hardware boxes~~ (Sami found 2, 2x r710)
7. ~~Define / select suffixes~~ (if not already done)
8. ~~Install with Alpine~~
9. both:
~~10. update DNS and hostnames of the replacement-routers~~
11. Status
12. Both are reachable via copper (for bootstrapping) @ replacement-router[1,2].intern.place6.ungleich.ch
5. Ensure network addresses are correctly configured on replacement-router* (Balazs in the lead) **CURRENT**
2. ~~ensure every router has a suffix assigned and reachable by network **TODAY**~~
3. Ensure that cdist config is suitable for router{1..4}.placeX
4. ~~Ensure that replacement-router{1,2} don't have UPSTREAM IPs (no sunrise, init7, netstream,...)~~
5. ~~Update the wiki accordingly (-> can we put this into netbox???)~~
6. Status
7. Need to create config
8. Next steps afterwards
9. Deploy config && restart
6. Add bird to replacement-router{1,2} in place5 (Nico in the lead, Balazs) **TODAY**
7. ~~Need from Balazs: 2x /48 (one place5, one place6)~~
2. ~~easy to phase in, if we don't announce routes, but only receive routes (kind of safe)~~
3. Prepare the cdist configuration for it / double check it
4. ~~add a config snippet for replacement-router* which does not announce~~
5. ~~add a config snippet to router-* to not accept routes from them~~
5. configure existing routers to PEER with replacement-router{1,2} in each place
6. configure existing routers to DROP all routes from replacement-router{1,2}
7. ensure nft is correctly deployed and running on router3/4.placeX (Balazs) **TODAY**
2. cdist config + nft list ruleset -a
3. Clarify multi protocol BGP
4. Can we (always) route via IPv6 on IPv6 only routers?
5. Can we keep the config 1:1 and on productive routers we get upstream info via 2 peers?
8. Configure radvd on replacement-routers (Balazs) **TODAY**
9. New cdist type
4. Only for the test vlans
5. Define the networks
9. Test the routing on the replacement routers (**Balazs**)
4. create test vlan
5. With RAs only from replacement routers
6. One VLAN per place
7. Assign a /64 per place
6. How to configure clients for this VLAN
7. To be defined by Balazs (<5m+15m task, 5 thinking, 15 doing)
8. generate some traffic
9. Ensure routers survive
12. Move netboot stuff (Nico, **Balazs, Sami**)
13. Remove IPv4
14. Create USB sticks for every server (ipxe, Sami)
15. Add dhcpv6 for netbooting
16. Reboot servers w/ the usb stick
17. Clarify the image location issue ???
18. Test netbooting with APUs
14. Move NAT64 (Nico, Balazs)
14. Configure and use jool
15. With test clients
16. From the new test VLANs
17. Need to test the sychronisation of joold
18. Routers need to receive traffic for a certain IPv4 address (also double assigned)
17. DNS + DNS64 (Balazs) **TODAY**
18. Configure bind
19. Should be usable on all routers
20. New NAT64 prefixes
21. New one in both places
22. Taking from existing /48
16. Redundancy
11. If one routers goes down ...
12. 2x hardware (servers, cables, switches)
13. 2x all IP addresses (assigned on the "bond.X device", no keepalived)
14. disable dad for that
15. 2x DNS servers
16. 2x NAT64
17. Move v4 stuff -> what is stuff?
11. maybe dhcpv4 (not sure)
15. Finally, replace the existing routers with replacement-routers
13. Disconnect one of the old routers (keep running, but disconnected)
14. Disconnect by shutting down the network ports on the arista
15. LABEL the router to NOT reconnect unless you know what you do
16. Remove the keepalived stuff from cdist
17. So routers don't get this anymore
18. Replace it with the IP being statically added to the network interface
18. Reconfigure one of the replacement-routers to become the disconnected router
19. Every step here will be communicated in the infrastructure channel of our chat
17. create a /etc/hosts entry with the DISCONNECTED router's name pointing to the replacement-routers IP
18. And configure it with cdist
19. gets new hostname
20. gets new network config
21. gets new bird config
22. ...
23. Need to be in front of the console and reboot the server for this
24. Also should be attached to arista to possibly kill the ports of the new router
25. Need to monitor traffic before applying (Sami)
26. Logged into server1.place4, server1.place11 (ping4, ping6, mtr -4, mtr -6)
27. all started in a tmux
28. Checks after reboot
29. bird up and running?
30. bird peered?
31. routes in kernel?
32. nft up and running?
33. dns up and running?
34. virtual ip addresses present on the right interfaces?
35. traffic passed on?
36. reachable from outside (v6,v4)
37.
16. Cleanup tasks for the next days
14. Remove routers / put them on a pile for future use
15. Re-enable the ports on the arista
## Changes after the migration
* IPv6 only based netbooting
## Queue / stuff to pick
* ensure router{3,4}.place6 is ready to be used with Alpine
* status of router1.place6
* maybe call new routers "replacement-routerY", Y = {1,2}
## Long term queue
* Make 2a0a:e5c0::/64 (or similar) a "virtual network"
* Or use a different one ???
* Ensure the netboot image also finds its way to the routers
* Ensure that 100% of stuff is in cdist of the routers and replacement-routers
* Reinstall routers once per month
## Decisions
* No VMs for replacement routers at the moment
* Add DNS64 based unbound server
* Everything has to be in cdist before we replace them... logically