# Script to check if OSPF is functioning. It relies on the fact that each # adjacent neighbor will be reachable via it's router-id, and that if we're # unable to exchange routes for whatever reason, the router-id will be # unreachable. By default a router that comes up will not permit itself to be # rebooted until OSPF has come up at least once. The cases we've seen where it # fails the failing side states neighbor state Full, but the working side says # Exchange. So this check should prevent rebooting core routers, but alas, I'm # not 100% confident of that. # # The logic here reboots if we're in a bad state with at leats one peer. # Possibly this should be negated to only reboot if no peers are in a good # state and at least one is in a bad state (more complex though). # Reboot possibility will get enabled once at least one peer has managed to # come up successfully. # # In case of multiple OSPF instances, if any one of them is functioning we move # towards the mayreboot state, but we will only restart non-functioning # instances. :if ([/file find name=ospfstatus.txt] = "") do={ :put "ospf status file doesn't exist - creating." /file print file=ospfstatus /file set [/file find name="ospfstatus.txt"] contents="no" :put "Done, please re-run the script." # Continuing with the rest of the script is pointless as our view of /file is # a snapshot in spite of /file set ... which above sets it to some arbitrary # content (looks like a file list). } else={ :local mayreboot [/file get ospfstatus.txt contents] # After initial set the file content is completely bogus ... :if ($mayreboot != "yes" && $mayreboot != "no") do={ :set mayreboot "no" } :put "Checking OSPF (mayreboot=$mayreboot) ..." :foreach n in=[/routing ospf neighbor find where state="Full"] do={ :local loopback [/routing ospf neighbor get $n router-id] :put "Remote OSPF $loopback @ $remoteaddress" /ip route check $loopback once do={ :if ($status != "failed") do={ :put "OSPF is functioning correctly." :if ($mayreboot != "yes") do={ :log info "OSPF restored - restoring" /file set ospfstatus.txt contents="yes" /tool fetch keep-result=no url="https://slack.com/api/chat.postMessage?token=xoxb-212578866007-F6mp189CQi8vHl1gPmeaxLLo&channel=wisp&text=OSPF%20on%20$[/system identity get name]%20restored&as_user=false&username=ospf-check" } } else={ :put "OSPF is not functioning correctly." :if ($mayreboot = "yes") do={ :local instancename [/routing ospf neighbor get $n instance] /file set ospfstatus.txt contents="no" :put "Restarting OSPF" :log error "OSPF malfunctioned - restarting." /routing ospf instance set [/routing ospf instance find name=$instancename] disabled=yes :delay 10 /routing ospf instance set [/routing ospf instance find name=$instancename] disabled=no } } } } }