I believe that this problem is caused by Rachio using the BROADCAST flag in its DHCPDISCOVER (and DHCPREQUEST) packets. This requests the DHCP server to send the reply as a broadcast packet. It would normally be used by older devices that are incapable of receiving unicast packets before the protocol stack has been set up. IMO, it’s very unlikely that Rachio is one of them; if they simply turned the bit off (in new firmware releases) the problem would be permanently fixed. However, if their hardware really does have this limitation, we need to do something at the customer end. See
Broadcast packets are expensive on Wi-Fi for several reasons. Among others, they must be sent from all APs on the LAN, as well as at a low speed. There is some useful detail here:
Some APs ‘handle’ the issue with a ‘block LAN to WLAN broadcast’ option, which is on by default. If that’s your case (the controller eventually hears the reply from a more distant AP), turning this off should fix your problem.
I’m not familiar with EdgeOS, but mapping the Rachio to a static IP may cause it to issue an indefinite or much longer lease time, which may mitigate or eliminate the trouble.
If neither of the above are applicable, and assuming that you have far fewer devices than the subnet allows for, i.e. you have no need to reuse LAN addresses, consider setting the DHCP lease time to e.g. 90 days (and overriding any shorter time that the device requests). I’d expect this to greatly reduce connectivity loss, unless the requests you are seeing are a result of having lost association / authentication at the Wi-Fi level, rather than the lease having run out.
Another thought: If you have two or more APs on the same channel or on overlapping channels, it’s possible that they transmit the broadcast packet simultaneously and interfere with each other. There is no acknowledgement for broadcast, so they won’t know to retransmit.
It’s typical for a device to renew its lease when half expired. In theory, if the renewal fails it should keep using the current address, while periodically attempting renewal. However, it wouldn’t surprise me that a failed renewal results in the device going offline.
Try configuring a static IP mapping for the device and see what lease time results; it might be very long. If not but you have no need to reuse addresses, try setting the lease time to 90 days or perhaps a year and see whether Rachio stays connected longer.
I’ll try moving it to the other SSID. I don’t entirely understand why it wouldn’t work as-is though, since the VLAN should be invisible to the device (I also have an EAP225) as the DHCP server is on the same subnet. But who knows at this point… The AP is plugged directly into the ER-X, so no intermediate switch (only the virtual one within the ER-X itself).
Other things I’m considering trying:
Switch the IOT VLAN DHCP server to Pi-hole instead of the built-in one. See if it responds differently.
Swap the VLANs over so the ‘default’ VLAN is actually the IOT one.
I suspect that the broadcast flag in the DISCOVER is superfluous, i.e. a unicast OFFER would be properly received and processed… This is based on my general experience with embedded systems, nothing specific to Rachio.
So if there is a problem with the AP sending the OFFER or the controller receiving it, getting the DHCP server to ignore the flag may be a solution. I believe this could be done with an iptables rule and netsed. See
Well in trying to move SSIDs the controller totally bricked itself! Won’t boot and factory reset failed too.
Will have to wait a few days to get a replacement (Rachio chat very responsive where email less so…) before trying anything else!
I’m loathed to abandon the IOT VLAN altogether because of course it’s best practice and all my other devices are having no problems at all; but maybe I could be less controlling and allow some ‘well behaved’ devices onto the main LAN, leaving only less trustworthy manufacturers etc. on the IOT VLAN.
Stewart - While I understood that thread on a theoretical level, it’s way above my current ability to implement. And of course I don’t think that we should have to resort to that in the first place!
I may spin up a Netshark VM to see if I can monitor the DHCP requests.
Simpler than that, as @Gene noted, just run Wireshark on your PC (while connected to the IoT SSID).
Since your logs have already shown that the DHCP OFFER packets are broadcast, if the AP is sending them correctly they should show up in your capture.
At a time I’ve had issues, I’ve tried everything I could think of to troubleshoot tplink APs & my firewall/router. I went as far as installing an Omada controller (tp link’s centralized management for multiple APs), but at the end of the day, trouble went away after rebooting a single managed switch (no settings change on the switch itself).
No, but set up a second mobile phone as a hotspot. Configure Rachio to connect to that. Set up a fixed schedule with all smart features disabled. Run it manually to confirm that the system is working properly. You can now take the hotspot down and Rachio will run indefinitely without Wi-Fi, operating as a dumb timer.
Hey. I took a look at that thread but I think the customer stumbled across a fix by effectively rebooting the WAP while changing various settings, and the Rachio connected. This definitely isn’t a root cause and I have confirmed that the setting doesn’t change anything.
The root cause is the DHCP broadcast packet that the Rachio is sending out; albeit somehow coupled with something on the TP-Link WAPs that don’t like doing that across VLANs.