Rachio Gen 3 frequently offline / DHCP failure

Hi.

I have emailed the support box twice about this but not received any reply whatsoever. Posting here for visibility. The issue persists and now when the device waters while offline it doesn’t show properly in the calendar (although it does appear in history).

I’m having issues with my Rachio Gen3 sporadically not picking up an IP address from the DHCP server. This manifests as the controller going offline for extended periods of time. Sometimes the only way to bring it back is to power cycle it and maybe the WiFi AP as well.

Very similar to the problem experienced by this user:

DHCP Logs from Router:
17:52:04.236812 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:52:05.238235 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:52:52.425803 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:52:52.426611 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:52:57.425302 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:52:57.426086 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:53:05.425515 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:53:05.426320 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:53:19.027327 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:53:19.028084 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:53:23.021558 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:53:23.022348 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:53:32.027259 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:53:32.027983 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:54:21.015560 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:54:21.016414 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:55:26.008931 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:55:27.010392 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:56:10.162058 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:56:10.162816 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:56:15.155670 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:56:15.156381 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:56:23.154885 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:56:23.155615 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:56:36.317638 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:56:36.318363 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:56:40.314144 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:56:40.314956 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:56:49.314132 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:56:49.315043 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:57:05.311458 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:57:05.312279 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:57:37.312160 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:57:37.312941 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:58:41.301388 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:58:42.302848 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:59:26.905857 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:59:26.906590 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
17:59:26.909426 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 58:d5:0a:bd:27:b0, length 300
17:59:26.910153 IP 192.168.99.1.67 > 255.255.255.255.68: BOOTP/DHCP, Reply, length 300
*No further requests after this point. Controller came back online but had been offline on/off sporadically for several days.

You can see that in the period of a very short space of time the device requested a DHCP address multiple times, and the router sent responses.

Also attached -

  1. Screenshots of WiFi AP page showing that the Rachio is connected to the AP, but is not accepting an IP and therefore taking a 169.254.x.x IP address. This also shows that the signal strength is -69dB, which is plenty strong enough for a reliable connection. The controller is close to a garage door opener device which does not have any of the same connectivity issues.

  2. Screenshot from router showing the DHCP address leased to the device (should it decide to accept it); and

  3. Another screenshot from the AP a short while later this morning, when the device finally accepted the IP and came back online.

Any advice you can offer would be appreciated. Let me know if you’d like me to run any tests on my end or grant you access to logs etc.

I see you are using a TPlink AP, are you running it off of a managed switch by any chance?
Also, I find it useful to explicitly allow traffic to / from broadcast address 255.255.255.255 with firewall rules.

Thank you for replying.

The AP is plugged directly into the router which is a UBNT device so depending on how you think of it you might call a managed switch?

I do not suspect any firewall issues because I can see the DHCP packets coming in and leaving the router, and because literally no other device behaves this way.

Is there a debugging log we can pull to see what the Rachio sees on its end?

Do you have IoT SSID on a different / dedicated VLAN?

Yes, I do. The tagging is done by the AP and the router has two DHCP servers running, one for each VLAN. The Rachio service itself would not have knowledge of any other network or even that it is on a ‘special’ VLAN.

Unfortunately I’ve never used edgerouter myself (as I prefer pfsense), seems there are some issues with switch0 (default) interface and other vlans, see if this video helps:

TLDR: looks like this is a known issue (# 10 here), whereas you’ll need to configure your default vlan from switch0 to switch0.1 in order to use additional vlan tags.

I believe that this problem is caused by Rachio using the BROADCAST flag in its DHCPDISCOVER (and DHCPREQUEST) packets. This requests the DHCP server to send the reply as a broadcast packet. It would normally be used by older devices that are incapable of receiving unicast packets before the protocol stack has been set up. IMO, it’s very unlikely that Rachio is one of them; if they simply turned the bit off (in new firmware releases) the problem would be permanently fixed. However, if their hardware really does have this limitation, we need to do something at the customer end. See


or see https://tools.ietf.org/html/rfc2131 for the gory details.

Broadcast packets are expensive on Wi-Fi for several reasons. Among others, they must be sent from all APs on the LAN, as well as at a low speed. There is some useful detail here:

Some APs ‘handle’ the issue with a ‘block LAN to WLAN broadcast’ option, which is on by default. If that’s your case (the controller eventually hears the reply from a more distant AP), turning this off should fix your problem.

I’m not familiar with EdgeOS, but mapping the Rachio to a static IP may cause it to issue an indefinite or much longer lease time, which may mitigate or eliminate the trouble.

If neither of the above are applicable, and assuming that you have far fewer devices than the subnet allows for, i.e. you have no need to reuse LAN addresses, consider setting the DHCP lease time to e.g. 90 days (and overriding any shorter time that the device requests). I’d expect this to greatly reduce connectivity loss, unless the requests you are seeing are a result of having lost association / authentication at the Wi-Fi level, rather than the lease having run out.

1 Like

Interesting, I was always wondering why Rachio was effected whereas others seemingly were not. Broadcast flag may explain it.

Another thought: If you have two or more APs on the same channel or on overlapping channels, it’s possible that they transmit the broadcast packet simultaneously and interfere with each other. There is no acknowledgement for broadcast, so they won’t know to retransmit.

Hey folks.

Thanks for the replies. Been seeing how things have gone this last week.

A few notes:

  1. I have just one AP, so no echoes or overlapping channels.

  2. The AP does not have a toggle for LAN-to-WAN broadcast.

  3. I updated the DHCP lease time to 7 days as an experiment. It was up for a few days but has been offline now since Saturday - over 3 days.

  4. VLANs are set up properly (always had been) in accordance with that thread.

Any more ideas? It really sounds like the Rachio firmware, that uses a configuration that literally no other device does, is at fault here.

It’s typical for a device to renew its lease when half expired. In theory, if the renewal fails it should keep using the current address, while periodically attempting renewal. However, it wouldn’t surprise me that a failed renewal results in the device going offline.

Try configuring a static IP mapping for the device and see what lease time results; it might be very long. If not but you have no need to reuse addresses, try setting the lease time to 90 days or perhaps a year and see whether Rachio stays connected longer.

Oops forgot to add that to the list…

  1. Mapped a static IP for the device. Above effect still the result.

Sorry! I know you’re trying to help :slight_smile:

I’m actually the one you’ve quoted as having a similar problem. I agree it is frustrating, especially considering that a static IP would easily solve the issue.

In my case, root cause ended up being a managed switch between the DHCP server & the TpLink APs (I have two EAP225s). Rebooting it has fixed the issue.

Static IP assignment on DHCP server level is not the same as Static IP on the device as DHCP server would still need to assign the IP on a regular basis, vs device never having to contact DHCP server.

Have you tried to connect Rachio to non IoT network / default VLAN?

1 Like

Thanks, Gene. I agree very frustrating.

I’ll try moving it to the other SSID. I don’t entirely understand why it wouldn’t work as-is though, since the VLAN should be invisible to the device (I also have an EAP225) as the DHCP server is on the same subnet. But who knows at this point… The AP is plugged directly into the ER-X, so no intermediate switch (only the virtual one within the ER-X itself).

Other things I’m considering trying:

  1. Switch the IOT VLAN DHCP server to Pi-hole instead of the built-in one. See if it responds differently.
  2. Swap the VLANs over so the ‘default’ VLAN is actually the IOT one.

You could also try to run a wireshark on the computer connected to the same sunset/vlan. See if you see the broadcast replies from your DHCP server.

1 Like

I suspect that the broadcast flag in the DISCOVER is superfluous, i.e. a unicast OFFER would be properly received and processed… This is based on my general experience with embedded systems, nothing specific to Rachio.

So if there is a problem with the AP sending the OFFER or the controller receiving it, getting the DHCP server to ignore the flag may be a solution. I believe this could be done with an iptables rule and netsed. See

Well in trying to move SSIDs the controller totally bricked itself! Won’t boot and factory reset failed too. :roll_eyes: :sweat_smile:

Will have to wait a few days to get a replacement (Rachio chat very responsive where email less so…) before trying anything else!

I’m loathed to abandon the IOT VLAN altogether because of course it’s best practice and all my other devices are having no problems at all; but maybe I could be less controlling and allow some ‘well behaved’ devices onto the main LAN, leaving only less trustworthy manufacturers etc. on the IOT VLAN.

Stewart - While I understood that thread on a theoretical level, it’s way above my current ability to implement. And of course I don’t think that we should have to resort to that in the first place!

I may spin up a Netshark VM to see if I can monitor the DHCP requests.

Random thought, Gene. We both are experiencing the same issue and both have the same TPLINK AP.

How are we sure that the AP isn’t causing the issue (even though the broadcast flag is still funky)?

Simpler than that, as @Gene noted, just run Wireshark on your PC (while connected to the IoT SSID).
Since your logs have already shown that the DHCP OFFER packets are broadcast, if the AP is sending them correctly they should show up in your capture.

At a time I’ve had issues, I’ve tried everything I could think of to troubleshoot tplink APs & my firewall/router. I went as far as installing an Omada controller (tp link’s centralized management for multiple APs), but at the end of the day, trouble went away after rebooting a single managed switch (no settings change on the switch itself).