Ignore False PWS data + Feature requests


#1

My weather station is connected to Weather Underground and I populate PWSweather through @Gene’s script

My Rachio reads from this PWS station.

Rachio reported 0.28" of rainfall a couple of days back when we didn’t receive any rain and as a result skipped watering. On close inspection of PWSweather data I noticed that there were a few incorrect rainfall values. You can see them here - https://www.pwsweather.com/archivewx.php?id=PLANO202&d=20170418&t=day 9:42-10:10am ( @Gene is aware of this issue)

Is there any way Rachio can ignore these incorrect rainfall readings?

Also, I really hope Rachio integrates with Wunderground soon and also show a Moisture Level for all schedules and not only the flexible one. I’ve applied for your beta program but haven’t heard back :-/

Thanks,
Rohan


Using WUnderground.com to integrate Personal Weather Stations
Saw a new PWS in Costco, anyone tried it yet?
Using WUnderground.com to integrate Personal Weather Stations
Using WUnderground.com to integrate Personal Weather Stations
#2

Yes, @ronmis has contacted me and I’m investigating how I can fix it on wufyi side, but meanwhile maybe someone from Rachio can contact your weather provider and see if they can ignore short spikes within reported daily rain reading.

Lets see who finds a fix first :wink:


#3

@ronmis - the age old computing problem of GIGO. Out of curiosity, did Weather Underground handle the spike/incorrect rainfall values correctly (i.e. zeroed them out)?


#4

@DLane actually wunderground doesnt show the spike at all, but a call to its api shows those values. Maybe wunderground zero’s them before showing them on the website or it has a bug in its api


#5

@ronmis - so if Rachio used wunderground’s API to feed their weather intelligence they would have the same issue then?

As I understand it, the suggestion is that if there is a drop back to zero for the total rainfall value for the day during the day (i.e. not around the midnight reset time) then the previous rainfall values should be ignored. Correct?

Consider these scenarios -
1)The spike/abnormal rainfall values are reported right before Rachio does it’s check for is it raining and should I run the schedule or not (~1 hour before, I believe). The schedule would be skipped as Rachio hasn’t seen the zero’ed out values yet, but then later on the data would show no rain, but it would have skipped a schedule.

  1. A heavy thunderstorm comes through, lots of rain, wind, lighting and thunder which caused the power to go out and the weather station/data logger isn’t on an uniterruptible power supply. When the power is restored in an hour, it is bright and sunny and there is no more rain for the day. As I think the data logger would start back over at zero rainfall for the day, the proposed solution would have the weather intelligence think that it didn’t rain and would water.

Both situations would lead to support calls - why did Rachio do this when…

So I think they are between a rock and a hard place. I think once the data has been reported and acted up it shouldn’t be changed.


#6

Yes, Unfortunately we don’t have API access to edit historical data on PWSweather in a similar way WU can do internally. I’m sure they may be reporting the data on the website if you happen to watch it in real-time, but probably edit their historical logs as soon as they see data drop to zero before the day is out. Unfortunately we do not have a way to edit historical data without first delaying it, thus requiring a database and increased overhead.

I’m considering, at this time, to create a 30 minute db buffer, which I hope is sufficient for now. Script execution will go up, but I don’t anticipate a problem for con-job’s 30 second limit.


#7

Do we know what is causing these false data points? @DLane really outlined the scenarios well, I can’t think of a way to really resolve this without causing other issues. I also would be concerned about changing data after already skipping, from a support perspective, we need that data to stay consistent so we can understand why a skip did/didn’t happen. WUnderground is not showing these spikes? Do you know if it shows them and then corrects later, or if it just never shows them? Is there a lag in their data so they can correct for the false data points?


#8

To give everyone an update:
I’ve been testing how long I could delay updates over to pwsweather and found some issues with that.

What I am working on now: I’m adding a database support to delay just raindata. Meaning that your temperature, pressure and other measurements will be relayed in real-time, but your rainfall data may be D number of minutes delayed (you choose). (meaning that if rain started at 10pm, it may showup starting at 10pm + D minutes, 10:30 for example if Delay is set to 30 minutes).

Before the data is forwarded, I’ll add logic to filter out seemingly false data.

In case I’ll need wufyi users to update their configuration, I will trigger 5 response errors / day (a notification should than go through to your cron-job email account) with an otherwise constant message containing a new URL, just update URL within cron-job (copy+ paste) and you will not get any new errors. API errors will also be limited to 5 / day so that your cron-jobs do not get disabled.


#9

First of all, thanks to @Gene for creating and hosting this script! It has saved me a fair bit of effort.

I’m seeing similar behavior to what is reported above when grabbing data from a nearby weather station. In this case, it is with the temperature and dew point measurements. Early this morning, my PWSWeather site shows a rogue data point of -999 degrees. This is obviously skewing the averages a bit. :slight_smile: This stray data point is not visible on the data table on WUnderground.

The anomaly seems to have occurred around 12:30AM Mountain on 5/8.

See https://www.wunderground.com/personal-weather-station/dashboard?ID=KCOLOVEL127#history
and
https://www.pwsweather.com/obs/WCMS80537.html

I assume this is related to the issue(s) being discussed above?

Thanks!

-Matt


#10

@Gene keep going on this track and you may be taking someone’s job…


#11

Thank you @goku3989 for bringing this to my attention, I will add this to the list of issues to be addressed, stay tuned. I’ve had the debug capture active to deal with the other issues and looked up the actual data that came from WU for the data-point in question:


[details=Raw Data from WU at 12:30AM]
{
“response”: {
“version”:“0.1”,
“termsofService”:“http://www.wunderground.com/weather/api/d/terms.html”,
“features”: {
“conditions”: 1
}
}
, “current_observation”: {
“image”: {
“url”:“http://icons.wxug.com/graphics/wu2/logo_130x80.png”,
“title”:“Weather Underground”,
“link”:“http://www.wunderground.com
},
“display_location”: {
“full”:“Loveland, CO”,
“city”:“Loveland”,
“state”:“CO”,
“state_name”:“Colorado”,
“country”:“US”,
“country_iso3166”:“US”,
“zip”:“80537”,
“magic”:“1”,
“wmo”:“99999”,
“latitude”:“40.384724”,
“longitude”:"-105.113541",
“elevation”:“1528.9”
},
“observation_location”: {
“full”:“Walt Clark Middle School, Loveland, Colorado”,
“city”:“Walt Clark Middle School, Loveland”,
“state”:“Colorado”,
“country”:“US”,
“country_iso3166”:“US”,
“latitude”:“40.384724”,
“longitude”:"-105.113541",
“elevation”:“4985 ft”
},
“estimated”: {
},
“station_id”:“KCOLOVEL127”,
“observation_time”:“Last Updated on May 8, 12:30 AM MDT”,
“observation_time_rfc822”:“Mon, 08 May 2017 00:30:09 -0600”,
“observation_epoch”:“1494225009”,
“local_time_rfc822”:“Mon, 08 May 2017 00:30:29 -0600”,
“local_epoch”:“1494225029”,
“local_tz_short”:“MDT”,
“local_tz_long”:“America/Denver”,
“local_tz_offset”:"-0600",
“weather”:“Clear”,
“temperature_string”:"-9999.0 F (-999.0 C)",
“temp_f”:-9999.0,
“temp_c”:-999.0,
“relative_humidity”:"-999%",
“wind_string”:“Calm”,
“wind_dir”:“North”,
“wind_degrees”:-9999,
“wind_mph”:-9999.0,
“wind_gust_mph”:0,
“wind_kph”:0,
“wind_gust_kph”:0,
“pressure_mb”:“1008”,
“pressure_in”:“29.77”,
“pressure_trend”:"+",
“dewpoint_string”:"-9999 F (0 C)",
“dewpoint_f”:-9999,
“dewpoint_c”:0,
“heat_index_string”:“NA”,
“heat_index_f”:“NA”,
“heat_index_c”:“NA”,
“windchill_string”:“NA”,
“windchill_f”:“NA”,
“windchill_c”:“NA”,
“feelslike_string”:"-9999.0 F (-999.0 C)",
“feelslike_f”:"-9999.0",
“feelslike_c”:"-999.0",
“visibility_mi”:“10.0”,
“visibility_km”:“16.1”,
“solarradiation”:"–",
“UV”:“0”,“precip_1hr_string”:“0.00 in ( 0 mm)”,
“precip_1hr_in”:“0.00”,
“precip_1hr_metric”:" 0",
“precip_today_string”:“0.00 in (0 mm)”,
“precip_today_in”:“0.00”,
“precip_today_metric”:“0”,
“icon”:“clear”,
“icon_url”:“http://icons.wxug.com/i/c/k/nt_clear.gif”,
“forecast_url”:“http://www.wunderground.com/US/CO/Loveland.html”,
“history_url”:“http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=KCOLOVEL127”,
“ob_url”:“http://www.wunderground.com/cgi-bin/findweather/getForecast?query=40.384724,-105.113541”,
“nowcast”:""
}
}[/details]


@mckynzee Don’t worry, it’s not easy to afford my services full time :wink:


#12

Thanks, @Gene! If I were running the script on my own box, I might be tempted to do some value clamping based on some reasonable assumptions about this area. Wish I knew why WUnderground is reporting that rogue data point, yet not showing it on their own website. Like I think you mentioned before, they must do some post processing to filter stuff like that out.

-Matt


#13

Ah, looks like it happened again–just a little after 3:00PM Mountain. This time, I’m actually seeing a mostly blanked data line in the WUnderground data; not sure if it will get corrected on their side later, though.

For what it’s worth, that particular area just got hit with a barrage of hail right around that time. That might have temporarily borked the weather station.

-Matt


#14

Looking at the data, I do not anticipate any issues about filtering this kind of data in the future, there are plenty of clearly wrong data to invalidate the whole data packet. I may expedite the fix for this particular problem since it looks pretty straight forward. Thank you @goku3989 for sharing your findings

P.S. Looks like WU didn’t yet filter out every bad data-point, will be interesting to check back alter to see if it is still there tomorrow.


#15

Update: As of now, the negative readings should be filtered out of the data updates to PWS weather.
I’ve also improved error handling, so initial setup should be clearer. The site will now handle errors from others and help you figure out which of the parameters has an issue (such as that your PWS weather password is wrong in case you need to correct “pws” within the URL).
I’ve stripped HTML from the successful output so that it would show up clearer within cron-job output.
I’ve also added information on how old the data was during the transfer.

I’m still working on testing / debugging other false data filtering, current expectation that it will be ready next weekend.

As always, if you are using wufyi.com, updates were applied automatically. You do not have to do anything.

Cheers,
Gene


#16

Excellent–this should make those troublesome spikes go away! Thanks, @Gene!

I took a peek at your GitHub. I wonder, for a possible future adjustment, it might make sense to allow some negative tolerance for values where is makes sense (e.g. the temperature values). Granted, it probably doesn’t have much impact with respect to sane times of year when we’d be running our Rachio devices, but rather in the spirit of an accurate data migration from WU->PWS. Indeed, we do get some subzero overnight lows here in Colorado in the winter! :wink:

Anyway, good stuff–thanks again!

-Matt


#17

You are absolutely right, update is going up immediately. Sorry, being in Florida… :sunny:

Update is live on wufyi.
@goku3989 what do you think of the changes (link)?


#18

@Gene --those changes seem reasonable to me. I think the ‘bad data’ temps usually get reported as -999, right? So using -459 as a floor seems pretty safe. I’ll keep an eye out to see if any spikes pop up going forward.

Cheers!

-Matt


#19

Actually, something must have gone wrong. wufyi.com is now complaining about my PWS password being incorrect. Looks like the updates stopped around 8:10PM Mountain (I have my cron job set to run every 5 minutes).

I just double-checked and I’m able to sign in just fine.

Cheers,

-Matt


#20

Ah, it looks like you are URL encoding the password. That would mess up the password in my particular case.

-Matt