My promise
As I have promised a few days ago, I have come with a more portable version of the noise removal
algorithm for the DHT
sensor.
Here’s a link to our Grove DHT Pro
sensor.
Here’s a link to the 1st
topic that raised the issue.
Here’s a link to the 2nd
topic that raised the issue.
The problem
The problem is that the Grove DHT
sensor is not reliable when it’s used for long periods of time.
As users have reported, there’s a chance of 0.5-2% of getting bad readings - extreme values, which sometimes don’t even belong in the sensor’s working range.
Scientifically, it’s quite normal for a sensor to give bad readings at some point and statistically the chance of encountering erroneous data increases as time passes.
Here’s a visual representation of data points collected over time from a Grove DHT sensor
- @royhills’s contribution .
As you can see, there’re lots of moments when the sensor is way off the chart by a long shot.
Totally unreliable.
The solution
So, we have to come up with a solution to attenuate/filter those outlier values.
My approach is to use a basic statistical model for removing the outlier values.
The thing here is to capture a certain amount of data points (values) and then filter them by excluding any value that’s outside of a certain threshold.
The threshold is simply calculated:
threshold = mean_average +- standard_deviation * factor
As you can see, we’re calculating the mean average of the captured data, calculate the standard deviation and then also include a factor for making the threshold more restrictive or permissive.
The factor can be user-defined, even though it’s set by default to factor = 2
.
Here’s a visual representation of data points collected over time from a Grove DHT sensor
- still @royhills’s contribution.
The code
Here’s the library you have to include.
Please take a look at the folder structure down below and see how you have to use it.
As an everyday user, you shouldn’t modify this library’s implementation.
grove_dht.py
import threading
import numpy
import datetime
import math
from grovepi import dht
import time
# after a list of numerical values is provided
# the function returns a list with the outlier(or extreme) values removed
# make the std_factor_threshold bigger so that filtering becomes less strict
# and make the std_factor_threshold smaller to get the opposite
def statisticalNoiseReduction(values, std_factor_threshold = 2):
if len(values) == 0:
return []
mean = numpy.mean(values)
standard_deviation = numpy.std(values)
# just return if we only got constant values
if standard_deviation == 0:
return values
# remove outlier values which are less than the average but bigger than the calculated threshold
filtered_values = [element for element in values if element > mean - std_factor_threshold * standard_deviation]
# the same but in the opposite direction
filtered_values = [element for element in filtered_values if element < mean + std_factor_threshold * standard_deviation]
return filtered_values
# class for the Grove DHT sensor
# it was designed so that on a separate thread the values from the DHT sensor are read
# on the same separate thread, the filtering process takes place
class Dht(threading.Thread):
# refresh_period specifies for how long data is captured before it's filtered
def __init__(self, pin = 4, refresh_period = 10.0, debugging = False):
super(Dht, self).__init__(name = "DHT filtering")
self.pin = pin
self.refresh_period = refresh_period
self.debugging = debugging
self.event_stopper = threading.Event()
self.blue_sensor = 0
self.white_sensor = 1
self.filtering_aggresiveness = 2
self.callbackfunc = None
self.sensor_type = self.blue_sensor
self.lock = threading.Lock()
self.filtered_temperature = []
self.filtered_humidity = []
# refresh_period specifies for how long data is captured before it's filtered
def setRefreshPeriod(self, time):
self.refresh_period = time
# sets the digital port
def setDhtPin(self, pin):
self.pin = pin
# use the white sensor module
def setAsWhiteSensor(self):
self.sensor_type = self.white_sensor
# use the blue sensor module
def setAsBlueSensor(self):
self.sensor_type = self.blue_sensor
# removes the processed data from the buffer
def clearBuffer(self):
self.lock.acquire()
self.filtered_humidity = []
self.filtered_temperature = []
self.lock.release()
# the bigger the parameter, the less strict is the filtering process
# it's also vice-versa
def setFilteringAggresiveness(self, filtering_aggresiveness = 2):
self.filtering_aggresiveness = filtering_aggresiveness
# whenever there's new data processed
# a callback takes place
# arguments can also be sent
def setCallbackFunction(self, callbackfunc, *args):
self.callbackfunc = callbackfunc
self.args = args
# stops the current thread from running
def stop(self):
self.event_stopper.set()
self.join()
# replaces the need to custom-create code for outputting logs/data
# print(dhtObject) can be used instead
def __str__(self):
string = ""
self.lock.acquire()
if len(self.filtered_humidity) > 0:
string = '[{}][temperature = {:.01f}][humidity = {:.01f}]'.format(
datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
self.filtered_temperature.pop(),
self.filtered_humidity.pop())
self.lock.release()
return string
# returns a tuple with the (temperature, humidity) format
# if there's nothing in the buffer, then it returns (None, None)
def feedMe(self):
if self.length() > 0:
self.lock.acquire()
temp = self.filtered_temperature.pop()
hum = self.filtered_humidity.pop()
self.lock.release()
return (temp, hum)
else:
return (None, None)
# returns the length of the buffer
# the buffer is filled with filtered data
def length(self):
self.lock.acquire()
length = len(self.filtered_humidity)
self.lock.release()
return length
# you musn't call this function from the user-program
# this one is called by threading.Thread's start function
def run(self):
values = []
# while we haven't called stop function
while not self.event_stopper.is_set():
counter = 0
# while we haven't done a cycle (period)
while counter < self.refresh_period and not self.event_stopper.is_set():
temp = None
humidity = None
# read data
try:
[temp, humidity] = dht(self.pin, self.sensor_type)
# check for NaN errors
if math.isnan(temp) is False and math.isnan(humidity) is False:
new_entry = {"temp" : temp, "hum" : humidity}
values.append(new_entry)
else:
raise RuntimeWarning("[dht sensor][we've caught a NaN]")
counter += 1
# in case we have an I2C error
except IOError:
if self.debugging is True:
print("[dht sensor][we've got an IO error]")
# intented to catch NaN errors
except RuntimeWarning as error:
if self.debugging is True:
print(str(error))
finally:
# the DHT can be read once a second
time.sleep(1)
if len(values) > 0:
# remove outliers
temp = numpy.mean(statisticalNoiseReduction([x["temp"] for x in values], self.filtering_aggresiveness))
humidity = numpy.mean(statisticalNoiseReduction([x["hum"] for x in values], self.filtering_aggresiveness))
# insert into the filtered buffer
self.lock.acquire()
self.filtered_temperature.append(temp)
self.filtered_humidity.append(humidity)
self.lock.release()
# if we have set a callback then call that function w/ its parameters
if not self.callbackfunc is None:
self.callbackfunc(*self.args)
# reset the values for the next iteration/period
values = []
if self.debugging is True:
print("[dht sensor][called for joining thread]")
And here’s an example program for you @cluckers - now, you can use it relatively simple.
This is the most basic program.
grove_dht_example.py
#!/usr/bin/env python3
from grove_dht import Dht
import signal
import sys
# Don't forget to run it with Python 3 !!
# Don't forget to run it with Python 3 !!
# Don't forget to run it with Python 3 !!
# Please read the source file(s) for more explanations
# Source file(s) are more comprehensive
dht = Dht()
def signal_handler(signal, frame):
global dht
dht.stop()
def callbackFunc():
global dht
print(dht)
def Main():
print("[program is running][please wait]")
global dht
digital_port = 4
# set the digital port for the DHT sensor
dht.setDhtPin(digital_port)
# using the blue kind of sensor
# there's also the white one which can be set by calling [dht.setAsWhiteSensor()] function
dht.setAsBlueSensor()
# specifies for how long we record data before we filter it
# it's better to have larger periods of time,
# because the statistical algorithm has a vaster pool of values
dht.setRefreshPeriod(12)
# the bigger is the filtering factor (as in the filtering aggresiveness)
# the less strict is the algorithm when it comes to filtering
# it's also valid vice-versa
# the factor must be greater than 0
# it's recommended to leave its default value unless there is a better reason
dht.setFilteringAggresiveness(2.1)
# every time the Dht object loads new filtered data inside the buffer
# a callback is what it follows
dht.setCallbackFunction(callbackFunc)
# start the thread for gathering data
dht.start()
# if you want to stop the thread just
# call dht.stop() and you're done
if __name__ == "__main__":
signal.signal(signal.SIGINT, signal_handler)
Main()
Here’s the folder structure you should have:
Now, this example program is really basic.
The library is capable of much more.
I encourage anyone to make a code review on this library.
Maybe there’s someone that can add other features to it and by so making it more practical.
Maybe there’s someone who’s thinking of transforming this library into an add-on, so that any sensor can use it.
If there’s anything unclear or if I did a mistake somewhere here, please let me know about it.
Here’s a link to the repo folder - you’ll see here’s another interesting example program.
A README
is also an interesting idea.
Don’t forget to use Python 3!
Thank you!