Graphing Orlando IoT Temperature Sensor Readings

Case Studies

Graphing Orlando IoT Temperature Sensor Readings

I wondered what temperatures in Orlando have done over this last week. You see I just happen to have a set of IoT devices which are streaming data that I persist into an archive. One of those sensors is on a covered patio in Orlando, so it would be interesting to see what kind of data there is from this last week.

In my setup, Smartthings sends data to a Spring Cloud Data Flow endpoint which drops the data into files. Those files are parsed and uploaded to Ceph host, a simple IoT data flow. Within Greenplum I have created external tables that can read the data and the table looks like this

gpadmin=# \d smartthings_tracking

         Table "public.smartthings_tracking"
   Column   |            Type             | Modifiers 
------------+-----------------------------+-----------
 deviceid   | text                        | 
 name       | text                        | 
 value      | text                        | 
 recordedat | timestamp without time zone | 
 unit       | text                        | 
Number of child tables: 170 (Use \d+ to list them.)
Distributed randomly
Partition by: (recordedat)

The structure is simple and I can sample of few of the data points with this query

gpadmin=# \x
Expanded display is on.

gpadmin=# SELECT * FROM smartthings_tracking 
WHERE deviceid = 'a15b41c3-8bf0-49df-8571-a5a5d138fdff' 
ORDER BY recordedat DESC LIMIT 2;

-[ RECORD 1 ]------------------------------------
deviceid   | a15b41c3-8bf0-49df-8571-a5a5d138fdff
name       | temperature
value      | 72
recordedat | 2017-09-12 01:35:47.993
unit       | F
-[ RECORD 2 ]------------------------------------
deviceid   | a15b41c3-8bf0-49df-8571-a5a5d138fdff
name       | temperature
value      | 73
recordedat | 2017-09-12 00:20:46.677
unit       | F

I know I’m going to want to have a visual representation of the data because looking at it as a list won’t easily convey the information I want. Also I know I will be doing some interactive querying to try to get this right, so I switch from the command line over to Apache Zeppelin which connects to my Greenplum server.

This is one of those cases where I can really appreciate interactive querying of IoT data. I need to do some exploring and will submit queries or attempts at a query a few dozen times. As I’m crafting the visualization I will adjust the window to varying widths of data in order to optimize what I see which will generate more queries. Getting immediate feedback greatly decreases the cycle time in order to get a result. What I’m doing here is fairly simple though and it doesn’t take long before I come up with a query and time window I like.

SELECT 
 deviceid, 
 recordedat::timestamp, 
 value 
FROM smartthings_tracking 
WHERE deviceid = 'a15b41c3-8bf0-49df-8571-a5a5d138fdff' 
AND name = 'temperature' 
AND recordedat > '2017-09-05 00:00:00' 
ORDER BY recordedat DESC;

A little bit of messing around with the visualization buttons and I come up with this

Which is not a bad image, but I quickly realize something is wrong with it. These nice angled lines do not show how temperatures work over time. Why is that?

Then I remember, it is because these sensors kick out a temp value only when they have registered a temperature change of a full degree. So the time distance between two points could be 3 seconds or it could be 12 hours.

This will cause the time to not be reflected with spatial accuracy along my X axis. While it is showing the strange weather, reflected by the low values on the left, it isn’t giving me a good perception of how that actually happened in time over the past week.

Frustration.

Then I remember a post by Caleb Welton a few years ago on Time Series analysis. This has exactly what I need. A big part of making this work is setting up a few functions to that will handle and maintain the last known values.

create function
 last_known_t(prev float8, current float8)
 returns float8
 as $$return current or prev$$
 language plpythonu;

create aggregate last_known(float8) (
 stype =float8,
 sfunc = last_known_t,
 prefunc = last_known_t
);

create function
 last_known_t(prev timestamp, current timestamp)
 returns timestamp
 as $$return current or prev$$
 language plpythonu;

create aggregate last_known(timestamp) (
 stype = timestamp,
 sfunc = last_known_t,
 prefunc = last_known_t
);

With those in place I can now create a new query that will create time slices and drop the data into those slices. Now the slices will carry forward the last known value.

WITH
 bounded AS (
 SELECT
 recordedat,
 interval_bound(recordedat, '15 minutes') AS slice,
 value
 FROM smartthings_tracking
 WHERE deviceid = 'a15b41c3-8bf0-49df-8571-a5a5d138fdff' 
 AND name = 'temperature' 
 AND recordedat::timestamp > '2017-09-05 00:00:00'
 ),
 dense AS (
 SELECT slice
 FROM generate_series('2017-09-05 00:00:00'::timestamp,
 '2017-09-13 00:00:00', '15 minutes') s(slice)
 )
SELECT 
 slice,
 value::int,
 last_known(value::int) OVER (ORDER BY slice, recordedat)
FROM BOUNDED RIGHT JOIN dense USING (slice)
ORDER BY slice, recordedat;

Looking at a piece of the raw output it becomes easy to see how different the data appears as it is chunked into even time periods

I will potentially double up on some numbers if it jumped more than a degree in a 15 minute period, but my visualization can show the average so it is not a problem. This leads to a much better visualization of the temperatures

The temperatures sure did deviate from the normal pattern there for a little while. This isn’t a fantastic visualization, for under and hour of work though it does give me a decent visual representation of the strangeness in the weather. I feel lucky the the only impact I see is in this graph and pray that those dealing with greater impacts stay healthy and find a path to recovery.

Graphing Orlando IoT Temperature Sensor Readings

You May Also Like

Big Data in Healthcare: Revolutionizing Patient Care with AI

HOW TO SCALE SQL SERVER

Cloudifying Enterprise Data Analytics with VMware Tanzu Greenplum and Cloudian Object Storage

Large Chinese Bank’s Successful TD to GP Migration

Enabling Agile Data Analysis with AWS and Greenplum

IoT, CEP, storage and NATS in between. Part 1 of 3.