Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-471

Sensor network end to end integration test case

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • 340

    Description

      MERLSense Data


      Mitsubishi Electric Research Labs (MERL) collected motion sensor data from a network of over 200 sensors in the research facility for two years and then released a public data set ("MERLSense Data") in 2009. The data set contains over 50 million raw motion records and is distributed as a GZIP compressed tarball approximately 1.1 GB in size. See http://go.drwren.com/wmddata . The data set is described in a technical report available at http://www.merl.com/publications/docs/TR2007-069.pdf.

      The MERLSense data contains spatio-temporal structure at the granularity of seconds of individuals walking down hallways, chatting with colleagues, attending talks and meetings, on weekdays and weekends through varying seasons and weather.

      This data has some nice properties for testing Phoenix's join capabilities and also the current secondary index implementation, key being that row data is immutable since it is a record of time series data.

      The raw motion trace files look like:

      470 01179980510828 01179980511853 1.0
      469 01179980512169 01179980513193 1.0
      467 01179980513580 01179980514609 1.0
      468 01179980514573 01179980515598 1.0

      The first element is the sensor identification number. The second and third numbers are the timestamps of the beginning and end of the event, respectively. Timestamps are the number of milliseconds since the epoch (January 1, 1970 UTC). Take care to use 64-bit integer representations when manipulating timestamps. The fourth number is the magnitude of the sensor reading, always 1.0.

      The dataset includes a calibration file that associates the sensor IDs to a map of the lab. Each sensor ID corresponds to a unique sensor.

      sid,floor,wing
      214,8,L
      222,8,L
      256,8,W
      257,8,W

      The sensors IDs are associated with physical space by a table that contains one row per sensor, keyed by the sensor ID, with eight coordinates that specify the four corners of a quadrilateral in meters:

      sid,x1,y1,x2,y2,x3,y3,x4,y4
      214,-13.3,23.1,-13.3,25.3,-15.5,25.3,-15.5,23.1
      222,-13.3,20.9,-13.3,23.1,-15.5,23.1,-15.5,20.9
      256,-15.5,8.3,-15.5,10.5,-17.7,10.5,-17.7,8.3
      257,-13.3,8.3,-13.3,10.5,-15.5,10.5,-15.5,8.3

      In addition the data is given temporal meaning by several calendars that record the times and locations of various meetings and gatherings, the dates of official holidays, and a record of the number of people who were out of the office on given days.

      A daily almanac of the weather conditions as measured at nearby Boston Logan airport is also provided.

      Phoenix Integration Tests using MERLSense Data


      0. Create the observation table.

      CREATE TABLE observations (
      sensor_id INTEGER NOT NULL,
      start_time BIGINT(20) NOT NULL,
      end_time BIGINT(20) NOT NULL
      CONSTRAINT pk PRIMARY KEY ( sensor_id, start_time )
      )
      IMMUTABLE_ROWS=true

      1. Create indexes for motion events in each sensor by time in descending order

      CREATE INDEX observations_${sensor} ON observations ( start_time DESC )

      2. a. Replay or bulk insert the motion sensor data.

      UPSERT INTO observations (sensor_id, start_time, end_time) VALUES (...)

      b. Generate realistic additional paths of "individuals" for upsert into the observations table at the desired rate. Choose between a short and long walk, with short being more likely. Then perform a random walk of the chosen number of transitions with variable delay at each step. A transition is valid only if a sensor's area is reacheable from its predecessor's according to sensor coverage geography. There is a helpful map provided with the data showing sensor adjacencies on a floor plan.

      3. Find TopN popular locations using the main observations table.

      4. Select subsets of activity to study as joins over indexes.

      5. Join motion sensor data with the sensor location table to produce result sets with spatial context. The sensor location table is very small so this should be possible to do in memory on the server.

      Attachments

        Activity

          People

            Unassigned Unassigned
            apurtell Andrew Kyle Purtell
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: