About

Gravioli is a lightweight location data processing package written in Scala that runs time-series clustering and classification on spatial-temporal data at near linear time. It focuses on data quality and stability for high confidence analytics by organizing thousands of time-embedded GPS signals from a device and delivers a complete yet concise device journey.

The wide output schema allows quick index and search with attributes such as duration, traveling mode, or visits. The package comes with a pre-trained model that works with different data distribution and adjusts model parameters dynamically based on the sampling rate. With key features like outlier detection, GPS drift detection, speed estimation and global time zone conversion, your messy location data will be contextualized and compressed and ready for rapid prototyping, case study or production data pipeline with minimal coding.

Result Schema

Dwell

  • Time/duration

  • Center

  • Home/work/shopping prediction

A strong cluster of signals indicating a device at a location for a period of time.

Moving

  • Time/duration

  • Distance

  • Speed/speed profile

  • Origin/destination

  • Route

  • Mode prediction

A weak cluster of signals that move spatially for a period of time.

Edge

  • Time

  • Location

The beginning or end of a moving cluster that is not connected to a dwell cluster, likely from a lapse of signal or motion triggered signal.

Noise

  • Time

  • Location

A signal that cannot classified as dwell or moving.

At a glance:

  • 190k raw signals from a single mobile device for 400 days turns into

  • 1.7k moving clusters with an average duration of 15 minutes and maximum duration of 4 hours

  • 2.2k dwell clusters with an average duration of 3 hours and maximum duration of 24 hours

  • which is equivalent to ~10 clusters a day

  • achieving a compression ratio of 50:1

  • taking just seconds to compute