Purple Robot (PR) is an excellent data-collection and device-side background application, with an impressive functionality set. But for any non-trivial use – and particularly for use in an institutional setting (such as that of CBITS) – it quickly becomes necessary to enable automated data-ingestion. tl;dr Purple Robot Importer and Purple Warehouse are a robust and… Read more »
Posts Categorized: Machine Learning
One of the challenges of marrying sensor data collection and machine learning algorithms to generate usable modes for accomplishing something useful is negotiating the right level of abstraction of data flowing from the sensor to the learner. If the sensor data is too low-level, it functions as little more than noise and the learning algorithm will interpret spurious random patterns as something meaningful. If the data flow is too high-level, you’ve probably wasted time and effort implementing learning infrastructure that is little more than a simple mapping from one high-level concept to another. The trick is finding the right middle ground that maximizes the usefulness of the models being generated with expending as little time and resources processing the data from a low-level noisy signal to something more meaningful.
After weeks of discussion and brainstorming, we conceded that in order to bump the semantic level of our data up a notch, Purple Robot would have to initiate an interaction with the patient to ask a few targeted questions to assist the sensor moving forward. Since this is not unlike calibrating a measuring instrument, we’ve been calling these interactions “calibrators”.
While Purple Robot’s main features are its data collection mechanisms and embedded scripting environment, we’ve been working hard to integrate machine learners. Being able to execute learned models on the same device that is collecting data is enormously powerful and allows us to build functionality that takes specific actions when a learner predicts something useful (e.g. “Your mood seems to be unusually poor at the moment – would you like to play Angry Birds to take a break?”) or to help us collect a fuller dataset to improve our models of our user (e.g. “The confidence in predicting your location is low because your latitude and longitude fall outside the bounds of your previously-seen area. Where are you?”).
While implementing robust modeling on the mobile device opens up many interesting possibilities, limitations of all mobile devices constrain our opportunity. On a technical level, these limitations include:
* Battery power and lifespan.
* Computational processing power.
* Limited memory
* Limited & expensive network access.
To create a successful mobile experience, we have to weigh the impacts of the systems we are creating with how the user expects their device to behave. For example, we can’t be constantly generating models because that would drain the phone’s power too quickly and the advantages of the mobile platform are lost when the user has to keep it tethered to a charger to keep the device functional. We can’t go too crazy with IO or memory usage because this will impact the responsiveness of other software running on the system. We also can’t use a cellular network in the same way that we might use a broadband connection – mobile users have much smaller data allocations that are orders of magnitude more expensive.
Given the opportunity offered by machine learning and data mining technologies, we’ve been exploring different approaches to try and capture the best of both worlds. In Purple Robot, we have already addressed some of these issues on the data collection front (such as our store-and-forward data collection & transmission architecture) and some of our approaches mirror what’s worked for us in similar contexts. The remainder of the post will outline how we’re adding learner functionality to Purple Robot.