Posted by & filed under Deployment, Rails.

The following describes one method of deploying Rails applications in a production environment. I have only used this approach with a single server, and some changes would have to be made to get this to work in an environment with load balanced servers, multiple databases, etc. I have borrowed some of this from “Agile Web Development with Rails 4.”

Deployment server

Because it is outside the scope of this post, let’s assume you have installed the following on your deployment server:

At this point, install Ruby. We’ll assume it’s version 2.0:

rvm install ruby

Next, install Phusion Passenger (which is basically mod_ruby for Apache):

Then set up an empty git repository:

mkdir -p ~/git/my_app.git; cd ~/git/my_app.git; git init --bare

Finally, give your account permission to access the server as if it were remote, by adding a public key:

test -e ~/.ssh/id_dsa.pub || ssh-keygen -t dsa; cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Development machine

On the development machine set up your Rails application with Capistrano:

echo "gem 'capistrano', '~> 3.0.0', group: :development" >> Gemfile

echo "gem 'rvm-capistrano', group: :development" >> Gemfile

bundle --binstubs; cap install

Put your application under version control if it is not already:

git init .

Commit the changes:

git add .; git commit -m "initial commit"

Next, push the changes to the deployment server:

git remote add origin ssh://user@host/~/git/my_app.git; git push origin master

Configure your Capistrano recipe, which I recommend basing on the example from the “Pragmatic Programmers” book. Also, edit the Capfile as necessary if you intend to use the Rails asset pipeline.

Finally, it’s time to deploy! Check your setup, and fix any warnings/errors:

cap deploy:setup

Once everything looks good, set up your database as needed with:

cap deploy:migrate

and

cap deploy:seed

Then deploy with:

cap deploy

This does the following:

  • checks out the latest commit
  • symlinks the assets (to preserve them between deploys)
  • uses Bundler to install any new gems or versions in Gemfile.lock
  • precompiles and adds cache busters to the assets
  • symlinks shared configuration files (that are not in version control, such as database.yml)
  • symlinks from the location known by Phusion Passenger
  • touches tmp/restart.txt to signal Phusion Passenger to restart the application

If anything goes wrong during these steps, Capistrano will emit a warning and roll back the deployment so no harm is done.

Posted by & filed under Data Management, Machine Learning, Purple Robot.

Purple Robot (PR) is an excellent data-collection and device-side background application, with an impressive functionality set. But for any non-trivial use – and particularly for use in an institutional setting (such as that of CBITS) – it quickly becomes necessary to enable automated data-ingestion.

tl;dr

Purple Robot Importer and Purple Warehouse are a robust and scalable system for receiving sample data uploaded by Purple Robot (or any other client implementing its simple message interface). This includes high-frequency data like that collected by the accelerometer. The data are stored in a individualized SQL databases, making the system fast and easily queryable.

Longer Version

How to do this? Consider the following facts (guided partly by the “three Vs” framing of “big data” problems, since even though we don’t particularly have “big data”, we expect we’ll eventually get to such a state):

  • The volume of data can be quite large, given sub-second sampling rates on several of the probes.
    • The Accelerometer Probe, for example, is capable of capturing the X, Y, and Z dimensions, timestamped at the nanosecond level, at a rate upwards of 100 samples/sec or more (where “samples” is here specifically the context of science, generally; in the rest of this post, these would be “sub-samples”, in the context of Purple Warehouse). Other sub-second probes include the Gyroscope Probe.
    • Additionally, other probes sample frequently, perhaps every few seconds, such as the Pressure Probe and the Temperature Probe.
  • The velocity of the data outputted by the swarm of mobile devices sending us data will rise with mobile and wi-fi network upload speed increases.
  • The variety (shape) of the data structures outputted by Purple Robot will vary.
    • Common attributes exist across PR probes, but each probe inherently offers different dimensions from other probes.
    • Added to this, device manufacturers, motivated by the competitive advantage of greater functionality, continue to add new sensors – sensors that we may wish to capture.
    • Non-sensor uses also create varying data structures.
  • Conventional big-data tools (e.g. Hadoop and its ecosystem of Mahout, HBase, Hive, Impala, etc.) are too heavy and user-unfriendly for a small technical team on stereotypically-tight university research budgets with bright-but-not-technically-focused researchers as our customers.
  • Near-real-time querying is desirable, given that our ability to perform behavioral interventions in a meaningful way with at-risk study participants may demand rapid response.
  • Researchers tend to use data-analysis tools like Matlab, RapidMiner, R, and Excel – none of which integrate with traditional “big data” NoSQL-based databases like MongoDB, but which are capable of integrating with SQL-based RDBMSs.

Purple Robot Importer (PRI) and Purple Warehouse (PW) are custom in-house solutions to this problem.

These services have been fast, stable, and robust, finding use for multiple trials and by external collaborators since November 2012, with only a couple of bugfixes (both very early-on), and very few feature upgrades necessary since its initial deployment.

Architecture

The data-flow architecture looks like this (borrowing a slide from a presentation given about this system at ISRII in May 2013):

PRI, needing to be a high-speed, write-only API layer, was conceived before any code was written as a stateless, functional architecture, hewing-closely to the design intentions of Node.js applications. This makes it easy to scale PRI: simply add additional PRI instances and round-robin them at the load-balancer level.

Alternately or additionally, one may add more cores to the PRI host: both Metamorphoo and Trireme have been parallellized via the “cluster” module available for Node.js, taking full advantage of multicore systems by instantiating n_cores-1 worker processes, managed by a master process.

Purple Robot Importer (PRI)

PRI is the system that receives sample data uploaded by PR, and stores it in PW.

PRI fundamentally consists of two web-service applications:

  1. Metamorphoo
  2. Trireme with the Dingo extension

Both applications provide a large number of HTTP(S) routes, most of which are not relevant to the PRI system.

Metamorphoo — so-named because its original purpose was data-transformation — exists as the application to which PR connects from a mobile device.

Trireme is a data-access layer (originally built as an API layer to a MongoDB instance) which houses Dingo as a module that Metamorphoo calls during sample-upload processing.

Dingo is basically a web API for executing CRUD SQL statements against either a Postgres instance or MySQL instance (the latter is a legacy carryover from iteration 1 of PRI).

Metamorphoo receives probe samples from PR in a JSON message format known and enforced both on PR and on PRI (with integrity-validation on both ends prior to all processing). It then converts these to a set of SQL statements that — from scratch — will create or update a user’s database to contain columns representing all of a sample’s top-level dimensions and their sub-samples. This is possible regardless of datatype, via datatype inference on the first observation. (i.e., this performs a purpose-limited set of ORM functionality.) Samples and sub-samples are converted to SQL INSERT statements.

Each of the database existence- and structure-validation steps, and their corresponding creation/modification steps when insufficient structure exists, are performed step-by-step as SQL statements executed on PW via Dingo, until the DB contains the necessary structure to represent samples and sub-samples. Structural checks occur for each message sent by PR (but not for each sample or sub-sample being inserted – this would degrade performance far too-much).

At this point, the set of sample-insertion statements is sent Dingo for submission to and execution by PW.

Purple Warehouse (PW)

PW is a data warehouse built on PostgreSQL into which all PR data is uploaded by PRI. It sits at the heart of the entire system, taking all data inflows, and responding to all queries for data.

PW stores the collected sample and sub-sample data.

Each username maps to a single user’s database.

Each probe type (e.g. accelerometer, location, etc.) maps to a sample table in the user’s database.

Each sample maps to a row in its sample table.

Each sub-sample maps to a row in the sample table’s corresponding sub-sample table.

As of Sept. 19, 2013 our main PW instance houses the roughly 490GB of data we have collected via PRI since Nov. 2012, of which 330GB has been stored since about May 18, 2013. Much of the total is from usage by researchers in activity-recognition who are making heavy use of the accelerometer.

Conclusion

PRI and PW provide the foundation for widespread mobile device sensor data-collection at CBITS. Whether low-volume data, such as survey responses, or high-volume data, such as accelerometer samples, this system has risen to the challenge of collection for hundreds of users.

Posted by & filed under Android, Machine Learning, Purple Robot.

One of the challenges of marrying sensor data collection and machine learning algorithms to generate usable modes for accomplishing something useful is negotiating the right level of abstraction of data flowing from the sensor to the learner. If the sensor data is too low-level, it functions as little more than noise and the learning algorithm will interpret spurious random patterns as something meaningful. If the data flow is too high-level, you’ve probably wasted time and effort implementing learning infrastructure that is little more than a simple mapping from one high-level concept to another. The trick is finding the right middle ground that maximizes the usefulness of the models being generated with expending as little time and resources processing the data from a low-level noisy signal to something more meaningful.

Calibrator Message

In Purple Robot, we’ve been dealing with this mismatch between our learners and our location and communication log probes. We know that we’re collecting sufficient raw data that can be used to generate models, but the system isn’t giving us models that work. For example, in some of the apps we’re building for patients with depression, knowing where the patient is helps us understand how they’re spending their time, correlate mood with places they visit, and offer suggestions to help improve their mental state (e.g. “You haven’t left home for the past 18 hours, try taking a walk.”).

Unfortunately, while the mobile device knows where you with an error of less than a couple of meters, it thinks in three-dimensional positions (latitude, longitude, altitude) instead of the more semantic ways humans think about place (home, work, school, etc.). To get around this issue, we’ve implemented automated systems for determining more semantic labels, such as querying the Google Places or Foursquare APIs for labels. We get back more useful information that helps improve our models (“X is in a place that is surrounded by 25% shops, 30% restaurants, 20% parks, and 25% bars”), but this is a generic view of the user’s location that omits any personal meaning or significance given to a place.

Similarly, Purple Robot can access and analyze the local call and SMS logs to extract information that might be useful for determining how a patient’s social contacts influence their behavior and mental state. Purple Robot will try and use any information available in the local address book, but mobile phone users rarely tend to keep their contacts up to date. From Purple Robot’s perspective, it can see that a patient has talked to a particular phone number – which might have a name attached – but from an algorithmic perspective, the values “Jenny, (555) 867-5309” as much meaning as “Z, 4”. This information is useful for distinguishing between individual contacts, but doesn’t shed much more light than that.

After weeks of discussion and brainstorming, we conceded that in order to bump the semantic level of our data up a notch, Purple Robot would have to initiate an interaction with the patient to ask a few targeted questions to assist the sensor moving forward. Since this is not unlike calibrating a measuring instrument, we’ve been calling these interactions “calibrators”.

Calibrators hook into the existing sanity checking framework that alerts the patient if Purple Robot is misconfigured, running out of resources, or fails some other check that indicates the system is performing sub-optimally and a user action can correct the problem. A typical calibrator’s sanity check will ask two questions: if/when was the sensor last calibrated and does the system have enough data to ask good calibration questions. In the case of the communication log sensor, the patient isn’t prompted to calibrate until a sufficient number of communication events (calls, SMSs) are available to label. In the case of the location sensor, the system collects a historical log of places visited, and the patient is invited label the places when the number of logged locations exceeds a couple hundred. Our goal is to only interrupt and query the user as infrequently as possible.

The communication log calibration is pretty simple. After enough calls and texts have been made, Purple Robot prompts the patient to tell us more about the people on the other end of the line. In our case, we’re interested in learning the categories of people, so we present an interface that aggregates the communication log, sorts the contact by frequency of interaction, and the patient can tap the contact’s name/number and select a category from a set of predefined labels (significant other, friend, family, coworker, etc.). The system saves those labels and the next time it transmits communication data to the learner, it will look up the individual’s information from the calibration dictionary and add the category label to the data payload. This allows our learners to not only detect relationships between mental state and individuals (“talking to Coworker Bob makes me more stressed”) but also between mental state and groups (“talking to friends makes me less anxious”).

Contacts List Contact Details

The location calibrator is a bit more involved. Modern locational technologies return readings that are values on a continuous scale. I can sit perfectly still in a single place and watch my GPS position and it will flutter in the insignificant digits due to variations like weather conditions, satellite position and ambient interference. We can clean this data by rounding readings to a certain level of precision, but the error/flutter in the signal may be larger than you want to round up to. From a user experience perspective, we also don’t want to place a significant labelling burden on the patient to make sense of the data (“label point X. Now label the point 100 meters west of X. Now label the point 100 meters north of X. etc.”).

To get around these two issues, we deploy a technique called “cluster analysis” that can look at a collection of points in a multidimensional space and highlight dense areas where many points are collocated. Algorithms such as k-means are popular approaches to solve this problem, but we decided to use an alternate approach called DBSCAN. In short, the DBSCAN algorithm takes two parameters: what is the minimal distance between points in a cluster and how many points must be close enough for the collection to be considered a cluster. In Purple Robot, DBSCAN allows us to take hundreds of location readings and highlight the ones where the patient spends their time.

Work Cluster Transit Cluster

Once we generate the location clusters, we have enough information to build an effective user interface that will prompt the patient to label significant places as detected by the DBSCAN algorithm. In our case, we combined a map view with a list view that allows the patient to zoom to a given cluster by clicking the color next to the place label to see where it’s situated on a map, and once they identify the place, they can click the name to assign it a label. The labelled clusters are then saved and when the location sensor generates a new location reading, that reading is tested against the saved clusters to determine if that point would be a member of one of the saved clusters. If it would be a member, then the cluster’s name is appended to the location reading before being sent to the learning algorithm. In our initial testing, this has worked well, but we are still testing the scalability of the approach to determine how much information we can save and process to improve the data we’re generating.

The location and communication log calibrators are only the first of what will probably be many more calibrators. These two sensors were fairly straightforward to tackle with a clear payoff for improved model generation. We are currently discussing taking this approach to other sensors such as implementing calibration for activity detection (walking, running, sitting, etc.) using the motion sensors. In this case, instead of an on-screen user interaction, we may implement an alternative sound-only interface that allows the user to keep their device in its usual location while it instructs them to go through various activities so that the motion calibrator can collect the signals that it needs to train its own internal algorithm for a given patient.

This is an exciting area of development for us, and I look forward to sharing more details and notes as we continue to make progress.

Posted by & filed under Android, Cordova, Intellicare, Javascript, Presentation, Techtopia.

Intellicare is a new project for CBITs and it involves the creation of a suite of Android “mini-apps” that each implement a clinical strategy or objective to serve users with anxiety or depression.

To speed the development of these smaller apps, I’ve created an Android template app that serves as a resource/library providing commonly-used features and assets to individual Intellicare apps. For example, it provides the common consent form required by the IRB. It provides a layer over the Android notification framework for managing alerts from a variety of related apps. It includes an icon library for use within the UI.

From my perspective, the native Android bits are falling into place quite well. However, the Cordova/PhoneGap platform is a popular tool among CBITs developers and I’m probably the least schooled in its use. I need to understand better what components from the Javascript world should be part of the Intellicare template.

So, this Friday, I’ll be introducing the app template to CBITs developers, show how it’s already being used to develop new apps, and open the floor for a discussion about what other Javascript-based platforms or frameworks should be bundled with the template beyond the base Cordova system.

Friday, 4-5pm @ Techtopia Room.

Posted by & filed under Android, Django, Machine Learning, Purple Robot, RapidMiner & RapidAnalytics.

While Purple Robot’s main features are its data collection mechanisms and embedded scripting environment, we’ve been working hard to integrate machine learners. Being able to execute learned models on the same device that is collecting data is enormously powerful and allows us to build functionality that takes specific actions when a learner predicts something useful (e.g. “Your mood seems to be unusually poor at the moment – would you like to play Angry Birds to take a break?”) or to help us collect a fuller dataset to improve our models of our user (e.g. “The confidence in predicting your location is low because your latitude and longitude fall outside the bounds of your previously-seen area. Where are you?”).

While implementing robust modeling on the mobile device opens up many interesting possibilities, limitations of typical mobile devices constrain our opportunity. On a technical level, these limitations include:

  • Battery power and lifespan
  • Computational processing power
  • Limited memory
  • Limited & expensive network access

To create a successful mobile experience, we have to weigh the impacts of the systems we are creating with how the user expects their device to behave. For example, we can’t be constantly generating models because that would drain the phone’s power too quickly and the advantages of the mobile platform are lost when the user has to keep it tethered to a charger to keep the device functional. We can’t go too crazy with I/O or memory usage because this will impact the responsiveness of other software running on the system. We also can’t use a cellular network in the same way that we might use a broadband connection – mobile users have much smaller data allocations that are orders of magnitude more expensive.

Given the opportunity offered by machine learning and data mining technologies, we’ve been exploring different approaches to try and capture the best of both worlds. In Purple Robot, we have already addressed some of these issues on the data collection front (such as our store-and-forward data collection & transmission architecture) and some of our approaches mirror what’s worked for us in similar contexts. The remainder of the post will outline how we’re adding learner functionality to Purple Robot.

Training the models

The fundamental and inescapable truth that complicates our life is that Purple Robot is capable of capturing a volume of information in real-time that exceeds our ability to analyze it in any time- or space-effective way on the mobile phone. Hardware sensors can be configured to collect samples exceeding 100 Hz, and software-based probes typically collect data on one, five or ten minute intervals. Running a typical configuration, a mobile device can collect 10 to 15 megabytes of data in an hour. Given that we typically wish to build models from many hours’ worth of data, the overall memory footprint for the dataset in question can consume hundreds of megabytes (or more) of RAM in a typical scenario.

On the software side of things, mobile operating systems are quite conservative in the amount of memory that they will allow user apps to consume before forcefully shutting them down. The maximum size of the heap that Android will allocate varies by device, but it’s typically between 16 and 48 MB. When we load the necessary resources to run the app, any libraries we may need to do a proper analysis, the remaining memory available is simply insufficient for most training algorithms’ implementations.

Consequently, we’ve adopted an architecture that forgoes model training on the phone, delegating that responsibility to a server API that can retrieve the data collected from our storage systems and train models on that data on traditional desktop/server hardware that supports much more physical memory and related software infrastructure like disk-based swap spaces.

In our own implementations, we’ve adopted RapidAnalytics as our default learning engine because it provides a user-friendly interface that we can use to create workflows that process our data, train & evaluate models, and package the models (with assistance) in a format that can be expanded, interpreted and executed on the mobile device. (More on this below.) The RapidAnalytics server product provides a simple route from taking customized workflows authored in RapidMiner and exposing that functionality via a web API.

Surrounding the RapidAnalytics component is a Django web application that implements the necessary functionality that RapidAnalytics does not include. The Django application provides the following services:

  1. Retrieving and packaging data from our storage system (Postgres) into a format ingestible by RapidAnalytics (ARFF).
  2. Maintaining the batch scheduler that handles periodic tasks such as updating the cached ARFF file for a given participant & label. This batch system also automates the training and evaluation of models through RapidAnalytics.
  3. Providing a researcher-facing data dashboard that provides tools to assess the quality of the data being collected from the mobile devices.
  4. Providing the transmission channel for sending models to the mobile devices.
  5. Translating any proprietary data formats (RapidMiner or otherwise) into formats suitable for use on the mobile device.

The current implementation status of the server component is that we are successfully training models (decision trees) and caching the results on an individual basis. We are currently improving that process to make it more robust to anomalies like missing data when using more sensitive learners with implementations that don’t cope well with the data that we’re collecting in the messy real world.

architecture

Executing the models

Once we’ve generated models using our server infrastructure, the mobile device fetches the trained models from the server and replaces any existing model for a given label or concept with the newest one. Note that the mobile device does not have a sufficient history nor the metadata to determine if one model is better than another, so the server infrastructure’s solely responsible for guaranteeing that it’s making the best model available to the mobile device.

Once the device has the model, we execute it locally either using our embedded scripting environments (JavaScript or Scheme) or with the assistance of an existing native Java library. In the case of decision trees, we take the model produced by RapidAnalytics and generate a JavaScript if/then tree that implements the decision tree model. In the case of support vector machines, we take a textual representation of the support vectors and generate a native evaluator using the LibSVM library.

By offloading the training of the learners to the server infrastructure, but keeping the real-time evaluation of the data on the local device, this allows us to reap the benefits of computationally-expensive training algorithms while retaining the ability to remain responsive and accessible (in periods where network connectivity may be spotty) on the device itself. The two main drawbacks to this approach is that models on the mobile device may become stale if it’s unable to retrieve updated models from the server for any reason and the choice of learner algorithms is not entirely open-ended since we must still be cognizant of how device limitations can constrain model execution.

The most salient example of these constraints that we’ve encountered is dealing with missing data. During the lifetime of a mobile device, particular sensors may be deactivated for a variety of reasons, including limited predictive utility, selective shutdowns by the system to conserve battery life, or the user changing the parameters under which the sensor operates. Consequently, the feature set that we provide with the labels is quite likely to be dynamic in structure. New feature values may be introduced in later data sets, and values may later be removed.

For models that are robust against missing values (such as C4.5 decision trees), this isn’t a major issue. However for other algorithms (such as RapidMiner’s SVM implementation), missing data can prevent the classifier from producing results, so imputing the missing values becomes an important part of the model execution. Since the mobile device does not have the available storage capacity to keep a full history of everything it ever sensed, this lack of history can preclude techiques that depend on the historical distribution of values to compute a replacement for the missing attribute. Consequently, on the server side, algorithms must be chosen and configured with these execution limitations in mind if the models used by the device are to have the same performance characteristics as the models trained on the server.

Current status

On the mobile device side of the software development, we’ve run some small feasibility experiments with deserializing and executing decision tree models and have been successful on that front. While LibSVM has been included with the Purple Robot distribution for several weeks, we are still working on resolving the issues on the server that will reliably produce a trained SVM in the presence of missing and dirty data. Once we are successful on that front, we already have a process in place for converting the RapidAnalytics SVM output into a format suitable for deserialization (via LibSVM) on the device.

While we’re still in the process of assembling this machine-learning infrastructure, we’re excited to begin applying it in a productive user-facing manner on the mobile devices. As I mentioned in the introduction, I believe that marrying the real-time execution of models with our existing trigger framework will allow us to create more personalized and responsive interventions and products than our current scheduled-based systems. I’m also quite interested to see if we can also employ the confidence estimates of our predictions help us obtain more data where we need it and interrupt the user less often when we’re already producing reasonably confident predictions in contexts we’ve already observed repeatedly.

Posted by & filed under Android.

Purple RobotWhen I started working for CBITs back in September, my initial task was to develop a set of tools that would complement existing mobile apps. Since these apps had been constructed using the Apache Cordova platform, the advantages of using JavaScript as the base technology platform were tempered by Cordova’s inability to fully integrate with native platforms (example, example) once the user was outside the app hosting the root WebView. With this challenge in hand, I started writing Purple Robot to bridge this gap in functionality. The initial incarnation of Purple Robot was primarily concerned with providing a time-based triggering mechanism that could notify the user in one of three ways: status-bar notifications, app widgets, and a full dialog when needed. The system operated by downloading a JSON configuration file from a remote web server that encoded schedules and other conditions that dictated when a given notification was to be presented. A date trigger example:

{
        "type": "datetime",
        "name": "Repeating Test Date Trigger",
        "action": "PurpleRobot.launchUrl('http://www.twitter.com');",
        "datetime_start": "20120920T092000",
        "datetime_end": "20120920T092100",
        "datetime_repeat": "FREQ=HOURLY;INTERVAL=1;BYHOUR=9,10,11,12,13,14,15,16"
    }

This trigger opens the Twitter web page in the default browser 20 minutes after the hours between 9am and 4pm. (If you’ve used a calendaring program, the repeating concept will be familiar to you.) We created a small JavaScript runtime and inserted our own PurpleRobot global object (and mini-API) that serves as the interface between the JavaScript action and the rest of the system. If we wanted to show a status-bar notification (like you get when a new text message arrives), we could change the action parameter:

"action": "PurpleRobot.showApplicationLaunchNotification
('Open CBITs App', 'Tap this notification to open CBITs.app.', 'CBITs', 0);",

When the user selected the notification, it would launch a local app named CBITs. With the triggering mechanism, Purple Robot addressed some initial pain points and provided a way for us to push content and activities to mobile devices out in the field. In the months since then, Purple Robot has been extended dramatically to provide the additional services to our other apps on Android devices:
* A full local sensor acquisition platform and methods to intelligently save sensor data to external web endpoints.
* Local interfaces to visualize and understand local sensor history. (e.g. Where have I been today?)
* An embedded web server that builds upon the initial JavaScript runtime that allows local apps and webpages to access PurpleRobot services. (e.g. A Cordova app needs to update the content of a Purple Robot-provided app widget.) Over the past three months, the application has grown from single idea into an 80 thousand of lines of code system that’s providing native system integration & infrastructure to applications that would otherwise leave that out. Over the next several weeks, I’ll dive into particular aspects of the system to describe how it’s built and why

Posted by & filed under Techtopia.

Happy 2013 and congratulations on surviving the Mayan apocalypse! My name is Chris Karr and I’m one of the software guys (along with Mark, Evan, Gabe, and others) at “Techtopia” as part of the Center for Behavioral Intervention Technologies at Northwestern University’s Feinberg School of Medicine. At Techtopia, we work with both internal and external researchers and collaborators to create the next generation of psychological tools to assist patients in need. I’ve officially been a member of the Techtopia crew since last September (2011), but I’ve been working with them in an outside capacity back as early as 2009, when I put together a Qt-based native app for the initial Mobilyze trials on Nokia Symbian devices. When I decided to exit the independent consulting business in 2011, I evaluated a variety of options, and joined CBITs because of the extremely interesting and meaningful work that they were doing. In the Chicago area, there were few opportunities for me take techniques like machine learning and ubiquitous computing and apply them to concrete problems facing everyday people. At CBITs, I saw that opportunity and took it. In terms of my background, my formal educational background consists of a Bachelor’s degree in Computer Science from Princeton University (where I worked with Brian Kernighan to create a system predating and not dissimilar to Google Earth) and a Master’s degree in Media, Technology and Society from Northwestern University’s School of Communication where my work with Darren Gergle focused on applied context-aware computing. Prior to my Master’s work, I worked for Northwestern University the first time at Academic Technologies where I was responsible for creating technologies for professors and others that supported the University’s teaching and research missions. After receiving my Master’s degree, I started and ran Audacious Software as a consulting firm providing services start-up and research clients. As a software craftsman, my primary interest lies in the realm of ubiquitous computing. It’s my personal belief that we can be much more ambitious with our mobile platforms beyond today’s basic and isolated apps and I look forward to writing more about how I translate that belief into shipping products. If you’re interested in learning more about my approach, I strongly encourage you to take a look at some of the work I’ve done thus far and my ongoing projects:
* Shion: An open-source platform for home automation. (GitHub: desktop app, iOS app)
* Pennyworth: My original context-awareness research on the Mac and other platforms.
* SMSBot: An open-sourced platform for creating interactive dialogs using SMS text messaging.
* Fresh Comics: This is what happens when an ubicomp obsession meets a comic book obsession.
* Audacious Software: A select list of my work from my past life as an independent contractor.

I’m glad to have this opportunity to share my work and thoughts in the months ahead and I hope that my upcoming posts are both interesting and useful.