Click on an image to jump to it.

the main visualization an analysis of the results people view im integration

Method

The core idea of the visualization is simple, and can be broken down into simple steps.
  1. The database is filled with a corpus of documents linked with the people involved
  2. Every document in the database is analyzed for keywords using term frequency-inverse document frequency.
  3. A link is made for every keyword a person has used
  4. The top several hundred links are chosen and a graph is made. Every edge connects exactly one person to one keyword.
  5. A force-directed graph drawing algorithm is used. People and keywords are nodes colored orange, and blue respectively. Nodes are treated as positively charged particles and edges are treated as springs.

Technologies

Gmail

The system relies on Gmail's filters and labels to automatically mark emails as processed.

Ruby

A ruby daemon polls the email server and automatically adds emails to the document corpus as they arrive.

ActiveRecord

In retrospect, this was a mistake. I encourage anyone attempting to implement this system to either use pure SQL, or look into one of the more sane implementations of tf-idf. The technicalities of implementing an efficient tf-idf system are outside the scope of this project.

Simple AIM plugin

Rails

A simple rails site serves up the GraphML used by Flare. Any server with support for some type of scripting will do.

Flex and Flare

Flare is a fantastic visualization library for Actionscript and Adobe Flex born out of the Java prefuse library by the researchers at UC Berkeley Visualization Lab. Flare has built in support for force-directed graphs and GraphML. Implantation was a breeze.

AIM

A simple proof of concept plug-in for the popular instant message client automatically adds links for keywords. Despite the mockup image showing iChat, only Windows clients are supported.

Groups identified in Overview

Roll over the blue highlighted portions of this image of a generated graph to see the groups automatically identified by the visualization. The graph was generated automatically after 1000 iterations of the physics engine. The blue circles and roll over identifiers were manually added after-the-fact by the researcher.

Live Demo of Verbose View

The verbose view is good for a limited subsection of the graph. In the future using either a graph connectivity algorithm like Dijkstra's or a k-means clustering algorithm to identify groups would be nice to visualize this way. Try dragging nodes around to see how the springs and charges respond. Reload the page to see the physics do its work.