Method
The core idea of the visualization is simple, and can be broken down into simple steps.
- The database is filled with a corpus of documents linked with the people involved
- Every document in the database is analyzed for keywords using term frequency-inverse document frequency.
- A link is made for every keyword a person has used
- The top several hundred links are chosen and a graph is made. Every edge connects exactly one person to one keyword.
- A force-directed graph drawing algorithm is used. People and keywords are nodes colored orange, and blue respectively. Nodes are treated as positively charged particles and edges are treated as springs.
Technologies
Gmail
The system relies on Gmail's filters and labels to automatically mark emails as processed.
Ruby
A ruby daemon polls the email server and automatically adds emails to the document corpus as they arrive.
ActiveRecord
In retrospect, this was a mistake. I encourage anyone attempting to implement this system to either use pure SQL, or look into one of the more sane implementations of tf-idf. The technicalities of implementing an efficient tf-idf system are outside the scope of this project.
Rails
A simple rails site serves up the GraphML used by Flare. Any server with support for some type of scripting will do.
Flex and Flare
Flare is a fantastic visualization library for Actionscript and Adobe Flex born out of the Java prefuse library by the researchers at UC Berkeley Visualization Lab. Flare has built in support for force-directed graphs and GraphML. Implantation was a breeze.
AIM
A simple proof of concept plug-in for the popular instant message client automatically adds links for keywords. Despite the mockup image showing iChat, only Windows clients are supported.



