Business - Written by Jeff DeChambeau on Tuesday, May 27, 2008 17:07 - 1 Comment
The Archeology of (Programmers’) Social Artifacts

Ok, not quite.
My friend Abram Hindle is doing some fascinating research: he’s working on ‘mining artifacts from versioned software.’ Here’s what that means:
In software development, programmers use a central database to keep track of every change made to the code of their software. This database keeps copies of all earlier versions of the software, as well as the current version (and maybe some unstable versions for testing).
Abram is writing a piece of software that analyzes a development database like the one described above. This software looks through every iteration of the project’s code and determines what changes were made, by who and when, and stores that information in it’s own database. Based on this data, mined from comparing and contrasting previous versions of in-development programs, Abram’s software is able to figure out how much time the programmers spent on each part of the program. What’s more, the software can even determine which programmers create or fix the most errors.
These techniques and methods allow programmers to be socially linked by virtue of what parts of the code they edit — regardless of when either programmer makes their contributions and changes. On being able to map programming contributions socially like that, my buddy Phil says “I think if we did that here at my work, I’d be best friends with a guy who quit 10 years ago.” And he’s right: lots of companies have version control databases that reach back 5, 10, even 25 years. With this software you could look inside of the old code and take a long view at the effectiveness of programming teams over the years under different management regimes, or just track the lifetime growth of a given subroutine.
This form of data mining isn’t only applicable to software programming, though; it will work with any kind of version controlled document (I’m looking at you, wikis). With a mining program like this, you could examine your company wiki and see — nicely summarized — the types of contributions that each editor makes, how long they take to do it, and where they like to spend their time editing.
All of this isn’t without its dark side, though. Programming can be an involved and complicated process, often too much so to be neatly summarized by a graph. That is, there’s the danger that normal programming practices could be misunderstood by managers, who penalize programmers for generating errors, all the while losing sight of the fact that those programmers are generating, by a wide margin, the most code.
I think that this is just one example of a larger theme: that we’re able to extract useful data from the very process of creating and sharing useful data. I’m very excited to see where research like this goes over the coming months.
1 Comment
Jenn Durley
Leave a Reply
Browse Content
- The iPhone, growing up digital, and my daughter's education
- Playbor: When work and fun coincide
- Lessons in collaboration from B.B. King’s
- A decade of frustration ahead?
- Games, user experience, and retroactive Continuity--All enabled by platforms
- Survey: How prepared is the enterprise to lead in the age of unbounded data?
- When you ask customers to dance, let them lead
- Real world examples for collaboration ROI
- Will you use Target's mobile coupons?
- Mobile platform magic: Five things executives must know about mobility
- Addressing the social media ‘support gap’
- On unintended consequences
- Mobile platform magic: Five things executives must know about mobility
- Will you use Target’s mobile coupons?
- Lessons in collaboration from B.B. King’s
- Games, user experience, and retroactive Continuity–All enabled by platforms
- Survey: How prepared is the enterprise to lead in the age of unbounded data?
- A decade of frustration ahead?
- The iPhone, growing up digital, and my daughter’s education
- Real world examples for collaboration ROI
- Playbor: When work and fun coincide
- farmville is the best game ever and this is the best blog post!...
- Physicians are totally antiquated in their use of the computer. Its funny - a r...
- Great list of questions, Laura. Check out this post by someone who signed up for...
- Not everybody will have read Malthus. And the the title heading of this post app...
- Given the numbers not connected properly, there's continuous digital divide....
- Quite possibly....
- Due to global financial crisis companies and individuals are affected. Many work...
- Good post Naumi,
I like how you relate the jazz band performance to customer ...
Business - Mar 19, 2010 16:57 - 0 Comments
Addressing the social media ‘support gap’
More In Business
- Mobile platform magic: Five things executives must know about mobility
- Will you use Target’s mobile coupons?
- Games, user experience, and retroactive Continuity–All enabled by platforms
- Survey: How prepared is the enterprise to lead in the age of unbounded data?
- Real world examples for collaboration ROI
Entertainment - Mar 9, 2010 16:58 - 3 Comments
Lessons in collaboration from B.B. King’s
More In Entertainment
- CL!CK – LEGO’s fun social product development platform
- Peer Pressure 2.0: Farmville
- Online gaming more than just fun
- The NFL – The most protective league, attempting to control the uncontrollable
- The rise of computational photography and the birth of camera 2.0


I have a family member who programs for a major software company. He will attest to management’s misunderstanding of this type of data (something to do with him being the programmer who churns out the most code / bugs). That said, I agree that it is very interesting to think about non-software applications for this type of data mining. Fantasy application: exposing that deadbeat member of a group project whose contribution was the Title Page.