Naumi Haque
The digital identity divide

If you haven’t conducted this experiment yet, visit the MIT Personas project and type your name into the search field. What comes out is a visual representation of your digital self. As noted on the project page, “Personas shows you how the Internet sees you.”
Naumi MIT visual

Over the past year, we’ve been researching extensively the topic of digital identity. Not surprisingly, a lot of the theory we explore directly affects our day-to-day lives and how we interact online. As an example, I’m finding a growing “identity divide” as my various social graphs intersect (or more importantly don’t intersect) with my digital self. To put it bluntly, it’s becoming increasingly uncanny how my online persona—instead of converging—is in many ways actually diverging from the “real” me.

Why is this? One reason is that many of our most significant interactions—those with family and close friends—occur offline and are not captured as part of our digital identities. I think the bigger reason is that we’re constantly reminded that our digital identities are entities that need to be managed, so what does appear online tends to be a highly sanitized version of us. danah boyde talks extensively about the issue of digital identity. In her 2002 thesis paper “Faceted Id/entity: Managing Representation in a Digital World,” she says:

“In any given situation, an individual presents a face, which is the social presentation of one facet of their identity. I believe that an individual has a coherent sense of self, but in presenting only facets of their identity, they are perceived as fragmented. People maintain many different social facets and often associate particular facets, and therefore faces, with particular contexts.”

For knowledge workers like me, the vast majority of Internet use is work-related. In many ways, if you’re developing a professional online brand, the decision to cut out personal details is made for you. I could use Twitter and blogs to talk about how cute the baby is or to complain about how I was up all night working on a paper, but how would that contribute to my brand? Why does anyone care about what movie I saw on the weekend or who I’m cheering for in the playoffs? The general question is: What is the value of using the pubic Web as a venue to air personal issues?

I would argue not much (unless maybe if I’m a celebrity). But it does create a gap in my online identity. In a sense, it’s unfortunate that what I talk about online is of no interest to family and friends. I could develop professional distinction on the Web but few people that who are really near and dear to me would notice.

One of the models we can use to help think about this digital identity divide is the Johari Window. The image below shows a 2×2 matrix that contains different types of information about our identity – aspects that are known to us and those that known to others. Over time, more and more of the matrix comes into play as personal information is digitized and shared. Prior to social networking everyone projected a particular image within which there were characteristics of themselves that they knew about and chose to share: the Arena. There were also characteristics of which they were aware, but chose not to share: the Façade. This was changed by social networking technologies which can now uncover a whole other aspect of the self: the Blind Spot(s) – things that we don’t know about ourselves, but that are well known to others (e.g. characteristics such as habits, expressions, or even behaviors).

The true power of social networking becomes apparent in the co-operative era, during which knowledge is taken from narrow to broad base by organizations that can now link up personal information known in different contexts/realms and make it useful for their own purposes. Finally, we have the tireless machine era in which we have to imagine a world (not too far away) where machines and software will be operating 24/7/365 in the background, enabling powerful reality mining applications. Much of what is uncovered in the tireless machine era will still have been unknown to us and to others as well, but will be collected, aggregated, analyzed, monitored, and tagged by powerful, always-on systems running in the background of all Web applications. (Note: nGenera members will hear more about the Johari Window and pervasive digital identity in several forthcoming research projects).

Johari Window Main

The progression to the tireless machine era is by no means linear, and I think what we’re experiencing now are growing pains. Currently, I think the highly managed approach to most digital identities is moving more information from the Arena to the Facade, creating the divide I talk about above. The purposeful collection of personal data by companies is adding to our Blind Spots (e.g. companies can now use social network analysis and reality mining to identify behaviors and trends that we don’t perceive) but I don’t think the technology is quite good enough yet to create much that it fully Unknown.

Johari Blind and Facade

I don’t think anyone feels comfortable with idea of always putting up a facade – most of us would like to think we “keep it real,” however I think in many ways, the Internet gives us little choice. A Net Gen take on Richard Florida’s Creative Class asks, “Why leave your personality at home? As free time blends with time on the job you want to make a statement and be comfortable at the same time.” It’s a nice thought, but it sounds unlikely for most professional environments.


Doug Cornelius
Aug 20, 2009 13:24

Naumi –

Some really interesting thoughts here.

The MIT personas is very interesting, but mis-characterizes itself. It does not produce a representation of person, it merely produces a representation of the name. I ended up with a big chunk labeled “sports” because it pulled information on Doug Cornelius, the basketball coach at Yuba College. (not me)

The other flaw is more central to your discussion. It only pulls information from the public internet. So it is missing the rich information in closed online communities. Facebook being the biggest missing piece. It also happens to be the place where people expose more of their personal side.

Different parts of the internet allow you to expose different parts of your persona.

In a professional space like this for you, or my blog for me, is probably of little interest to those who are near and dear. But in alternative spaces and closed online communities you publish information and communicate in a different way with different people than you would in a very public space.

I did a similar 4 box analysis pointing out that that the line is not merely personal versus professional, but also public versus private. You can see that here:

Naumi Haque
Aug 20, 2009 16:17

Hey Doug, great point on the MIT tool – a clear flaw that I didn’t really notice since I have a pretty unique name. In terms of pulling data from Facebook and other private sites – I think that would be really neat to see since it would reveal a much truer picture of myself. Still, in many respects, I think my Facebook profile is fairly sanitized as well. I can control how I share information across different social graphs, but there’s some stuff that still doesn’t make it up there at all (pictures of me partying in university, or guilty pleasures such as bad reality TV that I’ll never admit to liking, conversations I share with my non-Facebook friends and parents, etc.). I like the 2×2 you suggest on your site – I made some comments there as well.

Steve Guengerich
Aug 20, 2009 18:25

Cool MIT gadget, but talk about yer digital divide:

- what about the blind (no meta data or apparent alt tags in the result for screen reading),

- color blind (aren’t >15% of anglo men red/green color blind?), and

- non english speaking/reading

A classic Media Lab “proof of concept” good at stimulating ideas, but leaves a but to be desired regarding inclusiveness.

Aaron Zinman
Aug 20, 2009 22:54

Hi, creator of Personas here.

Nice blog post. It is certainly true that much of our lives are being lived online, and therefore our digital identity can converge to a pretty good approximation of ourselves through our traces of history online. There is still a long way to go, but tools that aggregate and make sense out of our online histories will enable major new technologies to leap forward as computers have some kind of accurate social intelligence.

Now, to Doug Cornelius, Naumi Haque, and ESPECIALLY Steve Guengerich, a little background that’s missing on Personas. Did you guys realize that this is an art piece that is part of a larger exhibit now at the MIT Museum questioning our modern world (or even read what I wrote on the homepage)? It’s not a “tool” or a “proof of concept”. This is meant to critique data mining, and as such, include inaccuracies, models with fundamental assumptions baked in from the creator of the algorithms, only include partial kinds of data, be opaque in its final representation and how it arrived at its conclusions, etc. The viewer is meant to see the computer try to make sense of the data, oscillating back and forth and eventually converging on a box in which to put you in, no matter how appropriate or inappropriate, or with any explanation of its rationale. This is the world of data mining, and the “world of tomorrow”, where programmers and mathematicians attempt to create their models of how people interact and what data means independent from the true complexities and subtleties of reality… sometimes with success and sometimes not.

And besides, Steve– do you really think its appropriate for a research project done by a student at university (as a side project done in 2 months, nevertheless) to create programs for all languages in the world? Especially something that involves training large-scale language models? Come on. And besides, there were (and are) alt tags and meta data on the index.html for screen reading, and ultimately blind people can’t see a visualization so its not a terrible concern. And don’t forget, as art and not a tool, aesthetics matter more than attention paid to the color blind. I’d like to know if you complain about the colors on a Picasso. If you want to talk smack, at least have a valid critique that has some depth. You sound like a generic Media Lab hater: missing the big picture while arguing problems with things that were never the point.


Steve Guengerich
Aug 20, 2009 23:52

Ouch Aaron: I clearly struck a nerve. My “classic Media lab PoC” comment wasn’t pejorative. Far from being a Media lab hater, I’m a long-time admirer. See from two years ago: http://www.austinstartup.com/2007/09/mit-media-lab-20/ So, I’d ask you to back off on the “generic” categorizations.

I’ll grant you that my only info about Personas was what I read in Naumi’s post and then a quick run of it after clicking through to the link, so you are correct that I read none of the other contextual background about the history that you provided – thanks for that.

But as to my other comments about the visual elements, I stand by them. Having served as the CEO of a major regional non-profit treating the disabled and having co-founded and seed-funded a non-profit that focused on making technology accessible to people with disabilities, I’ve heard comments like yours before and they are weak.

Many people go blind later in life, due to disease, injury, illness. By using accessible techniques to enable descriptive interpretation of an online work of art, you give someone who once had sight the ability to inwardly visualize it and, thus, call upon a level of appreciation for that beauty or creativity that they would otherwise not have.

You can learn more about just how to do this by reaching out to the Executive Director of Knowbility, Sharron Rush, at srush@knowbility.org. I’m sure she would be happy to help you with some tips on how to make Personas more accessible. Good luck!

Naumi Haque
Aug 21, 2009 0:49

Sorry Aaron, my bad – I should have read more carefully. I used your visualization slightly out of context there. From the Personas main page:

“Personas demonstrates the computer’s uncanny insights and its inadvertent errors, such as the mischaracterizations caused by the inability to separate data from multiple owners of the same name. It is meant for the viewer to reflect on our current and future world, where digital histories are as important if not more important than oral histories, and computational methods of condensing our digital traces are opaque and socially ignorant.”

As digital art, I think it’s very cool (nice job), which is why I originally used it to kick off my post. As philosophy, I also think the point it makes is similar to the point I’m trying to make, which is that the data that exists about us online is in many instances “independent from the true complexities and subtleties of reality.” Data mining/reality mining probably will someday fulfill the promise of the Tireless Machine Era, but as we start to see seeds of this, we will guard our online identities even more closely and manicure them ever more diligently.

But the real kicker is that much of what is uncovered in the Tireless Machine Era will still be unknown to us (and to others as well), but will be collected, aggregated, analyzed, monitored, and tagged by powerful, always-on systems running 24/7 in the background of all Web applications. The temporal aspect of one’s online profile – past actions, posts, connections, and data points – will be visible from the outside, creating trend lines that may not be evident to the individual user. To your point, the data mining will be opaque, yes; but socially ignorant, maybe not.

Doug Cornelius
Aug 21, 2009 9:10

Aaron –

It’s an interesting project. The results I mentioned are exactly what you are trying to show.

I paid no attention to opening page and missed your point. I now see the new text on enter your name page.

How about something on the results page to emphasize some of the points you are making? “Is this you?” would seem to carry the point.

Aug 31, 2009 5:05

Many people go blind later in life, due to disease, injury, illness. By using accessible techniques to enable descriptive interpretation of an online work of art, you give someone who once had sight the ability to inwardly visualize it and, thus, call upon a level of appreciation for that beauty or creativity that they would otherwise not have.

Aaron Zinman
Aug 31, 2009 15:04

@Naumi – We almost mesh except that there is a misunderstanding of the term ‘social ignorance’. It is not referring to the lack of data–certainly Personas is attempting to show just that. It is more about wisdom and capacity to understand social interaction, what it means to be human, how to interpret the given text. Putting things into categories is a scenario that occurs very frequently in machine learning. But the types of categories are often too rigid and uninformed in regards to the higher-level social processes that are of real value here. The ones that we would read the data and say, ‘oh, I think that person is about X’, or ‘That characteristic is an anomaly because is not consistent with my informed world view about Y’.

I also updated the text and appended ‘for now’. But clearly anything that humans can disagree on (which is a lot) is only that much harder for a computer. This stuff is hairy and no one is really talking about it in the mainstream world, let alone reasonable critique from technologists. Just read any social network analysis papers, and you’ll see how optimistic, “rational”, and simplistic most papers/authors are.

@zalet & steve – I had alt tags from the beginning, and plain text descriptions on the main page. I wish I had the time to create an alternative version for the blind that worked in sound, but alas I do not. If you wish to have access to the source code to do so yourself, I would be happy to collaborate.


Oct 21, 2009 8:24

