online sex chat

lesbian webcams

sex and chat

ebony girl

cam sex

sex cam to cam

sex cam

sex shows

online chat

sex chat with girls

Business - Written by on Tuesday, October 2, 2007 21:38 - 1 Comment

Denis Hancock
Mining the blogosphere for varieties of self-expression

Different groups use language differently in order to get their ideas across – one only has to flip back and forth between Jerry Springer and election debates to realize this. Ok, this may be a particularly bad example of the point I’m trying to get across, but hopefully you get the idea – there is a lot to be learned from how different groups communicate amongst themselves and with others.

Unfortunately, trying to figure out these differences has long faced a restraint common to much academic research – the time and expense to collect and annotate data. But what if, somehow, there was suddenly millions upon millions of easily accessible communications, from all kinds of people all over the world, unedited in very natural language, right at researcher’s fingertips?

Well of course there is something just like that – the blogosphere. In turn, a group of professors (mostly computer scientists, with a Chair of Psychology thrown in for good measure) have published this paper on Mining the Blogosphere: Age, gender and the varieties of self-expression. Sample size for their research? To quote:

Our corpus comprises over 140 million words of naturally occurring text from randomly selected blogs by men and women from their teens into their forties.

It’s almost absurd to think how big that sample is compared to such tests from the past - they must have felt like Dr. Evil sitting in their research lair getting ready to mine the data… if a trillion dollars was suddenly feasible in Dr. Evil’s quest for world domination… and Austin Powers had something to do with Personal Pronouns and conjunctions… and the researchers in this case had a lair of some sort. Since that comparison really makes little sense lets get right to some of their findings:

older bloggers tend to write about externally–focused topics, while younger bloggers tend to write about more personally–focused topics; changes in writing style with age are closely related. (translation – older folks are interested in what’s going on out there and the younger folks are me me me)

the linguistic factors that increase in use with age are just those used more by males of any age, and conversely, those that decrease in use with age are those used more by females of any age. (translation – I could get myself into trouble with this one so I’ll leave it as an exercise)

There are a variety of others, and if you are interested in this sort of thing it is well worth the read (noting if you think men use auxillery verbs more than women, you are sadly mistaken). Personally, I’m more interested in the use of this blogosphere sample of millions in terms of academic research- what else could we learn?



1 Comment

You can follow any responses to this entry through the RSS 2.0 feed. Responses are currently closed, but you can trackback from your own site.

Naumi Haque
Oct 3, 2007 10:40

Hmm… I’m pretty sure I read something recently about the US government using similar technology to monitor Web sites and blogs for terrorist activity. Apparently terrorists have certain language indicators as well.

Coming soon in paperback! Help rename the paperback version of Macrowikinomics and win a one-hour webinar for you and your colleagues with Don Tapscott. Ends 5:00pm ET, August 31. Learn more.

Business - Oct 5, 2010 12:00 - 0 Comments

DRM and us

More In Business


Entertainment - Aug 3, 2010 13:14 - 2 Comments

Want to see the future? Look to the games

More In Entertainment


Society - Aug 6, 2010 8:19 - 4 Comments

The Empire strikes a light

More In Society