Category Archives: Network research

Write, that I may find thee

A Google Dance – when Google changes its rankings of web sites – used to be something that happened infrequently enough that each “dance” had a name – Boston, Fritz and Brandy, for instance – but are now happening more than 500 times per year, with names like Panda #25 and Penguin 2.0, to name a few relatively recent ones. (There is even a Google algorithm change “weather report”, as many of the updates now are unnamed and very frequent.) As a consequence, search engine optimization seems to me to be changing – and funny enough, is less and less about optimization and more and more about origination and creation.

It turns out that Google is now more and more about original content – that means, for instance, that you can no longer boost your web site simply by using Google Translate to create a French or Korean version of your content. Nor can you create lots of stuff that nobody reads – and by nobody, I mean not just that nobody reads your article, but that the incoming links are from, well, nobodies. According to my sources, Google’s algorithms have now evolved to the point where there are just two main mechanisms for generating the good Google juice (and they are related):

  1. Write something original and good, not seen anywhere else on the web.
  2. Get some incoming links from web sites with good Google-juice, such as the New York Times, Boing Boing, a well-known university or, well, any of the “Big 10” domains (Wikipedia, Amazon, Youtube, Facebook, eBay (2 versions), Yelp, WebMD, Walmart, and Target.)

The importance of the top domains is increasing, as seen by this chart from mozcast.com:

image

In other words, search engines are moving towards the same strategy for determining what is important as the rest of the world has – if it garners the attention of the movers and shakers (and, importantly, is not a copy of something else) it must be important and hence, worthy of your attention.

For the serious companies (and publishers) out there, this is good news: Write well and interesting, and you will be rewarded with more readers and more influence. This also means that companies seeking to boost their web presence may be well advised to hire good writers and create good content, rather than resort to all kinds of shady tricks – duplication of content, acquired traffic (including hiring people to search Google and click on your links and ads), and backlinks from serially created WordPress sites.

For writers, this may be good news – perhaps there is a future for good writing and serious journalism after all. The difference is that now you write to be found original by a search engine – and should a more august publication with a human behind it see what you write and publish it, that will just be a nice bonus.

What you can learn from your LinkedIn network

LinkedIn Maps is a fascinating service that lets you map out your contact network. Here is my first-level network, with 848 nodes (click for larger image):

image

The colors are added automatically by LinkedIn, presumably by profile similarity and link to other networks. You have to add the labels yourself – they are reasonably precise, at least for the top five groups (listed according to size and, I presume, interconnectedness).

As can be seen, I am a gatekeeper between a network of consultants and researchers in the States (the orange group) and reasonably plugged into the IT industry, primarily Norwegian (the dark blue). The others are fairly obvious, with the exception of the last category, which happens to be an eclectic group that I interact with quite a lot, but which are hard to categorize, at least from their backgrounds.

Incidentally, the “shared” map, which takes away names, provides more information for analysis. Note the yellow nodes in my green network on the right: These are the people hired by BI to manage or teach in China. They are, not in nationality but in orientation, foreigners in their own organization.

My LinkedIn policy is to accept anyone I know (i.e. have had dealings with and would like in my network), which, naturally, includes a number of students (I will friend any student of my courses as long as I can remember them, though I must admit I am a bit sloppy there.)

What is missing? Two things stand out: I have many contacts in Norwegian media and in the international blogosphere, which isn’t here because, well, Norwegian media use Twitter or their own outlets, and bloggers use, well, their blogs. Hence, the commentariat is largely invisible in the LinkedIn world (except for Jill Walker Rettberg, who sicced me onto LinkedIn Maps). Also, a number of personal friends are not here, simply because LinkedIn is a professional network – and as such captures formal relationships, not your daily communications.

Now, what really would make me curious is what this map would look like for my Facebook, Twitter and Gmail accounts – and how they overlap. But the network in itself is interesting – and tells me that increasing the interaction between my USA network and the Norwegian IT industry wouldn’t hurt.

Two books on search and social network analysis

Social Network Analysis for Startups: Finding connections on the social webSocial Network Analysis for Startups: Finding connections on the social web by Maksim Tsvetovat
My rating: 3 of 5 stars

Concise and well-written (like most O’Reilly stuff) book on basic social network analysis, complete with (Python, Unix-based) code and examples. You can ignore the code samples if you want to just read the book (I was able to replicate some of them using UCINet, a network analysis tool).

Liked it. Recommended.

Search Analytics for Your Site: Conversations with Your CustomersSearch Analytics for Your Site: Conversations with Your Customers by Louis Rosenfeld
My rating: 4 of 5 stars

Very straightforward and practically oriented – with lots of good examples. Search log analysis – seeing what customers are looking for and whether or not they find it – is as close to having a real, recorded and analyzable conversation with your customers as you can come, yet very few companies do it. Rosenfeld shows how to do it, and also how to find the low-hanging fruit and how to justify spending resources on it.

This is not rocket science – I was, quite frankly, astonished at how few companies do this. With more and more traffic coming from search engines, more and more users using search rather than hierarchical navigation, and the invisibility of dissatisfied customers (and the lost opportunities they represent) this should be high on any CIOs agenda.

Highly recommended.

View all my reviews

Our search-detected personalities

Personas is an interesting project at the Media Lab which takes your (or anyone else’s) name as input and then determines our personalities based on what it finds about us on the web, generating a graphical representation. This is my result:

image

…which I found rather disturbing: Fame, sports and religion seems to take way to much space here. The reason, of course, is that my name is rather common in Norway, and, for example, a formerly well known skier skews the results, even though I seem to be the most web-known person with that name.

Anyway, if you have a rare name, it might be accurate – and if your name is John Smith, you might be left with an average, possibly tilted a bit towards Pocahontas:

image

Anyway – try it out. You might be surprised. And please remember – this is an art project, not an accurate representation of anything…

Update September 20:I somehow forgot to point to Naomi Haque’s blog post about Personas, with discussion of how social networking changes our perception of self.

Messy works magically

Craigslist is a mess that is currently taking the mickey out of eBay and irritating Google, according to a fun article in Wired. I am not surprised. The value of a meeting place is not what happens there, but who is there – and by minimizing controls and keeping most transactions face-to-face, Cragslist is eeking out the value from the network with minimal investment and a business model that really isn’t a business model.

As for the messy design, well, it is quick, and you learn very fast where to click to get what you want.

The funny thing is that in Norway, the most popular website by far is vgnett.no, the online version of the biggest tabloid paper – or, rather, an online paper that shares the name, but not must else, with VG, the paper paper. The online version has its own editorial office and their design is evolved, evolving, and the perennial joke with web designers for its busy organization and ratty typeface. They would love to replace it with something akin to Aftenposten or New York Times, where order, quality and completeness reigns. VGnett would beg to offer – they know the use patterns of their audience and serve it, messy or not.

Network externalities in plain view, in other words.

Are social networks a help or a threat to headhunters?

In a currently hot Youtube video which breathlessly evangelizes the revolutionary nature of social networks, I found this statement: "80% of companies are using LinkedIn as their primary tool to find employees". In the comments this is corrected to "80 percent of companies use or are planning to use social networking to find and attract candidates this year", which sounds rather more believable. Social media is where the young people (and, eventually, us in the middle ages as well) are, so that is where you should look.

At the same time, many of the most prolific users of LinkedIn (and, at least according to this guy, Twitter), both in terms of number of contacts and other activities, are headhunters. It is these people’s business to know many people and be able to find someone who matches a company’s demands.

image Headhunters are the proverbial networkers – they derive their value from knowing not just many people, but the right people. In particular, headhunters that know people in many places are valuable, because they would then be the only conduit between one group and another. Your network is more valuable the fewer of your contacts are also in contact with each other.

The American sociologist Ronald S. Burt, in his book Structural Holes: The Social Networks of Competition (1992), showed that social capital accrues to those who not only know many people, but have connections across groups. Or, in other words, if everyone had been directly linked, you would have a dense network structure. The fact that we aren’t, means that there are structural holes – hence the term. In the picture to the right, we see a social network of 9 individuals. Person A here derives social capital from being the link two groups that otherwise are only internally connected. A would be an excellent headhunter here. (Much as profits only can be generated if you can locate market imperfections).

LinkedIn is a social networks, indistinguishable from a regular one (i.e., one that is not digitally facilitated) except that you can search across the network, directly up to three levels away, indirectly a bit further. Headhunters like it for this reason, and use it extensively in the early phases of locating a candidate. The trouble is, LinkedIn (not to mention the tendency of more and more people having their CV online on regular websites) makes searching for candidates easy for everyone else as well. In other words – while initially helpful, is the long term result of this searchability that headhunters will no long be necessary.

Search technology – in social networks as well as in general – lowers the transaction cost of finding something. Lower transaction costs favors coordination by markets rather than hierarchy (or, in this case, network). Hence, the value of having a central position in that network should diminish. On the other hand, search technology (in networks in particular) allows you to extend your network, hence increase your social capital. Which effect is stronger remains to be seen.

Anyway, this should make for interesting research. Anyone out there in headhunterland interested in talking to me about their use of these tools?

From links to seeds: Edging towards the semantic web

Wolfram Alpha just may take us one step closer to the elusive Semantic Web, by evolving a communication protocol out of its query terms.

(this is very much in ruminating form – comments welcome)

Wolfram Alpha officially launched on May 18, an exciting new kind of "computational" search engine which, rather than looking up documents where your questions have been answered before, actually computes the answer. The difference, as Stephen Wolfram himself has said, is that if you ask what the distance is to the moon, Google and other search engines will find you documents that tells you the average distance, whereas Wolfram Alpha will calculate what the distance is right now, and tell you that, in addition to many other facts (such as the average). Wolfram Alpha does not store answers, but creates them every time. And it does primarily answer numerical, computable questions.

The difference between Google (and other search engines) and Wolfram Alpha is not so clear-cut, of course. If you ask Google "17 mpg in liters per 100km" it will calculate the result for you. And you can send Wolfram Alpha non-computational queries such as "Norway" and it will give an informational answer. The difference lies more in what kind of data the two services work against, and how they determine what to show you: Google crawls the web, tracking links and monitoring user responses, in a sense asking every page and every user of their services what they think about all web pages (mostly, of course, we don’t think anything about most of them, but in principle we do.) Wolfram Alpha works against a database of facts with a set of defined computational algorithms – it stores less and derives more. (That being said, they will both answer the question "what is the answer to life, the universe and everything" the same way….)

While the technical differences are important and interesting, the real difference between WA and Google lies in what kind of questions they can answer – to use Clayton Christensen’s concept, the different jobs you would hire them to do. You would hire Google to figuring out information, introduction, background and concepts – or to find that email you didn’t bother filing away in the correct folder. You would hire Alpha to answer precise questions and get the facts, rather than what the web collectively has decided is the facts.

The meaning of it all

Now – what will the long-term impact of Alpha be? Google has made us replace categorization with search – we no longer bother filing things away and remembering them, for we can find them with a few half-remembered keywords, relying on sophisticated query front-end processing and the fact that most of our not that great minds think depressingly alike. Wolfram Alpha, on the other hand, is quite a different animal. Back in the 80s, I once saw someone exhort their not very digital readers to think of the personal computer as a "friendly assistant who is quite stupid in everything but mathematics."  Wolfram Alpha is quite a bit smarter than that, of course, but the fact is that we now have access to this service which, quite simply, will do the math and look up the facts for us. Our own personal Hermione Granger, as it is.

I think the long-term impact of Wolfram Alpha will be to further something that may not have started with Google, but certainly became apparent with them: The use of search terms (or, if you will, seeds) as references. It is already common to, rather than writing out a URL, to help people find something by saying "Google this and you will find it". I have a couple of blogs and a web page, but googling my name will get you there faster (and you can misspell my last name and still not miss.) The risk in doing that, of course, is that something can intervene. As I read (in this paper) General Motors, a few years ago, had an ad for a new Pontiac model, at the end of which they exhorted the audience to "Google Pontiac" to find out more. Mazda quickly set up a web page with Pontiac in it, bought some keywords on Google, and quite literally Shanghaied GM’s ad.

Wolfram Alpha, on the other hand, will, given the same input, return the same answer every time. If the answer should change, it is because the underlying data has changed (or, extremely rarely, because somebody figured out a new way of calculating it.) It would not be because someone external to the company has figured out a way to game the system. This means that we can use references to Wolfram Alpha as shorthand – enter "budget surplus" in Wolfram Alpha, and the results will stare you in the face. In the sense that math is a language for expressing certain concepts in a very terse and precise language, Wolfram Alpha seeds will, I think, emerge as a notation for referring to factual information.

A short detour into graffiti

Back in the early-to-mid-90s, Apple launched one of the first pen-based PDAs, the Apple Newton. The Newton was, for its time, an amazing technology, but for once Apple screwed it up, largely because they tried to make the device do too much. One important issue was the handwriting recognition software – it would let you write in your own handwriting, and then try to interpret it. I am a physician’s son, and I certainly took after my father in the handwriting department. Newton could not make sense of my scribbles, even if I tried to behave, and, given that handwriting recognition is hard, it took a long time doing it. I bought one, and then sent it back. Then the Palm Pilot came, and became the device to get.

The Palm Pilot did not recognize handwriting – it demanded that you, the user, wrote to it in a sign language called Graffiti, which recognized individual characters. Most of the characters resembled the regular characters enough that you could guess what they were, for the others you either had to consult a small plastic card or experiment. The feedback was rapid, to experimenting usually worked well, and pretty soon you had learned – or, rather, your hand had learned – to enter the Graffiti characters rapidly and accurately.

Wolfram Alpha works in the same way as Graffiti did: As Steven Wolfram says in his talk at the Berkman Center, people start out writing natural language but pretty quickly trim it down to just the key concepts (a process known in search technology circles as "anti-phrasing".) In other words, by dint of patience and experimentation, we (or, at least, some of us) will learn to write queries in a notation that Wolfram Alpha understands, much like our hands learned Graffiti.

From links to seeds to semantics

Semantics is really about symbols and shorthand – a word is created as shorthand for a more complicated concept by a process of internalization. When learning a language, rapid feedback helps (which is why I th
ink it is easier to learn a language with a strict and terse grammar rather than a permissive one), simplicity helps, and a structure and culture that allows for creating new words by relying on shared context and intuitive combinations (see this great video with Stephen Fry and Jonathan Ross on language creation for some great examples.)

And this is what we need to do – gather around Wolfram Alpha and figure out the best way of interacting with the system -and then conduct "what if" analysis of what happens if we change the input just a little. To a certain extent, it is happening already, starting with people finding Easter Eggs – little jokes developers leave in programs for users to find. Pretty soon we will start figuring out the notation, and you will see web pages use Wolfram Alpha queries first as references, then as modules, then as dynamic elements.

It is sort of quirky when humans start to exchange query seeds (or search terms, if you will).  It gets downright interesting when computers start doing it. It would also be part of an ongoing evolution of gradually increasing meaningfulness of computer messaging.

When computers – or, if you will, programs – needed to exchange information in the early days, they did it in a machine-efficient manner – information was passed using shared memory addresses, hexadecimal codes, assembler instructions and other terse and efficient, but humanly unreadable encoding schemes. Sometime in the early 80s, computers were getting powerful enough that the exchanges gradually could be done in human-readable format – the SMTP protocol, for instance, a standard for exchanging email, could be read and even hand-built by humans (as I remember doing in 1985, to send email outside the company network I was on.) The world wide web, conceived in the early 90s and live to a wider audience in 1994, had at its core an addressing system – the URL – which could be used as a general way of conversing between computers, no matter what their operating system or languages. (To the technology purists out there – yes, WWW relies on a whole slew of other standards as well, but I am trying to make a point here) It was rather inefficient from a machine communication perspective, but very flexible and easy to understand for developers and users alike. Over time, it has been refined from pure exchange of information to the sophisticated exchanges needed to make sure it really is you when you log into your online bank – essentially by increasing the sophistication of the HTML markup language towards standards such as XML, where you can send over not just instructions and data but also definitions and metadata.

The much-discussed semantic web is the natural continuation of this evolution – programming further and further away from the metal, if you will. Human requests for information from each other are imprecise but rely on shared understanding of what is going on, ability to interpret results in context, and a willingness to use many clues and requests for clarification to arrive at a desired result. Observe two humans interacting over the telephone – they can have deep and rich discussions, but as soon as the conversation involves computers, they default to slow and simple communication protocols: Spelling words out (sometimes using the international phonetic alphabet), going back and forth about where to apply mouse clicks and keystrokes, double-checking to avoid mistakes. We just aren’t that good at communicating as computers – but can the computers eventually get good enough to communicate with us?

I think the solution lies in mutual adaptation, and the exchange of references to data and information in other terms than direct document addresses may just be the key to achieving that. Increases in performance and functionality of computers have always progressed in a punctuated equilibrium fashion, alternating between integrated and modular architectures. The first mainframes were integrated with simple terminal interfaces, which gave way to client-server architectures (exchanging SQL requests), which gave way to highly modular TCP/IP-based architectures (exchanging URLs), which may give way to mainframe-like semi-integrated data centers. I think those data centers will exchange information at a higher semantic level than any of the others – and Wolfram Alpha, with its terse but precise query structure may just be the way to get there.

The perils of openness

Mary Beard has a really interesting perspective on the consequences of openness: Transparency is the new opacity. In the absence of confidential channels (which, given today’s storage and search capabilities, you have no guarantee will remain confidential) very little actual information gets transmitted in student appraisals.

And the only difference between job appraisals and student appraisals, I assume, lies in vocabulary. As a technologist, I could envision all kinds of technical fixes to this, assuming that those in charge of the specifications acknowledge that they are necessary: Fields for comments hidden from the subject, fields that terminate after a certain time after reading, filters to search engines that handle confidentiality – including the fact that there is a confidential comment in the first place (which turns out to be surprisingly hard to do.)

But the more natural fix is the quick conversation in the pub, the hallway, or on the private cell phone – impervious to search, storage and documentation – where the real information can be exchanged. The electronic equivalent? Encrypted Twitter, perhaps, if such a thing exists.

What we need is online coffee shops, offering the same discreet, transient and history-less marketplace for information. Now I spend time on the phone with my colleagues for that, but that doesn’t work well across time zones. So – what would it look like and how to build it?

PS: Come to think of it, Skype is encrypted, at least the phone calls.

SIm card as platform

I am at the Open Nordic Conference in Skien (about two hours south-west of Oslo), listening to Lars Ingvald Hoff from Telenor R&D talking to a bunch of developers about the new, platform-like SIM cards coming out.

The new SIM card has plenty of memory "gigbytes", USB interface (means you can get data from the SIM card real fast), virtual machines (or at least virtual memory areas, closed off, called SSDs). Tele operator has control of the card, application developers can install SSDs (whatever they are) that run in a sandbox. One business model may be that operators will charge rent for space on the SIM. Seems like a pretty full architecture to me. Translation HMTL to APDU (command language for phone) in a web server on the card, so in principle you could move your cell phone onto the net. Alos has a "Java Card", where you can to some extent can have interoperable applications running between manufacturers. Secure and certified environment, not full Java stack , but a pretty good selection. Standards based, not operator-specific.

FC: New short-range communications protocol, can be used to access payment terminals and similar, secure devices.

Apps can be downloaded and installed via a variety of protocols (among them BIP (Bearer Independent Protocol) directly to the SIM card.

In other words, mobile phones are going to open up to a much larger extent. I predict that the SIM card over time will become you – an identification and payment device.

Future future SIM card – you will get IP stacks, threads, full Java virtual machine, will look more and more like a server.

Open Mobile conference musings

Tomorrow I am giving a talk on disruptive technologies at the Open Nordic Conference, and how that theory applies to open standards and open source in the mobile technology industry. The audience is apparently very technical and I, quite frankly, do not think that open source plays that much of a role – apart from providing available functionality for innovators (mostly at the user interface/user service level) to build on.

The challenge in mobile technology (and in any consumer technology whose aim is to facilitate interaction) lies in establishing a platform for users and business to build on. Right now I am listening to Nick Vitalari analyze platform establishment and growth as part of the nGenera project PBG: Building a platform for business growth.

I am thinking about how platforms get established – and playing with words. It seems to me that the process can be described in terms of four words:

  • Problem (often personal): Somebody has an itch to scratch, something that can be fixed with software, so they do it. (This is what Eric Raymond considers to be the beginning of almost any open source project.)
  • Product (or service): The solution to the problem gets productized, either in a closed or open fashion, using standard or collaborative programming and development processes.
  • Platform: The solution expands both in scale (distribution) and scope (technologies it can run on, added services, links to other solutions) until it is less a solution in itself for others to build on, where customers and users get it less for itself than for the added functionality it provides.
  • Protocol: The platform becomes so open and ubiquitous that it is available everywhere, fading into the background in terms of user awareness. This can happen in many ways – it can expand to become all-encompassing (Google, for instance, maybe Facebook in certain communities, email certainly); it can be modularized with tools that pulverizes the proprietary value proposition (emulation, multiple clients (like Trillian in the chat space, cross-licensing); it can be regulated into a standard (AT&T with telephones, for instance); or it can be subsumed into an underlying functional layer (Microsoft’s embrace and extend strategy).

In the end, it will be forced into some form of openness.

Half-baked so far, but it’s a start.

The last days of eBay

Interesting article in the London Review of Books by Thomas Jones. I always thought eBay’s competitive advantage (aside from the obvious network effects) lay in its payment system (i.e., PayPal). But proprietary platforms will over time be out-competed by open and modular ones – about time selling something vent from platform to protocol.

Technorati Tags: ,,,,

How free is the Internet?

Semi-liveblogged notes from a seminar at the Nobel Peace Center, Oslo, arranged by the  Norwegian Board of Technology. I ran out of battery towards the end, and had to leave before the final session. (On the plus side, the Nobel center has free and available wifi, which I deem a Very Good Thing indeed):

Introduction: Bente Erichsen, head of the Nobel Peace Center: Parvin Ardalan, one of the founders of the One Million Signatures initiative to protest discrimination against women, could not come as her passport has been confiscated by the Iranian government and she is not allowed to leave Iran. Ingvild Myhre, Chairman Norwegian Board of Technology: Increase in state-sponsored censorship on the Internet.

Jonathan Zittrain, professor of Internet Law at Oxford and founder of the Berkman Center:

Filtering the Internet is hard compared to most other networks, because of the "best-effort" routing, otherwise known as "send-and-pray". Impossible to filter in the cloud, but at the point of the ISP you can filter. Examples include geographical filtering (movie releases, newspaper articles in the US about British law cases, Google.de removing neo-nazi material from the index, videos about various things at Google Video made unavailable by the uploaders (check-box solution)). In China, Google states that due to local law, some search results are withheld. ChillingEffects.com now gets the letters that Google receives with take-down notices. Microsoft implemented a filtering of their msn blogging system to satisfy the authorities (though it leaks like a sieve). This "check-box" form of filtering at the source is likely to increase. This not need to be measurable at the net itself: In Singapore, your expressions can cause you to lose our house to a defamation suit.

Much harder to measure surveillance than blockages. China has experimented with various measures. For a while, Google.com was redirected to a Chinese University search engine. Blocking access to content is a "parking ticket" offense, Various sites are blocked (drugs, pornography, religion, some political issues.) Saudi Arabia has a pretty clear filtering policy, quite open about it, not much fervor.

Filtering at the device. Access is shifting from PC to cell phone and other locked devices, and many of these new endpoints are controlled by vendors and thus open to pressure.

Many technology companies are at the horn of a dilemma here – witness Google’s dilemma going into China. Sullivan principles offers a middle way (started out with apartheid in South Africa), now written into American law (at precisely the time Sullivan repudiated them.) Are there ways to work with the government to concede to some of the restrictions while doing the ethical thing?

Many other services: Livecastr allows direct filming from cell phone, LiveLeaks, WikiLeaks, psiphon – allowing people to see Internet the way you see it. Automatic translation now at the point where it allows chatting between two speakers.

Jimbo Wales: Can Wikipedia promote free speech?

Wikipedia is a freely licensed encyclopedia written by thousands of volunteers in many languages. Now the 9th most popular website on the web. 12th most popular in Iran. How global? Follows Internet penetration, basically – large in English, only 15,000 articles in Hindi despite 280 mm speakers of Hindi.

Wikipedia in China: First block June 2-21, 2004, then September 23-27, 2004, then from October 19 2005 until now. Lately, BBC and Wikipedia in English has been unblocked, unclear why, probably Olympics. Wikipedia in Chinese has more than 170,000 articles, 12th largest of all Wikipedia. More Chinese speakers outside of China than there are Dutch people anywhere. Mistake to think of this as written outside China – the firewall is porous and of the 87 administrators, 29 are from mainland China.

Censorship in China is discreet and done at an industrial level, the aim is not at individuals. Most youngsters know how to get to Wikipedia. If you set up a mirror you will be shut down, but the Chinese authorities have avoided having sad stories about people being arrested for reading Wikipedia.

Core point: Wikipedia is free access. You can copy, modify, redistribute, redistribute modified versions, and you can do this commercially or non-commercially. Baidu redistributes Wikipedia (except the pages they censor) in China (though they put "all rights reserved" on it).

Quality? German Wikipedia compared to Brockhaus, in43 out of 50 articles, Wikipedia was the winner. Not an archive, not a dump, not a textbook. Not a place to testify about human rights abuses, but the place to document human rights abuses in a neutral way. Want to be an encyclopedia, access to knowledge should not be censored, therefore Wikipedia does not take the middle ground and refuses all kinds of censorship. Jim thinks Google does a huge mistake, but theirs is a considered decision and they are sincerely trying. As customers, we should put pressure on Google. Force Google to tell us what they are doing in China to change the policies they now have to abide by.

Every single person on the planet? Available in many languages, but many of them do not have many articles. Showed a video of Desanjo, the father of the Swahili Wikipedia, wrote day an night, recruited people, now 7000 articles. Have now started the Wikipedia Academy in Africa, will start many of them.

How do you design a space where people can engage in conversations? Make it open – like a restaurant that people want to be in.

Discussion: 

(I didn’t catch all of this discussion, partially because I participated in it. Notes a bit jumbled, will edit later.)

How powerful is Wikipedia? JW: More powerful than we like, especially a problem with bios of living people. We have the flag "The neutrality of this article is disputed", which I wish some newspapers would adopt.

Can you have a neutral point of view on human rights? JW: You can represent something in a neutral way, representing the different views. For instance, you can be neutral on abortion, saying that according to the Catholic church, this is a sin.

Things going in the right direction? Zittrain: Hard to say, social innovations such as Wikipedia tend to overcome attempts at censorship?

(My question, which was only partially answered.)What are the power implications over time for Google and Wikipedia. Both are on the ascendant now, profitable and popular, but does there need to be a different contribution model for a more stable wikipedia, and what happens when google no longer is running at a huge profit?

Mark Kriger: What worries you about the Internet five years out, at the edge of chaos? Zittrain: At the edge of chaos is suburbia: The tame, controlled online lives where things are OK, there is no reason that one bad apple can spoil everything. Jim Wales is now working on Wikisearch, more transparent about the search ranking. You don’t have a lot of investment in your use of Google, it is easy to switch, but that is not the case with many of the other services that are out there. Some regulatory interventions would be good about giving people the right to leave and easily take their information with them.

Citing Elie Wiesel: The opposite of good is not evil but indiffernence. Do not see the Internet as a shopping mall, keep it moving.

Part II: Ce
nsorship on the Net

In the absence of Parvin Ardalan, a movie from Iran about the million signatures movement was shown. It calls for equal rights for women in terms of judicial protection, divorce, inheritance and so on. A number of women have been arrested for collecting signatures. Parvin Ardalan was one of the organizers of this movement, and she has been arrested for this and has received a 2 year suspended prison sentence. She could not come, but the actor Camilla Belsvik delivered the speech for her:

  • Internet censored in Iran, but remain the most active medium for discussion of women’s issues. It has given women power, which has upset the power balance in families and between wives and husbands, and given them a mean of entering the public sphere.
  • On the Internet, women can connect and find a place for expression about their private lives. Especially for young women, using blogs, this has been especially important. They can talk about their romantic and family relationships, power structures, violence and sexuality.  This was a revolutionary development for them.
  • Some women have attained public identities even though they write anonymously.
  • Internet came to Iran during the reconstruction area in the 1990s and became more available during the reform years starting 1997. Women’s activism has been there, but in small groups. The reform period allowed more freedom of expression, but press permissions for women were few, especially for secular women. The reform period ended, and many were shut down. Many publications then turned to the Internet, as did NGOs were women were active.
  • Issues of feminism and sexuality are taken more seriously online. Gradually, filtering and blocking has become more severe. In 2004, the Ministry of Information technology ordered the words "women" and "gender" to be filtered, with the excuse of blocking pornography.
  • A large problem is self-censorship on politically and culturally sensitive issues. Women’s rights is politically as well as culturally sensitive.
  • There is a lack of laws, meaning that much of the censorship is arbitrary and haphazard. It is normally left to the judge to decide, since there are no clear laws on what is permitted and what is not.
  • The One Millon Signatures campaign was launched in august 2006. It aims to collect one million signatures on a petition to the Iranian government asking for equal rights for women in Iran. It has done much to focus the efforts on women’s rights in Iran.
  • The changeforequality web site has been blocked more than ten times, but each time a new domain name is registered and it continues publishing. Four of the activists have been arrested, but the struggle will continue. The action can serve as a model for movements in repressed societies everywhere.

Zittrain: Comments on censorship in Iran. (dicsussion with Helge Tennøe)

Pervasive censorship in Iran, web sites have to be licensed, many topics are not allowed, such as atheism. ISPs can be held responsible for criminal content. Very precise censorship, the ISP is responsible. The government is not monolithic, there are struggles inside the government, first they were excited about broadband, then you need a license to have anything faster than 128 Kbps.

Why do they have Internet in Iran at all? Very few states explicitly rejects modernity – Cuba and North Korea are some of the very few. Most states want the economic effects of the Internet. It is rather haphazardly enforced, though. Iran filters more stuff than China, but China tries harder to filter the relatively few things they filter.

The US government has actually contracted with Anonymizer, to provide circumvention software for Iranians, and for Iranians only. Rather primitive, and filtered, of all things, for pornography (the stop word "ass" means that usembassy.state.gov was filtered)

Radio Tibet – a radio in exile

Øystein Alme – started broadcasting in 1996, the Chinese have been jamming. Still the program is getting into Tibet. Øystein got involved as a backpacker many years ago, came back home and started reading up on Tibet, started Voice of Tibet. Now has fifteen employees, one in Norway, the rest is in Pakistan and India. Main channel into Tibet is shortwave radio, in China it is the Internet. Have spent a lot of time studying how to avoid Chinese jamming of frequencies, which are reserved for Voice of Tibet.

China is a repressive state, where the party dominates despite only having 6% of the population as members. (If you strip off those who are members because they need the membership to get a promotion in their job, not many remain). China has signed up to the articles on Human Rights, but break their promises with impunity.

Internet use in China is growing dramatically. China’s Internet police number 50,000, censoring made possible with foreign technology companies such as Google. One journalist, Shi Tao, got ten years for an article criticizing the government – and he was found thanks to information provided by Yahoo.

But the Internet is also the hope for change – with it we would not have the images from Tibet, for instance.

Discussion: Zittrain, Alme

Alme: Companies such as Yahoo, Google, Microsoft and others should join forces and together resist the policies of the government.

The Chinese government also use the Internet proactively, to push their point of view.

Zittrain: These companies could also offer business reasons for privacy, for instance offering encrypted accounts for business conversations.

Movie from Iran: a recording studio with bombs going off outside. During the Israeli siege of Lebanon, hit by 15000 missiles, a country of 4 million people under siege that we hear very little about. Zena el Khalil is an artist currently based in Beirut. Her blog from Beirut during the siege of Lebanon in 2006 was followed by a number of people as well as newspapers, who found it a valuable addition to official sources.

She talked about how her blog and others both changed the world’s perspective on the war and documented it: Lebanon is lacking in history since so much of it is rewritten by the warring parties. She also documented how Israeli attacks on a power plant created an ecological disaster, as oil spread as far north as Syria and even Turkey.

Searching and finding – hard to get into

I am currently reading two books on what can only be described as Web 2.0: John Batelle’s The Search and Peter Morville’s Ambient Findability. I don’t know why (maybe just my own overdosing on reading after starting my sabbatical), but I am finding both hard to get into.

Batelle front coverThe Search is better written – it is a mix of a corporate biography and a discussion of how search capability changes society. The language is tight – though sometimes cute, as in the phrase "the database of intentions" about Google clickstreams and archived query terms – and there is a thread (roughly chronological) through the book that allows most people who have been online for a while to nod and agree on almost any page. John Batelle has an excellent blog and plenty of scars from the dot-com boom and bust (I always liked Industry Standard and wrote a column for the Norwegian version, Business Standard, for a few years, so I am very favorably disposed), and his competence as a writer shows. The book reads like a long Wired report, but better structured, marginally below average in use of buzzwords and John has the right industry connections to pull it off.

Ambient findability front coverAmbient Findability looks at search from the other side of the coin – how do you make yourself findable in a world where search, rather than categorization, is the preferred user interface? For one thing, you have to make your whole web site findable, make it accessible and meaningful from all entry points. Morville fills the book up with drawings and pictures on almost every page, comes off as a widely read person, but I am still looking for a thorough expansion of the central message – or at least some  decent and deep speculation on personal and organizational consequences. It is more a book popularizing information science than a book that wants to tell a story, and it shows.

While both books are well worth the read if you are relatively new to the Internet, I was a little disappointed in the lack of new ideas – they are clever, but once you accept that the marginal cost of processing, storage and communciations bandwidth approaches zero, the conclusions kind of give themselves. Perhaps I am tired – actually, I am – perhaps I am unfairly critical after having treated myself to The Blank Slate, The World is Flat and Collapse, but these books, while both worthwhile, have failed to "wow" me.

Apologies. I will make a more determined re-entry once I wake up.

Linked out

I am a member of LinkedIn, a networking site with a distinctive business flair, as opposed to Friendster and Nokut/kornut/Orkut, which are more oriented towards socializing and dating.
LinkedIn is interesting because you discover people you once knew (former students/fellow students/colleagues/clients/acquiantances) and it enables you to find people you need to talk to. But there is a bit of a disconnect – getting invitations from people whom you have no idea who are and whom you have never met, but who wants to link up with you.
I am not sure how to deal with that – my instinct is not to let anyone into my LinkedIn network that I don’t know. I currently have 191 connections, which is semi-high (at least in Norway). Since I have lived and worked abroad and had a lot of students over the years, and live in the field of IT and consulting (which is full of people who use computer and who connect) it is probably normal. I have not resorted to spinning through my email adresses to bulk email people for connections, but I have searched former and current places of work.
The interesting thing, of course, is that LinkedIn really isn’t a social network, but a network of potential business and professional contacts. I still think you should know the people you connect to, at least know who they are and when you met them, but the threshold is much lower than for a normal social relationship.
This article by Lnace Ulanoff in PC Magazine expresses his frustration with LinkedIn – but this very long comment in the discussion really puts it right, methinks. LinkedIn is a tool for connection junkies. And that is all it is. It may be useful, it may be useless, but if you want to find people or be found by them, it is one tool among many.
Just as long as you are allowed to decline invitations and reject forwarding of really opportunistic messages to someone who wouldn’t read them if it wasn’t for them coming through you…..

Networks in our midst

The Enronic gets my vote as best social network analysis tool ever (both system and instance. This should have great potential as a tool for consultants and researchers, provided you can deal with the privacy issues. I especially like the idea of introducing animation – imagine introducing changes into the organization, and then watching the emails fly in real time…
(Via BoingBoing)

Sasson for the defense

Today I attended Amir Sasson‘s doctoral defence, where he defended his thesis “On Affiliation and Mediation: A Study of Information Mediated Network Effects in The Banking Industry.” Amir has studied to what extent companies and banks have economic gain from being well connected, in essence: Do well-connected companies have higher survival rates, do they pay less interest on their loans, and do they have higher credit availability than others? And vice versa – do banks do better by targeting customers that are connected to each other? Amir’s conclusion is that both companies and their banks benefit from being interconnected – and that banks can provide value for their clients by increasing the number of ties between their customers.

His supervisor has been Øystein Fjeldstad, chairman of the committee has been Henrich Greve, and the opponents have been Brian Uzzi of the Kellogg School of Management, Northwestern University and Kent Eriksson, KTH Royal Institute of Technology.

The work received excellent marks – Amir has done a tremendous job in creating the data set and developing a way to analyse network structures. Most importantly, as Brian Uzzi said, the work has a high potential for generalization to other industries and, indeed to other literatures – the hallmark of an excellent dissertation.

The questions given by the opponents in a doctoral defense tend to be more difficult the better the dissertation is – and the questions from the two opponents were detailed and hard-hitting. Amir sailed through with an understanding of theory, conceptualization and method that sets a standard that will be hard to follow for other doctoral candidates at BI. Congratulations are in order both to Amir and his advisors – this is an unusually well designed and executed thesis.