Category Archives: Search

Google edging closer to being "the new Microsoft"

A few years ago, I wrote an essay about how Microsoft had become the new IBM – i.e., the dominant, love-to-hate company of the computer industry. In this interesting article, John Lanchester discusses how Google now is stepping into that role, with its aggressive moves into making the world searchable, and a lot more than you would like findable. Interesting point:

[…] as Google makes clear, nothing short of a court order is going to stop it digitising every book in print. Google doesn’t accept that that constitutes a violation of copyright. But the company won’t even discuss the physical process by which it scans the books: a classic example of how very free it is with other people’s intellectual property, while being highly protective of its own.

This issue, in all its various forms, isn’t going to go away. Book Search, Street View and many of Google’s other offerings simply bulldoze existing ideas of how things are and how they should be done. I was highly critical of Gmail when it first came in, on the grounds that the superbly effective mail system came at the unacceptable price of allowing Google to scan all emails and place text ads. But I soon began using it, because it was free, and because it’s such good software, and because I frankly never noticed the ads.

He goes on to show how a hard disk crash and a botched backup restore left him without his documents, until it dawned on him that, yes, Gmail had them all, ready for download. So big brothers can be nice, but they are still Big Brothers…

Shirky on newspapers

Clay Shirky, the foremost essayist on the Internet and its boisterous intrusion into everything, has done it again: Written an essay on something already thoroughly discussed with a new and fresh perspective. This time, it is on the demise of newspapers – the short message is that this is a revolution, and saving newspapers just isn’t going to happen, because this is, well, a revolution:

[..]I remember Thompson [in 1993] saying something to the effect of “When a 14 year old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you got a problem.” I think about that conversation a lot these days.


Revolutions create a curious inversion of perception. In ordinary times, people who do no more than describe the world around them are seen as pragmatists, while those who imagine fabulous alternative futures are viewed as radicals. The last couple of decades haven’t been ordinary, however. Inside the papers, the pragmatists were the ones simply looking out the window and noticing that the real world was increasingly resembling the unthinkable scenario. These people were treated as if they were barking mad. Meanwhile the people spinning visions of popular walled gardens and enthusiastic micropayment adoption, visions unsupported by reality, were regarded not as charlatans but saviors.


That is what real revolutions are like. The old stuff gets broken faster than the new stuff is put in its place. The importance of any given experiment isn’t apparent at the moment it appears; big changes stall, small changes spread. Even the revolutionaries can’t predict what will happen. Agreements on all sides that core institutions must be protected are rendered meaningless by the very people doing the agreeing. (Luther and the Church both insisted, for years, that whatever else happened, no one was talking about a schism.) Ancient social bargains, once disrupted, can neither be mended nor quickly replaced, since any such bargain takes decades to solidify.

And so it is today. When someone demands to know how we are going to replace newspapers, they are really demanding to be told that we are not living through a revolution. They are demanding to be told that old systems won’t break before new systems are in place. They are demanding to be told that ancient social bargains aren’t in peril, that core institutions will be spared, that new methods of spreading information will improve previous practice rather than upending it. They are demanding to be lied to.

That simple. He draws the line back to the Gutenberg printing press and the enormous transition that caused – much more chaotic that you would think with 500 year hindsight.

Highly recommended. And another piece of reading for my suffering students….

Interesting search:

image is a federated search engine for classified ads – it does not (at least as far as I know) have its own ads, but act as a portal to other ad sites, presumably in return for a share of profits.

The value created is partly from the interface technology (enter "Mercedes 450 SEL 6.9" and it knows you are looking for a car and format the page so that you can drill down on models and years) and partly in that it accesses all kinds of local and community-based listings.

When I was looking for my used Mercedes I searched large advertisers such as and – but they only show their own ads. would have found me more cars (though not any I would have bought rather than the car I did get.) Useful because many markets are local and therefore hidden if you come from outside.

Interesting search

Since I am doing research on search, I thought I would create a list of interesting search-based web sites here, with individual blog entries describing each site and why they are interesting. Here is a starting list, which, of course, will be added to as I discover more interesting sites.


  • – visual search interface reminiscent of iPod Touch album covers (or, rather, the other way around)
  • New York Times – search-based editorial pages (topic pages) (conversational interface)
  • Times of London – search-based editorial pages (topic pages) defined by user (conversational interface)
  • Yahoo Mindset – intent-driven (or rather, intent-revealing) interface for product search. This is no longer available, but this blogpost has an explanation and a graphic of the "intent slider".

Federated search

  • Oodle – federated search for classified ads
  • Globrix – federated search for real estate in UK

Rich media search

  • SnapTell – instant product identification from mobile photo
  • TinEye – image-matching search (great service, but unfortunately the index is rather small)
  • Shazam – music-matching search for mobile phones (not quite query by humming, but close…) See article in CACM.


  • Indian search engines: (local search)
  • Chinese search engines: Baidu (a serious competitor to Google)
  • search engine: Specializing in Norwegian content not easily available on Google, such as relationships between people.


  • OpenCalais – metadata generator, useful for understanding how machines read your text

… more to come …

By all means – feel free to make suggestions!

FAST Forward 2009: Notes from the third day

Bjørn Olstad: Microsoft’s vision for enterprise search

Search as a transparent and ubiquitous layer providing information and context seamlessly – from a search box (tell me what you want in 1.4 words and I will answer) to a conversational interface (giving pointers to more information and suggestions for continued searches, to a natural interface.

Demo of Microsoft Surface: Camera interface, can recognize things. Multiuser (as opposed to Apple. Showed an application built on search with touch – whenever you touch an information object a query goes towards an ESP implementation and brings up all the information available on that object.

Very impressive demo of Excel Gemini: How do you fit enterprise data into Excel. (Picture of a VW bug with a jet engine.) Pulls 100 million rows into Excel, sort them (instantly), slices and dices. Built on top of ESP, does extreme compression, takes advantage of high memory, allows publishing of live spreadsheets to Sharepoint. Extremely impressive, worth the whole conference.

Bjørn continues talking about search as a platform: Demoing, where you can ask questions about apartments and houses and get a rich search experience where you can change attributes and the data changes dynamically. Globrix does not hold content themselves, but crawls available content on the web and shows it (much like for airline tickets).

Another demo: Search for entertainment based on location, friends and content. Moving from there to a focused movie site. This is federated search that understands some of the semantics (understands that “David Bowie” refers to a person and therefore only search certain databases.) Also incorporates community (letting users edit the results and feed them back).

FAST AdMomentum – advertising network – has had tremendous growth.

Content analytics: How can you lay a foundation for a good search experience by focusing on data quality? Demo: Content Integration Studio, sucking out semantics from unstructured text and writing it back both to the search engine and to databases (such as an HR database).

Panel session on enterprise search

Hitachi consulting (Ellen): Very big focus on the economy now, almost all conversations are about that topic. eDiscovery is important: Looking at many sources with a view towards risk discovery and risk mitigation.

EMC consulting (Mark Stone): Natural interfaces will be important, frees up the mind to focus on the information rather than the interface. Shows a video of a small girls using the Surface table and how she very quickly starts to focus on the pictures she is manipulating rather than the interface – she completely forgets that she is working with a computer.

Sue Feldman, IDC: We have to get beyond the document paradigm. I want to see interfaces that will immerse me in the sea of information and explore it, without having to think about what application it is in.

Sue Feldman: Core issue with search: Data quality and making it a rich experience for the user. Anthropological, linguistic and cultural issues, getting people to understand both what they are seeing and what they are looking for.  We are just beginning on this journey. From keyword matching and relevance ranking to pulling the user in, having a dialogue with the information. What we are seeing is hybrid systems that combine collaboration, search, analysis etc.

AMR Research: There is a religious war going on, between collaborative systems, portals, content management systems, and search. They all claim to be the answer to the problem of connecting users with their data. There is also consolidation in the market, partially driven by the economy, but there is also a consolidation of functionality and an explosion in new ideas, many small companies coming up with new ideas.  No one technology is going to solve all of these problems. Lots of opportunity because Microsoft is gobbling up all these technologies, trying to provide one product that covers most (Sharepoint).

Q: Examples of interaction management?

Hitachi consulting: Best examples currently found in collaboration and community software.

EMC: There is a tool out there that searches not only blogs, but specifically the comment sections of blogs, looking for mentions of products. Do sentiment analysis, find out what the customers are saying about you.

Sue Feldman: Searching through corporate communications in lawsuit situations. Ad targeting. And what is the relationship between search and innovation?

Hitachi: Innovation comes from finding what you did not expect to find.

Q: This question always comes up: Search is a commodity – or is it? What is the current market doing for search adoption?

AMR: I am not sure who says that, there is so much room for innovation, so I can’t understand why anyone would say it is commoditized. Go out there and find the opportunities.

Sue F: Well, search is a tool, like a screwdriver. But I really need a screwdriver. The toolbox has expanded so much. I see the search market continuing to explode even though the technology is tanking. Possible that we will see a disruption with a new platform based on information management, access and collaboration.

EMC: We are seeing growth, the business will mature because companies have to focus on what the business really needs.

Sue Feldman & others: Search use awards

Customer awards:

  • Best productivity advancement: Verizon Business.
  • Best digital market application (I): McGraw-Hill Platts (doing industry-specific searches, 50% increase in trial subscriptions, 40% increase in revenue.)
  • Best digital market application (II): SPH Search (reader interaction and content integrated with newspaper sources, federated search.)
  • Social computing: Accenture (internal search on people profiles and content)
  • User engagement:, Japan (700m pageviews, 18m unique users)
  • User engagement: AutoTrader (peak query level of 1500 qps)

Partner awards:

  • Digital market solution: Comperio (use of search for user interaction)
  • Social computing solution: NewsGator (enterprise social computing on top of Sharepoint)
  • User experience solutions: EMC Consulting
  • Partner of the year: Hitachi consulting.

FASTForward 2009 – impressions from the second day

The second day has less of the “big picture” and more of product announcements and more technical detail. Here are some notes as the day progresses:

Kirk Koenigsbauer, Microsoft: Our enterprise search vision & roadmap

Kirk is responsible for the business side of FAST after the acquisition. He is speaking on Microsoft’s commitment to search, the roadmap and future business directions, including pricing.

About 15% of the research done in MS Research is search-oriented.  10 years support on current FAST products, even non-MS platform.

Search server express now has more than 100,000 downloads. 1/3 of MS enterprise customers have deployed a MS search solution. Partner #s have doubled.

MS vision: Create experiences that combine the magic of software with the power of Internet services across a world of devices. Search is integral to vision.

Demo: Use of search in a business setting, showing documents in a viewer format, extracting keywords and concepts.

Announcing two new products:

  • FAST for Sharepoint, which is FAST ESP integrated into Sharepoint, available at a substantially lower price than FAST ESP, typically 50% lower price. Simpler pricing model: Per-user charge for FAST ESP standalone, included in Sharepoint. Still need to buy a server at 25K a pop, but this is substantially lower price. Will be available from next rollout of Office (wave 14). Will also provide a licensing bridge for those who purchase Sharepoint now.
  • FAST Search for Internet business. New functionality for interaction management (promotions, campaigns etc.), Content Integration Studio (graphical interface for managing content restructuring and content integration), and simplified licensing: Language pack and connectors will be part of the standard package.

Valentin Richter, Raytion: User engagement

Low satisfaction with many search solutions, and 70% of search managers do not study search logs with an eye to improve the experience. Went through a list of common myths about search (such as “people know what they are looking for”.) People want simplicity – they cannot handle expressions and need more of a drill down approach navigating through related information. Installing search platforms immediately needs to a focus on information quality: You find duplicates, you find confidential documents everywhere, and so on – be ready for it both in a technical and organizational sense.

Walton Smith, Booz Allen Hamilton: Case study of use of FAST and Sharepoint

BAH based in Virginia, traditionally centralized, but expanding. 300 partners, all wanting to go in different directions. De facto collaboration tool was Outlook. Created a social computing platform called Among the results: Have given access to more esoteric material, which caused issues with indexing. Were able to pull new people from other parts of the organization on a project. Other application:, finding people with the right credentials and experience, pulling information from many sources. crawls hell and iShare. About 1/3 of the firm is now using the platform, lots of information on individuals.

Charlene Li: Transformation  based on social technologies

It is all about engaging users in dialogue: H&R Block has a page on Facebook where they discuss tax issues – not trying to pull people in, at least not explicitly. Comcast is on Twitter with their customer service people. Starbucks testing ideas, such as automated purchasing based on a customer card. Beth Israel’s CEO blogs about what it is like to run a hospital. Necessary to change search to include social software: Technorati searches blogs, allows social bookmarking. You can use Twitter mapping to see what people are discussing – showing that what is rated high somewhere may not be what is most discussed. Amazon now lets you filter reviews by friends.

Conclusion: Social networks will be like air, and will transform companies from the outside in. Social media is impacting search at multiple levels, refining results based on personalization details derived from their social circles.

Jørn Ellefsen, Comperio: In search of profits

Comperio has more than 100 customers and have created a front application, Comperio Front, that sits between the customer’s web pages and their search engine. Introduced Drew Brunell who works with SEO for, among others, News International. Paid search is the growing part of the advertising market, everything else is either flat (display ads) or sinking (traditional ads). Doing a lot of experimentation linking into customer behavior – for instance, matching content with areas that see a lot of conmments, “invisible newspapers”. Another notion is the “curated content model”, setting up pages with a blend of original content with stuff from the outside web. Topic pages based on “zero-term search”, offering editorial content put together automatically around. Stefan Sveen, CTO Comperio, demonstrated topic pages from Times Online: User and journalists can create their own topic pages, based on search results and mark entries coming in after the page is created.

Venkat Krishnamoorthy, Thomson Reuters: Delivering Contextual and Intelligent Information to Premium Customers

Reuters delivers context-sensitive information for pre-investment analysis to premiere customers. They have done this for a long time, but want to change from being a data-delivery company, but to integrate into  the user’s workflow. Challenges here included having too many applications the customers needed to stitch together, finding information was difficult, especially across different kinds of assets – more than 40 content databases.  Solution: Put in a search and navigation layer between their desktop products (they have two, a web-based one and a premium, client-based one).

Liveblogging from Sophia Antipolis

This are my running notes from visiting Accenture’s Technology Labs in Sophia Antipolis, as part of a Master of Management program called "Strategic Business Development and Innovation" for the Norwegian School of Management.

Accenture’s Technology Labs is a relatively small organization: 200 researchers, 180000 employees in Accenture. There are four tech labs: Silicon Valley, Chicago (the largest), Sophia Antipolis, Bangalore, they should be able to do everything, but in practice there is specialization. The four main activities of the tech labs are technology visioning, research, development of specific platforms, and innovation workshops (with clients, press, consultants etc.) The themes pursued are mobility and sensors; analytics and insight; human interaction & performance; Systems Integration (architecture, development methods); and infrastructure (virtualization, cloud computing).

Kelly Dempski: Power Shift: Accenture Technology vision

The visioning used to be far-thinking, visionary etc., now have a much more immediate focus, want to look at things that you can implement today, make it much more "grounded in reality"

Eight critical trends:

  • 1: Cloud computing and SaaS: Hardware cloud (, IBM, Google (now the third largest producer of servers in the world)), desktop cloud (Google, Zimbra, MS Office Live Workspace), SaaS cloud (Netsuite, CrownPeak,, and services cloud (Google Checkout, Amazon web services, eBay, Yahoo)
    • examples: Flextronics has changed over their HR applications to an SaaS model. AMD emulates chips on software for testing purposes, now contract with Sun to do that in the cloud. New York Times had 4Tb of articles that they wanted to translate to PDF: Translated it all twice (because there was a bug the first time), someone went on Amazon with their credit card, uploaded 4Tb, processed it (24h), there was a bug, had to do it again, 48h, total cost $250 on someone’s credit card.
    • issues:
      • data location (where is the data)
      • privacy and security
      • performance
  • 2: Systems – regular and lite
    • SOA as the integration paradigm (regular), mashups (lite)
    • traditional back-end apps vs. end-user apps
    • small number of apps maintained by CIOs vs. large number of User and user-group created applications (long tail)
    • examples:
      • REST is a light architectural approach for interoperability & data extraction
      • Mashups (JackMe (trading platform tools), Serena, Duet (SAP and Microsoft), IBM) becoming more important in the enterprise arena
      • Widgets and gadgets are light-weight desktop UIs that continually update some data
  • 3: Enterprise intelligence at scale
    • combination of internet-scale computing, petabytes of data, and new algorithms
    • almost all the large systems vendors have partnered with or acquired some analytics oriented software company (such as Microsoft acquiring FAST)
    • rampant use of data: evolution through access, reporting, external & internal, unstructured etc.
  • Trends 1-2-3 together: The new CIO
    • hardware and software procured from the cloud
    • business units, end-users create their own lightweight apps
    • The new CIO:
      • "Data Fort Commander" – ensure security, privacy, integrity of corporate data and manage back-end apps
      • "Chief Intelligence Officer" – provide data analysis services & insights to business units
  • 4: Continuous access
    • mobile device "first class" IT object
    • No concept of enterprise desktop/laptop
    • location-based services
  • 5: Social computing
    • amplify and support the value of the community
    • three major directions: Platformization, inter-operability, identity management
  • 6: User-generated content
    • community-based CRM (users making videos about how to run certain kinds of software or build something from IKEA)
    • new forms of entertainment
    • revenue erosion of traditional media companies
    • this has marketing implications: You can measure the sentiment out there in the user community. You switch from advertising to engaging.
  • 7: Industrialization of software development
    • converging trends will increase integration: Predictive metrics, model-driven development, domain-specific languages, service-oriented architecture, agile-development & Forever Beta.
  • 8: Green computing
    • global warming, energy prices, consumer pressure, compliance and valuation
    • switch out energy-intensive processes for information-intensive processes: Electronic collaboration; Warehousing, supply chain & logistics optimization; Smart factories, plants, buildings & homes; and new businesses such as carbon auditing and trading

Cyrille Bataller: Biometric Identity Management

Biometric identification is coming, driven by increasing demand and technological progress. Biometric identification is defined as "automated recognition of individuals based on their physiological and/or behavioral characteristics. Physiological can be face, iris, fingerprint; behavioral can be signature, voice, or walk. Involves a tradeoff, as with all security systems, between the level of security and the convenience of the system. Fingerprint is most used (38%), face is the most natural, iris the most accurate. Many others: Finger/hand vein, gait, ear shape, electricity, heat signature, hand geometry and so on…

Balance between FMR (false (positive identification) m rate) and FNMR, called equal error rate. Iris has an EER of .002%, 10 fingerprints .01%, fingerprint .4%, signature 3%, face recognition 6%, voice 8%. Many parameters in addition to this.

Securimetrix has something called HIIDE, a mobile unit that does a number of biometrics, used in Iran. Voice is very interesting because it can be done over the phone, interesting for call centers, banks etc. Multimodal important, because it is hard to spoof.

Airports is a good example of what you can do with proper identification: You can move 99.9% of the check-in away from the airport. Bag drop can also be almost fully automated. Portugal is the leader in the EU, have automated passport control with facial recognition (scan, use electronic passport etc.). Most people are not concerned very much with privacy given some assurance and convenience. Likely to see lost of automated border clearance for the masses, but also registered travelers that go through even quicker and are interoperable across many airports. One common misunderstanding is that automated identity checking is moving away from 100% accuracy, but human passport/security control is an error-ridden process and mostly automated processes are more accurate.

Antoine Caner: Next Generation Branch

This is a showcase exhibit of best practice banking technology and processes. This showroom has about 40 companies (banks, mostly) visits per year.

Most banks have a multi-channel strategy, have returned from a strategy of getting rid of branches but want to redefine it. Rather than doing low-value transactions, the branches are seen as a mesh network for business development.

Key principles behind the branch of the future:

  • generating and taking advantage of the traffic
  • flexibility throughout the day
  • adaptation to client’s value
  • sell & service oriented
  • modular space according
  • entertaining and attractive
  • focused on customer experience


  • turning the branch windows into an interactive display (realty, for instance)
  • Bluetooth-enabled push information
  • swipe card at entrance to let branch know you are there, let your account manager know, apply Amazon-like features
  • digital displays for marketing
  • avatar-based teller services
  • biometric-based ATMs to allow for more advanced transactions, as well as more opportunistic sales applications
  • do both identification and authentication
  • digital pen user interface for capturing data from forms
  • RFID-based or NFC (Near Field Communication) in brochures, swipe and get info on screen
  • "interactive wall" for interaction with clients in information seeking mode
  • visual tracking of movement in the branch
  • modular office that can change shape during the day, reconfigurable furniture

What impressed me was not the individual applications per se – though they were impressive – but way everything had been put together, with a back-office application that can be used by the branch manager to track how this whole customer interface  (i.e., the whole bank branch) works.

Alexandre Naressi: Emerging Web Technologies

Alexandre leads the rich Internet applications community of interest within Accenture. He started off giving some background on Web 2.0 and used Flickr as an example of a Web 2.0 application, where a company use user-generated content and tagging to get network effects on their side. Important here is not only the user interface but also having APIs that allow anyone to create applications and to have your content or services embedded into other platforms. Dimpls is another example. More than one billion people have Internet access, 50% of the world has broadband access, which allows for richer applications. Customers’ behavior is changing – it is now a "read-write" web. It has also gotten so much cheaper to launch something: Excite cost $3m, JotSpot $200k, Digg cost $200.

Rich Internet Application and Social Software represent low-hanging fruit in this scenario. RIA allows the functionality of a fat client in a browser interface, with very rich and capable components for programmmers to play around with.

Two families of technologies: Jacascript/Ajax (doesn’t require a plugin, advocated by Google), and three different plugin-based platforms: Silverlight (Microsoft), Flash/Flex from Adobe, and JavaFX from Sun. All of them have offline clients that can be downloaded as well. A good example is, which gives a better user interface – Accenture has developed something similar for their internal enterprisesearch.

Social Software: Accenture has its own internal version of Facebook. Youtube is also a possible corporate platform where people can contribute screencasts of all kinds of interesting demos and prototypes.

Kirsti Kierulf: Nordic Innovation Model for Accenture and Microsoft

Accenture and Microsoft collaborating (own a company, Avanade, together), and have set up an Innovation lab in Oslo called the Accenture Innovation Lab on Microsoft Enterprise Search. Three agendas: Network services, enterprise search (iAD), and service innovation. Running a number of innovation processes internally. This happens on a Nordic level, so collaboration is with academic institutions and companies all over.

Have made a number of tools to support innovation methodologies: InnovateIT, InnovoteIT, and InnomindIT (mind maps), as well as a method for making quick prototypes of systems and concepts for testing and experimentation: 6 weeks from idea to test.

Current innovation models are not working for long-term, risky projects. Closed models do not work – hence, looser, more informal and open innovation models with shorter innovation cycles. Pull people in, share costs throughout the network, Try to avoid the funnel which closes down projects with no clear business case and NIH. Try to park ideas rather than kill them.

Important: Ask for advice, stay in the question, maintain relationships, don’t spend time on legalities and financials.

CACM becomes much more readable

CACM (Communications of the ACM) is one of my favorite journals – and it is currently in the throes of an editorial upheaval that I think is very positive. In addition to scholarly articles, it is moving in the direction of essays and more generally accessible articles, without loosening the quality criteria. Ever since BYTE disappeared (a victim of the need for targeted advertising) I have missed a general, quite technical yet accessible journal – CACM is now getting closer to what I am looking for.

Here are two articles I found very interesting:

  • "Will the Future of Software be Open Source?, a well reasoned reflection by Martin Campbell-Kelly, giving a very terse, yet comprehensive and useful description of the evolution of software markets. Answer: OS is a tempting conclusion if you extrapolate, but extrapolation has not been a very successful prediction technique so far…
  • "Searching the Deep Web", by Alex Wright, which explores two different approaches to searching beyond static web pages – the trawling approach, which relies on local storage, and the angling approach, which produces targeted results in real time.

Small firm, large firm, we are all equal now

Hal Varian has a good post on the democratization of data over at the Google blog – in short, that small firms now can access information and analysis (including consultants) much like large firms can.

My interpretation: Information access is now close to free. What you now need is understanding. That takes people, and if you can access the smart ones in person as well as their explicated output, you will do well.

One danger of search-collected newspapers

United Airlines’ share price dropped 76% when Google News erroneously picked up a six-year old story about UAL filing for bankruptcy and pushed it to the front page.

Not that this couldn’t happen in any newspaper, but Google News is automatically generated. This opens for interesting possibilities in pump-and-dump….

IAD center opening

Monday was exciting – not only was it the Fall workshop for the iAD Center for Research-based Innovation, but it was also the opening of the iAD Lab [Norwegian language story here] – a physical manifestation of the Bjørn Olstad, CTO of FAST, opening the lab research project, as well as an important tool for drawing the researchers from the five Oslo-based participants (FAST, Accenture, Schibsted, UiO and BI) closer together.

Myself, I plan to spend at least one day per week in the lab – there is nothing like physical proximity to get to know an organization and a field, notwithstanding all the communications capabilities, electronic and otherwise, we surround ourselves with.

The lab itself, incidentally, is just six workspaces, a few computers and access cards for researchers. Gone are the days when the opening of a computing center was photogenic, with blinking lights and spinning tape decks. But it will enable us to store sensitive data in a secure environment, have enough horsepower to really analyze them, and provide a natural focal point for demonstrations, prototypes and experiments.

Serendipity, researchwise

Mary B. has this account of finding interesting material bound with another book from the library – and then discovering that all the stuff was available through Google Booksearch. Which raises the point – how to we make the serendipity often found in research (go into any library and look at the books next to the one you are looking for) in an electronic context?

Online newspapers (as well as domain squatters) face this challenge every day – not just serving what the customer wants, but also something they didn’t know they wanted, often sufficiently similar that it may be, if not a substitute, at least a diversion.

Perhaps Google should have a new subcategory on their result screen – an appropriately random link under the heading of "and now, for something completely different…"

Google and network externalities

Here is a bunch of links about Google that I have had lying around for a while – trying to think about the first one and to what extent Hal Varian is right about Google not having a network externality competitive advantage. I think he is wrong, but why is hard to articulate.

So, here goes (note that Google, rather nicely, includes a list of links to each blog post, which is fodder for further discussion):

  • Hal Varian: Our secret sauce, arguing that Google’s competitive advantage is due to experience and innovation, not network externalities.
  • Tom Evslin: Sitemaps and how the rich get richer: Essentially, Google has an advantage because they are the biggest and people adjust their web sites to the Google engine and its various algorithmic quirks.
  • Hal Varian: Why data matters. Brief overview of search and PageRank.
  • Hal Varian: How auctions set ad prices. Brief explanation of Google’s auction system for ads. One interesting effect, not mentioned here, is that the more precisely the user can describe the targeted population, the lower the ad price – thus, Google has both an incentive to make targeting imprecise (to have enough actors competing for a particular keyword/target) and an incentive to make it precise (to increase click rates).
  • Marissa Mayer: A peek into our search factory. Various presentations, with notes, about the infrastructure underlying Google’s various offerings.
  • Udi Manber: Introduction to Google search quality. Overview of what Google does to fight spam, increase precision, and other things. (Reads like a transcript of a talk.)

Here are two articles that everyone trying to understand Google should read (come to think of it, this blog post is starting to resemble the layout for a class):

  • Brin, S. and L. Page (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Seventh International WWW Conference, Brisbane, Australia. (The classic on PageRank.)
  • Ghemawat, S., H. Gobioff, et al. (2003). The Google File System. ACM Symposium on Operating Systems Principles, ACM. Description of the architecture of Google’s index, a file system geared for few writes and very many reads, redundancy, and low response time. PDF here.


Here is, a UNIX-like interface to Google. I like it – wonderful how a sparse interface can improve productivity. It is almost so I start to long back to the days of Desqview and all those other text-based multitasking hacks of the 90s. (Mind you, this is just an interface, no full Unix shell.)

(Via David Weinberger.)

iAD Master students wanted!

(This message is meant for students at the Norwegian School of Management, but I am posting it here for distribution – and if someone from another institution should be interested, by all means, get in touch.)

The Center for Technology Strategy is seeking M.Sc. students who are looking for interesting topics for their thesis, offering the opportunity to write their thesis under the iAD research project. This project is a joint project of FAST, Accenture, Schibsted and six universities, among them NSM. The purpose of NSM’s part of the project is to understand the business impact of search technologies and other new technologies for information access.

This opportunity is open to all Master students, at any specialty, and would involve finding a research topic connected to the iAD project. (See a list of proposed topics here, but feel free to come up with your own.) The topic definition will happen in collaboration with faculty from your M.Sc. specialty. Thesis advisor will be either your own faculty, one of the faculty associated with the iAD project (Espen Andersen, Ingunn Myrtveit, Erik Stensrud, Torger Reve), or possibly an advisor from FAST, Accenture or Schibsted, as appropriate.

We are planning an information meeting on

    April 2, at 0900-1030 at room C2-040, BI Nydalen

If you are interested, please send me an email so I can know how many will be there.

Updated March 25: A list of some suggested topics can be found here