Tuesday, September 28, 2010

LocalMaps or MicroMaps - an idea for maps at the "micro" level

In Aug'09, I went to the IGI Airport (DEL) to pick a friend who had flown to Delhi, and who had a few hours before she took a connecting train to her final destination. After parking my car at the parking, I wondered where the Arrival terminal might be!

I digress for a moment.

What does one do when one is in Gauteng and wants to drive to Kruger National Park? Simple, one feeds Kruger National Park into one's GPS device and the device guides one to one's destination. This is an example of mapping at the macro level.

I think we also need maps at the micro level - maps that work inside large airports, large entertainment parks, large shopping malls, etc. I name such a feature LocalMaps (or MicroMaps).

Back to IGI. Had there been such a feature, my phone would've sensed that it's inside an area that supports LocalMaps, and would've automatically asked me to either choose (from a list discovered by my phone in that zone) or search for a person, place or product (or service) I'm interested in (in that zone). As soon as I would've started typing Arri, the phone would've suggested Arrival Terminal, and choosing this option would've either displayed (or spoken) the direction in which I should start moving (towers in this zone would've helped my device to pinpoint its location within the zone, and to suggest directions).

I believe LocalMaps can save people time when they're inside large, well-defined areas. Looking for a specific Axe item at a huge shopping mall? Type Axe into the search and you'll be told not only about the availability (and price, etc.) of the item, but also the direction in which you should move to get to the item. Order the phone to check if a friend is inside the same mall, and you'll be told both the answer and the direction, if applicable, to get to him/her. Inside a Pick n Pay store but unable to locate a particular product? Worry not, for you can query the location of the product.

Tuesday, September 21, 2010

The world needs a "SearchMyStuff" service

I've created content on numerous Web properties:
  1. Emails in Gmail, Hotmail and Yahoo Mail
  2. Chats in Gmail
  3. Photos on Flickr, Orkut and  Picasa Web
  4. Tweets on Twitter
  5. Comments on Facebook
  6. Bookmarks and Videos on YouTube
  7. Posts on GMAT Club
  8. Questions on LinkedIn Answers and Yahoo Answers
  9. Blogs on Blogger
  10. Scraps on Orkut
  11. Contacts in Google Contacts and Yahoo Contacts
  12. Documents in Google Docs and Zoho
  13. Files on SkyDrive
  14. Events, etc., on Google Calendar
  15. Etc.
I know no service that allows me to conduct a search on all the content created by me, lurking on various Web properties owned by different companies. And I think that the world needs such a service. A service that allows me, for example, to conduct a search for the term Chrome, and retrieve all the content related to Chrome that I've created online. Creating such a service will be a tall order, and will involve APIs, business-interests, privacy, security, etc., but that doesn't mean that such a service won't be worth the effort.

Tuesday, September 14, 2010

Building Google-like search engine using Google Search Appliance

The idea below is based on certain assumptions made by me. It seems that this idea is workable, unless this "misuse" was foreseen and features/measures were put in place to thwart such a use. I first wrote this idea in Jun'09, but never posted it on the Web.

Online news media is abuzz about the release of an updated version of Google Search Appliance (GSA) - version 6.0. Among the many updates to this version, one feature has intrigued me the most - the ability to index billions of documents (using clustering). (1)

To understand a possible implication of this, it's important realize that Google's search algorithms are one of its most important pieces of IP. What makes Google Google is its secret sauce - the algorithms it uses to rank Web content. Everyone can crawl the Web, but it's the relevance-determining algorithms that give Google much of its competitive edge over rival search engines such as Yahoo SearchAsk.com, and Bing(2)

It's also known that GSA uses Google's ranking algorithms to rank indexed content. (3)

We also know that the ability to crawl and ingest the Web is not a major source of competitive advantage for search engines. Even a simple program such as HTTrack can do a relatively decent job of downloading a website by jumping from URL to URL. The process that HTTrack uses to crawl a website is similar to how contemporary search engines crawl the Web. It should be easily possible to configure (or customize) HTTrack to crawl the Web, rather than just a website. (4)

Click image to enlarge

What all of this leads me to believe is that it should be possible to cluster multiple GSAs to create a pseudo-Google - a search engine that uses Google's secret algorithms to rank the Web, but is powered by a cluster of GSAs. If this is indeed possible, it'll make it super-easy for clever entrepreneurs to launch new search engines that provide high-quality results.

References:
  1. Google Search Appliance Now Can Index Billions of Documents, PC World, June 2, 2009
  2. The Big Cheese - Powerful Version Of Google Search Appliance Can Grow Exponentially, TechCrunch, June 2, 2009
  3. Google Enterprise Search - Search Appliance
  4. Google Search Appliance on Wikipedia

Thursday, September 9, 2010

By not serving environment-optimized binaries, Mozilla is denying Firefox users an optimum experience (and hurting Firefox)

ALSO SEE OID 212Z.

To summarize, I've observed that Mozilla's practice of distributing official binaries that aren't optimized to take advantage of new features and other improvements in modern CPUs (and possibly operating systems) results in a relatively inferior user-experience for Firefox users, ultimately hurting Firefox (and thus Mozilla).

More specifically, the binaries distributed by Mozilla use the Greatest Common Divisor (GCD)/Highest Common Factor (HCF) approach (this approach is mistakenly labeled Least Common Multiple (LCM) by many, many people) for compilation, resulting in binaries that don't make use of relatively newer instruction set extensions such as SSE, SSE2, etc. Additionally, the compilation settings used by Mozilla result in binaries that are optimized for size, at the expense of performance. The result is a non-insignificant deterioration in Firefox's performance (I've observed non-insignificant performance difference between non-optimized and optimized versions of Firefox on popular benchmark tests such as V8). This difference is visible to, and hurts those Firefox users whose systems allow Firefox to provide only slightly less than just-acceptable performance (such users would probably get slightly more than just-acceptable performance with optimized builds).

Mozilla's practice also denies better experience to those users who have faster connections (so they don't mind downloading a larger installer) and newer CPUs (so their systems are capable of performing better).

One of the computers I own and maintain is exactly such a system. Chrome runs smoothly, whereas the official version of Firefox 3.6.8 struggles, as if it's at a loss of breath. Switching to an optimized version of Firefox has improved performance noticeably, and has prevented me from switching full-time to Chrome on this system. What my own example shows is that there's a bunch of users whose machines don't provide them with a satisfactory Firefox experience because Mozilla doesn't optimize Firefox for performance. Is this number of such users large? Is it so large that a switch by these users from Firefox to the undeniably-snappier Chrome will materially hurt Firefox/Mozilla?

How can Mozilla solve this problem? The following options come to my mind:
  1. Fat installer, normal install: An installer will contain multiple versions of binaries, intended to cover the most-common types of systems in existence. During installation, the installer will perform system capability assessment, and those binaries will be installed which are best suited to the environment. This solution will most likely result in a decrease in the number of downloads, due to increased installer size.
  2. Normal installer, normal install: The installer will carry only the GCD/HCF code (thus keeping it small-enough for most people to download), but will perform system capability assessment during installation. If it determines that more optimized binaries can be run on this system, it'll either quietly pull those binaries from the Web, or will ask the user for his consent before pulling the more optimized files.
  3. Separate installer, normal install: Since only power users are expected to be bothered about using optimized binaries, Mozilla should make available official but optimized versions of Firefox on Mozilla.com, albeit hidden from general users (so they don't accidentally download an installer which their system might not support). The installer, even for an official but optimized version, will perform system capability assessment to ensure that installation is being conducted on a supported system. This approach has the disadvantage that it serves only power users. It fails to provide the benefits of optimized Firefox to the masses.
  4. Regular installer, deferred optimization: Install the regular Firefox, and on first launch, it'll itself perform system capability assessment and ask your permission to optimize itself by downloading and installing optimized versions of files that are best-suited to the current system. Alternatively, it'll quietly download and install these files, and the effects will be visible after a restart.
  5. Thin installer, normal install: This is the approach that I favor the most. A tiny installer that doesn't carry any binaries will perform system capability assessment and submit the results to Mozilla.com, which'll supply the installer with the files best-suited for the system. This type of installer - Chrome also uses a thin installer, although I'm unaware if it pulls system-specific binaries - appears to be suitable for both amateur and power users.
Related content: Swiftweasel; Swiftfox