<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d6566853\x26blogName\x3d1%25+inspiration\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dBLUE\x26layoutType\x3dCLASSIC\x26searchRoot\x3dhttps://patke.blogspot.com/search\x26blogLocale\x3den\x26v\x3d2\x26homepageUrl\x3dhttp://patke.blogspot.com/\x26vt\x3d2795022480681574377', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe" }); } }); </script>

Tuesday, March 16, 2004

Another fine Tuesday morning!

Up and at 'em at 4:30. Actually looking forward to work this week as it looks like I will have some work to do - as opposed to the usual "consulting". Ick.

This weekend I was thinking about data on the internet. The biggest problem with the internet is the amount of data. Strange because this is also the (only?) cool think about the internet. There are major problems in the area of "search".

...and when I say "search" I am not talking about google. Honestly, I think the style of searching provided by google is...less then ideal. I don't think you can brute force the internet - even if you do come up with a few sneaky hacks. I think the key to Google's success has been in trying to relate to the "human" side of the internet. The algorithm that powers google is not simply matching words you type with words on a web page. It also goes a step further and tries to determine the relevance of a web page by using various factors - including the number of time the site is referenced from other sites.

There are some obvious flaws with this approach:
- first, people don't necessarily know what they are searching for / how to describe what they are looking for. For example, if I am looking for "a mothers day present", how do I begin? ...I start off a bit in the dark. ...and this is why I say a google search if fundamentally flawed. Twenty years from now, we should be able to search for "a mothers day present" and the search she execute based on what it knows about our mother, what it knows about "similar mothers", etc.

- second, I don't see how Google can determine a site's relevance accurately based purely on HTML data. Why do links from other websites correspond to relevance? Clearly, this is something that can be "faked". I am sure Google tracks it's users as well to determine which links they are accessing - if the second link is clicked more then the first, then maybe it should be promoted? But again, this doesn't really correspond with relevance because God only knows why someone clicked the second link. Additionally, the link was clicked, in part, because it was provided. This creates a bit of a catch 22.

I think the search engine of the future should be browser based. ...more of a P2p maybe...

...more later.

Comments: Post a Comment