+1 (410) 742-9088 david@highcontext.com

High Context Consulting, LLC

Archive for the 'Search' Category

January 11, 2006

Nielsen on the importance of converting search engine ad traffic

Jakob Nielsen has posted a short article you should read about the importance of converting search engine advertising traffic: Search Engines as Leeches on the Web.

Search engines extract too much of the Web’s value, leaving too little for the websites that actually create the content. Liberation from search dependency is a strategic imperative for both websites and software vendors.

Worth a read if you are using advertising on search engines to drive traffic to your web site.

Permanent Link | Subscribe via RSS | Subscribe via Email | 2 Comments

November 10, 2005

Boxwood Technology Adds RSS to Job Board Service

Boxwood Technology is pretty much on top of the heap for hosted job board services for associations. (Disclaimer: I was a client of theirs when I worked at ASHA and I serve with Boxwood Chairman John Bell on the ASAE Tech Council.) They have just added RSS feeds to their service, which is a fantastic extension. Now job seekers can subscribe to all new jobs or to the results of a specific search. After they subscribe, any newly posted jobs will appear in their newsreader of choice. Nice! They should mention this service on their web site.

For an example, see ASAE’s job center. There is an orange RSS button at the bottom of the screen.

A couple of improvements I think they could make include:

Permanent Link | Subscribe via RSS | Subscribe via Email | One Comment

September 12, 2005

Verity Ultraseek: Free Download

Verity’s Ultraseek, a search engine, is now available for download and free trial. This is the tool we used at ASHA when I was working there. Excellent ability to tune results and the interface can be customized relatively easily using Python and HTML (although the templates were rather incomprehensible spaghetti code, which is hard to do with Python, normally). Hopefully the spaghetti issue was improved with the latest release.

Spotted via SearchTools.

Permanent Link | Subscribe via RSS | Subscribe via Email | 3 Comments

September 2, 2005

Paging Robert Scoble: Tell msnbot to Calm Down

I’m posting this note with Robert Scoble’s name in it in order to get some attention from Microsoft about the behavior of their RSS bot, msnbot.

Over the past week, the bot has hit my site over 27k times for about 38mb of bandwidth. The bot is almost exclusively hitting RSS feeds. However, most of the feeds it is getting on my site are for individual entries, which allow people to track comments. Each feed is getting hit about 100x a week. I would think that is a big waste of effort for older entries that get few comments. Once a day should be plenty.

So, Robert, when you see this in one of your ego notifications, please pass the word to whoever manages msnbot to chill out a bit on the hits. I love to be indexed but not at such a heavy load which is wasting my bandwidth and MS’s. If the load goes much higher I might ban the bot for poor manners.

Permanent Link | Subscribe via RSS | Subscribe via Email | One Comment

August 31, 2005

Search Log Analysis Book

Lou Rosenfeld is coauthoring a book on search log analysis. Excellent!

Based on my recent posting, it might not come as a huge surprise that I’m co-authoring (with Rich Wiggins) a new book on search log analysis (SLA). I’m happy to report that we’re already a couple chapters deep and I’m actually enjoying the process of writing, which usually requires a lot more self-discipline than my genetic programming supports.

I’m gung-ho on SLA because it seems so obvious, and yet it’s still uncommon in the worlds of UCD and, more broadly, web design. Rich and I hope our book helps clear away many barriers to SLA–practical, technical, and political–by collecting both how-to info and justification in a single, short book.

When I was at ASHA, we found that reviewing our search logs on a regular basis told us a lot of great information about what people are looking for and where we needed to pay some attention. We would identify searches that didn’t return the appropriate content and we would take steps to tweak the content to float up higher or do it manually through best bet links. We would also identify searches for which we had no content, giving us great ideas on what we should add to the site.

Can’t wait to see what Lou and Rich come up with in the book.

Permanent Link | Subscribe via RSS | Subscribe via Email | Comment

July 20, 2005

Crawling Robots!

Search Engine World did a crawl recently of 75k robots.txt files. (robots.txt files contain instructions for search engines that index your site. You can use them to prevent search engines from indexing certain directories, blocking specific search engines, etc.) They report on their findings of common errors made in the files.

The worst robots.txt error I ever saw was for a site whose owners complained that they never showed up in google search results. I took a peek at their robots.txt file and sure enough someone had set it to disallow all search engines. Oops! This was probably a leftover from when the site was in development. Have you checked your robots.txt file recently?

Permanent Link | Subscribe via RSS | Subscribe via Email | Comment

June 7, 2005

Google Sitemap Protocol

Google has announced a new protocol that they are now using to better index web sites: Sitemap Protocol

he Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. A Sitemap consists of a list of URLs and may also contain additional information about those URLs, such as when they were last modified, how frequently they change, etc.”

I imagine it won’t be long before most of the major CMSs out there have the ability to create one of these sitemap files. The primary benefit is to reveal pages to search engine crawlers that they would not find via their normal crawling of your site.

Permanent Link | Subscribe via RSS | Subscribe via Email | Comment

March 20, 2005

Google Code

Google has released a number of open source code projects developed by their staff: Google Code. A lot of it is pretty esoteric. One that caught my eye was PyGoogle, a python module that can be used to call the Google search API. We use a search engine at work that uses Python, so in theory we could use the PyGoogle library to incorporate google search results with our own. Nifty.

Permanent Link | Subscribe via RSS | Subscribe via Email | Comment

February 2, 2005

Thunderstone: Search Appliance SBE

Via CMS Watch: Thunderstone Search Appliance SBE. Thunderstone’s answer to the Google Mini. I’ve always liked Thuderstone’s search engine and agree with CMS Watch that it is a viable alternative to the Google boxes.

Permanent Link | Subscribe via RSS | Subscribe via Email | Comment

January 13, 2005

Google Goes Mini

Google just launched another search appliance: the Google Mini. It costs $5,000 which seems to be a good price point for a lot of non-profit organizations to make an investment for their site or intranet. It will index up to 50,000 documents, which should be plenty for most needs in this market as well.

Permanent Link | Subscribe via RSS | Subscribe via Email | Comment

Copyright © 2008 High Context Consulting

Privacy Policy: HCC will never share your information with anyone without your permission.