Thursday, December 01, 2005

Information Management

There is a lot of information flying around these days. Far too much to be able to do more than a quick filter as it flies through, from one ear to the other. I decided to have a look at the Information Management Show at Olympia to see what new techniques are around. It had a few CMS stands there as well, so I could look at some ideas in that area as well.

You'll find the exhibitor list at http://www.online-information.co.uk/cgi-events/exhibitors.pl?exhibition_id=79&search_trail_7=618&search=SEARCH

There are some big companies getting involved in putting information online. Libraries, universities, large company research teams, the list is endless. I know that people like Google, Microsoft and Amazon are involded in trying to scan all the available books in the world. At the show, there were numerous companies that are trying to sell their database and indexing services to allow us to access the precise piece of data that we need from this wealth of information. It was interesting to see the numer of strategies being employed, but , surprisingly, not many had much more than a passing interest in images and I'm not sure that I spoke to anyone involved in the indexing of video or film clips. The Israeli image processing software that recognises faces and scenes, that I saw at NAB years ago, is nowhere to be seen. Even Canon had no ideas on indexing. They were more interested in the scan and print functions.

The main push was to make the step past keyword search, with synonyms and homonym databases, sometimes sensitive to categories. Searching for relevance in the text is becoming quite a science. I haven't looked at it for a while, and there is now some heavyweight ( and usually expensive ) software around that can help. People like MondoSoft are offering very comprehensive indexing strategies, with behavioural tracking applied to their indexes as well. The software indexes all the pages, but also sits on the servers watching where users go. Words that are mispelt can be picked up and added to the thesaurus to build more power into the indexes over a period of a few months. I suppost the software is not that expensive if you have a large site, but a little over the top in our terms. Simpler engines like Filehopper might give a more simple way of doing things. They have a J2E engine that sits in the back end. There are quite a few Java engines around actually. This seems a popular system to use with Information Databases.

I spent a while trying to grab all the concepts of Topic Maps from the inventors of the techniques, Ontopia. It is a way to describe a series of links between items, with more of a relational database feel about it. Items are classified according to their type, adnd placed in one or more context layers. By adjusting the values of the importance to the contexts and item types we can get a feel for the relative importance of the links. There are some interesting aspects to it, so I should download the white paper to read when I have a little more time.

As for CMS companies, EasySite was there with a reasonably priced offering. It still has some good facilities at the lower end. Easy to use and quick for a budget package. Not many of the open source companies were there. Squiz.net the only one I noticed, but they still just use Postgres, so I'll wait til they have their mySQL interface before I look at them again. Microsoft were pushing the SharePoint packages. Probably makes sense for the larger user, but seems over engineered for the type of businesses that I have to deal with. I'm going to make some time to use the latest ASP .NET 2 development software over the next few weeks, so I don't think I'll get involved with the SharePoint distraction. There are far more interesting things to play with.

One interesting CMS from Denmark was Sitepoint. They are a tad more expensive than I'd like, in the versions that had fuller facilities, but they had all the right buzzwords on their stand. Longhorn was one of them, but I don't think we have to wait for that to have a play with the CMS. The interesting aspect for me was that it was very much a Microsoft based package using all the latest XAML, ASP.NET C# technologies, but they also advertise running it on Linux by using the Mono platform. Something to look at another day. Might have a look at the CMS though, if I take some time to load up the WPF laters under XP. I'd like to look at Sparkle and Cider anyway, so I should have a more in depth look at XAML. I wonder how it matches up to the 2005 versions.

Another CMS that could do with more time to look at was Digimaker. This time from Norway. Must be those long Arctic winters giving the Scandinavians time to write software. It's dotNet based and they talked well about a number of areas.

0 Comments:

Post a Comment

<< Home