What is Semantic Information Retrieval?

19. August 2010 um 00:45 Keine Kommentare

The most fun part of my dissertation is when I can procastinate dig deeply to the foundation of computer and information science. Lately I tried to find out when the terms „file“ and the „directory“ were coined in its current sense. The first commercial disk drive was the IBM 350, introduced in 1956. It had the size of a wardrobe, stored 4.4 megabytes 6-bit-characters and could be leased for 3,200$/month. Instances of it were also called „files“. But user files first appeared in the early 1960s with the Compatible Time-Sharing System (CTSS), the earliest ancestor of Unix. You should watch this great video from 1964 in which Robert Fano talks about making computers accessible to people. A wonderful demonstration of one of the very first command lines of a multi-user system! The explicit aims and concepts of computer systems are very similar to today. The more I read about history of computing, the more it seems to be that all important concepts were developed in the 1960s and 1970s. The rest is just reinventing and application on a broader scale.

Robert Fano was director of project MAC, a laboratory that brought together pioneers in operating systems, artificial intelligence, and other areas of the emerging discipline computer science. I browsed the historical publications of the laboratory at MIT where you can find a report of CTSS. Also published at MAC in 1964, I stumbled upon Bertram Raphael’s PhD thesis. It is titled SIR: A COMPUTER PROGRAM FOR SEMANTIC INFORMATION RETRIEVAL and its abstracts sounds like todays Semantic Web propaganda:

This system demonstrates what can reasonably be called an ability to „understand“ semantic information. SIR’s semantic and deductive ability is based on the construction of an internal model, which uses word associations and property lists, for the relational information normally conveyed in conversational statements. […] The system has some capacity to recognize exceptions to general rules, resolve certain semantic ambiguities, and modify its model structure in order to save computer memory space.

The SIR expert system even seems to go beyong current RDF techniques in supporting exceptions. By the way Bertram Raphael was at MAC at the same time as Joseph Weizenbaum. Weizenbaum fooled expectations in articial intelligence with his program ELIZA that he created between 1964 and 1966. He later became an important critic of artificial intelligence and the application of computer technology in general. By the way we need more like him instead of well-meaning, megalomaniac technology evangelists. See the documentary Rebel at work about Weizenbaum or even better the promising film Plug & Pray!

So what is Semantic Information Retrieval? In short: bullshit. The term is also used independently for search indices on graph structured data (2009), digital libraries (1998) and more. But why bothering with words, meaning, and history if computers will surely „understand“ soon?

No Comments yet »

RSS feed for comments on this post. TrackBack URI

Sorry, the comment form is closed at this time.