Thursday 10 July 2008

Crowd Sourcing the Semantic Web

I've always been interested in the potential power of the semantic web but until now there hasn't appeared to be any killer application. I mean sure there's semantic information out there but most of it is what I would consider high brow. Databases of scientific information converted to semantic web form.

Yet what I always thought was the most obvious use of the semantic web technology would be to build a repository of human knowledge.

Ok, first you think wikipedia then if you're savvy you think of all the efforts out there to turn things like wikipedia into semantic information.

What I am thinking is something much more. I am imagining a website, or even a series of websites where people could input semantic information. I guess a semantic version of wikipedia though I can't think of a cool name.

People would go to this website with the intent of creating semantic knowledge. For instance you could go there and input the information that table is flat and has legs and that an average table will have four legs. The idea is not that I now become the authoratative source on tables but that I have just input a tiny piece if information.

From there we apply the crowd sourcing principle and probability networks to weigh each piece of information that the crowd has entered. If enough people input information about a table that is similar, that information becomes a fact. I believe this is the closest approximation to how we as humans learn.

The biggest failing of wikipedia is that even though I can input fantastic information the next person can overide that information. I would prefer that the next person input their information under the same heading and the system work out what most people believe based on the most popular information.

There are reasoning systems out there that can work on information like this. This would allow anyone to input any information (like wikipedia) but rather than anyone being able to overright any information you would have to fight against the crowd. So if 50,000 people put in information about a table being flat with legs and 6 people said that they were round and squishy a reasoning system would still come to the right conclusion.

So that's my great idea. Perhaps it's already being done but I don't know of it. I think if we were to stay standards compliant then we would have to have some way of crowd sourcing the ontology that goes with everything. I think there is potential here.

Then people could point their probability reasoners at this store of knowledge and could actually begin to understand human knowledge.

Google are you listening?