On a number of fronts recently I've been thinking a bunch about RDF, the DCMI Abstract Model, and the Semantic Web, all with an eye towards understanding these things more than I have in the past. I think I've made some progress, although I can't claim to fully grok any of these yet. One thing does occur to me, although it's probably a gross oversimplification. The difference in the Semantic Web/RDF approach from the, say, Google approach is this: is the robustness in the data or is it in the system?
The Semantic Web (et al) would like the data to be self-explanatory, to say itself explicitly what it is it is describing and with explicit reference to all the properties used in the description. The opposite end of the spectrum is systems like Google which assume some kind of intelligence went into the creation of the data but doesn't expect the data itself to explicitly manifest it. The approach of these systems is to reverse engineer that data, getting at the human intelligence that created it in the first place.
The difference is one of who is expected to to the work - the sytem encoding the data in the first place (Semantic Web approach) or the system decoding the data for use in a specific application. Both obviously present challenges, and it's not clear to me at this point which will "win." Maybe the "good enough and a person can go the last bit" approach really is appropriate - no system can be perfect! Or maybe as information systems evolve our standards for the performance of these systems will be raised to a degree where self-describing data is demanded. As a moderate, I guess I think both will probably be necessary for different uses. But which way will the library community go? Can we afford to have feet in both camps into the future?