Second, the amount of data recolected is so big that data mining is incredibly useless. What is ordinary? Considering personal differences, almost all individuals will have at least a trait that falls out of the ordinary it is quite probable. Well, those extraordinary things are mostly harmless (fetishes, intelligence, amount of social conections), but how to discriminate between them is hard. And eve "dangerous" terms are not very good. Because if I talk about terrorism, attacks, Molotov coktail's, bioagents, blah, blah, blah, well that data is giving false positives probably, and for example I could be doing those just by alking about things in a forum (or ebcause I hold an interest in Applied Math and infection disease). And they aren't few. There are probably far more false positives that true ones, making the program near useless. You are either very specific to avoid these false positives (meaning that almost a rudimentary code could get past ot) or too lenient getting so much cases to look at that human analysts with better discrimination skills aren't probably up to the task of looking at all the files. Internet traffic and other similar things are also incredibly hard to distinguish correctly with a simple program that can go through all the data they collect, being either not very hard to fool (a few proxies for example, open hotspots, public places, etc.) or so wide that give more useless information than anything.