Topic on the current event
.. Beyond IBM's “Concept Search” Concept

Theory on Molecular Formulation of Complete Human Knowledge

 (for Chinese translation, click here)


Hua Fang

(August 2005)


Before open a Codon bag


Most recently, the technology giant IBM Company made a public release of its concept searching technology called UIMA (Unstructured Information Management Architecture). Upon having reviewed its contents, I have found that the base of such system is philosophically unsolidified, although its design of software engineering may be technically sound. In this article, in comparison along with UIMA, I would like to elucidate my point of view on concept and its structural role in human reasoning, learning, understanding exchange and human knowledge.


Since the whole collection of my argument is fully based on a simple formulation for complete conceptual structure of human knowledge, which is named “Codon”, a DNA like structure in analogy, such theory deserves a characteristic adjective “Molecular”.



What is concept?


Throughout its conceptual overview about UIMA, IBM researches do not clarify the key definition about Concept. What it has been implied in the context of their manual regarding this most critical point, in other word, the best answer, is that concepts are not keywords, they are entities of keywords with deeper meanings shared by different keywords or texts. My intuition and logic tell me that they are actually still talking about keywords, but being in disguise, leaving the question still unanswered: What is a concept? Now, we are being forced to have a philosophical leap, which will enlighten us out of this darkness of conceptual cave.


Open a Codon bag

Now, let’s put “concept” question aside, ask some question which is more fundamental in hope of getting more broader answers or more generalized conclusions: Is there any basic structural element(s) existing in all human reasoning, learning, understanding exchange and knowledge? The profoundness of questioning in such scale forces every truth seeker looking into history of every branch of philosophies. Unfortunately, after having done my homework, as you can guess, there is no direct and straight answer. Nonetheless, there are hints everywhere from ancient to modern, from western to non-western. First, all human reasoning, scientific and non-scientific, religious and non-religious, always based on rules. In science, they are often called “Law”, “Rule” “Principle”, and “Proposition”, “Theorem”; in non-science, they are called “Law”, “Rule”, “Doctrine”, and “Dogma”. What they are shared in common is that they are believed having a governing power over Phenomenon, regardless in religion or non-religion. (For time being, let’s put the issue about Truth, a non-structural concept, aside, although it is one of the most philosophically critical ones). I, as a scientist, my most favorites on the list of Law are: Newton’s Three Laws of Motion, Quantum Theory, DNA structure theory so on and so forth (have to keep it short). However, doing science does not make me deny such a fact that Bible has some great Laws that fix the faith of its believers for thousands years.


It’s time to draw some structure in general, now we know it must have Law or Law-like structure, “L” in short, and Phenomenon, the object before any meaningful explanation has been given, “P” in short. Now, do we have the solution-a series of LPs? We know that any Law deserves a simple term to show its totality, therefore “L” does not just carry single point of meaning, so to speak; Phenomenon, including words, texts, image, sound, etc, does not have meaning just by itself. Again, “P” is object(s) to be explained. Now, we need something qualified as a non-totality to carry a single meaning of any object. Intuitively and logically, it cannot be anything else but Concept, “C” in short. Now, we have the most basic structural unit of human thinking, it seems that they could be written in any following format: LCP, LPC, CPL, CLP, PCL, and PLC. However, logic and common sense allows us only one format, LCP, period. Let's set it as a rule.


Before the time of bioinformatics and modern information technology, such discovery of “human thinking unit” or “reasoning element”, even if it had been made, would not be making too much sense to the public and useless anyway. To make such piece of thinking about human thinking truly meaningful, like evolution process of human species ourselves, it has to wait for its time, by God or by chance. First, discovery of DNA structure, the Double Helix by Watson and Crick, bring us the insight of life of any kind. Its biological and technological significance has been proven over and over again in an exponential upwards trend. Ironically, its philosophical impact on the way of human thinking has not been explored. For example, in his book “Quantum Philosophy-Understanding and Interpreting Contemporary Science”, Professor Roland Omnes states about his dream on Page 63, seeking a universal fitting mold for philosophy of science, but on Page 253, the only time on this book, he mentions revolutionizing role of DNA just in a biological sense.


Now, we have DNA, a structure molecularly well known, and the basic human thinking unit, a Codon-like structure, which is waiting to be utilized and constructed for their complete senses. The only missing piece is the bond, in a symbol “-“. Now, we make a progress to form an activated unit due to as such “-LCP-“. With basic math skill, counting and some elementary level of knowledge about set theory, we are able to do this: For one piece of knowledge, we can put it in writing as (-LCP-)1, then add up to quantity marked “n” (the “n” in natural number; Googol, 10100 is just part of it), thus numerous pieces of knowledge expressed as (-LCP-)n  It must be noted that the quantity “n” is not necessarily having equal proportional power of multiplying to the individual subset of LCP. For instance, it can be L1C4Pn, meaning one law has 4 concept but too many pieces of phenomenon to be explained. Newton's Second Law of Motion is exactly the case. F = ma where there are four concepts: force “F’, mass “m”, acceleration “a”, and concept of mathematical relationship between them, the union of “mass” x ”acceleration” equals “Force”. Examples of Phenomenon can be such as, “Galileo drops balls”; or “NASA launches a rocket”, so on and so forth. If keep such “Codon chain” (-LNewton2ndLawC4Pn-) building process going, the Codons bound on its left side should be belonging to those from Quantum Theories; Those on the right should be something from applied mechanics, such as Ballistics etc. To put in formularized expression, they may look like: -LQuantumMechanicsCnPn- or LCosmologyC{Space,Time}Pn-LNewton2ndLawC4Pn-LBallisticsCnPn-.


To make a library in such codon format for complete human knowledge collected from all recorded human history, one more key point has to be made, a time tag, which marks thinking or reasoning character in an absolute historical-sense, and necessary reference point that is critically needed when an inference is carried out spontaneously by taking advantage of “bonded and potentially activated nature like real DNA molecules or any polymers with reactive potentials In such case, we call this kind chemical-like reaction reasoning, or thinking, including deduction and induction.


Taking all points together from above, a logically concluded formula for complete human knowledge should be expressed as such:

It can be read so: “Right now, this is what has already been all known about all things”.(T: time; @ T0: at Time zero; R _: the set of negative real number.)


Now, what we can do with Codon


Let's come back to reality. Unlike what is stated from IBM's UIMA, all human expressions themselves in Natural language or another forms have their intrinsic structures of reasoning, which mostly have been studied. Therefore, those concepts imbedded in the published theories can be formulized into -LCP- Codon (may even without need of too many man-hours spent on), although many are still controversial, meaning that same sets of phenomenon can be explained in different ways. It is not hard to imagine that with significantly advanced tools from bioinformatics and information technology in general, a true concept-searching tool can be available in theory. It becomes equally obvious that many more “previously unthinkable” can be and should be thinkable. Out of my favorites, I like to propose a system, named “Direct Understanding Exchange”, “DUE” in short.


To give a simple explanation about DUE, I like using an analogy from bioinformatics to quickly and sharply elucidate the point of this idea. It is called DNA/Gene Array technology. Now, let's imagine that thousands, if not millions, of DNA fragments on a chip are conceptually replaced by our conceptual molecules-Codon. Based on the natures of Codon, we can call all of information on the “chip” a complete set of expert opinions (complete list of laws, concepts and questions to be answered), the samples to be tested are kind of knowledge data base of any individual, which should cover complete scope of individual knowledge in the specific field including personal experience. Once the hybridization is done, the reaction in DNA array testing chamber, the report of result should show the score of matching, meaning in Codon sense that what and how much of knowledge the testee has for that specific field.


Theoretically, we can go even further in DUE system. For example, if a person, who, for some reason, can never fully understand some specific sets of laws containing many forever mysterious concepts hidden in them, but what he or she know is this: whatever they are inside the black box, some consistent, reliable and favorite results are always being produced as long as some seemingly superficial rules are strictly followed. Actually, we have countless personal experience about incidences of these kinds. Driving an automobile is one. Taking a medicine is another. Now, let me make my point. I call such scenario”Assisted Intelligence”, “AI” in short. We all know in common term, they are called “tool(s)”. What is special about this tool is that it is helping people intelligently. In other word, you do not need to know every thing if you want to do every thing intelligently. AI makes you being fully equipped with complete knowledge sets you need for the job.


Concluding words for IBM researchers and the alike


No matter that the Codon theory is right or wrong, a simple question about what is concept has to be answered with philosophically reasonable satisfaction before any development of a true concept search technology. Sometimes, we do have to force ourselves painfully cutting off the links of self-shadow attraction as Plato warned thousands of years ago, and welcome the light we have never seen.


