BLACK BOX SCIENCE LeonaF. Fass From: AAAI Technical Report SS-95-03. Compilation copyright © 1995, AAAI (www.aaai.org). All rights reserved. Wehave looked at the process of discovery as an aspect of reasoning about knowledge, particularly in relation to problemsof knowledgeacquisition or learning. In those problemswe have found to be of interest, a body of knowledgeto-be-acquired maybe infinite. Still, it can be "learned" through discovery of a model that correctly represents the knowledge in some finite way. Realistically, experiments or observations that determine the characterizing modelalso must be t’mite. A successful discovery process concludes, determining a model of knowledge, effectively. Once the model is acquired the body of knowledgeis learned, in the sense that it is precisely characterized. within a domain of candidates. If the process is sufficiently constrained, a unique M(e.g., minimal structure) for S maybe obtained. Instances of this process include determining the internal structure (and thus, all future behavior) of a vending machine, from observations of its operation; determining a grammarG for an entire language L, given a sentence sample; or finding a "simplest" programP to implementa particular function f, when the function is described only by selected input/output,i.e., ( x, f(x) ), pairs. A testing process, on the other hand, experiments with a given candidate M~ claimed to modelS, to see if it indeed does satisfy that specification. Unlike the inference process, the testing process "knows"S and so, how a model of S should always behave. It is nontrusting or, adversarial process that maysystematically choose experiments (using positive data from S, and/or negative data from "not S") to discover whether, or where, a potential model M1 may be incorrect. The goal of testing is to detect incorrectness of a given "box" Mt so that the box maybe "fixed" and a better potential model of S maybe found (this is the "Popperian"view [2,5]). Since only finite knowledge can be observed effectively, we view the entire body of knowledgeas a black box. The process of discovering a knowledge model is, from our perspective, experimentally determining correct structure of the "black box", given finite samples or examplesof howit does, or should (or should not) "behave". For some time we have investigated systematic methods of "black box discovery" for learning or knowledgeacquisition, in the sense we have described. Wefirst considered a language acquisition problem, establishing discovery of a linguistic modelfrom explicit experiments with appropriate data. Our success in this area led us to examine other problem domains, seeking a general procedure for modeling knowledge. Webegan with a "positive" process, using techniques of inductive inference [1] to discover a model constructively. Motivated by Cherniavskyet al. [3] and by his "Popperian approach" [2] (after the late Karl Popper [5]), we then investigated the default discovery of a correct model "adversarially", by testing for incorrectness. But, if no experiment detects incorrectness and /f experimentationis deemedsufficient, the discovery of the correct "box" MI is verified. Mt might be a newly built vending machine on a trial run; a grammarclaimed to generate a language L; or a programclaimed to compute f(x) on input x. A "Popperian" would be satisfied if vending machineshort-circuit were detected; an incorrect sentence generated; or a program bug found. In each case, reparation of the discovered defect could yield a better potential behavioral model. A "verificationist" could be satisfied if sufficient experiments detected no such errors. With adequate experimentation a successful testing process might establish that M~ produces exactly what is claimed, and nothing else. Here we briefly describe these two related processes for discovering correct "black box" models, and thus acquiring knowledge. Wealso describe some cases where the processes might, or can, be applied. At fu’st glance inference and testing might seemto be complementary processes to discover, for some body of knowledge, a correct "black box" model. Inference would systematically construct a correct "box" M1 through t’mite experiments, or correctness could be discovered through systematic effective tests. But inference and testing are not complementary processes for discovering correct models, for testing is more difficult than inference [2,3,6,7]. Successful inference depends on finitely t. characterizing correct behavior of the "box" M Successful testing will discover correct M~ only if its experiments characterize correctness and incorrectness, effectively. Aninductive inference process discovers a correct or "best" model of knowledge, i.e., having some specified behavior S, from a finite behavioral sample. In constructive approaches a characterizing sample S (positive data) is observedand generalized to f’md a model Mfor the entire behavior. This is a trusting process that depends on selection, provision of, or access to, appropriate sample data that will lead to correct generalization. Experiments defined by the sample may identify componentsof a (black box) modelMfor S, from 116 [5] Wehave found that when considering behaviors, or bodies of knowledge,that are finitely-realizable, and for which membershipqueries are decidable, both inference and testing will systematically lead to discoveryof correct models of knowledge. Wehave successfully applied both techniques in such cases: e.g., discovering finite-state devices and finite grammatical models. In such cases we have established existence of precise models, and systematic experimentation that terminates upon discovering them. We have found "approximating" results, in cases where only some of the conditions (decidability, realizability, finiteness) apply: e.g., discovering correct programs, approximately. (Some of our results appear in [6-9]). In the fuzzy areas of cognitive processes [4] we have much to learn about discovery of models, and so, too, in the noisy, nonconstrained, non-"logical" or non-"algebrized" domainof "real-life" science. In these cases things don’t fit so neatly into a "black box". Popper, K., The Logic of Scientific Discovery, Harper Torch Books, NewYork, 1968. Selected Relevant Papers AndPresentations By The Author In the area of systematic methods of scientific discovery, we wouldhope to contribute our experience in modeling of knowledge, as our research in discovery has shifted from specific engineering/computer-science problems into epistemology and philosophy of science. Wewould hope to gain new insights and perspectives on discovering models of knowledge from observed phenomenaor processes, to propose solutions (or at least approaches) to the many open problems we have encountered(e.g. in [4]), and the manywehave yet to see. [6] Fass, L. F., "A CommonBasis for Inductive Inference and Testing," Proc. of the Seventh Pacific Northwest SoRwareQuality Conference, Portland, September1989, pp. 183-200. [7] Fass, L. F., "Inference, Testing and Verification," Ninth Intemational Congress on Logic, Methodology and Philosophy of Science and Logic Colloquium 91, Uppsala, Sweden, August 1991. AbsWactedin Congress VolumeI, p. 193, and in J. Symbolic Logic, Vol. 58, No. 2, (June 1993), pp. 763-764. [8] Fass, L. F., "ModelingPerfect Behavior: A GoalDriven Learning Analysis," in Notes of the AAAI Spring Symposium on Goal-Driven Learning, Stanford, March1994, pp. 125-127. [9] Fass, L. F., "Modeling Perfect Behavior: A Machine-Theoretic Approach", Proc. On Summary, Joint Conference on Information Sciences: ComputerTheory and Informatics, Duke University/Pinehurst N.C., November1994, pp. 141-144. Selected References [l] Angluin, D. and C. H. Smith, "Inductive Inference: Theory and Methods," ComputingSurveys, Vol. 15 (1983), pp. 237-269. [2] Cherniavsky, [31 Chemiavsky, J. C., R. Statman and M. Velauthapillai, "Testing and Inductive Inference: Abstract Approaches," Proc. of the First Workshop on Computational Learning Theory, Morgan-Kaufmann,1988. [41 Lenat, D. B., and R. V. Guha,K. Pittman, D. Pratt, M. Shepherd, "CYC: Toward Programs With CommonSense," Communications of the ACM, Vol. 33 (1990), pp. 30-49. LeonaF. Fass received a B.S. in Mathematicsand Science Education from Comell University and an M.S.E. and Ph.D. in Computer and Information Science from the University of Pennsylvania. Prior to obtaining her Ph.D. she held research, administrative and/or teaching positions at Penn and TempleUniversity. Since then she has been on the faculties of the University of California, Georgetown University and the Naval Postgraduate School. Her research primarily has focused on language structure and processing; knowledge acquisition; and the general interactions of logic, language and computation. She has had particular interest in inductive inference processes, and applications/adaptations of inference results to the practical domain.Dr. Fass maybe reached at J.C., "Computer Systems as Scientific Theories -- A Popperian ApproachTo Testing," Proe. of the Fifth Pacific Northwest Software Quality Conference, Portland, October 1987, pp. 297-308. Mailing Address: 117 P.O. Box 2914 Carmel, CA93921