Probabilistic programming is considered as a framework, in which basic components of cognitive architectures can be represented in unified and elegant fashion. At the same time, necessity of adopting some component of cognitive architectures for extending capabilities of probabilistic programming languages is pointed out. In particular, implicit specification of generative models via declaration of concepts and links between them is proposed, and usefulness of declarative knowledge for achieving efficient inference is briefly discussed.
Any AGI system should rely on some knowledge (experience) representation, learning (prediction) methods, and reasoning (action selection) methods. Although these components are not necessarily explicit, and some systems can be more syncretic than others, we can characterize approaches to AGI by them. For example, basic models of universal algorithmic intelligence like AIXI implicitly represent knowledge in the form of programs, and use Solomonoff prediction, and exhaustive search for action selection. Cognitive architectures (CA) usually utilize more restrictive representations and learning methods for the sake of computational efficiency. Some architectures use one uniform representation and corresponding learning method yielding "grand unification and functional elegance", e.g. [1], but loosing expressiveness. Others utilize quite general knowledge representations and many inference strategies [2] that result in higher expressiveness, but causes difficulties with integrations of different components of the CA.
Achieving “grand unification and functional elegance” for more general representations can be considered as a direction of further development of CAs. Here, we claim that the probabilistic programming paradigm can be seen as a theory for CAs with the properties of grand unification and functional elegance for universal (Turingcomplete) representations. We also show that insights from CAs can be very useful for further development of probabilistic programming languages (PPLs).
Basic purpose of PPLs is to conduct conditional inference over generative models specified in the form of programs with random choices. One can specify models corresponding both to particular narrow machine learning methods and to a sort of universal induction (if the model generates arbitrary programs). The same inference engine can be used to solve deductive reasoning tasks (see an example with the subset sum problem in [3]). One can also perform a sort of knowledge-based reasoning using probabilistic programming for free (see, e.g. [4]).
Of course, PPLs usually don’t support some distinct representation of knowledge separated from the rest code. This also has a positive side -any kind of computable knowledge can be expressed.
Thus, PPLs can be used to quite naturally and uniformly implement three basic components of AGI systems (knowledge representation, reasoning, and learning). Of course, there are some obvious differences between PPLs and cognitive architectures. PPLs only have capabilities to represent knowledge, perform reasoning and learning, which should be realized and combined. However, PPLs seem suitable as a meta-tool for designing and implementing cognitive architectures in a convenient and unified way. The real problem here is not in designing a specific architecture, but in efficiency of inference.
Indeed, inefficiency of AIXI is directly reflected in its implementation in a PPL. Turing-incomplete PPLs use more efficient inference methods, but they cannot reproduce AIXI. One possible way to try to achieve both efficiency and universality is program specialization [5]. The idea is to automatically construct an efficient projection of a universal inference method w.r.t. given specific task or generative model. If there is a program in PPL, one should not immediately apply a general inference method, but should try to optimize it w.r.t. this program.
There are some attempts to do something like this in PPLs. For example, in [6] program analysis is performed to propagate observations backward through the program. In [7] something similar to specialization of PPL inference engine w.r.t. given program is performed.
However, there is no simple and universal solution for efficient program specialization (with possibly exponential gain in speed), just like there is no simple and practically efficient universal inference method. The specializer should be an expert in program analysis, and it should be able to learn new ways to analyze and optimize programs. That is, it is impossible to put efficient and general methods inside the (static) PPL interpreter, because then such interpreter will already should be a matured AGI. Instead, the AGI system should have capabilities of becoming such an expert. Then, the question is what are the main requirements to the AGI core if they are not the efficient and general inference itself? How should we extend the paradigm of PPLs to make them more suitable both for AGI development and real-world applications?
Consider the following simple program in Church language [4].
(rejection-query (define x (random-integer 10))
x (= (+ x 5) 10)) Basic PPLs will blindly search for the appropriate solution. This by itself is not necessarily bad, since if you ask a small child to find such a value that its sum with 5 is equal to 10, she or he (possessing basic knowledge about numbers) will also do this by blindly searching for the appropriate number.
More sophisticated PPLs might be able
This content is AI-processed based on open access ArXiv data.