Computability
Logic
(CoL)
Section 8
CoL-based knowledgebase and resourcebase
systems
The
reason for
the failure of x(p(x) p(x)) as a
computability-theoretic principle is that the problem represented by
this
formula may have no effective solution --- that is, the predicate p(x)
may be undecidable. The reason why this principle would fail in the
context of
knowledgebase systems, however, is much simpler. A knowledgebase system
may
fail to solve the problem x(Married(x) Married(x)) not because
the
latter has no effective solution (of course it has one), but because
the system
simply lacks sufficient knowledge to tell any given person’s marital
status. On the
other hand, any system
would be able to “solve” the problem x(Married(x) Married(x)) as this is an
automatically won elementary game so that there is nothing to solve at
all.
Similarly, while yxFather(x,y)
is an automatically solved elementary problem expressing the trivial
knowledge
that every person has a father, ability to solve the problem yxFather(x,y)
implies the nontrivial knowledge of everyone’s actual father. Obviously
the
knowledge expressed by AB or xA(x)
is generally stronger than the
knowledge expressed by AB or xA(x),
yet the formalism of classical logic
fails to capture this difference --- the difference whose relevance
hardly
requires any explanation. The traditional approaches to knowledgebase
systems
try to mend this gap by augmenting the language of classical logic with
special
epistemic constructs, such as the modal “know that” operator □, after which
probably
□A□B would be
suggested as a translation for AB and yx□A(x,y)
for yxA(x,y).
Leaving it for the philosophers to argue whether, say, yx□A(x,y) really
expresses the constructive meaning of yxA(x,y),
and forgetting that epistemic constructs often yield unnecessary and
unpleasant complications
such as
messiness and non-semidecidability of the resulting logics, some of the
major
issues still do not seem to be taken care of. Most of the actual
knowledgebase
and information systems are interactive, and what we really need is a
logic of interaction rather than
just a logic of
knowledge. Furthermore, a knowledgebase logic needs to be resource-conscious.
The informational
resource expressed by x(Pregnant(x) Pregnant(x)) is
not as strong as the
one expressed by x(Pregnant(x) Pregnant(x)) x(Pregnant(x) Pregnant(x)): the former
implies
the resource provider’s commitment to tell only one (even though an
arbitrary
one) woman’s pregnancy status, while
the
latter is about doing the same for any two Women.
The former is the informational resource
provided by a single disposable pregnancy test device, while the latter
is the
resource provided by two such devices. Neither classical logic nor its
standard
epistemic extensions have the ability to account for such differences.
But CoL
promises to be adequate. It is a
logic of interaction, it is
resource-conscious, and it does
capture the relevant differences between truth and actual ability to
find/compute/know truth.
When
CoL is
used as a logic of knowledgebases, its
formulas represent interactive queries. A formula whose main operator
is or
can
be understood as a
question asked by the
user, and a formula whose main operator is or
as
a question asked by the
system. Consider
the problem xyHas(x,y),
where Has(x,y) means
“patient x
has disease y”
(with Healthy counting as one of
the
possible “diseases”). This formula is the following question asked by
the
system: “Who do you want me to diagnose?”
The user's response can be “Laura”. This move brings the
game down to yHas(Laura,y). This is now a question asked by the
user: “What does Laura
have?”. The system’s response can be “flu”, taking us to the terminal
position Has(Laura,Flu).
The system has been successful iff
Laura really has the flu.
Successfully
solving the above problem xyHas(x,y) requires
having all relevant medical
information for each possible patient, which in a real diagnostic
system would
hardly be the case. Most likely, such a system, after receiving a
request to
diagnose x, would make
counterqueries
regarding x’s symptoms, blood
pressure, test results, age, gender, etc., so that the query that the
system
will be solving would have a higher degree of interactivity than the
two-step
query xyHas(x,y)
does, with questions and counterquestions interspersed in some complex
fashion.
Here is when other CoL operations come into play. Negation
turns
queries into counterqueries; parallel operations
generate combined queries; recurrence operations allow repeated
queries; implications
and rimplications act as various sorts of query reduction operations;
etc. Here
we are expanding our example. Let Sympt(x,s)
mean “patient x has (set of)
symptoms
s”, and Pos(x,t)
mean “patient x tests positive for
test t”.
Imagine a diagnostic system which can diagnose any particular patient x, but needs some additional
information. Specifically, it needs to know x’s
symptoms; plus, the system may require to have x
taken a test t which it
selects dynamically in the course of a dialogue with the user depending
on what
responses it received. The interactive task/query that such a system is
performing/solving can then be expressed by the following formula,
which we
shall call Diagn for further
references:
Diagn ≡ x(sSympt(x,s) t(Pos(x,t) Pos(x,t)) yHas(x,y)).
A
possible
scenario of playing the above game is the
following. At the beginning, the system waits until the user specifies
a
patient x to be diagnosed. We can
think of this stage as systems’s requesting the user to select a
particular
(value of) x, remembering that the
presence of x automatically
implies such a request. After a patient x
--- say x=X
--- is selected, the system requests to specify X’s
symptoms. Notice that our game rules
make the system successful if the user fails to provide this
information, i.e.
specify a (the true) value for s in
sSympt(X,s).
Once a response --- say, s=S --- is
received, the system selects a
test t=T
and asks the user to perform it on X,
i.e. to choose the true disjunct of
Pos(X,T) Pos(X,T).
Finally,
provided that the user gave correct answers
to all counterqueries (and if not, the user has lost), the system makes
a
diagnostic decision, i.e. specifies a value Y
for y in yHas(X,y)
for which Has(X,Y) is
true.
The
presence
of a single “copy” of t(Pos(x,t) Pos(x,t)) in the
antecedent
of Diagn
means
that the system may request testing a given patient only once. If up to
n tests were potentially needed
instead,
this would be expressed by taking the -conjunction
of n identical conjuncts t(Pos(x,t) Pos(x,t)). And if the
system
potentially needed an unbounded number of tests, then we would write t(Pos(x,t) Pos(x,t)), thus further
weakening Diagnoser: a system that
performs this weakened task
is
not as good as the one performing the original Diagn, as it
requires stronger external (user-provided) informational resources.
Replacing
the main quantifier x by x, on the other
hand,
would strengthen Diagn,
signifying the system’s ability to diagnose a patent purely on the
basis of
his/her symptoms and test result without knowing who the patient really
is.
However, if in its diagnostic decisions the system uses some additional
information on patients such their medical histories stored in its
knowledgebase and hence needs to know the patient’s identity, x cannot be
upgraded
to x. Replacing x by x would be a
yet
another way to strengthen Diagn,
signifying the system’s ability to diagnose all patients rather than
any
particular one; obviously effects of at least the same strength would
be
achieved by just prefixing Diagn
with or
.
The
knowledgebase KB
of the system may include atomic elementary formulas expressing factual
knowledge, such as Married(Laura), or
non-atomic elementary
formulas expressing general knowledge, such as x(yFather(x,y) M ale(x)) and xy(x✕(y+1) = (x✕y)+x); it can also
include nonclassical formulas such as x(Female(x) Male(x)), expressing
potential knowledge of everyone's gender, or xy(x2=y), expressing ability to repeatedly
compute the square
function, or something more complex and more interactive, such as the
earlier
seen formula Diagn.
With each resource R?#381;KB is associated
(if not physically, at least
conceptually) its provider
--- an agent that solves the query R
for the system, i.e. plays the game R
against the system. Physically the
provider could be a computer program allocated to the system, or a
network
server having the system as a client, or another knowledgebase system
to which
the system has querying access, or even human personnel servicing the
system.
E.g., the provider for as xyBloodpressure(x,y)
would probably be a team of nurses repeatedly performing the task of
measuring
the blood pressure of a patient specified by the system and reporting
the
outcome back to the system. Again, we do not think of providers as a
part of
the system itself. The latter only sees what
resources are available to it, without knowing or caring about how the corresponding providers do their
job; furthermore, the system does not even care whether
the providers really do their job right. The system's
responsibility is only to correctly solve queries for the user as long as none of the providers fails to
do their job. Indeed,
if the system
misdiagnoses a patient because a nurse-provider gave it wrong
information about
that patient’s blood pressure, the hospital (ultimate user) is unlikely
to fire the system
and demand refund from its
vendor; more likely, it would fire the nurse. Of course, when R is elementary, the provider has
nothing to do, and its successfully playing R
against the system simply means that R
is true. Note that in the picture that we have just presented, the
system plays
each game R∈KB in the role
of , so that,
from the
system’s perspective, the game that it plays against the provider of R is R rather than R.
The
most
typical internal informational resources,
such as factual knowledge or queries solved by computer programs, can
be reused
an arbitrary number of times and with unlimited branching capabilities,
i.e. in
the strong sense captured by , and thus
they
would be prefixed with a as
we did with x(Female(x) Male(x)). There was no
point
in -prefixing Married(Laura), x(yFather(x,y) Male(x)) or xy(x✕(y+1) = (x✕y)+x)
because every elementary
game A is equivalent to A and hence
remains
“recyclable” even without recurrence operators. As noted in Section 3.8 for the
corresponding rimplications,
there is no difference between and
as
long as “simple”
resources are concerned
such as such xy(x2=y). However, in some cases --- say, when
a resource with a high degree
of interactivity is supported by an unlimited number of independent
providers
each of which however allows to run only one single “session” --- the
weaker
operator will
have to be used instead
of .
Yet, some of
the
internal informational resources could be essentially non-reusable. A
provider possessing a single piece of litmus paper would be able to
support the
resource x(Acid(x) Acid(x)) but not x(Acid(x) Acid(x)) and not even x(Acid(x) Acid(x)) x(Acid(x) Acid(x)). Most users,
however, would try to refrain from including this sort of a resource
into KB, and rather make it a part
(antecedent) of possible queries. Indeed, knowledgebases with
non-recyclable
resources would tend to weaken from query to query and require more
careful
maintainance and updates. Whether recyclable or not, all of the
resources of KB can be used
independently and in
parallel. This is exactly what allows us to identify KB
with the -conjunction
of its
elements.
Assume
KB = R1 ... Rn, and let us
now try
to visualize a system solving a query F
for the user. The designer would probably select an interface where the
user
only sees the moves made by the system in F,
and hence gets the illusion that the system is just playing F.
But in fact the game that the system
is really playing is KBF, i.e.
R1...RnF. Indeed, the
system
is not only interacting with the user in F,
but --- in parallel --- also with its resource providers against whom,
as we
already know, it plays R1,...,Rn. As long as
those
providers do not fail to do their job, the system loses each of the
games R1,...,Rn. Then
our semantics for implies
that the system wins
its play over the
“big game” R1...RnF if and only
if it
wins it in the F component, i.e.
successfully solves the query F.
Notice that it is uniform-constructive
soundness rather
than simple soundness of the built-in (axiomatization of the) logic
that allows
the knowledgebase system to function. By simple
soundness here we mean that every provable formula is
nonlogically valid.
This is not sufficient for two reasons.
The
other
reason why simple soundness of the built-in
logic would not be sufficient for a knowledgebase system to function
--- even
if every provable formula was known to be logically valid --- is the
following.
With simple soundness, after finding a proof of E,
even though the system would know that a solution for E*
exists, it might have no way to
actually find such a solution. On the other hand, uniform-constructive
soundness guarantees that a (uniform) solution for every provable
formula not
only exists, but can be effectively extracted from a proof.
As
for
completeness of the built-in logic, unlike
uniform-constructive soundness, it is a desirable but not necessary
condition.
So far complete axiomatizations have been found only for various
limited
fragments of the language of CoL. We hope that the future will bring
completeness results for more and more expressive fragments as well.
But even
if not, we can still certainly succeed in finding ever stronger
axiomatizations
that are uniform-constructively sound even if not necessarily complete.
It
should be remembered that, when it comes to practical applications in
the
proper sense, the logic that will be used is likely to be far from
complete
anyway. For example, the popular classical-logic-based systems and
programming
languages are incomplete, and the reason is not that a complete
axiomatization
for classical logic is not known, but rather the unfortunate fact of
life that
often efficiency only comes at the expense of completeness.
The
context of
knowledgebase systems can be further
extended to systems for planning and action. Roughly, the formal
semantics in
such applications remains the same, and what changes is only the
underlying
philosophical assumption that the truth values of predicates and
propositions
are fixed or predetermined. Rather, those values in CoL-based planning
systems will be viewed as something that interacting agents may be able
to manage.
That is,
predicates or propositions there stand for tasks
rather than facts. E.g., Pregnant(Laura)
--- or, perhaps, Impregnate(Laura)
instead --- can be seen as having
no predetermined truth value, with Laura or her mate being in control
of
whether to make it true or not. And the nonelementary formula xHit(x)
describes the task of hitting any one
target x selected by the
environment/commander/user. Note how naturally resource-consciousness
arises
here: while xHit(x)
is a task accomplishable with one
ballistic missile, the stronger task xHit(x) xHit(x)
would require two missiles instead.
All of the other operators of CoL, too, have natural interpretations as
operations on physical (as opposed to just informational) tasks, with acting
as a task reduction
operation. To get a
feel of this, let us look at the task
Give me a
wooden stake Give
me
a silver bullet Destroy
the vampire Kill
the
werewolf.
This
is a task
accomplishable by an agent who has a
mallet and a gun as well as sufficient time, energy and bravery to chase and eliminate any
one (but not both)
of the two monsters, and only needs
a
wooden stake and/or a silver bullet to complete his noble mission. Then
the
story told by the legal run 〈1.1, 0.1〉 of the above
game is that the environment asked the
agent to kill the werewolf, to which the agent replied by the
counterrequest to
give him a silver bullet. The task will be considered eventually
accomplished
by the agent iff he indeed killed the werewolf as long as a silver
bullet was
indeed given to him.