Partial Truth with Clifford Numbers
UNREAL PROBABILITIES
Partial Truth with Clifford Numbers
Carlos C. Rodriguez
Department of Mathematics and Statistics
University at Albany, SUNY
Albany NY 12222, USA1
Abstract
This paper introduces and studies the basic properties of Clifford algebra
valued conditional measures.
1 Introduction
Probability theory was given a firm mathematical foundation in 1933,
when Kolmogorov [] introduced his axioms. By defining
probability as an uninterpreted special case of a positive
measure with total unit mass (plus an additional definition for
independence), the subject exploded with new results and
found innumerable applications. In 1946, Cox (see []) showed
that the Kolmogorov axioms for probability are really theorems that
follow from basic desiderata about the representation of partial truth
with real numbers. We owe to Ed Jaynes (see []) the
discovery of the importance of Cox's 1946 work ([]). After
Jaynes, it became clear why the calculus of probability is so
successful in the real world. Probability works because its axioms
axiomatize the right thing: partial truth of a logical
proposition given another. Even more, the rules of probability are
unique in the sense that any other set of consistent rules can be
brought into the standard sum and product rules by a change of scale
(or we may say logical gauge). This is in fact Cox's main result and
it makes futile the enterprise of looking for alternatives to the
calculus of normalized real valued probabilities. It is only by
allowing the partial truth of a proposition to be encoded by an object
other than a real number in the interval [0,1] that we could find
alternatives to the standard theory of probability.
We seek to find out what happens when standard probability theory is
modified by relaxing the axiom that the probability of an event must
be a real number in the interval [0,1]. We show that, by allowing
the measure of a proposition to take a value in a Clifford Algebra, we
automatically find the methods of standard quantum theory without ever
introducing anything specifically related to nature itself.
The main motivation for this article has come from realizing that the
derivations in Cox [] still apply if real numbers are
replaced by complex numbers as the encoders of partial truth. This was
first mentioned by Youssef [] and checked in more detail
by Caticha [] who also showed that non-relativistic
Quantum theory, as formulated by Feynman [], is the only
consistent calculus of probability amplitudes. By measuring
propositions with Clifford numbers we automatically include the reals,
complex, quaternions, spinors and any combination of them (among
others) as special cases.
2 The Axioms
In this section we introduce the notation and collect the simple properties
about Boolean and Clifford algebras that will be needed for the definition
of y below.
2.1 The Boolean Algebra \cal A
Let \cal A be a boolean s-algebra of propositions a,b,c,¼. We denote by 0 the false proposition, by 1 the true
proposition, by a+b the logical sum, by ab the logical product and by [`a] the negation. Each proposition b Î \cal A defines the
set \cal A b where,
\cal A b = { ba : a Î \cal A } = b\cal A |
| (1) |
Clearly, \cal A b is a subset of \cal A that
contains b and 0, it is closed for sums and products and thus, Ab
is a sub algebra of \cal A with b as the unit. From the fact
that a = ac + a[`c] it follows that
\cal A = \cal A c Å\cal A [`c] |
| (2) |
Given two propositions a,b Î \cal A we have,
\cal A É \cal A a É \cal A ab |
| (3) |
The set X Ì \cal A is called the set of elementary
propositions of \cal A (and we say that \cal A is a s-algebra of propositions in X) if,
- for x,y Î X, xy = 0 whenever x ¹ y
- every a Î \cal A is the sum of propositions in X. We
write
If \cal A and B are two Boolean s-algebras of
propositions in X and Y respectively, then
\cal A ×B is a Boolean s-algebra of propositions in X×Y if
one defines the truth value of (a,b) Î \cal A ×B
as the truth value of ab i.e. true only when both a Î \cal A
and b Î B are true. We denote by \cal A n the s-algebra of n copies of \cal A of propositions in Xn. We have,
P Î \cal A n Û P = |
å
x Î P
|
x1x2¼xn |
| (5) |
and by this we mean that P is always the sum of propositions in Xn
and each x Î Xn is the conjunction of n propositions, one for each
copy of X. Finally we let
\cal A * = \cal A \{0}.
Notice that these Boolean s-algebras are nothing but the standard
sets where general measures (in particular probability measures) are
defined. We chose the notation of logical sums and products instead
of the traditional set notation of unions and intersections to emphasize
the fact that we are interested in the encoding of partial truth of
logical propositions, but this is only a choice of notation and
there is a complete one to one correspondence between the two languages.
As general references see e.g. Halmos [] or
Chow and Teicher [].
2.2 The algebra of Clifford numbers \cal G
Let \cal G be an arbitrary finite dimensional Clifford Algebra
with real scalars. We try to follow the notation in []. We
denote the elements of \cal G by capital letters like, A,B,C,¼. A general Clifford number M always expands as the sum of
its scalar, vector, bivector, etc parts like:
|
|
|
< M > 0 + < M > 1 + < M > 2 + ¼ |
| (6) | |
|
|
| |
|
Where < M > k denotes the k-vector part of M. If u and v are vectors
in \cal G then their geometric (Clifford) product uv can be
decomposed into a symmetric part u·v and an antisymmetric part uÙv as
|
|
|
|
1
2
|
(uv + vu) + |
1
2
|
(uv - vu) |
| (7) | |
|
| (8) |
| |
|
The inner product between two vectors is always a scalar and their wedge
product is always a bivector. The operation of reversion of a
clifford number M is denoted by M\dagger and defined as a linear
operation with the properties,
a\dagger = a, u\dagger = u, (MN)\dagger = M\daggerN\dagger |
| (9) |
where a is a scalar, u is a vector, and M and N are arbitrary
Clifford numbers. The euclidean inner product on \cal G is given
by,
< M,N > \cal G = < M\daggerN > 0 |
| (10) |
2.3 Definition of y
By a clifford algebra valued conditional measure (or simply a y) we
mean a function,
satisfying the following two axioms:
- (I)
- If c Þ b then
- (II)
- If {a1,a2,¼} Ì \cal A , with ajak = 0 for j ¹ k, then
y |
æ è
|
|
å
j
|
aj,c |
ö ø
|
= |
å
j
|
y(aj,c) |
| (13) |
Since the only property a proposition in \cal A always has is its truth
value, we can interpret y(a,c) as the clifford number that
represents the truth in a when c is certain. Axiom (I) says that
c is certain (e.g. take a = 1 and b = c) and axiom (II) says that the
whole truth of a for a given c is always the sum of the truths of
its separate parts.
2.3.1 The truth of 0
By taking each aj = 0 in (13) we get,
y(0,c) = y(0,c) + y(0,c) + ¼ |
| (14) |
and therefore, y(0,c) is either 0 Î \cal G or unbounded
but if it is unbounded then all the propositions will be assigned an
unbounded value since y(a,c) = y(a+0,c) = y(a,c) + y(0,c).
Hence,
y(0,c) = 0 for all c Î \cal A * |
| (15) |
3 The spaces Hc
The functions y, as defined by (12) and (13),
are specified independently at each c Î \cal A *. So far,
there is no link between the y in the domain of discourse of c, i.e. y(·,c) and y in the more specialized domain of discourse of bc, i.e. y(·,bc). We shall talk about changing domains of discourse
in the next section but in this section we describe the important properties
that the functions y(·,c) have as functions of their first argument
only, for fix c Î \cal A *. To simplify the notation simply
write y(a) instead of y(a,c) in the formulas below. Thus, whenever
the background proposition c is not subject to change we take y as
any s-additive function defined on \cal A c with
values in \cal G . The condition (12) is
automatically satisfied since c is the true proposition in \cal A c.
Let Hc be the set of all s-additive functions defined on
\cal A c with values in \cal G .
3.1 The Hc are Hilbert spaces
Since the sum of two s-additive functions and the product of a
s-additive function by a scalar are still s-additive
functions we have that the Hc are vector spaces. The scalars are
the scalars in \cal G . In principle the field of scalars could be taken as
the reals or the complex numbers but it seems that the reals is all
that is needed in most applications.
3.1.1 The inner product in Hc
For, f,y Î Hc define the real inner product between them by:
|
|
|
|
å
x Î X
|
< f(x),y(x) > \cal G |
| (16) | |
|
|
å
x Î X
|
< f(x)\daggery(x) > 0 |
| (17) |
| |
|
By considering only ys with finite norm we make Hc a real Hilbert
space. From now on we assume the finite norm to be part of the definition of
Hc itself, i.e.
Hc = {y: y is s-additive on \calA c and |
å
x Î X
|
< y\dagger(x)y(x) > 0 < ¥} |
| (18) |
Notice that the spaces Hc are complete for the inner product
(17) since \cal G with the scalar product < .,. > \cal G
is complete. When X is a finite set (i.e. when it contains only a
finite number of propositions) the proof is trivial, just use the fact
that if {fn}n Î IN Ì Hc is a Cauchy sequence then for
each x Î X the sequence {fn(x)}n Î IN is also a Cauchy
sequence of elements of \cal G and thus it converges to some
f(x) Î \cal G and therefore f Î Hc is the limit of the
original sequence in Hc. When X is infinite we need to
reinterpret the sums as integrals, (for which we need a measure in
X), and we also need to reinterpret the ys as \cal A -measurable
densities, but after that, the proof is essentially the standard proof
that L2 is complete. An important example of an infinite X
occurs when the propositions in X are labeled with the vectors in
\cal G . In this case the sum in (17) is replaced by the
integral with respect to the standard Lebesgue measure in X.
3.2 The isomorphic spaces:
H(\cal A c) @ Hc(\cal A )
In order to be able to understand the differences between the current
approach and ordinary probability theory it is convenient to introduce two
other spaces closely related to Hc. These are, the space of all s-additive functions on \cal A c with values on \cal G ,
H(\cal A c) = {yc:\cal A c ® \cal G | yc s-additive on \cal A c } |
| (19) |
and the space,
Hc(\cal A ) = {y:\cal A ® \cal G | y s-additive on \cal A AND if c Þ b, y(ab) = y(a) } |
| (20) |
Both are Hilbert with the inner product (17) and considering
only elements of finite norm.
Notice that if y Î Hc(\cal A ) then its restriction to \cal A c belongs to H(\cal A c), i.e.
y|\cal A c = yc Î H(\cal A c) |
|
and conversely, if yc Î H(\cal A c) then the function y defined by:
y(a) = yc(ac) "a Î \cal A |
|
belongs to Hc(\cal A ) since it is clearly s-additive
and if cÞ b then [`c]+b = 1 and multiplying both sides by c
we get, bc = c from where,
y(ab) = yc(abc) = yc(ac) = y(a) |
|
The map y® yc is obviously linear one to one and onto
so it makes the two spaces isomorphic.
Consider now two propositions b and c such that c, bc Î \cal A *. Then we can write:
H(\cal A bc) @ H(b\cal A c) @ Hb(\cal A c) |
| (21) |
In other words each y(·,bc) Î H(\cal A bc) uniquely
defines a function f(·b,c) Î Hb(\cal A c) and
that is all we can say. Since the ys are unnormalized we can not write
a general product rule as in normalized standard probability theory.
Nevertheless, it is possible to justify a restricted product rule for
independence as we do in section 5 below. When c = 1 Î \cal A we simply
write H(\cal A ) instead of H1(\cal A ) or H(\cal A 1).
4 The truth with y
The remarkable fact about the functions y is that without
committing to a particular value for y(1) in \cal G , they still allow
to tell what propositions are true. We show in this section that,
b Î \cal A is considered to be true by y when
y([`b]a) = 0 "a Î \cal A . By liberating ordinary probabilities from the
constrain that the probability of the whole space must always be fix
at one, we make the space of all possible assignments of partial truth
into a Hilbert space without losing the ability to identify truth.
4.1 Propositions as operators
Each proposition b Î \cal A defines two complementary linear
operators on Hc by multiplication, [^b], and by addition, \checkb to the first argument of y. In symbols
To simplify the notation we often omit the hats and simply write
by instead of [^b]y. From bb = b and b+b = b it follows
that [^b] and \checkb are projectors and therefore they are
self-adjoint with eigen values either 0 or 1. We can write,
Theorem 1
The following two complementary statements are true.
- If y Î Hc is an eigen vector of the operator [^b]
with eigen value 1 then y(b) = y(1) and we say that y
makes b true conditional on c. Conversely, if cÞ b
then every y Î Hc is an eigen vector of the operator
[^b] with eigen value 1.
- If y Î Hc is an eigen vector of the operator
\checkb with eigen value 1 then y(b) = 0 and we say that
y makes b false conditional on c . Conversely, if
cÞ [`b] then every y Î Hc is an eigen vector
of the operator \checkb with eigen value 1.
Proof
-
|
^ b
|
y = yÞ y(ab,c) = y(a,c) "a Î \cal A Þ y(b,c) = y(1,c) |
|
where the last implication follows by taking a = 1. Conversely,
if c Þ b then from (12) we have,
y(ab,c) = y(a,c) "a Î \cal A |
|
Thus, [^b]y = y.
-
\checkby = yÞ y(a+b,c) = y(a,c) "a Î \cal A Þ y(b,c) = y(0,c) |
|
where the last implication follows by taking a = 0. Conversely,
(c Þ |
_ b
|
) Þ |
_ c
|
+ |
_ b
|
= 1 Þ a+bc = a "a Î \cal A |
|
Now from the fact that y is a function and applying (12)
twice we have,
Thus, \checkby = y ·
The following theorem elaborates on the same theme.
Theorem 2
Let b Î \cal A be an arbitrary proposition in a s-algebra of
propositions in X and let y Î H(\cal A ).
The following are all equivalent:
- by = y i.e., y makes b true.
- [`b]y = 0 i.e., y makes [`b] false.
- ||[`b]y|| = 0
- ||y|| = ||by||
Proof:
We show that 1Û 2Û 3Û 4.
First equivalence follows from y = by+ [`b]y, the second
equivalence is a property of the norm and the third equivalence is
Pythagoras theorem since (by) ^([`b]y) ·
It is evident from this last theorem that the norm in the Hilbert
spaces Hc provides a mechanism for translating the clifford
numbers y(b) assigned to the propositions in \cal A by a function
y Î H(\cal A ) into positive real numbers
measuring how close is y from making the proposition b true.
It is also clear from (24) and (25)
that it is the square of the norm and not just the norm what is needed.
It is only with the square of the norms that we can say that the
amount of truth of [`b] (measured by ||[`b]y||2)
equals the amount of truth assigned to the true proposition
(measured by ||1y||2) minus the amount of truth assigned to
b (measured by ||by||2).
4.2 Commutativity, orthogonality and a Clifford number times a
proposition
Propositional operators can be composed to form other operators. Thus, if a,b Î \cal A and y Î H(\cal A ) we have,
|
|
|
|
^ a
|
( |
^ b
|
y(x)) = y(abx) = |
^ (ab)
|
y(x) |
| (26) | |
|
| (27) | |
|
| (28) |
| |
|
and we can see that checks commute with other checks and hats commute with
other hats, but in general, hats don't commute with checks.
If A Î \cal G and b Î \cal A we can define the
operator Ab by,
(Ab)y(x) = A(by(x)) = Ay(bx) |
| (29) |
and similarly for A\checkb. These definitions allow a very rich algebra
of operators that mix boolean and clifford algebra properties in new ways.
One particularly interesting example of this kind of mix is given by the
following statement: mutually exclusive propositions are orthogonal.
More explicitly, if a,b Î \cal A and y1,y2 Î H(\cal A ) then, ab = 0 Þ < ay1,by2 > = 0 and
pythagoras theorem holds,
|| ay1 + by2 ||2 = ||ay1||2 + ||by2||2 |
| (30) |
5 Independence
If the clifford number y(a,c) is interpreted as a representation of the
partial truth of a when we assume c to be certain then there is only one
rational way to define independence namely:
Preliminary Definition: We say that y makes propositions a
and b in \cal A logically independent conditionally on
c Î \cal A * if the additional knowledge of one of them does
not affect the value of y for the other. i.e,
whenever the conditional ys exist.
5.1 A restricted product rule
If we try to find the value of y(ab,c) in terms of the partial truths
that y assigns to a and b, then the most general relation is,
y(ab,c) = F(y(a,c),y(a,bc),y(b,c),y(b,ac)) |
| (32) |
where F is an arbitrary function of its arguments. If we assume further
that a and b are logically independent conditionally on c then using (31) the most general relation becomes,
y(ab,c) = F(y(a,c),y(b,c)) |
| (33) |
Let u = y(a,c), v = y(b,c) and w = y(d,c) and use the commutativity
and associativity properties of the logical product to get the following two
properties for the function F:
In other words, F must be symmetric and it must satisfy the usual
associativity equation. If the ys take values only on a
commutative subspace of \cal G (e.g. reals, complex or pseudo scalars)
then the only solution is F(u,v) = uv (see []) but this
can not be the solution if uv ¹ vu. Given that F(u,v) must be
symmetric, and that it must reduce to uv when u and v commute,
and obvious solution is given by the symmetrization of the product,
i.e. (uv+vu)/2. In principle, it seems feasible that a modification
of the standard argument of Aczel (see [] or
[]) may yield the symmetrized product as the unique
solution of (35) and (34) for u,v Î \cal G
at least for u,v in some subset of \cal G for which (35)
is still true. At the present time there is no such proof. In any case
the lack of a proof for the uniqueness is not a deterrent to turn the
formula into the definition for independence. If it turns out that
there are multiple solutions (which seems highly unlikely) the results
obtained from this particular solution will still be valid. Thus, from
now on we say that y makes a and b (logically) independent
given c if,
y(ab,c) = |
1
2
|
[ y(a,c)y(b,c) + y(b,c)y(a,c)] |
| (36) |
More generally we have,
Definition: We say that y makes a1,a2,¼,an
logically independent given c if, for k = 1,2,¼,n and
1 £ i1 < i2 < ¼ < ik £ n
y( |
k Õ
j = 1
|
aij,c) = |
1
k!
|
|
å
s
|
y(as(i1),c) y(as(i2),c)¼y(as(ik),c) |
| (37) |
where the sum runs over all the permutations, s of
(i1,i2,¼,ik).
The associativity equation (35) imposes a heavy restriction
on the possible y assignments for independent propositions. In fact
we have,
Theorem 3
If y makes three or more propositions a,b,d,¼
independent conditionally on c then the clifford numbers
u = y(a,c), v = y(b,c), w = y(d,c),¼ are such that
each of them commutes with the anticommutator of any other two.
Proof:
From (37) it suffices to show that
v must commute with F(u,w) = (uw+wu)/2 when
a,b,d are independent given c.
The right hand side of (35) simplifies to,
F(u,F(v,w)) = |
1
4
|
{ uvw - uwv + vwu - wvu } |
| (38) |
Similarly the left hand side of (35) is given by,
F(F(u,v),w) = |
1
4
|
{ uvw - vuw + wuv - wvu } |
| (39) |
Equating (38) to (39) and simplifying
we get,
where [u,v] = uv - vu denotes the usual commutator product
·
Notice that when the clifford numbers u,v,w,¼ either
commute or anticommute with each other then (40)
is true. But there are many other solutions. For example
(40) is also true when u,v,w,¼
are arbitrary vectors.
5.2 Independence and Orthogonality
The above definition for independence makes the following statement true:
Theorem 4
If y(ab,c) = 0 then y(a,c) anticommutes with y(b,c)
when and only when y makes a and b independent given c.
There is nothing like this in standard probability theory where
mutually exclusive events that are possible (i.e. that have positive
probability) are never independent. We are used to think that this
makes sense, for if we know that one of the events happens then we
also know that the other couldn't happen. The events are totally
linked so they can't be independent.
This is fine for real numbers, that are commutative, but not with
clifford numbers. There is however, a extreme case where the
above theorem is true even in standard probability theory. Suppose
that a,b and c are three mutually exclusive propositions. Then,
y(ab,c) = y(a,c) = y(b,c) = y(a,c)y(b,c) = 0 and we would
have to say that a and b are independent given c even though
neither a nor b are possible given c. Anticommutativity allows
this to happen even when y(a,c) and y(b,c) are not zero. Two
events can be completely linked (i.e. mutually exclusive) and at the
same time be logically independent from each other! This is as weird
as entanglement in quantum mechanics.
6 Flipping n coins
If \cal A is a s-algebra of propositions in X then, by
the s-additivity property, every
y Î H(\cal A ) is completely specified on \cal A by just giving
y(x) for all x Î X, i.e.,
where for x,y Î X, dy(x) is 1 Î \cal G if x = y and 0 Î \cal G otherwise.
We consider the following special case.
6.1 The Binomial experiment with ys
Let a be an arbitrary proposition and let X = {a,[`a]} and \cal A = {1,0,a,[`a]}. Clearly \cal A is a
boolean algebra of propositions in X. From (41) we
have
y(x) = Ada(x) + Bd[`a](x) |
| (42) |
where y Î H(\cal A ) and A,B Î \cal G .
This is the canonical Bernoulli experiment. There are only two possible
outcomes a and [`a] with partial truths encoded by the clifford
numbers A = y(a) and B = y([`a]).
As in standard probability
theory, consider now n independent repetitions of the Bernoulli
experiment. i.e., consider Xn with its
corresponding boolean algebra \cal A n of elements in Xn
(see (5)).
From (41) a general yn Î H(\cal A n) is given by,
yn(x) = |
å
y Î Xn
|
yn(y)dy(x) |
| (43) |
From the assumption that yn make the different repetitions independent,
we obtain, using (37) that
yn(x) = yn(x1,...,xn) = Mn(m(x)) |
| (44) |
where for each integer k with 0 £ k £ n, Mn(k) Î \cal G
is the symmetrization of the product AkBn-k and m(x) is the number
of a's in x. Here are some examples for n = 2 and n = 3,
|
|
|
A2 = M2(2), y2(a, |
_ a
|
) = y2( |
_ a
|
,a) = |
1
2
|
[AB + BA] = M2(1) |
| |
|
y3(a, |
_ a
|
,a) = y3( |
_ a
|
,a,a) = |
1
3
|
[A2B + ABA + BA2] = M3(2). |
|
| |
|
Now define the proposition Pnk Î \cal A n by,
Pnk = ``exactly k of the n repetitions is an a¢¢ |
| (45) |
Recall that by (22) we have,
Pnkyn(x) = yn(Pnkx) = |
ì í
î
|
|
| |
| (46) |
By the first part of theorem (1) we have that yn makes Pnk
true when Pnkyn = yn. So the question is: How far is
yn from making Pnk true?. Answer:
||yn - Pnkyn ||2.
6.2 Computation of ||yn - Pnkyn ||2
To compute this distance we use the fact that Pnk, and its
negation in \cal A n, 1-Pnk, are mutually exclusive
propositions hence orthogonal (see (30)) and by
pythagoras,
||yn-Pnkyn||2 = ||yn||2-||Pnkyn||2 |
| (47) |
Let us compute each of these terms. From (17),
||yn||2 = |
å
x Î Xn
|
áyn\dagger(x)yn(x)ñ0. |
| (48) |
and using (43) and (44) we can write,
yn(x) = |
å
y Î Xn
|
Mn(m(y))dy(x) |
| (49) |
from where we obtain,
|
|
|
|
å
y1,y2 Î Xn
|
Mn\dagger (m(y1)Mn(m(y2))dy1(x)dy2(x) |
| |
|
|
å
x Î Xn
|
Mn\dagger (m(x))Mn(m(x)) |
| |
|
|
| |
|
and replacing in (48) we get,
|
|
|
| |
|
|
n å
j = 0
|
|
æ ç
è
|
|
n
)j
|
|Mn(j)|2 |
| (50) |
| |
|
the last equation followed from the fact that there are (n/)j]
propositions in Xn with exactly j components equal to a. We use the
same fact again to compute the other norm in (47),
||Pnkyn||2 = |
å
x Î Xn
|
áyn\dagger (Pnkx)yn(Pnkx)ñ0 |
| (51) |
to obtain,
||Pnkyn||2 = |
æ ç
è
|
|
n
)k
|
|Mn(k)|2 |
| (52) |
Replacing (50) and (52) in (47) we
get,
||yn-Pnkyn||2 = |
n å
j = 0
|
|
æ ç
è
|
|
n
)j
|
|Mn(j)|2- |
æ ç
è
|
|
n
)k
|
|Mn(k)|2 |
| (53) |
Let us consider the proposition, Pn,ef Î \cal A n defined by,
|
|
|
``The observed frequency of a¢s in n independent repetitions is |
| |
|
k/n with f-e £ |
k
n
|
£ f+e¢¢ |
| (54) |
| |
|
in other words for x Î Xn, Pn,efx ¹ 0 when and only
when the proportion of a's in x = (x1,¼,xn) is within e from the specified frequency f. The proposition
Pn,ef is equal to the following disjunction of 2ne+1 mutually
exclusive propositions Pnk:
Pn,ef = |
n(f+e) å
k = n(f-e)
|
Pnk |
| (55) |
hence, from (30) we get,
||Pn,efyn||2 = |
n(f+e) å
k = n(f-e)
|
||Pnkyn||2 |
| (56) |
and from (47) and (53) we can write,
||yn-Pn,efyn||2 = |
n å
j = 0
|
|
æ ç
è
|
|
n
)j
|
|Mn(j)|2- |
n(f+e) å
k = n(f-e)
|
|
æ ç
è
|
|
n
)k
|
|Mn(k)|2 |
| (57) |
In general this distance increases without limit as n® ¥
but it can converge relative to the size of yn. Let us define the
relative error by,
Dn,ef = |
||yn-Pn,efyn||2
||yn||2
|
|
| (58) |
Using (57) and (50) we have,
Dn,ef = 1 - |
|
n(f+e) å
k = n(f-e)
|
|
æ ç
è
|
|
n
)k
|
|Mn(k)|2 |
|
n å
k = 0
|
|
æ ç
è
|
|
n
)k
|
|Mn(k)|2 |
|
|
| (59) |
We separate the computation of Dn,ef into three
different cases.
6.3 Case: AB = BA
From (59) we can write the following,
Theorem 5
If AB ¹ 0, AB = BA and |AkBn-k| = |A|k|B|n-k then,
Dn,ef = 1 - |
n(f+e) å
k = n(f-e)
|
|
æ ç
è
|
|
n
)k
|
pk(1-p)n-k |
| (60) |
where,
Proof:
Under the conditions of the theorem we have,
|Mn(k)|2 = |A|2k|B|2(n-k) |
|
replacing this last equation in (59) and noticing that,
|
n å
k = 0
|
|
æ ç
è
|
|
n
)k
|
|A|2k|B|2(n-k) = ( |A|2 + |B|2 )n |
|
we immediately obtain (60) and (61) ·
It is not always true that for A,B Î \cal G , |AB| = |A| |B| even when
AB = BA (take for example A = 1+au, B = 1-bu for a unit
vector u and scalars a and b) so the extra condition
besides commutativity is needed for the theorem to be true.
6.4 Case: AB = 0
Unlike the real (or complex) numbers, the product of non zero clifford
numbers can be zero (e.g. take a = b = 1 in the example
above) so this case is not trivial. When AB = 0 the following is true,
Theorem 6
If AB = 0 then,
Proof:
Notice that when AB = 0 then all the symmetrized products, except the
two extremes are zero, i.e., Mn(k) = 0 for all 0 < k < n and
Mn(n) = An and Mn(0) = Bn. Substituting these values into
(59) we obtain (62) ·
6.5 Case: AB = -BA
When A and B anticommute we have,
Theorem 7
If AB ¹ 0, AB = -BA and |AkBn-k| = |A|k|B|n-k then,
Dn,ef = 1 - |
|
n(f+e) å
k = n(f-e)
|
bn(k)(1-2ln(k))2 |
|
n å
k = 0
|
bn(k)(1-2ln(k))2 |
|
|
| (63) |
where bn(k) are the binomial probabilities,
bn(k) = |
æ ç
è
|
|
n
)k
|
pk(1-p)n-k, with p as before. i.e., p = |
|A|2
|A|2+|B|2
|
|
| (64) |
and the numbers ln(k) satisfy ln(k) = ln(n-k) and
for k £ n/2, ln(k) is the chance of drawing and odd number of
RED balls out of k draws without replacement from a box containing either:
n/2 REDS and n/2 BLUES if n is even or (n+1)/2 REDS and (n-1)/2 BLUES
if n is odd.
Proof:
Recall that Mn(k) is the symmetrization of AkBn-k, i.e., the
average over all the permutations of AkBn-k. There are (n/)k]
permutations and, by the assumed anticommutativity of A with B, each
permutation is either AkBn-k or -AkBn-k so we have,
where r(n,k) is an integer. From the fact that |Mn(k)| is
invariant under the transformation: A® B, B® A, and
k® (n-k) it follows that |r(n,k)| = |r(n,n-k)|.
In order to prove the theorem it is
sufficient to show that,
since if (66) is true, by using the conditions of the theorem we have,
|Mn(k)|2 = (1-2ln(k))2 |A|2k|B|2(n-k) |
| (67) |
and dividing the numerator and the denominator of (58) by
(|A|2+|B|2)n we obtain (63).
Let us show that (66) is true by giving an explicit
formula for |r(n,k)| when k £ n/2. To do this, represent each
permutation of AkBn-k by the k integers
(j1j2¼jk) that correspond to the positions of the A's in increasing
order. For example, for n = 6 and k = 3, the permutation ABABBA is
represented by (136), since the As are found at positions 1,3
and 6. The permutation AABBBA is represented by (126) etc. Define
the parity of (j1¼jk) as
parity of (j1j2¼jk) = (-1)j1+j2¼+jk = (-1)j1(-1)j2¼(-1)jk |
| (68) |
Note that the transposition of an A with a B, located next to it,
changes by one the position of that A in the permutation and hence,
the parity of the permutation obtained after the transposition is
always the reverse of the parity of the original permutation. From
this and the fact that we can transform any permutation into any other
by a sequence of transpositions it follows that two permutations have the
same parity if and only if the number of flips (transpositions)
necessary for transforming one permutation into the other is even.
The permutation AkBn-k always corresponds to (12¼k)
and therefore an arbitrary permutation (j1j2¼jk) will
have the same parity as AkBn-k if the parity of the number of
odd integers in the set {j1,j2,¼,jk} is the same as
the parity of the number of odd integers in the set
{1,2,¼,k}. In other words, if there are an even number of odd
integers in the set {1,2,¼,k} then every permutation
(j1j2¼jk) which also contains an even number of odd
integers can be reorder into AkBn-k but if the number of odd
integers in {j1,¼,jk} is odd then the permutation
reorders into -AkBn-k. Therefore, we can write
|r(n,k)| = | |
å
1 £ j1 < j2¼ < jk £ n
|
(-1)j1+j2+¼+ jk| |
| (69) |
Thus, if we call Ne the number of permutations with an even number
of odd integers among {j1¼,jk} and we call No the
number of permutations with an odd number of odds, then,
using the fact that Ne+No = (n/)k] we also have that,
|r(n,k)| = | |
æ ç
è
|
|
n
)k
|
- 2 No| |
| (71) |
We now turn to the computation of No. Let No(m) be the total
number of permutations (j1j2¼jk) with exactly m
of the positions of the A's being odd. We have,
No = |
ì ï ï í
ï ï î
|
|
|
|
|
[(k-1)/ 2] å
t = 0
|
No(2t+1) |
|
|
| |
| (72) |
where, for 0 £ m £ k £ n/2
No(m) = |
ì ï ï í
ï ï î
|
|
|
æ ç
è
|
|
n/2
)m
|
|
æ ç
è
|
|
n/2
)k-m
|
|
|
|
|
æ ç
è
|
|
(n+1)/2
)m
|
|
æ ç
è
|
|
(n-1)/2
)k-m
|
|
|
|
| |
| (73) |
this is because the set {1,2,¼,n} contains an equal number of odd
and even numbers when n is even but the number of odds is one more than
the number of even when n is odd.
So dividing (71) by (n/)k] and using (72) and
(73) we obtain (66) with ln(k) defined as
the theorem says. There are four different formulas for ln(k)
depending on the parities of n and k.Let us check one of them.
When n and k are both even and k £ n/2 we have,
ln(k) = |
k/2-1 å
t = 0
|
|
|
æ ç
è
|
|
n/2
)2t+1
|
|
æ ç
è
|
|
n/2
)k-2t-1
|
|
|
|
| (74) |
and we can see that (74) is the chance of drawing an odd
number of red balls when drawing at random k balls, without
replacement, from a box containing n/2 red balls and n/2 blue
balls. This completes the proof of the theorem ·
7 The weak law of large numbers
7.1 Taking limits as n®¥
In this section we compute
for the three cases considered in the previous section.
Theorem 8
If AB ¹ 0, |AkBn-k| = |A|k|B|n-k and either AB = BA
or AB = -BA then, for all sufficiently small e > 0,
where as before, p = [(|A|2)/( |A|2+|B|2)].
Moreover, if AB = 0, then "e > 0,
|
lim
n®¥
|
Dn,ef = |
ì ï ï í
ï ï î
|
|
|
if (f = 0 and |A| < |B|) or (f = 1 and |A| > |B|) |
| |
if |A| = |B| and either f = 0 or f = 1 |
| |
|
| |
| (76) |
Proof
For the first part we use equations, (60)
and (63). By the usual gaussian approximation for the
binomial probabilities (e.g. see [] p.59) we have
that for any integers
0 £ k1 £ k2 £ n and any function gn with finite expectation
with respect to the standard gaussian,
|
k2 å
k = k1
|
bn(k) gn(k) = |
ó õ
|
[(k2-np)/( [Önpq])]
[(k1-np)/( [Önpq])]
|
gn(np + x |
| ___ Önpq
|
) |
1
|
e[(-x2)/ 2] dx (1+o(n0)) |
| (77) |
thus, taking k1 = n(f-e), k2 = n(f+e),
gn(y) = 1 for 0 £ y £ n and gn(y) = 0 outside [0,n]
we obtain from equation (60) that,
|
lim
n®¥
|
Dn,ef = 1 - |
lim
n®¥
|
|
ó õ
|
[(n(f-p+e))/( [Önpq])]
[(n(f-p-e))/( [Önpq])]
|
|
1
|
e[(-x2)/ 2] dx |
| (78) |
hence, when f ¹ p for any 0 < e < |f-p| the limits of the
integral in equation (78) are both positive or both
negative and both going to ¥ as n®¥ so the desired
limit is 1-0 = 0. On the other hand when f = p for any e > 0
the desired limit is 1-1 = 0 and this shows that
(75) is true for the commutative case.
To show (75) for the anticommutative case we take
which increases like y6 and therefore it has finite expectation
with respect to the standard gaussian. If we show that
gn(k) = n2(1-2ln(k))2 + o(n0) |
| (80) |
then it will follow from (80), (77) and
(63) that,
|
lim
n®¥
|
Dn,ef = 1 - |
lim
n®¥
|
|
|
ó õ
|
[(n(f-p+e))/( [Önpq])]
[(n(f-p-e))/( [Önpq])]
|
gn(np + x |
| ___ Önpq
|
) |
1
|
e[(-x2)/ 2] dx |
|
ó õ
|
[nq/( [Önpq])]
[(-np)/( [Önpq])]
|
gn(np + x |
| ___ Önpq
|
) |
1
|
e[(-x2)/ 2] dx |
|
|
| (81) |
and by the same reasoning as in the commutative case we
obtain (75) for the anticommutative case.
Let us then show (80). Notice that from (74)
we can write,
ln(k) = |
k/2-1 å
t = 0
|
W(n,2t+1,k) |
| (82) |
where the hypergeometric probabilities,
|
|
|
|
|
æ ç
è
|
|
n/2
)m
|
|
æ ç
è
|
|
n/2
)k-m
|
|
|
|
| |
|
|
|
é ê
ë
|
|
1
nm
|
|
æ ç
è
|
|
n/2
)m
|
|
ù ú
û
|
|
é ê
ë
|
|
1
nk-m
|
|
æ ç
è
|
|
n/2
)k-m
|
|
ù ú
û
|
|
|
|
| (83) | |
|
|
|
é ê
ë
|
|
1
m!
|
|
1
2
|
( |
1
2
|
- |
1
n
|
)¼( |
1
2
|
- |
m-1
n
|
) |
ù ú
û
|
|
é ê
ë
|
|
1
(k-m)!
|
|
1
2
|
( |
1
2
|
- |
1
n
|
)¼( |
1
2
|
- |
k-m-1
n
|
) |
ù ú
û
|
|
|
é ê
ë
|
|
1
k!
|
1(1- |
1
n
|
)(1- |
2
n
|
)¼(1- |
k-1
n
|
) |
ù ú
û
|
|
|
|
|
| |
|
expanding the products up to terms of order
(1/n) and letting W = W(n,m,k) we have,
|
|
|
|
æ ç
è
|
|
k
)m
|
|
|
é ê
ë
|
2-m{1 - |
m(m-1)
n
|
+ o(n-1)} |
ù ú
û
|
|
é ê
ë
|
2m-k{1- |
(k-m)(k-m-1)
n
|
+ o(n-1)} |
ù ú
û
|
|
|
|
| |
|
|
æ ç
è
|
|
k
)m
|
|
æ ç
è
|
|
1
2
|
|
ö ÷
ø
|
k
|
|
ì í
î
|
1 - |
1
n
|
[m2+(k-m)2-k] + o(n-1) |
ü ý
þ
|
|
ì í
î
|
1 + |
k(k-1)
n
|
+ o(n-1) |
ü ý
þ
|
|
| |
|
|
æ ç
è
|
|
k
)m
|
|
æ ç
è
|
|
1
2
|
|
ö ÷
ø
|
k
|
|
ì í
î
|
1 + |
2
n
|
m(k-m) + o(n-1) |
ü ý
þ
|
|
| (84) |
| |
|
We can readily check that,
|
k/2-1 å
t = 0
|
|
æ ç
è
|
|
k
)2t+1
|
|
æ ç
è
|
|
1
2
|
|
ö ÷
ø
|
k
|
= |
1
2
|
|
| (85) |
and that,
|
k/2-1 å
t = 0
|
(2t+1)(k-2t-1) |
æ ç
è
|
|
k
)2t+1
|
= |
1
4
|
k(k-1)2k-1 |
| (86) |
From (85), (86), (84) and
(82) we have,
ln(k) - |
1
2
|
= |
k(k-1)
4n
|
+ o(n-1). |
| (87) |
Squaring both sides of (87) and multiplying
through by 4n2 we obtain,
n2(1-2ln(k))2 = |
1
4
|
k2(k-1)2 + o(n0) |
| (88) |
which is exactly (80). This ends the proof
for the anticommutative case. The second part of the theorem
i.e. (76) follows directly from (62)
by taking limits as n®¥·
7.2 Flipping an infinite number of coins
As in standard probability theory there is a subtle nuisance with
limits such as (75) and (76) that needs
to be faced in order to have a straight probabilistic interpretation
for laws of large numbers. The problem with (75) and
(76) is that it is not clear how to paste all the
yn together into one global y¥. It was
due to these kind of problems that modern measure-theoretic
probability theory was born.
To be able to make statements about infinite sequences of bernoulli
trials we need to specify a boolean s-algebra, \cal A ¥,
that contains at least those statements. This can be done as in
standard probability theory (e.g. see []), i.e.
\cal A ¥ is defined as the smallest s-algebra containing
the cylinder sets, in particular it contains the propositions
Pnk defined in (45) but now n refers to the first
n repetitions in an infinite sequence of bernoulli trials. Having
constructed \cal A ¥ we also need to construct the Hilbert space,
H(\cal A ¥), containing the functions y = y¥.
Again, the construction is not trivial but well known in functional
analysis as the standard construction of an infinite tensor product of
Hilbert spaces (e.g. see []). These standard
constructions allow us to write,
where y = y¥ Î H(\cal A ¥). Equation
(89) can be used to re-write the statements
(75) and (76) as,
Theorem 9
Let X¥ be the space of infinite sequences of independent
tosses of a coin and let \cal A ¥ be the smallest
s-algebra containing all the propositions Pnk about
elements in X¥. If for each toss the y values for
falling heads and tails are the clifford numbers A and B
satisfying,
- |A|2+|B|2 = 1
- AB ¹ 0
- either AB = BA or AB = -AB
-
|AkBn-k| = |A|k|B|n-k "n Î IN, "0 £ k £ n.
Then, for all sufficiently small e > 0 the propositions,
are true.
Proof
Under the conditions of the theorem we have from (89) and
(75) that when the yn are normalized i.e. when
||yn|| = 1 for all n then,
||y- Pn,e|A|2y|| ® 0 as n ®¥ |
|
or equivalently,
|
lim
n®¥
|
Pn,e|A|2 y = P¥,e|A|2y = y |
| (90) |
so that y is an eigen vector of the operator
P¥,e|A|2 with eigen value 1 and thus, it makes
the proposition true ·
We also have,
Theorem 10
Let X¥ and \cal A ¥ be as in the previous theorem but
now suppose that the clifford numbers A and B satisfy,
- AB = 0
- |A| > |B|
Then for all e > 0 the propositions,
are true.
Proof
Under the conditions of the theorem we have from (89) and
(76) that when the yn are all of unit norm then
||y- Pn,e1y|| ® 0 as n ®¥ |
|
or equivalently,
|
lim
n®¥
|
Pn,e1 y = P¥,e1y = y |
| (91) |
so that y is an eigen vector of the operator
P¥,e1 with eigen value 1 and thus, it makes
the proposition true ·
7.3 Interpretation and examples
The previous two theorems can be interpreted as in standard
probability theory. They say that an infinite sequence of independent
tosses of a coin with y( heads ) = A and y( tails ) = B
will have for sure (relative to y) a frequency of heads within
e from |A|2 in the first case and within e from
1 in the AB = 0 case. When AB = 0 the theorem assures us that
(again relative to y) the coin will show up heads with frequency
100% whenever |A| > |B| !
The four conditions on A and B that are needed for the AB ¹ 0
case, impose heavy restrictions on the possible values that A and
B can take but there are lots of examples. Let p be a real number
in the interval [0,1] and consider,
- Example 1
-
- Example 2
-
where [^B] = s1s2¼sr is a unit blade,
i.e. it can be factorized into a product of orthogonal (anticommuting)
unit vectors sj.
- Example 3
-
where [^A] and [^B] are both unit blades possibly of different
dimensions.
- Example 4
-
A = Öp eia |
^ A
|
B = |
| ___ Ö1-p
|
eib |
^ B
|
|
| (95) |
where a and b are scalars,
[^A] and [^B] are both unit blades and i is any
multivector such that i2 = -1 and i commutes or anticommutes
with both [^A] and [^B] i.e. i[^A] = ±[^A]i and
i[^B] = ±[^B]i
It can be readily check that all these examples satisfy the four
conditions of the theorem and hence, coin tosses with these ys will
show up heads with probability p.
7.4 Why isn't every one a frequentist?
For the same reason as in probability theory these laws of large
numbers can not be used to define what we mean by the partial
truth that the coin will show up heads in the next toss since the
theorem only says that the propositions P¥,ep are
made true by y. So any attempt to use the law of large
numbers as the definition of what y is, or means, is therefore
circular.
8 The Boolean algebra of Caticha's temporal filters
Let X be a set and let \cal B be a s-algebra of subsets of
X. Notice that we are using the standard set notation for the
elements of \cal B instead of the logical notation used in the rest of
the paper. The reason for changing the notation is that the boolean
s-algebra that we are trying to define is not \cal B itself but
only based on \cal B . Think of X as the set of possible locations
for a point particle and define the elementary propositions e(x,t)
by the statement: the particle is at location x at time t.
As in [], e(x,t) is a pure hypothesis not the
result of a measurement. The truth value of e(x,t)
can be obtained, at least in principle, by imagining a filter
that covers all of X except at location x where it has an
infinitesimal hole. This magical filter materializes only
for an instant at time t and then disappears leaving no trace
of its existence. If after time t we still find the particle
somewhere then we conclude that e(x,t) is true. These filters
form a boolean algebra with the definitions below.
Let T be a subset of the real line and define for
t Î T and B Î \cal B the proposition e(B,t) as: an elementary
filter at time t with B open. Thus, e(B,t) is true if and only
if the statement: the particle is somewhere in B at time t
is true. We define the logical product of two elementary filters as
the operation of putting one on top of the other and we define the
negation of an elementary filter as the filter that closes the
holes and opens the rest. In symbols:
where [`B] = X \B is the complement of B with respect to X.
Notice that (98) follows from (96) and
(97) by using De'Morgan's law i.e.,
We also have that for all s,t Î T,
|
|
|
``Filter at time s followed (or on top of) filter at time t¢¢ |
| |
|
``Filter at time s OR filter at time t¢¢ |
| |
|
``Barrier (nothing open) at time t¢¢ = 0 |
| (100) | |
|
``Absence of filter (all open) at time t¢¢ = 1 |
| (101) |
| |
|
We define \cal F as the smallest s-algebra containing the elementary
filters e(B,t) i.e.,
\cal F = s{e(B,t) : B Î \cal B , t Î T } |
| (102) |
The boolean algebra of temporal filters \cal F is a spell out of the
usual algebra of events of a stochastic process with state space X.
8.1 The Markov Property
Due to the fact that there is no product rule for the unnormalized
ys we cannot make use of the standard Markov property of
probability theory directly. The following definition is all that
is needed to recover non relativistic quantum mechanics,
Definition:
y Î H(\cal F ) is said to have independent segments given
c Î \cal F if for all
n = 1,2,¼, all times t1 < t2 < ¼ < tn
in T and all locations x1,x2,¼,xn in X the
propositions
e(x1,t1)e(x2,t2), e(x2,t2)e(x3,t3),¼,e(xn-1,tn-1)e(xn-1,tn-1) |
|
are independent given c.
8.2 Time evolution and the Shrödinger equation
When y Î H(\cal F ) has independent segments, it evolves according
to the Shrödringer equation. The usual jargon of quantum mechanics
is recovered with the notation,
- Probability Amplitude:
- y(e(x,s)e(y,t),e(x0,t0)) is
the amplitude for the particle to go from location x at time s to
location y at time t > s given that it was initially prepared
at location x0 at time t0. We denote this amplitude
by K(y,t;x,s).
- Wave Function:
- y(e(x,t),e(x0,t0)) is the amplitude of
going from the initial position to location x at time t. It is
often denoted by just Y(x,t).
Thus, with this notation, a particle which is prepared by
e(x0,t0) and for which y Î H(\cal F ) has independent
segments conditionally on this preparation, will satisfy,
Y(x,t) = |
å
y Î X
|
|
1
2
|
[ K(x,t;y,s)Y(y,s) + Y(y,s)K(x,t;y,s)] |
| (103) |
since
|
|
|
|
å
y Î X
|
y(e(x,t)e(y,s),e(x0,t0)) |
| |
|
|
å
y Î X
|
y ( [e(x0,t0)e(y,s)] [e(y,s)e(x,t)],e(x0,t0)) |
|
| |
|
taking derivatives in (103) with respect to t and evaluating
at t = s we obtain,
|
¶Y(x,t)
¶t
|
|
ê ê
ê
|
t = s
|
= |
å
y Î X
|
|
1
2
|
|
é ê
ë
|
|
¶K(x,t;y,s)
¶t
|
|
ê ê
ê
|
t = s
|
Y(y,s) + Y(y,s) |
¶K(x,t;y,s)
¶t
|
|
ê ê
ê
|
t = s
|
|
ù ú
û
|
|
|
Defining the Hamiltonian H by,
|
¶K(x,t;y,s)
¶t
|
|
ê ê
ê
|
t = s
|
= - |
i
(h/2p)
|
H(x,y,s) |
| (104) |
where i is any multivector that squares to -1 and that it commutes
with all the ys. Relabeling s with t we can write
Shrödinger equation for possible non-commuting ys as,
i(h/2p) |
¶Y(x,t)
¶t
|
= |
å
y Î X
|
|
1
2
|
[ H(x,y,t)Y(y,t) + Y(y,t)H(y,x,t)] |
| (105) |
when the wave functions Y commute with the Hamiltonian, (e.g.
when all the ys take values in a commutative subspace of \cal G )
(105) reduces to the usual Shrödinger equation.
9 Next:
- Using the Spacetime algebra
- How to connect the above
with the Dirac-Hestenes equation.
- y assignments in the real continuous case
- Minimum
Fisher information and the Huber-Frieden derivation of the
time independent Shrödinger equation.
- y and Brownian motion
- Nagasawa's diffusion model.
- Comments and conclusion
- What the hell is this all about
and what it may be likely to become....
Footnotes:
1 carlos@math.albany.edu
File translated from TEX by TTH, version 1.50.