Complexity and Order Kevin Swingler
Jun 29, 2015
Complexity and Order
Kevin Swingler
What is Complexity?
<
Interaction Order
21
3
Total Possible Interactions
• Where n is the number of nodes:– Possible 1st order interactions = n– Possible 2nd order interactions = (n(n-1))/2
– Possible order k interactions =
k
n
nn
k k
n2
0
Measuring Complexity
• Enumerate the possible interactions w0 .. w2n
-1
• Count the number that are used• Then a measure of complexity of a system
might be the number of interactions, possibly divided by 2n
• We might also want to consider higher order interactions as being more complex than lower ones– For example, accounting for the number of
samples needed to model them …
Function Modelling
Now we define a system more specifically
We would like to express any such function so that:• The interactions are explicit• It reproduces the function perfectly• Local maxima are attractor points• If the function is a PMF, we can sample from it
and calculate any probability we like
Why?
• We can specifically manage and understand the complexity of the model
• We can find optimal points in input space that maximise the output
• We can sample from the function (which in general is difficult)
How?
• I am using Mixed Order Hyper-Networks (MOHNs)
• A type of neural network
Neural Networks
• Generally:– A set of processing units, u with roles of either:
• Receiving input• Making calculations• Providing output
– Defined by a set of weighted connections between pairs of units
– Each unit makes the same calculation
jjii uwfu
MOHNs
• Units do not have roles – no input, output etc.
• Connections are not just between pairs of units, but into single ones, and amongst subsets of all sizes
• Defined by a set of parameters w0 .. w2n-1
• Function is threshold: >0=1, else -1
• Takes values over c = {-1,1}n
Can I See a Picture of One?
u1 u2
u4 u3
W0
W1 W2
W8 W4
W15
W7
W6W9
In What Follows …
u4
u1 u2
u3
w8 = 1000
w2 = 0010
w13 = 1101
u4
u1 u2
u3Q8 = 1000
Q6 = 0110
How?(1) Learning a Function
Qi is the subset of neurons connected to weight i.
How?(2) Learning a PMF
Qi is the subset of neurons connected to weight i.
Calculating Function Output
Once learning is complete, we calculate the function output by:
icu ii
u4
u1 u2
u3
f (-1,1,1,-1) =
u1u3u4W13+ u1u2W3
= W13 – W3
Calculating Output Averages
• We may want to ask questions such as “If input 3 is set to one, what is the average output?”
• Or, more generally, we want to calculate:
• For example:
* is, of course, a wild card
nhhf ,*}1,1{),( )1,*,*,1,1( f
Calculating Output Averages
• To calculate the average output, we sum the weights, as before, but with one change:– Not all of the weights are used – just those
defined by the *s in h:
1**0 producesΨ = 0000, 0001, 1000, 1001 = W0,W1,W8,W9
Schemata Averages
Ψ
Examples:
f(***) = W0
f(*0*) = W0 + W2
f(01*) = W0 – W2 + W4 – W6
Finding Attractors
• Attractors in a function are maximal turning points – the tops of hills
• They are of particular interest in optimisation problems
• If we treat the output of the function as a score, or measure of quality, then the attractor patterns are in some sense good examples of the concept being learned
Finding Attractors
• We find the attractors by starting at a random point in {-1,1}n
• And then repeatedly apply these two steps:
u4
u1 u2
u3
u1=w14 x u4 x u3 + u2 x w3
Probabilities and Sampling
• If we learn the Probability Mass Function (PMF) from samples, we can calculate the probability of a pattern occurring as:
• Marginal and joint probabilities are calculated using the function average method described above, e.g.
)()( cfcp
)1,*,1(*,)11( 31 fuup
Sampling
• Let’s say we want to generate 1000 patterns, across which the distribution is that same as that of the data used to build the MOHN
• Useful in search and optimisation
• And for many other reasons
Sampling Algorithm
1. Start with h = *,*,*,*,* …2. Pick a random location, i
3. Calculate p(h) with hi set to 14. Repeat:
1. Leave hi=1 with given probability, else hi=-1
2. Choose another i (hi = *) at random
3. Calculate p(h^hi=1|h)
5. Until all bits are set
Example 1 – Function Learning
Binary to Integer encoding
Weights and Averages
Weights:
W0=127.5, W1 = 0.5, W2=1, W4=2, W8=4
W16=8, W32=16, W64=32, W128=64
ΣWi = 255
******** = 127.51******* = 191.5*******1 = 1281111111* = 254.5
Example 2: Symmetry
• Function output is measure of vertical symmetry
• No first order interactions
• Some second order
• None higher
Attractors
Example 3: Sampling