Uploaded by Seshadri Mazumder

IIITH Data System Assignment 1

advertisement
Assignment 1 : Othello/Reversi Page Design
Course Name : Data Systems
Course Code : CS4.401
Semester : Monsoon’22
Instructor Name : Prof. Kamal Karlapalem
Sk. Abukhoyer
Roll : 2021201023
Mtech CSE
Seshadri Mazumder
Roll : 2021801002
Research Scholar (PhD CSE)
Center for Visual Information Technology
IIIT Hyderabad
September 13, 2022
Contents
1 Initial Assumptions
1
2 Data Structure Description
1
2.1
2.2
Structure of B+ Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2.1.1
Leaf Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2.1.2
Internal Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Time Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
3 Page Design
1
3.1
Textual Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
3.2
Visual Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
3.2.1
Time Step 1 : . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
3.2.2
Time Step 2 : . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
3.2.3
Time Step 3 : . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
Indexed Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3.3
4 Storage Space Design
4.1
3
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
4.1.1
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
4.2
Storage Technique : A Theoretical Idea . . . . . . . . . . . . . . . . . . .
5
4.3
Storage Technique : A Mathematical Approach . . . . . . . . . . . . . .
5
4.4
Storage Technique : A more Compact Possibility
. . . . . . . . . . . . .
6
4.5
Question : How much storage is needed to store one completed game ? .
6
5 Examples from 8x8 Othello Board
5.1
5.2
6
Example : Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
5.1.1
Time Step 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
5.1.2
Time Step 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
5.1.3
Time Step 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
Example : Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
6 A Thought : ”Our Page Design is the Best”
8
6.1
Reasoning about Indexing . . . . . . . . . . . . . . . . . . . . . . . . . .
8
6.2
Reasoning to Page Design & Storage : Overall Architecture . . . . . . . .
8
7 Conclusion
8
2
1
Initial Assumptions
2
Data Structure Description
A balanced binary search tree is the B+ tree. It uses a multi-level indexing structure.
Leaf nodes in the B+ tree represent real data references. All leaf nodes are kept at the
same height by the B+ tree. The leaf nodes in the B+ tree are connected via a link list.
As a result, a B+ tree may allow both sequential and random access.
2.1
Structure of B+ Tree
Each leaf node in the B+ tree is equally spaced from the root node. The B+ tree has
an order of n, where n is constant throughout all B+ trees. It has a leaf node and an
interior node.
2.1.1
Leaf Node
At least n/2 record pointers and n/2 key values may be found in the leaf node of the B+
tree. A leaf node can have n record pointers and n key values maximum. A block pointer
P that points to the following leaf node is present in each leaf node of the B+ tree.
2.1.2
Internal Node
Except for the root node, an internal node of the B+ tree can have at least n/2 record
references. An internal node of the tree can only hold n pointers at most.
2.2
Time Complexity
• Insertion : log n
• Search / Retrieval : log n
3
Page Design
3.1
Textual Explanation
We characterize all the game states as a tuple (t, m). Here t represents the time step of
the game, and m represents one of the possible moves in the tth time step possible. All
the data samples or game states are stored in the sorted order adhering to the rules of
B+ Tree.
For Sorting we have two keys :
1
• Primary Key : Time Step(t) - This represents the number of Game movements
happened.
• Secondary Key : Possible Moves(m) - This represents the number of Game movements that would have been possible. However Sorting as per the Secondary Key
is not required but we take it into account for the case of retrieval. If the query is
about the k th possible among the m possible moves in the tth time step.
We don’t need additional overhead cost for sorting the item other than insertion in the
B+ Tree.
3.2
3.2.1
Visual Explanation
Time Step 1 :
For time step one, we call it by a tuple (t, m) where the time step t = 1 & number of
possible moves = m = 1. The view of (1, 1) is addressed as indexed in B+ Tree, whereas
Figure 1: Time Step 1
this matrix is stored in the form of a sparse matrix in the storage space. The leaf node
having value (1, 1) points to the stored matrix. The entire indexed tree structure is show
in fig:.
3.2.2
Time Step 2 :
In time step 2 we have four possible moves as shown in the below figure. The four different
moves are identified as : (2, 1), (2, 2), (2, 3), (2, 4). They also are stored as sparse matrix
in the storage spaces. The corresponding leaf node will point towards their corresponding
matrix. The entire indexed tree structure is shown in fig :.
3.2.3
Time Step 3 :
In time step 3 we have twelve possible moves as shown in the below figure.
2
Figure 2: All Possible Moves in Time Step 2
The twelve different moves are identified as : (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7),
(3, 8), (3, 9), (3, 10), (3, 11), (3, 12). They also are stored as sparse matrix in the storage
spaces. The corresponding leaf node will point towards their corresponding matrix. The
entire indexed tree structure is shown in fig :.
3.3
Indexed Tree
The entire indexed tree after inserting values till the 3rd time step for a 8 x 8 Othello
board is shown in the below figure, where the tuple of values (t, m) represents timestep
and the possible move in that layer.
4
Storage Space Design
We designed an efficient Storage Mechanism inspired by the concepts of Vector Space,
which is explained in the below sections :
4.1
Motivation
In a 2D Vector
we can have countably infinite vectors. However we have two basis
Space
0
1
vectors :
&
and we can represent any basis vectors in the linear combination of
1
0
these.
4.1.1
Example
2
Let’s say we want to generate the matrix
.
3
3
Figure 3: All Possible Moves in Time Step 3
Figure 4: B+ Tree after Time Step 3
2
1
0
Now we can represent
=2·
+3·
3
0
1
4
Here we can see, that two vectors and the set of integers and their linear combinations
can generate a set of infinitely countable vectors.
This leads to the motivation of proposal of a Basis Matrices.
4.2
Storage Technique : A Theoretical Idea
Taking Motivation from the Vector Spaces we first form a subspace of all the matrices
that can possibly represent the game state. Then there will exist a set of basis matrices
whose linear combination can represent all the possible matrices in that subspace. Now
instead of storing all the game state as matrices if we store the basis vectors and few
integers then we can recreate the entire subspace of matrices that store all the possible
games.
4.3
Storage Technique : A Mathematical Approach
Let’s assume that the dimension of the board is n × n.
Now the possible number of game states can be huge, and is countably infinite. Storing
each of the game states for retrieval will take an upper bound of n2 space× countably
infinite cardinality.
n2
Now all the game spaces are there in a vector space V where V ∈ Z+ . Now create
a subspace V ‘ where V ‘ ∈ V & where V ‘ represents all the possible game state in n × n
board.
Therefore ∃ at most n2 basis vectors of dimension n2 .
Thus the total number of entries considering all the basis vectors for the subspace has
a dimension of n4 .
Now for each of the basis vectors there is 1 in only one position and 0 in the other
positions. Therefore the number of 0s and 1s in a basis vector is represented as :
• Number of 1s : 1
• Number of 0s : n2 − 1, since each basis vector is of dimension n2 .
Therefore out of n4 entries we have the following number of 0s and 1s :
• Number of 1s : 1 × n2 , because there are n2 basis vectors.
• Number of 0s : n2 × (n2 − 1) = (n4 − n2 ) entries.
Here we can see that n2 << (n4 − n2 ) for large n.
Now since these are binary matrices we can store it in the form of unary matrices
by ignoring the 0s, this boils down to total n2 cost for storing the basis matrices. While
storing only the location of non-zero entries, we need to have their x & y positions. i.e.
we spend actually 2 · n2 .
But for very large value of n, 2 · n2 ∼ n2
5
So for the vector space V ‘ which have countably huge number of state space possibly can be represented with only a linear combination of n2 value, which are again
representation of n2 sparse matrices.
For retrieving the original matrices for taking a linear combination we can do something like this :
matrix = c1 · e1 + c2 · e2 + ... + cn2 · en2 where c1 , c2 , c3 , ..., cn2 are the coefficients
and are simple integers, and e1 , e2 , e3 , ..., en2 represents the non zero elements of the basis
matrices.
Now since there are two colours in the game, we can add another variable v where
v ∈ {1, 2}. Multiplying v with the previous equation will give a resultant matrix with
positions having values 1 & 2.
4.4
Storage Technique : A more Compact Possibility
If we remember from Under Grad Mathematics, there’s a concept of ”Characteristic
Polynomial” & ”Characteristic Equation” while calculating the Eigen Values and Eigen
vectors for a matrix.
Now there will be cases in our scenarios where there will be Minimal Characteristic
Polynomial. And if there exists a minimal characteristic polynomial because of dependency among the moves, which remains inherently.
Then the dimension of the basis matrix will be < n2 , which leads to a further compact
representation of the game.
4.5
Question : How much storage is needed to store one completed game ?
For storing a completed game, with all states saved, assuming that all the mathematics
above is correct, it should take n2 list of constant values which serves as a lower bound
and an upper bound n4 , where all constants are taken into account.
5
5.1
5.1.1
Examples from 8x8 Othello Board
Example : Indexing
Time Step 1
This is the B+ Tree at Time step 1, We have only only one node, which is the root as
well as the leaf node. Attached to it has a matrix pointed in the storage. (1, 1) indicates
the time step 1 and the another 1 indicates that there is only one possible move after
step 0.
6
Figure 5: B+ Tree after Time Step 1
5.1.2
Time Step 2
This is the B+ Tree at Time step 2, We have 3 internal nodes and 5 leaf nodes. Attached
to each of the leaf nodes we have a matrix pointed in the storage. For each of the tuples
Figure 6: B+ Tree after Time Step 2
(t, m) t indicates the time step and m indicates one of the possible move at time step t.
5.1.3
Time Step 3
Figure 7: B+ Tree after Time Step 3
5.2
Example : Storage
The Storage Mechanism explained in the previous space can’t be shown, however the
above maths is quite readable.
7
6
A Thought : ”Our Page Design is the Best”
6.1
Reasoning about Indexing
Since we are using B+ Tree as our data structure where the insertion and retrieval are
reasonably fast with the following time complexities :
• Insertion : log n
• Search / Retrieval : log n
6.2
Reasoning to Page Design & Storage : Overall Architecture
For storage also instead of saving every n2 dimensional vector for each and every possible
state move which becomes countably infite we save only two things :
• Basis Vectors of the subspace V ‘
• Sparse Matrix Represntation of the Basis vectors which boils down to a integer for
one basis vector, and ∃ n2 such.
• Few supporting coefficients which are integer which include c1 , c2 , c3 , ... and the
corresponding v where v ∈ {1, 2}
• Also for storing the Basis Vectors the dimension can be brought lower than n2 if ∃
a minimal characteristic polynomial.
7
Conclusion
In this report we have designed a strategy for storing and retrieval all states of an possible
Othello game with n × n dimensional board where n can be very large, within reasonable
time. We have used B+ Data structure for storing the pointers and operations that can
be done in log n time, because of the sorted order, and we extended the concept of basis
vectors to form basis matrices so that a linear combination of few matrix can generate a
list of around infinitely countable states.
8
Download