addons

advertisement
2. RELATED WORKS
We present in this section the most advanced secure access control evaluation that have been devised up to
now which are based on encryption and show their limitations.
Super encryption In [MS04], Miklau and Suciu rely on XML encryption which keeps the structure of the
original XML document where subparts of it are encrypted in place in the document. When receiving the
document, the user decrypts subparts of the document depending of decryption keys in his possession. The
decrypted data can be in turn a subtree containing also encrypted parts. The user can continue the
decryption process recursively as long as he got the decryption keys. One of the most interesting aspect of
their work is that they compile in the encryption logical condition to access the data, e.g., an access to a
subtree may be granted for all the users in possession of the key K1 and K2 or simply K3. This is achieved
by encrypting the subtree P with an inner key X3 and by using extra nodes next to P as follows: a node
containing a key X1 encrypted with K1, another containing the key X2 encrypted with K2 and the last one
containing the key X3 encrypted with K3 such that X3=S1  S2, where  denotes the XOR operator. If the
user has got the key K3, he will have access to the key S3 and so to P. In the same way, owning the keys
K1 and K2 will give access to X1 and X2 and so to P after computing X3 from X1 and X2.
While this solution provides an elegant way to reduce the number of keys to be distributed to the users (in
the extreme case, only one key is needed per user), it suffers from many limitations. First, it does not solve
the problem of the dynamicity of rights. Indeed removing a right to a user, incurs to re-encrypt parts of the
document he was previously authorized to see using a different encryption key. This process is particularly
complex considering super-encryption. Second, the decryption cost incurred by recursive encryption and
the use of inner key adds in the cryptographic initialization process making it inappropriate for device with
low processing capacities. Finally, as no compression is considered, the overhead incurred by XML
encryption and inner key can be important.
Well-formed encryption The previous solution does not perform well when a user is interested in a rather
small subset of the document. Indeed, there is no indexation structure to converge towards relevant part in
the previous solution. The idea developed in [Carminati] is to rely on well-formed encryption which is to
encrypt tags, attributes and values in place in the document depending of the access control policies. A
query on the structure can be performed easily on the encrypted document, encrypting in place in the XPath
expression the tags, attributes and values (e.g., /a/b[c=5] can become /eZ/r5[er=53]). To support selection
on values, they rely on [Hacigumus] and consider index partitioning which consists of appending to each
encrypted value an index. When considering numerical values, the index tells to which interval the value
belongs (e.g., values between [1, 100] has an index value of 1). In this scheme, the relative order is
conserved and values having a greater index have a greater value. This enables selections using
inequalities. These indexes are appended in the form of extra elements in the document. Finally, the
problem when dealing with queries is to guarantee completeness. This is achieved extending the Merkle
Hash Tree (reference) to an XML tree. Each internal XML node of the tree is associated to a hash
computed as the hash of the concatenation of the tag name, its content and the hash of all its child nodes.
While this solution provides a compilation of many existing techniques to secure the access control, it
suffers from many weaknesses. First, the well-formed encryption can be subject to inference attacks (e.g.,
statistics on the number of occurrences and inference on structures). Second, the encoding scheme has a
dramatic overhead: when considering secure encryption function such as 3DES or AES (which produces 64
bit or 128 bit blocks), tags and values has to be padded accordingly which can drastically increase the size
of the document. Moreover, index information and schema information (which basically tells the key used
for encryption) added as extra elements contributes to the space overhead. Third, extending the Merkle
Hash Tree which originally operates on binary tree leads to dramatic overhead: when requesting an element
having n siblings, their n hashes (SHA-1 considers hash of 20 bytes) are sent along with the answer.
Finally, this model does not support well updates. Indeed, when access controls are updates, data needs be
re-encrypted accordingly.
In the context of XML filtering [expedite] and XML routing [suciu avec SIX], the authors devised a
streaming index which consists of appending to each subtree its size, giving the possibility to skip it.
However no information about the content of the subtree is provided making its use very limited (e.g., if the
query /a/b only the sibbling of b can be skipped however when considering //, no skip can be done).
Delivery When the elements are delivered to the terminal, we use a simple representation of data based on
tag compression. Basically, the starting tag is represented using an id, and the ending tag using the null
byte. Characters are output in place as is. Tags encoding are prefixed using a bit set to one, and characters
by a zero bit. The terminal then replace tag id by the proper tag names using a tag dictionary. In case, a
positive rule is nested in a negative rule, an orphan subtree can be output. In this case, the tags (found in the
tag stack) linking the subtree to the last output element are appended to the orphan subtree in order to keep
the document structure consistent.
Pending delivery The pending parts are externalized to the terminal in an encrypted form using a
temporary encryption key. If later in the parsing ,the pending part is found to be authorized, then the
temporary key is delivered and discarded otherwise. A different temporary key is generated for every
pending parts which depends of different predicates. We refer in the following to output block as an
contiguous encrypted output encrypted with the same key or contiguous clear-text output.
When issuing an output block, we have to consider the case that some of the previous issued output blocks
may be discarded and so to find a way to connect the block to the last authorized (possibly pending) output
block which may not be known in advance. To this end, we append to each output block the following
information. The last tag of the output block is marked with a random number which serves as an anchor
(this marked is also appended in the tag stack). The list of tags to connect to the last authorized output
block (either clear-text output or encrypted block which keys has been delivered). For each tag in the list,
we append the mark if present. This way, the terminal can easily reconstruct the document.
The attentive reader may notice that a user can infer from discarded output blocks their possible values
(e.g., we know that a salary is less than $1000 since we know that a rule conditioned by the salary is
defined). To tackle this problem, fake discarded blocks may be issued randomly.
Coping with multiple pending predicates When coping with multiple pending predicates, the problem
which arises is on how to manage the different buffers efficiently because the delivery of a subtrees may
depend on complex logical expressions of predicates. As the number of pending predicates is likely to be
small (less than a dozen), we can modelize the logical expressions using bitarray representing the truth
table.
The logical expression ab which conditions the delivery of an expression can be represented as in Figure
YYY. Only the gray parts are encoded and stored in memory, the rest is implicit. So we consider two
vectors: V={(p, d)} a list of predicate identified by the predicate id and the depth at which they occur, B
equals to the truth table results which represents the logical expression.
The blocks which share the same logical expression E are grouped in classes. Each class are associated
with an encryption key which serves to encrypt the blocks. In the following, we describe how the logical
expression can be constructed incrementally and how to evaluate them.
For ease of understanding, we consider here that only one pending predicate is associated to each rule. The
extension to manage multiple predicates per rule is trivial. When considering a subtree which is
conditioned by an expression P and an inner one on which a pending rule R having a pending predicate F
applies, the resulting logical expression is: E = P  F if E is a positive rule and E = P  F if it is negative.
The new vectors V’ and B’ of E can be constructed from V and B of P as follows. First we copy V to V’.
The predicate is inserted in V’ in the (predicate id, depth) order and B’ is computed as follows: B is first
expanded by duplicating every set of 2index(p) bits (in Figure 1, (b) is constructed from (a) by duplicating
each row and adding the c column to build the new result), index(p) being the position of p in V. Then for
the column which is inserted, we consider an alternation of 2 |v| - index(p) bit of 0 and 1 (here an alternance of
one bit block of 0 and 1). Finally we compute the resulting bits using a bitwise AND. Then V’ and B’ is
compared against the other logical expressions of the other classes. If found in another class, then the two
classes merge.
(a, 1)
(b, 2)
ab
0
0
0
0
1
0
1
0
0
1
1
1
(a, 1)
(b, 2)
(c, 2)
abc
0
0
0
0
0
0
1
0
0
1
0
0
0
1
1
0
1
0
0
0
1
0
1
0
1
1
0
1
1
1
1
0
(b) Class 2
(a) Class 1
(a, 1)
(b, 2)
ab
(a, 1)
a
0
0
0
0
0
0
1
0
0
0
1
0
0
1
1
1
(c) Class 1-2
(d) Class 1-2
Figure 1. Multiple pending predicate management
When a predicate p is found to be false the row which consider a true value are removed, that is all the rows
in
the
intervals
[k*2index(p)+1,
k*2index(p)+1
index(p)
+2
] are removed and the rest is shifted backward to remove the gaps, k varying from 1 to n, n being
the number of predicates. Conversely if a predicate p is found to be true, every rows in the interval
[k*2index(p),
k*2index(p)
+ 2index(p)] are removed. Finally, when all the bits of B are set to one, we can conclude that the associated
buffers is to be delivered and if they are all set to zero, they must be discarded. In Figure 1.c we can see
that the c which was found to be false are simplified to the array in (c). As you can notice the truth table is
the same as the one in the class 1 (in (a)). These two classes are merged into the class 1-2. When the
element b is found to be false, the table is simplified as in table (d). As we can see the result is made up of 0
bits so the logical expression is false and all the elements of this class can be discarded.
Each class is in charge of a decryption key, and when the logical expression is resolved (true or false) the
key are delivered or discarded. When two classes merge, the key of the first class will serve to encrypt the
output to come.
6. ACCESS RIGHT MANAGEMENT
As access rights can evolve, we design a process to refresh the access rights in the smartcard from the
unsecured server.
The access rights are stored encrypted on the server and can only be decrypted by the smartcards. As access
rights are defined incrementally, we identify them with a timestamp which is incremented for every new
access rights so we can detect if an access right is missing. For each of them, we store the document
timestamp, user id and encrypted rule definition. Note that only the rules and the signatures are encrypted.
Finally the first data block of the document contains the document timestamp (incremented every time the
document is updates) and the timestamp of the access rights when the document was modified. To refresh
access rights, the smart card requests all the access rights with a timestamp grater than the one of the last
connection.
1 TD1 John E(rules) sig
2 TD1 Mary E(rules) sig
3 TD2 John E(rules) sig
Figure XXX
When considering updates on document, many situation have to be considered. Suppose that a document
has been stored (TD1) on the server with its associated rules TR1. Now suppose that the owner of the
document updates the document (becoming TD2) and/or the access rights (becoming TR2 which is actually
the new access rights from the last connections).
When the client request the document, the server may not be trustworthy and four situations may occur:
- the server sends TD2 and TR2. The date and the access rights are consistent and up-to-date.
- the server sends TD1 and TR1. The data and the access rights are consistent. The user could have missed
grant access or deny access. As the user was authorized to see the denied parts at a prior date, this case does
not give access to extra information and do not violate the confidentiality constraints.
- the server sends TD1 and TR2. Access rights giving extra access can give access to denied parts of TD1
(which have been modified now) for which the user did not have access. That leads to confidentiality leak.
This situation is detected thanks to the timestamp of the document appended with the access right.
- the server sends TD2 and TR1. Access rights which remove access to subparts of TD2 could have been
defined in TR2. In this situation, the user can have extra access to these subparts. That leads to
confidentiality leak. This situation is detected thanks to the last timestamp of the access right coming with
the document.
If we consider the case where access rights are updated but not the document, then if the server did not
reflect the new access rights, the user will have in the worse case get access to subparts of the document he
has previously access to which do not violate confidentiality constraints.
Optimization issues In order to reduce the number of access rights to be fetched, that is by skipping access
rights of the other users we append to each access right extra information telling the last timestamps of the
other users. This can be reduced using hash partitions on the users, and to give the last timestamp for each
partition. In this situation, a user fetches the last timestamp and converge to his access right thanks to the
timestamps.
Download