CS510 Concurrent Systems Tyler Fetters A Methodology for Implementing Highly Concurrent Data Objects Agenda • • • • • • • Definitions Author Biography Overview Small Objects Class Exercise Large Objects Summary & Contributions Definitions • • • • • • • • Non-Blocking Wait-Free Concurrent Objects load_linked (LL) store_conditional (SC) Small Object Large Object Linearizable Author Biography Ph.D. in CS from MIT Professor: Carnegie Mellon University Brown University Awards: 2003 & 2012 Dijkstra Prize 2004 Godel Prize 2013 W. Wallace McDowell Award Overview •Provide a framework for transforming sequential data structures into concurrent ones •Requires writing operations as stylized sequential operations •Increase ease of reasoning •Uses LL and SC as core primitives (said to be “universal”, in terms of their ability to reach consensus in a wait-free manner for a unlimited number of threads) Overview Cont. •Implement data objects as stylized sequential programs without explicit synchronization* •Apply synchronization and memory management techniques •This transformation will “in theory” transform any sequential object into a non-blocking or wait-free concurrent object. Overview Cont. Linearizability is used as a basic correctness of the implementation This doesn’t mean a concurrent version which allows other values to occur is incorrect. Thus, non-linearizable algorithm are not necessarily incorrect. Small Objects At a high level: Reads memory pointer using load_linked On failure retry On success transformation complete Calls store_conditional to swing the pointer from the old to the new Copies version into another block Applies sequential operation to the copy Small Objects Cont. - Code Preventing race DO If the checkthe values condition! match, now we can perform int Pqueue_deq(Pqueue_type **Q){ ... our dequeue operation! while(1){ Copy the old, new data. old_pqueue = load_linked(Q); Try to publicize the new old_version = &old_pqueue->version; Ifheap the check via values do not new_version = &new_pqueue->version; match, loop again. We store_conditional, first = old_pqueue->check[1]; failed. which could fail and we loop copy(old_version, new_version); back. last = old_pqueue->check[0]; if (first == last){ result = pqueue_deq(new_version); Lastly, copy the old if (store_conditional(Q, new_version))break; } concurrent object pointer to } the new concurrent pointer. new_pqueue = old_pqueue; return result; Return our priority queue } result. Small Objects Cont. – Back Off } ... if (first == last) { result = pqueue_deq(new_version); if (store_conditional(Q, new_version )) break; } if (max_delay < DELAY_LIMIT) max_delay = 2 * max_delay; delay = random() % max_delay; for (i = 0; i < delay; i++); } /* end while*/ When the consistency new_pqueue = old_pqueue; check or the return result; store_conditional fails, introduce back-off for a random amount of time! Small Objects Cont. - Performance Small Object, Non-Blocking (naive) Small Object, Non-Blocking (back-off) Small Objects Cont. – Wait Free • Operation combining – before trying to do work to the concurrent object, a thread records what it is trying to do. • Then reads what all other threads are doing and tries to complete the work for them • Once it does all of their work it then does it’s own. • Gain failure tolerance at the cost of efficiency. Small Objects Cont. – Wait Free Use Operation combining to transform the block-free object into a wait free Process starts an operation. Record the call in Invocation. Upon completion of the operation, record the result in Result. Small Objects Cont. – Wait Free ... announce[P].op_name = DEQ_CODE; new_toggle = announce[P].toggle = !announce[P].toggle; if (max_delay> 1) max_delay = max_delay >> 1; Record the process name. Flip the toggle bit. while(((*Q)->responses[P].toggle != new_toggle) || ((*Q)->responses[P].toggle != new_toggle)){ apply pending operations old_pqueue = load_linked(Q); to the NEW version. old_version = &old_pqueue->version; new_version = &new_pqueue->version; first = old_pqueue->check[1]; memcopy(old_version, new_version, sizeof(pqueue_type)); last = old_pqueue->check[0]; if (first == last){ result = pqueue_deq(new_version); apply(announce, Q); if (store_conditional(Q, new_version )) break; } if (max_delay < DELAY_LIMIT) max_delay = 2 * max_delay; delay = random() % max_delay; for (i = 0; i < delay; i++); Small Objects Cont. – Wait Free Small Object, Non-Blocking (back-off) Small Object, Wait Free (backoff) Class Exercise Sequential Code for Removing Head node to the head of a Linked List Apply synchronization and memory management techniques Add Exponential Back-off for performance Class Exercise Typedef Res { boolean non_empty; int ret_val; } Res linkList_removeHead(LinkList_type * l){ node * temp; Res value; value->non_empty = false if (l->head != null) { value->non_empty = true; temp = l->head->next; value->ret_value = head->value; head = temp; } return value; } Class Exercise Res LinkList_removeHead(LinkList_type **L){ ... while(1){ old_linkList = load_linked(L); old_version = &old_linkList->version; new_version = &new_linkList->version; first = old_linkList->check[1]; copy(old_version, new_version); last = old_linkList->check[0]; if (first == last){ result = linkList_removeHead(new_version); if (store_conditional(L, new_version))break; } if (max_delay < DELAY_LIMIT) max_delay = 2 * max_delay; delay = random() % max_delay; for (i = 0; i < delay; i++); } new_linkList = old_linkList; return result; Large Objects Per-process pool of memory 3 states: committed, allocated and freed Operations: set_alloc moves block from committed (freed?) to allocated and returns address set_free moves block to freed set_prepare marks blocks in allocated as consistent set_commit sets committed to union of freed and committed set_abort sets freed and allocated to the empty set Summary & Contributions Foundation for transforming sequential implementations (small and large) to concurrent operations •Possible to be performed by a compiler •Maintains a “reasonable” level of performance •Utilizing LL and SC as base primitives •Addresses the issue of conceptual complexity (thoughts?) Sources •A Methodology for Implementing Highly Concurrent Data Objects – Slides from Tina Swenson – 2010 •http://en.wikipedia.org/wiki/Maurice_Herlihy •http://cs.brown.edu/~mph/ •http://web.cecs.pdx.edu/~walpole/class/cs510/papers/10.pdf