MORTGAGE-BACKED SECURITIES VALUATION FOR MOBILE DEVICES APRIL 3, 2015 ASYMPTOTIC INVESTORS Contents Motivation..................................................................................................................................................... 3 Algorithms ..................................................................................................................................................... 3 MBS Risk Factors ........................................................................................................................................... 3 Credit Default Risk .................................................................................................................................... 3 Prepayment Risk ....................................................................................................................................... 4 Note Pricing................................................................................................................................................... 4 MBS Trading Strategies ................................................................................................................................. 6 Architecture Diagram .................................................................................................................................... 6 Object Diagram ......................................................................................................................................... 7 Google Play Deployment............................................................................................................................... 8 FHA/FNMA Data Processing Using Machine Learning .................................................................................. 8 File-Based Model Training User Interface (in development) ...................................................................... 10 XML Schema ............................................................................................................................................ 12 Control of Rendering of Restricted Boltzmann Machine in Android GUI ............................................... 14 Data Structures in Java for Restricted Boltzmann Machine Components .......................................... 15 Line Rendering Between Visible and Hidden Layers ........................................................................... 15 Text Nodes in DOM Tree ......................................................................................................................... 17 Translation from FHA Data to XML ......................................................................................................... 18 File Selection for Browse Button ............................................................................................................ 18 Handling Invalid File Path........................................................................................................................ 20 Nesting a ScrollView for the Canvas ....................................................................................................... 20 Utilizing Machine Learning to Infer Future Prices (Win32 GUI, C++ Logic) ................................................ 20 About the Boltzmann Machine ............................................................................................................... 23 Restricted Boltzmann Network Diagram ................................................................................................ 24 Energy Computation ............................................................................................................................... 26 Probability Computation ......................................................................................................................... 26 Restricted Boltzmann Machine Inference .............................................................................................. 26 Mapping Double-Precision Floating Point to Binary ........................................................................... 26 Preprocessor Class .............................................................................................................................. 28 Training Data and Loader .................................................................................................................... 29 Multiplicity of Models – Price, Default Rate, and Prepayment Rate ...................................................... 30 Constructors and Instantiation ............................................................................................................... 30 Sample of Inference in Programmatically Created Network .................................................................. 30 Test Run of Simple Network Inference ................................................................................................... 32 Sample Training via Contrastive Divergence .......................................................................................... 32 RBM Training Algorithm.......................................................................................................................... 36 Contrastive Divergence Implementation in C++ ................................................................................. 37 Boltzmann Machine Classes.................................................................................................................... 39 C++ Template Specializations ............................................................................................................. 39 Aside: C++ Print Formatting Bug ............................................................................................................ 40 Boltzmann Machine GUI ......................................................................................................................... 40 Translating Graphical Images to a Neural Network ............................................................................ 40 Canvas ................................................................................................................................................. 41 Icon Tray.............................................................................................................................................. 44 Win32 Methods and Types ................................................................................................................. 44 Compiler and Linker ............................................................................................................................ 50 Resource Files ..................................................................................................................................... 50 Icons in Win32......................................................................................................................................... 50 Pointers to Pointers ................................................................................................................................ 53 Preventing Circular Header Dependency ................................................................................................ 53 Variable Scope in Switch Statement Cases ............................................................................................. 53 Static Text Controls ................................................................................................................................. 53 Windows Store Deployment ....................................................................................................................... 55 Android ViewAnimator ............................................................................................................................... 55 Android SDK ................................................................................................................................................ 55 UI Layout ..................................................................................................................................................... 55 Shape Drawable .......................................................................................................................................... 55 Test Case ..................................................................................................................................................... 56 Bibliography ................................................................................................................................................ 60 Motivation Mortgage-backed securities are a highly relevant aspect of the global economy and a product group which financial engineering methods can be applied to in many ways. The market size has been estimated to be 13.7 trillion USD in US issuances alone. Currently, these products are traded via the TradeWeb fixed income platform and have denominations of values like USD 1000 (FNMA class ASQ1), 25000 (for GNMA issuance), and 100000 (FNMA class X1) for the government-agency-issued notes. Methods which can accurately price, trade, and account for the risk of these products could yield significant returns for a hedge fund. Algorithms This application computes the Conditional Prepayment Rate (CPR), Single Monthly Mortality (SMM), scheduled payment, and monthly principal balance for mortgage-backed security issues. The following data fields will be rendered in a time series table. Monthly Payment - depends on the number of months remaining CPR – Conditional Prepayment Rate, which depends on the current month (percent increases month by month up to 6%) and PSA rate, i.e. if month<=30, .002*PSA*month_index, else .06*PSA. SMM – Single Monthly Mortality, which is 1-(1-CPR)^(1/12) Scheduled Interest – Pass-through rate times begin principal Scheduled Principal – Monthly payment – scheduled interest Prepaid Principal(SMM) – SMM percent*(begin principal-scheduled principal) End Principal – begin principal – (scheduled principal + prepaid principal) Total Cash Flow – discount rate input into UI is used to move cash flows back to time zero. The tabular computational output is stored within the “Vector<MonthInfo>monthData” data structure. The GUI consists of a Vertical ScrollView, which contains a horizontal ScrollView, which contains a vertical LinearLayout, which contains a series of horizontal LinearLayout instances. A second algorithm, implemented on the server-side with an Android client, will be a Restricted Boltzmann Machine mechanism for calculating the prepayment rate, default rate, or MBS price based on a model deduced from training data. MBS Risk Factors Credit Default Risk While default risk is not accounted for explicitly, the average coupon is an interest rate which reflects the risk of the lender, while the pass-through rate reflects the risk of the note investor. An additional risk adjustment might be made utilizing the credit valuation adjustment (CVA), which reduces the asset value by the present value of the expected loss-given-default. Ideally, the present value of the interest cash flows above the amount accounted to by the risk-free rate (e.g. LIBOR) would equal the CVA amount. Such an adjustment (along with the additional metrics of Debt Valuation Adjustment (DVA) and Funding Valuation Adjustment (FVA)) may be implemented in version 2.0. Prepayment Risk Prepayment risk is the danger that interest payments will be forfeited due to the early repayment of mortgage loans by borrowers. This usually occurs when refinancing takes place during periods of low interest rates. Multiple methods of prepayment estimation exist. In the initial release, the single monthly mortality (SMM) measure will be computed. This is a heuristic which expresses the annual conditional prepayment rate (CPR) in monthly form. The CPR is computed by multiplying a time-based amount (max(.002*monNum, .06)) by the Public Securities Administration (PSA) method scale factor. In a later version, a machine-learning approach to estimating the prepayment rate in a given month by training a model on historical Federal Housing Administration (FHA) data is proposed. Product Universe Two types of mortgage-backed securities exist: agency pass-throughs and collateralized mortgage obligations (CMOs). Separate pricing lines exist for CMOs and pass-throughs at sites such as The Wall Street Journal, and the quote convention also differs. Pass-throughs are quoted as a “per-100” price (e.g. 98 14/32 or 106 10/32), while CMOs are quoted as a basis point spread above US Treasury bonds of comparable maturity (e.g. 175 bps). Pass-through securities forward principal and interest cash flows received from loan collections through to the bondholders indiscriminately. CMOs, on the other hand, are issued in multiple tranches, whereby the most senior chance will receive payment priority in the event that cash flows are inadequate to pay the promised principal and interest on the bonds. Collateralized mortgage obligations themselves are organized into two classes: sequentials and planned amortization class (PAC) bonds. Sequentials distribute the prepayment risk among tranches, requiring all principal payments to be allocated to the lowest tranche first until it is retired. Subsequent principal payments flow to the next lowest tranche, and so on. Interest payments are allocated evenly across the tranches. It is unclear how the default risk is distributed among the tranches, but it would be expected that the junior tranches would receive interest payments after the senior tranches in such a case. The PAC bonds also involve allocation of prepayment and default cash flows (or lack of them). However, in this case, the unexpected cash flows are diverted away from the PAC tranche, making these bonds behave like a treasury bond with predictable cash flows. It is expected that the initial products traded will be FNMA pass-throughs exclusively. In the second phase, FNMA collateralized mortgage obligations will be added, with Freddie Mac (FMAC) and Ginnie May (GNMA) offerings to follow. Note Pricing The price is canonically defined as the present value of the cash flows of the MBS. In the case of this application, adding the “Discounted cash flow (PSA)” column will obtain the value, with a value equivalent to the notional indicating par. In an Excel sheet with two different prepayment options, one a constant 4 basis points, and the other a graduated SMM approach reaching a maximal value of 6%, the slower prepayment approach was worth significantly less. The value on a scale in which 100 is par is also computed, with the slow prepayment approach worth about 83 and the SMM cash flows worth more than 134. The PSA discounted cash flow value and the par-100 price are also computed in the Android application. The values obtained under identical parameterization to the Excel sheet (PSA=1.2, WAC = .08, PT Rate = .07, notional = 2000000, r=.03, T=360) were identical to the outputs of the Excel macro. A sanity check using only a single month (so the entire principal amount was scheduled for one payment) produced similarly correct results (2006643.78 present value and 100.33 normalized price). MBS Trading Strategies The primary market for mortgage-backed securities (at least the agency notes) involves purchasing the debt from the issuer (e.g. FNMA). Two types of securities, Discount Notes and Benchmark Bills can be acquired. Discount notes are issued via Federal Reserve banks and sold by a limited group of brokers, while Benchmark Bills are purchased via an electronic Dutch Auction. Nonetheless, the more liquid market for MBS securities appears to be the secondary market, specifically the TBA, or to-be-announced market. In this scenario, counterparties set a price for securities of a given notional value to be sold for on a delivery date (or within a settlement month). The exact securities are not specified, but the contract indicates the issuer, mortgage type, maturity, and coupon. Trading in the secondary market might involve arbitrage with contracts of different coupon rates and payment priority (tranche). For example, a butterfly strategy might involve going long one 5% coupon contract, selling two 7% coupon contracts, and buying one 9% contract. Architecture Diagram MBSPricer (View 1) Compute cash flows Table model MBSTable (View 2) MBSCalc (Quant Logic) Object Diagram GUI Module NetworkManipulato r BoltzmannGUI DragDropHandler Boltzmann Machine Training and Inference Boltzmann TLU Google Play Deployment The calculator utility (and, hopefully, later the neural network code, although this may be sold separately on the Windows AppStore) will be available for sale on Google Play. In order to accomplish this, the following steps were necessary. 1. Obtain a Google developer account 2. Generate the APK archive via ApkBuilder using the –u (unsigned) flag, not the –d(debug sign) flag. 3. sign the APK in release mode (using keytool and jarsigner) a. keytool generates a private/public key pair b. jarsigner affixes the digital signature to the apk 4. zipalign the APK file – this causes the byte alignment of the APK to resemble that of a zip file (presumably facilitating parsing of the file on other operating systems). 5. Upload the APK into the production store, setting the price and metadata 6. Upload images with pixel dimensions created to the Play specifications for icon, feature, and promo PNG. 7. Complete a content rating questionnaire. 8. Create a privacy policy web site 9. Create a merchant account. FHA/FNMA Data Processing Using Machine Learning Aside from the PSA/SMM approach to estimating prepayment rates of the pool, data from the Federal Housing Administration (FHA) about home loans can be used to derive a prepayment model. Rows in the tabular data released by the FHA can be parsed as training vectors for a machine learning algorithm. The Federal Housing Administration programs now fall under the auspices of the Department for Housing and Urban Development (HUD). The GUI Application will permit loan data to be downloaded to train Boltzmann prepayment and default models. While the HUD site contains numerous data links, some appeared to not lead to any site, and others linked to datasets which lacked prepayment or default information. However, a search of other sites indicated that the Home Equity Conversion Mortgage (HECM) data could be interpreted to yield prepayment rates. While this is actually data about “reverse mortgages”, it might be interpreted to represent the state of the home loan market in general(if no more direct source is available). This inference, however, cannot be made directly from the data, as there is no “prepayment” column or attribute in the schema. One possible formula is, for each year, compute (number of loans paid to termination)/(number of loans outstanding). However, it seems unlikely that the mapping between government loan performance and MBS defaults and performance is linear, as the FHA loans seem likely to be low-income housing, whereas the mortgagebacked securities are comprised of loans for multiple grades of housing. It seems that Fannie Mae’s PoolTalk site, which provides issuance and monthly statistics about MBS securities for lookup by CUSIP, might be a more reliable source of information. A screenshot of this application is below (copyright FNMA, used without permission). New Issuance files are divided by record types, where 1 is the pool statistics record, 3 is the loan purpose, and 9 is the geographic region. Each pool id is associated on a one-to-one basis with the international CUSIP identifier for MBS securities. The total notional of the pool and number of loans in various geographic regions are identified in the new issuance files. In the excerpt below, pool AS5545 (CUSIP 3138WFET9) contains 1023 loans, 739 which are purchase loans, and 284 of which are refinance loans. Eighty-four were issued for properties in California, and thirty-six were made for properties in Florida. AS5545|01|3138WFET9|07/01/2015|FNMS 03.5000 CLAS5545|$190,124,021.00|3.5||08/25/2015|MULTIPLE|MULTIPLE|1023|185911.73|08/01 /2045|||||||4.064|||0|360|360|80|751|0.0||4.42|CL ||29.62|81 AS5545|02|MAX|199999.0|4.25|97.0|824|360|7|361 AS5545|02|75%|192000.0|4.125|92.0|786|360|0|360 AS5545|02|MED|185250.0|4.125|80.0|757|360|0|360 AS5545|02|25%|180000.0|3.99|75.0|722|360|0|360 AS5545|02|MIN|175000.0|3.875|20.0|623|301|-1|301 AS5545|03|PURCHASE|739|72.28|$137,414,031.68 AS5545|03|REFINANCE|284|27.72|$52,709,989.54 AS5545|04|1|1016|99.29|$188,770,295.22 AS5545|04|2 - 4|7|0.71|$1,353,726.00 AS5545|05|PRINCIPAL RESIDENCE|951|92.9|$176,618,379.90 AS5545|05|SECOND HOME|37|3.65|$6,929,248.12 AS5545|05|INVESTOR|35|3.46|$6,576,393.20 AS5545|08|2015|1022|99.9|$189,931,563.39 AS5545|08|2014|1|0.1|$192,457.83 AS5545|09|ARIZONA|21|2.05|$3,902,863.41 AS5545|09|ARKANSAS|6|0.58|$1,098,500.00 AS5545|09|CALIFORNIA|84|8.23|$15,642,819.25 AS5545|09|COLORADO|50|4.86|$9,237,062.23 AS5545|09|CONNECTICUT|5|0.48|$917,560.00 Unfortunately, neither the fixed rate monthly file nor the loan-level disclosure file contains fields disclosing the prepayment rate or the default rate (credit events). However, the loan-level disclosure file provides some clues, via the initial and current coupon rate and the initial and current unpaid principal balance. A rising coupon indicates likely credit issues, and a steeply declining unpaid principal balance is a sign of prepayment. The yield maintenance premium is provided in some files, like those for discounted mortgage-backed securities (DBMS). However, while this provides insight into the cash flows resulting from prepayment, it does not represent the likelihood of prepayment. It seems likely that some other source of historical MBS performance, indicating prepayments, defaults, and prices over the term of the underlying loans is needed in order to obtain training vectors in the requisite format. However, FNMA does publish a spreadsheet of prepayment data which specifies origination year, month of prepayment, and amount. This information would be sufficient to create a corpus for the prepayment rate training, given the appropriate data translation. In addition, the US Federal Reserve Bank publishes historical charge-off and delinquency rates for real-estate loans at a quarterly granularity which, when combined with other data about outstanding loan notional and term values, could provide training for the default rate metric. Ginnie Mae also seems to offer statistics of delinquency and unscheduled paydowns. Both the number of loans and the unpaid balance amounts are provided for thirty, sixty, and ninety day delinquencies. MBS pricing information is available on a daily basis from sites such as The Wall Street Journal, and TradeWeb provides a limited 12-month graph of prices. The iShares ETF historical pricing information is available on the NASDAQ web site. File-Based Model Training/Test A third screen is being added to the Android application to import Housing and Urban Development (possibly), US Treasury, FNMA, and GNMA datasets into the Restricted Boltzmann Machine model’s training set. The first utility implemented in this UI is the parsing of Restricted Boltzmann Machine XML and the rendering of the network in the Android GUI. This will involve two segments: a file selection panel and a Restricted Boltzmann Machine Rendering view. A preliminary screen shot (prior to implementing the XML parser and renderer) is below. Upon examining the model, data can be loaded and Contrastive Divergence training run on the model or, if training is complete, inference can be carried out on an input vector (these UI controls will be designed later). The XML file is stored on the device’s SDCard and facilitates both the graphical rendering and the serialization of the network graph. It will be read using the i/o methods Envronment.getExternalStorageDirectory and BufferedInputStream’s readLine. XML Schema The XML file encodes the sequence of visible layer threshold logic units, hidden layer threshold logic units, and inter-layer connections. In the future, tags for input and output values of each TLU should be added, as should tags for the energy. An optional tag, the name of the node, such as “WAC0” to indicate the leftmost bit of the WAC input, is included but not currently parsed (should be added to Android display). A sample of such a file is below. <xml version="1.0"> <RBM> <visibleLayer> <TLU> <name>WAM0</name> <index>0</index> <value>0<value> </TLU> <TLU> <index>1</index> <value>0<value> </TLU> </visibleLayer> <hiddenLayer> <TLU> <index>0</index> <value>0<value> </TLU> <TLU> <index>1</index> <value>0<value> </TLU> </hiddenLayer> <connections> <connection> <visibleIndex> 0 </visibleIndex> <hiddenIndex> 1 </hiddenIndex> <weight>10<weight> <connection> <visibleIndex> 1 </visibleIndex> <hiddenIndex> 0 </hiddenIndex> <weight>5<weight> </connection> </connections> </RBM> A new optional child element of <TLU> has been added to accommodate the threshold (bias) of a unit. This is a double-precision floating point value known as <threshold>. A TLU including this element might be specified as follows. <TLU> <index>0</index> <value>0<value> <threshold>1.2</threshold> </TLU> Control of Rendering of Restricted Boltzmann Machine in Android GUI The RBM rendering will be handled in the “onDraw” method of the CanvView view nested in the FHAUI view. This will check the encapsulating FHAUI’s “visibleLayerPts” and “hiddenLayerPts” hash tbles and the “connections” vector for the network objects to be rendered. Such hash table and vector data structures will be populated by the XML file loaded and parsed into a Document Object Model (DOM) tree. The index field will be retrieved from the XML text, and the screen shot demonstrates that nonsequential ID numbers can be utilized, but the y-offset will be set by a counter, not by the displayed value. A preliminary screen shot of a rendering of the visible layer described in the HTML above is illustrated below. Data Structures in Java for Restricted Boltzmann Machine Components The actual Boltzmann Machine class was written in C++ and will run outside of the Android device. In order to integrate the C++ code with Android code, the XML file will be transferred between the platforms. However, a mapping from the visible (hidden) layer index in the XML file (an index tag exists) to the position of the Threshold Logic Unit image on the canvas is required, as is the mapping from the (visible,hidden) pair (of TLUs connected by an edge) to the weight of this edge (as a line is drawn and the weight is used to label the line). In the future, a mechanism to store the value of the TLU will also be requisite. A new hash table for each layer (visible or hidden), mapping index to value, would suffice for this purpose (and has been implemented). Line Rendering Between Visible and Hidden Layers Lines will be traced between the boundaries of the TLU circles to signify connections and will be labeled with weights. The coordinates of the lines’ endpoints will be based on the centers of the TLUs whose indices are stored in the visible and hidden indices of the ConnInfo object. The visible layer endpoint of the line will be at (centerX+radius, centerY), and the hidden side will be at (centerX-radius, centerY). The connections XML for the example case, with two lines, is as follows (note that the index values are nonconsecutive and used as ID numbers). <connections> <connection> <visibleIndex> 1000 </visibleIndex> <hiddenIndex> 66 </hiddenIndex> <weight>10</weight> </connection> <connection> <visibleIndex> 99 </visibleIndex> <hiddenIndex> 760 </hiddenIndex> <weight>5</weight> </connection> </connections> The rendering of the above XML is below. Text Nodes in DOM Tree Interestingly, even in nodes such as <visibleLayer> which contain no text, it was necessary to filter out text nodes from the tree traversal. It is not clear whether inclusion of an empty text node with all DOM elements is standard or parser implementation-specific. In fact, it seems that white space is being construed as text, as the log which follows indicates. I/System.out( 3065): renderHidden, text node: I/System.out( 3065): Translation from FHA Data to XML The data files from the Federal Housing Administration must be translated into the XML which will be stored in the DOM. Two mechanisms might account for such mapping. 1. Utilization of the Win32 graphical user interface by a human operator to manually create a Restricted Boltzmann Machine after reading the data and then saving the topology into XML (in progress). 2. Running an Android utility to parse the FHA file, map it to inputs, visible units, hidden units, and outputs, and persist the XML to a file (to be developed). File Selection for Browse Button Upon clicking the “Browse” button in most file load user interfaces, a file selection dialog is opened. Within this dialog, double-clicking a directory descends into the directory and appends that directory to the path in the selection text view. In Android, no standard widget exists within the API. In the initial phase, verification that the /sdcard directory’s files and directories can be rendered will be confirmed. A subclass of AlertDialog, FileChooser, will be implemented. The files and folders can be scrolled through, and the icon used to display for a file or folder will be determined based on the “isDirectory()” method on the File object. Icons are displayed using the TextView’s “setCompoundDrawablesWithIntrinsicBounds” method. When the “Select” button is clicked, the contents of the “path” EditText are copied back to the parent window. In phase two, a click on a folder will open it and update the path EditText, and a single click on a file will add its name to the path (it will not set the file path in the parent dialog, but this would be a convenient feature). This will require handling clicks on the TextView and looking up the view in a hash table mapping the view to a file/folder status. Instead of hashing on the view directly (requiring a hash function for the TextView, the view will be assigned an ID based on its index in the directory listing, and this will be the hash key. Both of these phases have been implemented, but no highlighting when the view is touched has been added (this is a desired feature). An XML file was successfully found and loaded via the browse mechanism. A screen shot of the dialog is below. Additional functionality to handle returning to parent directories in searching for a file was added with the “parent directory” button. Moving from /sdcard/Samsung to /sdcard is illustrated below. Two issues which remain outstanding are: 1. The entire path is difficult to view and scroll through in portrait mode. 2. An additional ‘/’ is appended to the path when moving up with the “parent directory” button and then back to a subdirectory. Handling Invalid File Path For the time being, a blank canvas is displayed with only the visible/hidden partition and the exception is logged to the Android “logcat” stream. In the future, a dialog or Toast should appear notifying the user of the error. Nesting a ScrollView for the Canvas The Restricted Boltzmann Machine can have a variable number of TLUs in its visible and hidden layers. Thus, given a constant scale, the height of the diagram derived from the XML is unknown at the time of the original layout rendering. Thus, the CanvView object is placed within a ScrollView. However, there is no direct imperative means to set the maximal height of the ScrollView (size of canvas that, when exceeded, induces scrolling and wrapping). However, utilizing the ViewGroup.LayoutParams parameter to the addView method enables both the actual (thus maximal) width and height of the ScrollView to be specified. These parameters are set in the LayoutParams constructor, and then the LayoutParams object is passed to addView along with the ScrollView. However, this approach, which involves onTouch events’ being handled by the internal and external ScrollViews, causes the internal view to be scrolled at a slow rate. It seems that the shift offset, which constitutes a scroll, is divided between the the internal and external ScrollViews. The only way to ameliorate this would be to override the touch or motion event handler used by the internal ScrollView and ensure that it is not propagated to the external ScrollView. This might involve the external ScrollView’s ignoring the scroll message, if the touch fell within the internal view. However, in the case when the touch does not fall within the internal ScrollView, the scroll invocation’s offset, based on the length of the gesture, is non-trivial to compute. The problem may also be that every scroll would entail an onDraw call to the canvas, but this behavior would seem to be the same for any other view embedded in a ScrollView. For now, nesting of ScrollView will be avoided, and such an approach will be considered only if the RBM becomes prohibitively large. The canvas and flip button will be removed and re-added with each file load in order to enable the resizing of the canvas in the main linear layout. Note that some accounting for the nested behavior seems to occur in API level 21 of the Android SDK, where “nested scroll” event listeners can be designated for the control, but the current application is compiled with the API level 16 library. Inference View A link from the Boltzmann load GUI will enable mortgage-backed securities parameters to be entered into the user interface. Upon the clicking of the “infer price” button, the parameters, along with the state representation (likely the XML) will be serialized and sent from the Android client to a c++ web service listener on the server. The input screen will be fairly similar to the initial screen used as input for the MBS cash flow computation. Upon clicking of the “calculate price” button, an HTTP GET is made to a web service front-end for the Boltzmann Machine, which performs the inference and returns the price. A screen shot of the Android user interface is below. SOAP Web Service The server-side Restricted Boltzmann Machine will be accessed from the client via the Simple Object Access Protocol (SOAP), which enables RPC calls to be made via HTTP and marshalling to take the form of XML documents. The problem is to determine which libraries will be utilized for the HTTP server, method dispatch, and client TCP/IP proxy code generation. One alternative being considered is the open-source platform SimpleSOAP by Scott Seely and Jasen Plietz. However, problems with compilation of this library with the windows socket headers suggest that this solution may not be usable. The web server will dispatch to the PrepaymentApp class, passing it an XML representation of the desired RBM and the input parameters for prepayment rate inference. A new deserialization method to generate the neural network TLUs, weights, and connections will be implemented in the Boltzmann class invoked by the PrepaymentApp object. The XML will be parsed using the C++ API of the Apache Xerces Document Object Model (DOM) XML parser. This traversal of the DOM tree will be similar to that accomplished in the FHAUI class’s Java code used to render the Boltzmann XML in the Android GUI. The TLU elements will be read into “visible” and “hidden” vectors, while the connection elements will be read into the map data structures for connections and weights. getPrepaymentRate call (XML message) Web Server (SOAP Listener) Client (Android) getPrepaymentRate call (XML message) postprocess preprocess, infer Restricted Boltzmann Machine Logic SimpleSoap Build and Deployment The typical SOAP model for client generation and invocation is as follows: 1. Download service description in the XML format known as WSDL from the server 2. Generate proxy code for making the HTTP calls to the web service and parsing the HTTP response 3. Invoke the web service client methods from the application The SimpleSoap code base was written to compile on Linux, but the MinGW C++ compiler (the compiler of choice for the firm currently) does not seem to run correctly within the Cygwin environment. Thus, the current approach has been to rewrite the SimpleSoap build scripts as Windows cmd files. However, this induces errors in the socket code, as it seems that the TCP/IP code is not compatible with using the “winsock” library in place of “sys/socket”. This also fails, due to apparent inconsistencies in the mingw winsock implementation and the compiler’s treatment of function templates. Utilizing Machine Learning to Infer Future Prices (Win32 GUI, C++ Logic) A Boltzmann machine will be implemented to learn a pricing function from a data set. Both prepayment and default can be parameterized in terms of the time to maturity, credit rating of the obligor, weighted average coupon, etc. About the Boltzmann Machine A Boltzmann Machine is a form of recurrent neural network with two types of nodes: hidden and visible. The weights on the connections between nodes are trained via gradient ascent, similarly to other neural networks. However, the weight update formula is somewhat anomalous, as it involves adding the difference of the expected value of the product of the two connected states’ outputs in the data set and the expected value of the same product over the thermal equilibrium (temperature 1) distribution of the model. The notion of thermal equilibrium is somewhat counter-intuitive, but it simply means the output of the model in configuration that is most likely for the given weight set, so the weight adjustment is similar to standard gradient ascent, subtracting actual (model) output from expected (data) output. In addition, the output at a logic unit is a stochastic function of the dot product of the weights and inputs, not just the thresholded dot product, as in a standard TLU. While most neural networks are used for supervised learning problems, the Boltzmann machine seems to be utilized for unsupervised feature clustering problems. The Hopfield Network, a simpler form of a Boltzmann network (without stochastic activation) is utilized for content-addressable memory. Content-addressable memory means that a data item can be found by querying the memory with a fragment of the data. How such an architecture relates to mortgagebacked security pricing is not intuitively obvious. However, precedent exists for Boltzmann network utilization in the field of credit risk (see Tomczak and Zieba, 2015). Basically, a Boltzmann machine is just a canonical neural network in which the state of a TLU is a probabilistic function. The final hidden layer can be construed as an output or an inference layer, just as the output layer of a feed-forward neural network is. Neural networks are models of a biological brain, where the threshold logic units represent neurons and the weights are analogous to synapses. The training of weights can be considered similar to a memory function, whereby the weights enable the network to remember the correlation between the inputs and outputs (bidirectionally). Another interpretation would be that the Boltzmann machine is likely to generate visible input vectors in the memorized set given the abstract state fixed in the hidden layer units. In this mortgage-backed securities application, the visible units will represent inputs such as the weighted average coupon, the pass-through interest rate, the notional, the borrowers’ credit ratings, the average maturity of the underlying loans, the PSA rate, the month number etc. The hidden units will represent one of: the default rate for a month, the prepayment rate for the month, or (for a high-level inference engine) the price of the MBS securities. Restricted Boltzmann Network Diagram A restricted Boltzmann network has only one visible and one hidden layer, and no connections within the layers is permitted. Visible Layer Hidden Layer Energy Computation For a given TLU, an iterator traverses all connections and considers those beginning with that TLU (if a visible unit) or ending with that TLU (if a hidden unit). For each such connection, the quantity (weight*state(neighbor)) is added to the TLU’s energy. Probability Computation Given the energy, the probability that the state is set to 1.0 is 1/(1+e^(-energy)). If this probability exceeds the TLU’s threshold (by default 0.5) the state is set to 1.0. Likewise, the probability of any configuration of the visible and hidden units is p(v,h) = e^(-E(v,h))/∑u,g(e^(E(u,g)). Here, u, g are all visible and hidden configurations besides the (v,h) configuration. Restricted Boltzmann Machine Inference Inference is carried out by clamping a set of states to the visible units and computing the values of the hidden units. The input units will represent the following properties of a mortgage-backed security issue: weighted average maturity, weighted average coupon, pass-through rate, number of loans, notional, and credit rating. Each input layer (visible layer) TLU will represent one bit for one of these values. The output layer will be configurable, representing the prepayment rate, default rate, or MBS price. Thus, a quantity like the FHA prepayment rates can be learned and utilized in a cash flow matrix in the same manner that the SMM rates were. The algorithm for inference is as follows. 1. Given a vector of variables (like the WAM and WAC), preprocess these into a vector of single bits (stored as doubles). 2. Assign the bits obtained to the visible layer TLUs in a 1-1 mapping. 3. Compute the energy of all hidden-layer TLUs. 4. Compute the probability of each hidden layer TLU’s being set to 1, given the energy. 5. Based on the probability above, compute a random draw in the range of 0 to 1. Basically, the probability is the chance that the draw will yield 1. (Multiply the probability by 1000 and set this to be the cutoff. If a random number between 0 and 1000 is less than or equal to the cutoff, return 1, else return 0). If the draw is greater than or equal to the threshold for the TLU, set the state of the TLU to 1, else set the state to 0. Mapping Double-Precision Floating Point to Binary The “one unit” or “one TLU” description above means one multi-bit unit, which must be split into multiple single bit units. This will be done using two techniques: making each TLU be a selector, where setting the state to one means that this particular option was selected, or representing the value as a binary string. An example of the selector method would be the notional, where three TLUs are utilized: the first set means a notional of 100 million, the second set means a notional of 10 million, and the third set means that the notional is 1 million. An example of the second approach would be the weighted average coupon (WAC), where four bits are used to represent the percentage (0-15, e.g. 1000 for 8%). In the first case, three threshold logic units are used, and, in the second case, four units are used. For percentage inputs such as the risk-free rate and WAC, the rate is multiplied by 100 prior to translating to binary. Note that credit ratings are assumed to follow the Standard and Poor’s convention without the use of the +/- qualifier, and that the lowest rating considered is “D” (default). Note that “WAM”, weighted average maturity, is treated differently from “TTM”, or total time to maturity. The WAM value is the average number of months remaining of the loans in the bond portfolio, while the TTM value is the number of months until the bond itself matures. The initial mappings are as follows (post-processing of output is included). number of TLUs input field type PSA double In 1% increments, 0-255 (eg PSA 120% = 8 01111000) R double in 1% increments, 0-31%, rounded to 5 the nearest percent Wam double Wac double 5 in 1% increments, 0-31% ptRate double 5 in 1% increments, 0-31% double leftmost bit->10^5 loans or fewer, second from left->10000 or fewer 6 loans, … right bit set->1 or fewer double leftmost bit->1bln or less, second from left->100 mln or less, … right bit 6 set->10000 or less string 000=D(default), 001=C, 010=B,010=BB,011=BBB,100=A, 3 101=AA,111=AAA numLoans notional rtg Semantics 10 0-1023 months output type (not implemented) integer average time to maturity (months) integer output field type Output value double 00=MBS Price, 01=Default Rate , 2 10=Prepayment Rate 10 Simple binary string in range 0-1023 number of TLUs/bits Semantics If output type MBS price, use 9 bits (up to 511 par 100) divided by 511 times 100. If percent of defaults or prepayments, bit value is right 9 (significant) bits of product (decimal rate times 511) rounded to nearest integer. Thus, divide bit value by 511 for rate (allows decimal percents less than 1%, and all 1’s refers to 100%, for range of 0.1 to 1.0). Final output units 9 are in decimal. Preprocessor Class A preprocessor abstraction has been designed for converting business-level parameters (weighted average maturity, notional, etc) to bit vectors. The high-level methods such as wacToBits invoke some intermediate methods, such as doubleToFiveBits, which are common among multiple variables (risk-free rate, WAC, etc). All fields needed for inference of the prepayment rate, default rate, and MBS price will be translated to bit vectors. The “preprocess” method will invoke all of these (as implemented initially, later only a subset may be used) and amalgamate the bits into a single network input bit vector. The order of parameters in the input object is: PSA, risk-free rate, WAM, WAC, pass-through rate, number of loans, notional, credit rating, and time to maturity. As the parameters are heterogeneous, an encapsulating type has been defined to contain the parameters. Preprocess methods for a parameter like r involve rounding the value, multiplying it by 100 (to convert it to a percent), and then considering the bits representing 2^4, 2^3, . . ., 2^0. Thus, a decreasing for loop is run on the exponent of base two, where the value to be converted is tested against (1<<exponent). If greater or equal, a “1” is pushed onto the vector, else a zero is pushed. Then the value is updated according to: (x = x%(1<<exponent), and the exponent is decremented. For the output used in supervised algorithms, the double-precision value is divided by 511 (rightmost 9 bits of integer form will be used) and then rounded to the nearest integer. Bits of this value are considered from left to right, where a one will be pushed as the ith bit from the right, if the remainder of the original integer exceeds 2^i. Otherwise, a zero is pushed. For 500, 256 would be checked first, resulting in a 1, then the remainder (244) would be checked against 128 (push 1 for 11), then 116 would be checked vs. 64 (111), then 52 against 32 (1111), 20 against 16 (11111), 4 against 8 (111110), 4 against 4 (1111101), 0 against 2 (11111010), and 0 against 1 (111110100). Two new classes have been added to handle the training and preprocessing phases, TrainInput and NetworkInput. TrainInput is the class created from the input file, which includes both input and output (i.e. rating, WAC, …, and prepayment rate). NetworkInput contains only the input fields (WAC, WAM, r, etc). NetworkInput is the superclass of TrainInput. Training Data and Loader In order to validate the model, training vectors will be loaded from flat files and used to set the weights in memory. Later, both the training vectors and the weights will be stored in the database. The file format will be the standard comma-delimited (.csv) format, with each row representing a distinct training vector and output pair. The CSV file will be translated to a vector of TrainInput objects, representing the inputs and ouputs to the mortgage-backed securities model. For each of the three (prepayment, default rate, MBS price) inference models, a sample file is below. Note that it makes little sense to mix the three types in a single file, as the neural network does not take the output type as an explicit input. Thus, the weights learned would not be able to discriminate between output types for a given input vector. Here, the output type of 0 represents the prepayment rate, the output type of 1 represents the default rate, and the output type of 2 represents the MBS price. A rating of 3 is BBB (011), and a rating of 4 is BBB. 1. trainvecs_prepay.csv psa,r,wam,wac,ptrate,numLoans,notional,rtg,ttm,outputType,outputValue 120,.05,360,.07,.06,10000,1000000000,3,360,0,.02 120,.05,360,.07,.06,10000,1000000000,3,360,0,.03 120,.05,360,.07,.06,10000,1000000000,3,360,0,.097 120,.04,330,.08,.06,1000,200000000,4,360,0,.011 2. trainvecs_default.csv psa,r,wam,wac,ptrate,numLoans,notional,rtg,outputType,outputValue .05,360,.07,.06,10000,1000000000,3,1,.04 .05,360,.07,.06,10000,1000000000,3,1,.035 .05,360,.07,.06,10000,1000000000,3,1,.026 .04,330,.08,.06,1000,200000000,4,1,.05 3. trainvecs_mbs.csv psa,r,wam,wac,ptrate,numLoans,notional,rtg,outputType,outputValue .02,330,.06,.04,10000,1000000000,7,2,102 .03,390,.08,.07,10000,1000000000,7,2,110 .03,360,.05,.04,10000,1000000000,2,2,80 .04,330,.08,.06,1000,200000000,3,2,90 Multiplicity of Models – Price, Default Rate, and Prepayment Rate Due to the relative simplicity and generality of the Restricted Boltzmann Machine, the same basic formulas and design can be leveraged for more than one variable’s model. Possible outputs include the prepayment rate, the default rate, and the MBS price, and these can be chosen within the construction kit graphical user interface. Prepayment rate and default rates will be per month and should be parameterized by the current month number or number of months remaining in the loan. The output quantity will be stored along with the topology in the XML files. For the time being, the same input variable set will be leveraged for all models, although in the future this might later be configurable. These variables are: risk-free rate, WAM, WAC, pass-through rate, number of loans, notional, rating. If the output is either the default rate or prepayment rate in a given year (year n), the MBS price can be inferred by obtaining default and prepayment rates for all years up to the maturity of the security. These rates can be utilized to derive the cash flows, which are discounted and summed to obtain the price. Given that the MBS price is the output, assumptions can be made about the growth dynamics of the prepayment and default rates. Then, in each year n, the cash flow, default rate, and prepayment rate might be backed out, given the price (2n unknowns might be reduced to two—the initial year prepayment and default rates. If one more assumption can be made, this can be combined with the cash flow’s present value formula to obtain a solvable system). For the time being, a single model will permit only one output type, although it would be possible to pass the output type as a parameter in the training vector. The output type will be passed as a separate parameter to the train and test methods. Constructors and Instantiation No constructors which create a usable network have been created initially, as the network topology is determined by the user. Basically, any constructor would require a set of visible TLUs, a set of hidden TLUs, connections between visible and hidden nodes, and weights for each connection. The “createSampleNetwork” function calls the default network constructor and then adds the TLUs, connections, and weights by setting public instance variables. A method to instantiate a network from an XML file has been written, as has a TLU class. The TLU stores threshold (bias), index, name, and value fields. Note that the index is a unique identifier which, while it is intended to indicate the ordinal position of the TLU in the layer, is not explicitly used to sort the TLUs by. Sample of Inference in Programmatically Created Network In the main method, a simple network with two nodes in each layer is constructed. Without training, the inference is called using the weights of 1 and the inputs provided on the command line. The output type is set to be the prepayment rate. However, given only two input layer TLUs, the preprocess function will serve to convert a vector of seven double-precision floating point numbers into a vector of 40 bits. However, only the first two bits of this vector will be utilized. Thus, only the first parameter serves to dictate the input into the neural network (the remaining 38 will be discarded in the “setVisible” method which assigns the state of visible-layer neurons to the coordinates of training input vectors. In addition, the two bits of output allow four discrete possible values for the prepayment rate. For now, these four values will be: {11=75.15% or less, 10=50.098%, 01=25.049%, 00=0%}. The following command line is utilized for this test. c:\mbs>Boltzmann /? Usage: Boltzmann (<risk-free rate> <WAM> <WAC> <pass-through rate> <number of l oans> <notional> <rating> <trainDataFile> <outType)||/? OutTypes: 0=prepayment rate, 1=default rate, 2=MBS price If “/?” is entered as the first argument, the usage message is printed. The inference on the network containing two TLUs in each layer was successfully tested. The topology is sketched below. Hidden TLU 0, energy 1, probability .73 1,threshold .5 Visible TLU 0,input 1 .7 Visible TLU 1, input 1 1 Hidden TLU 1, energy .7, probability .668 , threshold .5 Note that, as the input to the 0th visible TLU is 1 and the Visible0-Hidden1 weight is 0.7, the 1st hidden unit has energy of .7. This generates a probability of 1/(1+exp(-.7)) = .731. The variance in output seems correct and in accord with a stochastic neural network. One problem noted was the monotonic increasing nature of the first random draw. However, this is to be expected, as the random number generator is reset using the current time with each app restart. Given the verification of the neural network state propagation, more complex networks with full pre-and-post-processing and contrastive divergence training can be implemented. Test Run of Simple Network Inference The following command line was executed: Boltzmann .03 360 .07 .06 100 20000000 AA The parameters are: 3% risk-free rate, 360 month weighted average maturity, 7% WAC, 6% passthrough rate, 100 loans in the portfolio, 20 million dollar notional, and AA credit rating. Basically, a network of two visible and two hidden nodes (drawn above) with the weights of .7 and 1 trained as above is fed the inputs on the command line. The final output of two bits is translated to a prepayment rate as follows: (bit0*256+bit1*128)/511 The use of the rndDraw function given a probability to set the hidden layer was somewhat suspect over repeated runs in a short time. Since the random number generator is seeded with the time, the first random number of each run is monotonically increasing with time elapsed between runs. However, for multiple calls in a single run, the numbers are random and, over a span of 1000 time increments, the draw will output a value below the threshold p in p of the steps. Sample Training via Contrastive Divergence The simple two-layer network illustrated above will be trained with the CSV file discussed in the “Training Data and Loader” section. The translation of the CSV file to a TrainInput object is a straightforward string to numerical (integer or double) conversion. However, the TrainInput object must subsequently be converted to a chain of bits to input to the threshold logic units via the Preprocessor class I ”positive” step and another chain of bits to compare to set as the hidden output in the “negative” step of Contrastive Divergence. This is handled via the “preprocess” and “getOutBits” methods of the Preprocessor class, which are invoked by the “train” method of the Boltzmann class. The input fields for each training row are read into forty threshold logic units. Strangely, the “setPositive” method which (originally, later modified to set hidden from the training vector) assigns the hidden units the outputs from the training vector and computes the products of visible and hidden TLU outputs on a connection, crashes in the second iteration of the loop over connections. The problem was that 40 input bits and 9 output bits need to be condensed to two input bits and two output bits. This was corrected by adjusting the training code to be able to handle a case where the number of neurons is fewer than the number of input bits (the leftmost bits are placed and the remainder dropped). Strangely, convergence occurred immediately in one iteration. At each iteration, the current weight settings are printed to the console. In the one iteration, no update to either weight, and so the weights remain at the original 0.7 and 1 values. This seems to have been due to an error in the negative phase in which the TLU output computation was performed in the visible layer instead of the hidden layer. The second problem was that the bits passed to the visible layer TLUs were always zero. This is likely because the only two bits used are the left two bits of the risk-free rate, which are always zero. This was corrected by setting the risk-free rate to 32% for one input vector (more visible TLUs would prevent such truncation), but the negative phase output was still zero, as there was confusion between the numerical format stored (double) and the format printed (int). This was corrected by storing the value, which is just the product of bits, as an integer. A problem similar to the problem with the setting of the visible bits from training data occurred in the setting of the hidden bits. This involved the use of only the first two bits of the prepayment rate in the output, which were both 0. Thus, a prepayment rate of 100% was set for one training vector (again, using more hidden-layer TLUs would prevent this measure from being required). The one remaining problem is that outputs from the positive and negative phases seem to always match, inducing convergence in one iteration without weight adjustment (and eliminating the stochastic element). The immediate convergence was resolved by lowering the convergence threshold of the weight delta (from 1 to .1), and the lack of divergence of the positive and negative phases was apparently an aberration. The update of the weights is correctly carried out as wij = positiveij-negativeij, where i and j are the indices of the visible and hidden layer TLUS, respectively. Excerpts from the training and inference logs illustrate the process. In each iteration, four training vectors are processed and two weights are updated for the training vector. Only the first training vector resulted in interesting output, as this is the only one for which the visible layer TLUs do not output only zero, inducing the visible and hidden layer vectors to be zero. Training: bits size 40, visible size 2 visible len 2, bits len 2 setVisibleBits, bit 1 setVisibleBits, bit 1 setPositive, before for, num out bits: 9 connections[0] 1 setPositive for loop setpos visind 1 setPositive, vis 1, hid 0 visible[visInd].outputVal 1 hidden[hidInd].outputVal 1 pair hash operator access hash code returned 32768 pair hash operator access hash code returned 32768 pair equal_to operator access pair equal_to returns 1 after positive set, value 1 setPositive for loop setpos visind 0 setPositive, vis 0, hid 1 visible[visInd].outputVal 1 hidden[hidInd].outputVal 1 pair hash operator access hash code returned 1 pair hash operator access hash code returned 1 pair equal_to operator access pair equal_to returns 1 after positive set, value 1 after setPositive setNegative entry pair hash operator access hash code returned 32768 pair equal_to operator access pair equal_to returns 1 computeTLUState, layer 1, index 0, energy 1, prob 0.731059 rndDraw, rndInt 963, cutoff 731 pair hash operator access hash code returned 1 pair equal_to operator access pair equal_to returns 1 computeTLUState, layer 1, index 1, energy 0.7, prob 0.668188 rndDraw, rndInt 178, cutoff 668 visInd 0, hidInd 1 visible[visInd].outputVal 1 hidden[hidInd].outputVal 1 product: 1 pair hash operator access hash code returned 1 pair hash operator access hash code returned 1 pair equal_to operator access pair equal_to returns 1 after negative set, value 1 visInd 1, hidInd 0 visible[visInd].outputVal 1 hidden[hidInd].outputVal 0 product: 0 pair hash operator access hash code returned 32768 pair hash operator access hash code returned 32768 pair equal_to operator access pair equal_to returns 1 after negative set, value 0 setNegative exit … visible index 1, hiddenIndex 0, weight 2.000000 pair hash operator access hash code returned 1 pair equal_to operator access pair equal_to returns 1 visible index 0, hiddenIndex 1, weight 0.700000 iter 1, deltaWtSum 1.000000 Inference: computeTLUState, layer 1, index 0, energy 0, prob 0.5 rndDraw, rndInt 813, cutoff 500 pair hash operator access hash code returned 1 pair equal_to operator access pair equal_to returns 1 computeTLUState, layer 1, index 1, energy 0, prob 0.5 rndDraw, rndInt 175, cutoff 500 postProcess, outVec len 2 postProcPrepay, outVec size 2 while loop expVal 256, ind 0 outVec[ind]=0 while loop expVal 128, ind 1 outVec[ind]=1 postProc, vector 0, 1, retVal 0.250489 infer returns: 0.250489 rbm inferred prepayment rate: 0.250489 Test of Full Input/Output Prepayment Inference Both the training and inference will be tested on a production-scale (56 inputs, 9 outputs) RBM. A new class, PrepayApp was created to call the Boltzmann class’s training and test methods with the output type set to “prepayment”. Although they may be moved into the Boltzmann class later, the PrepayApp class maintains statistics about the accuracy of the training and testing. These metrics can be reported via the “printTrainSummary” and “printTestSummary” methods. The input must be read from a file into a TrainInput object, serialized to a bit vector, and read into the Restricted Boltzmann Machine. The output bits from the RBM must be postprocessed to convert the bits back into the prepayment rate (double precision floating point). A critical problem in writing this application was how to report training (number of training vectors correctly classified in training) and test (number of test vectors classified correctly) statistics known only to the Boltzmann object to the PrepayApp object. This will be accomplished by modifying the Boltzmann class’s train and test methods to return a ReportInfo object. The test method will return the expected and actual results for each input vector in a list of TestResult objects along with the summary information, and the train method will include the number of iterations required for convergence and the convergence threshold. The per-training-vector error computed by the training phase will be the positive-negative difference calculated within the contrastive divergence algorithm, and the metric in the case of testing will be the difference between the expected (training vector) prepayment rate and the actual rate. The per-vector error threshold will initially be fixed at .01, but this variable should be parameterized. The reporting functionality in the PrepayApp class can also access the weights in the RBM via the public members of the Boltzmann object. The “printTrainSummary” and “printTestSummary” methods will be similar, with the “printTrainSummary” outputting the number of iterations, the overall threshold for convergence, and the final weights along with the values printed in test (per-vector threshold, number of vectors, number correct). Test XML File The network topology itself will be described in an XML file with 58 input TLUs, 9 output TLUs, and fully connected visible and hidden layers. The fully-connectedness will induce a fairly large file, as this means that there will be 58x9=522 connections to describe. However, it seems problematic to determine which input units should be connected to which output units, as all output bits seem to be a combination of all input bits. RBM Training Algorithm The training of a Restricted Boltzmann Machine is carried out by a method known as contrastive divergence. The objective is to learn weights which best correlate the visible node vectors of the training set with the hidden unit state vectors of the same training set (making this particular RBM implementation a supervised learning case). In the first iteration, given the inputs (or outputs of the visible layer TLUs), the hidden layer TLUs are set (based on the energy of each unit and the probability resulting from the energy). For each TLU pair connected by an edge eij, Positiveij is computed. Here, Positiveij = xi*xj. In the next iteration, the hidden unit values are clamped to the training vector outputinduced values, and the visible layer units are inferred. Then the hidden units are derived again. The edge values derived at this point are denoted Negativeij. For each training vector, the weights are updated as follows: wij = wij+lambda*(positiveij-negativeij) Here, lambda is the learning rate. All training vectors are iterated over until convergence (error beneath threshold for hidden layer output) or some upper bound on the iterations. (Update, this is the canonical clustering algorithm, but the supervised categorization applications required herein require the modifications outlined below. Contrastive Divergence Implementation in C++ The algorithm runs until either one hundred iterations complete or the error rate on the training set is less than ten percent. In each iteration, a positive and negative pass over all connections is made, setting the Positive and Negative data structures. Using these positive and negative values, the weights are updated with each training vector. The steps of the given contrastive divergence algorithm are as follows. 1. Clamp the visual layer with training inputs 2. Use the visual layer values to compute energies of the hidden nodes, probabilities they are on, and their states (sampling from the Bernoulli probabilities). 3. Set the “positive” data structure positive[i][j] = vi*hj, using the hidden node outputs 4. Use the hidden outputs from (2) to compute the energies, probabilities, and states of the visual nodes. 5. Use the visual values to compute the hidden node values. 6. Set the negative[i][j] data structure using these hidden outputs (as outlined above). 7. The difference, negative[i][j]-positive[i][j] is the delta in weights[i][j]. At the end of the iteration, all training vectors are checked and the accuracy assessed. If accuracy is sufficient, convergence is determined, and the algorithm halts. The means to assess this accuracy is discussed in the derivation section below. Update: the algorithm steps will be modified as follows to provide a supervised learning approach: Step 2: clamp the hidden layer with the training vector output bits. Step 4: recompute the hidden layer using the energy formula and the input bits Derivation of Contrastive Divergence How to obtain weights such that the network’s hidden units accurately predict a mortgage prepayment rate given the WAC, WAM, risk-free rate, etc. is the problem of the training algorithm. It seems that this algorithm would require some exogenous aspect, in that the output rates from the training samples would need to form part of the input into the training. Otherwise, the algorithm could at best cluster the training samples as a function of their inputs and outputs. An explanatory paper and implementation of a Restricted Boltzmann Machine and contrastive divergence at deeplearning.net appears to be more mathematically rigorous than the simpler CD algorithm outlined above. The probability of a neuron’s state being on is expressed in terms of the “free energy”, and instead of computing the weights based on the difference between the hidden layer values in positive and negative phases, the algorithm computes the gradient of the difference between negative and positive free energies and adds the partial derivative of this “cost” function with respect to the weight to be updated. The code, implemented in the Python “Theano” library, calculates the gradient of the free energy function, which is apparently the vector of scalars with dimension equal to the sample batch size. In fact, the grad function apparently computes symbolic derivatives (and subsequently, it is assumed, makes numerical substitution). Where the RBM is constructed and the training function is called for each network in the batch is unclear (the paper seems to take N samples of k batches each). In fact, it was discovered later that the complete source on a separate web page includes a test function which calls the RBM __init__ method and the get_cost_updates contrastive divergence method. The Theano code, in fact, is identical in logic to the simpler contrastive divergence algorithm outlined above. It computes the same positive and negative energy values as contrastive divergence, but then takes the additional step of minimizing the “cost” function which is the difference (positiveij-negativeij) by solving for the parameter set which makes the gradient 0 (using gradient descent). However, simply retaining the cost output and multiplying it by a learning rate to update the weights should have the same effect (if the reconstruction turns off units which should be on, positiveijnegativeij>0 will hold, and the weights will increase, increasing the number of units likely to turn on). Unfortunately, both of these algorithms are based on the premise that only the visual layer is populated by training inputs. The purpose of the hidden layer is to facilitate feedback to reconstruct the visual layer. The ultimate goal of the algorithm seems to be to enable the visual layer to sample from the training probability distribution. However, such a usage is insufficient for computation of metrics such as the prepayment rate or MBS price. Thus, a discriminative version of the Restricted Boltzmann Machine must be utilized. This version of the Restricted Boltzmann Machine will mimic the discriminative RBM described by Larochelle and Bengio, but replacing the gradient descent weight updates with the simpler contrastive divergence described by Chen. This network will make the entire hidden layer consist of the class bits, and the positive stage of the CD algorithm will use the inputs and outputs directly from the training sample. The preprocessing step will need to be modified to preprocess the training set outputs such as prepayment rate so they can be dealt with as bits in the hidden layer. The Preprocessor class will be modified to set the output bits as hidden in the positive phase, and the positive phase of contrastive divergence will not involve inference but simply use the hidden values set from these outputs. One interesting issue addressed in the discriminative and sampling papers is the conditional probability distribution of a single TLU, p(hi=1|v). The key insight is that the individual hi values are independent, and thus, the other hidden units can be ignored in considering hi’s probability. Bayes’ Rule yields, p(hi=1|v) = p(hi=1,v)/p(v), where p(hi=1,v) = exp(-E(v,1))/Z, and p(v) = (exp(-E(v,0))+exp(-E(v,x)))/Z, where Z is the normalization constant value in both cases. Given expressions for E (energy) as free energy in terms of b’v, ci, and Wiv, this simplifies to sigm(ci+Wiv), where sigm(x) = 1/(1+exp(-x)). Boltzmann Machine Classes 1. Boltzmann – represents the neural network. This includes the topology (as represented by visible and hidden layer TLU vectors and weights), the training/testing, and the inference. 2. TLU – a single logic unit C++ Template Specializations In order to utilize the unordered_map template (hash table functionality) with custom classes like “Pair” as the key, a hash function and equality operator must be provided to the unordered_map. This can be accomplished by passing specific user-defined function pointers to the unordered_map constructor. A more elegant way, though, is to extend the std namespace and specialize the hash and equal_to struct templates for the user-defined type. This enables the default hash and equality calls made by the unordered_map to access the type-specific method. In the header file, the template specialization (indicating that an implementation for a specific type exists) is as follows. //hash assumes at most 2^15 nodes/layer template<>struct hash<Pair>{ size_t operator()(const Pair&p)const; }; In the .cpp file, the implementation of the operator is as delineated below. size_t hash<Pair>::operator()(const Pair&p)const{ printf("pair hash operator access\n"); size_t code = 0; size_t maxVal = 1<<15; code += maxVal*p.x1; code += p.x2; return code; } Aside: C++ Print Formatting Bug An interesting phenomenon occurred when the author mistakenly declared an integer variable as a double-precision floating point member. The printf call formatted both this variable and another index variable using the “%d” designation for integers. The result was that the floating point number, a large integer, printed as 0 while the index variable (a montonically increasing sequence starting with 0) printed as a 10-digit number. Correctly declaring the double variable as an integer solved the printing issue for both. Thus, printf seems to propagate corruption in converting one numerical variable to a char* to other variables. Boltzmann Machine GUI A graphical user interface for interacting with the Boltzmann machine C++ code will be written for the desktop platform. The classical Windows Win32 GUI library will be leveraged for this implementation. High-level functionality provided by this module will include: drag-and-drop of threshold logic unit (TLU objects), drag-and-drop of connections between TLUs, weight assignments of connections, a train button to learn the optimal weights, an infer button to infer the equilibrium output state for a given input (visible layer), and energy computation for a given state and weight assignment. The user can select the qualitative meaning of the visible layer states (credit rating above BBB, notional above 100000000, maturity > 10 years, etc) and the hidden layers (probability of default > 20%, probability of prepayment > 10%, etc). The objective is to utilize at least one combobox to specify the output type, (unimplemented) multiple combo boxes for the input types, buttons for TLU components (neurons and connections), and a canvas to drag and drop to the components onto. Translating Graphical Images to a Neural Network The key challenge of the GUI lies in detecting relationships between the shapes and lines on the canvas and constructing the Restricted Boltzmann Machine data structures from these pixels. The steps involved in the construction of a network graph from user mouse actions is as follows. 1. When an icon is dropped on the canvas, the image must be converted from an icon to a painted geometric object, if resize is to occur. This is necessary for synapses and the preferred way to handle neurons. The synapses will be converted to painted lines using the “MoveToEx” and “LineTo” Win32 functions. 2. The overlap of neurons (circles) and connections (lines) needs to be checked using an “overlaps” function. 3. The “layer” of the network in which neurons are placed must be inferred based on their x-offset in the canvas (i.e. which side of the partition). 4. The new object’s extrema (enclosing rectangle for a neuron, endpoints for a synapse) must be stored in a hash table. For a TLU, this will map the geometric coordinates to an index in the visible layer or hidden layer (separate tables for each layer). For a connection, the endpoints are mapped to a pair of indices in the visible and hidden layers. A Boltzmann network object must be stored in the GUI instance. This is necessary, as the canvas must be redrawn with each move of a dragging of a new neuron or syanapse. Repaint redraws all images based on this data structure. a. A “lines” vector will store all synapse connections and, when a synapse is dropped or moved, this vector will be updated b. All lines will be repainted whenever the canvas boundary and partition is repainted. Canvas The canvas is a white rectangle with a black line partitioning the two network segments (visible and hidden). Two text windows will also indicate the regions separated by the partition. A custom pen is created via the “CreatePen” GDI function to obtain a pen of user-defined width for the separator. A problem noted after the icon drag was implemented was that the repainting of the screen causes the canvas to become a gray gap (as opposed to the initial white rectangle with black border). Modifying the drop code to destroy the old window and add a new window to the canvas does not ameliorate this problem, nor does calling setupCanvas in the “handleDrop” code. Interestingly, decomposing the logic that paints the lines and background of the canvas from the code which instantiates the window and invalidating the canvas rectangle in this function seems to induce correct behavior. The “InvalidateRect” call is necessary to essentially delete the painted lines and background from the canvas’s memory, it seems. Otherwise, the new paint commands do not replace the lines. The new paint function is called by the redraw (move) logic and the drop logic. One persistent problem is that the icon renders merely as a gray square. This seems to be due to the fact that the dragged image is still a child of the main window, not the canvas. A second problem was that the icons, when placed were shifted. This was due to a failure to translate the main window coordinates to canvas coordinates, i.e. x -= canvX, y-= canvY. Lastly, the newer icon, when placed atop an older one, “underlaps” it, which is counter-intuitive. This has something to do with the paint order in the canvas, and it is reversed, when another icon is placed (possibly a bug in the Win32 library). Interestingly, this phenomenon was corrected by adding another “InvalidateRect” (with the berase parameter set to FALSE) to the drop handler immediately after creating the new icon as a child of canvas. The drag-and-drop functionality highlighted another problem with the use of Win32 windows and messages. Whenever an image icon needs to be erased and redrawn in the main window, the DestroyWindow and InvalidateRect methods were sufficient. However, when dragging or dropping on the canvas, these methods do not induce the icon to be deleted (the canvas needs to be repainted before this deletion takes effect). It seems that there may be a problem with the event-handling logic which prevents the WM_PAINT message from being correctly sent and processed by the parent window. This problem is remedied by passing NULL as the window argument to InvalidateRect, which apparently induces the WM_PAINT message to be sent to the primary window, and the paint message is processed by the original event handler, not the dispatch event handler set for the window when the drag event is processed. However, this causes all painted effects in the canvas to be erased. Also, whenever the icon is dragged over the rectangle or partition, these are erased. The only solution which thus far has worked is to repaint the rectangle and partition of the canvas with each move and drop of the icons. In addition, all lines need to be repainted (as will all TLUs which have been dropped, when they are stored as shapes and not images). An updated screen shot is below. Selection of Canvas Objects Once a neuron or synapse is dropped on the canvas, it becomes a part of the neural network data structure. In order to allow the network reconfiguration via operations such as the deletion of neurons, the transfer of neurons between layers, and the connecting of neurons in multiple fashions, the graphical objects must be able to be selected. Upon left-click selection, the synapses can be dragged, resized, and rotated to enable new connections between neurons. Neurons can be moved and resized. The right click operation will enable deletion of neurons and synapses. Selection behavior will mirror that in other commercial drawing applications, resulting in the highlighting of “action points” on the line or circle. For the synapse, three points, at the two endpoints and the middle, will be highlighted. The mouse-up event will cause the setting of a “selected” bitmap for the line in question, and the “paintCanvas” method will be updated to paint the “selected” line with dots. In addition, the subsequent mouse-down and mouse-up, if not on this line, must unselect the line, which is handled by marking the line’s flag as “false” in the data structure. These actions, which involve fairly complex logic and dispatch, will be handled by a new module and class, the “NetworkManipulator”. This object needs to maintain a pointer back to the instantiating BoltzmannGUI’s canvas object and lines vector. Unfortunately, the event handler method must be static, and there is no well-defined way to pass custom data about objects to this. Thus, the solution utilized was to make the BoltzmannGUI’s reference to itself be a public, static member of the class, which the NetworkManipulator can access. Once a line or neuron is selected, its selection points can be dragged. Dragging the endpoints of a line causes an expansion (if dragged away from the center) and a contraction (if dragged towards the center). A separate event handler was utilized for the drag mechanism in the case of icons, and it may be necessary to implement a similar facility for the dragging of lines (synapses) and circles (neurons). In fact, it was later determined empirically that the only way to handle LBUTTONUP events is via a separate event handler. Shape Outlines and Fills Shape outlines are drawn based on a “pen” (HPEN) struct, and shapes are filled with a color based on a “brush” (HBRUSH) structure. It seems that both the brush and the pen can be selected for the shape using the “SelectObject” function. For example, ellipses for the line selection will be drawn with a black outline and white fill. A screen shot of the new lines, with one selected, is below. Event Handler Specification Interestingly, the functionality to handle the clicks on lines in order to select them (to later move and expand them) reveals another possible bug in the Win32 logic. Whenever the “CallWindowProc” function is invoked on the previous dispatch function, the current dispatch function is called instead, yielding an infinite recursion. In addition, the “SetWindowLong” call to revert the event handler back to the previous function sets the value to the current event handler. This was remedied by explicitly passing the correct function names, but why the correct function pointer was not in the “previous handle” variable remains a mystery. When running in GDB with breakpoints around the CallWindowProc and SetWindowLong functions, no error occurred. Thus, this may be a timing issue between GUI threads (although it is believed that both handlers would run in the same thread). Icon Tray The icon tray is a set of two images which can be selected and dragged to the canvas. Upon the placement of the icon on the canvas, its parent window is changed from the main window to the canvas. A screenshot of the icon tray with the canvas is below. Win32 Methods and Types Some non-intuitive structures and functions exist in Win32. WndClassEx is a structure of metadata about the window. It is stored in a registry in memory via RegisterClassEx, which links the metadata to a name. The name is looked up and the window instantiated in the memory as a displayable entity via the CreateWindowEx method invocation, which returns a HWND handle. The HWND can be passed to the ShowWindow function. The HWND is apparently an address in a table which references the bitmap for the painting, etc.. Interestingly, the CreateWindowEx method is also used to instantiate controls such as combo boxes, but the RegisterClassEx method need not be invoked before such instantiation. .Icon images need to be placed in a resource file in order to assign them to a window, it seems (based on the documentation for the LoadIcon function). An event loop runs by calling the GetMessage function in a while loop until it returns 0 (apparently when the application process is terminated). One of the fields of the WndClassEx struct is the event handler, known as lpWndProc. This method handles all events occurring on controls within the window. Note that the Win32 “CALLBACK” keyword is used to modify this method, which makes it static. However, one of the parameters to the method, “lParam”, is a handle to the control which originated the event (e.g. button which was clicked). Note that the messge type is simply WM_COMMAND for all messages, but the upper word of the “wParam” to the event handler contains the identification code. One of the keys to implementing a window is to implement the event handler and handle messages for WM_NCCREATE, WM_CREATE, and WM_PAINT. These can all be processed using the “DefWindowProc” function, it seems. In addition, any “default” cases should be handled by calling this function. The order of messages sent on window instantiation was, “GETMINMAXINFO”, “NCCREATE”, NCCALCSIZE”, “CREATE”, “SHOWWINDOW”. In order to use a traditional combobox, a window must created with class WC_COMBOBOX and style CBS_DROPDOWNLIST (editing is thus disabled). Strangely (using both mingw and VisualStudio/Windows SDK libraries), the combobox list items cannot be selected by the mouse click, but only by the keyboard arrows (finalizing a change by pressing “Enter”). When a mouse click occurs on an item in an expanded dropdown list, no event messages are even sent to the event handler method, it seems. It was initially suspected that the COM method CoInitialize needed to be called to properly set up the message pump for the event loop, but even after adding this call, the combobox behavior was incorrect. In fact, the correct behavior was obtained by passing NULL as the second parameter to the GetMessage method in the event loop instead of the window handle. This is apparently due to the fact that the messages generated by mouse moves and clicks within the combobox’s list are not attributed to the window handle and thus would not be dispatched (this is an error, as messages to a child handle should be addressed to the parent window, and the static control correctly propagates messages, as does the combobox for keyboard-generated messages). The error handling of this API makes it somewhat troublesome to write to. For example, when setting an image in a static control, setting the styles to SS_CENTER and SS_BITMAP caused no image to display. However, no exception occurred either in CreateWindow or SendMessage. Drawing Drawing lines and shapes to the Win32 GUI is also somewhat non-intuitive, although not completely different from other graphical APIs such as Java Swing. Basically, the WM_PAINT event-handling logic must call the code which draws the geometry. In order to set the pen or brush colors, the “device context” must be set to represent the pen, and then the handle to the context passed to the “SetDCPenColor”, along with the RGB values for the color ((0,0,0) for black, (255,255,255) for white, etc). Interestingly, a “device context” is instantiated to provide the memory for display output operations (the “BeginPaint” function returns the handle to this memory). An object (such as a pen, brush, or region) is selected within the device context, and properties of the object (such as color) can be set. Note that the handle to the device context is a 32-bit integer which maps the current display context to a location in memory (in a lookup table of the OS). The “DC” function set is a part of the API known as “Graphics Device Interface”, or “GDI”, whose symbols are located in libGdi32.dll. A screen shot of a simple UI, involving static text, combo box, rectangle, and images from a resource bundle linked into the executable is below. COM OLE Drag and Drop Although low-level mouse-down, mouse-up, and mouse-move events could be processed for drag-anddrop, a higher-level API exists via object link embedding (OLE). The Component Object Model (COM) remote procedure call framework is utilized to send messages between objects which might reside in different processes (it is unclear that COM is really needed for drag and drop within a single window, but the OLE abstraction may require it by sending the events to an intermediary dispatch process). Classes for which these methods are implemented are part of the Microsoft Foundation Classes (MFC) objectoriented library. Unfortunately, the OLE and MFC libraries are not distributed with the MinGW compiler. The author located some files which will be tried in both the g++ and cl (Microsoft) compiler/linker sets. Note that one source of confusion will be the mixture of object-oriented MFC/ATL (ATL stands for Active Template Library, and is a COM abstraction) code and the C-style procedural Win32 code. One problem is that, in an MFC implementation, the main function, WinMain, resides in the DLL and not the user’s module. In addition, GUI widgets are referenced on a “document” basis, and not as a “handle”. It is expected that, if the main dispatch loop is not started with the MFC libraries and the user interface state initialized within this library, it would be impossible to utilize the drag and drop methodology. Another source of confusion is the client-server paradigm of OLE and COM, when really only a single application process is involved. The premise of OLE is that data objects need to be shared at runtime between applications via Windows Explorer operations like cut and paste. In the case of this neural network construction application, no cross-application sharing is required, so the COM and OLE libraries seem unnecessary. Win 32 (C API) Drag and Drop A custom Win32 implementation of the drag-and-drop functionality will involve handling of the WM_MOUSEMOVE, WM_LBUTTONDOWN, and WM_LBUTTONUP events as well as tracking state stored within a DragDropHandler object. The drag-and-drop starts when the left button is pressed over a draggable image (TLU or connection button or images on the canvas). The image is redrawn in a semitranslucent form. After the drag begins, every mouse move of a distance of 10 or more pixels away (without releasing the left button) induces a repaint of the semi-transparent image. When the left button is released over the canvas, either a TLU or a connection line is painted. This GUI structure must be reflected in data structures for the neural network in memory and serialized to an XML file. Unlike some of the other messages, the mouse button and move messages do not provide the handle to the window (i.e. UI control such as a button) where the action occurred. Thus, a means to infer the UI widget upon which the mouse button was depressed from the X and Y coordinates which triggered the event must be found. This will be accomplished via the GetRect method, which returns the rectangle for a given window handle Interestingly, the WM_LBUTTONDOWN message is never processed by the event loop, and mousedowns are construed as clicks. The workaround for this is intercepting the WM_PARENTNOTIFY message (which occurs prior to a click message). This results in a correct button down catch. However, the x and y coordinates of the mouse fall well outside those which are assigned to either of the draggable buttons. The reason for this seems to be that the coordinates returned by the GetWindowRect function are the absolute coordinates in the OS display. If the application window is shifted left, the left boundary of the rectangle shifts left accordingly. The solution to this problem was to find the rectangle for both the child (image button) window and the parent (application) window. For example, leftX = controlRect.left-windowRect.left, and rightX=controlRect.right-windowRect.left. Upon making this correction, the drag start methods are correctly invoked. New copies of the buttons are painted in response to the mousedown event, but the redraw does not work correctly and the mouseup event is not correctly detected. In addition, the images do not appear to be the translucent copies. This last issue was because the paintNewImg method used the original images (no Win32 bug). Another problem was that the position was passed to the LoadImage function instead of the size. The position is associated with the enclosing window and not with the bitmap itself. The mouseup event was not conveyed via either a WM_LBUTTONUP message or a WM_COMMAND message, as was expected. The only message sent when the button is released is WM_SETCURSOR (code 32). However, this message is so general that catching it prevents the gui’s being repainted whenever the cursor is above the window. Interestingly, the mouse-up event was caught by setting the capture of the mouse to the window of the translucent image and by delegating a new dispatch function to the two network component buttons. Strangely, even when the button-up occurs outside of the image, the event is caught (due to capture, it is suspected). In addition, it is peculiar that a dispatch to a separate event-handler is necessary, but in fact the button-up event was not caught without such logic. Another issue was a crash that occurred when repeated mousedown events occurred on the TLU or connection button. This seems to be due a flood of MOUSE_ACTIVATE messages which occur on the second mousedown. The exception seemed to be ameliorated by setting the capture to the button window (not the application window) on the mousedown and also associating the delegate with the button window. However, this is incorrect, as the mouse move coordinates would not be relative to the containing window’s upper-left corner. Another way to eliminate the ACTIVATE message flood was to avoid calling the old wndproc method at the exit of the delegate (as this causes an infinite recursion of messages, it seems). Strangely, the method of subtracting the parent window’s coordinate from the current window’s coordinate to obtain the parent-relative offset does not seem to be effective (although it approximately works) as the connection button’s x-boundary is off by 11 and its y-boundary is too low by 40. This problem was corrected by adjusting for the window’s borders on the top and left (which accounted for the extra pixels). Such an adjustment is made by subtracting the “client window” width or height from the “window” rect or height, as the client rectangle excludes the border. The left border is based on dividing the extra x-pixels by two, and the top border is based on subtracting the left border from the extra y-pixels. However, in this case, a mousemove event occurs when the translucent image is first drawn, as the cursor’s position changes, which was due to an incorrect setting of the previous position to the upper left of the image instead of the center After the above adjustments (shifting from the center of the image to the upper-left corner prior to paint and converting screen coordinates to window coordinates), the initial select and drag of the icons was successful. However, while the mouseup correctly identifies the image when no movement has occurred before a mouseup, the rectangle for the icon is incorrect, after a drag has occurred. This was corrected by caching the image rectangle in the “translucentTLU” and “translucentConn” handles. However, when one image was dropped, dragging and dropping a subsequent image caused the original dropped image to be erased. This was corrected by resetting the “imgHnd” pointer which is passed to the DestroyWindow function to be NULL after a drop occurs. However, the DestroyWindow does not cause the old images to disappear, and the constant reset of the canvas on paint causes placed objects to be lost. This was due to an incorrect default message handler implementation in the delegate, which must CallWindowProc(prevWndProc…) for all events except for WMLBUTTONUP (return 1 in this case). The problem of deletion of items left on the canvas was addressed by instantiating the canvas only in the setup code, not in the mousemove event handler. Apparently, the default event handler cleans up the display in such a way that DestroyWindow’s changes are effective, while the simple DestroyWindow and InvalidateRect statements alone are not sufficient. In addition, the button-up event caused an infinite loop by setting an event handler to its predecessor and then directly calling the predecessor of that handler (original handler), causing an infinite recursion (f(a,b)->f(a,b)->f(a,b)…) A screen shot is below. As of now, the drag and drop functionality works, except that: 1. Drops can occur anywhere (should be allowed only on canvas). 2. Upon drop, the image should solidify and not be an icon. 3. The image on the canvas should be selectable and movable. The problem of discarding images not placed on the canvas was solved by calling DestroyWindow if the mouse button is released when the image is outside the canvas. In an attempt to reduce the flicker of the window during drag, only the icon’s rectangle was invalidated on a mouse-move. However, this induced a disappearance of the image in the canvas. Invalidating only the canvas when the image was inside it failed to ameliorate this situation. The final approach was to invalidate the entire window but to set the bErase parameter to FALSE, which induced less flicker than a call with bErase set to TRUE. The problem when not invalidating the window seems to be that, as the static icon windows are placed on the canvas without being the children of the canvas, the canvas can be painted above them, unless the window is repainted (as the root window remembers the order in which children were added). A minor issue is that the canvas border disappears, when an icon is dragged to it. This is because the black rectangle is painted only in the setup method, while it should be restored in every WM_PAINT event handler. Compiler and Linker The GUI software was originally built using the mingw compiler and linker (g++ and ld). Later, an additional makefile was written to compile with the Microsoft Visual Studio compiler and linker (cl) and to link to the Windows SDK official libraries. Behavior of UI rendering and event messaging seems identical, but there is no standard output console by default if building with the cl compiler. This disparity was reconciled by adding a standard “main” method and (implicitly) causing cl to use the /Subsystem:Console option. Upon the implementation of logic encoding the placed graphical objects into data structures, the Boltzmann Machine core logic, including the TLU and Boltzmann modules, needed to be linked into the executable. Strangely, building the GUI along with the Boltzmann code generated errors involving the TLU class which did not occur when building the Boltzmann code independently of the GUI. This was due to the aberration that an enum defined in the DragDropHandler class contained a “TLU” element which the compiler had placed in its symbol table prior to parsing the TLU class. Resource Files In order to include custom images into a Win32 application, the API constrains the developer to the utilization of “resource” identifiers in specifying the image to load. These resource IDs are mapped to an offset in a binary resource file, which is a bundle of all of the images (and possibly sounds, strings, etc) utilized by the program. Basically, a “.rc” file, mapping an integer ID to a resource type and a file name is compiled using the “RC” utility. This outputs a “.res” file, which can be linked by the Microsoft linker into the executable. However, in order to utilize the Gnu Compiler Collection (gcc) for the build, the “windres” utility must be utilized to convert the .rc file to an object (.o) file. Note that a resource file is not required, if the image is installed along with the .exe and the “LR_LOAD_FROM_FILE” parameter is passed to the LoadImage function. In the case of the CL (Visual Studio) compilation, the res file is directly linked to the output (like a .o or .lib file). Icons in Win32 In Windows, program icons (for Explorer, the task bar, etc) are stored in .ico files. Just as is the case with bitmaps, these are attached to the executable via the resource bundle. Apparently, the first icon found within the resource bundle is selected by Windows as the program icon. Note also that .ico files are archives which can contain multiple copies of an image (at different sizes or resolutions), and that and external utility, Converticon, was needed to generate the image in the correct format. Note that, while exporting images of 16x16, 32x32, and 256x256 succeeded, preserving the original image resulted in an error in the “windres” conversion program which generates the “.o” object file from the “.res” resource. The resource compiler output, executable link output, and icon result are below. c:\mbs>makerc c:\mbs>rc /v BoltzmannGUI.rc Microsoft (R) Windows (R) Resource Compiler Version 6.1.7600.16385 Copyright (C) Microsoft Corporation. All rights reserved. Using codepage 1252 as default Creating BoltzmannGUI.res BoltzmannGUI.rc. Writing BITMAP:100, lang:0x409, size 67072. Writing BITMAP:101, lang:0x409, size 36584. Writing ICON:1, lang:0x409, size 1128 Writing ICON:2, lang:0x409, size 4264 Writing ICON:3, lang:0x409, size 270376 Writing GROUP_ICON:102, lang:0x409, size 48 c:\mbs>makeboltzmanngui c:\mbs>windres BoltzmannGUI.rc BoltzmannGUIRes.o c:\mbs>g++ -g -o BoltzmannGUI.exe BoltzmannGUI.cpp BoltzmannGUIRes.o -lGdi32 In order to set the image which appears in the upper-left corner of the window, the LoadIcon function is called and the result assigned to the hIcon field of the WNDCLASS struct. Note that two problems were encountered here: the id number of the image could not be 102, which seemed to conflict with the Windows built-in “question mark” icon, and the handle argument to LoadIcon needed to be the current instance (not NULL, as was used for the default application icon). A screen shot, showing the task bar and upper-left corner icons is below. Pointers to Pointers In order to set the address stored by a pointer from within a method, the pointer’s address is passed to the function (pass-by-reference). This could be useful in the paint method, as a new window must be created and stored within a handle at instance scope, but the method will be indifferent to whether it is processing a connection image or a TLU image (polymorphism). However, the code was written so that the window handle was returned up to a parent function where the type of image dragged is discriminated against. Preventing Circular Header Dependency In one instance, the BoltzmannGUI class retained a reference to a NetworkManipulator object, which maintained a pointer back to the parent instantiating BoltzmannGUI object. Since, during the preprocessor phase, the BoltzmannGUI symbol table object will not be externally visible when the NetworkManipulator.h file is processed, mutual #includes in NetworkManipulator.h and BoltzmannGUI.h is not feasible. This is avoided by using a forward declaration of “class NetworkManipulator” in BoltzmannGUI.h, #include “NetworkManipulator.h” in BoltzmannGUI.cpp (where methods are invoked), and #include “BoltzmannGUI.h” in NetworkManipulator.h. Variable Scope in Switch Statement Cases In a switch statement, variables used in one case are visible in another case, but the value is not initialized in the latter case. As the message parsing logic is a large switch statement, regulating the dispatch of GUI events, compile errors can easily occur. Partitioning the cases with {} will prevent this shared scope from occurring. Static Text Controls Interestingly, WC_STATIC style controls display with a different background color than the window proper (gray instead of white). Trying to include the style SS_WHITERECT does not remedy the situation, as the white background covers the text, it seems. For the labels in the canvas, the font size will be manipulated utilizing the WM_SETFONT message. Prior to this, a CreateFont function call must be made, passing the logical font size based on the following transform for 10pt (from MSDN). logHt = -MulDiv(10, GetDeviceCaps(hDC,LOGPIXELSY),72) Here, hDC is the device context for the screen, and MulDiv refers to a multiply of the first two arguments followed by a divide of the product by the third argument. The background color can be set by the SetBkColor function, which takes the window handle and an RGB color reference. The documentation suggests that this processing be done in the handler of the WM_CTLCOLORSTATIC event, whereby a brush reference is returned to the caller which threw the event. In fact, setting the background outside of such an event handler (called whenever the control is repainted) results in no color’s being painted. The “Visible Layer” and “Hidden Layer” labels below are examples of static controls with custom-fonts and programmatically-set backgrounds. Windows Store Deployment Any code not embedded in the Android app (such as the Windows GUI) will likely be released as an app in the Windows Store. Android ViewAnimator Used to switch between the main valuation view and the machine learning module. Android SDK This application will utilize Android API version 16. The “aapt” binary is utilized to generate an “arsc” resource file, as well as to update the “R.java” source file. The “dx --dex” utility is used to convert Java bytecode to a “.dex” archive to be run in the Dalvik Virtual Machine. Interestingly, it seems that the .dex file needed to be deleted prior to re-archiving in order ensure that the most recent classes are included in the APK file. If this is the case, there may be a bug in the SDK’s dx utility. UI Layout This application will utilize a two-faced UI which can be toggled via a “flip button” and which makes use of the ViewAnimator class. Note that the contents of EditText inputs, etc will not change when the screen is toggled. The table for computation output will utilize a fixed column width format set to 1/5 of the screen width given the current orientation. Code to obtain such a width utilizes the android.view.Display’s getSize method. Interestingly, when the table grows too large (more than 25 rows), the flip button to toggle back to the input screen cannot be reached. Note that, currently, the onCreate event is thrown whenever rotation of the device occurs. This should not be the case, as a rotation should retain the state of the ViewAnimator and EditText inputs. Shape Drawable An XML file which is translated to code which paints an image. A preferable approach would be to instantiate this Drawable object in code, but even the RectShape does not seem to provide requisite methods to set the stroke width, fill color, etc. The shape configuration should be saved in the file <app_home>/res/drawable/<shape_name>.xml. These XML files are parsed by the aapt resource compilation utility. For example, given the XML file: <?xml version="1.0" encoding="utf-8" ?> <shape xmlns:android="http://schemas.android.com/apk/res/android" android:shape="rectangle"> <solid android:color="#ffffff"/> <stroke android:width="3dp" android:color="#000000"/> </shape> The following R.java fragment results. public static final class drawable { public static final int data_bg=0x7f020000; Note that the new R.class files need to be generated by makeaapt before the XML files can be utilized as background resources in UI widgets. Test Case Cash flow simulations were carried out both in an Excel spreadsheet (only the first 12 months of simulation are covered by Excel for now). The parameters set in the input screen are as follows: PSA: 1.2 WAC: .08 Pass-through: .07 Principal Amt: 2000000 Risk-Free Rate: .03 Months Remaining: 360 The Excel sheet output is below, followed by the Android input and output screen shots. Note that the Excel sheet includes “FHA” estimated prepayments, which are not relevant in version 1.0. MBS Cashf low Mod el Prep ayme nt Mod el WAC PSA PSA 1.2 CPR Incre ment 0.00 24 Passthrou gh rate riskfree 0.08 0.07 0.03 Note: interest is interest paid to mbs bondholders, not to mortgage lenders note: belo w is mont hstart princi pal minu s sched uled amt times prepa ymen t rate Mont h FHA Princ ipal Prep aid( %) 1 0.00 04 2 0.00 04 3 0.00 04 4 0.00 04 5 0.00 04 6 0.00 04 7 0.00 04 8 0.00 04 9 0.00 04 CP R 0.0 02 4 0.0 04 8 0.0 07 2 0.0 09 6 0.0 12 0.0 14 4 0.0 16 8 0.0 19 2 0.0 21 6 SMM (%) 0.000 2002 2 0.000 4008 8 0.000 6019 9 0.000 8035 4 0.001 0055 4 0.001 2079 9 0.001 4109 0.001 6142 6 0.001 8180 7 mon th begi n prin cipal 2000 000 1998 257. 9 1996 106. 7 1993 546. 8 1990 578. 7 1987 203. 1 1983 421. 1 1979 234 1974 643. 3 mo nthl y pay me nt 146 75.2 9 146 72.3 5 146 66.4 7 146 57.6 4 146 45.8 6 146 31.1 4 146 13.4 6 145 92.8 4 145 69.2 9 1164 3.96 sche dule d prin cipal pay men t (amt , to mbs issu er) 134 1.95 8 135 0.63 4 135 9.09 3 1162 9.02 136 7.33 1161 1.71 137 5.34 138 3.11 6 139 0.65 5 139 7.95 1 140 4.99 9 inter est(t o mbs hold ers) 1166 6.67 1165 6.5 1159 2.02 1156 9.96 1154 5.53 1151 8.75 prepa id princi pal (amt) FHA 799.4 6321 7 798.7 6289 4 797.8 9904 7 796.8 7178 8 795.6 8133 2 794.3 2799 4 792.8 1218 9 791.1 3443 2 789.2 9534 Note: CF assumes 100% pass-through of principal prepayment prep aid prin cipal (amt ) SM M 400. 1719 9 800. 5255 8 1200 .816 4 1600 .799 4 2000 .228 6 2398 .857 9 2796 .441 1 3192 .732 1 3587 .485 2 month -end princi pal(FH A) mont h-end princi pal(PS A) 19978 58.58 19982 57.9 19961 08.47 19961 06.7 19939 49.72 19935 46.8 19913 82.6 19905 78.7 19884 07.65 19872 03.1 19850 25.66 19834 21.1 19812 37.66 19792 34 19770 44.95 19746 43.3 19724 49.05 19696 50.9 disc oun t fact or 0.99 750 3 0.99 501 2 0.99 252 8 0.99 005 0.98 757 8 0.98 511 2 0.98 265 2 0.98 019 9 0.97 775 1 discounted cash flow(FHA) disco unte d cash flow (PSA ) 13773.61 1337 5.32 13737.04 1373 8.8 13697.83 1409 7.73 13655.98 1445 1.91 13611.52 1480 1.1 13564.46 1514 5.1 13514.83 1548 3.7 13462.65 1581 6.7 13407.95 1614 3.88 10 0.00 04 11 0.00 04 12 0.00 04 0.0 24 0.0 26 4 0.0 28 8 0.002 0223 4 0.002 2270 8 0.002 4322 8 1969 650. 9 1964 258. 6 1958 468. 9 145 42.8 145 13.3 9 144 81.0 7 1148 9.63 1145 8.18 1142 4.4 141 1.79 4 141 8.33 2 142 4.60 8 787.2 9562 8 785.1 3611 3 782.8 1771 3980 .455 2 4371 .398 4760 .070 7 19674 51.77 19642 58.6 19620 55.15 19584 68.9 19562 61.46 19522 84.2 0.97 531 0.97 287 5 0.97 044 6 1646 5.06 13350.74 13291.07 1678 0.05 13228.95 1708 8.65 Bibliography 1. http://www.xavier.edu/williams/tradingcenter/documents/research/edu_ppts/01_MortgageBackedSecurities.ppt CPR, SMM, etc formulas 2. http://developer.android.com/guide/topics/resources/drawable-resource.html Shape XML specification 3. http://www.stackoverflow.com/questions/1016896/get-screen-dimensions-in-pixels Retrieval of display size 4. http://en.wikipedia.org/wiki/PSA_prepayment_model PSA discussion 5. http://www.scholarpedia.org/article/Boltzmann_machine Boltzmann Machine overview. 6. http://www.sciencedirect.com/science/article/pii/S0957417414006393 Boltzmann Machine in credit risk 7. https://msdn.microsoft.com/en-us/library/windows/desktop/ms632680(v=vs.85).aspx Win32 guide 8. https://www.allegro.cc/forums/thread/212101 Discussion of dropdown lists in Win32 9. http://blogs.msdn.com/b/larryosterman/archive/2004/04/28/122240.aspx Discussion of COM initialization and threaded apartments 10. https://msdn.microsoft.com/en-us/library/windows/desktop/hh298364(v=vs.85).aspx win32 combobox tutorial 11. http://blog.echen.me/2011/07/18/introduction-to-restricted-boltzmann-machines/ Practical discussion of Restricted Boltzmann Machine. 12. http://www.mingw.org/wiki/MS_resource_compiler Discussion of resources under GCC. 13. https://msdn.microsoft.com/en-us/library/windows/desktop/dd162957(v=vs.85).aspx Information about selecting an object within a device context. 14. http://portal.hud.gov/hudportal/HUD?src=/program_offices/housing/rmra/oe/rpts/hecmdata/ hecmdatamenu HECM data 15. http://newviewadvisors.com/commentary/tag/hecm-prepayments Discussion of 2012 FHA data 16. http://stackoverflow.com/questions/7655736/win32-changing-program-icon Win32 Icons 17. http://www.converticon.com/ Image conversion utility. 18. http://stackoverflow.com/questions/8157937/how-to-specialize-stdhashkey-operator-for-userdefined-type-in-unordered Template specialization for hash/equal_to. 19. https://support.google.com/googleplay/android-developer/answer/113469?hl=en Distribution of Android apps. 20. http://developer.android.com/tools/publishing/app-signing.html Signing APK. 21. https://msdn.microsoft.com/en-us/library/5t7ex8as.aspx Win32 drag and drop. 22. http://en.wikipedia.org/wiki/Active_Template_Library ATL discussion. 23. http://www.cboe.com/micro/gvz/introduction.aspx GVZ Gold Volatility. 24. http://www.codeproject.com/Tips/400920/Handling-WM-LBUTTONUP-WM-LBUTTONDOWNMessages-for LBUTTONDOWN via PARENTNOTIFY catching 25. http://wiki.winehq.org/List_Of_Windows_Messages Win32 message codes 26. http://social.msdn.microsoft.com/forums/vstudio/en-US/8960c64a Defining alternate window event handlers 27. https://msdn.microsoft.com/en-us/library/windows/desktop/ms632682(v=vs.85).aspx DestroyWindow discussion. 28. www.deeplearning.net/tutorial/rbm.html Discussion of Restricted Boltzmann Machine with Python implementation 29. Larochelle, H. and Bengio, Y.. “Classification Using Discriminative Restricted Boltzmann Machines”. Proceedings of the 25th Annual Converence on Machine Learning. Helsinki, Finland. 2008. 30. https://mbsdisclosure.fanniemae.com/PoolTalk/index.html# Fannie Mae MBS Pool Application 31. https://msdn.microsoft.com/en-us/library/aa911419.aspx Discussion of font height in Win32 32. http://winprog.org/tutorial/fonts.html Font example code (Win32) 33. http://www.informit.com/articles/article.aspx?p=328647&seqNum=3 GDI Pen and Brush discussion 34. http://www.fanniemae.com/portal/funding-the-market/mbs/multifamily/dusprepaymenthistory.html Prepayment spreadsheet site 35. http://www.federalreserve.gov/releases/chargeoff/about.htm Charge-off and delinquency data 36. https://en.wikipedia.org/wiki/Mortgage-backed_security#Valuation MBS Overview 37. http://info.fanniemae.com/resources/file/mbs/pdf/gems_remic_term_sheet_111214.pdf FNMA REMIC term sheet example 38. http://www.ginniemae.gov/doing_business_with_ginniemae/investor_resources/mbs_disclosur e_data/Pages/consolidated_data_history.aspx Ginnie Mae reports of delinquency, etc 39. http://www.nasdaq.com/symbol/mbb/historical iShares MBS ETF prices 40. http://developer.android.com/reference/android/widget/TextView.html Discussion of setCompoundDrawables methods. 41. http://www.fanniemae.com/resources/file/debt/pdf/debt_library.pdf Fannie Mae auction system. 42. http://simplesoap.sourceforge.net/ Simple SOAP 43. http://www.investinginbonds.com/learnmore.asp?catid=11&subcatid=56&id=137 CMO types 44. http://www.wsj.com/mdc/public/page/2_3024-bondmbs.html The Wall Street Journal MBS Quotes