MORTGAGE-BACKED
SECURITIES VALUATION FOR
MOBILE DEVICES
APRIL 3, 2015
ASYMPTOTIC INVESTORS
Contents
Motivation..................................................................................................................................................... 3
Algorithms ..................................................................................................................................................... 3
MBS Risk Factors ........................................................................................................................................... 3
Credit Default Risk .................................................................................................................................... 3
Prepayment Risk ....................................................................................................................................... 4
Note Pricing................................................................................................................................................... 4
MBS Trading Strategies ................................................................................................................................. 6
Architecture Diagram .................................................................................................................................... 6
Object Diagram ......................................................................................................................................... 7
Google Play Deployment............................................................................................................................... 8
FHA/FNMA Data Processing Using Machine Learning .................................................................................. 8
File-Based Model Training User Interface (in development) ...................................................................... 10
XML Schema ............................................................................................................................................ 12
Control of Rendering of Restricted Boltzmann Machine in Android GUI ............................................... 14
Data Structures in Java for Restricted Boltzmann Machine Components .......................................... 15
Line Rendering Between Visible and Hidden Layers ........................................................................... 15
Text Nodes in DOM Tree ......................................................................................................................... 17
Translation from FHA Data to XML ......................................................................................................... 18
File Selection for Browse Button ............................................................................................................ 18
Handling Invalid File Path........................................................................................................................ 20
Nesting a ScrollView for the Canvas ....................................................................................................... 20
Utilizing Machine Learning to Infer Future Prices (Win32 GUI, C++ Logic) ................................................ 20
About the Boltzmann Machine ............................................................................................................... 23
Restricted Boltzmann Network Diagram ................................................................................................ 24
Energy Computation ............................................................................................................................... 26
Probability Computation ......................................................................................................................... 26
Restricted Boltzmann Machine Inference .............................................................................................. 26
Mapping Double-Precision Floating Point to Binary ........................................................................... 26
Preprocessor Class .............................................................................................................................. 28
Training Data and Loader .................................................................................................................... 29
Multiplicity of Models – Price, Default Rate, and Prepayment Rate ...................................................... 30
Constructors and Instantiation ............................................................................................................... 30
Sample of Inference in Programmatically Created Network .................................................................. 30
Test Run of Simple Network Inference ................................................................................................... 32
Sample Training via Contrastive Divergence .......................................................................................... 32
RBM Training Algorithm.......................................................................................................................... 36
Contrastive Divergence Implementation in C++ ................................................................................. 37
Boltzmann Machine Classes.................................................................................................................... 39
C++ Template Specializations ............................................................................................................. 39
Aside: C++ Print Formatting Bug ............................................................................................................ 40
Boltzmann Machine GUI ......................................................................................................................... 40
Translating Graphical Images to a Neural Network ............................................................................ 40
Canvas ................................................................................................................................................. 41
Icon Tray.............................................................................................................................................. 44
Win32 Methods and Types ................................................................................................................. 44
Compiler and Linker ............................................................................................................................ 50
Resource Files ..................................................................................................................................... 50
Icons in Win32......................................................................................................................................... 50
Pointers to Pointers ................................................................................................................................ 53
Preventing Circular Header Dependency ................................................................................................ 53
Variable Scope in Switch Statement Cases ............................................................................................. 53
Static Text Controls ................................................................................................................................. 53
Windows Store Deployment ....................................................................................................................... 55
Android ViewAnimator ............................................................................................................................... 55
Android SDK ................................................................................................................................................ 55
UI Layout ..................................................................................................................................................... 55
Shape Drawable .......................................................................................................................................... 55
Test Case ..................................................................................................................................................... 56
Bibliography ................................................................................................................................................ 60
Motivation
Mortgage-backed securities are a highly relevant aspect of the global economy and a product group
which financial engineering methods can be applied to in many ways. The market size has been
estimated to be 13.7 trillion USD in US issuances alone. Currently, these products are traded via the
TradeWeb fixed income platform and have denominations of values like USD 1000 (FNMA class ASQ1),
25000 (for GNMA issuance), and 100000 (FNMA class X1) for the government-agency-issued notes.
Methods which can accurately price, trade, and account for the risk of these products could yield
significant returns for a hedge fund.
Algorithms
This application computes the Conditional Prepayment Rate (CPR), Single Monthly Mortality (SMM),
scheduled payment, and monthly principal balance for mortgage-backed security issues. The following
data fields will be rendered in a time series table.
Monthly Payment - depends on the number of months remaining
CPR – Conditional Prepayment Rate, which depends on the current month (percent increases month by
month up to 6%) and PSA rate, i.e. if month<=30, .002*PSA*month_index, else .06*PSA.
SMM – Single Monthly Mortality, which is 1-(1-CPR)^(1/12)
Scheduled Interest – Pass-through rate times begin principal
Scheduled Principal – Monthly payment – scheduled interest
Prepaid Principal(SMM) – SMM percent*(begin principal-scheduled principal)
End Principal – begin principal – (scheduled principal + prepaid principal)
Total Cash Flow – discount rate input into UI is used to move cash flows back to time zero.
The tabular computational output is stored within the “Vector<MonthInfo>monthData” data structure.
The GUI consists of a Vertical ScrollView, which contains a horizontal ScrollView, which contains a
vertical LinearLayout, which contains a series of horizontal LinearLayout instances.
A second algorithm, implemented on the server-side with an Android client, will be a Restricted
Boltzmann Machine mechanism for calculating the prepayment rate, default rate, or MBS price based
on a model deduced from training data.
MBS Risk Factors
Credit Default Risk
While default risk is not accounted for explicitly, the average coupon is an interest rate which reflects
the risk of the lender, while the pass-through rate reflects the risk of the note investor. An additional
risk adjustment might be made utilizing the credit valuation adjustment (CVA), which reduces the asset
value by the present value of the expected loss-given-default. Ideally, the present value of the interest
cash flows above the amount accounted to by the risk-free rate (e.g. LIBOR) would equal the CVA
amount. Such an adjustment (along with the additional metrics of Debt Valuation Adjustment (DVA)
and Funding Valuation Adjustment (FVA)) may be implemented in version 2.0.
Prepayment Risk
Prepayment risk is the danger that interest payments will be forfeited due to the early repayment of
mortgage loans by borrowers. This usually occurs when refinancing takes place during periods of low
interest rates. Multiple methods of prepayment estimation exist. In the initial release, the single
monthly mortality (SMM) measure will be computed. This is a heuristic which expresses the annual
conditional prepayment rate (CPR) in monthly form. The CPR is computed by multiplying a time-based
amount (max(.002*monNum, .06)) by the Public Securities Administration (PSA) method scale factor.
In a later version, a machine-learning approach to estimating the prepayment rate in a given month by
training a model on historical Federal Housing Administration (FHA) data is proposed.
Product Universe
Two types of mortgage-backed securities exist: agency pass-throughs and collateralized mortgage
obligations (CMOs). Separate pricing lines exist for CMOs and pass-throughs at sites such as The Wall
Street Journal, and the quote convention also differs. Pass-throughs are quoted as a “per-100” price
(e.g. 98 14/32 or 106 10/32), while CMOs are quoted as a basis point spread above US Treasury bonds of
comparable maturity (e.g. 175 bps). Pass-through securities forward principal and interest cash flows
received from loan collections through to the bondholders indiscriminately. CMOs, on the other hand,
are issued in multiple tranches, whereby the most senior chance will receive payment priority in the
event that cash flows are inadequate to pay the promised principal and interest on the bonds.
Collateralized mortgage obligations themselves are organized into two classes: sequentials and planned
amortization class (PAC) bonds. Sequentials distribute the prepayment risk among tranches, requiring
all principal payments to be allocated to the lowest tranche first until it is retired. Subsequent principal
payments flow to the next lowest tranche, and so on. Interest payments are allocated evenly across the
tranches. It is unclear how the default risk is distributed among the tranches, but it would be expected
that the junior tranches would receive interest payments after the senior tranches in such a case. The
PAC bonds also involve allocation of prepayment and default cash flows (or lack of them). However, in
this case, the unexpected cash flows are diverted away from the PAC tranche, making these bonds
behave like a treasury bond with predictable cash flows.
It is expected that the initial products traded will be FNMA pass-throughs exclusively. In the second
phase, FNMA collateralized mortgage obligations will be added, with Freddie Mac (FMAC) and Ginnie
May (GNMA) offerings to follow.
Note Pricing
The price is canonically defined as the present value of the cash flows of the MBS. In the case of this
application, adding the “Discounted cash flow (PSA)” column will obtain the value, with a value
equivalent to the notional indicating par. In an Excel sheet with two different prepayment options, one
a constant 4 basis points, and the other a graduated SMM approach reaching a maximal value of 6%, the
slower prepayment approach was worth significantly less. The value on a scale in which 100 is par is
also computed, with the slow prepayment approach worth about 83 and the SMM cash flows worth
more than 134.
The PSA discounted cash flow value and the par-100 price are also computed in the Android application.
The values obtained under identical parameterization to the Excel sheet (PSA=1.2, WAC = .08, PT Rate
= .07, notional = 2000000, r=.03, T=360) were identical to the outputs of the Excel macro. A sanity check
using only a single month (so the entire principal amount was scheduled for one payment) produced
similarly correct results (2006643.78 present value and 100.33 normalized price).
MBS Trading Strategies
The primary market for mortgage-backed securities (at least the agency notes) involves purchasing the
debt from the issuer (e.g. FNMA). Two types of securities, Discount Notes and Benchmark Bills can be
acquired. Discount notes are issued via Federal Reserve banks and sold by a limited group of brokers,
while Benchmark Bills are purchased via an electronic Dutch Auction. Nonetheless, the more liquid
market for MBS securities appears to be the secondary market, specifically the TBA, or to-be-announced
market. In this scenario, counterparties set a price for securities of a given notional value to be sold for
on a delivery date (or within a settlement month). The exact securities are not specified, but the
contract indicates the issuer, mortgage type, maturity, and coupon.
Trading in the secondary market might involve arbitrage with contracts of different coupon rates and
payment priority (tranche). For example, a butterfly strategy might involve going long one 5% coupon
contract, selling two 7% coupon contracts, and buying one 9% contract.
Architecture Diagram
MBSPricer (View
1)
Compute cash flows
Table model
MBSTable (View 2)
MBSCalc (Quant
Logic)
Object Diagram
GUI Module
NetworkManipulato
r
BoltzmannGUI
DragDropHandler
Boltzmann Machine Training and Inference
Boltzmann
TLU
Google Play Deployment
The calculator utility (and, hopefully, later the neural network code, although this may be sold
separately on the Windows AppStore) will be available for sale on Google Play. In order to accomplish
this, the following steps were necessary.
1. Obtain a Google developer account
2. Generate the APK archive via ApkBuilder using the –u (unsigned) flag, not the –d(debug sign)
flag.
3. sign the APK in release mode (using keytool and jarsigner)
a. keytool generates a private/public key pair
b. jarsigner affixes the digital signature to the apk
4. zipalign the APK file – this causes the byte alignment of the APK to resemble that of a zip file
(presumably facilitating parsing of the file on other operating systems).
5. Upload the APK into the production store, setting the price and metadata
6. Upload images with pixel dimensions created to the Play specifications for icon, feature, and
promo PNG.
7. Complete a content rating questionnaire.
8. Create a privacy policy web site
9. Create a merchant account.
FHA/FNMA Data Processing Using Machine Learning
Aside from the PSA/SMM approach to estimating prepayment rates of the pool, data from the Federal
Housing Administration (FHA) about home loans can be used to derive a prepayment model. Rows in
the tabular data released by the FHA can be parsed as training vectors for a machine learning algorithm.
The Federal Housing Administration programs now fall under the auspices of the Department for
Housing and Urban Development (HUD). The GUI Application will permit loan data to be downloaded to
train Boltzmann prepayment and default models. While the HUD site contains numerous data links,
some appeared to not lead to any site, and others linked to datasets which lacked prepayment or
default information. However, a search of other sites indicated that the Home Equity Conversion
Mortgage (HECM) data could be interpreted to yield prepayment rates. While this is actually data about
“reverse mortgages”, it might be interpreted to represent the state of the home loan market in
general(if no more direct source is available). This inference, however, cannot be made directly from
the data, as there is no “prepayment” column or attribute in the schema. One possible formula is, for
each year, compute (number of loans paid to termination)/(number of loans outstanding). However, it
seems unlikely that the mapping between government loan performance and MBS defaults and
performance is linear, as the FHA loans seem likely to be low-income housing, whereas the mortgagebacked securities are comprised of loans for multiple grades of housing.
It seems that Fannie Mae’s PoolTalk site, which provides issuance and monthly statistics about MBS
securities for lookup by CUSIP, might be a more reliable source of information. A screenshot of this
application is below (copyright FNMA, used without permission).
New Issuance files are divided by record types, where 1 is the pool statistics record, 3 is the loan
purpose, and 9 is the geographic region. Each pool id is associated on a one-to-one basis with the
international CUSIP identifier for MBS securities. The total notional of the pool and number of loans in
various geographic regions are identified in the new issuance files. In the excerpt below, pool AS5545
(CUSIP 3138WFET9) contains 1023 loans, 739 which are purchase loans, and 284 of which are refinance
loans. Eighty-four were issued for properties in California, and thirty-six were made for properties in
Florida.
AS5545|01|3138WFET9|07/01/2015|FNMS 03.5000 CLAS5545|$190,124,021.00|3.5||08/25/2015|MULTIPLE|MULTIPLE|1023|185911.73|08/01
/2045|||||||4.064|||0|360|360|80|751|0.0||4.42|CL ||29.62|81
AS5545|02|MAX|199999.0|4.25|97.0|824|360|7|361
AS5545|02|75%|192000.0|4.125|92.0|786|360|0|360
AS5545|02|MED|185250.0|4.125|80.0|757|360|0|360
AS5545|02|25%|180000.0|3.99|75.0|722|360|0|360
AS5545|02|MIN|175000.0|3.875|20.0|623|301|-1|301
AS5545|03|PURCHASE|739|72.28|$137,414,031.68
AS5545|03|REFINANCE|284|27.72|$52,709,989.54
AS5545|04|1|1016|99.29|$188,770,295.22
AS5545|04|2 - 4|7|0.71|$1,353,726.00
AS5545|05|PRINCIPAL RESIDENCE|951|92.9|$176,618,379.90
AS5545|05|SECOND HOME|37|3.65|$6,929,248.12
AS5545|05|INVESTOR|35|3.46|$6,576,393.20
AS5545|08|2015|1022|99.9|$189,931,563.39
AS5545|08|2014|1|0.1|$192,457.83
AS5545|09|ARIZONA|21|2.05|$3,902,863.41
AS5545|09|ARKANSAS|6|0.58|$1,098,500.00
AS5545|09|CALIFORNIA|84|8.23|$15,642,819.25
AS5545|09|COLORADO|50|4.86|$9,237,062.23
AS5545|09|CONNECTICUT|5|0.48|$917,560.00
Unfortunately, neither the fixed rate monthly file nor the loan-level disclosure file contains fields
disclosing the prepayment rate or the default rate (credit events). However, the loan-level disclosure
file provides some clues, via the initial and current coupon rate and the initial and current unpaid
principal balance. A rising coupon indicates likely credit issues, and a steeply declining unpaid principal
balance is a sign of prepayment. The yield maintenance premium is provided in some files, like those for
discounted mortgage-backed securities (DBMS). However, while this provides insight into the cash flows
resulting from prepayment, it does not represent the likelihood of prepayment. It seems likely that
some other source of historical MBS performance, indicating prepayments, defaults, and prices over the
term of the underlying loans is needed in order to obtain training vectors in the requisite format.
However, FNMA does publish a spreadsheet of prepayment data which specifies origination year, month
of prepayment, and amount. This information would be sufficient to create a corpus for the
prepayment rate training, given the appropriate data translation. In addition, the US Federal Reserve
Bank publishes historical charge-off and delinquency rates for real-estate loans at a quarterly granularity
which, when combined with other data about outstanding loan notional and term values, could provide
training for the default rate metric. Ginnie Mae also seems to offer statistics of delinquency and
unscheduled paydowns. Both the number of loans and the unpaid balance amounts are provided for
thirty, sixty, and ninety day delinquencies. MBS pricing information is available on a daily basis from
sites such as The Wall Street Journal, and TradeWeb provides a limited 12-month graph of prices. The
iShares ETF historical pricing information is available on the NASDAQ web site.
File-Based Model Training/Test
A third screen is being added to the Android application to import Housing and Urban Development
(possibly), US Treasury, FNMA, and GNMA datasets into the Restricted Boltzmann Machine model’s
training set. The first utility implemented in this UI is the parsing of Restricted Boltzmann Machine XML
and the rendering of the network in the Android GUI. This will involve two segments: a file selection
panel and a Restricted Boltzmann Machine Rendering view. A preliminary screen shot (prior to
implementing the XML parser and renderer) is below. Upon examining the model, data can be loaded
and Contrastive Divergence training run on the model or, if training is complete, inference can be
carried out on an input vector (these UI controls will be designed later).
The XML file is stored on the device’s SDCard and facilitates both the graphical rendering and the
serialization of the network graph. It will be read using the i/o methods
Envronment.getExternalStorageDirectory and BufferedInputStream’s readLine.
XML Schema
The XML file encodes the sequence of visible layer threshold logic units, hidden layer threshold logic
units, and inter-layer connections. In the future, tags for input and output values of each TLU should be
added, as should tags for the energy. An optional tag, the name of the node, such as “WAC0” to
indicate the leftmost bit of the WAC input, is included but not currently parsed (should be added to
Android display). A sample of such a file is below.
<xml version="1.0">
<RBM>
<visibleLayer>
<TLU>
<name>WAM0</name>
<index>0</index>
<value>0<value>
</TLU>
<TLU>
<index>1</index>
<value>0<value>
</TLU>
</visibleLayer>
<hiddenLayer>
<TLU>
<index>0</index>
<value>0<value>
</TLU>
<TLU>
<index>1</index>
<value>0<value>
</TLU>
</hiddenLayer>
<connections>
<connection>
<visibleIndex>
0
</visibleIndex>
<hiddenIndex>
1
</hiddenIndex>
<weight>10<weight>
<connection>
<visibleIndex>
1
</visibleIndex>
<hiddenIndex>
0
</hiddenIndex>
<weight>5<weight>
</connection>
</connections>
</RBM>
A new optional child element of <TLU> has been added to accommodate the threshold (bias) of a unit.
This is a double-precision floating point value known as <threshold>. A TLU including this element might
be specified as follows.
<TLU>
<index>0</index>
<value>0<value>
<threshold>1.2</threshold>
</TLU>
Control of Rendering of Restricted Boltzmann Machine in Android GUI
The RBM rendering will be handled in the “onDraw” method of the CanvView view nested in the FHAUI
view. This will check the encapsulating FHAUI’s “visibleLayerPts” and “hiddenLayerPts” hash tbles and
the “connections” vector for the network objects to be rendered. Such hash table and vector data
structures will be populated by the XML file loaded and parsed into a Document Object Model (DOM)
tree. The index field will be retrieved from the XML text, and the screen shot demonstrates that nonsequential ID numbers can be utilized, but the y-offset will be set by a counter, not by the displayed
value.
A preliminary screen shot of a rendering of the visible layer described in the HTML above is illustrated
below.
Data Structures in Java for Restricted Boltzmann Machine Components
The actual Boltzmann Machine class was written in C++ and will run outside of the Android device. In
order to integrate the C++ code with Android code, the XML file will be transferred between the
platforms. However, a mapping from the visible (hidden) layer index in the XML file (an index tag exists)
to the position of the Threshold Logic Unit image on the canvas is required, as is the mapping from the
(visible,hidden) pair (of TLUs connected by an edge) to the weight of this edge (as a line is drawn and the
weight is used to label the line). In the future, a mechanism to store the value of the TLU will also be
requisite. A new hash table for each layer (visible or hidden), mapping index to value, would suffice for
this purpose (and has been implemented).
Line Rendering Between Visible and Hidden Layers
Lines will be traced between the boundaries of the TLU circles to signify connections and will be labeled
with weights. The coordinates of the lines’ endpoints will be based on the centers of the TLUs whose
indices are stored in the visible and hidden indices of the ConnInfo object. The visible layer endpoint of
the line will be at (centerX+radius, centerY), and the hidden side will be at (centerX-radius, centerY).
The connections XML for the example case, with two lines, is as follows (note that the index values are
nonconsecutive and used as ID numbers).
<connections>
<connection>
<visibleIndex>
1000
</visibleIndex>
<hiddenIndex>
66
</hiddenIndex>
<weight>10</weight>
</connection>
<connection>
<visibleIndex>
99
</visibleIndex>
<hiddenIndex>
760
</hiddenIndex>
<weight>5</weight>
</connection>
</connections>
The rendering of the above XML is below.
Text Nodes in DOM Tree
Interestingly, even in nodes such as <visibleLayer> which contain no text, it was necessary to filter out
text nodes from the tree traversal. It is not clear whether inclusion of an empty text node with all DOM
elements is standard or parser implementation-specific. In fact, it seems that white space is being
construed as text, as the log which follows indicates.
I/System.out( 3065): renderHidden, text node:
I/System.out( 3065):
Translation from FHA Data to XML
The data files from the Federal Housing Administration must be translated into the XML which will be
stored in the DOM. Two mechanisms might account for such mapping. 1. Utilization of the Win32
graphical user interface by a human operator to manually create a Restricted Boltzmann Machine after
reading the data and then saving the topology into XML (in progress). 2. Running an Android utility to
parse the FHA file, map it to inputs, visible units, hidden units, and outputs, and persist the XML to a file
(to be developed).
File Selection for Browse Button
Upon clicking the “Browse” button in most file load user interfaces, a file selection dialog is opened.
Within this dialog, double-clicking a directory descends into the directory and appends that directory to
the path in the selection text view. In Android, no standard widget exists within the API.
In the initial phase, verification that the /sdcard directory’s files and directories can be rendered will be
confirmed. A subclass of AlertDialog, FileChooser, will be implemented. The files and folders can be
scrolled through, and the icon used to display for a file or folder will be determined based on the
“isDirectory()” method on the File object. Icons are displayed using the TextView’s
“setCompoundDrawablesWithIntrinsicBounds” method. When the “Select” button is clicked, the
contents of the “path” EditText are copied back to the parent window.
In phase two, a click on a folder will open it and update the path EditText, and a single click on a file will
add its name to the path (it will not set the file path in the parent dialog, but this would be a convenient
feature). This will require handling clicks on the TextView and looking up the view in a hash table
mapping the view to a file/folder status. Instead of hashing on the view directly (requiring a hash
function for the TextView, the view will be assigned an ID based on its index in the directory listing, and
this will be the hash key. Both of these phases have been implemented, but no highlighting when the
view is touched has been added (this is a desired feature). An XML file was successfully found and
loaded via the browse mechanism.
A screen shot of the dialog is below.
Additional functionality to handle returning to parent directories in searching for a file was added with
the “parent directory” button. Moving from /sdcard/Samsung to /sdcard is illustrated below.
Two issues which remain outstanding are:
1. The entire path is difficult to view and scroll through in portrait mode.
2. An additional ‘/’ is appended to the path when moving up with the “parent directory” button
and then back to a subdirectory.
Handling Invalid File Path
For the time being, a blank canvas is displayed with only the visible/hidden partition and the exception is
logged to the Android “logcat” stream. In the future, a dialog or Toast should appear notifying the user
of the error.
Nesting a ScrollView for the Canvas
The Restricted Boltzmann Machine can have a variable number of TLUs in its visible and hidden layers.
Thus, given a constant scale, the height of the diagram derived from the XML is unknown at the time of
the original layout rendering. Thus, the CanvView object is placed within a ScrollView. However, there
is no direct imperative means to set the maximal height of the ScrollView (size of canvas that, when
exceeded, induces scrolling and wrapping). However, utilizing the ViewGroup.LayoutParams parameter
to the addView method enables both the actual (thus maximal) width and height of the ScrollView to be
specified. These parameters are set in the LayoutParams constructor, and then the LayoutParams
object is passed to addView along with the ScrollView. However, this approach, which involves onTouch
events’ being handled by the internal and external ScrollViews, causes the internal view to be scrolled at
a slow rate. It seems that the shift offset, which constitutes a scroll, is divided between the the internal
and external ScrollViews. The only way to ameliorate this would be to override the touch or motion
event handler used by the internal ScrollView and ensure that it is not propagated to the external
ScrollView. This might involve the external ScrollView’s ignoring the scroll message, if the touch fell
within the internal view. However, in the case when the touch does not fall within the internal
ScrollView, the scroll invocation’s offset, based on the length of the gesture, is non-trivial to compute.
The problem may also be that every scroll would entail an onDraw call to the canvas, but this behavior
would seem to be the same for any other view embedded in a ScrollView.
For now, nesting of ScrollView will be avoided, and such an approach will be considered only if the RBM
becomes prohibitively large. The canvas and flip button will be removed and re-added with each file
load in order to enable the resizing of the canvas in the main linear layout.
Note that some accounting for the nested behavior seems to occur in API level 21 of the Android SDK,
where “nested scroll” event listeners can be designated for the control, but the current application is
compiled with the API level 16 library.
Inference View
A link from the Boltzmann load GUI will enable mortgage-backed securities parameters to be entered
into the user interface. Upon the clicking of the “infer price” button, the parameters, along with the
state representation (likely the XML) will be serialized and sent from the Android client to a c++ web
service listener on the server. The input screen will be fairly similar to the initial screen used as input for
the MBS cash flow computation. Upon clicking of the “calculate price” button, an HTTP GET is made to a
web service front-end for the Boltzmann Machine, which performs the inference and returns the price.
A screen shot of the Android user interface is below.
SOAP Web Service
The server-side Restricted Boltzmann Machine will be accessed from the client via the Simple Object
Access Protocol (SOAP), which enables RPC calls to be made via HTTP and marshalling to take the form
of XML documents. The problem is to determine which libraries will be utilized for the HTTP server,
method dispatch, and client TCP/IP proxy code generation. One alternative being considered is the
open-source platform SimpleSOAP by Scott Seely and Jasen Plietz. However, problems with compilation
of this library with the windows socket headers suggest that this solution may not be usable.
The web server will dispatch to the PrepaymentApp class, passing it an XML representation of the
desired RBM and the input parameters for prepayment rate inference. A new deserialization method to
generate the neural network TLUs, weights, and connections will be implemented in the Boltzmann class
invoked by the PrepaymentApp object. The XML will be parsed using the C++ API of the Apache Xerces
Document Object Model (DOM) XML parser. This traversal of the DOM tree will be similar to that
accomplished in the FHAUI class’s Java code used to render the Boltzmann XML in the Android GUI.
The TLU elements will be read into “visible” and “hidden” vectors, while the connection elements will be
read into the map data structures for connections and weights.
getPrepaymentRate call (XML message)
Web Server (SOAP
Listener)
Client
(Android) getPrepaymentRate call (XML message)
postprocess
preprocess,
infer
Restricted Boltzmann
Machine Logic
SimpleSoap Build and Deployment
The typical SOAP model for client generation and invocation is as follows:
1. Download service description in the XML format known as WSDL from the server
2. Generate proxy code for making the HTTP calls to the web service and parsing the HTTP
response
3. Invoke the web service client methods from the application
The SimpleSoap code base was written to compile on Linux, but the MinGW C++ compiler (the compiler
of choice for the firm currently) does not seem to run correctly within the Cygwin environment. Thus,
the current approach has been to rewrite the SimpleSoap build scripts as Windows cmd files. However,
this induces errors in the socket code, as it seems that the TCP/IP code is not compatible with using the
“winsock” library in place of “sys/socket”. This also fails, due to apparent inconsistencies in the mingw
winsock implementation and the compiler’s treatment of function templates.
Utilizing Machine Learning to Infer Future Prices (Win32 GUI, C++ Logic)
A Boltzmann machine will be implemented to learn a pricing function from a data set. Both prepayment
and default can be parameterized in terms of the time to maturity, credit rating of the obligor, weighted
average coupon, etc.
About the Boltzmann Machine
A Boltzmann Machine is a form of recurrent neural network with two types of nodes: hidden and
visible. The weights on the connections between nodes are trained via gradient ascent, similarly to
other neural networks. However, the weight update formula is somewhat anomalous, as it involves
adding the difference of the expected value of the product of the two connected states’ outputs in the
data set and the expected value of the same product over the thermal equilibrium (temperature 1)
distribution of the model. The notion of thermal equilibrium is somewhat counter-intuitive, but it
simply means the output of the model in configuration that is most likely for the given weight set, so the
weight adjustment is similar to standard gradient ascent, subtracting actual (model) output from
expected (data) output. In addition, the output at a logic unit is a stochastic function of the dot product
of the weights and inputs, not just the thresholded dot product, as in a standard TLU. While most neural
networks are used for supervised learning problems, the Boltzmann machine seems to be utilized for
unsupervised feature clustering problems.
The Hopfield Network, a simpler form of a Boltzmann network (without stochastic activation) is utilized
for content-addressable memory. Content-addressable memory means that a data item can be found
by querying the memory with a fragment of the data. How such an architecture relates to mortgagebacked security pricing is not intuitively obvious. However, precedent exists for Boltzmann network
utilization in the field of credit risk (see Tomczak and Zieba, 2015).
Basically, a Boltzmann machine is just a canonical neural network in which the state of a TLU is a
probabilistic function. The final hidden layer can be construed as an output or an inference layer, just as
the output layer of a feed-forward neural network is. Neural networks are models of a biological brain,
where the threshold logic units represent neurons and the weights are analogous to synapses. The
training of weights can be considered similar to a memory function, whereby the weights enable the
network to remember the correlation between the inputs and outputs (bidirectionally). Another
interpretation would be that the Boltzmann machine is likely to generate visible input vectors in the
memorized set given the abstract state fixed in the hidden layer units.
In this mortgage-backed securities application, the visible units will represent inputs such as the
weighted average coupon, the pass-through interest rate, the notional, the borrowers’ credit ratings,
the average maturity of the underlying loans, the PSA rate, the month number etc. The hidden units will
represent one of: the default rate for a month, the prepayment rate for the month, or (for a high-level
inference engine) the price of the MBS securities.
Restricted Boltzmann Network Diagram
A restricted Boltzmann network has only one visible and one hidden layer, and no connections within
the layers is permitted.
Visible Layer
Hidden Layer
Energy Computation
For a given TLU, an iterator traverses all connections and considers those beginning with that TLU (if a
visible unit) or ending with that TLU (if a hidden unit). For each such connection, the quantity
(weight*state(neighbor)) is added to the TLU’s energy.
Probability Computation
Given the energy, the probability that the state is set to 1.0 is 1/(1+e^(-energy)). If this probability
exceeds the TLU’s threshold (by default 0.5) the state is set to 1.0. Likewise, the probability of any
configuration of the visible and hidden units is p(v,h) = e^(-E(v,h))/∑u,g(e^(E(u,g)). Here, u, g are all visible
and hidden configurations besides the (v,h) configuration.
Restricted Boltzmann Machine Inference
Inference is carried out by clamping a set of states to the visible units and computing the values of the
hidden units. The input units will represent the following properties of a mortgage-backed security
issue: weighted average maturity, weighted average coupon, pass-through rate, number of loans,
notional, and credit rating. Each input layer (visible layer) TLU will represent one bit for one of these
values. The output layer will be configurable, representing the prepayment rate, default rate, or MBS
price. Thus, a quantity like the FHA prepayment rates can be learned and utilized in a cash flow matrix
in the same manner that the SMM rates were.
The algorithm for inference is as follows.
1. Given a vector of variables (like the WAM and WAC), preprocess these into a vector of single bits
(stored as doubles).
2. Assign the bits obtained to the visible layer TLUs in a 1-1 mapping.
3. Compute the energy of all hidden-layer TLUs.
4. Compute the probability of each hidden layer TLU’s being set to 1, given the energy.
5. Based on the probability above, compute a random draw in the range of 0 to 1. Basically, the
probability is the chance that the draw will yield 1. (Multiply the probability by 1000 and set this
to be the cutoff. If a random number between 0 and 1000 is less than or equal to the cutoff,
return 1, else return 0). If the draw is greater than or equal to the threshold for the TLU, set the
state of the TLU to 1, else set the state to 0.
Mapping Double-Precision Floating Point to Binary
The “one unit” or “one TLU” description above means one multi-bit unit, which must be split into
multiple single bit units. This will be done using two techniques: making each TLU be a selector, where
setting the state to one means that this particular option was selected, or representing the value as a
binary string. An example of the selector method would be the notional, where three TLUs are utilized:
the first set means a notional of 100 million, the second set means a notional of 10 million, and the third
set means that the notional is 1 million. An example of the second approach would be the weighted
average coupon (WAC), where four bits are used to represent the percentage (0-15, e.g. 1000 for 8%).
In the first case, three threshold logic units are used, and, in the second case, four units are used. For
percentage inputs such as the risk-free rate and WAC, the rate is multiplied by 100 prior to translating to
binary. Note that credit ratings are assumed to follow the Standard and Poor’s convention without the
use of the +/- qualifier, and that the lowest rating considered is “D” (default).
Note that “WAM”, weighted average maturity, is treated differently from “TTM”, or total time to
maturity. The WAM value is the average number of months remaining of the loans in the bond
portfolio, while the TTM value is the number of months until the bond itself matures.
The initial mappings are as follows (post-processing of output is included).
number
of TLUs
input field
type
PSA
double
In 1% increments, 0-255 (eg PSA 120% =
8 01111000)
R
double
in 1% increments, 0-31%, rounded to
5 the nearest percent
Wam
double
Wac
double
5 in 1% increments, 0-31%
ptRate
double
5 in 1% increments, 0-31%
double
leftmost bit->10^5 loans or fewer,
second from left->10000 or fewer
6 loans, … right bit set->1 or fewer
double
leftmost bit->1bln or less, second from
left->100 mln or less, … right bit
6 set->10000 or less
string
000=D(default), 001=C,
010=B,010=BB,011=BBB,100=A,
3 101=AA,111=AAA
numLoans
notional
rtg
Semantics
10 0-1023 months
output type
(not
implemented) integer
average time
to maturity
(months)
integer
output field
type
Output value
double
00=MBS Price, 01=Default Rate ,
2 10=Prepayment Rate
10 Simple binary string in range 0-1023
number
of
TLUs/bits Semantics
If output type MBS price, use 9 bits (up
to 511 par 100) divided by 511 times
100. If percent of defaults or
prepayments, bit value is right 9
(significant) bits of product (decimal
rate times 511) rounded to nearest
integer. Thus, divide bit value by 511
for rate (allows decimal percents less
than 1%, and all 1’s refers to 100%, for
range of 0.1 to 1.0). Final output units
9 are in decimal.
Preprocessor Class
A preprocessor abstraction has been designed for converting business-level parameters (weighted
average maturity, notional, etc) to bit vectors. The high-level methods such as wacToBits invoke some
intermediate methods, such as doubleToFiveBits, which are common among multiple variables (risk-free
rate, WAC, etc). All fields needed for inference of the prepayment rate, default rate, and MBS price will
be translated to bit vectors. The “preprocess” method will invoke all of these (as implemented initially,
later only a subset may be used) and amalgamate the bits into a single network input bit vector. The
order of parameters in the input object is: PSA, risk-free rate, WAM, WAC, pass-through rate, number of
loans, notional, credit rating, and time to maturity. As the parameters are heterogeneous, an
encapsulating type has been defined to contain the parameters. Preprocess methods for a parameter
like r involve rounding the value, multiplying it by 100 (to convert it to a percent), and then considering
the bits representing 2^4, 2^3, . . ., 2^0. Thus, a decreasing for loop is run on the exponent of base two,
where the value to be converted is tested against (1<<exponent). If greater or equal, a “1” is pushed
onto the vector, else a zero is pushed. Then the value is updated according to: (x = x%(1<<exponent),
and the exponent is decremented. For the output used in supervised algorithms, the double-precision
value is divided by 511 (rightmost 9 bits of integer form will be used) and then rounded to the nearest
integer. Bits of this value are considered from left to right, where a one will be pushed as the ith bit
from the right, if the remainder of the original integer exceeds 2^i. Otherwise, a zero is pushed. For 500,
256 would be checked first, resulting in a 1, then the remainder (244) would be checked against 128
(push 1 for 11), then 116 would be checked vs. 64 (111), then 52 against 32 (1111), 20 against 16
(11111), 4 against 8 (111110), 4 against 4 (1111101), 0 against 2 (11111010), and 0 against 1
(111110100).
Two new classes have been added to handle the training and preprocessing phases, TrainInput and
NetworkInput. TrainInput is the class created from the input file, which includes both input and output
(i.e. rating, WAC, …, and prepayment rate). NetworkInput contains only the input fields (WAC, WAM, r,
etc). NetworkInput is the superclass of TrainInput.
Training Data and Loader
In order to validate the model, training vectors will be loaded from flat files and used to set the weights
in memory. Later, both the training vectors and the weights will be stored in the database. The file
format will be the standard comma-delimited (.csv) format, with each row representing a distinct
training vector and output pair. The CSV file will be translated to a vector of TrainInput objects,
representing the inputs and ouputs to the mortgage-backed securities model. For each of the three
(prepayment, default rate, MBS price) inference models, a sample file is below. Note that it makes little
sense to mix the three types in a single file, as the neural network does not take the output type as an
explicit input. Thus, the weights learned would not be able to discriminate between output types for a
given input vector. Here, the output type of 0 represents the prepayment rate, the output type of 1
represents the default rate, and the output type of 2 represents the MBS price. A rating of 3 is BBB
(011), and a rating of 4 is BBB.
1. trainvecs_prepay.csv
psa,r,wam,wac,ptrate,numLoans,notional,rtg,ttm,outputType,outputValue
120,.05,360,.07,.06,10000,1000000000,3,360,0,.02
120,.05,360,.07,.06,10000,1000000000,3,360,0,.03
120,.05,360,.07,.06,10000,1000000000,3,360,0,.097
120,.04,330,.08,.06,1000,200000000,4,360,0,.011
2. trainvecs_default.csv
psa,r,wam,wac,ptrate,numLoans,notional,rtg,outputType,outputValue
.05,360,.07,.06,10000,1000000000,3,1,.04
.05,360,.07,.06,10000,1000000000,3,1,.035
.05,360,.07,.06,10000,1000000000,3,1,.026
.04,330,.08,.06,1000,200000000,4,1,.05
3. trainvecs_mbs.csv
psa,r,wam,wac,ptrate,numLoans,notional,rtg,outputType,outputValue
.02,330,.06,.04,10000,1000000000,7,2,102
.03,390,.08,.07,10000,1000000000,7,2,110
.03,360,.05,.04,10000,1000000000,2,2,80
.04,330,.08,.06,1000,200000000,3,2,90
Multiplicity of Models – Price, Default Rate, and Prepayment Rate
Due to the relative simplicity and generality of the Restricted Boltzmann Machine, the same basic
formulas and design can be leveraged for more than one variable’s model. Possible outputs include the
prepayment rate, the default rate, and the MBS price, and these can be chosen within the construction
kit graphical user interface. Prepayment rate and default rates will be per month and should be
parameterized by the current month number or number of months remaining in the loan. The output
quantity will be stored along with the topology in the XML files. For the time being, the same input
variable set will be leveraged for all models, although in the future this might later be configurable.
These variables are: risk-free rate, WAM, WAC, pass-through rate, number of loans, notional, rating.
If the output is either the default rate or prepayment rate in a given year (year n), the MBS price can be
inferred by obtaining default and prepayment rates for all years up to the maturity of the security.
These rates can be utilized to derive the cash flows, which are discounted and summed to obtain the
price. Given that the MBS price is the output, assumptions can be made about the growth dynamics of
the prepayment and default rates. Then, in each year n, the cash flow, default rate, and prepayment
rate might be backed out, given the price (2n unknowns might be reduced to two—the initial year
prepayment and default rates. If one more assumption can be made, this can be combined with the
cash flow’s present value formula to obtain a solvable system).
For the time being, a single model will permit only one output type, although it would be possible to
pass the output type as a parameter in the training vector. The output type will be passed as a separate
parameter to the train and test methods.
Constructors and Instantiation
No constructors which create a usable network have been created initially, as the network topology is
determined by the user. Basically, any constructor would require a set of visible TLUs, a set of hidden
TLUs, connections between visible and hidden nodes, and weights for each connection. The
“createSampleNetwork” function calls the default network constructor and then adds the TLUs,
connections, and weights by setting public instance variables.
A method to instantiate a network from an XML file has been written, as has a TLU class. The TLU stores
threshold (bias), index, name, and value fields. Note that the index is a unique identifier which, while it
is intended to indicate the ordinal position of the TLU in the layer, is not explicitly used to sort the TLUs
by.
Sample of Inference in Programmatically Created Network
In the main method, a simple network with two nodes in each layer is constructed. Without training,
the inference is called using the weights of 1 and the inputs provided on the command line. The output
type is set to be the prepayment rate. However, given only two input layer TLUs, the preprocess
function will serve to convert a vector of seven double-precision floating point numbers into a vector of
40 bits. However, only the first two bits of this vector will be utilized. Thus, only the first parameter
serves to dictate the input into the neural network (the remaining 38 will be discarded in the
“setVisible” method which assigns the state of visible-layer neurons to the coordinates of training input
vectors.
In addition, the two bits of output allow four discrete possible values for the prepayment rate. For now,
these four values will be: {11=75.15% or less, 10=50.098%, 01=25.049%, 00=0%}.
The following command line is utilized for this test.
c:\mbs>Boltzmann /?
Usage: Boltzmann (<risk-free rate> <WAM> <WAC> <pass-through rate> <number of l
oans> <notional> <rating> <trainDataFile> <outType)||/?
OutTypes: 0=prepayment rate, 1=default rate, 2=MBS price
If “/?” is entered as the first argument, the usage message is printed.
The inference on the network containing two TLUs in each layer was successfully tested. The topology
is sketched below.
Hidden TLU 0,
energy 1,
probability .73
1,threshold .5
Visible TLU
0,input 1
.7
Visible TLU 1,
input 1
1
Hidden TLU 1,
energy .7,
probability .668
, threshold .5
Note that, as the input to the 0th visible TLU is 1 and the Visible0-Hidden1 weight is 0.7, the 1st hidden
unit has energy of .7. This generates a probability of 1/(1+exp(-.7)) = .731. The variance in output seems
correct and in accord with a stochastic neural network. One problem noted was the monotonic
increasing nature of the first random draw. However, this is to be expected, as the random number
generator is reset using the current time with each app restart. Given the verification of the neural
network state propagation, more complex networks with full pre-and-post-processing and contrastive
divergence training can be implemented.
Test Run of Simple Network Inference
The following command line was executed:
Boltzmann .03 360 .07 .06 100 20000000 AA
The parameters are: 3% risk-free rate, 360 month weighted average maturity, 7% WAC, 6% passthrough rate, 100 loans in the portfolio, 20 million dollar notional, and AA credit rating.
Basically, a network of two visible and two hidden nodes (drawn above) with the weights of .7 and 1
trained as above is fed the inputs on the command line. The final output of two bits is translated to a
prepayment rate as follows:
(bit0*256+bit1*128)/511
The use of the rndDraw function given a probability to set the hidden layer was somewhat suspect over
repeated runs in a short time. Since the random number generator is seeded with the time, the first
random number of each run is monotonically increasing with time elapsed between runs. However, for
multiple calls in a single run, the numbers are random and, over a span of 1000 time increments, the
draw will output a value below the threshold p in p of the steps.
Sample Training via Contrastive Divergence
The simple two-layer network illustrated above will be trained with the CSV file discussed in the
“Training Data and Loader” section. The translation of the CSV file to a TrainInput object is a
straightforward string to numerical (integer or double) conversion. However, the TrainInput object must
subsequently be converted to a chain of bits to input to the threshold logic units via the Preprocessor
class I ”positive” step and another chain of bits to compare to set as the hidden output in the “negative”
step of Contrastive Divergence. This is handled via the “preprocess” and “getOutBits” methods of the
Preprocessor class, which are invoked by the “train” method of the Boltzmann class. The input fields for
each training row are read into forty threshold logic units. Strangely, the “setPositive” method which
(originally, later modified to set hidden from the training vector) assigns the hidden units the outputs
from the training vector and computes the products of visible and hidden TLU outputs on a connection,
crashes in the second iteration of the loop over connections. The problem was that 40 input bits and 9
output bits need to be condensed to two input bits and two output bits. This was corrected by adjusting
the training code to be able to handle a case where the number of neurons is fewer than the number of
input bits (the leftmost bits are placed and the remainder dropped).
Strangely, convergence occurred immediately in one iteration. At each iteration, the current weight
settings are printed to the console. In the one iteration, no update to either weight, and so the weights
remain at the original 0.7 and 1 values. This seems to have been due to an error in the negative phase in
which the TLU output computation was performed in the visible layer instead of the hidden layer. The
second problem was that the bits passed to the visible layer TLUs were always zero. This is likely
because the only two bits used are the left two bits of the risk-free rate, which are always zero. This was
corrected by setting the risk-free rate to 32% for one input vector (more visible TLUs would prevent such
truncation), but the negative phase output was still zero, as there was confusion between the numerical
format stored (double) and the format printed (int). This was corrected by storing the value, which is
just the product of bits, as an integer. A problem similar to the problem with the setting of the visible
bits from training data occurred in the setting of the hidden bits. This involved the use of only the first
two bits of the prepayment rate in the output, which were both 0. Thus, a prepayment rate of 100%
was set for one training vector (again, using more hidden-layer TLUs would prevent this measure from
being required). The one remaining problem is that outputs from the positive and negative phases seem
to always match, inducing convergence in one iteration without weight adjustment (and eliminating the
stochastic element). The immediate convergence was resolved by lowering the convergence threshold
of the weight delta (from 1 to .1), and the lack of divergence of the positive and negative phases was
apparently an aberration. The update of the weights is correctly carried out as wij = positiveij-negativeij,
where i and j are the indices of the visible and hidden layer TLUS, respectively.
Excerpts from the training and inference logs illustrate the process. In each iteration, four training
vectors are processed and two weights are updated for the training vector. Only the first training vector
resulted in interesting output, as this is the only one for which the visible layer TLUs do not output only
zero, inducing the visible and hidden layer vectors to be zero.
Training:
bits size 40, visible size 2
visible len 2, bits len 2
setVisibleBits, bit 1
setVisibleBits, bit 1
setPositive, before for, num out bits: 9
connections[0] 1
setPositive for loop
setpos visind 1
setPositive, vis 1, hid 0
visible[visInd].outputVal 1
hidden[hidInd].outputVal 1
pair hash operator access
hash code returned 32768
pair hash operator access
hash code returned 32768
pair equal_to operator access
pair equal_to returns 1
after positive set, value 1
setPositive for loop
setpos visind 0
setPositive, vis 0, hid 1
visible[visInd].outputVal 1
hidden[hidInd].outputVal 1
pair hash operator access
hash code returned 1
pair hash operator access
hash code returned 1
pair equal_to operator access
pair equal_to returns 1
after positive set, value 1
after setPositive
setNegative entry
pair hash operator access
hash code returned 32768
pair equal_to operator access
pair equal_to returns 1
computeTLUState, layer 1, index 0, energy 1, prob 0.731059
rndDraw, rndInt 963, cutoff 731
pair hash operator access
hash code returned 1
pair equal_to operator access
pair equal_to returns 1
computeTLUState, layer 1, index 1, energy 0.7, prob 0.668188
rndDraw, rndInt 178, cutoff 668
visInd 0, hidInd 1
visible[visInd].outputVal 1
hidden[hidInd].outputVal 1
product: 1
pair hash operator access
hash code returned 1
pair hash operator access
hash code returned 1
pair equal_to operator access
pair equal_to returns 1
after negative set, value 1
visInd 1, hidInd 0
visible[visInd].outputVal 1
hidden[hidInd].outputVal 0
product: 0
pair hash operator access
hash code returned 32768
pair hash operator access
hash code returned 32768
pair equal_to operator access
pair equal_to returns 1
after negative set, value 0
setNegative exit
…
visible index 1, hiddenIndex 0, weight 2.000000
pair hash operator access
hash code returned 1
pair equal_to operator access
pair equal_to returns 1
visible index 0, hiddenIndex 1, weight 0.700000
iter 1, deltaWtSum 1.000000
Inference:
computeTLUState, layer 1, index 0, energy 0, prob 0.5
rndDraw, rndInt 813, cutoff 500
pair hash operator access
hash code returned 1
pair equal_to operator access
pair equal_to returns 1
computeTLUState, layer 1, index 1, energy 0, prob 0.5
rndDraw, rndInt 175, cutoff 500
postProcess, outVec len 2
postProcPrepay, outVec size 2
while loop expVal 256, ind 0
outVec[ind]=0
while loop expVal 128, ind 1
outVec[ind]=1
postProc, vector 0, 1, retVal 0.250489
infer returns: 0.250489
rbm inferred prepayment rate: 0.250489
Test of Full Input/Output Prepayment Inference
Both the training and inference will be tested on a production-scale (56 inputs, 9 outputs) RBM. A new
class, PrepayApp was created to call the Boltzmann class’s training and test methods with the output
type set to “prepayment”. Although they may be moved into the Boltzmann class later, the PrepayApp
class maintains statistics about the accuracy of the training and testing. These metrics can be reported
via the “printTrainSummary” and “printTestSummary” methods. The input must be read from a file into
a TrainInput object, serialized to a bit vector, and read into the Restricted Boltzmann Machine. The
output bits from the RBM must be postprocessed to convert the bits back into the prepayment rate
(double precision floating point).
A critical problem in writing this application was how to report training (number of training vectors
correctly classified in training) and test (number of test vectors classified correctly) statistics known only
to the Boltzmann object to the PrepayApp object. This will be accomplished by modifying the
Boltzmann class’s train and test methods to return a ReportInfo object. The test method will return the
expected and actual results for each input vector in a list of TestResult objects along with the summary
information, and the train method will include the number of iterations required for convergence and
the convergence threshold. The per-training-vector error computed by the training phase will be the
positive-negative difference calculated within the contrastive divergence algorithm, and the metric in
the case of testing will be the difference between the expected (training vector) prepayment rate and
the actual rate. The per-vector error threshold will initially be fixed at .01, but this variable should be
parameterized. The reporting functionality in the PrepayApp class can also access the weights in the
RBM via the public members of the Boltzmann object. The “printTrainSummary” and
“printTestSummary” methods will be similar, with the “printTrainSummary” outputting the number of
iterations, the overall threshold for convergence, and the final weights along with the values printed in
test (per-vector threshold, number of vectors, number correct).
Test XML File
The network topology itself will be described in an XML file with 58 input TLUs, 9 output TLUs, and fully
connected visible and hidden layers. The fully-connectedness will induce a fairly large file, as this means
that there will be 58x9=522 connections to describe. However, it seems problematic to determine
which input units should be connected to which output units, as all output bits seem to be a
combination of all input bits.
RBM Training Algorithm
The training of a Restricted Boltzmann Machine is carried out by a method known as contrastive
divergence. The objective is to learn weights which best correlate the visible node vectors of the
training set with the hidden unit state vectors of the same training set (making this particular RBM
implementation a supervised learning case). In the first iteration, given the inputs (or outputs of the
visible layer TLUs), the hidden layer TLUs are set (based on the energy of each unit and the probability
resulting from the energy). For each TLU pair connected by an edge eij, Positiveij is computed. Here,
Positiveij = xi*xj. In the next iteration, the hidden unit values are clamped to the training vector outputinduced values, and the visible layer units are inferred. Then the hidden units are derived again. The
edge values derived at this point are denoted Negativeij. For each training vector, the weights are
updated as follows:
wij = wij+lambda*(positiveij-negativeij)
Here, lambda is the learning rate. All training vectors are iterated over until convergence (error beneath
threshold for hidden layer output) or some upper bound on the iterations. (Update, this is the
canonical clustering algorithm, but the supervised categorization applications required herein require
the modifications outlined below.
Contrastive Divergence Implementation in C++
The algorithm runs until either one hundred iterations complete or the error rate on the training set is
less than ten percent. In each iteration, a positive and negative pass over all connections is made,
setting the Positive and Negative data structures. Using these positive and negative values, the weights
are updated with each training vector. The steps of the given contrastive divergence algorithm are as
follows.
1. Clamp the visual layer with training inputs
2. Use the visual layer values to compute energies of the hidden nodes, probabilities they are on, and
their states (sampling from the Bernoulli probabilities).
3. Set the “positive” data structure positive[i][j] = vi*hj, using the hidden node outputs
4. Use the hidden outputs from (2) to compute the energies, probabilities, and states of the visual
nodes.
5. Use the visual values to compute the hidden node values.
6. Set the negative[i][j] data structure using these hidden outputs (as outlined above).
7. The difference, negative[i][j]-positive[i][j] is the delta in weights[i][j].
At the end of the iteration, all training vectors are checked and the accuracy assessed. If accuracy is
sufficient, convergence is determined, and the algorithm halts. The means to assess this accuracy is
discussed in the derivation section below.
Update: the algorithm steps will be modified as follows to provide a supervised learning approach:
Step 2: clamp the hidden layer with the training vector output bits.
Step 4: recompute the hidden layer using the energy formula and the input bits
Derivation of Contrastive Divergence
How to obtain weights such that the network’s hidden units accurately predict a mortgage prepayment
rate given the WAC, WAM, risk-free rate, etc. is the problem of the training algorithm. It seems that this
algorithm would require some exogenous aspect, in that the output rates from the training samples
would need to form part of the input into the training. Otherwise, the algorithm could at best cluster
the training samples as a function of their inputs and outputs.
An explanatory paper and implementation of a Restricted Boltzmann Machine and contrastive
divergence at deeplearning.net appears to be more mathematically rigorous than the simpler CD
algorithm outlined above. The probability of a neuron’s state being on is expressed in terms of the “free
energy”, and instead of computing the weights based on the difference between the hidden layer values
in positive and negative phases, the algorithm computes the gradient of the difference between
negative and positive free energies and adds the partial derivative of this “cost” function with respect to
the weight to be updated. The code, implemented in the Python “Theano” library, calculates the
gradient of the free energy function, which is apparently the vector of scalars with dimension equal to
the sample batch size. In fact, the grad function apparently computes symbolic derivatives (and
subsequently, it is assumed, makes numerical substitution). Where the RBM is constructed and the
training function is called for each network in the batch is unclear (the paper seems to take N samples of
k batches each). In fact, it was discovered later that the complete source on a separate web page
includes a test function which calls the RBM __init__ method and the get_cost_updates contrastive
divergence method. The Theano code, in fact, is identical in logic to the simpler contrastive divergence
algorithm outlined above. It computes the same positive and negative energy values as contrastive
divergence, but then takes the additional step of minimizing the “cost” function which is the difference
(positiveij-negativeij) by solving for the parameter set which makes the gradient 0 (using gradient
descent). However, simply retaining the cost output and multiplying it by a learning rate to update the
weights should have the same effect (if the reconstruction turns off units which should be on, positiveijnegativeij>0 will hold, and the weights will increase, increasing the number of units likely to turn on).
Unfortunately, both of these algorithms are based on the premise that only the visual layer is populated
by training inputs. The purpose of the hidden layer is to facilitate feedback to reconstruct the visual
layer. The ultimate goal of the algorithm seems to be to enable the visual layer to sample from the
training probability distribution. However, such a usage is insufficient for computation of metrics such
as the prepayment rate or MBS price. Thus, a discriminative version of the Restricted Boltzmann
Machine must be utilized. This version of the Restricted Boltzmann Machine will mimic the
discriminative RBM described by Larochelle and Bengio, but replacing the gradient descent weight
updates with the simpler contrastive divergence described by Chen. This network will make the entire
hidden layer consist of the class bits, and the positive stage of the CD algorithm will use the inputs and
outputs directly from the training sample. The preprocessing step will need to be modified to
preprocess the training set outputs such as prepayment rate so they can be dealt with as bits in the
hidden layer. The Preprocessor class will be modified to set the output bits as hidden in the positive
phase, and the positive phase of contrastive divergence will not involve inference but simply use the
hidden values set from these outputs.
One interesting issue addressed in the discriminative and sampling papers is the conditional probability
distribution of a single TLU, p(hi=1|v). The key insight is that the individual hi values are independent,
and thus, the other hidden units can be ignored in considering hi’s probability. Bayes’ Rule yields,
p(hi=1|v) = p(hi=1,v)/p(v), where p(hi=1,v) = exp(-E(v,1))/Z, and p(v) = (exp(-E(v,0))+exp(-E(v,x)))/Z, where
Z is the normalization constant value in both cases. Given expressions for E (energy) as free energy in
terms of b’v, ci, and Wiv, this simplifies to sigm(ci+Wiv), where sigm(x) = 1/(1+exp(-x)).
Boltzmann Machine Classes
1.
Boltzmann – represents the neural network. This includes the topology (as represented by
visible and hidden layer TLU vectors and weights), the training/testing, and the inference.
2. TLU – a single logic unit
C++ Template Specializations
In order to utilize the unordered_map template (hash table functionality) with custom classes like “Pair”
as the key, a hash function and equality operator must be provided to the unordered_map. This can be
accomplished by passing specific user-defined function pointers to the unordered_map constructor. A
more elegant way, though, is to extend the std namespace and specialize the hash and equal_to struct
templates for the user-defined type. This enables the default hash and equality calls made by the
unordered_map to access the type-specific method. In the header file, the template specialization
(indicating that an implementation for a specific type exists) is as follows.
//hash assumes at most 2^15 nodes/layer
template<>struct hash<Pair>{
size_t operator()(const Pair&p)const;
};
In the .cpp file, the implementation of the operator is as delineated below.
size_t hash<Pair>::operator()(const Pair&p)const{
printf("pair hash operator access\n");
size_t code = 0;
size_t maxVal = 1<<15;
code += maxVal*p.x1;
code += p.x2;
return code;
}
Aside: C++ Print Formatting Bug
An interesting phenomenon occurred when the author mistakenly declared an integer variable as a
double-precision floating point member. The printf call formatted both this variable and another index
variable using the “%d” designation for integers. The result was that the floating point number, a large
integer, printed as 0 while the index variable (a montonically increasing sequence starting with 0)
printed as a 10-digit number. Correctly declaring the double variable as an integer solved the printing
issue for both. Thus, printf seems to propagate corruption in converting one numerical variable to a
char* to other variables.
Boltzmann Machine GUI
A graphical user interface for interacting with the Boltzmann machine C++ code will be written for the
desktop platform. The classical Windows Win32 GUI library will be leveraged for this implementation.
High-level functionality provided by this module will include: drag-and-drop of threshold logic unit (TLU
objects), drag-and-drop of connections between TLUs, weight assignments of connections, a train
button to learn the optimal weights, an infer button to infer the equilibrium output state for a given
input (visible layer), and energy computation for a given state and weight assignment. The user can
select the qualitative meaning of the visible layer states (credit rating above BBB, notional above
100000000, maturity > 10 years, etc) and the hidden layers (probability of default > 20%, probability of
prepayment > 10%, etc).
The objective is to utilize at least one combobox to specify the output type, (unimplemented) multiple
combo boxes for the input types, buttons for TLU components (neurons and connections), and a canvas
to drag and drop to the components onto.
Translating Graphical Images to a Neural Network
The key challenge of the GUI lies in detecting relationships between the shapes and lines on the canvas
and constructing the Restricted Boltzmann Machine data structures from these pixels. The steps
involved in the construction of a network graph from user mouse actions is as follows.
1.
When an icon is dropped on the canvas, the image must be converted from an icon to a painted
geometric object, if resize is to occur. This is necessary for synapses and the preferred way to
handle neurons. The synapses will be converted to painted lines using the “MoveToEx” and
“LineTo” Win32 functions.
2. The overlap of neurons (circles) and connections (lines) needs to be checked using an “overlaps”
function.
3. The “layer” of the network in which neurons are placed must be inferred based on their x-offset
in the canvas (i.e. which side of the partition).
4. The new object’s extrema (enclosing rectangle for a neuron, endpoints for a synapse) must be
stored in a hash table. For a TLU, this will map the geometric coordinates to an index in the
visible layer or hidden layer (separate tables for each layer). For a connection, the endpoints are
mapped to a pair of indices in the visible and hidden layers. A Boltzmann network object must
be stored in the GUI instance. This is necessary, as the canvas must be redrawn with each move
of a dragging of a new neuron or syanapse. Repaint redraws all images based on this data
structure.
a. A “lines” vector will store all synapse connections and, when a synapse is dropped or
moved, this vector will be updated
b. All lines will be repainted whenever the canvas boundary and partition is repainted.
Canvas
The canvas is a white rectangle with a black line partitioning the two network segments (visible and
hidden). Two text windows will also indicate the regions separated by the partition. A custom pen is
created via the “CreatePen” GDI function to obtain a pen of user-defined width for the separator.
A problem noted after the icon drag was implemented was that the repainting of the screen causes the
canvas to become a gray gap (as opposed to the initial white rectangle with black border). Modifying
the drop code to destroy the old window and add a new window to the canvas does not ameliorate this
problem, nor does calling setupCanvas in the “handleDrop” code. Interestingly, decomposing the logic
that paints the lines and background of the canvas from the code which instantiates the window and
invalidating the canvas rectangle in this function seems to induce correct behavior. The “InvalidateRect”
call is necessary to essentially delete the painted lines and background from the canvas’s memory, it
seems. Otherwise, the new paint commands do not replace the lines. The new paint function is called
by the redraw (move) logic and the drop logic. One persistent problem is that the icon renders merely
as a gray square. This seems to be due to the fact that the dragged image is still a child of the main
window, not the canvas. A second problem was that the icons, when placed were shifted. This was due
to a failure to translate the main window coordinates to canvas coordinates, i.e. x -= canvX, y-= canvY.
Lastly, the newer icon, when placed atop an older one, “underlaps” it, which is counter-intuitive. This
has something to do with the paint order in the canvas, and it is reversed, when another icon is placed
(possibly a bug in the Win32 library). Interestingly, this phenomenon was corrected by adding another
“InvalidateRect” (with the berase parameter set to FALSE) to the drop handler immediately after
creating the new icon as a child of canvas.
The drag-and-drop functionality highlighted another problem with the use of Win32 windows and
messages. Whenever an image icon needs to be erased and redrawn in the main window, the
DestroyWindow and InvalidateRect methods were sufficient. However, when dragging or dropping on
the canvas, these methods do not induce the icon to be deleted (the canvas needs to be repainted
before this deletion takes effect). It seems that there may be a problem with the event-handling logic
which prevents the WM_PAINT message from being correctly sent and processed by the parent window.
This problem is remedied by passing NULL as the window argument to InvalidateRect, which apparently
induces the WM_PAINT message to be sent to the primary window, and the paint message is processed
by the original event handler, not the dispatch event handler set for the window when the drag event is
processed. However, this causes all painted effects in the canvas to be erased. Also, whenever the icon
is dragged over the rectangle or partition, these are erased. The only solution which thus far has
worked is to repaint the rectangle and partition of the canvas with each move and drop of the icons. In
addition, all lines need to be repainted (as will all TLUs which have been dropped, when they are stored
as shapes and not images). An updated screen shot is below.
Selection of Canvas Objects
Once a neuron or synapse is dropped on the canvas, it becomes a part of the neural network data
structure. In order to allow the network reconfiguration via operations such as the deletion of neurons,
the transfer of neurons between layers, and the connecting of neurons in multiple fashions, the
graphical objects must be able to be selected. Upon left-click selection, the synapses can be dragged,
resized, and rotated to enable new connections between neurons. Neurons can be moved and resized.
The right click operation will enable deletion of neurons and synapses. Selection behavior will mirror
that in other commercial drawing applications, resulting in the highlighting of “action points” on the line
or circle. For the synapse, three points, at the two endpoints and the middle, will be highlighted. The
mouse-up event will cause the setting of a “selected” bitmap for the line in question, and the
“paintCanvas” method will be updated to paint the “selected” line with dots. In addition, the
subsequent mouse-down and mouse-up, if not on this line, must unselect the line, which is handled by
marking the line’s flag as “false” in the data structure. These actions, which involve fairly complex logic
and dispatch, will be handled by a new module and class, the “NetworkManipulator”. This object needs
to maintain a pointer back to the instantiating BoltzmannGUI’s canvas object and lines vector.
Unfortunately, the event handler method must be static, and there is no well-defined way to pass
custom data about objects to this. Thus, the solution utilized was to make the BoltzmannGUI’s
reference to itself be a public, static member of the class, which the NetworkManipulator can access.
Once a line or neuron is selected, its selection points can be dragged. Dragging the endpoints of a line
causes an expansion (if dragged away from the center) and a contraction (if dragged towards the
center). A separate event handler was utilized for the drag mechanism in the case of icons, and it may
be necessary to implement a similar facility for the dragging of lines (synapses) and circles (neurons). In
fact, it was later determined empirically that the only way to handle LBUTTONUP events is via a separate
event handler.
Shape Outlines and Fills
Shape outlines are drawn based on a “pen” (HPEN) struct, and shapes are filled with a color based on a
“brush” (HBRUSH) structure. It seems that both the brush and the pen can be selected for the shape
using the “SelectObject” function. For example, ellipses for the line selection will be drawn with a black
outline and white fill. A screen shot of the new lines, with one selected, is below.
Event Handler Specification
Interestingly, the functionality to handle the clicks on lines in order to select them (to later move and
expand them) reveals another possible bug in the Win32 logic. Whenever the “CallWindowProc”
function is invoked on the previous dispatch function, the current dispatch function is called instead,
yielding an infinite recursion. In addition, the “SetWindowLong” call to revert the event handler back to
the previous function sets the value to the current event handler. This was remedied by explicitly
passing the correct function names, but why the correct function pointer was not in the “previous
handle” variable remains a mystery. When running in GDB with breakpoints around the
CallWindowProc and SetWindowLong functions, no error occurred. Thus, this may be a timing issue
between GUI threads (although it is believed that both handlers would run in the same thread).
Icon Tray
The icon tray is a set of two images which can be selected and dragged to the canvas. Upon the
placement of the icon on the canvas, its parent window is changed from the main window to the canvas.
A screenshot of the icon tray with the canvas is below.
Win32 Methods and Types
Some non-intuitive structures and functions exist in Win32. WndClassEx is a structure of metadata
about the window. It is stored in a registry in memory via RegisterClassEx, which links the metadata to a
name. The name is looked up and the window instantiated in the memory as a displayable entity via the
CreateWindowEx method invocation, which returns a HWND handle. The HWND can be passed to the
ShowWindow function. The HWND is apparently an address in a table which references the bitmap for
the painting, etc.. Interestingly, the CreateWindowEx method is also used to instantiate controls such as
combo boxes, but the RegisterClassEx method need not be invoked before such instantiation. .Icon
images need to be placed in a resource file in order to assign them to a window, it seems (based on the
documentation for the LoadIcon function). An event loop runs by calling the GetMessage function in a
while loop until it returns 0 (apparently when the application process is terminated). One of the fields
of the WndClassEx struct is the event handler, known as lpWndProc. This method handles all events
occurring on controls within the window. Note that the Win32 “CALLBACK” keyword is used to modify
this method, which makes it static. However, one of the parameters to the method, “lParam”, is a
handle to the control which originated the event (e.g. button which was clicked). Note that the messge
type is simply WM_COMMAND for all messages, but the upper word of the “wParam” to the event
handler contains the identification code.
One of the keys to implementing a window is to implement the event handler and handle messages for
WM_NCCREATE, WM_CREATE, and WM_PAINT. These can all be processed using the “DefWindowProc”
function, it seems. In addition, any “default” cases should be handled by calling this function. The order
of messages sent on window instantiation was, “GETMINMAXINFO”, “NCCREATE”, NCCALCSIZE”,
“CREATE”, “SHOWWINDOW”.
In order to use a traditional combobox, a window must created with class WC_COMBOBOX and style
CBS_DROPDOWNLIST (editing is thus disabled). Strangely (using both mingw and VisualStudio/Windows
SDK libraries), the combobox list items cannot be selected by the mouse click, but only by the keyboard
arrows (finalizing a change by pressing “Enter”). When a mouse click occurs on an item in an expanded
dropdown list, no event messages are even sent to the event handler method, it seems. It was initially
suspected that the COM method CoInitialize needed to be called to properly set up the message pump
for the event loop, but even after adding this call, the combobox behavior was incorrect. In fact, the
correct behavior was obtained by passing NULL as the second parameter to the GetMessage method in
the event loop instead of the window handle. This is apparently due to the fact that the messages
generated by mouse moves and clicks within the combobox’s list are not attributed to the window
handle and thus would not be dispatched (this is an error, as messages to a child handle should be
addressed to the parent window, and the static control correctly propagates messages, as does the
combobox for keyboard-generated messages).
The error handling of this API makes it somewhat troublesome to write to. For example, when setting
an image in a static control, setting the styles to SS_CENTER and SS_BITMAP caused no image to display.
However, no exception occurred either in CreateWindow or SendMessage.
Drawing
Drawing lines and shapes to the Win32 GUI is also somewhat non-intuitive, although not completely
different from other graphical APIs such as Java Swing. Basically, the WM_PAINT event-handling logic
must call the code which draws the geometry. In order to set the pen or brush colors, the “device
context” must be set to represent the pen, and then the handle to the context passed to the
“SetDCPenColor”, along with the RGB values for the color ((0,0,0) for black, (255,255,255) for white,
etc). Interestingly, a “device context” is instantiated to provide the memory for display output
operations (the “BeginPaint” function returns the handle to this memory). An object (such as a pen,
brush, or region) is selected within the device context, and properties of the object (such as color) can
be set. Note that the handle to the device context is a 32-bit integer which maps the current display
context to a location in memory (in a lookup table of the OS). The “DC” function set is a part of the API
known as “Graphics Device Interface”, or “GDI”, whose symbols are located in libGdi32.dll.
A screen shot of a simple UI, involving static text, combo box, rectangle, and images from a resource
bundle linked into the executable is below.
COM OLE Drag and Drop
Although low-level mouse-down, mouse-up, and mouse-move events could be processed for drag-anddrop, a higher-level API exists via object link embedding (OLE). The Component Object Model (COM)
remote procedure call framework is utilized to send messages between objects which might reside in
different processes (it is unclear that COM is really needed for drag and drop within a single window, but
the OLE abstraction may require it by sending the events to an intermediary dispatch process). Classes
for which these methods are implemented are part of the Microsoft Foundation Classes (MFC) objectoriented library. Unfortunately, the OLE and MFC libraries are not distributed with the MinGW
compiler. The author located some files which will be tried in both the g++ and cl (Microsoft)
compiler/linker sets.
Note that one source of confusion will be the mixture of object-oriented MFC/ATL (ATL stands for Active
Template Library, and is a COM abstraction) code and the C-style procedural Win32 code. One problem
is that, in an MFC implementation, the main function, WinMain, resides in the DLL and not the user’s
module. In addition, GUI widgets are referenced on a “document” basis, and not as a “handle”. It is
expected that, if the main dispatch loop is not started with the MFC libraries and the user interface state
initialized within this library, it would be impossible to utilize the drag and drop methodology.
Another source of confusion is the client-server paradigm of OLE and COM, when really only a single
application process is involved. The premise of OLE is that data objects need to be shared at runtime
between applications via Windows Explorer operations like cut and paste. In the case of this neural
network construction application, no cross-application sharing is required, so the COM and OLE libraries
seem unnecessary.
Win 32 (C API) Drag and Drop
A custom Win32 implementation of the drag-and-drop functionality will involve handling of the
WM_MOUSEMOVE, WM_LBUTTONDOWN, and WM_LBUTTONUP events as well as tracking state stored
within a DragDropHandler object. The drag-and-drop starts when the left button is pressed over a
draggable image (TLU or connection button or images on the canvas). The image is redrawn in a semitranslucent form. After the drag begins, every mouse move of a distance of 10 or more pixels away
(without releasing the left button) induces a repaint of the semi-transparent image. When the left
button is released over the canvas, either a TLU or a connection line is painted. This GUI structure must
be reflected in data structures for the neural network in memory and serialized to an XML file.
Unlike some of the other messages, the mouse button and move messages do not provide the handle to
the window (i.e. UI control such as a button) where the action occurred. Thus, a means to infer the UI
widget upon which the mouse button was depressed from the X and Y coordinates which triggered the
event must be found. This will be accomplished via the GetRect method, which returns the rectangle for
a given window handle
Interestingly, the WM_LBUTTONDOWN message is never processed by the event loop, and
mousedowns are construed as clicks. The workaround for this is intercepting the WM_PARENTNOTIFY
message (which occurs prior to a click message). This results in a correct button down catch. However,
the x and y coordinates of the mouse fall well outside those which are assigned to either of the
draggable buttons. The reason for this seems to be that the coordinates returned by the
GetWindowRect function are the absolute coordinates in the OS display. If the application window is
shifted left, the left boundary of the rectangle shifts left accordingly. The solution to this problem was to
find the rectangle for both the child (image button) window and the parent (application) window. For
example, leftX = controlRect.left-windowRect.left, and rightX=controlRect.right-windowRect.left. Upon
making this correction, the drag start methods are correctly invoked.
New copies of the buttons are painted in response to the mousedown event, but the redraw does not
work correctly and the mouseup event is not correctly detected. In addition, the images do not appear
to be the translucent copies. This last issue was because the paintNewImg method used the original
images (no Win32 bug). Another problem was that the position was passed to the LoadImage function
instead of the size. The position is associated with the enclosing window and not with the bitmap itself.
The mouseup event was not conveyed via either a WM_LBUTTONUP message or a WM_COMMAND
message, as was expected. The only message sent when the button is released is WM_SETCURSOR
(code 32). However, this message is so general that catching it prevents the gui’s being repainted
whenever the cursor is above the window. Interestingly, the mouse-up event was caught by setting the
capture of the mouse to the window of the translucent image and by delegating a new dispatch function
to the two network component buttons. Strangely, even when the button-up occurs outside of the
image, the event is caught (due to capture, it is suspected). In addition, it is peculiar that a dispatch to a
separate event-handler is necessary, but in fact the button-up event was not caught without such logic.
Another issue was a crash that occurred when repeated mousedown events occurred on the TLU or
connection button. This seems to be due a flood of MOUSE_ACTIVATE messages which occur on the
second mousedown. The exception seemed to be ameliorated by setting the capture to the button
window (not the application window) on the mousedown and also associating the delegate with the
button window. However, this is incorrect, as the mouse move coordinates would not be relative to the
containing window’s upper-left corner. Another way to eliminate the ACTIVATE message flood was to
avoid calling the old wndproc method at the exit of the delegate (as this causes an infinite recursion of
messages, it seems).
Strangely, the method of subtracting the parent window’s coordinate from the current window’s
coordinate to obtain the parent-relative offset does not seem to be effective (although it approximately
works) as the connection button’s x-boundary is off by 11 and its y-boundary is too low by 40. This
problem was corrected by adjusting for the window’s borders on the top and left (which accounted for
the extra pixels). Such an adjustment is made by subtracting the “client window” width or height from
the “window” rect or height, as the client rectangle excludes the border. The left border is based on
dividing the extra x-pixels by two, and the top border is based on subtracting the left border from the
extra y-pixels. However, in this case, a mousemove event occurs when the translucent image is first
drawn, as the cursor’s position changes, which was due to an incorrect setting of the previous position
to the upper left of the image instead of the center
After the above adjustments (shifting from the center of the image to the upper-left corner prior to
paint and converting screen coordinates to window coordinates), the initial select and drag of the icons
was successful. However, while the mouseup correctly identifies the image when no movement has
occurred before a mouseup, the rectangle for the icon is incorrect, after a drag has occurred. This was
corrected by caching the image rectangle in the “translucentTLU” and “translucentConn” handles.
However, when one image was dropped, dragging and dropping a subsequent image caused the original
dropped image to be erased. This was corrected by resetting the “imgHnd” pointer which is passed to
the DestroyWindow function to be NULL after a drop occurs. However, the DestroyWindow does not
cause the old images to disappear, and the constant reset of the canvas on paint causes placed objects
to be lost. This was due to an incorrect default message handler implementation in the delegate, which
must CallWindowProc(prevWndProc…) for all events except for WMLBUTTONUP (return 1 in this case).
The problem of deletion of items left on the canvas was addressed by instantiating the canvas only in
the setup code, not in the mousemove event handler. Apparently, the default event handler cleans up
the display in such a way that DestroyWindow’s changes are effective, while the simple DestroyWindow
and InvalidateRect statements alone are not sufficient. In addition, the button-up event caused an
infinite loop by setting an event handler to its predecessor and then directly calling the predecessor of
that handler (original handler), causing an infinite recursion (f(a,b)->f(a,b)->f(a,b)…)
A screen shot is below. As of now, the drag and drop functionality works, except that:
1. Drops can occur anywhere (should be allowed only on canvas).
2. Upon drop, the image should solidify and not be an icon.
3. The image on the canvas should be selectable and movable.
The problem of discarding images not placed on the canvas was solved by calling DestroyWindow if the
mouse button is released when the image is outside the canvas.
In an attempt to reduce the flicker of the window during drag, only the icon’s rectangle was invalidated
on a mouse-move. However, this induced a disappearance of the image in the canvas. Invalidating only
the canvas when the image was inside it failed to ameliorate this situation. The final approach was to
invalidate the entire window but to set the bErase parameter to FALSE, which induced less flicker than a
call with bErase set to TRUE. The problem when not invalidating the window seems to be that, as the
static icon windows are placed on the canvas without being the children of the canvas, the canvas can
be painted above them, unless the window is repainted (as the root window remembers the order in
which children were added).
A minor issue is that the canvas border disappears, when an icon is dragged to it. This is because the
black rectangle is painted only in the setup method, while it should be restored in every WM_PAINT
event handler.
Compiler and Linker
The GUI software was originally built using the mingw compiler and linker (g++ and ld). Later, an
additional makefile was written to compile with the Microsoft Visual Studio compiler and linker (cl) and
to link to the Windows SDK official libraries. Behavior of UI rendering and event messaging seems
identical, but there is no standard output console by default if building with the cl compiler. This
disparity was reconciled by adding a standard “main” method and (implicitly) causing cl to use the
/Subsystem:Console option.
Upon the implementation of logic encoding the placed graphical objects into data structures, the
Boltzmann Machine core logic, including the TLU and Boltzmann modules, needed to be linked into the
executable. Strangely, building the GUI along with the Boltzmann code generated errors involving the
TLU class which did not occur when building the Boltzmann code independently of the GUI. This was
due to the aberration that an enum defined in the DragDropHandler class contained a “TLU” element
which the compiler had placed in its symbol table prior to parsing the TLU class.
Resource Files
In order to include custom images into a Win32 application, the API constrains the developer to the
utilization of “resource” identifiers in specifying the image to load. These resource IDs are mapped to an
offset in a binary resource file, which is a bundle of all of the images (and possibly sounds, strings, etc)
utilized by the program. Basically, a “.rc” file, mapping an integer ID to a resource type and a file name
is compiled using the “RC” utility. This outputs a “.res” file, which can be linked by the Microsoft linker
into the executable. However, in order to utilize the Gnu Compiler Collection (gcc) for the build, the
“windres” utility must be utilized to convert the .rc file to an object (.o) file. Note that a resource file is
not required, if the image is installed along with the .exe and the “LR_LOAD_FROM_FILE” parameter is
passed to the LoadImage function. In the case of the CL (Visual Studio) compilation, the res file is
directly linked to the output (like a .o or .lib file).
Icons in Win32
In Windows, program icons (for Explorer, the task bar, etc) are stored in .ico files. Just as is the case with
bitmaps, these are attached to the executable via the resource bundle. Apparently, the first icon found
within the resource bundle is selected by Windows as the program icon. Note also that .ico files are
archives which can contain multiple copies of an image (at different sizes or resolutions), and that and
external utility, Converticon, was needed to generate the image in the correct format. Note that, while
exporting images of 16x16, 32x32, and 256x256 succeeded, preserving the original image resulted in an
error in the “windres” conversion program which generates the “.o” object file from the “.res” resource.
The resource compiler output, executable link output, and icon result are below.
c:\mbs>makerc
c:\mbs>rc /v BoltzmannGUI.rc
Microsoft (R) Windows (R) Resource Compiler Version 6.1.7600.16385
Copyright (C) Microsoft Corporation. All rights reserved.
Using codepage 1252 as default
Creating BoltzmannGUI.res
BoltzmannGUI.rc.
Writing BITMAP:100,
lang:0x409,
size 67072.
Writing BITMAP:101,
lang:0x409,
size 36584.
Writing ICON:1, lang:0x409,
size 1128
Writing ICON:2, lang:0x409,
size 4264
Writing ICON:3, lang:0x409,
size 270376
Writing GROUP_ICON:102, lang:0x409,
size 48
c:\mbs>makeboltzmanngui
c:\mbs>windres BoltzmannGUI.rc BoltzmannGUIRes.o
c:\mbs>g++ -g -o BoltzmannGUI.exe BoltzmannGUI.cpp BoltzmannGUIRes.o -lGdi32
In order to set the image which appears in the upper-left corner of the window, the LoadIcon function is
called and the result assigned to the hIcon field of the WNDCLASS struct. Note that two problems were
encountered here: the id number of the image could not be 102, which seemed to conflict with the
Windows built-in “question mark” icon, and the handle argument to LoadIcon needed to be the current
instance (not NULL, as was used for the default application icon). A screen shot, showing the task bar
and upper-left corner icons is below.
Pointers to Pointers
In order to set the address stored by a pointer from within a method, the pointer’s address is passed to
the function (pass-by-reference). This could be useful in the paint method, as a new window must be
created and stored within a handle at instance scope, but the method will be indifferent to whether it is
processing a connection image or a TLU image (polymorphism). However, the code was written so that
the window handle was returned up to a parent function where the type of image dragged is
discriminated against.
Preventing Circular Header Dependency
In one instance, the BoltzmannGUI class retained a reference to a NetworkManipulator object, which
maintained a pointer back to the parent instantiating BoltzmannGUI object. Since, during the
preprocessor phase, the BoltzmannGUI symbol table object will not be externally visible when the
NetworkManipulator.h file is processed, mutual #includes in NetworkManipulator.h and
BoltzmannGUI.h is not feasible. This is avoided by using a forward declaration of “class
NetworkManipulator” in BoltzmannGUI.h, #include “NetworkManipulator.h” in BoltzmannGUI.cpp
(where methods are invoked), and #include “BoltzmannGUI.h” in NetworkManipulator.h.
Variable Scope in Switch Statement Cases
In a switch statement, variables used in one case are visible in another case, but the value is not
initialized in the latter case. As the message parsing logic is a large switch statement, regulating the
dispatch of GUI events, compile errors can easily occur. Partitioning the cases with {} will prevent this
shared scope from occurring.
Static Text Controls
Interestingly, WC_STATIC style controls display with a different background color than the window
proper (gray instead of white). Trying to include the style SS_WHITERECT does not remedy the
situation, as the white background covers the text, it seems. For the labels in the canvas, the font size
will be manipulated utilizing the WM_SETFONT message. Prior to this, a CreateFont function call must
be made, passing the logical font size based on the following transform for 10pt (from MSDN).
logHt = -MulDiv(10, GetDeviceCaps(hDC,LOGPIXELSY),72)
Here, hDC is the device context for the screen, and MulDiv refers to a multiply of the first two
arguments followed by a divide of the product by the third argument.
The background color can be set by the SetBkColor function, which takes the window handle and an RGB
color reference. The documentation suggests that this processing be done in the handler of the
WM_CTLCOLORSTATIC event, whereby a brush reference is returned to the caller which threw the
event. In fact, setting the background outside of such an event handler (called whenever the control is
repainted) results in no color’s being painted.
The “Visible Layer” and “Hidden Layer” labels below are examples of static controls with custom-fonts
and programmatically-set backgrounds.
Windows Store Deployment
Any code not embedded in the Android app (such as the Windows GUI) will likely be released as an app
in the Windows Store.
Android ViewAnimator
Used to switch between the main valuation view and the machine learning module.
Android SDK
This application will utilize Android API version 16. The “aapt” binary is utilized to generate an “arsc”
resource file, as well as to update the “R.java” source file. The “dx --dex” utility is used to convert Java
bytecode to a “.dex” archive to be run in the Dalvik Virtual Machine. Interestingly, it seems that the .dex
file needed to be deleted prior to re-archiving in order ensure that the most recent classes are included
in the APK file. If this is the case, there may be a bug in the SDK’s dx utility.
UI Layout
This application will utilize a two-faced UI which can be toggled via a “flip button” and which makes use
of the ViewAnimator class. Note that the contents of EditText inputs, etc will not change when the
screen is toggled.
The table for computation output will utilize a fixed column width format set to 1/5 of the screen width
given the current orientation. Code to obtain such a width utilizes the android.view.Display’s getSize
method. Interestingly, when the table grows too large (more than 25 rows), the flip button to toggle
back to the input screen cannot be reached.
Note that, currently, the onCreate event is thrown whenever rotation of the device occurs. This should
not be the case, as a rotation should retain the state of the ViewAnimator and EditText inputs.
Shape Drawable
An XML file which is translated to code which paints an image. A preferable approach would be to
instantiate this Drawable object in code, but even the RectShape does not seem to provide requisite
methods to set the stroke width, fill color, etc. The shape configuration should be saved in the file
<app_home>/res/drawable/<shape_name>.xml. These XML files are parsed by the aapt resource
compilation utility.
For example, given the XML file:
<?xml version="1.0" encoding="utf-8" ?>
<shape xmlns:android="http://schemas.android.com/apk/res/android" android:shape="rectangle">
<solid android:color="#ffffff"/>
<stroke android:width="3dp" android:color="#000000"/>
</shape>
The following R.java fragment results.
public static final class drawable {
public static final int data_bg=0x7f020000;
Note that the new R.class files need to be generated by makeaapt before the XML files can be utilized as
background resources in UI widgets.
Test Case
Cash flow simulations were carried out both in an Excel spreadsheet (only the first 12 months of
simulation are covered by Excel for now). The parameters set in the input screen are as follows:
PSA: 1.2
WAC: .08
Pass-through: .07
Principal Amt: 2000000
Risk-Free Rate: .03
Months Remaining: 360
The Excel sheet output is below, followed by the Android input and output screen shots. Note that the
Excel sheet includes “FHA” estimated prepayments, which are not relevant in version 1.0.
MBS
Cashf
low
Mod
el
Prep
ayme
nt
Mod
el
WAC
PSA
PSA
1.2
CPR
Incre
ment
0.00
24
Passthrou
gh
rate
riskfree
0.08
0.07
0.03
Note: interest is
interest paid to mbs
bondholders, not to
mortgage lenders
note:
belo
w is
mont
hstart
princi
pal
minu
s
sched
uled
amt
times
prepa
ymen
t rate
Mont
h
FHA
Princ
ipal
Prep
aid(
%)
1
0.00
04
2
0.00
04
3
0.00
04
4
0.00
04
5
0.00
04
6
0.00
04
7
0.00
04
8
0.00
04
9
0.00
04
CP
R
0.0
02
4
0.0
04
8
0.0
07
2
0.0
09
6
0.0
12
0.0
14
4
0.0
16
8
0.0
19
2
0.0
21
6
SMM
(%)
0.000
2002
2
0.000
4008
8
0.000
6019
9
0.000
8035
4
0.001
0055
4
0.001
2079
9
0.001
4109
0.001
6142
6
0.001
8180
7
mon
th
begi
n
prin
cipal
2000
000
1998
257.
9
1996
106.
7
1993
546.
8
1990
578.
7
1987
203.
1
1983
421.
1
1979
234
1974
643.
3
mo
nthl
y
pay
me
nt
146
75.2
9
146
72.3
5
146
66.4
7
146
57.6
4
146
45.8
6
146
31.1
4
146
13.4
6
145
92.8
4
145
69.2
9
1164
3.96
sche
dule
d
prin
cipal
pay
men
t
(amt
, to
mbs
issu
er)
134
1.95
8
135
0.63
4
135
9.09
3
1162
9.02
136
7.33
1161
1.71
137
5.34
138
3.11
6
139
0.65
5
139
7.95
1
140
4.99
9
inter
est(t
o
mbs
hold
ers)
1166
6.67
1165
6.5
1159
2.02
1156
9.96
1154
5.53
1151
8.75
prepa
id
princi
pal
(amt)
FHA
799.4
6321
7
798.7
6289
4
797.8
9904
7
796.8
7178
8
795.6
8133
2
794.3
2799
4
792.8
1218
9
791.1
3443
2
789.2
9534
Note: CF assumes 100%
pass-through of principal
prepayment
prep
aid
prin
cipal
(amt
)
SM
M
400.
1719
9
800.
5255
8
1200
.816
4
1600
.799
4
2000
.228
6
2398
.857
9
2796
.441
1
3192
.732
1
3587
.485
2
month
-end
princi
pal(FH
A)
mont
h-end
princi
pal(PS
A)
19978
58.58
19982
57.9
19961
08.47
19961
06.7
19939
49.72
19935
46.8
19913
82.6
19905
78.7
19884
07.65
19872
03.1
19850
25.66
19834
21.1
19812
37.66
19792
34
19770
44.95
19746
43.3
19724
49.05
19696
50.9
disc
oun
t
fact
or
0.99
750
3
0.99
501
2
0.99
252
8
0.99
005
0.98
757
8
0.98
511
2
0.98
265
2
0.98
019
9
0.97
775
1
discounted cash flow(FHA)
disco
unte
d
cash
flow
(PSA
)
13773.61
1337
5.32
13737.04
1373
8.8
13697.83
1409
7.73
13655.98
1445
1.91
13611.52
1480
1.1
13564.46
1514
5.1
13514.83
1548
3.7
13462.65
1581
6.7
13407.95
1614
3.88
10
0.00
04
11
0.00
04
12
0.00
04
0.0
24
0.0
26
4
0.0
28
8
0.002
0223
4
0.002
2270
8
0.002
4322
8
1969
650.
9
1964
258.
6
1958
468.
9
145
42.8
145
13.3
9
144
81.0
7
1148
9.63
1145
8.18
1142
4.4
141
1.79
4
141
8.33
2
142
4.60
8
787.2
9562
8
785.1
3611
3
782.8
1771
3980
.455
2
4371
.398
4760
.070
7
19674
51.77
19642
58.6
19620
55.15
19584
68.9
19562
61.46
19522
84.2
0.97
531
0.97
287
5
0.97
044
6
1646
5.06
13350.74
13291.07
1678
0.05
13228.95
1708
8.65
Bibliography
1. http://www.xavier.edu/williams/tradingcenter/documents/research/edu_ppts/01_MortgageBackedSecurities.ppt CPR, SMM, etc
formulas
2. http://developer.android.com/guide/topics/resources/drawable-resource.html Shape XML
specification
3. http://www.stackoverflow.com/questions/1016896/get-screen-dimensions-in-pixels Retrieval
of display size
4. http://en.wikipedia.org/wiki/PSA_prepayment_model PSA discussion
5. http://www.scholarpedia.org/article/Boltzmann_machine Boltzmann Machine overview.
6. http://www.sciencedirect.com/science/article/pii/S0957417414006393 Boltzmann Machine in
credit risk
7. https://msdn.microsoft.com/en-us/library/windows/desktop/ms632680(v=vs.85).aspx Win32
guide
8. https://www.allegro.cc/forums/thread/212101 Discussion of dropdown lists in Win32
9. http://blogs.msdn.com/b/larryosterman/archive/2004/04/28/122240.aspx Discussion of COM
initialization and threaded apartments
10. https://msdn.microsoft.com/en-us/library/windows/desktop/hh298364(v=vs.85).aspx win32
combobox tutorial
11. http://blog.echen.me/2011/07/18/introduction-to-restricted-boltzmann-machines/ Practical
discussion of Restricted Boltzmann Machine.
12. http://www.mingw.org/wiki/MS_resource_compiler Discussion of resources under GCC.
13. https://msdn.microsoft.com/en-us/library/windows/desktop/dd162957(v=vs.85).aspx
Information about selecting an object within a device context.
14. http://portal.hud.gov/hudportal/HUD?src=/program_offices/housing/rmra/oe/rpts/hecmdata/
hecmdatamenu HECM data
15. http://newviewadvisors.com/commentary/tag/hecm-prepayments Discussion of 2012 FHA data
16. http://stackoverflow.com/questions/7655736/win32-changing-program-icon Win32 Icons
17. http://www.converticon.com/ Image conversion utility.
18. http://stackoverflow.com/questions/8157937/how-to-specialize-stdhashkey-operator-for-userdefined-type-in-unordered Template specialization for hash/equal_to.
19. https://support.google.com/googleplay/android-developer/answer/113469?hl=en Distribution
of Android apps.
20. http://developer.android.com/tools/publishing/app-signing.html Signing APK.
21. https://msdn.microsoft.com/en-us/library/5t7ex8as.aspx Win32 drag and drop.
22. http://en.wikipedia.org/wiki/Active_Template_Library ATL discussion.
23. http://www.cboe.com/micro/gvz/introduction.aspx GVZ Gold Volatility.
24. http://www.codeproject.com/Tips/400920/Handling-WM-LBUTTONUP-WM-LBUTTONDOWNMessages-for LBUTTONDOWN via PARENTNOTIFY catching
25. http://wiki.winehq.org/List_Of_Windows_Messages Win32 message codes
26. http://social.msdn.microsoft.com/forums/vstudio/en-US/8960c64a Defining alternate window
event handlers
27. https://msdn.microsoft.com/en-us/library/windows/desktop/ms632682(v=vs.85).aspx
DestroyWindow discussion.
28. www.deeplearning.net/tutorial/rbm.html Discussion of Restricted Boltzmann Machine with
Python implementation
29. Larochelle, H. and Bengio, Y.. “Classification Using Discriminative Restricted Boltzmann
Machines”. Proceedings of the 25th Annual Converence on Machine Learning. Helsinki, Finland.
2008.
30. https://mbsdisclosure.fanniemae.com/PoolTalk/index.html# Fannie Mae MBS Pool Application
31. https://msdn.microsoft.com/en-us/library/aa911419.aspx Discussion of font height in Win32
32. http://winprog.org/tutorial/fonts.html Font example code (Win32)
33. http://www.informit.com/articles/article.aspx?p=328647&seqNum=3 GDI Pen and Brush
discussion
34. http://www.fanniemae.com/portal/funding-the-market/mbs/multifamily/dusprepaymenthistory.html Prepayment spreadsheet site
35. http://www.federalreserve.gov/releases/chargeoff/about.htm Charge-off and delinquency data
36. https://en.wikipedia.org/wiki/Mortgage-backed_security#Valuation MBS Overview
37. http://info.fanniemae.com/resources/file/mbs/pdf/gems_remic_term_sheet_111214.pdf FNMA
REMIC term sheet example
38. http://www.ginniemae.gov/doing_business_with_ginniemae/investor_resources/mbs_disclosur
e_data/Pages/consolidated_data_history.aspx Ginnie Mae reports of delinquency, etc
39. http://www.nasdaq.com/symbol/mbb/historical iShares MBS ETF prices
40. http://developer.android.com/reference/android/widget/TextView.html Discussion of
setCompoundDrawables methods.
41. http://www.fanniemae.com/resources/file/debt/pdf/debt_library.pdf Fannie Mae auction
system.
42. http://simplesoap.sourceforge.net/ Simple SOAP
43. http://www.investinginbonds.com/learnmore.asp?catid=11&subcatid=56&id=137 CMO types
44. http://www.wsj.com/mdc/public/page/2_3024-bondmbs.html The Wall Street Journal MBS
Quotes