Gary W. Howell, Ph. D. 512 Farmington Woods Dr., Cary, NC 27511 Phone: (919) 469-5454 Email: gary_howell@ncsu.edu PERSONAL QUALIFICATIONS Good team member. Broad knowledge of science, mathematics, and computer science. Love to dig into new technical areas. Enjoy letting others teach me. Independent thinker. Good listener. Meet deadlines. Ability to write proposals, make presentations, mentor, teach, and organize team efforts. EDUCATION Ph. D. M. Sc. M. Sc. B. A. Mathematics, University of Florida, Gainesville, FL - 1986 Engineering Sciences, University of Florida - 1984 Mathematics, University of Florida - 1981 Mathematics, New College, Sarasota, Florida - 1973 LANGUAGES: C, C++, FORTRAN 77 & 90, APL, Matlab, Maple, LATEX, MPI, OpenMP, PVM, MSOffice, Perl, LSF, html, Shell scripts on UNIX platforms (LINUX, Solaris, Tru64, AIX) SOME RECENT WORK ON ALGORITHMS AND SOFTWARE Working to extend Householder Bidiagonalization to work with high cache efficiency in the sparse matrix case. This will provide a stable computation for the sparse singular value problem. The code is an extension of Householder bidiagonalization routines for the dense case (discussed in LAPACK Working Note 174) currently being incorporated into LAPACK. BHESS (Algorithm 841 in the ACM TOMS collection) together with BR iteration (G. A. Geist, G. W. Howell, and D. S. Watkins, ``The BR Eigenvalue Algorithm", SIAM J. on Matrix Analysis and Applications . Vol.20,4, pp. 1083-1098, (July 1999) provide an efficient means of calculating eigenvalues of unsymmetric matrices. G. W. Howell, C.T. Fulton, J.W. Demmel, S. Hammarling, K. Marmol, “Cache efficient Algorithms for Householder Bidiagonalization”, LAPACK Working Note 174, a shorter version has been submitted to ACM Transactions on Mathematical Software. G. W. Howell and N. Diaa, "Algorithm 841: BHESS, cache efficient reduction to similar smallband Hessenberg form", ACM Transactions on Mathematical Software, (March 2005). AREAS OF EXPERTISE Software Libraries and User Environments Worked in configuring and optimizing I/O and message processing on 1024 processor supercomputing site and on several other clusters. Maintained and ported software libraries and provided benchmarking and documentation. Aid in porting, debugging, and optimizing parallel applications and in developing a stable and usable programming environment. MPI: I served on the Message Passing Interface (MPI) Forum. MPI is the standard Message Passing library that provides a portable high level interface for parallel coding. The forum produced the MPI standard set of protocols for message passing. These are now very widely used, enabling parallel algorithms to be portably coded. 1 LAPACK Library is the standard software library for dense matrix computations using the BLAS. By making better use of in-cache data, LAPACK gains substantial speed ups compared to the earlier LINPACK and EISPACK packages. I developed Householder bidiagonalization routines for the next LAPACK library release. BLAS Library: Served on the BLAS (Basic Linear Algebra Subroutine) committee. The committee established a standard library of basic linear algebra subroutines to enable portable scientific computation codes to run fast on serial computers. Contributed to the BLAS software library. The Basic Linear Algebra Subroutines allow linear algebra operations such as matrix multiply with mega flop rates near CPU clock speeds. I proposed some BLAS 2.5 operations which speed reduction to banded (e.g. bidiagonal) form. Other libraries: Experience porting and customizing other libraries, e.g. porting and tutoring users in much of the DOE ACTS software, as well as many other standard open source numerical and application packages. Supported users in customizing and scripting packages as necessary for their own applications or contracted work in algorithm and package development. Numerical Linear Algebra Worked in numerical linear algebra for more than ten years, developing algorithms and implementing them in software. Worked in solving linear systems, in solving least squares problems, and in computing eigenvalues and singular vectors, working with dense and sparse systems, and in parallel. (See papers below). In consulting with users, this experience helps me in choosing appropriate libraries. A current project is finding singular values of a sparse matrix with flop rates comparable to the dense case. Applications Parallel Monte Carlo. Invented a method of separating isotopes via oscillatory flow. Wrote Monte Carlo codes to simulate the underlying convection diffusion process. Error bounds for polynomial interpolation: Generalized an 1830’s theorem due to AugustinLouis Cauchy to apply error bounds for approximation of derivative. Wrote codes to automatically determine error bounds for polynomial interpolation. Spacecraft redesign. Predicted re-entry temperatures of Atlas rockets using finite difference and finite element codes. Thermal design of recovery system for an Atlas booster. Developed finite difference and finite element analysis to determine amount of insulation/ablation required for a tumbling rocket booster re-entering atmosphere. Device to measure cross-section of the Shuttle Solid Booster Rockets (SineBar). The mathematical analysis and computer implementation of a circular spline device to accomplish this measurement as part of redesign after the Challenger explosion. Design of pressure sensing system for release of parachutes. Numeric Analysis of stress on water impact and of flotation requirements when wave action submerges flotation devices. Image Processing Algorithms for successful DARPA proposal for real-time LIDAR sensor. Benchmarking Codes, Least squares analyses, Inverse & regularization problems, Local search algorithms, Statistical analysis Teaching, Mentoring & Publishing Developed and taught short courses in MPI and graduate courses in parallel computation. Several decades experience teaching graduate and undergraduate courses in scientific computation, numerical analysis, mathematics, operations research, statistics and computer science. Supervised theses in parallel computations, numerical analysis, statistics, and computer science. 2 Served on numerous Ph.D. and M.S. thesis committees in Engineering, Computer Science, Computer Engineering, Aviation & Human Factors, Operations Research, Electrical Engineering, Mathematics, Physics and Space Sciences. Author of more than thirty publications in numerical linear algebra, approximation theory, differential equations, statistics, and thermal design. $1000 award for mentoring winner of the 2001 Nelson Ying Science Competition for high school students. PROFESSIONAL POSITIONS Jan. 2004 - present Computational Scientist, High Performance Computing, ITD, NCSU, Raleigh, NC Summer 2005 Visiting Researcher, CERFACS, Toulouse, France 2002 - Jan. 2004 Technical Consultant II, High Performance Scientific and Parallel Computation group, Hewlett Packard Corp, Vicksburg, MS 2001 - 2002 Professor of Mathematical Sciences, Florida Tech, Melbourne, FL Summer 2001 Visiting Researcher, CERFACS, Toulouse, France 2001 - 2002 Senior Scientist Consultant, Harris Corporation, Melbourne, FL 1997 - 2000 Operations Research Faculty, Florida Tech, Melbourne, FL 1991 - 2000 Associate Professor of Applied Mathematics, Florida Tech, Melbourne, FL Spring 1997 Visiting Professor of Applied Mathematics, Washington State University, Pullman, WA 1996 - 1999 Summer Faculty -Associate Professor of Computer Science, Florida Tech, Melbourne, FL 1994 - 1995 Summer Faculty Fellowship Researcher, Oak Ridge National Laboratory, Oak Ridge, TN 1987 - 1990 Engineering Consultant, Thiokol Corporation, Kennedy Space Center, FL 1986 - 1991 Assistant Professor of Applied Mathematics, Florida Tech, Melbourne, FL FUNDED RESEARCH Funded work includes: 2005 NIH senior scientist for $500K grant for ECCR Drug Discovery (algorithm development and benchmarking). 2002 PI of NSF Next Generation $104K to develop a more cache efficient algorithm for Householder Bidiagonalization (to be incorporated into LAPACK) -- NSF grant EIA0103642 from Next Generation Software. 2001 Senior Scientist Consultant to Harris Corp. for Jigsaw LIDAR real-time image processing sensor proposal. funded by DARPA. 2001 NSF -MRI (major research Instrumentation) $250K to design and construct a 48 node Beowulf parallel computer at Florida Tech to be used by faculty, students, and local industry. 3 1994 - 1995 Summer Faculty Fellowship at Oak Ridge National Laboratory to develop algorithms to reduce a general matrix to similar small band form and to efficiently determine Eigenvalues. 1989 - 1991 Co-PI of a $100K NSF grant to study parallel algorithms. 1991 - 1992 $13K from Florida Solar Energy Commission to study isotope separation. 1990 - 1991 $20K grant from TRDA to study Convection Diffusion problems, joint with V. Lakshmikantham. 1989 - 1990 $30K from Thiokol Corporation to determine thermal heating of Atlas thruster during re-entry. 1987 - 1988 $30K from Morton Thiokol to analyze piecewise circular spline algorithm as part of Challenger Redesign Effort. INVITED TALKS February 2006 SIAM Conference on Parallel Computing in San Francisco. minisymposium. “Cache Efficient Householder Bidiagonalization for LAPACK” October 2001 SIAM conference on Applied Linear Algebra in Raleigh, NC, “Cache Efficient Reduction Bidiagonalization”, “ELMRES solution of sparse linear systems of equations”. June. 2001 CERFACS, Toulouse, France, Sparse days conference. “Sparse Householder Bidiagonalization”. Sept. 2000 CERFACS, Toulouse, France, "The ELMRES algorithm for solving sparse nonsymmetric linear systems." July 2000 ICCAM, Leuven, Belgium, "Cache-Efficient Bidiagonalization". June 1999 SIAM annual conference in Atlanta, “Cache-Efficient Reduction to Small Band Form”. Oct. 1998 Basic Linear Algebra Standards National committee (BLAST Forum) Oct. 1998 “The ELMRES algorithm for solving sparse non-symmetric linear systems”, IMACS conference in honor of David Young, Austin, TX. June 1998 Basic Linear Algebra Standards National committee Dec 1997 Basic Linear Algebra Standards National committee (BLAST Forum) Oct. 1997 SIAM Conference on Applied Linear Algebra at Snowbird, Utah. Minisymposium presentation ``Gaussian Similarity Transformations In Computing Matrix Eigenvalues." April 1997 Washington State University, Mathematics Colloquium, ``Large Eigenvalue Problems." July 1995 Oak Ridge National Labs, ``BR Iteration." July 1994 Oak Ridge National Labs, ``BHESS Reduction to Small Band Form." Nov 1994 University of Central Florida Math. Seminar, ``Efficient Computation of Eigenvalues." 4 March 1993 University of Central Florida, ``Quasi-Circular Approximation." Nov 1992 University of Florida, Quantum Mechanics Seminar, ``Efficient Computation of Eigenvalues." Nov 1991 University of Central Florida, Math. Seminar, ``Error Bounds for Derivatives of Interpolating Polynomials." PROFESSIONAL MEMBERSHIP Member of SIAM, AMS, IEEE and ACM. PROFESSIONAL SERVICE National Computational Science Standards Committees: MPI (Message Passing Interface) Forum, BLAST (Basic Linear Algebra Subroutines) committee President of Brevard Chapter of Sigma Xi (1998-99) Book reviews for JAMSA Over 50 thesis and dissertation committees, math, CS, OR, various engineering disciplines. Refereeing: National Science Foundation proposals Oxford University Press Wiley & Sons publishers Journal of Applied Mathematics and Computation Journal of Applied Mathematics and Simulation Journal of Approximation theory Simon & Schuster Morgan & Kaufman Approximation Theory and its Applications Journal of Applied Mathematics and Stochastic Analysis Computational and Applied Mathematics Indian Journal of Pure and Applied Mathematics Rocky Mountain Journal of Mathematics COMMUNITY SERVICE Delegate to Wake County democratic convention 2006 Chair of Florida Tech faculty senate welfare committee 2000-2001. Florida Tech faculty senate 1999-2001 City Commissioner for Town of Melbourne Village (1990-1995). Chair of Public Works Committee of Melbourne Village (1993-1998). HOBBIES Enjoy playing the Piano, Clarinet and Saxophone with community jazz band . Jogging with my slow, old dog. Hiking & bird watching. Enjoy reading and going to the opera when possible. Have been known to find a volleyball game. PUBLICATIONS Numerical analysis G, W. Howell, J. W. Demmel, C. T. Fulton, S. Hammarling, and K. Marmol, “Cache Efficient Householder Bidiagonalization”, submitted to ACM Transactions on Mathematical Software, April 2006. Preliminary version available at http://www.netlib.org/lapack/lawns/downloads/ (see LAPACK Working Note 174). 5 G. W. Howell and N. Diaa, "Algorithm 841: BHESS, cache efficient reduction to similar smallband Hessenberg form", ACM Transactions on Mathematical Software, (March 2005). Desmond Stephens and Gary Howell, “The Elementary Residual Method”, Contemporary Mathematics, Vol. 275, pp. 107-116 (2001). G. A. Geist, G. W. Howell, and D. S. Watkins, ``The BR Eigenvalue Algorithm", SIAM J. on Matrix Analysis and Applications . Vol.20,4, pp. 1083-1098, (July 1999). Gary W. Howell, ``Towards an Efficient and Stable determination of spectra of a general matrix and a more efficient solution of the Lyapunov equation", Proceedings of the First World Congress of Nonlinear Analysts IV, pp. 3913-3927, Tampa, Walter de Gruyter (1996). G. W. Howell and G. A. Geist, ``Direct Reduction to a Near-Tridiagonal Form", Proceedings of the ISCA 8th International Conference on Parallel and Distributed Systems, pp. 426-432 (1995) G. W. Howell and G. A. Geist, ``Necessity of High Precision Arithmetic for Large-Scale Computation", Proceedings of Neural, Parallel, and Scientific Computations 1, Atlanta (1995). G. W. Howell and K. Rekab, ``Expected Conditioning for Eigenvalues of Randomly Generated Matrices", Neural, Parallel, and Scientific Computations 3, 2, pp. 263-270, (June 1995). Gary. W. Howell, ``Efficient Computation of Eigenvalues of Randomly Generated Matrices", Applied Mathematics and Computation 66, pp. 9-24 (1994). S.K. Sen and Gary Howell, ``Direct Fail-Proof Triangularization Algorithms for AX + XB = C with Error-Free and Parallel Implementations", Applied Mathematics and Computation 50, pp. 255278, (1992). Approximation Theory Paul Godfrey and G. W. Howell, "A Three Term Recurrence for Computing Approximate Euler, Bernoulli, and Genocchi Numbers", Bulletin of the Marathwanda Mathematical Society (Dec. 2001) S. G. Deo. and G. W. Howell, "A Highway to Trigonometry", Bulletin of the Marathwada Mathematical Society Vol 1, pp. 26-62, (Dec 2000). Gary Howell, ``Generalizations of the Cauchy Remainder", Proceedings of International Conference on Conjectures in Approximation Theory, Sofia, Bulgaria (June 1993). Gary Howell, ``Derivative Error Bounds for Lagrange Interpolation", Journal of Approximation Theory 67, 2. pp. 164-173 (1991). Gary Howell, ``Conditioning of Lidstone Polynomial Interpolations", Proceedings of 6th S.E. Approximation Theory Conference, (1991). A. K. Varma and Gary Howell, ``Best Error Bounds for Derivatives in Two Point Birkhoff Interpolation Problems", Journal of Approximation Theory 38, 3, (July 1983). Cryptography V. Lakshmikanthan, S. K. Sen, and G.W. Howell, ``Vectors vs. Matrices, p-inversion, cryptographic applications, and vector implementation", Neural, Parallel \& Scientific Computations Vol 4, 2, pp. 129-140, (June 1996). Differential Equations 6 K.N. Murty, G. W. Howell, and G.V.R.L. Sarma, "Two Multi point Non-Linear Lyapunov Systems Associated with an nth Order Non-Linear System of Differential Equation--Existence and Uniqueness", Mathematical Problems in Engineering, (2000) K. N. Murty, G. W. Howell and S. Sivasundaram, ``Two (multi) Point Nonlinear Lyapunov Systems, Existence and Uniqueness", Journal of Mathematical Analysis and Applications, pp.505-515, (July 1992). K. N. Murty, Gary Howell, and S. Sivasundaram, ``Two Point Boundary Value Problems associated with a System of First Order Non Linear Impulse Differential Equations", Applicable Analysis 51, pp. 303-313, (1993). Gary Howell and V. Laskhmikantham, ``A New Monotone Iterative Technique for Solution of Nonlinear Systems of Differential Equations", Applicable Analysis 39, pp. 113-118, (1990). Fluids Gary Howell, Pavlos Kairis, and Kamel Rekab, ``Monte Carlo Simulation of Dispersion in Slow Oscillatory Flow", Journal of Mathematics and Physical Sciences 27, 4, pp.257-269, (1993). Gary Howell, ``Isotope Separation by Oscillatory Flow", Physics of Fluids 31, 6, (1988). U. H. Kurzweg, G. W. Howell and M. J. Jaeger, ``Enhanced dispersion in oscillatory flows", Physics of Fluids 27, 5, (1984). Splines Gary Howell, L.V. Fausett, and D. Fausett, ``Quasi-Circular Splines, a Shape Preserving Algorithm", CVGIP: Graphical Models and Image Processing 55, 2, pp. 89-97, (1993). Gary Howell and A. K. Varma, ``0-2 Spline Interpolation With Quartic Splines", Numeric Functional Analysis and Optimization, (1991). Gary Howell, ``Optimal Error Bounds for Two Even Degree Tridiagonal Splines," Journal of Applied Mathematics and Simulation 3, pp. 117-134, (1990). Gary Howell, ``Shape Generation by Linear Deformation of Circular Splines", Proc. 6th Texas A\&M Conf. on Approximation Theory 1, pp. 333-336, Academic Press, (1989). Gary Howell and A. K. Varma, ``Best Error Bounds for Quartic Spline Interpolation", Journal of Approximation Theory 58, 1, (July 1989). Some Technical reports Benchmarking: "Comparison of Opteron and Xeon Blades", G. Howell and E. Sills, http://www.ncsu.edu/itd/hpc/Documents/Publications/gary_howell (April 2006). Benchmarking: “Using SHMEM for Low-Latency Communication on the Compaq SC40 and SC45”, G. W. Howell. Pp. 30-31 The Resource, U.S. Army Research and Development Center Information Technology Laboratory, Fall 2003. Numerical Analysis: Oak Ridge Technical Report: ``Error Analysis of Reduction to Similar Banded Hessenberg Form”, G.W. Howell, G.A. Geist, and T.H. Rowan. ORNL/TM-13334, March, 1998. Splines: Technical Report: ``Mathematical Evaluation of Sine-Bar Measuring Device", with Donald W. Fausett and Laurene V. Fausett, Morton Thiokol, contract 7MF103, (1987) Thermal Analysis Technical Report: ``A Finite Difference Routine for Solving Time - dependent Heat Flux for Composite Materials", delivered to Thiokol Corporation, (April 1989). 7 Thermal Analysis: J.I. Frankel, T.P. Wang, and G.W. Howell, ``The Use of the Finite Integral Transform Technique for Thermal Analysis in Microelectronic Chip Modules", 5th Int. Conf. on Numerical Methods for Thermal Problems, Montreal, Canada, (June 1987) Recent Conference Proceedings Y. Zhang, X.-M. Hu, G. W. Howell, E. Sills, J. D. Fast, W. I. Gustafson Jr., R.A "Modeling Atmospheric Aerosols in WRF/Chem". Extended abstract for 2005 Joint WRF/MM5 User’s Workshop, 27-20 June, 2005, Boulder, CO. 2005 Gary Howell, Charles Fulton, Sumit Malhotra, Jim Parker. Cache Efficient Householder Bidiagonalization” 2003 SIAM Conference on Applied Linear Algebra. July 16-19, Williamsburg, VA. P. Angeli, O. Basset, C. Fulton, G. Howell, R. Hsu, D. Richardson, A. Sawetprawhichkal, M. Schuster, H. Thompson, S. Wilberscheid, Some Issues in Efficient Implementation of a Vector Based Model for Document Retrieval The 2001 International Symposium on Information Systems and Engineering(ISE'2001). June 25-28, 2001. Monte Carlo Resort, Las Vegas, Nevada, USA. 8