其它参考文献

advertisement
[1] Accelerated Strategic Computing (ASCI) Initiative. A report by US Department of Energy,
Lawrence Livermore ,Los Alamos, Sandia National Laboratory,1996
[2] Interconnection Networks, J. Duato, S. Yalamanchili, L. Ni, Morgan Kaufman, 2002
[3] Boden, NJ et al, "Myrinet: A Gigabit-per-Second Local Area Network", IEEE Micro, Feb.
1995
[4] Stenstrom, P., Joe, T., and Gupta, A. Comparative performance evaluation of cache-coherent
numa and coma architectures. In Proceedings of the 19th International Symposium on
Computer Architecture (1992), IEEE Computer Society, IEEE Press, pp. 80--91
[5] Adve S, Hill M, Vernon M. Comparison of Hardware and Software Cache Coherence
Schemes. Proc. of the 18th Annual International Symposium on Computer Architecture, 1991,
(Jun.): 298~308
[8] Hwang K. Advanced computer architecture: parallelism, scalability, and programmability.
McGraw-Hill, 1993
[9] Silicon Graphics, Origin 200 and Origin 2000, Technical Report, 1996
[10] Stephen R. Wheat Timothy G. Mattson,David Scott. A TeraFLOPS Supercomputer in 1996:
The ASCI TFLOP System. In Proceedingsof the 1996 International Parallel Processing
Symposium, 1996
[11] Tom Anderson, David Culler, Dave Patterson, and the NOW Team. A Case for NOW
(Networks of Workstations). IEEE Micro 15, 1, February 1995, pp. 54-64
[12] Brent R P.The parallel Evaluation of General Arithmetic Expressions.Journal of the ACM,
1972, 21(2): 201-206
[13] Amdahl G.Validity of the Single Processor Approach to Achieving Large Scale Computing
Capabilities。AFIPS Conf.Proc.30,April,Thompson Books,Washington D.C,1967,
483-486
[14] Gustafson JL.Reevaluating Amdahl’s Law.Comm.of ACM, 31(5):532-533, 1988
[15] Sun X H, Ni L M.Another View of Parallel Speed.Proc.Supercomputing’90, 324-333, 1990
[16] Kumar V, Rao V N. Parallel Depth-Firsh Search, PartⅡ: Analysis. Int’l J.of Parallel
Programming, 16(6): 501-519, 1987
[17] Sun X H, Rover D T.Scalability of Parallel Algorithm-Machine Combinations.IEEE
Trans.on Parallel and Distributed, Systems, 5(6): 519-613, 1994
[18] Zhang X D, Yan Y, He K Q.Latency Metric:An Experimental Method for Measuring and
Evaluating Parallel Program and Architecture Scalability.J.of Parallel and Distributed
Computing, 22:392-410, 1994
[19] S. Fortune and J. Wyllie. Parallelism in random access machines. Proc. 10th Annual ACM
Symp. on Theory of COmputing, San Diego, California, 1978, 114-118
[20] 陈国良,并行算法的可扩放性分析, 小型微型计算机系统,Vol.16,No.2,pp.10-16,
1995
[21] Ben HH Juurlink, Harry AG Wijshoff: A Quantitative Comparison of Parallel Computation
Models. ACM Trans. Comput. Syst. 16(3): 271-318 (1998)
[22] Mark Goudreau, Kevin Lang, Satish Rao, Torsten Suel, Thanasis Tsantilas: Towards
Efficiency and Portability: Programming with the BSP Model. SPAA 1996: 1-12
[23] T. Cheatham, A. Fahmy, D. C. Stefanescu, and L. G. Valiant. Bulk synchronous parallel
computing - A paradigm for transportable software. In Proc. of the 28th Hawaii International
Conference on System Sciences. Vol. 2: SoftwareTechnology, pages 268--275, 1995.
[24] Chlebus B, Vrto I. , Parallel Quick Sort. Journal of Parallel and Distributed Computing,1991 ,
11:332-337
[25] ekel E, Nassimi D, Sahni S. Parallel Matrix and Graph Algorithms. SIAM j. on
Computing,1981,10:657---673
[26] Galil Z. Optimal Parallel Algorithms for String Matching. Info. and Control, 1985,67(1---3)
144--157
[27] Hoare C A R. Quicksort. Computer Journal,1962,5:10-15
[28] JaJa J. An Introduction to Parallel Algorithms. Addison-Wesley Pub. Company, 1992
[29] Knuth D E,Morris I H, Pratt V B. Fast Pattern Matching in String. SIAM J. Computing.
1997,6(2):189-195
[30] Sedgewick R. Implementing Quicksort Programs. Communication of the ACM, 1978, 21
(10):847--857
[31] Singh V, Kumar V, Agha G et al. Efficient Algorithms for Parallel Sorting on Mesh
Multi-computers. International Jounal of Parallel Programniug,1991,20(2):95---131
[32] Vishkin U. Optimal Parallel Matching in Strings. Info. and Control, 1985,67(1-3) :91-113
[33] Wagar B A. Hyperquicksort: A Fast Sorting Algorithm for Hypercubes. Pros. of the Second
Conference an Hypercube Multiprocessors, 1987,292-299
[34] E. Horowitz and A. Zorat,”Divide-and-conquer for parallel processing," IEEE Trans. Comput.,
vol. 32, pp. 582--585, June 1983.
[35] Daniel S. Hirschberg: Parallel Algorithms for the Transitive Closure and the Connected
Component Problems STOC 1976: 55-57.
[36] HT Kung, "Why systolic architectures ?", IEEE Computer 15, 1 (1982), 37-46.
[37] Richard Cole and Uzi Vishkin. Deterministic coin tossing with applications to optimal
parallel list ranking. Information and Control ,70(1):32-53, July 1986.
[38] AV Goldberg, SA Plotkin, and GE Shannon. Parallel symmetry-breaking in sparse graphs.
SIAM J. Desc. Math., 1:434–446, 1989.
[39] JaJa J. An introduction to parallel algorithm. Addison-Wesley Pub. Company, 1992
[40] Benjamin W. Wah, Guo-Jie Li, Chee Fen Yu: Multiprocessing of Combinatorial Search
Problems. IEEE Computer 18(6): 93-108 (1985)
[41] Parnas and Paul C Clements A rational design process: how and why to fake it IEEE
Transactions on Software Engineering, SE-12(2), pp251-257, Feb 1986.
[42] G. Fox, et al Solving Problems on Concurrent Processors, Prentice Hall 1988.
[43] GC Fox, RD Williams, and PC Messina. Parallel Computing Works! Morgan Kauffman
Publishers, Inc., 1994.
[44] Nichol, Salz "An Analysis of Scatter Decomposition", IEEE Transactions on Computers,
November 1990, pages 1153-1161.
[45] Foster I. Designing and building parallel programs: concepts and tools for parallel software
engineering, Addison-Wesley, 1995
[46] Feng T Y. A Survey of Interconnection Networks. IEEE Computer, 1981,14 (12):12- 27
[47] Hwang K. Advanced Computer Architecture. Parallelism, Scalability, Programmability.
Mc-Graw-Hill. Inc. .1993
[48] Kumar V, Gupta A, Gupta A et al. Introduction to Parallel Computing: Design and Analysis
of Algorithms. Benjamin/Cummings Publishing Company, Inc. , 1994
[49] Berntsen J. Communication Effcient Matrix Multiplication on Hypercubes. Parallel
Computing,1989,12:335---342
[50] Bertsekas D P and Tsitsilklis J N. Parallel and Distributed Computation, Numerical Methods.
Prentice-Hall, 1989
[51] Cannon L E. A Cellular Computer to Implement the Kalman Filter Algorithm: Ph. D.
thesis.Montana State Univ. ,1969
[52] Fox G C,Otto S W, Hey A J G. Matrix Algorithms on Hypercube I: Matrix Multiplication.
Parallel Computing, 1987,4:17--31
[53] Golub G H, Loan C V. Matrix Computations. (2nd Ed). The Johns Hopkins Univ. Press.1989
[54] Gupta A and Kumar V. The Scalability of Matrix Multiplication Algorithms on Parallel
Computers. Proc. lnt' l 93 Conference on Parallel Processing, 1993 , Ⅲ~115, Ⅲ ~119
[55] Ho C T, Johnssson S L, Edelman A. Matrix Multiplication on Hypercubes using Full
Bandwidth and Constant Storage. Proc. Int'l 91 Conference on Parallel Processing,
1997,447---451
[56] Kumar V, Gupta A, Rao V. Scalable Load Balancing Techniques far Parallel Computers. J.
Parallel & Distributed Ccanputing,1994,22(1) :60---79
[57] Kumar V, Gupta A, Gupta A et al. Introduction to Parallel Computing: Design and Analysis
of Algorithms. Benjamin/Cummings Publishing Company, Inc. , 1994
[58] Don Heller A survey of parallel algorithms in numerical linear algebra, SIAM Rev.20 (1978),
pp. 740—777
[59] JM Ortega, Introduction to Parallel and Vector Solution of Linear Systems, Plenum Press,
New York, 1989.
[60] KA Gallivan, RJ Plemmons, and AH Sameh, Parallel algorithms for dense linear algebra
computations, SIAM Rev. 32 (1990), no. 1, 54–135.
[61] MT.Heath, E.Ng and BW.Peyton, Parallel algorithms for sparse linear systems, SIAM Review,
Vol. 33, 1991, pp. 420-460
[62] JM Ortega and RG Voigt. Solution of partial differential equations on vector and parallel
computers. SIAM Review,27:149-240, 1985.
[63] 并行计算方法:《数值并行计算原理与方法》张宝琳等,国防工业出版社,1999
[64] JW Cooley and JW Tukey, “An algorithm for the machine caculation of complex fourier
series,” Math. Comp., vol. 19, pp. 297–301, April 1965.
[65] Nussbaumer, H. J. Fast Fourier Transform and Convolution Algorithms, 2nd ed. New York:
Springer-Verlag, 1982.
[66] Paul N. Swarztrauber. Multiprocessor FFTs. Parallel Computing, 5:197-210, 1987.
[67] Averbuch, E. Gabber, B. Gordissky and Y. Medan, "A Parallel FFT on a MIMD Machine,"
Parallel Computing, vol. 15, 1990, pp. 61-74
[68] Blumrich M A, Dubnicki C, Felten E W et al. Protected User-Level DMA for the SHRIMP
Network Interface, Proc.2th Int' l Symp. on High-Performance Computer Architecture, 1996
[69] Comer D E. Internetworking with TCP/IP. 3nd Ed. Prentice-Hall,1995
[70] Lauria M, Chien A. MPI-FM: High Performance MPI on Workstation Clusters. J. of Parallel
and Distributed Computing, 1997,40(l):4- 18
[71] Mellor-Crummey J M, Scott M L. Algorithms for Scalable Synchronization on Shared
Memory Multiprocessors. ACM Trans. Computer Systems,1991, 9{ 1} :21-b5
[72] Messina P, Sterling T (Eels) . System Software and Tools for High Performance Computing
Environment. SIAM, 1993
[73] Pancake C. M. Software Support for Parallel Computing: Where are We Headed? Comm. of
the AGM, 1991.34(11) :53 --G4
[74] Pfister G F. In Search of Clusters. Prentice-Hall PTR, 1995
[75] IEEE, POSIX P1003. 4a: Threads Extension for Portable Operating Systems, IEEE, 1994
[76] Snir M et al. The Communication Software and Parallel Environment of the IBM SP2. IBM
Systems Journal, 1995 , 34 (2).205 – 221
[77] Stallings W. Operating Systems (2nd Ed). Prentice-Ha11,1995
[78] Agha G, Concurrent Object-Oriented Programming. Comm. of the ACM, 1990, 33 (9). 125 141
[79] Allan S J, Oldehoeft R, HEP SISAL: Parallel Functional Programming. Kowalik (Ed).
Parallel MIMD Computation: HEP Supercomputers and Applications. MIT Press, 1985
[80] ANSI Technical Committee X3H5. Parallel Processing Model for High-level Programming
Languages, 1993
[81] Bal H E, Steiner J G, Tanenbaum A S. Programming Languages for Distributed Computing
Systems. ACM Computing Surveys, 1989,21(3).261~322
[82] OpenMP Standards Board. OpenMP: A Proposed Industry Standard AN far Shared Memory
Programming, Oct. 1997
[83] OpenMP Standards Board. OpenMP Fortran Application Program Interface Version I. 0,
Oct. 1997,
[84] IEEE, POSIX P1003. 4a: Threads Extension for Portable Operating Systems, IEEE, 1994
[85] Silicon Graphics, IRIS Power C User's Guide. Silicon Graphics Computer Systems, 1989
[86] Wilson G V, Lu P (Eds). Parallel Programming Using C+ + . MIT Press, 1996
[87] Xu Z, Hwang K. Coherent Parallel Programming in C//. Proc. of Int' l Conf. on Advances in
Parallel and Distributed Computing, IEEE Computer Society Press, Mar. 1997 ,116---122
[88] Adams J et al. The Fortran 90 Handbook. McGraw-Hill,1992
[89] Adams J et al. The Fortran 95 Handbook. MIT Press, 1997
[90] Chapman B et al. . Extending HPF for Advanced Data-Parallel Applications. IEEE Parallel &
Distributed Technology, 1994,2(3):15-27
[91] Fox G et al. FORTRAN D Language Specification. Rice Univ. , 1992.
[92] Geist A et al. PVM: Parallel Virtual Machine-A User's Guide and Tutorial for Networked
Parallel Computing. MIT Press, 1994
[93] Hillis W D, Steele G L. Data Parallel Algorithms. Comm. ACM, 1986,29(12).1170-1183
[94] Hwang K, Xu Z Scalable Parallel Computing. Technology, Architecture Programming.
WCB/McGraw-Hill Companies,1998
[95] Koelbel C et al. The High Performance Fortran Handbook. MIT Press, 1994
[96] Mehrotra P et al. High Performance Fortran: History, Status and Future. Parallel Computing,
1998,24:325---354
[97] MPI Forum, MPI: A Message Passing interface, Proceedings of Supercomputing' 93. IEEE
Computer Society,1993,878-883
[98] Zima H et al. Vienna FORTRAN-A Language Specification. ICASE,1992. Version 1.1
[99] Alliant. Alliant Product Summary. Alliant Computer Systems Corporation, 1989
[100] Babaoglu O et al. Paralex: An Environment for Parallel Programming in Distributed
Systems. Proc. of ACM Int' l Conf. on Supercomputing,1992
[101] Banerjee U. Dependence Analysis for Supercomputing. Boston: Kluwer Academic Press,
1988
[102] Beguelin A et al- Visualization and Debugging in a Heterogeneous Environment. IEEE
Computers, 1993,26(6)
[103] Boudier G et al. An Overview of PCTE+ . SIGPLAN,1982,2(24) :248---257
[104] Brown J S. Debuggers for High Performance Computers, Proc. of the Supercomputing'
93,1993
[105] Cheng Y. A Survey of Parallel Programming Languages and Tools. Technical Report
RND-93[106] Gosling J. Unix Emacs. Carnegie-Mellon Computer Science Dept,, 1982
[107] Gupta A and Kumar V. The Scalability of Matrix Multiplication Algorithms on Parallel
Computers. Proc. lnt' l 93 Conference on Parallel Processing, 1993 , Ⅲ~115, Ⅲ ~119
[108] Hwang K. Advanced Computer Architecture. Parallelism, Scalability, Programmability.
Mc-Graw-Hill. Inc. .1993
[109] Kacsuk P et al. A Graphical Development and Debugging Environment for Parallel Programs. Parallel Computing, 1997,22 :1747---1770
[110] Luque E et al. Overview and New Trend on PSEE. IEEE software ,1992
[111] Newton P, Browne J C. The CODE 2. 0 Graphical Parallel Programming Language. Proc. of
ACM Int’ l Conf on Supercomputing,1992
[112]
Reiss
S
P.
Software
Tools
aril
Environments.
ACM
Computing
Surveys,1996,28(1):281---284
[113] Ries B. The Paragon Perforn3anoe Monitoring Environment. Pros. of the Supercomputing'
93,1993
[114] Ross D T. Applications and Extensions of SALT. IEEE Cornputer,1985,18(4) :25---35
[115] Rumbaugh J et al. Object-Oriented Modeling and Design. Prentice-Hall,1991
[116] Scheidler C,Schafers L. TRAPPER: A Graphical Parallel Programming Environment for Industrial High Performance Applications. Proc. of PARLE' 93: Parallel Architectures and
Languages, 1993
[117] Wolfe M. High-Performance Compilers for Parallel Computing. Addison –Wesley,pub.
Company,1996
[118] NASA Ames Research Center, 1993
[119] Cheng D, Hood R. A Portable Debugger for Parallel and Distributed Programs, Proc. of
the Supercomputing' 94.1994
[120] Banerjee U. Dependence Analysis. Boston: Kluwer Academic Publishers, 1996
[121] Blume W, Eigenmann R. Performance Analysis of Parallelizing Compilers on the Perfect
Benchmarks Programs . IEEE Trans. on Parallel and Distributed Systerns,1992,
3(6) :643---656
[122] Blume W et al. Automatic Detection of Parallelism: A Grand Challenge for
High-Performance Computing. IEEE Parallel aral Distributed Technology, 1994,2(3):37-47
[123] Blume W et al. Parallel Programming with Polaris. IEEE ccmputer,1996,29t12):78---82
Download