The Craft of Writing a Research Paper Brian A. Malloy Computer Science Department Clemson University 1 What is it? • Craft • Art • Learned skill 2 Papers as patterns • Learn the pattern • Stay within the pattern • Stray outside the pattern: – Prospectus – Documentation • In this talk, I describe a pattern for a research paper 3 Outline • • • • • • • What is research Where to send How to organize a paper Sections of a paper Figures & tables Ingredients of powerful writing Ingredients of lucid writing 4 Sources • Reviewers • Colleagues • “Bugs in Writing” by Lyn Dupre 5 What is research? • • • • Identify a problem Find out what others have done Develop a solution Show your solution: – Works – Better – Sound & complete 6 Suggested organization of a research paper • • • • • • • Intro/motivation Background Overview of my solution My solution Results Related work Concluding remarks 7 Comparison of what is research? & organization • identify a problem • find out what others have done • develop a solution • show your solution works/better/sound • intro/motivation • background • overview of my solution • my solution • results • related work • concluding remarks 8 Where to send? • Conference: • Journal • Workshop • TR 9 Where to send? • Conference: – 3 kinds • accept “everything?” • IEEE (less than 50% accept rate) • Less than 20% accept – Quick 10 Where to send? • Journal – – – – archival respectable experience magazine 11 Where to send? • Workshop – PASTE – IWPC 12 Where to send? • TR – Not refereed – Large as you like 13 Where to send? • Conference: – 3 kinds • Accept “everything?” • IEEE (accept less than 50%) • Accept less than 20% • Journal – – – – Archival Respectable Experience Magazine – Quick • Workshop – PASTE – IWPC – ICSE Workshops • TR – Not refereed – Large as you like 14 The sections of a research paper 15 Introduction: 4 parts • Set the scene & motivate • High level view of what others have done and why it’s inadequate • In this paper… • In the next section … 16 Background vs related work • Background is a review of information the reader will need to understand your paper • Related work is what other researchers have done 17 Overview of “my solution” • A high level view of the system – works best w/ a figure! • Show how the “bits & pieces” fit together • Can highlight the advantages of your approach 18 System overview for ISSRE’03 19 System overview for ASE’02 20 Describing my approach • Most important thing to remember: You know it, they don’t! • Write as if you’re writing to your grandmother! • Every new word s/b italicized and followed by a definition • Picture is worth 1000 words 21 Results • Experiment • Results • Case study 22 Results • Describe the platform, processor, OS, language, compiler, compiler optimizations • Describe the test suite, which s/b legit! You can’t just make up the test programs!! • Use tables: describe each row & column • Sometimes a graph is better than a table 23 “A picture is worth…” 24 25 Impact of Results • How many times did you perform each experiment? • Validity of results – Any weaknesses of the test suite – Anything “hokey” about the approach • Threats to generalize – Can you really generalize: why/why not – Is it automatable or automated?! 26 Most important! • Do not claim more than you did • Do not generalize from one study or result • Do not claim that because it worked well on a few test cases that it will work well on all test cases, all platforms and for all inputs! 27 Figures • Cannot simply “drop” them into paper (see Figure 3) • Writers are immersed in the subject -- readers are not!! • When referring to a specific Figure/Section/Table, use upper case, otherwise use lower case: – The overview in Figure 3 is better than the other figures. • Should describe each item in a figure: “The icon in the upper left corner of the figure represents…” • Should motivate the figure: “This figure provides an overview of our system, including input …” 28 Figures (cont) • If they contain code, number each line • Refer to specific lines, or sets of lines, in the text • s/b either pseudo-code, or language specific (tell them what the language is!) • All elements of figure/graph s/b marked (key or legend) • Each figure should have a tag and a caption 29 30 Tag & caption 31 32 More powerful writing Use active voice & present tense! 33 You & your reader • Must remember that you’re familiar with the work & the reader is not! • Every term must be defined at its first use: highlight the definition with italics UGLY: The careful reader may consult the dragon book for an explanation of control flow graphs (see [1]). We use CFG’s to compute… GOOD: A control flow graph, CFG, is a graph whose nodes represent basic blocks and whose edges represent the flow of information into and out of the basic blocks. 34 Engage the reader • Writing has changed in the last 10 or 20 years: UGLY: The careful reader will learn the figure without any explanation on the part of this author! UGLY: The reader will observe a novel algorithm if she … GOOD: We suggest here a novel algorithm for computing data flow information on a control flow graph, CFG. 35 You & your reader (cont) • • • • RULE: Speak directly to the reader If single author: I or we Coauthors: we The reader: you 36 You & your reader (cont) UGLY: In this discussion, it is assumed that it is possible to get a closed form for at least one of the equations. GOOD: We assume that we can compute a closed form for at least one of the equations. 37 Avoid passive voice! • Can only say that an event took place: without admitting who or what did it! UGLY: The data sets were lost. The data sets were lost by the first author. UGLY: The first algorithm fails to compute the result in a timely manner because a solution to the traveling salesman problem is required. 38 Use active voice • Take responsibility for your work! GOOD: We lost the data sets. GOOD: Our first algorithm is too slow because the computation requires a solution to the traveling salesman problem. 39 passive voice is vague BAD: By removing an item from the list during each iteration, it is guaranteed that the loop will terminate. The first part of the sentence suggests that you will reveal who or what is removing. To what does “it” refer? GOOD: We remove an item from the list during each iteration of the loop; thus, the loop is guaranteed to terminate. GOOD: In the algorithm of Figure 3, we remove … 40 Active voice is stronger & clearer UGLY: In a queue, insertions are performed at the rear and deletions are performed at the front; therefore a pointer to the front and the rear must be maintained. GOOD: We use a queue, a structure that permits insertions at the rear and deletions from the front. We maintain a pointer to the front and rear of the structure. 41 Use present tense • Present tense is stronger than future tense WEAK: In this paper, we will show… STRONG: In this paper, we show that … 42 Use present tense • Past tense degrades into “diary writing” BAD: In this work we wanted to … GOOD: The goals of our work are to … 43 Don’t change tense BAD: In this chapter, we have described what happens when we do the wrong thing. We examined the behavior and determine that they are correct. BAD: The analysis reported in the preceding section will show that Nick can differentiate shod from shoddy. 44 More lucid writing Consistency, Symmetry & Correctness! 45 You MUST read each of your sentences for: • Content/meaning • Structure • Style 46 Be consistent: call a spade a spade! • Can’t change terminology, even for a good reason, w/out explanation remote proxy vs proxy data structure vs structure 47 Use symmetry when structuring sentences, paragraphs & sections We describe both a stack and a queue: a queue is a FIFO structure and a stack is a LIFO structure. In this section, we review background information about program representations and computation of data flow information using the program representations. To compute data flow information … 48 Which vs that • that identifies the object about which you are speaking • which provides further info about the object GOOD: The car that is speeding down the road is about to crash into a pole. GOOD: The car, which is speeding down the road, is about to crash into a pole. 49 Avoid fuzzy words • very, easily, actually, truly, in fact, some, thing • etc. BAD: In comparing our algorithm with the algorithm described in reference 17, we see that ours is very fast. BAD: In this section, we define the terms node, tree, graph, etc. 50 Semicolon • Connects two sentences that are closely related to each other • Use a semicolon when what follows constitutes a complete sentence • When what follows is a fragment, you must use a comma or an em dash 51 Semicolon CAREFUL: Max’s head was throbbing; Lyn’s heart was sinking. The semicolon implies that there is a connection between Max’s head throb and Lyn’s sinking heart!! BAD: Holly wanted to live on a farm with plenty of chickens; and to have a stellar career as well. GOOD: This machine is difficult to use; for example, it crashes whenever you turn it on. 52 Commas • Commas provide guidance to your reader about how to parse your sentence! • Place them wherever a speaker should pause UGLY: Greg was worried; however he remained calm. GOOD: Brendan was hungry; however, he remained calm. OKAY: Lyn and Richard were still puzzled, however many times they reread the directions for assembling the stepper climber; however, they remained calm. 53 Colon • The colon signifies that what follows it expands on or explains what precedes it: this sentence is an example. Lyn could tell that Red had been out hunting again: There were three mice neatly laid out on the upstairs rug. • Frequently a period or an em dash will also work • Use at the end of a sentence, followed by a list GOOD: This talk does not assume that you know the basics: how to form a sentence, how to use words and how to laugh at your mistakes. 54 Summary • Every claim must be explained and substantiated • Everything that you state is a claim • Any decent reviewer will assume that “if you don’t state it, you didn’t do it and you can’t handle it!” • Get a “reader” to read your paper • Writing a well-written paper is a lot of work! 55