Technical Writing

[Much of this text taken from a presentation by Kevin W. Hamlen, Univ. of Texas at Dallas]

Let’s talk about how to write a research paper:

  • advice about paper structure
  • good technical writing style
  • quality typography
  • bibliography guidelines

How much does writing matter?

Writing cannot be underestimated!

  • Many (maybe most) paper rejections arise from reviewer misinterpretations of your work.
    • This can be avoided via very, very clear and concise writing
  • How often your paper is read and cited depends significantly upon how easy it is to read.
    • improve impact by making it an easy, enjoyable read

Assume a rushed reader

  • Referees must often review 10 or more papers per week.
  • They look for quick rejections: If an early sentence gives a bad impression, they may decide to reject before reading the rest. Very hard to recover from this.
  • Authors scrambling to assemble a literature review often skim 50 papers in a day. Clarity is everything if you want to be cited.

Typical Paper Structure

  • abstract
  • introduction
  • overview
  • technical section(s)
  • discussion
  • related work
  • conclusion
  • bibliography

(not necessarily (but often) in that order)

Writing Order

Students tend to be obsessed with low-level implementation details

  • most readers don’t care (until convinced to care)
  • need to learn to see the research from a higher-level “rest of the world” perspective

Suggested writing order for build-before-write projects:

  • Related Work first
  • Overview next
  • Technical Section(s) next
  • adjust the overview as you go to add concepts you forgot
  • make a to-do list for Discussion section as you go
  • Discussion (influenced by Related Work)
  • Go play tennis
  • Introduction (hardest section to write)
  • Conclusion (easiest section to write)
  • Abstract (compressed version of Introduction)

The Abstract

Purpose

  • help reader/reviewer decide whether to read/review
  • declare keywords for search engines
  • provide template for others to summarize your work

Style guidelines

  • third person (no “I” or “we”)
  • should make sense when dislocated from paper
  • tightest, most succinct possible prose
  • no citations, avoid math

Content

  • describe problem in 1-2 sentences
  • describe solution in 1-2 sentences
  • highlight novelties & big results in 1-2 sentences
From Gutierrez and Schrum, (2020) Generative Adversarial Network Rooms in Generative Graph Grammar Dungeons for The Legend of Zelda

Introduction

Most important section for acceptance!
Content

  • describe the problem in detail (citing key works)
  • explain why past solutions are inadequate
    • vital for acceptance!
    • must cover every solution that might occur to reader
  • summarize solution
    • dazzle reader with a brilliant new insight (climax of paper)
  • list contributions in readily visible form (e.g., bullets)
  • roadmap for rest of paper
    • controversial—some think this is a waste of space
From Gutierrez and Schrum, (2020) Generative Adversarial Network Rooms in Generative Graph Grammar Dungeons for The Legend of Zelda

Shall we read on?

Early or late? (highly controversial)

  • Best practice: Put early only if absolutely necessary to introduce concepts critical for presenting novel contributions
  • Early
    delays reader from reaching real contributions
    Result: reviewers complain
  • Late
    delays full argument of novelty
    reviewers remain unconvinced throughout majority of paper
    Result: reviewers reject
  • What about none?

Purpose

  • convince reviewers that you know all related work
  • convince reader that your work improves upon past
  • tell story in which your work emerges as the hero

Content

  • organize by theme / approach
    • not a laundry list!
    • needs to tell a story
  • avoid pre- or re-presenting your own work
  • avoid re-presenting prior work
    • summarize connections & differences
  • avoid insulting prior work (!!!)
  • cite ACCURATELY (requires deep understanding of papers!)

Overview

Purpose

  • provide whole-system context in which reader can place lower-level innovations introduced later
  • put innovative claims into practical context

Content

  • describe system from user/admin perspective
  • summarize main components (picture is ideal)
  • clearly define TCB and system assumptions
  • conceptual description of each component’s design
  • What’s the “big idea”?

Technical Section

Purpose

  • provide detailed answers to all reader questions
  • present evidence/proof of claims

Content

  • must be well-organized and systematic
    • Some referees won’t fully read it! They hunt for answers to specific questions.
  • Never present old news as if it’s new.
    • Re-invention / rediscovery is an admission of ignorance!
    • Reuse of prior discoveries immunizes you (somewhat) from criticism on those points.

Discussion Section

  • mainly about (apparent) limitations
    • If you have trouble thinking of limitations, you don’t understand your work well enough.
  • think like an attacker
    • pull all the dirtiest tricks; don’t play “fair”
  • put a positive spin on each limitation
    • best spin: mitigated by related work
    • okay spin: mitigated by future work
    • okay spin: out of our scope (but declare your scope early)
    • dangerous: open, unsolved problem
  • reviewers DO use these admissions against you
    • but omitting a limitation entirely is much worse

Conclusions

Purpose

  • reinforce what readers should remember
  • make a “final impression”
  • excite reviewers about future progress

Content

  • restate main innovations and results, but this time you can use terms/concepts introduced throughout the paper
  • suggest future work (try to be inclusive!)

Bibliography

Purpose

  • give credit where credit is due
  • prove that you know how to give credit accurately
  • prove your awareness of past work

Quality standards for first submission

  • needn’t be perfect but must be accurate!
  • referees are angered by miscitation
  • double-check author orderings
  • ensure each citation is most recent version
  • do not trust CiteSeer or DBLP! (consult but verify)
  • Google Scholar is more reliable, but don’t blindly copy & paste its BibTeX entries

Quality standards for camera-ready copy

  • journal editors may do it for you
    • proof-read their edits VERY carefully (usually wrong)
  • or they may leave it to you
    • read EVERY entry in its entirety and double-check ALL info
    • always include page numbers if they exist
    • be consistent about venue acronyms, publishers, addresses, etc. (don’t include it in one but omit it from others)
    • don’t just proof the source; proof the final document
    • don’t just skim; actually read every (boring) line

Technical Writing Style

  • no monolithic paragraphs
    • max paragraph height: ~2 inches
    • larger paragraphs = poorly organized ideas
  • topic sentences
    • far more prominent than buried sentences
    • make sure anything you want readers to see is a topic sentence
    • try reading only topic sentences in order—should make sense and tell the story of your paper!
    • find “most important” sentence in each paragraph—that one should be the topic sentence
    • Common mistake: Most students put all the best sentences at the ends of their paragraphs rather than the beginnings.

Tight Prose

  • each sentence conveys its meaning in the fewest words/clauses possible
  • each paragraph makes a self-contained point in a clear, logical, concise manner
  • Requires MANY EDITING PASSES
    • I rewrite each paragraph many many times, removing/reordering words and sentences each time.
    • When in paper-shrinking mode, I additionally remove non-essential, redundant, or ancillary points.

Strategies for tightening

  • reorder and restructure thoughts to prioritize main subjects/objects
    • “First it compiles the module, which can fail, but if it succeeds it results in object code, and then all the object codes are linked together to create the final executable.”
    • “Final executables are generated by compiling all modules and linking the resulting object codes. Compilation failure results in … ”
  • relocate dangling prepositions/participles
    • “The first solution we thought of didn’t work.”
    • “The first solution of which we thought didn’t work.”
    • “Our first solution failed.”
  • shorten clumsy, overly verbose phrases
    • “the fact that”
    • “in order to”
    • “all of”
    • “so called”
    • “the reason is that”

Tense

  • Use present tense for almost everything
    • past tense = less practical
      • “Our checker found all the vulnerabilities in X.”
      • “Our checker finds all the vulnerabilities in X.”
    • future tense = unproved/untested
      • “The analysis will check both the positive and negative branch.”
      • “The analysis checks both the positive and negative branch.”
  • Cross-references should ALWAYS be in present tense
    • INCORRECT: In Section 4 we will discuss this further.
    • CORRECT: Section 4 discusses this further.
  • Some past tense is unavoidable
    • “One experiment failed, but we were unable to duplicate it.”
  • Future tense should usually be reserved for future work
    • “Once version 3 is publicly available, we will support it.”

Voice

  • Avoid passive voice
    • BAD: Results were tabulated in parallel.
    • BETTER: Our system tabulated results in parallel.
    • BEST: Our system tabulates results in parallel.
  • Objections to passive voice
    • imprecise (Who did the tabulating?)
    • weak
    • unnecessarily verbose
  • Possible exception: abstracts (controversial)
    • OPTION 1: “This paper presents X, Y, and Z.”
    • OPTION 2: “X, Y, and Z are presented.”

Word Choice

  • “proven” (adjective) vs. “proved” (past tense verb)
    • INCORRECT: These results have been proven.
    • CORRECT: These are proven results.
  • “which” (descriptor) vs. “that” (qualifier)
    • I appreciate info flow papers, which are written well.
    • I appreciate info flow papers that are written well.
  • “whether” (alternatives) vs. “if” (implication)
    • INCORRECT: Module X decides if P is true or false.
    • CORRECT: Module X decides whether P is true or false.
  • “like” (similarity) vs. “such as” (example)
    • INCORRECT: We use many tools, like SAT-solvers and disassemblers.
    • CORRECT: We use many tools, such as SAT-solvers and disassemblers.
  • “dissertation” (a document) vs. “thesis” (an assertion)
    • My thesis is that widgets are useful. My dissertation proves the veracity of my thesis.
  • “may” (permission) vs. “might” (possibility) vs. “can” (capability)
    • The user may delete all the files. [User is allowed to delete them.]
    • The user might delete all the files. [Possible that user deletes them.]
    • The user can delete all the files. [User is capable of deleting them.]
  • Never begin a sentence with “Besides, …”
    • It means, “Even if you ignore everything I already said, …”
    • You are admitting that everything prior is inconsequential!
    • Example: “Smoking causes lung cancer. Besides, I dislike the taste.”
      • Translation: The real reason I don’t smoke is because I dislike the taste. That argument alone explains my behavior, so the other argument is unnecessary and can be safely disregarded by the reader.
  • To build upon what you’ve previously said, use…
    • “Additionally, …”
    • “Furthermore, …”
    • “Moreover, …”
    • Or just nothing. Just say the next thing.
  • To provide examples of previous points, use…
    • “For example, …”
    • “For instance, …”
  • To summarize previous points, use…
    • “In summary, …”
    • “In conclusion, …”

Terminology

  • Never invent your own unless it’s a genuinely new concept.
    • If you think it’s a new concept, you better be right.
    • Reusing terminology accurately is good:
      • convinces reader of your competence
      • connects concepts to those already known by readers
      • shifts criticism away from you (and toward inventors of those terms)
    • Important terms should be italicized at first use
      • never use “scare quotes”
        • Possible exception: referring to the word rather than its meaning
        • Example: The word “foobar” was first coined by Allied soldiers in World War II.
      • don’t italicize twice
      • use \emph (not \it or \textit)

Citations

  • Avoid using citations as nouns
    • BAD: The results were published in [1].
    • GOOD: The published results [1] are quite promising.
  • Never begin a sentence with a citation
    • INCORRECT: [2] is an excellent paper.
    • CORRECT: The most recent study [2] is excellent.
  • Author names in text
    • USUALLY BAD: Weninger [3] rejects this approach.
    • GOOD: Related work rejects this approach [3].
    • System names are okay
      • GOOD: Google [4] adopts a different approach.
    • Super-famous researchers (e.g., Turing winners) are special
      • GOOD: Hoare’s axiomization [5] is the standard approach.
  • Wikipedia is rarely an authoritative sourve

Figure and Table Captions

  • Check style guide for journal/conference
  • If not specified…
    • begin with a tag: a descriptive phrase, not a complete sentence, but capitalized and punctuated like a sentence
    • optionally follow with complete sentences
From Gutierrez and Schrum, (2020) Generative Adversarial Network Rooms in Generative Graph Grammar Dungeons for The Legend of Zelda

Punctuation

  • The Harvard/Oxford/Serial comma
    • INCORRECT: Writing can be good, bad or ugly.
    • CORRECT: Writing can be good, bad, or ugly.
  • intro words
    • INCORRECT: In conclusion we solved the problem.
    • CORRECT: In conclusion, we solved the problem.
  • sub-clauses
    • INCORRECT: We applied many tools such as X and Y to the problem.
    • CORRECT: We applied many tools, such as X and Y, to the problem.
  • separate thoughts, introduce pauses, etc.
    • use to lucidly structure ideas within each sentence
    • too many commas = overly convoluted sentences
    • too few commas = reader must work too hard to disentangle ideas
  • dashes
    • em-dash (“—”): for clause separation
      • “Every writer has bad days—often followed by sleepless nights.”
      • “Certifying compilers—particularly those developed within the past few years—provide much higher assurance.”
    • en-dash (“–”): for ranges: “See chapters 2-7.”
    • dashes: for hyphenated words: “The code-producer is trusted.”
  • semicolon
    • separate two complete, short sentences to be kept together
      • “Our efforts failed; however, we subsequently found an alternative approach.”
    • lists in which some items contain commas
  • plural acronyms: VMs (no apostrophe), OSes (no apostrophe)
    • Possible exception: math variables (all x’s in the equation)
    • Definite exception: possessive acronyms (“the API’s header file”)

Latin Abbrev.

  • Examples vs. clarification vs. exhaustive itemization
    • “Access modifiers (e.g., private) are helpful.”
    • “Access modifiers (i.e., field qualifiers that specify security properties) are helpful.”
    • “Access modifiers (viz., public, private, and protected) are helpful.”
  • Different types of external citation
    • “AOP [1] is useful.” (The cited work defines AOP.)
    • “AOP (e.g., [1,2]) is useful.” (The works are examples of AOP, but not necessarily definitional or exhaustive.)
    • “AOP is useful (cf., [1]).” (The work has something to say about this assertion, but might not agree with it. The reader is advised to compare this statement with those cited.)
    • “AOP is a huge field (cf., [1]).” (Cross-reference the cited work to find all the related works.)*

LaTeX vs Word

LaTeX is required. BibTex is useful. Word is for the business majors.