Overview
This document is targeted at my research group, to outline my general thoughts and expectations on this topic. Writing a research paper can be quite field-specific, but there are quite a lot of general rules and expectations for the “scientific report” format that we use for experimental studies. Some expectations and writing conventions are not always so clear, so I will try and be fairly comprehensive and cover all of these that I can think of. If there is something I’ve missed let me know and I will try and update this document to add it in!
It actually makes the most sense to start writing your scientific report with the results section. This is the most important part of your manuscript, and what goes in here will guide you in writing the discussion, introduction, and materials and methods sections. The recommended order of writing is results, discussion, introduction, and you can write the methods section whenever you want, although it’s better to do this as you are doing the experiments so you don’t forget what you were doing when you go back to write this up later!
The Results section
General considerations
Although strictly speaking nothing is stopping you from leaving some data out when you write your paper, I strongly feel that all results you have produced should be included in some way, even if it’s one sentence linking to a supplementary table (try to avoid making a single sentence paragraph though!). Likewise, all your raw data and all information necessary to interpret the results needs to be available to the reader, either via a data repository or via supplementary tables. The reader should be able to completely replicate your data analyses, if they so choose, or to reuse your data for other analyses.
Overall structure of this section
This is the most important section of your manuscript. You have two main options for structuring this section: 1) most important/interesting to least important/interesting, or 2) logical progression, where the first results are necessary to understand the later results, or are built on by the later results. Results structures that you should NOT use (unless they happen to match 1) or 2) by chance) include: 3) in the order you did the experiments, or 4) by method. It’s not super bad to organize your results by 3) or 4), but there are some clear advantages to using 1) or 2) instead.
The argument for using 1), and listing results from most important/interesting to least important/interesting, is related to the fact that we want the reader to be interested in continuing to read our paper. Reading a whole lot of results first that lead to a ‘so what?’ or boredom from the reader is not ideal. We do want to report all our results, but the most boring results (confirmation of something that is already well known, for instance, or method validations) should either get only a sentence or two directing the reader to the supplementary material, or should be presented last in the results section so the reader can skip over them without missing anything major. The average reader of a published scientific paper is not going to read through the whole paper in detail, so we need to make it as easy as possible to find the major results.
In order of importance is my favourite way of structuring the results section, but sometimes it’s not really feasible to structure the results this way: quite often we have a logical flow to the results where we need to first describe one part before it makes sense to describe the next part. A common example of this is when we have generational studies: it would be confusing to start talking about results in the third generation when we haven’t heard anything about the first generation yet. Perhaps we want to report on generation of hybrid material before talking about characterizing it. Or perhaps we want to present individual trait analyses before we look at correlations between traits. This is not always the case for these examples (maybe we aren’t actually interested in talking about each generation or trait separately, for instance) but often there are very good reasons to structure the results section in a particular logical order.
It’s common to see a default results structure of either by method or by experiment order: often these are the same. A simple reason that the order you did the experiments in may not be a good idea is that this might not actually be the most logical or interesting – it might even be a bit random, which is not helpful for maintaining reader interest or in building a logical flow through the results. A more serious disadvantage of the ‘by methods’ structure is that for many papers with multiple methods or experiments, we are actually more interested in the results which come from a combination of methods. An example of this would be if we are characterizing individual plants or genotypes using a range of different methods: are we really interested in the traits individually across the population (measurements of x, then y, then z), or are we trying to find features in common within sub-populations (x, y and z in population 1 vs x, y and z in population 2), or correlations between traits? Of course it’s possible we are interested in each trait (method) individually, but often it’s much more interesting to read results sections which are organised by e.g. differences in or between groups based on x, y and z, rather than structured by all measurements of x, then all measurements of y, then all measurements of z. You will know your own results best, so the key message about results section structure is to consider your results in terms of interest and logical flow, and to think about how best to order your results sections based on these two points, rather than just using a methods-based or experimental timeline-based order by default.
Making an outline and deciding on text, figures or tables
So, to start the Results section, the best strategy is to list out all your results as dot-points (brief, one sentence summaries). These will represent each of your results sections or paragraphs. Don’t cluster multiple result together unless they really are very similar. Then order all the results from most interesting to least interesting. This is a good time to get feedback from me or even better our research group, which will be more representative of a wider readership with different levels of background on the topic.
Once you have an order of results, now you can think about how best to represent each of these results: as figures, tables or text. I haven’t talked about data analysis here, but most likely you have some draft figures already as a result of visualizing your data to see what is happening, as well as the results of statistical tests for specific correlations, associations or hypotheses. Think about the best way to present each result. We are often limited to a certain number of figures by journals, and also inclusion of too many figures in the main text can distract from the most important results. Despite this, figures are often the best way to present results, and you should think carefully about which results you want to highlight when choosing which figures to include. That said, sometimes clear text can be better than a figure, because in text we can also highlight specific points we want to present without other distractions.
So should a result be presented as a figure, as text or as a table? Generally, figures are easiest for readers to understand, provided these are nice and clear. Figures are definitely more interesting at least! There are some things you really need a table for, like listing references comprehensively in a review paper, but as a general rule if it CAN be a figure rather than a table then it SHOULD be a figure rather than a table. Definitely don’t present e.g. results of statistical analyses like ANOVA in the main text in table form though, I think it’s safe to say almost no-one reading the paper wants to see this! The best results sections read like an illustrated story, introducing the reader slowly to the main important results in a logical flow of information.
So, you have a results section outlined, and have decided how to present each result and in which order. Now, the easiest way to continue is to use subheadings for each result, where each subheading actually describes the result. An example would be that instead of using ‘Chromosome inheritance in second generation interspecific Brassica hybrids’, you would write ‘Second generation interspecific Brassica hybrids retain more chromosomes than expected by chance’. Putting the main result up front is of huge benefit to the average (lazy, disinterested, busy or easily confused) reader. Within each section then you can describe this result in more detail, and give the results of any related statistical analysis. For each result we want to present it only once, e.g. only in the figure or only in the text, not in detail in both, as this is redundant / unnecessary. You can also have different paragraphs under each subheading if this is clearer – for each paragraph make sure the first sentence of the paragraph explains what the paragraph is about, and how this relates to the result in the subheading (i.e. use topic sentences).
Making figures
Making clear, high quality figures is another topic, although there has been a lot written online and elsewhere about principles of data visualization. Use colour-blind friendly colour palettes (no red and green together), limit the use of colour to only what is really needed to add information to the figure, and use clear, strong lines, large labels on all axes and minimise text inside the figure. Feel free to annotate figures if you need to for clarity though (e.g. adding arrows to point to specific things). Always use scale bars for photos, this is a strict requirement. Try and use a “vector” file format for line graphics: this is one which stores text and lines as actual information rather than as pixels. Examples include .pdf, .emf and .eps formats: if you are unsure try zooming in on some text in your figure – if you can keep zooming in forever and it stays sharp then it’s a vector or true-type graphic format, if it goes pixelated then it’s just saved as an image. Don’t use boxplots for low numbers of samples (<50 or so) unless you also super-impose the actual dots representing the figure values, as this is otherwise very misleading. If you are presenting categories along one axis (e.g. like a boxplot), think about the most sensible or useful way to order these for reader understanding, don’t just use alphabetical order.
I don’t personally want to see any multi-part figures unless there is a clear and compelling need to compare and contrast between each of the figures presented together. This format is a historical artefact of colour printing costs, which led journals to a) limit number of figures per paper and B) put all figures on a single page as a ‘colour plate’. Today, we still see this in journals with extremely restrictive format and length requirements, such as Nature and Science, as the authors try and squish in multiple figures to get in under the page and figure limit, and it is considered acceptable by most of the scientific community. Personally, I think it’s terrible science communication and we need to get rid of this convention entirely. One figure should represent one major result, and if multiple different methods are used to confirm this result then the authors should pick the clearest one and put the rest in the supplementary information, for the sake of the reader.
Combining results and discussion (I would rather you didn’t)
Combined results and discussion sections are not very common, but are still given as an option by some journals. The advantage of a combined results/discussion section is that you can introduce your results then directly interpret their significance to the reader. This is very appealing to many research scientists, particularly since it avoids some of the complications involved in structuring the Discussion section. However, I don’t recommend this format, for two (related) reasons. Firstly, we have a split results and discussion section in a standard scientific report format for a reason, and that reason is to separate as much as possible the presentation of the data from the interpretation of the data. Now of course we already ‘interpret’ the data to some degree at least by how we choose to present our results, but the idea here is that another researcher reading your results section has the opportunity to come to their own conclusions about the interpretation of your results, with minimal influence of your opinion. This leads to a more robust, critical scientific literature, where results presented by one paper can more easily be reinterpreted, for example if new discoveries come to light. I have a classic personal example of reading a very old paper (I think 1930-1950s) where the (single) author did not know that dicotyledonous plants do not go through cytokinesis in meiosis until after the second meiotic division is complete (in animals and monocotyledonous plants cytokinesis happens twice, after both meiotic divisions). The author recorded what they thought was the number of cells following first and second divisions of meiosis. Actually, all cells were post-second division, and they had instead recorded frequencies of unreduced gametes (failure of meiosis to separate either homologous chromosomes or sister chromatids, resulting in two meiotic products instead of the usual four). Regardless of the technical details, I was able to reinterpret and utilise these results from an improved future perspective, even though the discussion was completely wrong in interpretation. So, this is a foundational concept for results sections: do not interpret or discuss your results in this section.
The second, related reason I recommend always splitting results and discussion sections is more practical. Basically, if you submit your paper for publication and a reviewer disagrees with your interpretation of any of your results, but your results are all clearly presented without interpretation, most reasonable reviewers will just ask you to revise your discussion section, rather than rejecting the manuscript. If the reviewer has already read through the results section, liked it / found it interesting and then hit a potentially upsetting mistake or interpretation they don’t agree with in the discussion section, they will be much more positively inclined towards the manuscript than if this problematic interpretation is immediately presented along with the result. This is in any case my feeling on this point – I have reviewed manuscripts many times where I don’t agree with the authors’ interpretation of the results either wholly or in part, but I would never recommend a complete reject if I thought the discussion could just be rewritten to accommodate different interpretations. Also just from experience, when the combined results / discussion section is done really well it can be amazing, but if this is not done very well it is usually confusing and substantially worse to read than a split results / discussion of similar quality. Some very high impact journals like Nature ask for combined results and discussion sections, but writing for these journals (and a much more general audience) requires a very different format and style of writing overall compared to more standard, field – specific journals.
The Material and Methods section
So let’s talk about the methods section. These are easily the most boring sections, but are usually also the easiest to write. The most important point of the materials and methods section is to allow someone else to exactly replicate your experiments if they want to. Usually we start by describing the experimental materials (e.g. plant genotypes and where you got these from), then have sections describing each of the experimental methods used to generate the results, then finish with sections describing the data analysis methods used. Methods described previously should be referenced instead of being written out again in full, and it is fine to say ‘according to the manufacturer’s instructions’ if you used a kit, since this information should be publically available.
Stay true to the idea that you need to present enough information for someone to replicate your experiment exactly, but also be prepared to send this hypothetical person reading back through other papers to collect all the methodological details. Do check that the papers you reference actually detail the method you used, and make sure to present any differences as ‘with modifications as follows:…’
The best time to write this section is actually as you are doing the experiments or data analyses, so that the details are fresh in your mind. Coming back to write up methods a year or more later can really be an exercise in frustration – this is also where writing up your data analysis in either your lab book or a centrally available word document is highly recommended!
There are some details which authors often forget to include in this section. In my experience, these include the following: genotype and origin (provenance) of the experimental lines, number of individuals and/or individual cells measured from each individual, experimental growth conditions (including field location and year if this was a field experiment), and information about the statistical analyses used.
One important point about the methods section is that sometimes we have results which are actually also kind of methods. An example of this is when we do a standard validation of a particular method, or of data obtained (e.g. sequence quality scores), or when we generate experimental material (number of plants obtained per seed sown etc.). How successful we were in generating experimental material or in validating our method, data or analysis is technically results, but it’s usually not something we are actually interested in per se for the paper. In this case, I personally find it acceptable and preferable to put this information into the methods, perhaps with a reference to the supplementary material for more details, than to ‘contaminate’ our results section with this information that no one really cares about, but which should otherwise normally be placed at the beginning of the results section since other results are based on it. I would argue that these types of ‘results’ are really more like methodological controls, and hence should go in the methods section. Apart from this though (which is more a personal preference, you can also just very briefly summarise and reference these types of results in the actual results section if you prefer) we don’t want to see any results in the methods section, and definitely no discussion.
It is acceptable to provide (very) brief context about why a particular method was chosen, but if you really feel you need to do this it’s better to do this via referencing previous papers which used this method, and to slot this information into a half or maximum single sentence. As a reviewer, I might sometimes be surprised or question why one method was used in preference to another method, but as long as the results are equivalent in supporting or rejecting the given hypothesis then the choice of methods to obtain these results should be irrelevant. Too often, complaints about the use of old, but perfectly acceptable methods discriminate against labs which simply can’t afford more modern methodologies. And if the authors used an out-of-date reference genome? Well, maybe the analysis was actually done five years ago – if it doesn’t affect the main conclusions significantly then it’s unreasonable to ask for the entire data analysis to be redone. As an editor I also enforce this, although you may get unlucky with the review process. In any case I therefore don’t recommend justification of your methods in the methods section: if the methods are appropriate to the research question then justification seems unnecessary. If the method is novel or untested, then discuss this in the discussion section. If the method is the main point of the paper then the validation and comparison of the new method with established methods should go in the results.
The Discussion section
Okay, so on to the Discussion section. If you really want to it’s fine to write the introduction before the discussion, but the introduction is easier to rewrite so I would recommend solidifying the Discussion first, based more directly on the results, then using the discussion to inform what needs to go in the introduction. The Discussion section generally causes the most issues to write out of all the scientific paper sections. I think this is because it’s hardest to know what to include in this section, while it’s relatively clear what should go in the other sections content-wise. The main mistakes I tend to see include 1) just stating the results again in detail without much interpretation or reference to the literature, 2) discussion paragraphs that discuss multiple unrelated results together with no clear logical thread and 3) paragraphs that don’t reference the results of the paper at all, but just include literature review. Fortunately, these mistakes can all be avoided by just using a clear, strict format for the discussion section, which I will describe in the following paragraphs.
Firstly, with regards to overall structure, I do have a preference (not a rule) for using the first paragraph of the results as a summary/conclusions paragraph. If an actual conclusions section is required then of course this may not be a good idea. But there is a case to be made for having the first paragraph of the discussion summarise and briefly interpret the major results and specifically the results which relate directly to the hypotheses. The reason for this is because many readers do skim papers by reading just the abstract, last paragraph of introduction, and first paragraph of discussion. As a reader, it’s usually less work to read the authors’ interpretation of the results than to read and interpret the results themselves, so just reading the beginning of the Discussion makes sense in this context. Also, readers may not make it to the end of the Discussion, so discussing major results up front increases the amount of information conveyed. This then also affects Discussion structure: as with the Results section, the preferred order of paragraphs should be from most interesting/important to least interesting/important. While it is possible to mirror the results section in terms of logical structure and discuss results in order of dependence on previous results etc., I think the case for doing this in the Discussion is relatively weak. I would therefore always recommend using the most important /interesting to least important /interesting structure for the Discussion paragraphs from second to last, assuming the first paragraph is the special summary/conclusions one.
More specifically on the structure of the first paragraph of the Discussion section, if you choose to use this as a summary: the very first sentence should present the main hypotheses or aims of the experiments again (e.g. ‘We hypothesized…’, or ‘We aimed…’). This helps the reader contextualize and work out what to expect. Then, the major results should each be reiterated as 1-2 sentences maximum per result (e.g. ‘We found…’). Following this, give the main ‘meta’ interpretations of the results in 1-3 sentences or so. This might be something like ‘These fertility and chromosome inheritance results suggest that subgenome interdependence is already well-established in Brassica allopolyploids, supporting similar results by Pelé et al. (2016) in Brassica napus. Our findings suggest establishment of new, stable karyotypes following hybridization between allopolyploids in Brassica will most likely also be coupled with ploidy increase and/or extensive karyotype restructuring.’ After these ‘big picture’ results, you may also want to add one or two sentences further highlighting the significance of your study, e.g. ‘Our results are of interest for understanding mechanisms of karyotype restructuring following interspecific hybridisation events. As well, our results suggest frequent recombination between the Brassica A, C, and to a lesser extent B genomes occurs in these hybrid types, which is potentially useful for introgression breeding.’ Don’t go crazy with this part – more than a few sentences will definitely seem like too much (i.e. overly defensive about the interest/significance of the paper) to the reader.
So, on to the ‘main’ Discussion section paragraphs. Although there are no strict rules, what we need to focus on in the Discussion is 1) one major idea per paragraph, 2) a clear topic sentence introducing the main result relating to this idea at the start of the paragraph, 3) minimal presentation of results, just enough to contextualize the discussion but not a repeat of the results section, 4) clear reference to other studies supporting or contradicting the main result or idea, and 5) logical flow from sentence to sentence within the paragraph. The last sentence or two may also offer a broader interpretation of the result or a ‘conclusions’ type sentence, e.g. ‘We conclude that naponigra hexaploids do not benefit in terms of fertility or meiotic stability from having Brassica napus as a parent, but may be a good choice of hybrid for obtaining introgressions between the B and A/C genomes’. Following this general structure for each Discussion paragraph is recommended. Then, just keep going until you run out of results of interest to discuss!
Most journals prefer no subsection headings in the Discussion (I have to admit I also don’t like to see subheadings here, but I suspect I have just been trained into it since I can’t see any reason not to use these). This makes it even more critical that the first sentence of each paragraph (the “topic sentence”) clearly reiterates the result which will be discussed in the paragraph, and if relevant also links back to the hypotheses /aims. Subsequent sentences may add minor, adjacent or linked results, but only if this is relevant to the discussion of the main result – don’t add unnecessary detail or repeat your results section here. One sentence is usually sufficient, and this should then be directly logically connected to the primary result, usually as it relates to the interpretation of the main result. Subsequent sentences should avoid adding more result details unless strictly necessary for the argument being made. As an example, say your primary result was ‘Naponigra allohexaploids were highly infertile’. We might start off with the paragraph with this sentence, then add some related references that showed a similar result. However, we might want to also offer a possible explanation for this result based on another result, e.g. ‘Meiosis in Naponigra allohexaploids was highly irregular’. In this case it’s okay to add this result into the same discussion paragraph, but it should be very clear why it’s there, e.g. ‘A possible explanation for the low fertility of our Naponigra hexaploids is that meiosis in these hybrids was highly irregular, which may have hindered production of viable euploid gametes’. The next paragraph would probably also then refer to the literature again, maybe specifically to e.g. describe the link between irregular meiosis and fertility in other Brassica studies. In this scenario, it would also work fine to split the discussion of irregular meiosis into another paragraph, or to introduce the two results in tandem at the start of the paragraph (‘Naponigra hybrids showed irregular meiosis and low fertility’). There are no really strict rules here, which can make it hard to know how best to write the Discussion paragraphs.
You don’t actually need to discuss every result, although it’s better to start off aiming for this. The Discussion section should be 3-6 pages or so (1.5 line spacing, 11 pt font), but shorter is better than longer if you can manage to say everything that needs to be said. Paragraphs should be as long as necessary, but most likely are too long if they are more than two thirds of a page or so. If you are using subheadings in the Discussion, these may replace topic sentences if having the topic sentence is redundant (if it is just a repeat of the subheading).
The Introduction section
The Introduction section can best be thought of as an inverted triangle. At the beginning, we need to explain broad concepts and general interest topics (the base of the pyramid). As we get through the introduction we narrow in on the exact topic of the paper and the novel contribution it is making to the scientific literature, finishing with the hypotheses/aims at the end of the introduction (tip of the triangle). Each paragraph should have one specific idea or topic, and we should use our topic sentence structure here too, so it is clear from the first sentence what the paragraph is going to be about.
So to start off: what is the big problem that your research is (in some small way) addressing? Is it a gap in our understanding, or is it related to a topic of agricultural importance such as drought stress causing loss of crop production? This problem at the most general level possible should be the first thing you introduce in the introduction. This is also why so many papers start with “Climate change…” but this may be too general for most journals and research topics! It’s important to think about the target audience – who are we writing for? Which possible journals? The introduction can always be rewritten to target a different journal / audience.
Next, think about what concepts the reader needs to understand in order to understand why your research is important – what concepts do you need to explain, and what terminology do you need to define? How can the reader understand this problem and the role your research / experimental work plays in relation to this? This needs to be covered in increasing detail as we go from general interest to specific research questions.
The last paragraph of your introduction should introduce your specific study system, hypotheses and aims. Generally, we should aim to have the last sentence of the introduction be the central hypothesis. If you have a set of smaller sub-hypotheses it’s better to introduce these individually with a sentence or two of context each, rather than just listing these one after the other at the end of the introduction.
Sometimes authors like to also present the major results at the end of the introduction. I don’t really like to do this, because it violates the scientific report structure (results should go in the results section). However, there’s a good argument to be made that this helps the reader, who then doesn’t have to read through the results to get the main findings. On the other hand, these main results should also be in the abstract… In any case I don’t feel too strongly about this, but if you do choose to put the major results at the end of the Introduction section these should be clear, understandable and relatively general, and presented in only 1-3 sentences. Sort of a “teaser” for the following results section!
Make sure that by the end of the introduction, the reader knows why your research is important or interesting, what we know already about this topic (in general terms), what specific similar studies have been done before (if any) and how these are different from your current work, and what your major hypotheses or aims are that you will be addressing in this study.
References
Always use a reference manager program. Doing references by hand is a huge waste of time, especially when it comes to the point that the references need to be formatted differently for journal submissions. Mendeley is free, popular and easy to use and therefore a good current choice, even though I know there are other (probably better) options. If I am co-authoring a paper with you I will expect you to either use Mendeley (and have a clean reference database, so that if I need to make changes to the references there should be no manual edits required to the text to correct issues), or to use another program to manage all the references yourself (in this case I will tell you to fix references etc. in revisions or for journal submission).
Always reference every statement you make in a scientific paper, even if it’s something extremely basic like “rapeseed is a major oilseed crop”. The vast majority of your references should be peer-reviewed scientific papers, although it’s okay to cite book chapters, books or more “scientific” or data-based websites like the Food and Agriculture Organisation statistical database (https://www.fao.org/faostat/en/) if these have important information. We don’t cite Wikipedia, blogs or newspaper articles etc. normally in our research area.
When making general statements (particularly early on in the introduction) try to cite reviews. For example, “polyploidy contributes to genome plasticity and adaptation” is a very general statement, so instead of citing specific examples I would add “(reviewed by Leitch and Leitch 2008)” or a couple of more recent review articles. However, if we are making more specific statements, we should be referencing the original experimental studies which provide evidence for these statements. This will be most of the referencing that you will be doing, particularly in the later parts of the Introduction and in the Discussion section. Don’t reference a statement made in the introduction section of an experimental paper to support one of your own statements, this is bad practice. The citation you provide should be a direct link to a paper providing experimental evidence that supports that statement.
Try not to list more than three or four references for a single point: if you find you have more references you want to add, then split them by e.g. species, study system or method, don’t just make a massive list. So e.g. “Neopolyploids are often unstable” should not be followed by a dozen references directly from across all different species, but could be divided as “Neopolyploids are often unstable: this has been observed in Tragopogon (Chester et al. 2012), rapeseed (Song et al. 1995, Gaeta and Pires 2007, Szadkowski et al. 2010), Arabidopsis…” etc.
Sometimes, it’s not possible to obtain the original paper that has been cited in other papers. In this case, you can use the format “Smith et al. 1956 as cited in Wang et al. 2010” if you really need to, although I think if you can manage to get hold of at least the abstract of the original paper and read it this is enough grounds to cite the original paper. This is kind of the point of the abstract, to provide a summary of the main results and conclusions of the paper so you don’t need to read the entire manuscript.
Scientific writing conventions
There are a few minor things which are hard to know if you haven’t been specifically taught to do this. Unfortunately, some journals also have their own conventions, and will disregard these general scientific writing principles. In any case though always do the following by default:
- Always include measurement units after every number, with a space between the number and the measurement unit (e.g. 5 μm, not 5μm; you can find all the symbols you need in “Windows Accessories: Character Map”)
- Put all genus and species names in italic font (but not family or higher taxonomic names, so Brassica napus in the Brassiceae tribe in the Brassicaceae family)
- Put all gene names in italics and protein names in CAPITALS (this can really differ by field and journal though unfortunately)
- Spell out all numbers less than ten, but for numbers greater than ten use numerics (e.g. one, two, three, but 11, 12, 13); the exception is if you have a mix of numbers greater and less than ten in the same sentence in which case you can use numerics for everything if you prefer
- The start of sentences should always be a capital letter, so not a number or a lower-case letter (spell out the number if you want to start the sentence with it, or try and rearrange the sentence so as not to start with a number or lower-case gene name).
- If you are using italics for a subheading, then species and gene names need to be in non-italic font, even though this might look a bit weird: e.g. “Brassica hybrids inherit more chromosomes than expected” in italics would be “Brassica hybrids inherit more chromosomes than expected”
- The crossing symbol is not “x” (lower-case letter x) but “×”, the multiplication sign. Likewise, for micrometer measurements this is not a lower-case letter “u” but the Greek symbol “mu”: μ
Paragraph and sentence structure
Another important foundational point is paragraph structure. A paragraph should have the following features: 1) a topic sentence, 2) contain only one idea or topic, and 3) be longer than one sentence. A topic sentence is the first sentence of the paragraph: this sentence tells you what the paragraph is about. This applies to every manuscript section, with the possible exception of the materials and methods (where section headings or subheadings can replace the function of the topic sentence in telling the reader what the following paragraph is about). It is very important for readability that there is only one topic or idea per paragraph, and each sentence within the paragraph should be logically connected to the next sentence. Sentences should not just pop up out of nowhere, surprising the reader, but should gently flow from the previous sentence, aided by the use of contextualizing words and phrases such as ‘however’, ‘in contrast’, ‘as well’, ‘also in relation to’ etc.
If possible, a following sentence should start with the topic that links the second sentence to the first sentence. This sounds a bit complicated, but isn’t really. Let’s give an example. Say your topic sentence is ‘Genotypic effects on the frequency of successful interspecific hybridization events in Brassica have been frequently observed.’ This topic sentence tells the reader that this paragraph (in the Introduction) is about the effect of genotype on interspecific hybridisation success in Brassica. The next sentence might read: ‘For example, in crosses between Brassica napus and Brassica nigra, genotype of both the B. napus parent and the B. nigra parent had a significant effect on number of seeds obtained per bud pollination (Gaebelein et al. 2019).’ The words ‘For example’ here introduce what relationship the following sentence has to the first sentence. In this case, we are adding supporting evidence for our previous statement, so “For example” is an excellent way to start this sentence. The third sentence might be ‘Similar genotype-specific success rates were also observed in crosses between B. rapa and B. oleracea, where cross-pollination was followed by embryo rescue (Abel et al. 2004).’ in this sentence, we want to include a bit of extra information, that embryo rescue was also used (don’t worry about the specific meanings of these examples, just pay attention to the sentence structure). So, this sentence could also be written as ‘Abel et al. (2004) crossed between different B. rapa and B. oleracea lines followed by embryo rescue, and also observed an effect of parent genotype on success rate in obtaining interspecific hybrid progeny’. Both sentence versions are technically correct. However, try reading sentence two then sentence three for both versions:
Version 1: ‘For example, in crosses between Brassica napus and Brassica nigra, genotype of both the B. napus parent and the B. nigra parent had a significant effect on number of seeds obtained per bud pollination (Gaebelein et al. 2019). Similar genotype-specific success rates were also observed in crosses between B. rapa and B. oleracea, where cross-pollination was followed by embryo rescue (Abel et al. 2004).’
Version 2: ‘For example, in crosses between Brassica napus and Brassica nigra, genotype of both the B. napus parent and the B. nigra parent had a significant effect on number of seeds obtained per bud pollination (Gaebelein et al. 2019). Abel et al. (2004) crossed between different B. rapa and B. oleracea lines followed by embryo rescue, and also observed an effect of parent genotype on success rate in obtaining interspecific hybrid progeny.’
For version two, we have to read the whole sentence in order to work out what the connection is to the previous sentence and to the theme. This is a greater cognitive load (mental effort) for the reader, which we want to avoid where possible. Hence, Version 1 is preferred, with clear connections between the end of one sentence and the start of the next. We’re already asking a lot of the reader for most scientific papers in terms of understanding and remembering different concepts, we should at least make our writing as easy to read as possible!
Scientific terminology and acronyms
This leads me to another point about language, which is use of jargon. Jargon is any field-specific terminology which probably wouldn’t be understood by a non-expert on the topic. Some jargon (or highly technical language) is necessary to allow scientists to communicate accurately and with no potential misunderstandings with other scientists in the same field. One example of this is in taxonomy, when we use names for specific leaf shapes, e.g. pinnate, or when we want to very accurately label specific stages of a process (e.g. zygotene in meiosis). However, in general, there should be as little jargon as possible in your scientific writing, so that it can be read and understood by the widest possible audience that is reasonable given the topic. This is not ‘dumbing down’, but just trying to always use the simplest language possible to communicate your meaning. There are no points for using complicated vocabulary, we aren’t trying to write English literature! Always remember that the majority of scientists worldwide are not native English speakers (probably just like you, statistically speaking) and try to write clear, short, logical and easy-to-understand sentences and paragraphs. If you introduce a concept or use a term that the average biological scientist may not be 100% familiar with, always explain this term the first time you use it in the manuscript. I use this same rule for conference presentations and have yet to have someone complain that there was too much introduction, or that it was too basic! Also, avoid using acronyms unless it’s really necessary. Every acronym you introduce also adds to the mental load of the reader – they must try and recall the meaning of the acronym every time they read it, which is usually more annoying than just seeing the same term written many times. Very commonly used acronyms (as in ones most readers would have seen already, like DNA) or ones that stand for long, multiple word concepts are usually okay to use, but only if they are really frequently used throughout the manuscript. There is no rule as such, but probably if the acronym shows up less than 10 or 20 times in the manuscript then there is no need to use the acronym at all. I can’t recommend making up your own acronyms (or terminology) under any circumstances really.