Difference between revisions of "Coding style/Formatting"
(→Spacing: rename section; add obvious comma and semicolon spacing rules) |
|||
(13 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | __NOTOC__ |
||
The following coding style considerations apply to the most mechanical aspects of C source code style. |
The following coding style considerations apply to the most mechanical aspects of C source code style. |
||
+ | |||
+ | == Quick summary == |
||
+ | |||
+ | * Keep lines to 79 characters or less. |
||
+ | * Use four-column indents. |
||
+ | * Do not use tabs (except in the few files where they're still in use). |
||
+ | * For comments longer than two lines, put the /* and */ on their own lines. |
||
+ | * Put spaces after flow control keywords but not after function names or cast operators. |
||
+ | * Put open braces on the same line as flow control statements, separate by a space. |
||
+ | * Use braces for flow control statement bodies longer than one line of text. |
||
+ | * If using braces around one arm of an if-else statement, use it around both arms. |
||
+ | * Put spaces around binary operators, but not unary operators. |
||
+ | * When breaking long lines, break after binary operators, not before. |
||
+ | * Don't parenthesize return statements. |
||
+ | * Use UTF-8 for non-ASCII characters in comments (as in proper names). |
||
== Maximum width 79 columns == |
== Maximum width 79 columns == |
||
Line 34: | Line 50: | ||
</pre> |
</pre> |
||
− | [needs more detail] |
||
+ | This four-column basic offset is the increment of indentation. It is different from tab stops. In files that contain tab characters, the assumption is that tab stops are at every 8 columns. (This is the convention for Unix.) Therefore, in a file that uses tab characters, a line at the second indentation level will have a single tab character as its indentation. (We are attempting to phase out the use of tab characters.) |
||
=== Current conformance === |
=== Current conformance === |
||
Line 55: | Line 71: | ||
== No tab characters == |
== No tab characters == |
||
− | No tab characters should appear in source files. This guideline will probably be one of the more difficult ones to adopt in a non-disruptive manner. |
+ | No tab characters should appear in source files. This guideline will probably be one of the more difficult ones to adopt in a non-disruptive manner. See [[Coding style/Transition strategies]] for one possibility. |
=== Current conformance === |
=== Current conformance === |
||
Line 64: | Line 80: | ||
Tab stop locations are not consistent across different editors and platforms, and can make code harder to read on a platform other than the one on which it was written. Tab characters also make diffs harder to read. |
Tab stop locations are not consistent across different editors and platforms, and can make code harder to read on a platform other than the one on which it was written. Tab characters also make diffs harder to read. |
||
+ | |||
+ | Example: |
||
+ | |||
+ | <pre> |
||
+ | 1234567890123456780 |
||
+ | four spaces |
||
+ | eight spaces |
||
+ | tab |
||
+ | </pre> |
||
== No trailing whitespace == |
== No trailing whitespace == |
||
Line 79: | Line 104: | ||
== Comment formatting == |
== Comment formatting == |
||
− | Comments to the right of code start in column 32. Comments not to the right of code are indented at the prevailing indent for the surrounding code. Make the comments complete sentences. If you need more than |
+ | Comments to the right of code start in column 32. Comments not to the right of code are indented at the prevailing indent for the surrounding code. Make the comments complete sentences. If you need more than two lines, use a block comment like this: |
<pre> |
<pre> |
||
Line 91: | Line 116: | ||
</pre> |
</pre> |
||
− | + | Important one-line or two-line comments should also be done in block form: |
|
<pre> |
<pre> |
||
Line 97: | Line 122: | ||
* This is a really important one-line comment. |
* This is a really important one-line comment. |
||
*/ |
*/ |
||
+ | </pre> |
||
+ | |||
+ | Two-line comments which are not made into block comments should look like: |
||
+ | |||
+ | <pre> |
||
+ | /* A brief explanatory comment which does not quite fit onto one |
||
+ | * line. */ |
||
</pre> |
</pre> |
||
Line 103: | Line 135: | ||
Since we are mostly aiming for C '89 compatibility, don't use "<code>//</code>" comments. |
Since we are mostly aiming for C '89 compatibility, don't use "<code>//</code>" comments. |
||
− | For Doxygen markup in comment blocks, use "<code> |
+ | For Doxygen markup in comment blocks, use an extra asterisk "<code>*</code>" to begin a Doxygen block, and "<code>@</code>" (at-sign) for Doxygen command keywords. |
<pre> |
<pre> |
||
− | /* |
+ | /** |
− | * |
+ | * @file krb5.h |
− | * |
+ | * @brief the main krb5 header |
*/ |
*/ |
||
</pre> |
</pre> |
||
Line 118: | Line 150: | ||
== Horizontal white space == |
== Horizontal white space == |
||
− | One space goes after keywords ("<code>if</code>", "<code>for</code>", "<code>while</code>", "<code>for</code>", "<code>do</code>", "<code>switch</code>", and "<code>return</code>"). Do not put a space after "<code>sizeof</code>", but do parenthesize its argument. Do not put a space between a function name and the opening parenthesis of its argument list. Do not put a space after a |
+ | One space goes after keywords ("<code>if</code>", "<code>for</code>", "<code>while</code>", "<code>for</code>", "<code>do</code>", "<code>switch</code>", and "<code>return</code>"). Do not put a space after "<code>sizeof</code>", but do parenthesize its argument. Do not put a space between a function name and the opening parenthesis of its argument list. Do not put a space after a cast operator. Do not put a space between a keyword and its immediately-following semicolon. One space goes after each comma in a function argument or parameter list or comma expression. |
<pre> |
<pre> |
||
if (x) { |
if (x) { |
||
− | foo(x); |
+ | foo((long)x); |
} |
} |
||
while (y) { |
while (y) { |
||
Line 168: | Line 200: | ||
=== Rationale === |
=== Rationale === |
||
− | Extra spacing around keywords helps to distinguish them from functions. Spacing around binary operators improves readability. |
+ | Extra spacing around keywords helps to distinguish them from functions. Spacing around binary operators improves readability. Some of the arbitrary quirks in these guidelines ("<code>sizeof</code>", cast operators) are for consistency with BSD coding style. |
+ | |||
+ | == Continuation lines == |
||
+ | |||
+ | When breaking long lines, the continuation lines should be indented by an additional indentation level. Break lines after binary operators, not before. When continuing a parenthesized expression, line up the continuation to the right of the corresponding opening parenthesis. |
||
+ | |||
+ | <pre> |
||
+ | x = y + z + do_something_here() * number_of_the_counting + |
||
+ | do_something_else() + quux; |
||
+ | if (x != y && silly_variable <= something_long_here && |
||
+ | random_comparision(x, y) == 0) { |
||
+ | /* ... */ |
||
+ | } |
||
+ | </pre> |
||
+ | |||
+ | === Current conformance === |
||
+ | |||
+ | Existing code varies. |
||
+ | |||
+ | === Rationale === |
||
+ | |||
+ | This guideline is mostly for consistency with BSD style, except that BSD style uses half-indents for continuations. |
||
== Function definitions and declarations == |
== Function definitions and declarations == |
||
− | Function names in function definitions should begin at the leftmost column. The type name of a function in a function definition should go on the line preceding the function name. The opening brace of a function definition should also go in the leftmost column. Use ANSI-style function definitions, not the K&R style; the K&R style is obsolescent. |
+ | Function names in function definitions should begin at the leftmost column. The return type name of a function in a function definition should go on the line preceding the function name. The opening brace of a function definition should also go in the leftmost column. Use ANSI-style function definitions, not the K&R style; the K&R style is obsolescent. |
<pre> |
<pre> |
||
Line 182: | Line 214: | ||
</pre> |
</pre> |
||
− | For functions with sufficiently many arguments that they do not fit on one line, one possibility |
+ | For functions with sufficiently many arguments that they do not fit on one line, one possibility is to place one parameter per line like: |
<pre> |
<pre> |
||
Line 192: | Line 224: | ||
/* ... */ |
/* ... */ |
||
} |
} |
||
+ | </pre> |
||
+ | Note that the opening parenthesis is at the end of the line, and the closing parenthesis immediately follows the final parameter. |
||
+ | |||
+ | Another style is to line up the continuation of the parameter list to the right of the opening parenthesis: |
||
+ | |||
+ | <pre> |
||
+ | int |
||
+ | lengthy_function_name(char *really, int ridiculously, long lengthy_argument, |
||
+ | void *list, struct goes *here) |
||
+ | { |
||
+ | /* ... */ |
||
+ | } |
||
+ | </pre> |
||
+ | |||
+ | Try to use a consistent form within a file. |
||
+ | |||
+ | For function prototype declarations that are not part of a definition, do not omit parameter names, and try to place the return type name on the same line as the function name. Also, try to avoid the above one-line-per-parameter style for prototypes. |
||
+ | |||
+ | <pre> |
||
+ | void krb5int_buf_add(struct k5buf *buf, const char *data); |
||
+ | </pre> |
||
+ | |||
+ | Some function prototypes will have many characters preceding the function name, such as calling convention or other attribute macros; when combined with a long function name, this makes putting the function name at the beginning of a line a better idea: |
||
+ | |||
+ | <pre> |
||
+ | krb5_error_code KRB5_CALLCONV |
||
+ | krb5_calculate_checksum(krb5_context context, krb5_cksumtype ctype, |
||
+ | krb5_const_pointer in, size_t in_length, |
||
+ | krb5_const_pointer seed, size_t seed_length, |
||
+ | krb5_checksum *outcksum); |
||
</pre> |
</pre> |
||
Line 201: | Line 263: | ||
Placing function names in the leftmost column helps some tools, such as ctags or Emacs. |
Placing function names in the leftmost column helps some tools, such as ctags or Emacs. |
||
+ | |||
+ | Omitting parameter names from prototypes is the usual style for system headers, as an attempt to minimize namespace conflicts. (Parameter name identifiers in function prototype declarations have a scope ending with the closing parenthesis of the prototype, and should be irrelevant, but user-defined macros can rewrite the parameter names, causing syntax errors or unintended effects.) Leaving parameter names in prototypes helps a reader remember the meanings of the parameters. Alternatives, such as putting a parameter name in a comment, are less readable to humans and to Doxygen. (Doxygen also does not handle omitted parameter names.) |
||
== Flow control statements == |
== Flow control statements == |
||
⚫ | Braces opening substatements |
||
+ | If the body of a flow control statement is more than one line of text (even if it is only one statement), it should be surrounded with braces. If braces are used for one arm of an "<code>if</code>", "<code>else</code>", or "<code>else if</code>" statement, they should be used for all arms. The body of a "<code>do</code>" or "<code>switch</code>" statement should always use braces. Otherwise, braces should not be used for single-line bodies. |
||
+ | |||
+ | <pre> |
||
+ | if (ret) |
||
+ | goto cleanup; |
||
+ | </pre> |
||
+ | |||
⚫ | Braces opening substatements should be on the same line as the keyword or expression associated with that substatement. There should be one space before the opening brace. This is sometimes called "hanging" braces. The closing brace of the substatement should be the first non-whitespace character on its line, and be placed at the same indentation level as the keyword for that substatement. |
||
<pre> |
<pre> |
||
if (x) { |
if (x) { |
||
foo(x); |
foo(x); |
||
+ | bar(x); |
||
} |
} |
||
</pre> |
</pre> |
||
Line 225: | Line 297: | ||
if (x) { |
if (x) { |
||
foo(); |
foo(); |
||
+ | foo2(); |
||
} else if (y) { |
} else if (y) { |
||
bar(); |
bar(); |
||
Line 239: | Line 312: | ||
This style is mostly for consistency with the BSD coding style. The GNU brace style consumes a larger amount of vertical space. The "brace-else-if-brace" style also prevents "stairstepping" within a long series of conditional statements. |
This style is mostly for consistency with the BSD coding style. The GNU brace style consumes a larger amount of vertical space. The "brace-else-if-brace" style also prevents "stairstepping" within a long series of conditional statements. |
||
+ | |||
+ | == UTF-8 character encoding/repertoire == |
||
+ | |||
+ | The character encoding for source code should be UTF-8. The use of characters outside the US-ASCII repertoire should be restricted to spelling proper names in comments and similar things. Avoid characters such as "curly quotation marks" when ordinary US-ASCII equivalents exist. |
||
+ | |||
+ | === Current conformance === |
||
+ | |||
+ | Current code mostly conforms. Limiting the use of non-ASCII characters enhances portability, and there are very few reasons for non-ASCII characters to exist in source code outside of comments. |
||
+ | |||
+ | === Rationale === |
||
+ | |||
+ | Using a consistent character encoding makes it easier to edit and copy code without needing to repeatedly re-encode the file. |
Latest revision as of 10:18, 19 October 2012
The following coding style considerations apply to the most mechanical aspects of C source code style.
Quick summary
- Keep lines to 79 characters or less.
- Use four-column indents.
- Do not use tabs (except in the few files where they're still in use).
- For comments longer than two lines, put the /* and */ on their own lines.
- Put spaces after flow control keywords but not after function names or cast operators.
- Put open braces on the same line as flow control statements, separate by a space.
- Use braces for flow control statement bodies longer than one line of text.
- If using braces around one arm of an if-else statement, use it around both arms.
- Put spaces around binary operators, but not unary operators.
- When breaking long lines, break after binary operators, not before.
- Don't parenthesize return statements.
- Use UTF-8 for non-ASCII characters in comments (as in proper names).
Maximum width 79 columns
Source code lines should not exceed 79 columns in width.
Current conformance
Existing code mostly conforms to this guideline.
Rationale
A width of 79 columns fits on most terminals, and is most suitable for printing with a decent column width. Long lines resulting from deeply indented code are often a symptom of design flaws.
Four-column basic indentation offset
Every level of block nesting should be indented by an additional four columns. Labels, including "switch
" labels, should be at one less level of indentation than their surrounding code:
void foo(int x) { switch (x) { case 0: bar(); break; case 1: quux(); break; default: break; } }
This four-column basic offset is the increment of indentation. It is different from tab stops. In files that contain tab characters, the assumption is that tab stops are at every 8 columns. (This is the convention for Unix.) Therefore, in a file that uses tab characters, a line at the second indentation level will have a single tab character as its indentation. (We are attempting to phase out the use of tab characters.)
Current conformance
Existing code varies in conformance. Much of the core library code (src/lib/krb5
, etc.) conforms, but other subsystems chose different indentation offsets. Exceptions include:
- Code of BSD-related origin -- typically eight columns
-
src/plugins/kdb/db2/libdb2
-
src/lib/rpc
- Parts of
src/lib/gssapi/mechglue
-
- Code derived from OpenVision -- various
- Parts of
src/lib/gssapi/krb5
-
src/lib/kadm5
-
src/kadmin
- Parts of
Rationale
Combined with the 79-column width limit, this somewhat limits the level of nesting. This indentation offset allows for visual identification of indentation levels while avoiding long-line problems resulting from using an eight-column indentation offset with some of the long identifier names we use.
No tab characters
No tab characters should appear in source files. This guideline will probably be one of the more difficult ones to adopt in a non-disruptive manner. See Coding style/Transition strategies for one possibility.
Current conformance
Existing code does not conform. Much of the existing code was written in Emacs, which defaults to using sequential tab characters at the beginning of stretches of horizontal whitespace longer than one column.
Rationale
Tab stop locations are not consistent across different editors and platforms, and can make code harder to read on a platform other than the one on which it was written. Tab characters also make diffs harder to read.
Example:
1234567890123456780 four spaces eight spaces tab
No trailing whitespace
There should be no whitespace at the end of a line. Blank lines should not contain any horizontal whitespace.
Current conformance
Existing code is highly variable in this area. Particularly problematic are boilerplate, such as copyright notices, which contain trailing whitespace. Blank lines in code sometimes contain indentation whitespace.
Rationale
Trailing whitespace is difficult to see in many editors. It can also create problems when generating patch files.
Comment formatting
Comments to the right of code start in column 32. Comments not to the right of code are indented at the prevailing indent for the surrounding code. Make the comments complete sentences. If you need more than two lines, use a block comment like this:
/* * This is a block comment. It should consist of complete * sentences. * * Paragraphs should be separated by blank lines so that emacs * fill commands will work properly. */
Important one-line or two-line comments should also be done in block form:
/* * This is a really important one-line comment. */
Two-line comments which are not made into block comments should look like:
/* A brief explanatory comment which does not quite fit onto one * line. */
In order to get the start and end delimiters for block comments to stay when you use emacs to fill paragraphs in the comments, set both the c-hanging-comment-starter-p and the c-hanging-comment-ender-p variables to nil. This will be done by the tentative "krb5" style for the emacs cc-mode.
Since we are mostly aiming for C '89 compatibility, don't use "//
" comments.
For Doxygen markup in comment blocks, use an extra asterisk "*
" to begin a Doxygen block, and "@
" (at-sign) for Doxygen command keywords.
/** * @file krb5.h * @brief the main krb5 header */
Current conformance
Rationale
Horizontal white space
One space goes after keywords ("if
", "for
", "while
", "for
", "do
", "switch
", and "return
"). Do not put a space after "sizeof
", but do parenthesize its argument. Do not put a space between a function name and the opening parenthesis of its argument list. Do not put a space after a cast operator. Do not put a space between a keyword and its immediately-following semicolon. One space goes after each comma in a function argument or parameter list or comma expression.
if (x) { foo((long)x); } while (y) { baz(&y); if (quux()) return; }
Semicolons separating the expressions of a "for
" statement have one space after them unless the following expression is empty.
for (;;) { /* ... */ } for (i = 0; i < n; i++) { /* ... */ }
Put spaces around binary operators, but not around unary or postfix operators. The structure member operators ".
" and "->
" count as postfix operators, not binary operators.
x = --a + b / c - d++; y = p->z.v[x];
Omitting spaces around some binary operators may be justified when it improves readability:
s[len+1] = '\0';
Put spaces around the "?
" and ":
" characters in a conditional expression:
x = y ? f() : g();
Current conformance
Spacing around binary operators is mostly consistent. Existing code is not consistent about putting the opening parenthesis of a function call immediately after the function name.
Rationale
Extra spacing around keywords helps to distinguish them from functions. Spacing around binary operators improves readability. Some of the arbitrary quirks in these guidelines ("sizeof
", cast operators) are for consistency with BSD coding style.
Continuation lines
When breaking long lines, the continuation lines should be indented by an additional indentation level. Break lines after binary operators, not before. When continuing a parenthesized expression, line up the continuation to the right of the corresponding opening parenthesis.
x = y + z + do_something_here() * number_of_the_counting + do_something_else() + quux; if (x != y && silly_variable <= something_long_here && random_comparision(x, y) == 0) { /* ... */ }
Current conformance
Existing code varies.
Rationale
This guideline is mostly for consistency with BSD style, except that BSD style uses half-indents for continuations.
Function definitions and declarations
Function names in function definitions should begin at the leftmost column. The return type name of a function in a function definition should go on the line preceding the function name. The opening brace of a function definition should also go in the leftmost column. Use ANSI-style function definitions, not the K&R style; the K&R style is obsolescent.
char * foo(int a) { /* ... */ }
For functions with sufficiently many arguments that they do not fit on one line, one possibility is to place one parameter per line like:
krb5_error_code krb5_do_something( krb5_context context, char *string) { /* ... */ }
Note that the opening parenthesis is at the end of the line, and the closing parenthesis immediately follows the final parameter.
Another style is to line up the continuation of the parameter list to the right of the opening parenthesis:
int lengthy_function_name(char *really, int ridiculously, long lengthy_argument, void *list, struct goes *here) { /* ... */ }
Try to use a consistent form within a file.
For function prototype declarations that are not part of a definition, do not omit parameter names, and try to place the return type name on the same line as the function name. Also, try to avoid the above one-line-per-parameter style for prototypes.
void krb5int_buf_add(struct k5buf *buf, const char *data);
Some function prototypes will have many characters preceding the function name, such as calling convention or other attribute macros; when combined with a long function name, this makes putting the function name at the beginning of a line a better idea:
krb5_error_code KRB5_CALLCONV krb5_calculate_checksum(krb5_context context, krb5_cksumtype ctype, krb5_const_pointer in, size_t in_length, krb5_const_pointer seed, size_t seed_length, krb5_checksum *outcksum);
Current conformance
Existing code is variable.
Rationale
Placing function names in the leftmost column helps some tools, such as ctags or Emacs.
Omitting parameter names from prototypes is the usual style for system headers, as an attempt to minimize namespace conflicts. (Parameter name identifiers in function prototype declarations have a scope ending with the closing parenthesis of the prototype, and should be irrelevant, but user-defined macros can rewrite the parameter names, causing syntax errors or unintended effects.) Leaving parameter names in prototypes helps a reader remember the meanings of the parameters. Alternatives, such as putting a parameter name in a comment, are less readable to humans and to Doxygen. (Doxygen also does not handle omitted parameter names.)
Flow control statements
If the body of a flow control statement is more than one line of text (even if it is only one statement), it should be surrounded with braces. If braces are used for one arm of an "if
", "else
", or "else if
" statement, they should be used for all arms. The body of a "do
" or "switch
" statement should always use braces. Otherwise, braces should not be used for single-line bodies.
if (ret) goto cleanup;
Braces opening substatements should be on the same line as the keyword or expression associated with that substatement. There should be one space before the opening brace. This is sometimes called "hanging" braces. The closing brace of the substatement should be the first non-whitespace character on its line, and be placed at the same indentation level as the keyword for that substatement.
if (x) { foo(x); bar(x); }
The "while
" keyword in a do-while construct should sit on the same line as the closing brace of the substatement following "do
":
do { baz(); } while (x);
An "if
" substatement immediately following an "else
" keyword should be on the same line as the "else
":
if (x) { foo(); foo2(); } else if (y) { bar(); }
Do not parenthesize the expression in a "return
" statement.
Current conformance
Existing code mostly conforms. Some Sun-derived code parenthesizes the expressions of "return
" statements.
Rationale
This style is mostly for consistency with the BSD coding style. The GNU brace style consumes a larger amount of vertical space. The "brace-else-if-brace" style also prevents "stairstepping" within a long series of conditional statements.
UTF-8 character encoding/repertoire
The character encoding for source code should be UTF-8. The use of characters outside the US-ASCII repertoire should be restricted to spelling proper names in comments and similar things. Avoid characters such as "curly quotation marks" when ordinary US-ASCII equivalents exist.
Current conformance
Current code mostly conforms. Limiting the use of non-ASCII characters enhances portability, and there are very few reasons for non-ASCII characters to exist in source code outside of comments.
Rationale
Using a consistent character encoding makes it easier to edit and copy code without needing to repeatedly re-encode the file.