LCLint is a tool for statically checking C programs. With minimal effort, LCLint can be used as a better lint.[1] If additional effort is invested adding annotations to programs, LCLint can perform stronger checks than can be done by any standard lint.
Some problems detected by LCLint include:
LCLint checking can be customized to select what classes of errors are reported using command line flags and stylized comments in the code.
This document is a guide to using LCLint. Section 1 is a brief overview of the design goals of LCLint. Section 2 explains how to run LCLint, interpret messages produce, and control checking. Sections 3-10 describe particular checks done by LCLint.
Appendix A. Availability
Appendix B. Communication
Appendix C. Flags
Appendix D. Annotations
Appendix E. Control Comments
Appendix F. Libraries
Appendix G. Specifications
Appendix H. Emacs
1. Overview
The main goals for LCLint are to:
LCLint does many of the traditional lint checks including unused declarations, type inconsistencies, use-before-definition, unreachable code, ignored return values, execution paths with no return, likely infinite loops, and fall-through cases. This document focuses on more powerful checks that are made possible by additional information given in source code annotations. [2] Annotations are stylized comments that document certain assumptions about functions, variables, parameters, and types. They may be used to indicate where the representation of a user-defined type is hidden, to limit where a global variable may be used or modified, to constrain what a function implementation may do to its parameters, and to express checked assumptions about variables, types, structure fields, function parameters, and function results. In addition to the checks specifically enabled by annotations, many of the traditional lint checks are improved by exploiting this additional information.
The best way to learn to use LCLint, of course, is to actually use it (if you don't already have LCLint installed on your system, download it). Before you read much further in this document, I recommend finding a small C program. Then, try running:
lclint *.c
For the most C programs, this will produce a large number of messages. To turn off reporting for some of the messages, try:
lclint -weak *.c
The -weak flag is a mode flag that sets many checking parameters to select weaker checking than is done in the default mode. Other LCLint flags will be introduced in the following sections; a complete list is given in Appendix C.
The first line gives the name of the function in which the error is found. This is printed before the first message reported for a function. (The function context is not printed if -showfunc is used.)sample.c: (in function faucet) sample.c:11:12: Fresh storage x not released before return A memory leak has been detected. Newly-allocated or only-qualified storage is not released before the last reference to is it lost. Use -mustfree to suppress message. sample.c:5:47: Fresh storage x allocated
The second line is the text of the message. This message reports a memory leak - storage allocated in a function is not deallocated before the function returns. The text is preceded by the file name, line and column number where the error is located. The column numbers are used by the emacs mode (see Appendix H) to jump to the appropriate line and column location. (Column numbers are not printed if -showcolumn is used.)
The next line is a hint giving more information about the suspected error. Most hints also include information on how the message may be suppressed. For this message, setting the -mustfree flag would prevent the message from being reported. Hints may be turned off by using -hints. Normally, a hint is given only the first time a class of error is reported. To have LCLint print a hint for every message regardless, use +forcehints.
The final line of the message gives additional location information. For this message, it tells where the leaking storage is allocated.
The generic message format is (parts enclosed in square brackets are optional):
[<file>:<line> (in <context>)]
<file>:<line>[,<column>]: message
[hint]
<file>:<line>,<column>: extra location information, if appopriate
The text of messages and hints may be longer than one line. They are split
into lines of length less than the value set using -linelen
<number>. The default line length is 80 characters. LCLint
attempts to split lines in a sensible place as near to the line length limit as
possible.
The +parenfileformat flag
can be used to generate file locations in the format recognized by
Microsoft Developer Studio. If +parenfileformat is set, the
line number follows the file name in parentheses (e.g.,
sample.c(11).).
2.2 Flags
So that many programming styles can be supported, LCLint provides over 300
flags for controlling checking and message reporting. Some of the flags are
introduced in the body of this document. Apppendix C describes every flag.
Modes and shortcut flags are provided for setting many flags at once.
Individual flags can override the mode settings.
Flags are preceded by + or -. When a flag is preceded by + we say it is on; when it is preceded by - it is off. The precise meaning of on and off depends on the type of flag.
The +/- flag settings are used for consistency and clarity, but contradict standard UNIX usage and is easy to accidentally use the wrong one. To reduce the likelihood of using the wrong flag, LCLint issues warnings when a flag is set in an unusual way. Warnings are issued when a flag is redundantly set to the value it already had (these errors are not reported if the flag is set using a stylized comment), if a mode flag or special flag is set after a more specific flag that will be set by the general flag was already set, if value flags are given unreasonable values, of if flags are set in an inconsistent way. The -warnflags flag suppresses these warnings.
Default flag settings will be read from ~/.lclintrc if it is readable. If
there is a .lclintrc file in the working directory, settings in this file will
be read next and its settings will override those in ~/.lclintrc. Command-line
flags override settings in either file. The syntax of the .lclintrc file is
the same as that of command-line flags, except that flags may be on separate
lines and the # character may be used to indicate that the remainder of the
line is a comment. The -nof
flag prevents the ~/.lclintrc file from being
loaded. The -f <filename> flag loads options from filename.
2.3 Stylized Comments
Stylized comments are used to provide extra information about a type, variable
or function interface to improve checking, or to control flag settings
locally.
All stylized comments begin with /*@ and are closed by the end of the comment. The role of the @ may be played by any printable character. Use -commentchar <char> to select a different stylized comment marker.
Syntactic comments for function interfaces are described in Section 4; comments for declaring constants in Section 8.1. and comments for declaring iterators in Section 8.4. Sections 3-8 include descriptions of annotations for expressing assumptions about variables, parameters, return values, structure fields and type definitions. A summary of annotations is found in Apppendix D.
Most flags (all except those characterized as "global" in Apppendix C) can be set locally using control comments. A control comment can set flags locally to override the command line settings. The original flag settings are restored before processing the next file. The syntax for setting flags in control comments is the same as that of the command line, except that flags may also be preceded by = to restore their setting to the original command-line value. For instance,
/*@+boolint -modifies =showfunc@*/sets boolint on (this makes bool and int indistinguishable types), sets modifies off (this prevents reporting of modification errors), and sets showfunc to its original setting (this controls whether or not the name of a function is displayed before a message).
Traditionally, programming books wax mathematical when they arrive at the topic of abstract data types Such books make it seem as if you'd never actually use an abstract data type except as a sleep aid.- Steve McConnell
Information hiding is a technique for handling complexity. By hiding implementation details, programs can be understood and developed in distinct modules and the effects of a change can be localized. One technique for information hiding is data abstraction. An abstract type is used to represent some natural program abstraction. It provides functions for manipulating instances of the type. The module that implements these functions is called the implementation module. We call the functions that are part of the implementation of an abstract type the operations of the type. Other modules that use the abstract type are called clients.
Clients may use the type name and operations, but should not manipulate or rely on the actual representation of the type. Only the implementation module may manipulate the representation of an abstract type. This hides information, since implementers and maintainers of client modules should not need to know anything about how the abstract type is implemented. It provides modularity, since the representation of an abstract type can be changed without having to change any client code.
LCLint supports abstract types by detecting places where client code depends on the concrete representation of an abstract type.
To declare an abstract type, the abstract annotation is added to a typedef. For example (in mstring.h),
typedef /*@abstract@*/ char *mstring;declares mstring as an abstract type. It is implemented using a char *, but clients of the type should not depend on or need to be aware of this. If it later becomes apparent that a better representation such as a string table should be used, we should be able to change the implementation of mstring without having to change or inspect any client code.
In a client module, abstract types are checked by name, not structure. LCLint reports an error if an instance of mstring is passed as a char * (for instance, as an argument to strlen), since the correctness of this call depends on the representation of the abstract type. LCLint also reports errors if any C operator except assignment (=) or sizeof is used on an abstract type. The assignment operator is allowed since its semantics do not depend on the representation of the type.[4] The use of sizeof is also permitted, since this is the only way for clients to allocate pointers to the abstract type. Type casting objects to or from abstract types in a client module is an abstraction violation and will generate a warning message.
Normally, LCLint will assume a type definition is not abstract unless the /*@abstract@*/ qualifier is used. If instead you want all user-defined types to be abstract types unless they are marked as concrete, the +impabstract flag can be used. This adds an implicit abstract annotation to any typedef that is not marked with /*@concrete@*/.
Some examples of abstraction violations detected by LCLint are shown in Figure 2.
There are a several ways of selecting what code has access the representation of an abstract type:
The mutability of a concrete type is determined by its type definition. For abstract types, mutability does not depend on the type representation but on what operations the type provides. If an abstract type has operations that may change the value of instances of the type, the type is mutable. If not, it is immutable. The value of an instance of an immutable type never changes. Since object sharing is noticeable only for mutable types, they are checked differently from immutable types.
The /*@mutable@*/ and /*@immutable@*/ annotations are used to declare an abstract type as mutable or immutable. (If neither is used, the abstract type is assumed to be mutable.) For example,
typedef /*@abstract@*/ /*@mutable@*/ char *mstring;
typedef /*@abstract@*/ /*@immutable@*/ int weekDay;
declares mstring as a mutable abstract type and weekDay as an immutable
abstract type.
Clients of a mutable abstract type need to know the semantics of assignment. After the assignment expression s = t, do s and t refer to the same object (that is, will changes to the value of s also change the value of t)?
LCLint prescribes that all abstract types have sharing semantics, so s and t would indeed be the same object. LCLint will report an error if a mutable type is implemented with a representation (e.g., a struct) that does not provide sharing semantics (controlled by mutrep flag).
The mutability of an abstract type is not necessarily the same as the
mutability of its representation. We could use the immutable concrete type int
to represent mutable strings using an index into a string table, or declare
mstring as immutable as long as no operations are provided that modify the
value of an mstring.
3.3 Boolean Types
Standard C has no boolean representation - the result of a comparison operator
is an integer, and no type checking is done for test expressions. Many common
errors can be detected by introducing a distinct boolean type and stronger type
checking.
Use the -booltype name flag to select the type name used to represent boolean values[8] Relations, comparisons and certain standard library functions are declared to return bool types.
LCLint checks that the test expression in an if, while, or for statement or an operand to &&, || or ! is a boolean. If the type of a test expression is not a boolean, LCLint will report an error depending on the type of the test expression and flag settings. If the test expression has pointer type, LCLint reports an error if predboolptr is on (this can be used to prevent messages for the idiom of testing if a pointer is not null without a comparison). If it is type int, an error is reported if predboolint is on. For all other types, LCLint reports an error if predboolothers is on.
Since using = instead of == is such a common bug, reporting of test expressions that are assignments is controlled by the separate predassign flag. The message can be suppressed by adding extra parentheses around the test expression.
Apppendix C (page
50)
describes other flags for controlling boolean checking.
3.4 Primitive C Types
Two types have compatible type if their types are the same.LCLint supports stricter checking of primitive C types. The char and enum types can be checked as distinct types, and the different numeric types can be type-checked strictly.
- ANSI C, 3.1.2.6.
Two types need not be identical to be compatible.
- ANSI C, footnote to 3.1.2.6.
The +charint flag can be used for checking legacy programs where char and int
are used interchangeably. If charint is on, char types indistinguishable from
ints. To keep char and int as distinct types, but allow chars to be used to
index arrays, use +charindex.
3.4.2 Enumerators
Standard C treats user-declared enum types just like integers. An arbitrary
integral value may be assigned to an enum type, whether or not it was listed as
an enumerator member. LCLint checks each user-defined enum type as distinct
type. An error is reported if a value that is not an enumerator member is
assigned to the enum type, or if an enum type is used as an operand to an
arithmetic operator.
If the enumint flag is on, enum and int types may be used interchangeably. Like charindex, if the enumindex flag is on, enum types may be used to index arrays.
Similarly, if a signed value is assigned to an unsigned, LCLint will report an error since an unsigned type cannot represent all signed values correctly. If the ignoresigns flag is on, checking is relaxed to ignore all sign qualifiers in type comparisons (this is not recommended, since it will suppress reporting of real bugs, but may be necessary for quickly checking certain legacy code).
/*@integraltype@*/An arbitrary integral type. The actual type may be any one of short, int, long, unsigned short, unsigned, or unsigned long./*@unsignedintegraltype@*/
An arbitrary unsigned integral type. The actual type may be any one of unsigned short, unsigned, or unsigned long./*@signedintegraltype@*/
An arbitrary signed integral type. The actual type may be any one of short, int, or long.
LCLint reports an error if the code depends on the actual representation of a type declared as an arbitrary integral. The match-any-integral flag relaxes checking and allows an arbitrary integral type is allowed to match any integral type.
Other flags set the arbitrary integral types to a concrete type. These should only be used if portability to platforms that may use different representations is not important. The long-integral and long-unsigned-integral flags set the type corresponding to /*@integraltype@*/ to be unsigned long and long respectively. The long-unsigned-unsigned-integral flag sets the type corresponding to /*@unsignedintegraltype@*/ to be unsigned long. The long-signed-integral flag sets the type corresponding to /*@signedintegraltype@*/ to be long.
A function prototype documents the interface to a function. It serves as a contract between the function and its caller. In early versions of C, the function "prototype" was very limited. It described the type returned by the function but nothing about its parameters. The main improvement provided by ANSI C was the ability to add information on the number and types of parameter to a function. LCLint provides the means to express much more about a function interface: what global variable the function may use, what values visible to the caller it may modify, if a pointer parameter may be a null pointer or point to undefined storage, if storage pointed to by a parameter is deallocated before the function returns, if the function may create new aliases to a parameter, can the caller modify or deallocate the return value, etc.
The extra interface information places constraints on both how the function may be called and how it may be implemented. LCLint reports places where these constrains are not satisfied. Typically, these indicate bugs in the code or errors in the interface documentation.
This section describes syntactic comments may be added to a function
declaration to document what global variables the function implementation may
use and what values visible to its caller it may modify. Sections 5-7 describe
annotations may be added to parameters to constrain valid arguments to a
function and how these arguments may be used after the call and to the return
value to constrain results.
4.1 Modifications
The modifies clause lists what values visible to the caller may be modified by
a function. Modifies clauses limit what values a function may modify, but they
do not require that listed values are always modified. The declaration,
int f (int *p, int *q) /*@modifies *p@*/;declares a function f that may modify the value pointed to by its first argument but may not modify the value of its second argument or any global state.
LCLint checks that a function does not modify any caller-visible value not
encompassed by its modifies clause and does modify all values listed in its
modifies clause on some possible execution of the function. Figure 4 shows an
example of modifies checking done by LCLint.
4.1.1 Special Modifications
A few special names are provided for describing function modifications:
internalState
The function modifies some internal state (that is, the value of a static variable). Even though a client cannot access the internal state directly, it is important to know that something may be modified by the function call both for clear documentation and for checking undefined order of evaluation (Section 10.1) and side-effect free parameters (Section 8.2.1).fileSystem
The function modifies the file system. Any modification that may change the system state is considered a file system modification. All functions that modify an object of type pointer to FILE also modify the file system. In addition, functions that do not modify a FILE pointer but modify some state that is visible outside this process also modify the file system (e.g., rename). The flag mod-file-system controls reporting of undocumented file system modifications.nothing
The function modifies nothing (i.e., it is side-effect free).
The syntactic comment, /*@*/ in a function declaration or definition (after the
parameter list, before the semi-colon or function body) denotes a function that
modifies nothing and does not use any global variables (see Section 4.2).
4.1.2 Missing Modifies Clauses
LCLint is designed so programs with many functions that are declared without
modifies clauses can be checked effectively. Unless modnomods is in on, no
modification errors are reported checking a function declared with no modifies
clause.
A function with no modifies clause is an unconstrained function since
there are no documented constraints on what it may modify. When an
unconstrained function is called, it is checked differently from a function
declared with a modifies clause. To prevent spurious errors, no modification
error is reported at the call site unless the moduncon flag is on. Flags
control whether errors involving unconstrained functions are reported for other
checks that depend on modifications (side-effect free macro parameters
(Section 8.2.1), undefined evaluation order (Section 10.1), and likely infinite loops
(Section 10.2.1).)
4.1.3 Limitations
Determining whether a function modifies a particular parameter or global is in
general an undecidable[9] problem. To enable useful
checking, certain simplifying assumptions are necessary. LCLint assumes an
object is modified when it appears on the left hand side of an assignment or it
is passed to a function as a parameter which may be modified by that function
(according to the called function's modifies clause). Hence, LCLint will report
spurious modification errors for assignments that do not change the value
of an object or modifications that are always reversed before a procedure
returns. The /*@-mods@*/ and /*@=mods@*/ control comments can be used around
these modifications to suppress the message.
4.2 Global Variables
Another aspect of a function's interface, is the global variables it uses. A
globals list in a function declaration lists external variables that may be
used in the function body. LCLint checks that global variables used in a
procedure match those listed in its globals list. A global is used in a
function if it appears in the body directly, or it is in the globals list of a
function called in the body. LCLint reports if a global that is used in a
procedure is not listed in its globals list, and if a listed global is not used
in the function implementation.
Figure 5 shows an example function definition with a globals list and associated checking done by LCLint.
A global or file static variable declaration may be preceded by an annotation to indicate how the variable should be checked. In order of decreasing checks, the annotations are:
/*@checkedstrict@*/
Strictest checking. Undocumented uses and modifications of the variable are reported in all functions whether or not they have a globals list (unless checkstrictglobs is off)./*@checked@*/
Undocumented use of the variable is reported in a function with a globals list, but not in a function declared with no globals (unless globnoglobs is on)./*@checkmod@*/
Undocumented uses of the variable are not reported, but undocumented modifications are reported. (If modglobsnomods is on, errors are reported even in functions declared with no modifies clause or globals list.)/*@unchecked@*/
No messages are reported for undocumented use or modification of this global variable. If a variable has none of these annotations, an implicit annotation is determined by the flag settings.
Different flags control the implicit annotation for variables declared with
global scope and variables declared with file scope (i.e., using the static
storage qualifier). To set the implicit annotation for global variables
declared in context (globs for external variables or statics for file
static variable) to be annotation (checked, checkmod, checkedstrict) use
imp<annotation><context>. For example,
+impcheckedstrictstatics makes the implicit checking on unqualified file static
variables checkedstrict. (See Apppendix C, page 51, for a complete list of
globals checking flags.)
4.3 Declaration Consistency
LCLint checks that function declarations and definitions are consistent. The
general rule is that the first declaration of a function imply all later
declarations and definitions. If a function is declared in a header file, the
first declaration processed is its first declaration (if it is declared in more
than one header file an error is reported if redecl is set). Otherwise, the
first declaration in the file defining the function is its first declaration.
Later declarations may not include variables in the globals list that were not included in the first declaration. The exception to this is when the first declaration is in a header file and the later declaration or definition includes file static variables. Since these are not visible in the header file, they can not be included in the header file declaration. Similarly, the modifies clause of a later declaration may not include objects that are not modifiable in the first declaration. The later declaration may be more specific. For example, if the header declaration is:
extern void setName (employee e, char *s) /*@modifies e@*/;the later declaration could be,
void setName (employee e, char *) /*@modifies e->name@*/;If employee is an abstract type, the declaration in the header should not refer to a particular implementation (i.e., it shouldn't rely on there being a name field), but the implementation declaration can be more specific.
This rule also applies to file static variables. The header declaration for a function that modifies a file static variable should use modifies internalState since file static variables are not visible to clients. The implementation declaration should list the actual file static variables that may be modified.
LCLint can detect many memory management errors at compile time including:
Yea, from the table of my memory I'll wipe away all trivial fond records, all saws of books, all forms, all pressures past, that youth and observation copied there.This section describes execution-time concepts for describing the state of storage more precisely than can be done using standard C terminology. Certain uses of storage are likely to indicate program bugs, and are reported as anomalies.
- Hamlet prefers garbage collection (Shakespeare, Hamlet. Act I, Scene v)
LCL assumes a CLU-like object storage model.[11] An object is a typed region of storage. Some objects use a fixed amount of storage that is allocated and deallocated automatically by the compiler.
Other objects use dynamic storage that must be managed by the program.
Storage is undefined if it has not been assigned a value, and defined after it has been assigned a value. An object is completely defined if all storage that may be reached from it is defined. What storage is reachable from an object depends on the type and value of the object. For example, if p is a pointer to a structure, p is completely defined if the value of p is NULL, or if every field of the structure p points to is completely defined.
When an expression is used as the left side of an assignment expression we say it is used as an lvalue. Its location in memory is used, but not its value. Undefined storage may be used as an lvalue since only its location is needed. When storage is used in any other way, such as on the right side of an assignment, as an operand to a primitive operator (including the indirection operator, *),[12] or as a
function parameter, we say it is used as an rvalue. It is an anomaly to use undefined storage as an rvalue.
A pointer is a typed memory address. A pointer is either live or dead. A live pointer is either NULL or an address within allocated storage. A pointer that points to an object is an object pointer. A pointer that points inside an object (e.g., to the third element of an allocated block) is an offset pointer. A pointer that points to allocated storage that is not defined is an allocated pointer. The result of dereferencing an allocated pointer is undefined storage. Hence, it is an anomaly to use it as an rvalue. A dead (or "dangling") pointer does not point to allocated storage. A pointer becomes dead if the storage it points to is deallocated (e.g., the pointer is passed to the free library function.) It is an anomaly to use a dead pointer as an rvalue.
There is a special object null corresponding to the NULL pointer in a C
program. A pointer that may have the value NULL is a possibly-null
pointer. It is an anomaly to use a possibly-null pointer where a non-null
pointer is expected (e.g., certain function arguments or the indirection
operator).
5.2 Deallocation Errors
There are two kinds of deallocation errors with which we are concerned:
deallocating storage when there are other live references to the same storage,
or failing to deallocate storage before the last reference to it is lost. To
handle these deallocation errors, we introduce a concept of an obligation to
release storage. Every time storage is allocated, it creates an obligation to
release the storage. This obligation is attached to the reference to which the
storage is assigned.[13] Before the scope of the
reference is exited or it is assigned to a new value, the storage to which it
points must be released. Annotations can be used to indicate that this
obligation is transferred through a return value, function parameter or
assignment to an external reference.
5.2.1 Unshared References
`Tis in my memory lock'd, and you yourself shall keep the key of it.The only annotation is used to indicate a reference is the only pointer to the object it points to. We can view the reference as having an obligation to release this storage. This obligation is satisfied by transferring it to some other reference in one of three ways:
- Ophelia prefers explicit deallocation (Hamlet. Act I, Scene iii)
All obligations to release storage stem from primitive allocation routines (e.g., malloc), and are ultimately satisfied by calls to free. The standard library declared the primitive allocation and deallocation routines.
The basic memory allocator, malloc, is declared:[14]
/*@only@*/ void *malloc (size_t size);It returns an object that is referenced only by the function return value.
The deallocator, free, is declared:[15]
void free (/*@only@*/ void *ptr);
The parameter to free must reference an unshared object. Since the parameter is declared using only, the caller may not use the referenced object after the call, and may not pass in a reference to a shared object. There is nothing special about malloc and free -- their behavior can be described entirely in terms of the provided annotations
.Figure 6. Deallocation errors.
5.2.7 Inner Storage
An annotation always applies to the outermost level of storage. For example,
/*@only@*/ int **x;declares x as an unshared pointer to a pointer to an int. The only annotation applies to x, but not to *x. To apply annotations to inner storage a type definition may be used:
typedef /*@only@*/ int *oip; /*@only@*/ oip *x;Now, x is an only pointer to an oip, which is an only pointer to an int.
When annotations are use in type definitions, they may be overridden in instance declarations. For example,
/*@dependent@*/ oip x;makes x a dependent pointer to an int.
An implicit memory management annotation may be assumed for declarations with no explicit memory management annotation. Implicit annotations are checked identically to the corresponding explicit annotation, except error messages indicate that they result from an implicit annotation.
Unannotated function parameters are assumed to be temp. This means if memory checking is turned on for an unannotated program, all functions that release storage referenced by a parameter or assign a global variable to alias the storage will produce error messages. (Controlled by paramimptemp.)
Unannotated return values, structure fields and global variables are assumed to be only. With implicit annotations (on by default), turning on memory checking for an unannotated program will produce errors for any function that does not return unshared storage or assignment of shared storage to a global variable or structure field.[16] (Controlled by retimponly, structimponly and globimponly. The codeimponly flag sets all of the implicit only flags.)
Figure 8. Implicit annotations
LCLint supports reference counting by using annotations to constrain the use of reference counted storage in a manner similar to other memory management annotations.
A reference counted type is declared using the refcounted annotation. Only pointer to struct types may be declared as reference counted, since reference counted storage must have a field to count the references. One field in the structure (or integral type) is preceded by the refs annotation to indicate that the value of this field is the number of live references to the structure.
For example (in rstring.h),
typedef /*@abstract@*/ /*@refcounted@*/ struct {
/*@refs@*/ int refs;
char *contents;
} *rstring;
declares rstring as an abstract, reference-counted type. The refs field counts
the number of references and the contents field holds the contents of a
string.
All functions that return refcounted storage must increase the reference count before returning. LCLint cannot determine if the reference count was increased, so any function that directly returns a reference to refcounted storage will produce an error. This is avoided, by using a function to return a new reference (e.g., rstring_ref in Figure 9).
A reference counted type may be passed as a temp or dependent parameter. It may not be passed as an only parameter. Instead, the killref annotation is used to denote a parameter whose reference is eliminated by the function call. Like only parameters, an actual parameter corresponding to a killref formal parameter may not be used in the calling function after the call. LCLint checks that the implementation of a function releases all killref parameters, either by passing them as killref parameters, or assigning or returning them without increasing the reference count.
LCLint will report an error if a unique parameter may be aliased by another parameter or global variable.
Figure 10. Unique parameters.
6.1.2 Returned Parameters
LCLint reports an error if a function returns a reference to storage reachable
from one of its parameters (if retalias is on) since this may introduce
unexpected aliases in the body of the calling function when the result is
assigned.
The returned annotation denotes a parameter that may be aliased by the return
value. LCLint checks the call assuming the result may be an alias to the
returned parameter. Figure 11 shows an example use of a returned annotation.
6.2 Exposure
LCLint detects places where the representation of an abstract type is exposed.
This occurs if a client has a pointer to storage that is part of the
representation of an instance of the abstract type. The client can then modify
or examine the storage this points to, and manipulate the value of the abstract
type instance without using its operations.
There are three ways a representation may be exposed:
typedef /*@abstract@*/ struct {
char *name;
int id;
} *employee;
...
char *employee_getName (employee e) { return e->name; }
LCLint produces a message to indicate that the return value exposes the
representation. One solution would be to return a fresh copy of e->name.
This is expensive, though, especially if we expect employee_getName is used
mainly just to get a string for searching or printing. Instead, we could
change the declaration of employee_getName to:
extern /*@observer@*/ char *employee_getName (employee e);Now, the original implementation is correct. The declaration indicates that the result may not be modified by the caller, so it is acceptable to return shared storage.[17] LCLint checks that the return value is not modified by the caller. An error is reported if observer storage is modified directly, passed as a function parameter that may be modified, assigned to a global variable or reference derivable from a global variable that is not declared with an observer annotation, or returned as a function result or a reference derivable from the function result that is not annotation with an observer annotation.
Figure 12 shows examples of exposure problems detected by LCLint.
7. Value Constraints
LCLint can be used to constrain values of parameters, function results, global
variables, and derived storage such as structure fields. These constraints are
checked at interface points -- where a function is called or returns.
Section 7.1 describes how to constrain parameters, return values and structures
to detect use before definition errors. A similar approach is used for
restricting the use of possibly null pointers in Section 7.2. To do both well, and avoid spurious errors, information about when and if a function returns if
useful. Annotations for documenting execution control are described in
Section 7.3.
uses a local variable before it is defined, a use before definition error is reported. Use before definition checking is controlled by the usedef flag.
LCLint can do more checking than standard checkers though, because the
annotations can be used to describe what storage must be defined and what
storage may be undefined at interface points. Unannotated references are
expected to be completely defined at interface points. This means all storage
reachable from a global variable, parameter to a function, or function return
value is defined before and after a function call.
7.1.1 Undefined Parameters
Sometimes, function parameters or return values are expected to reference
undefined or partially defined storage. For example, a pointer parameter may
be intended only as an address to store a result, or a memory allocator may
return allocated but undefined storage. The out annotation denotes a pointer
to storage that may be undefined.
LCLint does not report an error when a pointer to allocated but undefined storage is passed as an out parameter. Within the body of a function, LCLint will assume an out parameter is allocated but not necessarily bound to a value, so an error is reported if its value is used before it is defined.
LCLint reports an error if storage reachable by the caller after the call is not defined when the function returns. This can be suppressed by -mustdefine. When checking a call, an actual parameter corresponding to an out parameter is assumed to be completely defined after the call returns.
When checking unannotated programs, many spurious use before definition errors
may be reported If impouts is on, no error is reported when an
incompletely-defined parameter is passed to a formal parameter with no
definition annotation, and the actual parameter is assumed to be defined after
the call. The /*@in@*/ annotation can be used to denote a parameter that must
be completely defined, even if impouts is on. If impouts is off, there is an
implicit in annotation on every parameter with no definition annotation.
Figure 13. Use before definition.
7.1.2 Relaxing Checking
The reldef annotation relaxes definition checking for a particular declaration.
Storage declared with a reldef annotation is assumed to be defined when it is
used, but no error is reported if it is not defined before it is returned or
passed as a parameter.
It is up to the programmer to check reldef fields are used correctly. They
should be avoided in most cases, but may be useful for fields of structures
that may or may not be defined depending on other constraints.
7.1.3 Partially Defined Structures
The partial annotated can be used to relax checking of structure fields. A
structure with undefined fields may be passed as a partial parameter or
returned as a partial result. Inside a function body, no error is reported
when the field of a partial structure is used. After a call, all fields of a
structure that is passed as a partial parameter are assumed to be completely
defined.
7.1.4 Global Variables
Special annotations can be used in the globals list of a function declaration
(Section 4.2) to describe the states of global variables before and after the
call.
If a global is preceded by undef, it is assumed to be undefined before the call. Thus, no error is reported if the global is not defined when the function is called, but an error is reported if the global is used in the function body before it is defined.
The killed annotation denotes a global variable that may be undefined when the call returns. For globals that contain dynamically allocated storage, a killed global variable is similar to an only parameter (Section 5.2). An error is reported if it contains the only reference to storage that is not released before the call returns.
Figure 14. Annotated globals lists.
7.2 Null Pointers
A common cause of program failures is when a null pointer is dereferenced.
LCLint detects these errors by distinguishing possibly NULL pointers at
interface boundaries.
The null annotation is used to indicate that a pointer value may be NULL. A pointer declared with no null annotation, may not be NULL. If null checking is turned on (controlled by null), LCLint will report an error when a possibly null pointer is passed as a parameter, returned as a result, or assigned to an external reference with no null qualifier.
If a pointer is declared with the null annotation, the code must check that it is not NULL on all paths leading to the a dereference of the pointer (or the pointer being returned or passed as a value with no null annotation). Dereferences of possibly null pointers may be protected by conditional statements or assertions (to see how assert is declared see Section 7.3) that check the pointer is not NULL.
Consider two implementations of firstChar in Figure 15. For firstChar1, LCLint reports an error since the pointer that is dereferenced is declared with a null annotation. For firstChar2, no error is reported since the true branch of the s == NULL if statement returns, so the dereference of s is only reached if s is not NULL.
A function is annotated with truenull is assumed to return TRUE if its first parameter is NULL and FALSE otherwise. For example, if isNull is declared as,
/*@truenull@*/ bool isNull (/*@null@*/ char *x);we could write firstChar2:
char firstChar2 (/*@null@*/ char *s)
{
if (isNull (s)) return '\0';
return *s;
}
No error is reported since the dereference of s is only reached if isNull(s) is
false, and since isNull is declared with the truenull annotation this means s
must not be null.
The falsenull annotation is not quite the opposite of truenull. If a function declared with falsenull returns TRUE, it means its parameter is not NULL. If it returns FALSE, the parameter may or may not be NULL.
For example, we could define isNonEmpty to return TRUE if its parameter is not NULL and has least one character before the NUL terminator:
/*@falsenull@*/ bool isNonEmpty (/*@null@*/ char *x)
{
return (x != NULL && *x != `\0');
}
LCLint does not check that the implementation of a function declared with
falsenull or truenull is consistent with its annotation, but assumes the
annotation is correct when code that calls the function is checked.
7.2.3 Relaxing Null Checking
An additional annotation, relnull may be used to relax null checking (relnull
is analogous to reldef for definition checking). No error is reported when a
relnull value is dereferenced, or when a possibly null value is assigned to an
identifier declared using relnull.
This is generally used for structure fields that may or may not be null
depending on some other constraint. LCLint does not report and error when NULL
is assigned to a relnull reference, or when a relnull reference is
dereferenced. It is up to the programmer to ensure that this constraint is
satisfied before the pointer is dereferenced.
7.3 Execution
To detect certain errors and avoid spurious errors, it is important to know
something about the control flow behavior of called functions. Without
additional information, LCLint assumes that all functions eventually return and
execution continues normally at the call site.
The exits annotation is used to denote a function that never returns. For example,
extern /*@exits@*/ void fatalerror (/*@observer@*/ char *s);declares fatalerror to never return. This allows LCLint to correctly analyze code like,
if (x == NULL) fatalerror ("Yikes!");
*x = 3;
Other functions may exit, but sometimes (or usually) return normally. The
mayexit annotation denotes a function that may or may not return. This doesn't
help checking much, since LCLint must assume that a function declared with
mayexit returns normally when checking the code.
To be more precise, the trueexit and falseexit annotations may be used Similar to truenull and falsenull (see Section 7.2.1), trueexit and falseexit mean that a function always exits if the value of its first argument is TRUE or FALSE respectively. They may be used only on functions whose first argument has a boolean type.
A function declared with trueexit must exit if the value of its argument is TRUE, and a function declared with falseexit must exit if the value of its argument is FALSE. For example, the standard library declares assert as[20]:
/*@falseexit@*/ void assert (/*@sef@*/ bool /*@alt int@*/ pred);This way, code like,
assert (x != NULL);is checked correctly, since the falseexit annotation on assert means the deference of x is not reached is x != NULL is false.*x = 3;
Special clauses may be used to constrain the state of a parameter or return value before or after a call. One or more special clauses may appear in a function declaration, before the modifies or globals clauses. Special clauses may be listed in any order, but the same special clause should not be used more than once. Parameters used in special clauses must be annotated with /*@special@*/ in the function header. In a special clause list, result is used to refer to the return value of the function. If result appears in a special clause, the function return value must be annotated with /*@special@*/.
The following special clauses are used to describe the definition state or parameters before and after the function is called and the return value after the function returns:
/*@uses references@*/
References in the uses clause must be completely defined before the function is called. They are assumed to be defined at function entrance when the function is checked./*@sets references@*/
References in the sets clause must be allocated before the function is called. They are completely defined after the function returns. When the function is checked, they are assumed to be allocated at function entrance and an error is reported if there is a path on which they are not defined before the function returns./*@defines references@*/
References in the defines clause must not refer to unshared, allocated storage before the function is called. They are completely defined after the function returns. When the function is checked, they are assumed to be undefined at function entrance and an error is reported if there is a path on which they are not defined before the function returns./*@allocates references@*/
References in the allocates clause must not refer to unshared, allocated storage before the function is called. They are allocated but not necessarily defined after the function returns. When the function is checked, they are assumed to be undefined at function entrance and an error is reported if there is a path on which they are not allocated before the function returns./*@releases references@*/
References in the releases clause are deallocated by the function. They must correspond to storage which could be passed as an only parameter before the function is called, and are dead pointers after the function returns. When the function is checked, they are assumed to be allocated at function entrance and an error is reported if they refer to live, allocated storage at any return point.
Additional generic special clauses can be used to describe other aspects of the state of inner storage before or after a call. Generic special clauses have the form state:constraint. The state is either pre (before the function is called), or post (after the function is called). The constraint is similar to an annotation. The following constraints are supported:
Some examples of special clauses are shown in Figure 17. The defines clause for record_new indicates that the id field of the structure pointed to by the result is defined, but the name field is not. So, record_create needs to call record_setName to define the name field. Similarly, the releases clause for record_clearName indicates that no storage is associated with the name field of its parameter after the return, so no failure to deallocate storage message is produced for the call to free in record_free.Aliasing Annotations
pre:only, post:only
pre:shared, post:shared
pre:owned, post:owned
pre:dependent, post:dependent
References refer to only, shared, owned or dependent storage before (pre) or after (post) the call.Exposure Annotations
pre:observer, post:observer
pre:exposed, post:exposed
References refer to observer or exposed storage before (pre) or after (post) the call.Null State Annotations
pre:isnull, post:isnullReferences have the value NULL before (pre) or after (post) the call. Note, this is not the same name or meaning as the null annotation (which means the value may be NULL.)pre:notnull, post:notnullReferences do not have the value NULL before (pre) or after (post) the call.
LCLint eliminates most of the potential problems by detecting macros with
dangerous implementations and dangerous macro invocations. Whether or not a
macro definition is checked or expanded normally depends on flag settings and
control comments (see Section 8.3). Stylized macros can also be used to define
control structures for iterating through many values (see Section 8.4).
8.1 Constant Macros
Macros may be used to implement constants. To get type-checking for constant
macros, use the constant syntactic comment:
/*@constant null char *mstring_undefined@*/Declared constants are not expanded and are checked according to the declaration. A constant with a null annotation may be used as only storage.
# define square(x) x * xThis works fine for a simple invocation like square(i). It behaves unexpectedly, though, if it is invoked with a parameter that has a side effect.
For example, square(i++) expands to i++ * i++. Not only does this give the incorrect result, it has undefined behavior since the order in which the operands are evaluated is not defined. (See Section 10.1 for more information on how expressions exhibiting undefined evaluation order behavior are detected by LCLint.) To correct the problem we either need to rewrite the macro so that its parameter is evaluated exactly once, or prevent clients from invoking the macro with a parameter that has a side-effect.
Another possible problem with macros is that they may produce unexpected results because of operator precedence rules. The invocation, square(i+1) expands to i+1*i+1, which evaluates to i+i+1 instead of the square of i+1. To ensure the expected behavior, the macro parameter should be enclosed in parentheses where it is used in the macro body.
Macros may also behave unexpectedly if they are not syntactically equivalent to an expression. Consider the macro definition,
# define incCounts() ntotal++; ncurrent++;This works fine, unless it is used as a statement. For example,
if (x < 3) incCounts();increments ntotal if x < 3 but always increments ncurrent.
One solution is to use the comma operator to define the macro:
# define incCounts() (ntotal++, ncurrent++)More complicated macros can be written using a do while construction:
# define incCounts() \
do { ntotal++; ncurrent++; } while (FALSE)
LCLint detects these pitfalls in macro definitions, and checks that a macro
behaves as much like a function as possible. A client should only be able to
tell that a function was implemented by a macro if it attempts to use the macro
as a pointer to a function.
These checks are done by LCLint on a macro definition corresponding to a function:
extern int square (/*@sef@*/ int x); # define square(x) ((x) *(x))Now, LCLint will not report an error checking the definition of square even though x is used more than once.
A message will be reported, however, if square is invoked with a parameter that has a side-effect.
For the code fragment,
square (i++)LCLint produces the message:
Parameter 1 to square is declared sef, but the argument may modify i: i++It is also an error to pass a non-sef macro parameter as a sef macro parameter in the body of a macro definition. For example,
extern int sumsquares (int x, int y); # define sumsquares(x,y) (square(x) + square(y))Although x only appears once in the definition of sumsquares it will be evaluated twice since square is expanded. LCLint reports an error when a non-sef macro parameter is passed as a sef parameter.
A parameter may be passed as a sef parameter without an error being reported,
if LCLint can determine that evaluating the parameter has no side-effects. For
function calls, the modifies clause is used to determine if a side-effect is
possible.[22] To prevent many spurious
errors, if the called function has no modifies clause, LCLint will report an
error only if sefuncon is on. Justifiably paranoid programmers will insist on
setting sefuncon on, and will add modifies clauses to unconstrained functions
that are used in sef macro arguments.
8.2.2 Polymorphism
One problem with our new definition of square is that while the original macro
would work for parameters of any numeric type, LCLint will now report an error
is the new version is used with a non-integer parameter.
We can use the /*@alt type;,+@> syntax to indicate that an alternate type may be used. For example,
extern int /*@alt float@*/ square (/*@sef@*/ int /*@alt float@*/ x); # define square(x) ((x) *(x))declares square for both ints and floats.
Alternate types are also useful for declaring functions for which the return
value may be safely ignored (see Section 10.3.2).
8.3 Controlling Macro Checking
By default, LCLint expands macros normally and checks the resulting code after
macros have been expanded. Flags and control comments may be used to control
which macros are expanded and which are checked as functions or constants.
If the fcnmacros flag is on, LCLint assumes all macros defined with parameter lists implement functions and checks them accordingly. Parameterized macros are not expanded and are checked as functions with unknown result and parameter types (or using the types in the prototype, if one is given). The analogous flag for macros that define constants is constmacros. If it is on, macros with no parameter lists are assumed to be constants, and checked accordingly. The allmacros flag sets both fcnmacros and constmacros. If the macrofcndecl flag is set, a message reports parameterized macros with no corresponding function prototype. If the macroconstdecl flag is set, a similar message reports macros with no parameters with no corresponding constant declaration.
The macro checks described in the previous sections make sense only for macros that are intended to replace functions or constants. When fcnmacros or constmacros is on, more general macros need to be marked so they will not be checked as functions or constants, and will be expanded normally. Macros which are not meant to behave like functions should be preceded by the /*@notfunction@*/ comment. For example,
/*@notfunction@*/ # define forever for(;;)Macros preceded by notfunction are expanded normally before regular checking is done. If a macro that is not syntactically equivalent to a statement without a semi-colon (e.g., a macro which enters a new scope) is not preceded by notfunction, parse errors may result when fcnmacros or constmacros is on.
The C language provides no mechanism for creating user-defined iterators. LCLint supports a stylized form of iterators declared using syntactic comments and defined using macros.
Iterator declarations are similar to function declarations except instead of returning a value, they assign values to their yield parameters in each iteration. For example, we could add this iterator declaration to intSet.h:
/*@iter intSet_elements (intSet s, yield int el);@*/The yield annotation means that the variable passed as the second actual argument is declared as a local variable of type int and assigned a value in each loop iteration.
typedef /*@abstract@*/ struct {
int nelements;
int *elements;
} intSet;
...
# define intSet_elements(s,m_el) \
{ int m_i; \
for (m_i = (0); m_i <= ((s)->nelements); m_i++) { \
int m_el = (s)->elements[(m_i)];
# define end_intSet_elements }}
Each time through the loop, the yield parameter m_el is assigned to the next
value. After all values have been assigned to m_el for one iteration, the loop
terminates. Variables declared by the iterator macro (including the yield
parameter) are preceded by the macro variable namespace prefix m_ (see Section
8.2) to avoid conflicts with variables defined in the scope where the iterator
is used.
iter (<params>) stmt; end_iter
For example, a client could use intSet_elements to sum the elements of an intSet:
intSet s;
int sum = 0;
...
intSet_elements (s, el) {
sum += el;
} end_intSet_elements;
The actual parameter corresponding to a yield parameter, el, is not declared in
the function scope. Instead, it is declared by the iterator and assigned to an
appropriate value for each iteration.
LCLint will do the following checks for uses of stylized iterators:
Names may be constrained by the scope of the name (external, file static, internal), the file in which the identifier is defined, the type of the identifier, and global constraints.
Of course, this is a complete jumble to the uninitiated, and that's the joke.
- Charles Simonyi, on the Hungarian naming convention
Czech[23] names denote operations and variables of abstract types by preceding the names by <type>_. The remainder of the name should begin with a lowercase character, but may use any other character besides the underscore. Types may be named using any non-underscore characters.
The Czech naming convention is selected by the czech flag. If accessczech is on, a function, variable, constant or iterator named <type>_<name> has access to the abstract type <type>.
Reporting of violations of the Czech naming convention is controlled by different flags depending on what is being declared:
czechfcns
Functions and iterators. An error is reported for a function name of the form <prefix>_<name> where <prefix> is not the name of an accessible type. Note that if accessczech is on, a type named <prefix> would be accessible in a function beginning with <prefix>_. If accessczech is off, an error is reported instead. An error is reported for a function name that does not have an underscore if any abstract types are accessible where the function is defined.czechvars
Variables, constants and expanded macros. An error is reported if the identifier name starts with <prefix>_ and prefix is not the name of an accessible abstract type, or if an abstract type is accessible and the identifier name does not begin with <type>_ where type is the name of an accessible abstract type. If accessczech is on, the representation of the type is visible in the constant or variable definition.czechtypes
User-defined types. An error is reported if a type name includes an underscore character.
The slovak flag selects the Slovak naming convention. Like Czech names, it may
be used with accessslovak to control access to abstract representations. The
slovakfcns, slovakvars, slovakconstants, and slovakmacros flags are analogous
to the similar Czech flags. If slovaktype is on, an error is reported if a
type name includes an uppercase letter.
9.1.3 Czechoslovak Names
Czechoslovak names are a combination of Czech names and Slovak names.
Operations may be named either <type>_ followed by any
sequence of non-underscore characters, or <type> followed by an
uppercase letter and any sequence of characters. Czechoslovak names have been
out of favor since 1993, but may be necessary for checking legacy code. The
czechoslovakfcns, czechoslovakvars, czechoslovakmacros, and
czechoslovakconstants flags are analogous to the similar Czech flags. If
czechoslovaktype is on, an error is reported if a type name contains either an
uppercase letter or an underscore character.
9.2 Namespace Prefixes
Another way to restrict names is to constrain the leading character sequences
of various kinds of identifiers. For example, a the names of all user-defined
types might begin with "T" followed by an uppercase letter and all file static
names begin with an uppercase letter. This may be useful for enforcing a
namespace (e.g., all names exported by the X-windows library should begin with
"X") or just making programs easier to understand by establishing an enforced
convention. LCLint can be used to constrain identifiers in this way to detect
identifiers inconsistent with prefixes.
All namespace flags are of the form, -<context>prefix <string>. For example, the macro variable namespace restricting identifiers declared in macro bodies to be preceded by "m_" would be selected by -macrovarprefix "m_". The string may contain regular characters that may appear in a C identifier. These must match the initial characters of the identifier name. In addition, special characters (shown in Table 1) can be used to denoted a class of characters.[24] The * character may be used at the end of a prefix string to specify the rest of the identifier is zero or more characters matching the character immediately before the *. For example, the prefix string "T&*" matches "T" or "TWINDOW" but not "Twin".
^ Any uppercase letter, A-Z
& Any lowercase letter, a-z
% Any character that is not an uppercase letter (allows lowercase
letters, digits and underscore)
~ Any character that is not a lowercase letter (allows uppercase letters,
digits and underscore)
$ Any letter (a-z, A-Z)
/ Any letter or digit (A-Z, a-z, 0-9)
? Any character valid in a C identifier
# Any digit, 0-9
Table
1. Prefix character codes.
Different prefixes can be selected for the following identifier
contexts:
macrovarprefix
Any variable declared inside a macro bodyuncheckedmacroprefix
Any macro that is not checked as a function or constant (see Section 8.4)tagprefix
Tags for struct, union and enum declarationsenumprefix
Members of enum typestypeprefix
Name of a user-defined typefilestaticprefix
Any identifier with file static scopeglobvarprefix
Any variable (not of function type) with global variables scopeexternalprefix
Any exported identifier
If an identifier is in more than one of the namespace contexts, the most specific defined namespace prefix is used (e.g., a global variable is also an exported identifier, so if globalvarprefix is set, it is checked against the variable name; if not, the identifier is checked against the externalprefix.)
For each prefix flag, a corresponding flag named <prefixname>exclude controls whether errors are reported if identifiers in a different namespace match the namespace prefix. For example, if macrovarprefixexclude is on, LCLint checks that no identifier that is not a variable declared inside a macro body uses the macro variable prefix.
Here is a (somewhat draconian) sample naming convention:
-uncheckedmacroprefix "~*"
unchecked macros have no lowercase letters
-typeprefix
"T^&*"
all type typenames begin with T followed by an
uppercase letter. The rest of the name is all lowercase letters.
+typeprefixexclude
no identifier that does not name a user-defined type may begin with the
type name prefix (set above)
-filestaticprefix"^&&&"
file static scope variables begin with an uppercase letter and
three lowercase letters
-globvarprefix "G"
all global variables variables start with G
+globvarprefixexclude
no identifier that is not a global variable starts with G
9.3
Naming Restrictions
Additional naming
restrictions can be used to check that names do no conflict with names
reserved for the standard library, and that identifier are sufficiently
distinct (either for the compiler and linker, or for the programmer.)
Restrictions may be different for names that are needed by the linker
(external names) and names that are only needed during
compilations (internal names). Names of non-static functions and
global variables are external; all other names are internal. 9.3.1
Reserved Names
Many names are reserved for the
implementation and standard library. A complete list of reserved names
can
be
found in [vdL, p. 126-128] or [ANSI, Section 4]. Some name prefixes such as str
followed by a lowercase character are reserved for future library
extensions. Most C compilers do not detect naming conflicts, and they
can lead to unpredictable program behavior. If ansireserved is on,
LCLint reports errors for external names that conflict with reserved
names. If ansireservedinternal is on, errors are also reported for
internal names. 9.3.2 Distinct Identifiers
The decision to retain the old six-character case-insensitive restriction on significance was most painful.
- ANSI C Rationale
LCLint can check that identifiers differ within a given number of characters, optionally ignoring alphabetic case and differences between characters that look similar. The number of significant characters may be different for external and internal names.
Using +distinctexternalnames sets the number of significant characters for external names to six and makes alphabetical case insignificant for external names. This is the minimum significance acceptable in an ANSI-conforming compiler. Most modern compilers exceed these minimums (which are particularly hard to follow if one uses the Czech or Slovak naming convention). The number of significant characters can be changed using the externalnamelength <number> flag. If externalnamecaseinsensitive is on, alphabetical case is ignored in comparing external names. LCLint reports identifiers that differ only in alphabetic case.
For internal identifiers, a conforming compiler must recognize at least 31 characters and treat alphabetical cases distinctly. Nevertheless, it may still be useful to check that internal names are more distinct then required by the compiler to minimize the likelihood that identifiers are confused in the program. Analogously to external names, the internalnamelength <number> flag sets the number of significant characters in an internal name and internalnamecaseinsensitive sets the case sensitivity. The internalnamelookalike flag further restricts distinctions between identifiers. When set, similar-looking characters match -- the lowercase letter "l" matches the uppercase letter "I" and the number "1"; the letter "O" or "o" matches the number "0"; "5" matches "S"; and "2" matches "Z". Identifiers that are not distinct except for look-alike characters will produce an error message. External names are also internal names, so they must satisfy both the external and internal distinct identifier checks.
Figure 18 illustrates some of the name checking done by LCLint.
All side effects before a sequence point must be complete before the sequence point, and no evaluations after the sequence point shall have taken place [ANSI, Section 2.1.2.3]. Between sequence points, side effects and evaluations may take place in any order. Hence, the order in which expressions or arguments are evaluated is not specified. Compilers are free to evaluate function arguments and parts of expressions (that do not contain sequence points) in any order. The behavior of code that uses a value that is modified by another expression that is not required to be evaluated before or after the other use is undefined.
LCLint detects instances where undetermined order of evaluation produces undefined behavior. If modifies clauses and globals lists are used, this checking is enabled in expressions involving function calls. Evaluation order checking is controlled by the evalorder flag.
When checking systems without modifies and globals information, evaluation order checking may report errors when unconstrained functions are called in procedure arguments. Since LCLint has no annotations to constrain what these functions may modify, it cannot be guaranteed that the evaluation order is defined if another argument calls an unconstrained function or uses a global variable or storage reachable from a parameter to the unconstrained function. Its best to add modifies and globals clauses to constrain the unconstrained functions in ways that eliminate the possibility of undefined behavior. For large legacy systems, this may require too much effort. Instead, the -evalorderuncon flag may be used to prevent reporting of undefined behavior due to the order of evaluation of unconstrained functions.
Figure 20 shows examples of infinite loops detected by LCLint. An error is reported for the loop in line 14, since neither of the values used in the loop condition (x directly and glob1 through the call to f) is modified by the body of the loop. If the declaration of g is changed to include glob1 in the modifies clause no error is reported. (In this example, if we assume the annotations are correct, then the programmer has probably called the wrong function in the loop body. This isn't surprising, given the horrible choices of function and variable names!)
If an unconstrained function is called within the loop body, LCLint will assume that it modifies a value used in the condition test and not report an infinite loop error, unless infloopsuncon is on. If infloopsuncon is on, LCLint will report infinite loop errors for loops where there is no explicit modification of a value used in the condition test, but where they may be an undetected modification through a call to an unconstrained function (e.g., line 15 in Figure 20).
For switches on enum types, LCLint reports an error if a member of the enumerator does not appear as a case in the switch body (and there is no default case). (Controlled by misscase.)
An example of switch checking is shown in Figure 21.
LCLint reports an error if the marker preceding a break is not consistent with its effect. An error is reported if innerbreak precedes a break that is not breaking an inner loop, switchbreak precedes a break that is not breaking a switch, or loopbreak precedes a break that is not breaking a loop.
if (x == 0) { return "nil"; }
else if (x == 1) { return "many"; }
produces an error message since the second if has no matching else branch.
Figure 22. Statements with no effect.
Alternate types (Section 8.2.2) can be used to declare functions that return values that may safely be ignored by declaring the result type to alternately by void. Several functions in the standard library are specified to alternately return void to prevent ignored return value errors for standard library functions (e.g., strcpy) where the result may be safely ignored (see Apppendix F).
Figure 23 shows example of ignored return value errors reported by LCLint.