LCLint Sample - Naming Conventions

Changes from Checks Mode 2

Differences

Fixed errors reported in second iteration.

Now, running LCLint in checks mode detects no anomalies.

Modules

   * employee - employee datatype (LCL, Code, Header)
   * empset - sets of employees (LCL, Code, Header)
   * dbase - database of employees (LCL, Code, Header)
   * eref - reference to an employee (LCL, Code, Header)
   * erc - collection of erefs (LCL, Code, Header)
   * ereftab - table of employees and erefs (LCL, Code, Header)

Naming Conventions

Now we check that the code conforms to a naming convention. There is no
defined naming convention, so we will make one up that is close to what the
code already (almost) follows.

Czech Names

The code most closely fits the Czech naming convention, where identifiers
are preceded by the associated type name followed by and underscore. We use
the +czech flag to select the Czech naming convention. LCLint reports 10
anomalies. The first message reports a constant that does not follow the
Czech naming convention:

eref.lcl:7,15: Constant erefNIL name is not consistent with Czech naming
                  convention.  The name should begin with eref_
  Constant name is not consistent with Czech naming convention.  Use
  -czechconsts to suppress message.

In fact, it follows the Slovak convention (). We could use the
+slovakconstants flag to require that constants follow the Slovak instead of
the Czech naming convention. Instead, we change the name to eref_undefined.
After changing the declaration, we can run lclint using +repeatunrecog to
find all the places where erefNIL appears and replace them with
eref_undefined. (Of course, if this were a larger system we would want to
use emacs tags and M-x tags-query-replace to do this more efficiently.) The
next six messages report function names that are inconsistent with the Czech
naming convention:

< reading spec dbase.lcl >
dbase.lcl:17: Function hire name is not consistent with Czech naming
                 convention.  Accessible types: db
  Function or iterator name is not consistent with Czech naming convention.
  Use -czechfcns to suppress message.
dbase.lcl:32: Function uncheckedHire name is not consistent with Czech naming
                 convention.  Accessible types: db
dbase.lcl:41: Function fire name is not consistent with Czech naming
                 convention.  Accessible types: db
dbase.lcl:49: Function query name is not consistent with Czech naming
                 convention.  Accessible types: db
dbase.lcl:57: Function promote name is not consistent with Czech naming
                 convention.  Accessible types: db
dbase.lcl:68: Function setSalary name is not consistent with Czech naming
                 convention.  Accessible types: db

The names are in the dbase module where the only accessible type is the
specification-only type db. We add db_ in front of the names.

The other message is for check, which was added to bool.h:

bool.h:34,29: Function check name is not consistent with Czech naming
                 convention.  Accessible types: bool

The check macro really does not belong in the bool module. In a real
program, we would add a separate utilities file. Here, we add a
/*@noaccess@*/ comment before check is declared. Since there are no
accessible types now, check is a valid function name.

The final two messages report type names that are inconsistent with the
Czech naming convention:

eref.h:9,30: Datatype eref_status name violates Czech naming convention.  Czech
                datatype names should not use the _ charater.
  Type name is not consistent with Czech naming convention.  Czech type names
  must not use the underscore character.  Use -czechtypes to suppress message.
eref.h:14,3: Datatype eref_ERP name violates Czech naming convention.  Czech
                datatype names should not use the _ charater.

Since the Czech prefix is distinguished by the underscore character, names
of types cannot use the underscore character. The types are renamed
erefStatus and erefTable.

Distinct Names

LCLint can detect names that are not sufficiently different from other names
in the program. This can be necessary to check portabilty to old compilers
that only use the first n characters of an identifier. It is also useful for
the programmer, to reduce the possiblity of using the wrong names. Running
LCLint using +distinctexternalnames produces 34 messages. The default number
of significant characters in an external name is 6, and alphabetic case is
not significant in comparisons. (This is the minimum that is acceptable in
an ANSI conforming compiler.)

If we were really determined to have a program that is portable to old
systems, we should change these names to be different in the first six
characters. (We would probably have to abandon the Czech naming convention
to do this.) Instead, we use -externalnamelength n to find the minimum
number of characters used in comparisons. We can try different values to see
how many errors are reported. With -externalnamelength 12, 2 errors are
reported. With -externamelength 14, no errors are reported.

We haven't made the program any more portable, but at least this is clearly
documented now. Someone trying to compile the program in an environment
where less than 14 characters are used for external names, will need to edit
the source code first.

Internal names can be checked similarly, except the default length is 31
characters. We can use +internalnamelookalike to check that names do not
look the same (e.g., differ only in lookalike characters like 1 and l.
Checking reports no errors.

Namespace Prefixes

The Czech naming convention prescribes prefixes for names associated with
abstract types. We can use specific namespace prefixes to restrict other
names, and place more restrictions on the abstract type names.

One common convention is that names of expanded macros should use all
uppercase letters. This is expressed by -uncheckedmacroprefix "^~*". That
is, an uppercase letter followed by one or more non-lowercase letters.
Macros that implement functions or constants and are checked by LCLint do
not have to match the uncheckedmacroprefix, since clients should not need to
be aware that the implementation is a macro. One message reports a violation
of this convention:

Since employeeFormat is preceeded by /*@notfunction@*/ it is expanded
normally, and is in the unchecked macro namespace. We initially rename it to
EMPLOYEEFORMAT. This produces one new message:

employee.h:8: Name EMPLOYEEFORMAT is reserved for future ANSI library
    extensions. Macros beginning with E and a digit or uppercase letter may be
    added to . (See ANSI, Section 4.13.1)
  External name is reserved for system in ANSI standard.  Use -ansireserved to
  suppress message.

Names beginning with E may be reserved for the ANSI library, so we should
use a different name. It is changed to FORMATEMPLOYEE.

We might want to use a similar convention for enumerator members. We add
-enumprefix "^^~*". This means an enumerator must start with two capital
letters, and the rest must be all non-lowercase letters. Fourteen messages
report violations of the enum prefix. We fix these by changing the names to
use all capitals.

Initialization File

Now, we move the name convention flags into an .lclintrc file. The .lclintrc
file in the current directory is read before checking begins. If we wanted
this naming convention to apply to code in other directories too, we would
put the flags in the .lclintrc file in our home directory.
