Perl Version
Documentation
Version
Comments
Warnings and Strict
Lexical Variables
Exporting
Pass by Reference
Garbage Collection
DESTROY
die() and exit()
shift and @_
Tests
STDIN/STDOUT
System Calls

STYLE

Terminology
Names

Code formatting

Indents and Blank Space
Parentheses
Braces
Operators
Other

INTERNATIONALIZATION

String extraction styleguide

CODING PRCEDURE
BUG REPORTS, PATCHES
TO DO
CHANGES
VERSION

NAME

RT::StyleGuide - RT Style Guide

INTRODUCTION

All code and documentation that is submitted to be included in the RT distribution should follow the style in this document. This is not to try to stifle your creativity, but to make life easier for everybody who has to work with your code, and to aid those who are not quite sure how to do something.

These conventions below apply to perl modules, web programs, and command-line programs, specifically, but also might apply to some degree to any Perl code written for use in RT.

Note that these are all guidelines, not unbreakable rules. If you have a really good need to break one of the rules herein, however, then it is best to ask on the rt-devel mailing list first.

Note that with much of this document, it is not so much the Right Way as it is Our Way. We need to have conventions in order to make life easier for everyone. So don't gripe, and just follow it, because you didn't get a good grade in "Plays Well With Others" in kindergarten and you want to make up for it now.

If you have any questions, please ask us on the rt-devel mailing list:

	http://www.bestpractical.com/rt/lists.html

We don't always follow this guide. We are making changes throughout our code to be in line with it. But just because we didn't do it yet, that is no excuse. Do it anyway. :-)

This document is subject to change at the whims of the core RT team. We hope to add any significant changes at the bottom of the document.

CODING PRINCIPLES

Perl Version

We code everything to perl 5.6.1. Some features require advanced unicode features in perl 5.8.0. It is acceptable that unicode features work only for US-ASCII on perl 5.6.1.

Documentation

All modules will be documented using the POD examples in the module boilerplate. The function, purpose, use of the module will be explained, and each public API will be documented with name, description, inputs, outputs, side effects, etc.

If an array or hash reference is returned, document the size of the array (including what each element is, as appropriate) and name each key in the hash. For complex data structures, map out the structure as appropriate (e.g., name each field returned for each column from a DB call; yes, this means you shouldn't use "SELECT *", which you shouldn't use anyway).

Also document what kind of data returned values are. Is it an integer, a block of HTML, a boolean?

All command-line program options will be documented using the boilerplate code for command-line programs, which doesn't yet exist. Each available function, switch, etc. should be documented, along with a statement of function, purpose, use of the program. Do not use the same options as another program, for a different purpose.

All web templates should be documented with a statement of function, purpose, and use in a mason comment block.

Any external documents, and documentation for command-line programs and modules, should be written in POD, where appropriate. From there, they can be translated to many formats with the various pod2* translators. Read the perlpod manpage before writing any POD, because although POD is not difficult, it is not what most people are used to. It is not a regular markup language; it is just a way to make easy documentation for translating to other formats. Read, and understand, the perlpod manpage, and ask us or someone else who knows if you have any questions.

Version

Our distribution versions use tuples, where the first number is the major revision, the second number is the version, and third number is the subversion. Odd-numbered versions are development versions. Examples:

	1.0.0		First release of RT 1
	1.0.1		Second release of RT 1.0
	1.0.10		etc.
	1.1.0		First development release of RT 1.2 (or 2.0)
	2.0.0		First release of RT 2

Versions can be modified with a hyphen followed by some text, for special versions, or to give extra information. Examples:

	2.0.0-pre1	Notes that this is not final, but preview

In perl 5.6.0, you can have versions like v2.0.0, but this is not allowed in previous versions of perl. So to convert a tuple version string to a string to use with $VERSION, use a regular integer for the revision, and three digits for version and subversion. Examples:

	1.1.6	->	1.001006
	2.0.0	->	2.000000

This way, perl can use the version strings in greater-than and less-than comparisons.

Comments

All code should be self-documenting as much as possible. Only include necessary comments. Use names like "$ticket_count", so you don't need to do something like:

	# ticket count
	my $tc = 0;

Include any comments that are, or might be, necessary in order for someone else to understand the code. Sometimes a simple one-line comment is good to explain what the purpose of the following code is for. Sometimes each line needs to be commented because of a complex algorithm. Read Kernighan & Pike's Practice of Programming about commenting. Good stuff, Maynard.

Warnings and Strict

All code must compile and run cleanly with "use strict" enabled and the perl "-w" (warnings) option on. If you must do something that -w or strict complains about, there are workarounds, but the chances that you really need to do it that way are remote.

Lexical Variables

Use only lexical variables, except for special global variables ($VERSION, %ENV, @ISA, $!, etc.) or very special circumstances (see %HTML::Mason::Commands::session ). Global variables for regular use are never appropriate. When necessary, "declare" globals with "use vars" or "our()".

A lexical variable is created with my(). A global variable is pre-existing (if it is a special variable), or it pops into existence when it is used. local() is used to tell perl to assign a temporary value to a variable. This should only be used with special variables, like $/, or in special circumstances. If you must assign to any global variable, consider whether or not you should use local().

local() may also be used on elements of arrays and hashes, though there is seldom a need to do it, and you shouldn't.

Exporting

Do not export anything from a module by default. Feel free to put anything you want to in @EXPORT_OK, so users of your modules can explicitly ask for symbols (e.g., "use Something::Something qw(getFoo setFoo)"), but do not export them by default.

Pass by Reference

Arrays and hashes should be passed to and from functions by reference only. Note that a list and an array are NOT the same thing. This is perfectly fine:

	return($user, $form, $constants);

An exception might be a temporary array of discrete arguments:

	my @return = ($user, $form);
	push @return, $constants if $flag;
	return @return;

Although, usually, this is better (faster, easier to read, etc.):

	if ($flag) {
		return($user, $form, $constants);
	} else {
		return($user, $form);
	}

We need to talk about Class::ReturnValue here.

Garbage Collection

Perl does pretty good garbage collection for you. It will automatically clean up lexical variables that have gone out of scope and objects whose references have gone away. Normally you don't need to worry about cleaning up after yourself, if using lexicals.

However, some glue code, code compiled in C and linked to Perl, might not automatically clean up for you. In such cases, clean up for yourself. If there is a method in that glue to dispose or destruct, then use it as appropriate.

Also, if you have a long-running function that has a large data structure in it, it is polite to free up the memory as soon as you are done with it, if possible.

	my $huge_data_structure = get_huge_data_structure();
	do_something_with($huge_data_structure);
	undef $huge_data_structure;

DESTROY

All object classes must provide a DESTROY method. If it won't do anything, provide it anyway:

	sub DESTROY { }

die() and exit()

Don't do it. Do not die() or exit() from a web template or module. Do not call kill 9, $$. Don't do it.

In command-line programs, do as you please.

shift and @_

Do not use @_. Use shift. shift may take more lines, but Jesse thinks it leads to cleaner code.

	my $var = shift;			# right
	my($var) = @_;				# ick. no
	sub foo { uc $_[0] }			# icky. sometimes ok.

	my($var1, $var2) = (shift, shift);	# Um, no.

        my $var1 = shift;                       # right
        my $var2 = shift;

Tests

Modules should provide test code, with documentation on how to use it. Test::Inline allows tests to be embedded in code. Test::More makes it easy to create tests. Any code you write should have a testsuite. Any code you alter should have a test suite. If a patch comes in without tests, there is something wrong.

When altering code, you must run the test harness before submitting a patch or committing code to the repository.

"make regression" will extract inline tests, blow away the system database and run the test suite.

"make regression-quiet" will do all that and not print the "ok" lines.

STDIN/STDOUT

Always report errors using $RT::Logger. It's a Log::Dispatch object. Unlike message meant for the user, log messages are not to be internationalized.

There are several different levels ($RT::Logger methods) of logging:

debug: Used for messages only needed during system debugging.
info: Should be used to describe "system-critical" events which aren't errors. Examples: creating users, deleting users, creating tickets, creating queues, sending email (message id, time, recipients), recieving mail, changing passwords, changing access control, superuser logins)
error: Used for RT-generated failures during execution.
crit: Should be used for messages when an action can not be completed due to some error condition beyond our control.

In the web UI and modules, never print directly to STDERR. Do not print directly to STDOUT, unless you need to print directly to the user's console.

In command-line programs, feel free to print to STDERR and STDOUT as needed for direct console communication. But for actual error reporting, use the logging API.

System Calls

Always check return values from system calls, including open(), close(), mkdir(), or anything else that talks directly to the system. Perl built-in system calls return the error in $!; some functions in modules might return an error in $@ or some other way, so read the module's documentation if you don't know. Always do something, even if it is just calling $RT::Logger->warning(), when the return value is not what you'd expect.

STYLE

Much of the style section is taken from the perlsyle manpage. We make some changes to it here, but it wouldn't be a bad idea to read that document, too.

Terminology

RT the name

"RT" is the name of the project. "RT" is, optionally, the specific name for the actual file distribution. That's it.

While we sometimes use "RT2" or "RT3", that's shortand that's really not recommended. The name of the project is "RT".

To specify a major version, use "RT 3.0". To specify a specific release, use "RT 3.0.12"

function vs. sub(routine) vs. method

Just because it is the Perl Way (not necessarily right for all languages, but the documented terminology in the perl documentation), "method" should be used only to refer to a subroutine that are object methods or class methods; that is, these are functions that are used with OOP that always take either an object or a class as the first argument. Regular subroutines, ones that are not object or class methods, are functions. Class methods that create and return an object are optionally called constructors.

Users

"users" are normally users of RT, the ones hitting the site; if using it in any other context, specify. "system users" are user names on the operating system. "database users" are the user names in the database server. None of these needs to be capitalized.

Names

Don't use single-character variables, except as iterator variables.

Don't use two-character variables just to spite us over the above rule.

Constants are in all caps; these are variables whose value will never change during the course of the program.

	$Minimum = 10;		# wrong
	$MAXIMUM = 50;		# right

Other variables are lowercase, with underscores separating the words. They words used should, in general, form a noun (usually singular), unless the variable is a flag used to denote some action that should be taken, in which case they should be verbs (or gerunds, as appropriate) describing that action.

	$thisVar      = 'foo';	# wrong
	$this_var     = 'foo';	# right
	$work_hard    = 1;	# right, verb, boolean flag
	$running_fast = 0;	# right, gerund, boolean flag

Arrays and hashes should be plural nouns, whether as regular arrays and hashes or array and hash references. Do not name references with "ref" or the data type in the name.

	@stories     = (1, 2, 3);      # right
	$comment_ref = [4, 5, 6];      # wrong
	$comments    = [4, 5, 6];      # right
	$comment     = $comments->[0]; # right

Make the name descriptive. Don't use variables like "$sc" when you could call it "$story_count". See "Comments".

There are several variables in RT that are used throughout the code, that you should use in your code. Do not use these variable names for anything other than how they are normally used, and do not use any other variable names in their place. Some of these are:

	$self		# first named argument in object method

Subroutines (except for special cases, like AUTOLOAD and simple accessors) begin with a verb, with words following to complete the action. Accessors don't start with "Get" if they're just the name of the attribute.

Accessors which return an object should end with the suffix Obj.

This section needs clarification for RT.

Words begin with a capital letter. They should as clearly as possible describe the activity to be peformed, and the data to be returned.

	Load();		# good
	LoadByName();	# good
	LoadById();		# good

Subroutines beginning with _ are special: they are not to be used outside the current object. There is not to be enforced by the code itself, but by someone very big and very scary.

For large for() loops, do not use $_, but name the variable. Do not use $_ (or assume it) except for when it is absolutely clear what is going on, or when it is required (such as with map() and grep()).

	for (@list) {
		print;			# OK; everyone knows this one
		print uc;		# wrong; few people know this
		print uc $_;		# better
	}

Note that the special variable _ should be used when possible. It is a placeholder that can be passed to stat() and the file test operators, that saves perl a trip to re-stat the file. In the example below, using $file over for each file test, instead of _ for subsequent uses, is a performance hit. You should be careful that the last-tested file is what you think it is, though.

	if (-d $file) {		# $file is a directory
		# ...
	} elsif (-l _) {	# $file is a symlink
		# ...
	}

Package names begin with a capital letter in each word, followed by lower case letters (for the most part). Multiple words should be StudlyCapped.

	RT::User			# good
	RT::Database::MySQL		# proper name
	RT::Display::Provider	        # good
	RT::CustomField			# not so good, but OK

Plugin modules should begin with "RTx::", followed by the name of the plugin.

Code formatting

Use perltidy. Anything we say here is wrong if it conflicts with what perltidy does. Your perltidyrc should read:

-lp -vt=2 -vtc=2 -nsfs -bar

Indents and Blank Space

All indents should be tabs. Set your tab stops whatever you want them to be; I use 8 spaces per tabs.

No space before a semicolon that closes a statement.

	foo(@bar) ;	# wrong
	foo(@bar);	# right

Line up corresponding items vertically.

	my $foo   = 1;
	my $bar   = 2;
	my $xyzzy = 3;

	open(FILE, $fh)	  or die $!;
	open(FILE2, $fh2) or die $!;

	$rot13 =~ tr[abcedfghijklmnopqrstuvwxyz]
	            [nopqrstuvwxyzabcdefghijklm];

	# note we use a-mn-z instead of a-z,
	# for readability
	$rot13 =~ tr[a-mn-z]
	            [n-za-m];

Put blank lines between groups of code that do different things. Put blank lines after your variable declarations. Put a blank line before a final return() statement. Put a blank line following a block (and before, with the exception of comment lines).

An example:

	# this is my function!
	sub foo {
                my $val = shift;
		my $obj = new Constructor;
		my($var1, $var2);

		$obj->SetFoo($val);
		$var1 = $obj->Foo();

		return($val);
	}

	print 1;

Parentheses

For control structures, there is a space between the keyword and opening parenthesis. For functions, there is not.

	for(@list)	# wrong
	for (@list)	# right

	my ($ref)	# wrong
	my($ref)	# right

Be careful about list vs. scalar context with parentheses!

	my @array = ('a', 'b', 'c');
	my($first_element) = @array;		# a
	my($first_element) = ('a', 'b', 'c');	# a
	my $element_count  = @array;		# 3
	my $last_element   = ('a', 'b', 'c');	# c

Always include parentheses after functions, even if there are no arguments. There are some exceptions, such as list operators (like print) and unary operators (like undef, delete, uc).

There is no space inside the parentheses, unless it is needed for readability.

	for ( map { [ $_, 1 ] } @list )	# OK
	for ( @list )			# not really OK, not horrible

On multi-line expressions, match up the closing parenthesis with either the opening statement, or the opening parenthesis, whichever works best. Examples:

	@list = qw(
		bar
		baz
	);			# right

	if ($foo && $bar && $baz
		 && $buz && $xyzzy
	) {
		print $foo;
	}

Whether or not there is space following a closing parenthesis is dependent on what it is that follows.

	print foo(@bar), baz(@buz) if $xyzzy;

Note also that parentheses around single-statement control expressions, as in if $xyzzy, are optional (and discouraged) if it is absolutely clear -- to a programmer -- what is going on. There is absolutely no need for parentheses around $xyzzy above, so leaving them out enhances readability. Use your best discretion. Better to include them, if there is any question.

The same essentially goes for perl's built-in functions, when there is nothing confusing about what is going on (for example, there is only one function call in the statement, or the function call is separated by a flow control operator). User-supplied functions must always include parentheses.

	print 1, 2, 3;				# good
	delete $hash{key} if isAnon($uid);	# good

However, if there is any possible confusion at all, then include the parentheses. Remember the words of Larry Wall in the perlstyle manpage:

	When in doubt, parenthesize.  At the very least it will
	let some poor schmuck bounce on the % key in vi.

	Even if you aren't in doubt, consider the mental welfare
	of the person who has to maintain the code after you, and
	who will probably put parens in the wrong place.

So leave them out when it is absoutely clear to a programmer, but if there is any question, leave them in.

Braces

(This is about control braces, not hash/data structure braces.)

There is always a space befor the opening brace.

	while (<$fh>){	# wrong
	while (<$fh>) {	# right

A one-line block may be put on one line, and the semicolon may be omitted.

	for (@list) { print }

Otherwise, finish each statement with a semicolon, put the keyword and opening curly on the first line, and the ending curly lined up with the keyword at the end.

	for (@list) {
		print;
		smell();
	}

Generally, we prefer "uncuddled elses":

	if ($foo) {
		print;
	}
	else {
		die;
	}

_If_ the if statement is very brief, sometimes "cuddling" the else makes code more readable. Feel free to cuddle them in that case:

	if ($foo) {
		print;
	} else {
		die;
	}

Operators

Put space around most operators. The primary exception is the for aesthetics; e.g., sometimes the space around "**" is ommitted, and there is never a space before a ",", but always after.

	print $x , $y;	# wrong
	print $x, $y;	# right

	$x = 2 >> 1;	# good
	$y = 2**2;	# ok

Note that "&&" and "||" have a higher precedence than "and" and "or". Other than that, they are exactly the same. It is best to use the lower precedence version for control, and the higher for testing/returning values. Examples:

	$bool = $flag1 or $flag2;	# WRONG (doesn't work)
	$value = $foo || $bar;		# right
	open(FILE, $file) or die $!;

	$true  = foo($bar) && baz($buz);
	foo($bar) and baz($buz);

Note that "and" is seldom ever used, because the statement above is better written using "if":

	baz($buz) if foo($bar);

Most of the time, the confusion between and/&&, or/|| can be alleviated by using parentheses. If you want to leave off the parentheses then you must use the proper operator. But if you use parentheses -- and normally, you should, if there is any question at all -- then it doesn't matter which you use. Use whichever is most readable and aesthetically pleasing to you at the time, and be consistent within your block of code.

Break long lines AFTER operators, except for "and", "or", "&&", "||". Try to keep the two parts to a binary operator (an operator that has two operands) together when possible.

	print "foo" . "bar" . "baz"
		. "buz";			# wrong

	print "foo" . "bar" . "baz" .
		"buz";				# right

	print $foo unless $x == 3 && $y ==
		4 && $z == 5;			# wrong

	print $foo unless $x == 3 && $y == 4
		&& $z == 5;			# right

Other

Put space around a complex subscript inside the brackets or braces.

	$foo{$bar{baz}{buz}};	# OK
	$foo{ $bar{baz}{buz} };	# better

In general, use single-quotes around literals, and double-quotes when the text needs to be interpolated.

It is OK to omit quotes around names in braces and when using the => operator, but be careful not to use a name that doubles as a function; in that case, quote.

	$what{'time'}{it}{is} = time();

When making compound statements, put the primary action first.

	open(FILE, $fh) or die $!;	# right
	die $! unless open(FILE, $fh);	# wrong

	print "Starting\n" if $verbose;	# right
	$verbose && print "Starting\n";	# wrong

Use here-docs instead of repeated print statements.

		print <<EOT;
	This is a whole bunch of text.
	I like it.  I don't need to worry about messing
	with lots of print statements and lining them up.
	EOT

Just remember that unless you put single quotes around your here-doc token (<<'EOT'), the text will be interpolated, so escape any "$" or "@" as needed.

INTERNATIONALIZATION

String extraction styleguide

Web templates

Templates should use the /l filtering component to call the localisation framework

The string Foo!

Should become <&|/l&>Foo!</&>

All newlines should be removed from localized strings, to make it easy to grep the codebase for strings to be localized

The string Foo Bar Baz Should become <&|/l&>Foo Bar Baz</&>

Variable subsititutions should be moved to Locale::MakeText format

The string Hello, <%$name %>

should become <&|/l, $name &>Hello, [_1]</&>

Multiple variables work just like single variables The string You found <%$num%> tickets in queue <%$queue%>

should become <&|/l, $num, $queue &>You found [_1] tickets in queue [_2]</&>

When subcomponents are called in the middle of a phrase, they need to be escaped too:

The string <input type="submit" value="New ticket in">&nbsp<& /Elements/SelectNewTicketQueue&>

should become <&|/l, $m->scomp('/Elements/SelectNewTicketQueue')&><input type="submit" value="New ticket in"> [_1]</&>

The string <& /Elements/TitleBoxStart, width=> "40%", titleright => "RT $RT::VERSION for $RT::rtname", title => 'Login' &>

should become <& /Elements/TitleBoxStart, width=> "40%", titleright => loc("RT [_1] for [_2]",$RT::VERSION, $RT::rtname), title => loc('Login'), &>

Library code

Within RT's core code, every module has a localization handle available through the 'loc' method:

The code return ( $id, "Queue created" );

should become return ( $id, $self->loc("Queue created") );

When returning or localizing a single string, the "extra" set of parenthesis () should be omitted.

The code return ("Subject changed to ". $self->Data );

should become return $self->loc( "Subject changed to [_1]", $self->Data );

It is important not to localize the names of rights or statuses within RT's core, as there is logic that depends on them as string identifiers. The proper place to localize these values is when they're presented for display in the web or commandline interfaces.

CODING PRCEDURE

This is for new programs, modules, specific APIs, or anything else.

Contact for core team is the slashcode-development mailing list.

Present idea to core team

We may know of a better way to approach the problem, or know of an existing way to deal with it, or know someone else is working on it. This is mostly informal, but a fairly complete explanation for the need and use of the code should be provided.

Present complete specs to core team

The complete proposed API to the core team should be submitted for approval and discussion. For web and command-line programs, present the functionality and interface (op codes, command-lin switches, etc.).

The best way to do this is to take the documentation portion of the boilerplate and fill it in. You can make changes later if necessary, but fill it in as much as you can.

Announce any changes to interface

If the way it works or how it is called is going to change, notify the core team.

Prepare for core review

When you are done, the code will undergo a code review by a member of the core team, or someone picked by the core team. This is not to belittle you (that's just a nice side effect), it is to make sure that you understand your code, that we understand your code, that it won't break other code, that it follows the documentation and existing proposal. It is to check for possible optimizations or better ways of doing it.

For members of the core team, one or more other members of the team will perform the review.

Note that all code is expected to follow the coding principles and style guide contained in this document.

Finish it up

After the code is done (possibly going through multiple code reviews), if you do not have repository access, submit it to rt-<major-version>-bugs@fsck.com as a unified diff. From that point on, it'll be handled by someone with repository access.

BUG REPORTS, PATCHES

Use rt-<major-version>-bugs@fsck.com for any bug that is not being fixed immediately. If it is not in RT, there is a good chance it will not be dealt with.

Send patches to rt-<major-version>-bugs@fsck.com, too. Use diff -u for patches.

TO DO

Talk about DBIx::SearchBuilder

Talk about mason component style cascading style sheets Talk about adding a new translation

Talk more about logging

CHANGES

        Adapted from Slash Styleguide by jesse - 20 Dec, 2002

VERSION

0.1