XML
The XML fad has created a bonanza of opportunities for obfuscation. The basic
technique is to pick a random hunk of code, then invent an obscure way of
representing its logic in XML. Then replace the piece of code with an XML
properties file and an XML parser. Make sure the XML representation you choose
is so limited that almost anything other than the original logic cannot be
expressed in it. Of course you never document the XML language extension or the
parser. Nobody questions the simplicity of XML. Using this technique, you should
easily be able to balloon 10 lines of simple Java code up to 100 lines of
perfectly opaque XML.
Obfuscated C
Follow the obfuscated C contests on the Internet and sit at the lotus feet of
the masters.
Find a Forth or APL Guru
In those worlds, the terser your code and the more bizarre the way it works, the
more you are revered.
I’ll Take a Dozen
Never use one housekeeping variable when you could just as easily use two or
three.
Jude the Obscure
Always look for the most obscure way to do common tasks. For example, instead of
using arrays to convert an integer to the corresponding string, use code like
this:
char *p;
switch (n)
{
case 1:
p = "one";
if (0)
case 2:
p = "two";
if (0)
case 3:
p = "three";
printf("%s", p);
break;
}
Foolish Consistency Is the Hobgoblin of Little
Minds
When you need a character constant, use many different formats: ' ', 32, 0x20,
040. Make liberal use of the fact that 10 and 010 are not the same number in C
or Java.
Casting
Pass all data as a void * and then typecast to the appropriate structure. Using
byte offsets into the data instead of structure casting is fun too.
The Nested Switch
(a switch within a switch) is the most difficult type of nesting for the human
mind to unravel.
Exploit Implicit Conversion
Memorize all of the subtle implicit conversion rules in the programming language.
Take full advantage of them. Never use a picture variable (in COBOL or PL/I) or
a general conversion routine (such as sprintf in C). Be sure to use floating-point
variables as indexes into arrays, characters as loop counters, and perform
string functions on numbers. After all, all of these operations are well-defined
and will only add to the terseness of your source code. Any maintainer who tries
to understand them will be very grateful to you because they will have to read
and learn the entire chapter on implicit data type conversion; a chapter that
they probably had completely overlooked before working on your programs.
int literals
When using ComboBoxes, use a switch statement with integer cases rather than
named constants for the possible values.
If you have an array with 100 elements in it, hard code the literal
100 in as many places in the program as possible. Never use a static final named
constant for the 100, or refer to it as myArray.length.
To make changing this constant even more difficult, use the literal 50 instead
of 100/2, or 99 instead of 100-1. You can further disguise the 100 by checking
for a == 101 instead of a > 100
or a > 99 instead of a >= 100.
Consider things like page sizes, where the lines consisting of x header, y body,
and z footer lines, you can apply the obfuscations independently to each of
these and to their partial or total sums.
These time-honoured techniques are especially effective in a program with two
unrelated arrays that just accidentally happen to both have 100 elements. If the
maintenance programmer has to change the length of one of them, he will have to
decipher every use of the literal 100 in the program to determine which array it
applies to. He is almost sure to make at least one error, hopefully one that won’t
show up for years later.
There are even more fiendish variants. To lull the maintenance programmer into a
false sense of security, dutifully create the named constant, but very
occasionally "accidentally" use the literal 100 value instead
of the named constant. Most fiendish of all, in place of the literal 100 or the
correct named constant, sporadically use some other unrelated named constant
that just accidentally happens to have the value 100, for now. It almost goes
without saying that you should avoid any consistent naming scheme that would
associate an array name with its size constant.
Semicolons!
Always use semicolons whenever they are syntactically allowed. For example:
if ( a );
else;
{
int d;
d = c;
}
Use Octal For Obscurity
Smuggle octal literals into a list of decimal numbers like this:
array = new int[]
{
111,
120,
013,
121,
};
Convert Indirectly
Java offers great opportunity for obfuscation whenever you have to convert. As a
simple example, if you have to convert a double to
a String, go circuitously, via Double
with new Double(d).toString() rather than the more
direct Double.toString(d). You can, of course, be far
more circuitous than that! Avoid any conversion techniques recommended by the Conversion
Amanuensis. You get bonus points for every extra temporary object you leave
littering the heap after your conversion.
Nesting
Nest as deeply as you can. Good coders can get up to 10 levels of ( ) on a
single line and 20 { } in a single method. C++
coders have the additional powerful option of preprocessor nesting totally
independent of the nest structure of the underlying code. You earn extra Brownie
points whenever the beginning and end of a block appear on separate pages in a
printed listing. Wherever possible, convert nested ifs into nested [? : ]
ternaries. If they span several lines, so much the better.
C’s Eccentric View of Arrays
C compilers transform myArray[i] into *(myArray
+ i), which is equivalent to *(i + myArray)
which is equivalent to i[myArray]. Experts know to put
this to good use. To really disguise things, generate the index with a function:
int myfunc(int q, int p) { return p%q; }
…
myfunc(6291, 8)[Array];
Unfortunately, these techniques can only be used in native C classes, not Java.
L o n g L i n e s
Try to pack as much as possible into a single line. This saves the overhead of
temporary variables, and makes source files shorter by eliminating new line
characters and white space. Tip: remove all white space around operators. Good
programmers can often hit the 255 character line length limit imposed by some
editors. The bonus of long lines is that programmers who cannot read 6 point
type must scroll to view them.
Exceptions
I am going to let you in on a little-known coding secret. Exceptions are a pain
in the behind. Properly-written code never fails, so exceptions are actually
unnecessary. Don’t waste time on them. Subclassing exceptions is for
incompetents who know their code will fail. You can greatly simplify your
program by having only a single try/catch in the entire application (in main)
that calls System.exit(). Just stick a perfectly standard set of throws on every
method header whether they could actually throw any exceptions or not.
When To Use Exceptions
Use exceptions for non-exceptional conditions. Routinely terminate loops with an ArrayIndexOutOfBoundsException.
Pass return standard results from a method in an exception.
“Efficient” Exceptions
Throwing an Exception has quite a high overhead. The
JVM has to scan the stack looking for a ton of information to potentially use in
a stack trace. You can avoid this overhead by constructing a Exception
object once and throwing it many times. The stack trace will be for the spot in
the code where the Exception was constructed, not
where it was thrown. This will really keep them guessing where the bugs are.
Use threads With Abandon
title says it all.
Lawyer Code
Follow the language lawyer discussions in the newsgroups about what various bits
of tricky code should do e.g. a=a++; or f(a++,a++);
then sprinkle your code liberally with the examples. In C, the effects of pre/post
decrement code such as
*++b ? (*++b + *(b-1)) : 0
are not defined by the language spec. Every compiler is free to evaluate in a
different order. This makes them doubly deadly. Similarly, take advantage of the
complex tokenising rules of C and Java by removing all spaces.
Early Returns
Rigidly follow the guidelines about no goto, no early returns, and no labelled
breaks especially when you can increase the if/else nesting depth by at least 5
levels.
Avoid {}
Never put in any { } surrounding your if/else blocks unless they are
syntactically obligatory. If you have a deeply nested mixture of if/else
statements and blocks, especially with misleading indentation, you can trip up
even an expert maintenance programmer. For best results with this technique, use
Perl. You can pepper the code with additional ifs after the statements,
to amazing effect.
Tabs From Hell
Never underestimate how much havoc you can create by indenting with tabs instead
of spaces, especially when there is no corporate standard on how much indenting
a tab represents. Embed tabs inside string literals, or use a tool to convert
spaces to tabs that will do that for you.
Magic Matrix Locations
Use special values in certain matrix locations as flags. A good choice is the [3][0]
element in a transformation matrix used with a homogeneous coordinate system.
Magic Array Slots revisited
If you need several variables of a given type, just define an array of them,
then access them by number. Pick a numbering convention that only you know and
don’t document it. And don’t bother to define #define constants for
the indexes. Everybody should just know that the global variable widget[15] is
the cancel button. This is just an up-to-date variant on using absolute
numerical addresses in assembler code.
Never Beautify
Never use an automated source code tidier (beautifier) to keep your code aligned.
Lobby to have them banned them from your company on the grounds they create
false deltas in PVCS/CVS (version control tracking) or that every programmer
should have his own indenting style held forever sacrosanct for any module he
wrote. Insist that other programmers observe those idiosyncratic conventions in "his
" modules. Banning beautifiers is quite easy, even though they save the
millions of keystrokes doing manual alignment and days wasted misinterpreting
poorly aligned code. Just insist that everyone use the same tidied format,
not just for storing in the common repository, but also while they are editing.
This starts an RWAR and the boss, to keep the peace, will ban automated tidying.
Without automated tidying, you are now free to accidentally misalign the
code to give the optical illusion that bodies of loops and ifs are longer or
shorter than they really are, or that else clauses match a different if than
they really do, e. g.
The Macro Preprocessor
It offers great opportunities for obfuscation. The key technique is to nest
macro expansions several layers deep so that you have to discover all the
various parts in many different *.hpp files. Placing executable code into macros
then including those macros in every *.cpp file (even those that never use those
macros) will maximize the amount of recompilation necessary if ever that code
changes.
Exploit Schizophrenia
Java is schizophrenic about array declarations. You can do them the old C, way
String x[], (which uses mixed pre-postfix notation) or the new way String[] x,
which uses pure prefix notation. If you want to really confuse people, mix the
notations: e.g.
byte[] rowvector, colvector, matrix[];
which is equivalent to:
byte[] rowvector;
byte[] colvector;
byte[][] matrix;
Hide Error Recovery Code
Use nesting to put the error recovery for a function call as far as possible
away from the call. This simple example can be elaborated to 10 or 12 levels of
nest:
Pseudo C
The real reason for #define was to help programmers who
are familiar with another programming language to switch to C. Maybe you will
find declarations like #define begin { " or " #define
end } useful to write more interesting code.
Confounding Imports
Keep the maintenance programmer guessing about what packages the methods you are
using are in. Instead of:
import com.mindprod.mypackage.Read;
import com.mindprod.mypackage.Write;
use:
import com.mindprod.mypackage.*;
Never fully qualify any method or class no matter how obscure. Let the
maintenance programmer guess which of the packages/classes it belongs to. Of
course, inconsistency in when you fully qualify and how you do your imports
helps most.
Toilet Tubing
Never under any circumstances allow the code from more than one function or
procedure to appear on the screen at once. To achieve this with short routines,
use the following handy tricks:
- Blank lines are generally used to separate logical blocks of code. Each line is
a logical block in and of itself. Put blank lines between each line.
- Never comment your code at the end of a line. Put it on the line above. If you’re
forced to comment at the end of the line, pick the longest line of code in the
entire file, add 10 spaces, and left-align all end-of-line comments to that
column.
- Comments at the top of procedures should use templates that are at least 15
lines long and make liberal use of blank lines. Here’s a handy template:
The technique of putting so much redundant information in documentation almost
guarantees it will soon go out of date, and will help befuddle maintenance
programmers foolish enough to trust it.
Encapsulate The Trivial
Create entire classes or methods to encapsulate trivialities that could never
possibly change, but which then require complex invocation, and careful
unravelling to discover that the code does almost nothing. Here is a classic
example:
Loops
The humble canonical for loop: for
(int i=0; i<n; i++
) should never be used. Always randomly disguise it, for example by:
- Redoing it as a while or do
while loop.
- Reversing the names of the i and n
variables, or making up fanciful names for either that have nothing to do with
their purpose as index and count.
- Changing the < to <=.
- Use i-- just for a
change of pace.
It goes without saying you should never use the compact for:each
loop. There many ways to rearrange the parts of an Iterator
loop over a Collection so every time the
maintenance programmer looks at on a simple Iterator,
it appears to be something novel.