SCID
©1996-2009 Roedy Green, Canadian Mind Products
This essay does not describe an existing computer program, just
one that should exist. This essay is about a suggested
student
project in
Java programming. This essay gives a rough overview of how it
might work. I have
no source, object, specifications, file layouts or
anything else useful to implementing this project.
This project outline is not like the artificial tidy problems you are spoon-fed
in school, when all the facts you need are included, nothing extraneous is
mentioned, the answer is fully specified, along with hints to nudge you toward a
single expected canonical solution. This project is much more like the real
world of messy problems where it is up to you to fully the define the end point,
or a series of ever more difficult versions of this project, and research the
information yourself to solve them.
Everything I have to say to help you with this project is written below. I am not
prepared to help you implement it; or give you any additional materials. I have
too many other projects of my own.
Though I am a programmer, I don’t do people’s homework
for them. That just robs them of an education.
You have my full permission to implement this project in any way you please and
to keep all the profits from your endeavor.
Please do not email me about this project without reading the disclaimer above.
Java Source Code SCID-style browser/editor
“If
builders built buildings the way programmers write programs, then the first
woodpecker that came along would destroy civilization.
~ Weinberg’s
Second Law
(born: 1933 age: 76)
An invasion of armies can be resisted, but not an idea whose time has come.”
Victor Hugo (born: 1802-02-26 died: 1885-05-22 at age: 83), born 1852,
Histoire d’un Crime
I mean, source code in files; how quaint, how seventies!
~ Kent Beck (born: 1961 age: 48),
evangelist for extreme programming.
SCID means Source Code
In Database. This is one of many student
projects. We have been teaching our customers to regard their data as a
precious resource that should be milked and reused by finding many possible ways
of summarising, viewing and updating it. However, we programmers have not yet
learned to treat our source code as a similar structured data resource.
This is an enormous project, but you could start small. The basic idea is your
pre-parse your code and put it in a database. The problem is programs are
getting huger and huger. We need tools to help you temporarily ignore most of
them so you can concentrate on your immediate needs. We need tools to rapidly
navigate programs. We need tools to help you get an mental forest picture before
delving into the tree detail.
I have been talking up the SCID idea since the early 70s. Mostly people have
just hooted with derisive laughter. However, SCID-think is gradually catching on.
The RADs, such an Visual Café IBM Visual Age and Inprise Jbuilder, let
you write code to control the properties of widgets on the screen by right
clicking on visual elements to view the associated properties. You can tick off
entries in pop-up listboxes and checkboxes or fill in the blanks. This is an
important step away from thinking of programs strictly as linear streams of
ASCII characters. Java Studio lets you view and write Java code by playing
plumber — visually connecting JavaBeans.
I think it is a case that the shoemaker’s children have no shoes.
Programmers in creating source code in linear text files do the equivalent of
keeping their accounting books using a CPM Wordstar text editor. We would never
dream of handing a customer such error prone tools for manipulating such
complicated cross-linked data as source code. If a customer had such data, we
would offer a GUI-based data entry system with all sorts of point and click
features, extreme data validation, and ability to reuse that data, view it in
many ways, and search it by any key.
Once you have your program pre-parsed, you can display the program in a variety
of ways. Here are just a few examples:
- Make the beginnings of methods more visually obvious. The plain Java syntax
tends to camouflage them. Ditto for temporary variable declarations.
- Hide comments.
- Show just loop structure.
- Show just code involved with class X.
- Hide all code that involves calls to the java.awt package
- Highlight all code that does bit shifting or division.
- Highlight all uses of the sin method, but only in class Moses, not Math.
- Gray out all code that deals with error handling. All that is left is the normal
case.
- Colour code so that the darker the shade, the more frequently it is executed.
- Show me a switch
statement as if it had been handled with a set of subclasses. There is
underlying deep structure here. I should be able to view the code as if
it had been done with switch or as if
it had been done with polymorphism. Sometimes you are interested in all the
facts about Dalmatians. Sometimes you are interested in comparing all the
different ways different breeds of dogs bury their bones. Why should you have to
pre-decide on a representation that lets you see only one point of view?
- Sometimes you want to see all the code concerning the save button. Other times
you want to see only code involving hooking up listeners for all buttons/menu
items. Other times you are interested only in code that affects layout. The SCID
can hide non relevant code. It can also dynamically reorder code for the
current purpose. You no longer have to decide whether to bundle all your layout
code together or to bundle it with the corresponding button instantiation. You
can have it both ways!
- Show me a set of subclasses as if the code had been handled by a giant
switch. This lets me compare the equivalent code in all subclasses. Similarly
let me compare just two subclasses. This is like a DIFF utility that notices the
differences between two methods and generates a single program that handles both,
taking advantage of commonalities.
- Show me the logic as if it had been written with a
PET decision table. Here you have a list of conditions, then a list of actions.
The SCID can ensure that all possible combinations are covered, and you easily
proofread the logic originally coded in traditional nested if logic. You can
write a decision table and have the SCID convert it to nested if logic or a
giant switch indexed by concatenated binary condition representations. Here is a
simple example:
CONDITIONS
kettleFull() - Y Y Y N N N
makingTea() N Y N Y Y N Y
makingCoffee() N N Y Y N Y Y
ACTIONS
addWater() - - - - X X X
boilWater() - X X X X X X
addCoffee() - - X X - X X
addTea() - X - X X - X
Might generate code like this:
Better logic still would merge test conditions and actions to reduce code size
and to avoid computing a condition when it did not matter.
- Let me switch rapidly back and forth between different representations of my
code. I would like to see a high level CASE view, e.g. Warnier-Orr diagram or
your favourite flavour of UML or sequence diagrams, and then zoom in on coding
detail, something like TogetherJ offers. Let me see a flow chart of the program’s
basic loop structure, then zoom in on part of it. When a project starts,
typically all the energy is focussed on the UML, specifications and the bird’s
eye view. As the project progresses, the energy is focussed on the detail and
entropy gradually destroys the high level documentation. It is not kept in sync.
The high level documentation becomes worse than useless to orient an incoming
maintenance programmer. The spec, the UML, the high level stuff, the code,
various levels of comment detail and the end user docs must be more
closely integrated so that you can navigate at any level. All
levels must be kept up to date and in sync. The navigation function provides
motive to keep the high level docs accurate. On the other end of the spectrum,
let me zoom down and examine the byte code or machine code.
We make the error of thinking computer programs are primarily for communicating
with computers. On a project that requires more than one person, the source code
is primarily for communicating between people. The SCID gives you a mechanism to
record information only of interest to people and to help you manage that
information overload.
- With Java 1.5 enums, you
want an aligned grid so you can study the enum constructors either by row or by
column so that you can compare enums for a certain property, or study the
properties of one particular enum. You would like the grid to behave as a table
in HTML or better still as a spreadsheet, with non scrolling headings,
adjustable column widths so you can squeeze the most information on the screen
without scrolling. It wraps within cells. It would look something like this:
/**
* constants for Application categories
*/
public enum
AppCat {
| Enum |
|
shortName |
|
description |
|
aliases |
|
| APPLET |
( |
"Applet" |
, |
"Java Applet" |
, |
"applet" |
), |
| APPLICATION |
( |
"application" |
, |
"Java application" |
, |
"application" |
), |
| DOCUMENTATION |
( |
"documentation" |
, |
"documentation" |
, |
"documentation" |
), |
| HYBRID |
( |
"hybrid" |
, |
"Java Applet that can also be run as an
application" |
, |
"hybrid" |
), |
| JWS |
( |
"Java Web Start" |
, |
"Java Web Start" |
, |
"jws",
"weblet",
"webstart",
"jaws" |
), |
| LIBRARY |
( |
"Class" |
, |
"Class library" |
, |
"class",
"classes" ,
"library" |
), |
| SERVLET |
( |
"servlet" |
, |
"Java Servlet" |
, |
"servlet" |
), |
| UTILITY |
( |
"Utility" |
, |
"non-Java Utility" |
, |
"utility" |
); |
|---|
}
- Show me the definition of this variable or method just by clicking it.
- Tell me which classes and methods call this method and how many times. This XREF
is always up to date because the source is in a database. It does not need to be
scanned periodically to create a fresh XREF.
- Optionally show me code with all class names fully qualified by package, or
remove that qualification for all or some classes.
- Tell me which classes and methods look at this variable and how many times.
Similarly, tell me which ones change it.
- Collapse/expand level of detail, e. g. collapse detail of CASE bodies, LOOP
bodies, IF/ELSE bodies, parameter details leaving just the names of the methods
being called, collapse purely arithmetic assignments.
- By using specially coded comments you can hide/reveal various classes of them.
You can hide code and just read comments, or perhaps just see the overview
comments, or just the comments explaining what the various classes are for etc.
The key is to show you just the level of detail in comments you need for the
current task without being overwhelmed with irrelevant detail. You could
configure which categories of comments you wanted to see fully expanded, and
which you wanted revealed only by hoverhelp, and which you wanted totally
suppressed. By using hoverhelp to display comments you free up screen real
estate to see more code at once on screen. You could implement comment hiding/hoverhelp
without a SCID using a smart traditional text editor using:
markers to tag comments with level of detail and importance (or severity
as Chuck Sheehan, the technique’s
inventor, calls it.)
Programmers very familiar with the code might be less likely to remove JavaDoc
or complain about it, if they could get it off their screens. The main
drawback of doing this is out of sight, out of mind. Cowboy coders would be even
less likely to keep comments in sync with the code.
- When I write the code to call a method, show me the names, types and JavaDoc for
the parameters.
- Show me the names of the parameters next to the parameters themselves in each
method invocation so I can proof-read it, the way you can in Ada-95:
drawCircle( x => point.x, y => point.y, radius => 5 );
or Modula-3:
drawCircle( x := point.x, y := point.y, radius := 5 );
or as Java-style comments:
drawCircle( point.x, point.y, 5 );
As I become familiar with certain methods, turn this expansion off for those
methods only. In a similar way, optionally expand/collapse calls with parameter
type information as well.
- When I type in an identifier the SCID has never heard off, use spell check logic
to suggest what I likely meant. Eclipse
now does this. I spend so much time correcting typos, variant abbreviations,
errors in capitalisation, inconsistent capitalisation e.g. Hashtable
vs HashTable when several variants are plausible.
- Warn me if I reuse a name locally that is already defined as an instance or
static variable, except for the usual exceptions.
- Show me my declarations aligned in columns, perhaps using compact glyphs to
indicate static, instance, public etc. so that I can easily pick out parallels
in names and types.
- Let me see switch statements as if they had been coded in Eiffel as inspect
statements. Let me see declarations, expressions, loops, if nests etc, in my
favourite syntax, in any of the Algol family of languages such as: Eiffel, Ada-95,
Java, Dylan, Scheme, Algol-68, beta, Pascal, Delphi, Oberon, Modula, NetRexx,
Python, Sather, roll-your-own such as Abundance or even as flowcharts. You
should be able to key code in any of these modes too. The language would still
be Java underneath, with a surface veneer to simulate the coding conventions of
these other languages.
- Show me the JavaDoc comment for that parameter on demand.
- Global method renaming. Accurate, unambiguous method and variable naming is the
most underrated technique for writing maintainable code. Whenever you add a new
method, there is a strong possibility some existing similar method should be
renamed so the distinction between the two is more clear. Scope name clashes can
be resolved to avoid confusing programmers. Compilers have no trouble with
accidentally duplicated names, but programmers are easily befuddled. Globally
renaming manually is so error-prone that it is almost never done manually. With
a SCID, it would be effortless and completed in an eye blink. You could also do
generalisations of renaming, e.g. reordering parameters to some more consistent
standard, or adding overloaded methods to handle common default parameters, and
having all code converted to use the new overloaded methods.
- Show me the program with the Spanish strings inserted. Show it to me with the
Spanish variable names where they are available, but use English ones where not.
Let me read it as if it were written purely for Spanish with any
internationalisation bubblegum housekeeping hidden.
- Show me the program with all needless () levels removed that our newbie
programmer put in. The () are not actually stored in the database. They are
regenerated to suit the individual programmer preference.
- Show me the program with extra parentheses
() inserted because I can never remember the precedence
distinctions between && and <.
- Show me the program with the Whazmotron custom glyph set so that I can easily
pick out if begin end, loop begin end, class begin end, method begin end.
- What classes are available to me at this point in the program? What local
variables are in scope?
- How should you display semicolons? Once you have the parse tree, they are purely
for the convenience of the humans reading the code. They are not actually in the
computer’s parse tree. You could display with any statement
convention you wanted. Every programmer could flip between any display mode they
wanted.
- you might leave them out.
- You might use Pascal separator, Java-like terminator, or Eiffel-like only-when-it-would-otherwise-be-confusing
rules.
- You might use a pure indenting convention.
- You might draw boxes, or non-outlined boxes around each statement in a subtly
different shade from the background colour.
- You might use a special fat glyph, perhaps a little red stop sign. that is very
easy to tell apart from a colon.
- You should be able to ask, what methods are available at this point in the
program that produce a Zomblat object? What methods are available that take a
Zomblat object as a parameter? What methods are available that take both a
Zomblat and a Color object?
- What methods are available anywhere that produce a Zomblat object? What methods
are available that take a Zomblat object as a parameter?
- Show/hide the Eiffelian pre/post assertions. You can fill in dialog boxes about
each parameter, variable or return results. For ints you may specify the
acceptable low-high bounds. For strings you would specify whether they may be,
null, empty "" and whether they may have lead/trail blanks. You might
specify that they must be all upper case, all numeric, all lower case, no
accented letters etc. For enumerations you would specify the list of allowable
values. For debugging, you can turn this code on to ensure all the conditions
are being met.
- When debugging, the SCID secretly captures information about where in the code
each string in the output came from so that you can click anything in the
console output and instantly jump to the System.out.
println statement that generated it. It should be
easy, when debugging, to temporarily assign colours to the console output from
different classes or println statements to help
classify them similar to the way logging can be configured.
- Capture additional information about fields useful for data entry, such as low-high
bounds, blank if zero, left leading zero fill, commas, lists of legal values,
justification, natural layout parameters, field name or display, prompts, field
widths, validation routines,… Programs can access this data rather than
specifying it inline in the code. This keeps everything about the variable in
one place where it can be easily accessed and changed. It also facilitates
searching for fields that share some property and bulk replacing it.
- That was an ambiguous name for that method. Change it everywhere it is used to
this clearer name, but don’t change it where that same name is used in
another class. Computer, be clever, don’t pester me to figure it out for
you which ones should be changed. IBM’s Visual Age can do this already.
With a database, a variable or method name string actually appears in only one
place, (everywhere else the name is represented by a pointer to that name), so
it is trivial to make a global change.
- Display the program using foreground/background color, font family, font size,
font style (bold, italic), lines, glyphs and icons to pack as much additional
information on the screen as possible. For example you might be able to tell a
stack/temporary variable, from an instance variable from a static variable from
a constant just by looking the font, or some slight shade of foreground or
background colour difference, e.g. dark brown, orangey/brown, and light brown
foreground. The clues may be almost subliminal. You could encode all kinds of
information compactly such as: local, parameter, instance, static, my class, Sun
class, type, package, class, definition, keyword, final… all in a way
that did not get in your face. You could encode for whatever distinctions were
important at the time.
Variable pitch fonts are possible without giving up alignment. They put more on
the screen and are more readable than fix pitch fonts.
Exactly how these abilities are used will change constantly depending on your
current task. The idea is to encode information about symbols in their look.
- You could use the full colour abilities of the modern screen to give subliminal
clues, e.g. by automatically assigning a portion of the spectrum to each package/class
using a pastel shades as the backgrounds to any references to methods or
variables of that class. You could bold face the definition of any identifier to
make it stand out. You could make calls to Sun code look different from calls to
you own code.
- Chris Uppal suggested
using colour coding by author. He noted that in every shop where he had worked,
there were programmers he could trust and ones he did not. If he were attempting
to understand or debug a chunk of code it would help if he knew which stretches
could be trusted to do what they claimed to do, and which stretches warranted
more scrutiny.
- You could encode the age of code by colour. Generally the newest code is most
suspect if there is a problem. Sometimes old code, that was done before some
specification change occurred, needs to be examined and ticked off as compatible
with the new spec. You can use colour to help keep track of which code has been
checked. A SCID would know the age of every token to the millisecond, much finer
resolution that could be pulled off with CVS deltas.
- You could ask that all code be filtered out unless it had to do with
Instantiating objects (other than common ones like String). This skeleton view
would give you a pretty good overview of how all the classes fit together.
- You could ask to globally visit all references to a given method or variable,
and tick them off once each was dealt with.
- You could do quite a bit of code writing by point and click. There is no need to
type a variable or method name, just select it from a palette of likely variable
or method names. You could type personal abbreviations for them and have them
expanded. You could view code with your personal names or the official ones. For
example to write a FOR loop there might be boxes you fill in for the various
intializer, terminator, and incrementor expression. They would default to int
i=0, i<n, and i++.
You could give the loop a name to be displayed when its body is collapsed. You
could convert the loop to one that ran backwards by a single click, or to one
that generated a WHILE or UNTIL loop similarly. If you ticked enumeration
all you would need type is the name of the name of the Enumeration generator.
The rest would be generated for you, accurately.
- Alternate display with common functions displayed as if they were infix
operators using special glyphs (here simulated here with red). For example,
instead of seeing:
if ( a.equals(b) )
You might see
if ( a == b )
Instead of
if ( a.compareTo(b) < 0 )
You might see
if ( a < b )
- The SCID would act as a Java lint, displaying suspicious or unusual code in a
special colour, and perhaps ask you for confirmation when you inserted code of
the form if (myString ==
"abc") or if (myBool
= a & b).
- Show or hide explicit conversions.
- Display declarations in a grid so that is easy to pick out the variable name,
the type and the initialisation. They line up nicely in columns like a
spreadsheet, possibly with each column separately scrollable so that you can see
the big picture and home in on the detail when you need it.
- Embed HTML comments in your code that render, complete with diagrams and images
when you read the code. There could be links to references to where the
algorithms were documented etc.
- Show me the code in pseudo NetRexx, Bali
or JPython, with obvious declarations removed so I can focus on the procedural
logic, or vice versa. I would then see an enumeration iteration written tersely
as for r in
reminders instead of the usual Java verbosity.
- Bali-style variable size parentheses.
In Java a piece of code might be displayed like this:
int a = ((b+c)/(e+f))*(g(i)+h);
That some piece of code displayed in Bali might look like this:
int a = ((b
+ c)/(e + f)) *
(g(i) + h);
The red is just to highlight the outsized(), though
colour coding matching () and {} is not such a bad idea.. It might even be
optionally displayed like this:
b + c
int a = ----- * (g(i) +
h);
e + f
- Show get/set method invocations as if they were direct access to an associated
property variable, similar to Delphi or Eiffel. This simplifies the syntax.
Instead of seeing:
setFudge( getFudge()+1 );
you would see:
fudge++;
- Use colour to display literals to group digits by three for decimal and octal
and by 4 for hex, emphasising the trailing indicator char in a different colour
so it does not get confused with a digit. A listing might look like this:
In some Asian countries, decimal digits are also grouped by four. The SCID would
allow either preference, defaulting to the locale default, so different people
would see the same code differently.
- Display using lines or slight shade variations in background colour to mark the
bounds of ifs and loops. Programs would look more like flow charts, or more like
text with highlights, as the programmer preferred for the current purpose.
Vertical striation watermarks in the background would make it easier to see
matching alignments. You might draw thin vari-shaded boxes around each nested
block. You might bracket blocks with {} turned 90 degrees and made 10.16 cm (4 in)
wide. CSD is one such
representation.
- Optionally apply Hungarian notation warts to variable names to indicate variable
type or scope. Turn them on and off at will. They are always accurate, e. g.
Scope prefixes might work like this:
- local a (e.g. aPoint)
- param p (e.g. pPoint)
- member instance m (e.g. mPoint)
- static s (e.g. sPoint)
- exception X (e.g. XOutOfBounds)
- Highlight all code involving floating point. What I am talking about here is not permanently
highlighting floating point operators and operands, or example, but just for the
next 10 minutes, because floating point is the thing I am concentrating
on at the moment.
The syntax colouring schemes I am familiar with are designed to be done once and
left alone once you have them tweaked the way you like them.
For a SCID, you need not only ways to change the syntax highlighting, but to rapidly
flip between presets to enhance the current interest and to suppress the current
irrelevancies. You also need ways to rapidly set up new interest constellations.
I use the word highlight in a broader sense. With a GUI and SCID you may
use combinations of colour, font, size, glyphs, background colour, hiding,
folding, lines, geometric shapes, bold, italic, blink etc. etc. to make a
certain constellation of currently interesting features leap out at you.
Different interesting features would use different highlighting techniques to
grab your attention simultaneously.
- If you elected to view an IF as a flow chart, you could more easily compare the
true and false branches line by line side by side. With the modern GUI’s
ability to rapidly pan in 2D or even 3D, we should break the mindset that
programs have to be a single linear column of text. We can pay more
attention to what actually works for the eye, not what is easiest to code. I am
pretty sure than long horizontal lines of text, stretching all the way across
the screen, so popular now, will prove to be suboptimal. You might look for
inspiration to website navigation aids or the Windows ME exploding menus.
- You might create your own glyphs or icons to represent methods, classes,
operators, variables, syntactic elements etc. That way you can pack more
information onto the screen at once. You create a personal way of
displaying the program to yourself that no one else in the universe need be able
to make sense of. You share the underlying code syntax tree with your fellow
programmers. The representation is personal and evanescent.
- You want to see program flow under a certain set of circumstances. Code that
would not be executed when those circumstances don’t apply is temporarily
suppressed from the display. You are left with a simplified flow chart that
shows execution flow. You can focus on the usual case, then later view various
pathological cases, independently. You don’t have to deal with the full
complexity of everything all the time the way you do in conventional coding. The
whole point of the SCID is to temporarily suppress what is temporarily
irrelevant.
- Idiom expansion. There are many things in Java
that take reams of code to express. You can’t abbreviate it by writing a
method. Instead, you code an abbreviation, or fill in some blanks in a dialog
box, and it generates the bubblegum for you, error-free.
- Idiom detection. Java is verbose, but tends to follows standard patterns called
idioms, e.g. enumerating a set, hooking up buttons to listen for events. The
SCID can detect the pattern and replace it with an abbreviation for display. If
code refuses to abbreviate, but that looks like an idiom, you can be sure is not
quite the standard idiom. That may be an error, or it may be deliberate
deliberate. Code is much easier to proof read this way. You don’t need
your eyes to detect tiny variations from the standard idioms.
- You want to be able to trace not only program flow but data flow. Consider a
program rendered like a flow chart, with parts of it suppressed. Lines show how
a particular datum flows through the system, how it gets operated on and
modified. Code that is not germane to that flow is temporarily suppressed. I am
hand waving frantically here. What the heck am I talking about? Consider a
program where you entered a birthdate. There are parts of the program that would
be totally unaffected by that birthdate. Those parts could be suppressed if you
were concerned with how the birthdate affects the program. There are degrees of
association. A test on birthdate is a little less associated than some code that
transforms a birthdate ordinal into year, month and day for display.
- Pale finals. I would like the SCID to mark all variables that are not redefined
with a pale final to let me know I need not worry
about subsequent redefinition of the value. Similarly, I would like the SCID to
mark all methods and classes that are not overridden with a pale
final to let me know there are no redefinition of the method to worry
about. These would be generated dynamically, not part of the source code, and
could be turned on or off. They would look like regular finals, except would be
displayed in a pale colour to indicate their ghostly nature. They would not
prevent me from redefining the variable or the method. The pale final would
simply disappear.
- Display complicated expressions in true mathematical form much as TEX
or the Microsoft equation editor would display them, with variable sized parentheses
and denominated under numerators. To help understand expressions, you could ask
bits of them to collapse on screen. A simple version would adjust the amount of
space around each operator to indicate relative precedence.
Low precedence operators would be surrounded by more space. Java has 13 levels
of precedence, but you would not normally find them all in one expression. The relative
distinctions in spacing could be obvious. You would not have a fixed
amount of space for each precedence level. You could also display complex
expressions as a parse tree, with operators at the nodes. What you see need not
have that much relation to what you type. For example, you could type GT
for > since for some people it is easier to type.
- Let you refactor code by breaking up methods into smaller ones. You just
highlight the hunk of inline code you want made into a separate method. In
theory it should even be possible to automatically determine the parameters and
their types. The system could then go looking for code that does inline what
your new methods do, and replace that inline code with the new encapsulated
calls.
- Lets you select colours within the SCID using a ColorChooser.
Colours and variables/constants representing colours in code can be represented
any combination of three ways:
- By colour sample swatch.
- By colour name.
- By colour number (hex/decimal)
- Exploit new high res 1 meter square LCD or gas plasma panels so you have room to
see everything at once, visually navigating your way around the entire code
space, rather than peeking at it through a toilet tube the way we do now.
- With everything preparsed, writing your own custom code conversions would be a
lot easier. For example, you might write a translator from AWT to Swing code.
- True visual editing. Your GUI program looks like the final screen output. You
right click on any component which brings up a dialog box from which you can
change, colour, font, border, initialisation, associated event handlers…
mostly by ticking off boxes and making multiple choice selections. Your program
always works to some degree since you can’t select anything syntactically
invalid. Navigation is far easier, since you don’t even need to remember
the names of things. You can of course find out the names of things by right
clicking them. Code becomes far less procedural and more OO.
A supermarket parking lot helps its customers find their cars by posting signs
with animals on them in various parts of the parking lot. It is much easier for
someone to remember they left their car near the salmon than in sector E6.
Similarly you could embed landmark symbols in the code, perhaps with purely
personal meaning, and only visible to one programmer, just to help her find her
way around. You could click on the
to
get back to a section of code you were working on recently.
- For importing ordinary Java source code, a parser such as JavaCC or ANTLR might
be useful. See parser in the Java & Internet Glossary.
- Ability to add shortcuts to the syntax such as Abundance-style moods and for-each
loops. Instead of saying x.keyin(); y.keyin(); z.keyin();
you can declare keyin as a mood, and say: keyin
x, y, z; You could say things like: for ( MyArray) {
MyArray.x += MyArray.y + 1; } to run through all the elements of the
array and provide an implied default subscript inside the loop body. Collection
iteration could be much terser as it is in most modern languages.
You can invent your own language shortcuts. No other programmer need view them.
They would see perfectly standard Java. However, when you viewed their code,
your shortcuts would be applied so you would not have wade through their reams
of dinosaurian repetition. You might for example add case ranges to the switch,
implemented with a binary search. If your shortcut got in the way, you could
drop it, and instantly see standard Java again. After all, this is software,
right? It is supposed to be malleable and comfortable. With traditional coding
techniques software becomes so rigid. It is harder to change that the supposed hardware.
- I suspect that SCIDs will create a revolution in terseness of language design.
It will come about gradually like this. A SCID will give you the ability to
temporarily hide bookkeeping/busywork/plumbing/wiring (pick your analogy)
details. Next will come the call for the SCID to automatically generate those
details. Next will come the complete suppression of them from the application
programmer’s awareness. They will be hidden completely behind the walls,
no longer part of the day-to-day application programming language. Every time
you can suppress 30% of the busywork, a whole new
set of patterns emerge that were formerly obscured by all that fussy detail.
Suddenly, you discover new ways to more tersely specify your desires to the
computer. You see new levels of bookkeeping/busywork/plumbing/wiring that can be
similarly hidden, revealing still more deep structure. this is not just
speculation. I have seen this process in the evolution of my own Forth-based
language, Abundance.
- SCIDs will also have another influence on language design. Right now we are
stuck in a mindset that a computer language is a linear stream of vanilla 7-bit
ASCII characters. SCIDs will loosen that up. We are already seeing that with
tools like the Symantec layout editor. A program can be a diagram in 2D. Font
style can have semantic meaning. Noam Chomsky might put it this way,
programs may have many temporary surface representations of a single deep
structure. We will see multiple alias names for variables, multilingual variants
of the Strings that can be flipped with the click of a menu item.
- A SCID could potentially store a lot more information, (not normally visible)
than a text file representation would. For example, you could fairly easily
automatically record who changed each individual element in each line of code,
why and when and as part of which job layer. See dynamic
version control for what I mean by job layer. Global renamings would be
labelled as such, not as a million separate little transactions. This
information could also be used by the boss to track precisely what a
telecommuting employee did during the week. You could add all the commentary you
wanted without worrying about overwhelming the reader for whom it was irrelevant.
The other information you could track at each node is who has access to look at
or change that piece of code.
- For some notes on how a SCID might be implemented so that many users could be
updating the same code simultaneously from several globally dispersed sites,
again see dynamic
version control.
- If ever Microsoft, (the inventors of the dancing paperclip), gets a hold of the
SCID concept I suspect they will totally misunderstand. A SCID will be a 3D
simulation of a Disneyland ride where you passively watch transactions being
processed by a McDonald’s hamburger machine to the endlessly repeating
strains of It’s A Small Small World. To debug, you watch the
individual bits shaped like French fries being cooked, salted then "added",
get it? It takes only 10 seconds for the animation to complete the addition of
two numbers.
- Have a look at Visual
SlickEdit. It is not a SCID, but with every release it develops more and
more SCID-like features.
- I have written an essay
on online books. I propose a SCID-like solution to handling the problem of
information overload in technical documentation.
How Might You implement a SCID
Instead of traditional CVS or editor model where you have lines of ASCII text,
you would have a tangled hairball of objects, one object for each token, e.g. IF,
variable reference, method definition. The objects would have pointers to each
other so you can rapidly find related information and rapidly navigate the
program at any level of detail. References to a variable would not contain the
name of the variable, just a pointer to its associated token object. The actual
string name of a variable or method would appear in only one place. (This makes
global rename and aliasing trivial.)
There are TreeMaps so you can find symbols by name or approximate name or by
name/property combinations.
There is no source code, just the parse tree. You are thus free to display it in
many different possible formats, or to export traditional Java source. The parse
tree always represents a syntactically valid Java program.
The parse tree contains much more data than the equivalent source code, e.g.
history of change, who changed each token and why.
The parse tree is RAM-resident, or stored in a decent persistent object database
that approaches RAM-resident performance, such as Objectstore.
Even for a purely RAM-resident implementation, the data must be persisted
that is dumped to disk and restored as a lump with all the interconnections
intact. Execution, (but not startup) would be faster than using a POD (Persistent
Object Database). You need to
log transactions to disk, but everything else lives in a giant virtual RAM space.
Someday we will learn to snapshot entire virtual address spaces and pick up
later exactly where you left off.
I repeat, the parse tree always represents a syntactically valid program. It
might not necessarily do anything sensible, but it would "compile".
Changes to the parse tree are applied in the form of atomic transactions to
ensure the integrity of the tree cannot be compromised.
Other sorts of auxiliary data may be stored in a conventional SQL database where
it would be accessible to user-written queries. However, the source code itself
has too complex a structure to fit into the row-column SQL model.
There is a log of transactions that can be replayed in event of failure, or
analysed to recreate the dynamic change history. You can play the log forward or
back. The advantage of this log is that even in the event of catastrophic
failure you would never lose more than a few seconds worth of keying.
When you get around to implementing dynamic version control, this transaction
log must be sent to a central site and merged in real time with transactions of
other people’s changes, then redistributed to all the redundant hot copies
of the database. This implies a 24 hour Internet connection between all the
programmer sites, or at least while any programmer is active at a site. The key
is all copies of the database must process all the transactions in the exact
same order. For speed you might process local transactions immediately then back
them out if it turns out there were transactions from other sites that actually
needed to be processed first. For more detail on how that might work see dynamic
version control.
Books
 |
recommend book⇒Software Engineering Environments: Automated Support for Software Engineering |
| | paperback |
|---|
| ISBN13: | 978-0-07-707432-6 |
|---|
| publisher: | McGraw-Hill |
| published: | 1993-03 |
| by: | Alan Brown |
|
|
 |
recommend book⇒Object Oriented Databases: and Their Applications to Software Engineering |
| | paperback |
|---|
| ISBN13: | 978-0-07-707247-6 |
|---|
| publisher: | McGraw-Hill |
| published: | 1991-08 |
| by: | Alan Brown |
| He describes, among other things, the ECMA Portable Common Tools Environment (which is a spec rather than a product) and how a few actual CASE tools match up to it |
|
 |
recommend book⇒Doing Hard Time: Developing Real-Time Systems with UML, Objects, Frameworks and Patterns |
| | hardcover |
|---|
| ISBN13: | 978-0-201-49837-0 |
|---|
| publisher: | Addison-Wesley |
| published: | 1999-05-21 |
| by: | Bruce Douglass |
| on the ROPES software-development method built into Rhapsody |
|
Real World SCID Implementations
The usual reaction I get from programmers when I mention SCIDs is that they have
tried them and they hate them. What they have tried are coding templates
where you fill in the blanks. These stop you from coding in the old way, yet
offer almost no payback. Granted SCIDs will force you to rethink how you compose
programs. Code must at all times be 100%
syntactically correct. However, a good SCID will pay back 100 fold for this
inconvenience. If you try to import or paste code that is not correct, you will
find much of it being turned into a special kind of comment
- Semantic Oriented Programing
- SCIDs are not a totally mythical beast. Smalltalk and Logo programmers have been
using them for a long time. IBM’s Visual Age Java compiler uses a SCID,
though they backed off somewhat with its successor, Eclipse. SCID users are very
enthusiastic about them, even though I think the current crop of tools have just
begun to exploit the possibilities. Jade
stores its code is a preparsed tree. Mozart
develops the idea of concept programming where you create application specific
syntax.
- Lisp has been treating programs as structured data for many years.
- See Martin Fowler’s work on Refactoring.
His ideas on automated source transformations require analysing the code as a
parse tree.
- Every version of Slickedit
comes out with a more and more SCID-like user interface.
- The Xerox Parc people have been experimenting with a new way of organising Java
programs called Aspect Oriented
Programming as a way specifying facts in only one place declaratively rather
than by sprinkling them redundantly throughout the procedural code. Doing that
makes code much easier to maintain. Declaratively specifying a huge amount of
information that is traditionally handled procedurally is the key to my own
computer language, Abundance, whose primary design goal was ease of maintenance.
You can specify information declaratively and automatically generate the
corresponding procedural Java bubblegum.
Also see Cristina Lopes’ PhD
defense of Aspect oriented Programming.
- Microsoft had an Intentional Programming project. An intention
is the core essence of a program once you strip out the housekeeping bubblegum
that is necessary to explain picky implementation details the language/tool
cannot handle on its own. Once the programmer has formed an executable thought,
the programmer’s next question is not "what do I have to say to get
the computer to do this?", as was the case in traditional programming, but "what
do I insist on saying?". Intentions are the program spec plus sufficient
detail to specify how you want the problem solved.
- SCIDs are similar to Bell Lab’s SeeSoft
to generate bird’s eye graphical displays of the entire project that use
colour coded pixels to indicate such things as code age or hot spots where a
profiler determines code spends most of its time executing. You can zoom in on
interesting places to see the actual code. Other things you can colour code with
pixels or coloured background include, code I have recently changed, code others
have recently changed, code that was changed during some time period where a
problem first showed up, code that is frequently changed, code that makes use of
a certain class or method, where the comments are densest, Basically any metric
you can compute from the parse tree representation can be expressed as colour.
Colouring for absolute frequency of execution points out areas that could
benefit from optimisation. Colouring for relative frequency of execution helps
you pick out the most common paths through the code, i.e. what happens in the
usual case.
- Jim Little’s Prism project seeks to find a representation for SCID data
that can be shared by different programs. That way you could build your SCID
system up out of pluggable components.
- You might mine the i-Logix Rhapsody project for
ideas on visual programming. It is a diagrammatic code generator for C++.
It is based on UML (Universal Modeling
Language, the high-level language for real-time,
multitasking systems) and i-Logix Statemate.
The idea behind Rhapsody is to make the documentation executable. And the
documentation is in the form of a number of diagrams you draw. i-Logix'
Statemate uses enhanced bubble charts that, to paraphrase the Buick commercial,
are not your father’s bubble charts. Briefly, they allow an action upon
entry to a state, while in a state, and upon exit. Further, exits from a state
can branch conditionally and a sub-machine can remember its last state to pick
up where it left off upon re-entry. There’s more, but suffice it to say
that Statemate is very powerful.
- CodeGuide is an IDE that
is taking more of a plunge in the SCID direction than usual.
- The i programming language is reputed to be SCID friendly.
- OpenJava can also be
regarded as a toolkit for constructing a Java preprocessor.
- Jatha is a
simple preprocessor for Java that is inspired by the power of Lisp macros. It is
released under the GPL.
- Juliet lets you ask SCID-like
questions about your source code and rapidly navigate it. It is not an editor,
just a browser.
- Aubjex Alajava was a technology that transforms
Java code into an especially efficient and complete database form, with
generalized capabilities that do for Java source code what database query and
manipulation products do for business data. Author Don Gilmore writes "
Aubjex is built on SCID. It can parse the entire JDK 1.4
java source package in 30 seconds, into a database that maintains all
information. We have hundreds of XML scripts that query and manipulate the
database. There is a dataflow scripting tool for creating new scripts, although
it is not yet documented."
- There is SCID discussion group hosted by google groups. To get on, send an email
to brightone@o2.pl. You will need to create
a Google Groups Account. Then you can visit the SCID
website. You can go here
to unsubscribe. The moderator is Polish, and the host in google.pl, but go ahead
and post in English.
- The following people have expressed interest in writing a SCID. You might get
together with them on a combined project. Email me at
to
add you name to the list.
Unfortunately, the email addresses below are not clickable.
Further you cannot copy/paste them into
your email program. You must manually re-type them. The email addresses are
graphic *.png images created by Masker
. I inconvenience you this way to
discourage spammers from harvesting email addresses from the website with
automated website spidering.
| SCID Enthusiasts |
| email |
name |
notes |
|
Martin Fowler |
repository
based code and language
workbenches. |
 |
Kyle Lahnakoski |
|
 |
dIon Gillard |
|
 |
Roedy Green |
The author of this essay. |
 |
Bill Kress |
|
 |
Jim Little |
|
 |
Lew Maestas |
|
 |
Fabien Duminy |
|
 |
Steve Lewis |
|
 |
Graham Perkins |
|
 |
Robert Bossanyi |
|
 |
Marcos Diez |
|
 |
Chris Tutty |
|
 |
John Bäckstrand |
|
 |
Maxim Friedental |
|
 |
Carl Rosenberger |
doing a SCID project with Java, C# and SmallTalk. They plan to be able to
generate code in different languages from a common deep structure. |
 |
Richard Mullins |
|
 |
Hugh Doar |
|
 |
Kimberley Burchett |
|
 |
S. Saravanan |
|
 |
David Rosenstrauch |
Has completed the initial portion of a SCID project for Java. The app.
currently parses Java code while the user types it, and then stores it in a
database-like format. Project is currently on hold due to lack of time and money
Considering release as open source in the future. His work can be downloaded
from darose.net |
 |
Rohan Pall |
|
 |
Don Gilmore & Jonathan Colt |
|
 |
Ian |
Has written a PHP SCID, and is working on a rewrite. |
 |
Kirill Osenkov |
His thesis is dedicated to building an experimental structured editor for C#:
He’s enthusiastic about SCID, intentional programming etc. He believe a
structured editor would be a nice front-end for a SCID and is building a
structured editor framework for that purpose. www.osenkov.com
www.guilabs.net |