Transcript Document

Programming Notes & Advice
Todd Littell
22SEP05
Basic Coding Practices
Maintain State Locally
//////////// Bad C
int count = 0;
void countTokens()
{
while (…)
++count;
}
//////////// Bad Java
public class TokenCount
{
public int count;
public String token;
}
//////////// Good C
int countTokens()
{
int count = 0
while (…)
++count;
return count;
}
//////////// Good Java
public class TokenCount
{
private int count;
private String token;
public int getCount() { return count; }
public void setCount(int cnt)
{ this.count = cnt; }
}
Maintain State Locally (2)
//////////// Bad C
typedef struct {
int count;
char *token;
} TokenCount;
TokenCount tcArray[1000];
int tcArraySize = 1000;
int indexOfToken(char *token)
{
for (int i = 0; i < tcArraySize; i++)
if (!strcmp(tcArray[i], token))
return i;
return -1;
}
//////////// Good C
typedef struct {
int count;
char *token;
} TokenCount;
static TokenCount tcArray[1000];
static int tcArraySize = 1000;
int indexOfToken(TokenCount *array, int
arraySize, char *token)
{
for (int i = 0; i < arraySize; i++)
if (!strcmp(array[i], token))
return i;
return -1;
}
int main()
{
int k = indexOfToken(tcArray, tcArraySize,
argv[1]);
}
Note: Always keep data as “local” and “private” as possible. In C++/Java, use “private” keyword; in C
use static keyword for any globals. Note #2: Only have globals at the highest level of your program, if
at all.
Be Careful of Side-Effects
//////////// Bad Java
void dumpValidAddresses(Set addrSet)
{
Iterator it = addrSet.iterator();
while (it.hasNext())
{
Address addr = (Address)it.next();
if (addr.getCity() == null || addr.getState() ==
null)
it.remove();
else
System.out.println(“Address: “ + addr);
}
}
//////////// Bad C++
void myProc(Address arr[], int &num)
{
for (; num >= 0; --num) ;
}
//////////// Good Java
void dumpValidAddresses(Set addrSet)
{
Iterator it = addrSet.iterator();
while (it.hasNext())
{
Address addr = (Address)it.next();
if (addr.getCity() != null && addr.getState() !=
null)
System.out.println(“Address: “ + addr);
}
}
//////////// Good C++
void myProc(Address arr[], int num)
{
for (; num >= 0; --num) ;
}
Maintain Consistent Style
//////////// Bad C (inconsistent naming)
void cutBreadHorizontally();
void cut_bread_vertically();
//////////// Good C (consistent naming)
void cutBreadHorizontally();
void cutBreadVertically();
//////////// Bad C (inconsistent naming)
void cut_bread_horizontally();
void vertically_cut_bread();
void cut_bread_with_diagonal();
//////////// Good C (consistent naming)
void cut_bread_horizontally();
void cut_bread_vertically();
void cut_bread_diagonally();
//////////// Bad C (inconsistent typing)
short countTokens();
void printTokenCount(int count, char *token);
//////////// Good C (consistent typing)
int countTokens();
void printTokenCount(int count, char *token);
// Or
typedef int count_t;
count_t countTokens();
void
printTokenCount(count_t count, …);
//////////// Bad C (inconsistent signatures)
int indexOf(char c, int start);
int indexOf(int start, char *str);
//////////// Good C (consistent signatures)
int indexOf(char c, int start);
int indexOfchar *str, int start);
Comment when needed
//////////// Bad C
int foo(int *a, int b)
{
char *c = (char*)malloc(b);
int d, e, f=0;
for (d = 0; d < b; d++)
{
if (c[d]) continue;
c[d] = 1;
for (e=a[d]; e-d; e=a[e], c[d]++)
c[e] = 1;
f = (c[d] > f? c[d] : f);
}
free(c); return f;
}
//////////// Good C
int maxCycleLength(int array[], int sz)
{
// visited[] will store 1 if visited; else 0.
char *visited = (char*)malloc(sz);
int i, j, max=0;
for (i = 0; i < sz; i++) // for each array member
{
if (visited[i]) // if visited then skip
continue;
visited[i] = 1;
// for each member in cycle,
// update visited count.
for (j=array[i]; j != i; j = array[j], visited[i]++)
visited[j] = 1;
max = (visited[i] > max? visited[i] : max);
}
free(visited); return max;
}
Be Defensive!
//////////// Bad C
int foo(TokenCount *tcArray, int tcArraySize) {
for (int i = 0; i < tcArraySize; i++)
{
printf(“ %s %d\n”, tcArray[i].token,
tcArray[i].count);
}
return 0;
}
//////////// Bad Java
void foo(TokenCount tcArray[]) {
for (int i = 0; i < tcArray.length; i++) ;
}
//////////// Good C
#define NO_ERR 0
#define ERR_INVALID_ARGS -1
int foo(TokenCount *tcArray, int tcArraySize) {
if (tcArray == NULL || tcArraySize < 0)
return ERR_INVALID_ARGS;
for (int i = 0; i < tcArraySize; i++)
{
if (tcArrray[i].name == NULL)
continue;
printf(“ %s %d\n”, tcArray[i].token,
tcArray[i].count);
}
return NO_ERR;
}
//////////// Good Java
void foo(TokenCount tcArray[]) {
if (tcArray == null)
throw new IllegalArgumentException(“foo()”);
for (int i = 0; i < tcArray.length; i++) ;
}
Maintain Orthogonality
//////////// Bad C
void searchAndReplace(char *string, char
oldChr, char newChr);
//////////// Good C
int find(char *string, int pos, char oldChr);
void replace(char *string, int pos, char newChr);
//////////// Bad C++
class DataFileConverter {
void convert(char *oldFilename, char
*newFilename);
};
//////////// Better C++
class DataFileConverter {
void convert(FILE *oldFile, FILE *newFile);
void convert(int oldfd, int newfd);
};
//////////// Good Java
class TimerStack {
void pushTime(Time currTime);
Time popTime();
}
class Car {
boolean startEngine();
boolean driveToLocation(Location destination);
}
//////////// Bad Java
class TimerStack {
void pushCurrentTime();
Time popTime();
}
class Car {
boolean startCarAndDriveToStore();
}
Think: Reusability
•
If a function (method) is more than 30 lines of code, then refactor it!
•
If a file contains more than 300-400 lines of code, then refactor it!
•
Be able to describe the purpose of your function in one sentence, and the purpose of your class
(or file) in one paragraph. If it seems complex to you (the programmer) then it is too complex and
needs refactoring.
•
WHY?
Be Resource Conscious
•If an object (or function) opens (or allocates or reserves) a resource, then the same
object (or function) should release it.
Example #1:
void foo(char *mystring)
{
char *tmpbuf =
(char*)malloc((strlen(mystring)+1)*sizeof(ch
ar));
// processing happens.
free(tmpbuf);
}
File *gl_infile; // Global input file ptr.
char *gl_proc_buf; // Global proc’ing buf.
void init(char *infilename) {
gl_infile = fopen(infilename, “r”); // opens input
file.
gl_proc_buf = malloc(200);
}
void finish() {
fclose(gl_infile);
free(gl_proc_buf);
}
int main() {
init(argv[1]);
// Main processing loop
finish();
}
Quick Review
•
Know your language!
•
It is possible to write a bad (or good) program in any language. Some simple
practices can go a long way.
•
Keep variables (state) as local (private) as possible. Reduces dependencies.
•
Maintain a consistent style; not just naming/indentation, but also in how you code.
•
Stay away from side-effects, and be explicit about any mutators you may have (i.e. in
function name & in comments).
•
Keep your code modular; 30 lines/function; 300 lines/file. KISS.
•
Be defensive, not offensive. Increase level of error checking.
•
Use common idioms & practices. For example, for defining constants, macros, testing
code, assertion code, etc.
Language Basics
Review of Address Space
Unix/Linux Address Space
Stack
Stack Segment
Heap
Data Segment
Global Data
Executable Code
Text Segment
C Language
•
A level above assembly.
•
Still remains the underlying model behind many other languages.
•
Procedural language with pass-by-value argument passing.
•
Comes with powerful pre-processor (e.g. #include, #define, #if, #ifdef, etc.).
•
Separate declarations in header files (.h) from definitions in (.c) source files.
•
Pointer arithmetic is not only supported; it’s encouraged.
•
Typing is weakly enforced; especially with primitives. Anything can be cast into
anything else, basically.
•
Functions can be passed as arguments. Powerful!
Review variable scope & duration. Review address space partitioning.
Common C Idioms
•
Pointers are used to simulate pass-by-reference:
–
•
Using #define for constants (instead of enum{}):
–
–
–
•
#ifndef MYPROG_H
#define MYPROG_H
…
#endif
Testing null-ness the C way:
–
•
#define IS_ERR(err) ((err) < 0)
Using #ifdef, #ifdef, #ifndef for header file inclusion:
–
–
–
–
•
#define NO_ERROR
0
#define ERR_CONNECT -1
#define ERR_DATA
-2
Using #define for macros:
–
•
void foo(Address *array, int arraySize);
char *cptr = cptr2; if (cptr) …
Pointer manipulation:
–
–
typedef struct { char *string; short bwidth; void *array; } SkipListNode;
SkipListNode* make(int bwidth) { SkipListNode* s =
(SkipListNode*)malloc(sizeof(SkipListNode) + (bwidth-1)*sizeof(void*)); return s; }
C++ Language
•
An Object-Oriented extension of C.
•
A better C.
•
OO language with both pass-by-value and pass-by-reference argument passing. (e.g.
print(Address addr); vs. print(Address& addr); ).
•
Separate declarations in header files (.hpp) from definitions in (.cpp) source files.
•
Typing (with objects) is strongly enforced.
•
Run-Time Type Information and Exceptions are available, but not equally supported
across compilers.
Review variable scope & duration.
Common C++ Idioms
•
Use pass-by-ref instead of pass-by-value pointers.
•
Use new/delete operators instead of malloc/free.
•
Do NOT override operators (esp. new/delete), possibly with the exception of streaming operators
<< and >>.
•
Use smart pointers (reference):
–
{ auto_ptr<Car> car_ptr(new Car()); }
•
Use std::string for strings instead of char*.
•
Follow the Law of the Big Three (reference): If you have a non-trivial dtor, then you must provide
a suitable copy ctor & assignment op. Or, you must hide them.
•
Avoiding nested templates (or templates altogether).
•
Design-by-contract using Abstract Base Classes (ABCs) as interfaces.
–
–
class Car { virtual void startEngine() const = 0; };
class Hummer : public Car { virtual void startEngine() { fillUpTank(500); start(); };
Java Language
•
An interpreted, portable Object-Oriented language, with pass-by-value argument passing.
•
C++ look-a-like, but Smalltalk semantics.
•
Variable are either primitives or object references! All objects are allocated on heap. Only
primitives & references are on stack.
•
All methods are virtual (all the time)!
•
Built-in garbage collector & memory manager.
•
Declaration is not separated from definition. Both go in one source file (.java).
•
Pointer arithmetic does not exist, which makes for safer code.
•
Typing is strongly enforced.
•
Built-in multi-threading support.
Review variable scope & duration. Review address space partitioning.
Common Java Idioms
•
Use interface to define constants:
–
public interface ProgConstsIF { public final static String DFLT_TMP_PATH = “/tmp”; }
•
Throw RuntimeException for very common exceptions; else define and throw your
own app exception (that is not derived from RuntimeException).
•
When the construction & initialization of an object is complex, then a factory method
is often used.
–
•
Introspection is used often:
–
•
Example: javax.xml.parsers.DocumentBuilder.parse(InputStream is) :
org.w3c.dom.Document;
Example: void foo(Object o) { if (o instanceof String) … }
You must override hashCode() iff you override equals().
I/O Comparison
C
BufferedFormatted
BufferedUnformatted
fprintf()
fscanf()
fread()/fgets() read()/getc()
fwrite()/fputs() write()/putc()
C++ Overloaded
<< and >>
Java Classes:
strm.get()
strm.put()
strm.read()
strm.write()
Unbuffered
strm.get()
strm.put()
strm.read()
strm.write()
Classes:
Classes:
StreamTokenizer BufferedReader Reader
BufferedWriter
Writer
PrintWriter
Collections Comparison
List
C++ stl:
vector<>
list<>
Java java.util:
List
ArrayList
LinkedList
Set
Map
Bit Vector
stl:
set<>
multiset<>
java.util:
Set
HashSet
TreeSet
stl:
map<>
multimap<>
java.util:
Map
HashMap
TreeMap
stl:
vector<bool>
bitset<>
java.util:
BitSet
Note 1: The C++ Standard Template Library is iterator-based, while the Java Utils
are collection-based. Note 2: that Apache Common Collections are commonly used
as an extension. Note 3: the STL doesn’t have a hash table.
Design Basics
Common Design Principles
•
Design top-down, build bottom-up.
•
Design with layers. Layer N can only call functions/objects in layer N-1.
•
Design-by-Contract (aka Interface). Think of objects first in terms of their interface –
what they do – not how they are built. Use ABCs or interfaces often.
•
Design with composition, not inheritance. Compose objects with aggregation, not
inheritance. Rules for inheriting B from A:
–
–
–
–
An instance of B object will always be an instance of an A object.
A B object is a special kind-of an A object.
Never inherit for convenience (e.g. Stack inheriting from Vector is wrong!).
Use roles idiom to handle multiple types, or types that can change over time.
•
Use design patterns such as Composite, Factory, Strategy, Template, Visitor, Adaptor,
Singleton to simplify your code (reference). Also, Manager paradigm.
•
Class hierarchies tend to be flat, not deep. Common practice in C++ is to add a Root
class. (Not much method overriding actually happens).
Common Design Principles (2)
•
KISS
•
Do not intertwine object (data) behavior with application behavior.
–
•
Example:
• class Car { boolean startEngine();
boolean driveToLocation(Location destination);
• class DriveStategies { boolean driveCarToStore();
boolean driveCarToGasStation(); }
Weak coupling via Observer-Observable or a Semantic Event subsystem.
Design Practices
HOW TO ACTUALLY BUILD A DESIGN?
•
Data Models: A data model such as an Entity Relationship (ER) diagram or Extended Entity
Relationship (EER) is still very common.
–
–
–
•
Object Models: An object model encapsulates objects & class hierarchies. Most common
notation: Unified Modeling Language (UML).
–
–
–
–
•
Why? Because RDBMS still dominate as the persistent storage.
How? Use pencil & paper, or whiteboard, or Visio, or software.
Notations: Crows foot, Chen, homegrown.
Why? Because OO Programming Languages (OOPL) dominate software field.
How? Use pencil & paper, or whiteboard, or Visio, or software (Poseidon, Rational Rose, etc).
Notations: UML has taken over. OMT was predecessor in many ways.
UML is a Swiss Army knife – many views supported for different stakeholders.
Architectural Designs: This can be as simple as a napkin sketch, or as complete as a full-blown
UML model. Architectural designs are evaluated by “design principles”.
–
–
–
–
Why? Any large software system needs a clear, consistent design.
How? Whatever works!
Notations: If not done in UML, then can be whatever conveys the essence of the design.
Food for thought: A good Software Architect will make $200-300K per year, plus fringe, plus publicity, plus
stocks, etc.
Superb Books & References
•
Online e-Books/References:
–
–
–
•
General Programming Books:
–
–
•
C++ How to Program, Deitel & Deitel. (Great b/c it teaches very good many OO concepts).
C++ FAQs, Cline, et al.
Great Unix Book:
–
•
Code Complete, by Steve McConnell.
Writing Solid Code, Steve Maguire.
Introductory C++ Books:
–
–
•
The Art of Unix Programming, by Eric Raymond.
Thinking in Java/C++, by Bruce Eckel.
C++ FAQ Lite
Advanced Programming in the Unix Environment, Richard Stevens. (aka “The Bible”)
Software Engineering Books:
–
–
–
–
Instant UML, Muller.
Object-Oriented Methods, Graham. (General OO methodology/technology survey).
Patterns in Java, Mark Grand. (esp. volumes 1-2).
Advanced Object-Oriented Analysis and Design using UML, Odell.