Sucking Less: Checking In More Often

Posted in Jerry Andrews on March 20th, 2010 by admin

I'm fairly fearless when coding, which means that about once a week, I delete a huge chunk of something I should've kept, or change something into something unrecognizable, thereby inadvertently breaking a dozen unit tests. When I discover the problem, usually about 4 hours later, I no longer have any idea what I did that made the bad thing happen. Then I spend another 2 or 3 hours figuring out what I broke and fixing it.  Ugly.

On my personal projects, I check in code every time I get a unit test working. My checkins are something like 15-20 minutes apart.  On projects I get paid for, though, checking in means running the whole unit test suite, and that can take 10 minutes (on a good project) or 2 hours (on a bad one)--so I don't do it very often. That's when I get into trouble.  I've been meaning to solve that problem for some time, and Joel Spolsky's blog topic last Wednesday (Joel on Software) finally kicked me in the pants.  It took 15 minutes to solve the problem; here's how I did it.

The new breed of source code control systems, distributed systems like Git and Mercurial, have had my attention for awhile, as I usually work with a team that's spread out geographically.  I regularly need to share code that's not quite ready to be delivered with another developer, and that developer is very likely in a different city.  Typically we're reduced to emailing files to each other. A distributed SCC system would resolve this, as we could sync changes between our personal development repositories, but I assumed setting up and learning a new SCC would be painful, as it has been in the past, so I never got around to it.  Joel's article on Mercurial, however, got me to thinking about it seriously, and since I had a few hours on my hands, I figured I'd give it a try.  What I didn't expect was that I'd be up and functional in 10 minutes.

I downloaded Mercurial and installed it on my main dev box, which is a Windows laptop (I know, I suck). That took about 2 minutes, including googling for the Mercurial web site (http://mercurial.selenic.com/). Then I changed to the main development directory of my current project, and typed (per Joel's tutorial at http://hginit.com/):
hg init
hg add

The first impressive thing is these commands 'just worked'.  The second is that's all that's required to set up a local repository and add an entire project to it. Really. I'm so psyched!

The "add" command was naive, because it added everything, including build output and subversion control directories (**/.svn/**).  I spent the next 10 minutes reverting my add, ("hg revert --all"), then building an "ignore" file, then adding again, and finally committing.  To shorten your search (Mercurial has great documentation, by the way--it only took me a few minutes to figure this stuff out), here's what I ended up doing.

1. I created an .hhrc file in my home directory (C:\Documents and Settings\Jerry) with the following content:

[ui]
username=jerry
editor=C:/bin/vim/vim71/gvim

The "username" entry preempts a request by the "commit" command for a username, and I prefer vi to the default editor, which is notepad.

2. In the root of my project directory, I created a .hgignore file with the following content:

syntax: glob
.svn
*.class
*.log
.hgignore
antbuild/*
build.properties

This tells Mercurial to ignore all files or directories named .svn (which is where subversion stores its status), all .class and .log files, the .hgignore file itself, the build.properties file, and anything in the antbuild directory (or its subdirectories). Mercurial ignore files support at least 3 different syntaxes for specifying files; the documentation is available on the Mercurial wiki and it's quite complete.

Finally:

hg add
hg commit

Now I'm in a position to check in locally every few minutes, but when I have a small feature complete, I can deliver via svn to the project's repository.  That's right: I'm using Mercurial locally, and SVN for the project.

On my next project, I hope to have a chance to specify that the whole team uses Mercurial for the whole project; working with another programmer who's not physically nearby just got a whole lot easier; we can exchange our updates directly with Mercurial, then push them up to a central repository independently. Sweet!

Annotating Custom Types in Hibernate

Posted in Jerry Andrews on March 3rd, 2010 by admin
Hibernate has a lot of nice features, and it's pretty well documented, but a recent need to add a simple custom type to an existing mapping left me flailing around for documentation on exactly how to do it. I wanted to do it with annotations, not by updating the Hibernate configuration (that approach is well-documented). Here's how it's done.

Two new classes are needed.  You can do it with one (and the Hibernate examples do it that way), but they really have different functions, so I coded them separately.

The first is the class you want to use for the column.  In my case, I needed a Date with no milliseconds, which is a thin wrapper over java.util.Date.  Here's my class:


/**
 * Oracle stores dates in DATE columns down to the second; Java stores them to the millisecond.
 * This occasionally can confuse Hibernate as to what data are stale.  This class slices off
 * any milliseconds which might be present in its representation.
 */
public class DateNoMs extends java.util.Date {
    private static final long serialVersionUID = 1L;


    /** @see java.util.Date() */
    public DateNoMs() {
        super();
        long t = getTime();
        setTime(t - t%1000);
    }


    /** @see java.util.Date(long) */
    public DateNoMs(long time) {
        super(time - time%1000);
    }
    
    /**
     * @param value
     */
    public DateNoMs(Date value) {
        long t = value.getTime();
        setTime(t - t%1000);
    }


    /** @see java.util.Date#setTime(long)     */
    @Override
    public void setTime(long time) {
        super.setTime(time - time%1000);
    }
}

Straightforward, right?  Now, in my class, I have a field mapping:

    @Column(name = "PAYMENT_DATE")
    private DateNoMs m_paymentDate;

Of course, this won't run--Hibernate will gag on the mapping, because it doesn't know how to map a JDBC DATE column to a DateNoMs--as one would expect.  There are two things we need at this point: first, an object which Hibernate can use to transform JDBC DATE into a DateNoMs, and an annotation pointing to that "Factory".  The factory class is produced by implementing (in the simplest case) org.hibernate.usertype.UserType. Documentation in this interface is pretty thin, but there are good examples available in the Hibernate distribution. Here's my implementation.  I'm greatly helped by the fact that my class (DateNoMs) is very close to java.util.Date, and java.sql.Date extends java.util.Date.

/**
 * Map "things" (currently Oracle Date columns) to the DateNoMs.
 */
public class DateNoMsType implements UserType {

    /** @see org.hibernate.usertype.UserType#assemble(java.io.Serializable, Object)     */
    public Object assemble(Serializable cached, @SuppressWarnings("unused") Object owner) {
        return cached;
    }

    /** @see org.hibernate.usertype.UserType#deepCopy(Object)     */
    public Object deepCopy(Object value) {
        if (value==null)
            return null;
        
        if (! (value instanceof java.util.Date))
            throw new UnsupportedOperationException("can't convert "+value.getClass());
        return new DateNoMs((java.util.Date)value);
    }

    /** @see org.hibernate.usertype.UserType#disassemble(Object)     */
    public Serializable disassemble(Object value) throws HibernateException {
        if (! (value instanceof java.util.Date))
            throw new UnsupportedOperationException("can't convert "+value.getClass());

        return new DateNoMs((java.util.Date)value);
    }

    /** @see org.hibernate.usertype.UserType#equals(Object, Object)     */
    public boolean equals(Object x, Object y) throws HibernateException {
        return x.equals(y);
    }

    /** @see org.hibernate.usertype.UserType#hashCode(Object)     */
    public int hashCode(Object value) throws HibernateException {
        return value.hashCode();
    }

    /** @see org.hibernate.usertype.UserType#isMutable()     */
    public boolean isMutable() {
        return true;
    }

    /** @see org.hibernate.usertype.UserType#nullSafeGet(java.sql.ResultSet, String[], Object)     */
    public Object nullSafeGet(ResultSet rs, String[] names, @SuppressWarnings("unused") Object owner)
            throws HibernateException, SQLException {
        // assume that we only map to one column, so there's only one column name
        java.sql.Date value = rs.getDate( names[0] );
        if (value==null)
            return null;
        
        return new DateNoMs(value.getTime());
    }

    /** @see org.hibernate.usertype.UserType#nullSafeSet(java.sql.PreparedStatement, Object, int)     */
    public void nullSafeSet(PreparedStatement stmt, Object value, int index)
            throws HibernateException, SQLException {
        if (value==null) {
            stmt.setNull(index, Types.DATE);
            return;
        }

        if (! (value instanceof java.util.Date))
            throw new UnsupportedOperationException("can't convert "+value.getClass());

        stmt.setDate( index, new java.sql.Date( ((java.util.Date)value).getTime()) );
    }

    /** @see org.hibernate.usertype.UserType#replace(Object, Object, Object)     */
    public Object replace(Object original, 
            @SuppressWarnings("unused") Object target, @SuppressWarnings("unused") Object owner)  {
        return original;
    }

    /** @see org.hibernate.usertype.UserType#returnedClass()     */
    @SuppressWarnings("unchecked")
    public Class returnedClass() {
        return DateNoMs.class;
    }

    /** @see org.hibernate.usertype.UserType#sqlTypes()     */
    public int[] sqlTypes() {
        return new int[] {Types.DATE};
    }

}

The core of this class is the two methods which get and set values associated with my new type: nullSafeSet and nullSafeGet.  One key thing to note is that nullSafeGet is supplied with a list of all the column names mapped to the custom datatype in the current query.  In my case, there's only one, but in complex cases, you can map multiple columns to one object (there are examples in the Hibernate documentation).

The final piece of the puzzle is the annotation which tells Hibernate to use the new "Type" class to generate objects of your custom type by adding a new @Type annotation to the column:

    @Type(type="com.gorillalogic.type.DateNoMsType")
    @Column(name = "PAYMENT_DATE")
    private DateNoMs m_paymentDate;

The @Type annotation needs a full path to the class that implements the userType interface; this is the factory for producing the target type of the mapped column.

If you're going to use your new type in a lot of places, you can shorten the @Type annotation by doing a typedef; you can place this in package-info.java in any package you like (I put mine in the same package as the UserType class).  Here's the line for the type defined above:

@TypeDefs(
  {
    @TypeDef(name = "dateNoMs", typeClass = com.gorillalogic.type.DateNoMsType.class
  }) package com.gorillalogic.type;

Now my column annotation can look like this:

    @Type(type="dateNoMsType")
    @Column(name = "PAYMENT_DATE")
    private DateNoMs m_paymentDate;

That should be enough to get you started. 

java.util.DuctTape

Posted in Jerry Andrews on February 21st, 2010 by admin
Overview

The proposed class, java.util.DuctTape, is designed as a general purpose fix for a variety of commonly-observed situations in production code. It serves as a temporary patch until a permanent solution is developed and deployed.

Features

DuctTape has the following features:
  1. Transform any internal data representation into any external representation without creating dependencies on either side.
  2. Transform any component interface into the interface required by any caller, again without creating dependencies.
  3. Perform branches into any point in existing code, and returns from any point in existing code, allowing for maximal reuse without recoding where logic already exists to perform some function.
  4. Intercept attempted calls to any unimplemented method and send an email request to support to perform the operation manually
Discussion

The utility of a properly-executed DuctTape implementation seems obvious. Once code is in production, users inevitably find edge and corner cases (and sometimes whole use cases) not anticipated by the requirements, design, implementation, or QA teams. The lead time required to implement these cases is often not available; a quick-and-dirty DuctTape-based solution is required to keep things running smoothly. A standard DuctTape implementation is far preferable to the hacks commonly used to hold things together until the next point release.

We urge immediate action on this by the developer community.

Acknowledgement

Thanks to Bob Hedlund for the original DuctTape class concept. Thanks to the TCAS team for suggestions on the feature list.

Database/Code impedance mismatch

Posted in Jerry Andrews on February 17th, 2010 by admin

I love natural keys in database design. You have to pay attention, though: the natural impedance mismatch between a programming language representation and the database representation of the key can bite you.

Consider an object whose primary key might contain a date--say, a change log record. Oracle and DB2 both store a DATE as a time containing year, month, day, hours, minutes, and seconds. No timezone. The natural mapping for a Java tool like Hibernate is to map to a java.util.Date, which stores the Date as a time in milliseconds since the epoch GMT, and then maps it to whatever timezone is set on the machine where the code is running for display and conversion.

Now consider what might happen (especially if our change log record is attached to some parent object);

  1. We create and save the object; it is persisted. The local cached copy contains a non-zero value for milliseconds, but the database has truncated the milliseconds value and saved it.
  2. Later on in the code somewhere, we have reason to save the object again, perhaps as part of some collection operation.
  3. Hibernate looks in its cache, compares it with the database, and notes that the values of the Date don't match--so it tries to save the value again.
  4. The database dutifully tosses out the spare milliseconds, and bam! we have an attempt to re-insert an existing record, so it throws an exception.
This is all terribly confusing to the programmer, who, inspecting the objects in question, sees no difference between what's in the database and what's in her code, especially since the default display characteristics of her database browser and her debugger don't show the milliseconds.

The easy fix in this case is to declare a class which matches the database representation--in this case, a good choice would be to declare a new class which truncates the milliseconds. A modest example is shown below:

/**
* Public Domain; use or extend at will.
*/
import java.util.Date;

public class DbDate extends Date {
/** increment if you change the state model */
private static final long serialVersionUID = 1L;

/** @see java.util.Date#Date() */
public DbDate() {
long t = getTime();
setTime(t - t%1000);
}

/** @see java.util.Date#Date(long) */
public DbDate(long t) {
super(t - t%1000);
}

/** @see java.util.Date#setTime(long) */
@Override
public void setTime(long time) {
super.setTime(time - time%1000);
}
}

Also note that if you declared the database column as a TIMESTAMP, the Java and database representations more-or-less match--avoiding, in this case, this kind of problem. Note that Oracle doesn't support TIMESTAMP_WITH_TIMEZONE in a primary key, and DB2 doesn't implement TIMESTAMP_WITH_TIMEZONE at all--as of the last time I had access to DB2.

Dealing with timezones is another topic entirely--one which I'll take up in a future post.

Limiting Irreversibility

Posted in Jerry Andrews on February 14th, 2010 by admin
This afternoon I was reading Martin Fowler's commentary on architecture: http://www.martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf, and ran across the following:

At a fascinating talk at the XP 2002 conference (http://martinfowler.com/articles/xp2002.html), Enrico Zaninotto, an economist, analyzed the underlying thinking behind agile ideas in manufacturing and software development. One aspect I found particularly interesting was his comment that irreversibility was one of the prime drivers of complexity. He saw agile methods, in manufacturing and software development, as a shift that seeks to contain complexity by reducing irreversibility—as opposed to tackling other complexitydrivers. I think that one of an architect’s most important tasks is to remove architecture by finding ways to eliminate irreversibility in software designs.

I think he's absolutely on the mark with this--in fact, I think it illuminates one of the two or three key roles of architecture in system implementation. If the architect focuses on helping to define approaches which are hard to change once the system "complexifies", and in helping developers write solid code that can easily be modified when needs change, (s)he goes a long, long way towards making the system flexible and maintainable.

Note that there's two architectural roles there: (1) helping to make key decisions early, and (2) mentoring developers around those decisions.

On most of the dev teams I've worked with, the first set of decisions is made by proposal and refinement--one of us proposes an overall approach, and the rest of us point out tweaks or whole new approaches until the whole thing gels in everyone's mind. It seems to work, if everyone is engaged. Some architects would prefer to "rule by fiat", but I've found that generally results in systems which can't be maintained or in far more work than is needed. It's very hard to get key decisions right by yourself. To illustrate: I once proposed a relatively modest refactoring in a large thorny user interface, to separate business logic from display logic and generally make it easier to maintain. The reigning architect decided we needed a complete rewrite, in direct opposition to the opinion of everyone else on the team. Nobody objected strongly, though everyone quietly agreed that a rewrite probably wasn't needed--it's lot more fun to write new code than to modify old. Nobody asked what the minimum effort needed to meet the requirements was. About two years and a couple million dollars later, the new system is, indeed, quite a bit better-structured than the old one. It's not clear if the new UI will produce allow faster, cleaner updates than the old one--but it sure cost a lot to build: about twice the initial estimate, and about 6 times the original proposal. (By the way: the rewrite is considered a success by all involved. See "What Could Possibly Be Worse Than Failure?" by Alex Papadimoulis.)

A second role implied in minimizing irreversibility is "mentor". Writing good code is hard; writing well-structured code without someone to bounce your design off of is doubly hard. I spend a lot of time talking with the members of my teams, trying to make sure we have a design that's flexible and understandable. A lot of what we think of as "architecture" starts out as a small feature being implemented by a relatively junior developer. I try to make sure (s)he has someone to work with in the early stages of that work, or at least a trial design to start from.

I love Martin Fowler's writing: he always gives me something good to think about.

SEMAT and development principles

Posted in Jerry Andrews on February 3rd, 2010 by admin
My first reaction to SEMAT was--is this practical? But as I've thought about it more, I've decided there are some principles of good software design and implementation they can probably agree upon and illuminate.

For context: I spent 15 years as a practicing nuclear engineer before becoming a practicing software developer.

"Engineering practice", as I found it in the field, is often as arbitrary as "software development practice". What is "good" is measured first by "what works", second by "what's elegant", and finally by "what's inexpensive", and is judged primarily by senior practitioners working against their own experience rather than some set of objective standards (though those exist as well). In hardware engineering, the time lapse between design review and implementation is quite often very long (especially large-scale design, e.g. power plants, where I worked). As a result, feedback loops are even longer and harder to manage. What dominates in the large scale seems to be "what worked"--and just as often, "what failed".

This seems to me to be exactly the type of thing we're developing now as developers--we now have lots of categories of languages for solving different types of problems and we're developing solid tools and techniques for measuring performance, managing projects, and closing the loop between design and implementation. The world for software developers is a FAR less arbitrary place, and has far more development of tools and techniques, than in 1976 when I started coding.

The body of common practice is regularly changing in hardware engineering, as analysis tools get better. When I graduated, one of my first tasks required a stress analysis, which I did with a calculator by hand. These days, an engineer would set that up in a desktop finite-element analysis program with a nice UI, and he'd get a better answer in less time. The principles are the same, though: determine, through an understanding of material behavior and machine design, what a specific application required of the machine and what combination of off-the-shelf components and custom machining could be used to implement that application. It's very much what I do today: I pick off-the-shelf components and custom components to create a design to meet certain requirements.

Can we, as a group, specify some of the principles on which those decisions get made? I suspect we can. While I'm only really fluent in one sub-part of OO design, I know of a few; consider the SOLID principles in OO design, various common design patterns, and the corpus of tools and techniques in Knuth's "The Art of Computer Programming".

Coupling Design and Implementation

Posted in Jerry Andrews on January 10th, 2010 by admin
Six weeks ago or so, our development team reviewed a small design change in the way status is managed by an object. Basically, we broke one state variable into four, and thought more carefully about the state transitions allowed and expected for the class in question.

Yesterday, one of our analysts, at the prompting of one of our developers and two of our data designers, reviewed a completely new take on the same design change... none of the guys in question knew about the previous design change (they'd missed the design review). We ended up going with the previously-reviewed design.

The whole meeting and the thought that led up to it could have been avoided if there had been some mechanism for ensuring the approved design was implemented. This bit of design, like all design, is pretty much wasted if it remains in design documents.

One way to handle this would have been to somehow automatically compare the existing design artifacts to the implementation to see if they matched, and complain if they didn't. Even if we didn't implement the approved design right away, then, there'd be a mechanism for reminding developers that they planned to do something one way, and haven't made it happen yet.

Three snippets of “interesting” code

Posted in Jerry Andrews on January 7th, 2010 by admin
Here's a bit of code I use in interviews:

try {
if (foo())
return 1;
} catch (Exception e) {
throw e;
} finally {
return 3;
}

The questions I start with are:
  1. what does this do if foo()==true? If foo()==false? If foo() throws an exception?

  2. how could you recode this more simply?

  3. the original spec was:
    • if foo is true, return 1,

    • if foo is false, return 3,

    • propagate exceptions.

Provide code to implement the spec.

It's sobering to note that over 2/3 of interviewees for senior Java programmer positions fail all three questions. How would you answer them?

I firmly believe that even mid-tier developers should have no problem describing what they mean in code, and in understanding what others mean, even in code which is poorly written. There always seems to be a snarl somewhere that nobody wants to touch (I've written one or two of those myself). For the most part, though, the bad code I see is the result of "I'm not sure how to do this, but this seems to work".

Here's one such snippet:

static final BigDecimal ONE_HUNDRED_PERCENT = new BigDecimal("1.00")
.setScale(MathUtils.DOLLAR_SCALE, MathUtils.STANDARD_ROUNDING_MODE);

This is just bad code, unless you're letting BigDecimal manage all the results' scales itself (we aren't, and you shouldn't; see my previous article on BigDecimal rounding). It's completely replaceable with "BigDecimal.ONE". It leaves the resulting code less clear, and performs no useful function.

I found the following code (paraphrased) in a Java application I'm working on:

foo(vc.getNewValue(), ppCorrectedValue=vc.getNewValue());

It's perfectly correct and exactly equivalent to:

ppCorrectedValue = vc.getNewValue();
foo(vc.getNewValue(), ppCorrectedValue);

That sort of expression is common in C; most C developers are aware that the equals operator evaluates to the left-hand value of the expression. Many Java developers aren't aware of this fact--so I was surprised to see it. I'm not sure it improves the readability of the code, though, so I'm refactoring it into the second form above.

Where I think the equals operator behavior really helps in Java is when setting up initial values:

i = j = 0;


Good code is hard enough to write; have pity on the next guy, and be as straightforward and clear as possible. Arabesques like assignment inside a function call parameter list just make your code harder to read and maintain.

Introducing SEMAT

Posted in Jerry Andrews on December 11th, 2009 by admin
I'm trained as a nuclear and mechanical engineer. In those disciplines, there's an underlying body of knowledge and practice for design and implementation, based on a couple hundred years of experimentation and data analysis, on what works and what doesn't, and why.

Software "engineering" as it is practiced where I usually have worked, seems to be 75% gut feel and 20% anecdotal experience, with the remaining 5% subject to real analysis. I am committed to bringing at least some of the discipline of field engineering (those parts I understand well, and can adapt) to the practice of producing software.

A group of the leading lights in methodology and architecture have come together to form SEMAT -- an acronym of "Software Engineering Method and Theory", with the goal to "refound software engineering based on a solid theory, proven principles, and best practices". It would sound like so much smoke, but consider some of the signatories:

Scott Ambler
Barry Boehm
Bill Curtis
James Odell
Ivar Jacobson
Erich Gamma

The list is quite long, and luminous.

I suggest you go have a look: http://www.semat.org. I'm looking forward to finding out what Scott Ambler and Ivar Jacobson can agree on.

BigDecimal

Posted in Jerry Andrews on October 7th, 2009 by admin
Whenever you're working with money in Java, you should be using BigDecimal. There are other applications, but far and away the most common is monetary calculations. The biggest reason is accuracy: floating point values (stored using IEEE standard 754) can't represent common decimal amounts. For that, you need actual arbitrary precision math--the same kind you learned in grade school. That's what BigDecimal provides. Yes, the operations are clumsy-looking and require a lot of typing. Wouldn't you want your bank or your tax office to have full control over the precision of the values they use to calculate your interest or taxes? You owe it to your users to provide the same precision.

The problem is illustrated simply; pick any number with multipliers other than 2, store it, then print it out.

System.out.printf("float: %60.55f\n",0.35f);

float: 0.3499999940395355000000000000000000000000000000000000000

Using doubles helps, but doesn't make the problem go away--it just delays the point where you'll see a problem. The only real solution is to use arbitrary-precision math (or switch to a CPU which works in base 10 rather than base 2).

Wherever you can, make your first conversion from user input or database storage to BigDecimal using strings. The internal conversion exposes how these values are stored in the first place:

System.out.printf("BD by string: %60.55f\n",new BigDecimal("0.35"));
System.out.printf("BD by double: %60.55f\n",new BigDecimal(0.35d));
System.out.printf("BD by float: %60.55f\n",new BigDecimal(0.35f));

BD by string: 0.3500000000000000000000000000000000000000000000000000000
BD by double: 0.3499999999999999777955395074968691915273666381835937500
BD by float: 0.3499999940395355224609375000000000000000000000000000000

Once you have values stored as BigDecimals, keep all your math between two BigDecimal values. You don't gain anything by storing a value in BigDecimal, then multiplying it by a float or a double! You introduce the same problem by doing that that you sidestepped by moving into BigDecimal. I've seen senior developers do the following:

public BigDecimal multiply(BigDecimal bd, double d){
return bd.multiply(new BigDecimal(d));
}

Given the "BD by double" result above, is this really going to produce the results you expect? By doing this, you just gave up whatever advantage you had using BigDecimal in the first place!

If you use division in BigDecimal, you're going to have to round your results. To illustrate why, consider the following:

BigDecimal result = BigDecimal.ONE.divide(new BigDecimal("3"));

Exception in thread "main" java.lang.ArithmeticException:
Non-terminating decimal expansion; no exact representable decimal result.
at java.math.BigDecimal.divide(BigDecimal.java:1594)
at BDDemo.main(BDDemo.java:12)

Any calculation which results in a repeating decimal result will throw this exception. To deal with this problem, we need to discuss the relationship between "precision", "scale" and "rounding". "Precision" is the number of "significant digits" we learned about in science class in high school. The following numbers all have a precision of 4:

123.4
1234
12340
12.34
0.000001234

Unfortunately, BigDecimal has no internal concept of "precision". It stores its result as an integer of arbitrary precision, and then places the decimal point using a concept called "scale"--the number of places to the right of the decimal point. Thus, the scale of "12.34" is 2; the scale of "0.000001234" is 9. When you construct a BigDecimal with a given scale, you're reducing its precision.

In order to deal with the 'repeating' result, you're going to have to know enough about your problem domain to specify a scale, and a rounding mode to determine the result.

Rounding modes are discussed in detail in the BigDecimal javadoc, and I'll discuss them in a future entry, but for now, here's the list:

ROUND_UP round away from zero.
ROUND_DOWN round towards zero.

ROUND_CEILING round towards positive infinity.
ROUND_FLOOR round towards negative infinity.

ROUND_HALF_DOWN round towards "nearest neighbor" unless both
neighbors are equidistant, in which case round down.
ROUND_HALF_EVEN round towards the "nearest neighbor" unless both
neighbors are equidistant, in which case, round towards
the even neighbor.
ROUND_HALF_UP round towards "nearest neighbor" unless both
neighbors are equidistant, in which case round up.

ROUND_UNNECESSARY assert that the requested operation has an exact
result, hence no rounding is necessary.

Note that "ROUND_DOWN" is the same as truncating the result, and by convention, banks and accountants uses ROUND_HALF_EVEN (and has for at least the 3 decades I've been paying attention, if not longer).

Rules for using BigDecimals are heavily tied to your problem domain, but I do have a few rules of thumb as starting places:

1. Don't round at all until you must (a) divide, (b) you have to store or output results.

2. BigDecimal conversion to and from a database should be done as strings. Hibernate, iBatis, and many other tools support this conversion easily and directly.

3. Standardize your representations early; put a small list of rounding mode/scale pairs for your application in an enumeration somewhere so they will be consistent across your application.

4. Talk frankly with your architects, designers, and requirements people about the use of arbitrary-precision math; make sure everyone is on the same page.

Using BigDecimal seems clumsy and wordy when you start. You'll develop style conventions and approaches which work if you take the plunge, and you'll have a lot more confidence in your numeric results once you do.