» code smell » Managed Chaos by Naresh Jain

What heuristics do you use to decide Long Method Smell?

Friday, March 25th, 2011

I find myself using the following heuristics:

Can I quickly comprehend what is going on?
Do I need to parse the method’s code more than once?
Does this code violate the Single Responsibility Principle (SRP)?
Is the Cyclomatic Complexity of this code more than 5?
Does this method have more than 7 +/- 2 (Miller’s Magical Number Seven) concepts?

More details: Long Method Smell: When is a method too big?

Posted in Agile, Code Smells, Design, legacy code, Programming | No Comments »

An Example of Primitive Obsession

Tuesday, October 20th, 2009

Couple of years ago, I was working with a team which was building an application for handling User profile. It had the following functionality

User Signup
User Profile
Login/Logout API
Authentication and Authorization API
and so on…

As you can see, this is pretty common to most applications.

The central entity in this application is User. And we had a UserService to expose its API. So far, so good.

The UserService had 2 main methods that I want to focus on. The first one is:

    public boolean authenticate(String userId, String password) {
        ...
    }

Even before the authenticate() method gets called, the calling class does basic validations on the password parameter. Stuff like

the password cannot be null,
it needs to be more than 6 char long and less than 30 char long
should contain at least one special char or upper case letter
should contain at least one letter
and so on …

Some of these checks happen to reside as separate methods on a PasswordUtil class and some on the StringUtil class.

Once the authenticate method is called, we retrieve the respective User from the database, fetch the password stored in the database and match the new password against it. Wait a sec, we don’t store plain password in the DB any more, we hash them before we store ’em. And as you might already know, we use one-way hash; which means given a hash, we cannot get back the original string. So we hash the newly entered password. For which we use a HashUtil class. Then we compare the 2 hashes.

The second method is:

    public User create(final UserDTO userDTO) {
       ...
    }

Before the create() method is called, we validate all the fields inside the UserDTO. During this validation, we do the exact same validations on password as we do before the authenticate method. If all the fields are valid, then inside the create method, we make sure no one else has the same userid. Then we take the raw text password and hash it, so that we can store it in our DB. Once we save the user data in DB, we send out an activation email and off we are.

Sorry that was long. What is the point? Exactly my point. What is the point. Why do I need to know all this stuff? I can’t really explain you the pain I go through when I see:

All these hops & jumps around these large meaningless classes (UserDTO, PasswordUtil, StringUtil, HashUtil)
Conceptual and Data duplication in multiple places
Difficulty in knowing where I can find some logic (password logic seems to be sprayed all over the place)
And so on …

This is an example of Primitive Obsession.

A huge amount of complexity can be reduced,
clarity can be increased and
duplication can be avoided in this code

If we can create a Password class. To think about it, Password is really an entity like User in this domain.

Password class’ constructor can do the validations for you.
You can give it another password and ask if they match. This will hide all the hashing and rehashing logic from you
You can kill all those 3 Utils classes (PasswordUtil, StringUtil, HashUtil) & move the logic in the Password class where it belong

So once we are done, we have the following method signatures:

    public User userWithMatching(UserId id, Password userEnteredPwd) {
        ...
    }

    public User create(final User newUser) {
       ...
    }

Posted in Design | No Comments »

Refactoring Teaser II: Solution: Take 1

Tuesday, August 4th, 2009

Following is the first take at refactoring the II Teaser.

Started off by refactoring the tests:

@Test
public void mineSenderFromEdgeServerRecordInMailHeaders() {
    header(
            mail_from("67.205.47.130", "agileindia.org").receivedBy("75.119.213.4", "mails.agileindia.org"),
            mail_from(null, "mail-vw0-f172.google.com").receivedBy("67.205.47.130", "agileindia.org"));
    senderIP = extractSenderIPFrom(headers).forRecipientDomain(GMAIL);
    assertThat(senderIP).is("209.85.212.172");
    assertDistanceOfMatchingHeaderFromTopIs(2);
}

@Test
public void useSenderIPForInvalidSenderEdgeServerDomainName() {
    header(
            mail_from("67.205.47.130", "agileindia.org").receivedBy("75.119.213.4", "mails.agileindia.org"),
            mail_from("209.85.212.172", "cannot-exist.agilefaqs.com").receivedBy("67.205.47.130", "agileindia.org"));
    senderIP = extractSenderIPFrom(headers).forRecipientDomain(GMAIL);
    assertThat(senderIP).is("209.85.212.172");
    assertDistanceOfMatchingHeaderFromTopIs(2);
}

@Test
public void senderNameCanBeIPAddress() {
    header(mail_from(null, "209.85.212.172").receivedBy("67.205.47.130", "agileindia.org"));
    senderIP = extractSenderIPFrom(headers).forRecipientDomain(GMAIL);
    assertThat(senderIP).is("209.85.212.172");
    assertDistanceOfMatchingHeaderFromTopIs(1);
}

@Test
public void matchMXRecordIPWithReciever() {
    header(mail_from("", "mail-vw0-f172.google.com").receivedBy("67.205.47.130", "apache2-echo.robin.dreamhost.com"));
    senderIP = extractSenderIPFrom(headers).forRecipientDomain(GMAIL);
    assertThat(senderIP).is("209.85.212.172");
    assertDistanceOfMatchingHeaderFromTopIs(1);
}

@Test
public void skipHeaderRecordsThatDontCrossEdgeServers() {
    header(
            mail_from("192.168.1.47", "smtp.gmail.com").receivedBy("75.119.213.4", "mail.gmail.com"),
            mail_from("192.168.1.3", "192.168.6.242").receivedBy("192.168.1.47", "smtp.gmail.com"));
    senderIP = extractSenderIPFrom(headers).forRecipientDomain(GMAIL);
    assertFalse(senderIP.isValid());
    assertDistanceOfMatchingHeaderFromTopIs(0);
}

Following is the crux of the Sender Edge Server IP Extraction Algo:

public IPAddressExtractor(final List receivedHeaders, final Domain recepientDomain) {
    Domain recepientMXRecord = recepientDomain.retrieveFirstMXRecord();
    for (MatchingCriterion critierion : asList(MATCHING_DOMAIN, MATCHING_IP, MATCHING_SECOND_LEVEL_DOMAIN)) {
        Match result = foreach(receivedHeaders).and(recepientMXRecord).match(critierion);
        if (result.success()) {
            storeSenderIPWithDistance(result);
            break;
        }
    }
}

To do the whole Fluent interfaces on line number 21, I had to create a private method:

private CriterionMatcher foreach(final List mailHeaders) {
    return new CriterionMatcher(mailHeaders);
}

and a package protected CriterionMatcher class

class CriterionMatcher {
    private final List mailHeaders;
    private int counter;
    private Domain mxRecord;
 
    CriterionMatcher(final List mailHeaders) {
        this.mailHeaders = mailHeaders;
    }
 
    CriterionMatcher and(final Domain mxRecord) {
        this.mxRecord = mxRecord;
        return this;
    }
 
    Match match(final MatchingCriterion criterion) {
        for (EmailHeader header : mailHeaders) {
            counter++;
 
            if (criterion.isSatisfiedBy(mxRecord, header)) {
                return new Match(header.fromDomain, header.fromIp, counter);
            }
        }
        return Match.NULL;
    }
}

Other than the switch statement smell and conditional complexity, the original code was obsessed with Primitive Obsession code smell. To fix this issue, the first thing I had to do was great first class citizens (Objects). So I ended up creating

public class IPAddress {
    private static final String IP_ADDRESS_REGEX = "\\b(?:\\d{1,3}\\.){3}\\d{1,3}\\b";
    private static final Pattern VALID_IP = Pattern.compile(IP_ADDRESS_REGEX);
    private static final String LOCAL_HOST = "127.0.0.1";
    public static final IPAddress NULL = new IPAddress("");
 
    private final String ip;
 
    private IPAddress(final String address) {
        ip = address;
    }
 
    public static IPAddress parse(final String address) {
        if (address == null) {
            return NULL;
        }
        Matcher matcher = VALID_IP.matcher(address);
 
        if (!matcher.find()) {
            return NULL;
        }
        return new IPAddress(matcher.group(0));
    }
 
    @Override
    public String toString() {
        return ip;
    }
 
    public boolean isLocalhost() {
        return LOCAL_HOST.equals(ip);
    }
 
    public boolean isValid() {
        return this != NULL;
    }
}

and

public class Domain {
    private static final Pattern SLD = Pattern.compile("(.*\\.)?(.*\\..*)");
    public static final Domain NULL = new Domain("", Network.NULL);
 
    private final String name;
    private final Network network;
 
    protected Domain(final String domainName, final Network network) {
        name = domainName.toLowerCase();
        this.network = network;
    }
 
    public IPAddress resolveIP() {
        try {
            String ipAddress = network.ipAddressOf(name);
            return IPAddress.parse(ipAddress);
        } catch (UnknownHostException e) {
            return IPAddress.NULL;
        }
    }
 
    public Domain secondLevelDomain() {
        Matcher mxRecordMatch = SLD.matcher(name);
        if (mxRecordMatch.find()) {
            return new Domain(mxRecordMatch.group(2), network);
        }
        return this;
    }
 
    public Domain retrieveFirstMXRecord() {
        List mxRecords = network.firstMXRecordFor(name);
        if (mxRecords.size() &gt; 0) {
            return new Domain(mxRecords.get(0), network);
        }
        return NULL;
    }
 
    public boolean isValid() {
        return this != NULL;
    }
}

To create a Domain, we use a static Factory called DomainFactory

public final class DomainFactory {
    private static final String DOMAIN_NAME_REGEX = "[\\w.-]+\\.[A-Za-z]{2,6}";
    private static final int MAX_DOMAIN_NAME_LENGTH = 255;
 
    public static Domain build(final String domainName, final Network network) {
        if (isValidDomain(domainName)) {
            return new Domain(domainName, network);
        }
        IPAddress ip = IPAddress.parse(domainName);
        if (ip.isValid()) {
            return retrieveDomainName(ip, network);
        }
        return Domain.NULL;
    }
 
    private static Domain retrieveDomainName(final IPAddress ip, final Network network) {
        try {
            String hostName = network.hostNameFor(ip.toString());
            if (ip.toString().equals(hostName)) {
                return Domain.NULL;
            }
            return new Domain(hostName, network);
        } catch (UnknownHostException e) {
            return Domain.NULL;
        }
    }
}

Finally we’ve the 3 Criterion for checking if a header contains the edge server

public abstract class MatchingCriterion {
    public static final MatchingCriterion MATCHING_DOMAIN = new MatchingDomainCriterion();
    public static final MatchingCriterion MATCHING_IP = new MatchingIPCriterion();
    public static final MatchingCriterion MATCHING_SECOND_LEVEL_DOMAIN = new MatchingSecondLevelDomainCriterion();
 
    public boolean isSatisfiedBy(final Domain mxRecord, final EmailHeader header) {
        return header.fromDomain.isValid() &amp;&amp; satisfies(mxRecord, header);
    }
 
    protected abstract boolean satisfies(Domain mxRecord, EmailHeader header);
}

private static class MatchingDomainCriterion extends MatchingCriterion {
    @Override
    protected boolean satisfies(final Domain mxRecord, final EmailHeader header) {
        return !(header.byIp.equals(header.fromIp) || header.fromIp.isLocalhost() || !header.byDomain.equals(mxRecord));
    }
}

private static class MatchingIPCriterion extends MatchingCriterion {
    @Override
    protected boolean satisfies(final Domain mxRecord, final EmailHeader header) {
        return header.byIp.equals(mxRecord.resolveIP());
    }
}

private static class MatchingSecondLevelDomainCriterion extends MatchingCriterion {
    @Override
    protected boolean satisfies(final Domain mxRecord, final EmailHeader header) {
        Domain secondLevelDomain = mxRecord.secondLevelDomain();
        return secondLevelDomain.equals(header.byDomain.secondLevelDomain());
    }
}

Also notice that for testing purpose we don’t want to hit the network, so I created a FakeNetwork class which stubs out all Network calls. Network is injected into all Domain classes through the DomainFactory. (I’m not very happy with this design, it feels like a bit of a hack to inject Network this way.)

public class FakeNetwork extends Network {
    private static final Map domain2ip = new HashMap() {
        {
            put("mail-vw0-f172.google.com", "209.85.212.172");
            put("209.85.212.172", "mail-vw0-f172.google.com");
            put("mails.agileindia.org", "72.14.203.121");
            put("agileindia.org", "67.205.47.130");
        }
    };
 
    @Override
    public String ipAddressOf(final String domainName) throws UnknownHostException {
        return lookup(domainName);
    }
 
    @Override
    public String hostNameFor(final String ipAddress) throws UnknownHostException {
        return lookup(ipAddress);
    }
 
    @Override
    public List firstMXRecordFor(final String name) {
        return asList("agileindia.org");
    }
 
    private String lookup(final String value) throws UnknownHostException {
        String data = domain2ip.get(value);
        if (data == null) {
            throw new UnknownHostException();
        }
        return data;
    }
}

Feel free to download the whole project.

Posted in Agile, Design, Programming, Testing | 2 Comments »

The I Naming Convention

Saturday, June 13th, 2009

I don’t like the I<something> naming convention for interfaces for various reasons.

Let me explain with an example. Lets say we have IAccount interface. Why is this bad?

All I care is, I’m talking to an Account, it really does not matter whether its an interface, abstract class or a concrete class. So the I in the name is noise for me.
It might turn out that, I might only ever have one type of Account. Why do I need to create an interface then? Its the speculative generality code smell and violating the YAGNI principle. If someday I’ll need multiple types of Accounts, I’ll create it then. Its probably going to be as much effort to create it then v/s doing it now. Minus all the maintenance overhead.
Let’s say we’ve multiple types of Accounts. Instead of calling it IAccount and the child classes as AccountImpl1 or SavingAccountImpl, I would rather refer to it as Account and the different types of account as SavingAccount or CreditAccount. It might also turn out that, there is common behavior between the two types of Account. At that point instead of having IAccount and creating another Abstract class called AbstractAccount I would just change the Account interface to be an abstract class. I don’t want to lock myself into an Interface.

Personally I think the whole I<something> is a hang-over from C++ and its precursors.

Some people also argue that using the I<something> convention is easy for them to know whether its an interface or not (esp. coz they don’t want to spend time typing new Account() and then realizing that its an interface).

The way I look at it is good IDEs will show the appropriate icon and that can help you avoid this issue to some extent. But even if it did not, to me its not a big deal. The number of times the code is read is far more than its written. Maintaining one extra poorly named Interface is far more expensive than the minuscule time saved in typing.

P.S: I’m certainly not discouraging people from creating interfaces. What I don’t like is having only one class inheriting from an interface. May be if you are exposing an API, you might still be able to convenience me. But in most cases people use this convention through out their code base and not just at the boundaries of their system.

In lot of cases I find myself starting off with an Interface, because when I’m developing some class, I don’t want to start building its collaborator. But then when I start TDDing that class, I’ll change the interface to a concrete class. Later my tests might drive different types of subclass and I might introduce an Interface or an Abstract class as suitable. And may be sometime in the future I might break the hierarchy and use composition instead. Guess what, we are dealing with evolutionary design.

Posted in Design, Programming | 12 Comments »