XNSIO
  About   Slides   Home  

 
Managed Chaos
Naresh Jain's Random Thoughts on Software Development and Adventure Sports
     
`
 
RSS Feed
Recent Thoughts
Tags
Recent Comments

Refactoring Teaser 2

Monday, July 20th, 2009

Here comes the second Refactoring Teaser.

The purpose of this code is to determine the sender domain and IP address of the email server used by the sender of an email.

Following tests explain the purpose in more detail:

public class IPAddressExtractorTest {
    private ArrayList header = new ArrayList();
 
    @Test
    public void mineSenderFromEdgeServerRecordInMailHeaders() {
        mail_from("67.205.47.130", "agileindia.org").RecievedBy("75.119.213.4", "mails.agileindia.org");
        mail_from(null, "mail-vw0-f172.google.com").RecievedBy("67.205.47.130", "agileindia.org");
        assertSenderIPIs("209.85.212.172");
    }
 
    @Test
    public void useSenderIPForInvalidSenderEdgeServerDomainName() {
        mail_from("67.205.47.130", "agileindia.org").RecievedBy("75.119.213.4", "mails.agileindia.org");
        mail_from("209.85.212.172", "cannot-exist.agilefaqs.com").RecievedBy("67.205.47.130", "agileindia.org");
        assertSenderIPIs("209.85.212.172");
    }
 
    @Test
    public void senderNameCanBeIPAddress() {
        mail_from(null, "209.85.212.172").RecievedBy("67.205.47.130", "agileindia.org");
        assertSenderIPIs("209.85.212.172");
    }
 
    @Test
    public void matchMXRecordIPWithReciever() {
        mail_from("", "mail-vw0-f172.google.com").RecievedBy("67.205.47.130", "apache2-echo.robin.dreamhost.com");
        assertSenderIPIs("209.85.212.172");
    }
 
    @Test
    public void skipHeaderRecordsThatDontCrossEdgeServers() {
        mail_from("192.168.1.47", "smtp.gmail.com").RecievedBy("75.119.213.4", "mail.gmail.com");
        mail_from("192.168.1.3", "192.168.6.242").RecievedBy("192.168.1.47", "smtp.gmail.com");
        assertSenderIPIs("");
 
    }
 
    private void RecievedBy(final String ip, final String domainName) {
        header.add(new ReceiveFromByHeaders(ip, fromIp, domainName, fromDomain));
    }
 
    private String fromIp;
    private String fromDomain;
 
    private IPAddressExtractorTest mail_from(final String ip, final String domainName) {
        fromIp = ip;
        fromDomain = domainName;
        return this;
    }
 
    private void assertSenderIPIs(final String senderIP) {
        IPAddressExtractor addressExtractor = new IPAddressExtractor(header, "gmail.com");
        assertEquals(senderIP, addressExtractor.getSenderIP());
    }
}

Following is our big ball of mud:

public class IPAddressExtractor {
    private static Pattern regexSLD = Pattern.compile("(.*\\.)?(.*\\..*)");
    private static Pattern regexIP = Pattern.compile(Constants.ValidIpAddressRegex);
    private static Logger logger = Logger.getLogger(IPAddressExtractor.class.getName());
 
    private List _receiveFromByHeaders;
    private String _recepientDomain;
 
    private String _senderIP;
    private int _distance;
 
    public IPAddressExtractor(final List receiveFromByHeaders, final String recepientDomain) {
        logger.info("Entering IPAddressExtractor");
        logger.info(recepientDomain);
 
        _receiveFromByHeaders = receiveFromByHeaders;
        _recepientDomain = recepientDomain;
 
        ExtractSenderIPfromReceiveHeadersFromTop();
 
        logger.info("Leaving IPAddressExtractor");
    }
 
    private void ExtractSenderIPfromReceiveHeadersFromTop() {
        String senderDomain = "";
        boolean gotSenderDomain = false;
 
        String[] mxRecords = null;
 
        try {
            mxRecords = DnsMx.GetMXRecords(_recepientDomain);
 
            if (mxRecords == null) {
                return;
            }
        } catch (Exception ex) {
            // no records found
            return;
        }
        // generally first MX record is considered;
        String mxRecordIP = ResolveIPAddress(mxRecords[0]);
        logger.info(mxRecords[0] + " " + mxRecordIP);
 
        String ipByRecipMXServer = "";
 
        // exact MX Match
        int counter = 0;
        for (ReceiveFromByHeaders rOBj : _receiveFromByHeaders) {
            counter++;
 
            if (rOBj.getReceiveByIpAddress() != null
                    && (rOBj.getReceiveByIpAddress().equals(rOBj.getReceiveFromIpAddress()) || "127.0.0.1".equals(rOBj
                            .getReceiveFromIpAddress()))) {
                continue;
            }
 
            if (mxRecords[0].toLowerCase() == rOBj.getReceiveByHeader().toLowerCase()) {
                if (VerifyDomain(rOBj.getReceiveFromHeader()) && !VerifyIPAddress(rOBj.getReceiveFromHeader())) {
                    senderDomain = rOBj.getReceiveFromHeader();
                    gotSenderDomain = true;
                    ipByRecipMXServer = rOBj.getReceiveFromIpAddress();
                    _distance = counter;
                    break;
                } else if (VerifyIPAddress(rOBj.getReceiveFromHeader()))// since somethimes theres an ipAddress instead
                // of domain
                {
                    senderDomain = GetHostName(rOBj.getReceiveFromHeader());
                    gotSenderDomain = true;
                    _distance = counter;
                    break;
                }
            }
        }
 
        // MX IP match
        if (!gotSenderDomain) {
            counter = 0;
 
            for (ReceiveFromByHeaders rOBj : _receiveFromByHeaders) {
                counter++;
                if (mxRecordIP.equals(rOBj.getReceiveByIpAddress())) {
                    if (VerifyDomain(rOBj.getReceiveFromHeader()) && !VerifyIPAddress(rOBj.getReceiveFromHeader())) {
                        senderDomain = rOBj.getReceiveFromHeader();
                        gotSenderDomain = true;
                        ipByRecipMXServer = rOBj.getReceiveFromIpAddress();
                        _distance = counter;
                        break;
                    } else if (VerifyIPAddress(rOBj.getReceiveFromHeader()))// since somethimes theres an ipAddress
                    // instead of domain
                    {
                        senderDomain = GetHostName(rOBj.getReceiveFromHeader());
                        gotSenderDomain = true;
                        _distance = counter;
                        break;
                    }
                }
            }
        }
 
        // MX SLD match
        if (!gotSenderDomain) {
            counter = 0;
 
            for (ReceiveFromByHeaders rOBj : _receiveFromByHeaders) {
                counter++;
 
                Matcher mxRecordMatch = regexSLD.matcher(mxRecords[0]);
                Matcher rOBJMatch = regexSLD.matcher(rOBj.getReceiveByHeader());
 
                if (!(mxRecordMatch.find() && rOBJMatch.find())) {
                    continue;
                }
 
                if (mxRecordMatch.group(2).toLowerCase() == rOBJMatch.group(2).toLowerCase()) {
                    if (VerifyDomain(rOBj.getReceiveFromHeader()) && !VerifyIPAddress(rOBj.getReceiveFromHeader())) {
                        senderDomain = rOBj.getReceiveFromHeader();
                        gotSenderDomain = true;
                        ipByRecipMXServer = rOBj.getReceiveFromIpAddress();
                        _distance = counter;
                        break;
                    } else if (VerifyIPAddress(rOBj.getReceiveFromHeader()))// since somethimes theres an ipAddress
                    // instead of domain
                    {
                        String extractIP = ExtractIP(rOBj.getReceiveFromHeader());
                        senderDomain = GetHostName(extractIP);
                        gotSenderDomain = true;
                        _distance = counter;
                        break;
                    }
                }
            }
        }
 
        String ipAddress = "";
 
        try {
            if (senderDomain != null && senderDomain.trim().length() > 0) {
                ipAddress = ResolveIPAddress(senderDomain);
            }
        } catch (Exception e) {
        }
 
        if (ipAddress == null || ipAddress.trim().length() == 0) {
            ipAddress = ipByRecipMXServer;
        }
 
        _senderIP = ipAddress;
    }
 
    // sometimes IP can enclosed in brackets or extra chars
    private String ExtractIP(final String str) {
        logger.info("Entering ExtractIP");
        logger.info(str);
 
        return regexIP.matcher(str).group(1);
    }
 
    private String ResolveIPAddress(final String domain) {
        String ipAddress = "";
 
        if (!(domain.length() == 0 || domain.length() > Constants.MaxDomainLength || !Pattern.matches(Constants.DomainNameRegex, domain))) {
            try {
                ipAddress = InetAddress.getByName(domain).getHostAddress();
            } catch (UnknownHostException e) {
                logger.log(Level.INFO, "Not a valid Domain Name " + domain);
            }
            logger.info("IPAddress " + ipAddress + " found for domain " + domain);
        } else {
            logger.log(Level.INFO, "Not a valid Domain Name " + domain);
        }
        return ipAddress;
    }
 
    private boolean VerifyDomain(final String senderDomain) {
 
        if (senderDomain != null && senderDomain.trim().length() > 0) {
            if (!(senderDomain.length() == 0 || senderDomain.length() > Constants.MaxDomainLength || !Pattern.matches(
                    Constants.DomainNameRegex, senderDomain))) {
                logger.log(Level.FINE, "Sender domain identified as " + senderDomain);
                return true;
            } else {
                logger.log(Level.FINE, "Sender domain identified is not a valid Domain Name " + senderDomain);
                return false;
            }
        }
 
        return true;
    }
 
    private boolean VerifyIPAddress(final String ipAddress) {
        logger.info("Entering VerifyAddress");
        logger.info(ipAddress);
 
        if (ipAddress != null && ipAddress.trim().length() > 0) {
            return regexIP.matcher(ipAddress).find();
        }
 
        return false;
    }
 
    private String GetHostName(final String ipAddress) {
        try {
            InetAddress ip = InetAddress.getByName(ipAddress);
            return ip.getHostName();
        } catch (UnknownHostException e) {
            e.printStackTrace();
        }
        return "";
    }
 
    public String getSenderIP() {
        return _senderIP;
    }
 
    public int getDistance() {
        return _distance;
    }
}

This class depends on:

public class ReceiveFromByHeaders {
    private String receiveByIpAddress;
    private String receiveFromIpAddress;
    private String receiveByHeader;
    private String receiveFromHeader;
 
    public ReceiveFromByHeaders(final String receiveByIpAddress, final String receiveFromIpAddress, final String receiveByHeader,
            final String receiveFromHeader) {
        this.receiveByIpAddress = receiveByIpAddress;
        this.receiveFromIpAddress = receiveFromIpAddress;
        this.receiveByHeader = receiveByHeader;
        this.receiveFromHeader = receiveFromHeader;
    }
 
    public String getReceiveByIpAddress() {
        return receiveByIpAddress;
    }
 
    public void setReceiveByIpAddress(final String receiveByIpAddress) {
        this.receiveByIpAddress = receiveByIpAddress;
    }
 
    public String getReceiveFromIpAddress() {
        return receiveFromIpAddress;
    }
 
    public void setReceiveFromIpAddress(final String receiveFromIpAddress) {
        this.receiveFromIpAddress = receiveFromIpAddress;
    }
 
    public String getReceiveByHeader() {
        return receiveByHeader;
    }
 
    public void setReceiveByHeader(final String receiveByHeader) {
        this.receiveByHeader = receiveByHeader;
    }
 
    public String getReceiveFromHeader() {
        return receiveFromHeader;
    }
 
    public void setReceiveFromHeader(final String receiveFromHeader) {
        this.receiveFromHeader = receiveFromHeader;
    }
}

Some Contants:

public class Constants {
    public static final String ValidIpAddressRegex = "\\b(?:\\d{1,3}\\.){3}\\d{1,3}\\b";
    public static final int MaxDomainLength = 255;
    public static final String DomainNameRegex = "[\\w.-]+\\.[\\w-[0123456789]]{2,6}";
}

Finally, we’ve used the following class to stub out MxRecord look up:

public class DnsMx {
    public static String[] GetMXRecords(final String domain) {
        return new String[] { "agileindia.org", "alt2.gmail-smtp-in.l.google.com" };
    }
}

Download the Java Project or C# Version.

Refactoring Teaser 1: Take 1

Tuesday, July 14th, 2009

Last week I posted a small code snippet for refactoring under the heading Refactoring Teaser.

In this post I’ll try to show step by step how I would try to refactor this mud ball.

First and foremost cleaned up the tests to communicate the intent. Also notice I’ve changed the test class name to ContentTest instead of StringUtilTest, which means anything and everything.

public class ContentTest {
    private Content helloWorldJava = new Content("Hello World Java");
    private Content helloWorld = new Content("Hello World!");
 
    @Test
    public void ignoreContentSmallerThan3Words() {
        assertEquals("", helloWorld.toString());
    }
 
    @Test
    public void buildOneTwoAndThreeWordPhrasesFromContent() {
        assertEquals("'Hello', 'World', 'Java', 'Hello World', 'World Java', 'Hello World Java'", helloWorldJava.toPhrases(6));
    }
 
    @Test
    public void numberOfOutputPhrasesAreConfigurable() {
        assertEquals("'Hello'", helloWorldJava.toPhrases(1));
        assertEquals("'Hello', 'World', 'Java', 'Hello World'", helloWorldJava.toPhrases(4));
    }
 
    @Test
    public void returnsAllPhrasesUptoTheNumberSpecified() {
        assertEquals("'Hello', 'World', 'Java', 'Hello World', 'World Java', 'Hello World Java'", helloWorldJava.toPhrases(10));
    }
}

Next, I created a class called Content, instead of StringUtil. Content is a first-class domain object. Also notice, no more side-effect intense statics.

public class Content {
    private static final String BLANK_OUTPUT = "";
    private static final String SPACE = " ";
    private static final String DELIMITER = "', '";
    private static final String SINGLE_QUOTE = "'";
    private static final int MIN_NO_WORDS = 2;
    private static final Pattern ON_WHITESPACES = Pattern.compile("\\p{Z}|\\p{P}");
    private List phrases = new ArrayList();
 
    public Content(final String content) {
        String[] tokens = ON_WHITESPACES.split(content);
        if (tokens.length > MIN_NO_WORDS) {
            buildAllPhrasesUptoThreeWordsFrom(tokens);
        }
    }
 
    @Override
    public String toString() {
        return toPhrases(Integer.MAX_VALUE);
    }
 
    public String toPhrases(final int userRequestedSize) {
        if (phrases.isEmpty()) {
            return BLANK_OUTPUT;
        }
        List requiredPhrases = phrases.subList(0, numberOfPhrasesRequired(userRequestedSize));
        return withInQuotes(join(requiredPhrases, DELIMITER));
    }
 
    private String withInQuotes(final String phrases) {
        return SINGLE_QUOTE + phrases + SINGLE_QUOTE;
    }
 
    private int numberOfPhrasesRequired(final int userRequestedSize) {
        return userRequestedSize > phrases.size() ? phrases.size() : userRequestedSize;
    }
 
    private void buildAllPhrasesUptoThreeWordsFrom(final String[] words) {
        buildSingleWordPhrases(words);
        buildDoubleWordPhrases(words);
        buildTripleWordPhrases(words);
    }
 
    private void buildSingleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length; ++i) {
            phrases.add(words[i]);
        }
    }
 
    private void buildDoubleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 1; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1]);
        }
    }
 
    private void buildTripleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 2; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1] + SPACE + words[i + 2]);
        }
    }
}

This was a big step forward, but not good enough. Next I focused on the following code:

    private void buildAllPhrasesUptoThreeWordsFrom(final String[] words) {
        buildSingleWordPhrases(words);
        buildDoubleWordPhrases(words);
        buildTripleWordPhrases(words);
    }
 
    private void buildSingleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length; ++i) {
            phrases.add(words[i]);
        }
    }
 
    private void buildDoubleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 1; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1]);
        }
    }
 
    private void buildTripleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 2; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1] + SPACE + words[i + 2]);
        }
    }

The above code violates the Open-Closed Principle (pdf). It also smells of duplication. Created a somewhat generic method to kill the duplication.

    private void buildAllPhrasesUptoThreeWordsFrom(final String[] fromWords) {
        buildPhrasesOf(ONE_WORD, fromWords);
        buildPhrasesOf(TWO_WORDS, fromWords);
        buildPhrasesOf(THREE_WORDS, fromWords);
    }
 
    private void buildPhrasesOf(final int phraseLength, final String[] tokens) {
        for (int i = 0; i <= tokens.length - phraseLength; ++i) {
            String phrase = phraseAt(i, tokens, phraseLength);
            phrases.add(phrase);
        }
    }
 
    private String phraseAt(final int currentIndex, final String[] tokens, final int phraseLength) {
        StringBuilder phrase = new StringBuilder(tokens[currentIndex]);
        for (int i = 1; i < phraseLength; i++) {
            phrase.append(SPACE + tokens[currentIndex + i]);
        }
        return phrase.toString();
    }

Now I had a feeling that my Content class was doing too much and also suffered from the primitive obsession code smell. Looked like a concept/abstraction (class) was dying to be called out. So created a Words class as an inner class.

    private class Words {
        private String[] tokens;
        private static final String SPACE = " ";
 
        Words(final String content) {
            tokens = ON_WHITESPACES.split(content);
        }
 
        boolean has(final int minNoWords) {
            return tokens.length > minNoWords;
        }
 
        List phrasesOf(final int length) {
            List phrases = new ArrayList();
            for (int i = 0; i <= tokens.length - length; ++i) {
                String phrase = phraseAt(i, length);
                phrases.add(phrase);
            }
            return phrases;
        }
 
        private String phraseAt(final int index, final int length) {
            StringBuilder phrase = new StringBuilder(tokens[index]);
            for (int i = 1; i < length; i++) {
                phrase.append(SPACE + tokens[index + i]);
            }
            return phrase.toString();
        }
    }

In the constructor of the Content class we instantiate a Words class as follows:

    public Content(final String content) {
        Words words = new Words(content);
        if (words.has(MIN_NO_WORDS)) {
            phrases.addAll(words.phrasesOf(ONE_WORD));
            phrases.addAll(words.phrasesOf(TWO_WORDS));
            phrases.addAll(words.phrasesOf(THREE_WORDS));
        }
    }

Even though this code communicates well, there is duplication and noise that can be removed without compromising on the communication.

     phrases.addAll(words.phrasesOf(ONE_WORD, TWO_WORDS, THREE_WORDS));

There are few more version after this, but I think this should give you an idea about the direction I’m heading.

Why should I bother to learn and practice TDD?

Thursday, May 21st, 2009

My 2 cents, based on personal experience:

  • With TDD, I’m a lot more confident about my solutions. If I spot a design improvement, I can quickly jump in and fix it with confidence. When given feedback, I’m able to respond to it quickly. I feel I’m in control.
  • It really helps me keep my design and code minimalistic and clean. No bells and whistles. No buy one get one free combo offers. <Perfection in design is achieved not when there is nothing more to add, but rather when there is nothing more to take away>
  • Turns out that my code is lot more easier to test. Because its designed to be testable. Lots of people argue that they will write tests after writing the code and it would be more efficient. The biggest problem I find with this approach is that the code ends up being something not readily testable. Then either I’ve to spend time making it testable or I skip testing saying its not possible or not required.
  • It helps me build a safety net of executable, living, up-to-date specification for my classes and features. Its a great starting point for new team members to understand my software. Tests are a great reference point for someone who wants to use my code.
  • TDD is a great teacher. It helps me listen to my code and learn from it. It forces me to pay close attention to what is happening. Its easy to spot bad things faster. Its all about feedback and visibility.
  • TDD forces me to slow down and think. It encourages me to take baby-steps. Sometimes I find people are in such a hurry that they spill mess all over the place. It takes soo much more time and effort to clean up the mess they leave behind.
  • My tests tend to communicate my design choices much longer after I’m gone.
  • I massively reduce the amount of time I spend in the debugger or trying to manually test (monkey test) from the UI. When something breaks, I no longer need to crawl through the logs to figure out where things are going wrong. I get pin-pointed feedback.
  • TDD helps me maintain focus on measurable outcome (producing software that accomplishes a concrete objective). I’m not longer drifting down ratholes.
  • TDD also helps me reduce the hand-overs between developers and tests and hence the wastage introduced because of all that overhead and context switching.
  • And so on…

Having said that, TDD alone is not sufficient to achieve the above. At times you need to spike/prototype things. One needs to have (or at least start focusing on building) a good knowledge of Design Principles and Patterns. Its easy to get lost, having a pair can really help.

Again I don’t want to sound dogmatic. I don’t think TDD is the only way to build great software. There are lots of great developers out there, building amazing software without TDD. However I think TDD is a much easier approach to achieve the same.

Checkstyle’s Strict Duplicate Code Check Rocks!

Tuesday, March 3rd, 2009

Recently, during a project rescue effort, I introduced Checkstyle as part of the automated build. This morning I saw the build failing because of the following Checkstyle violation:

File: SuccessEvent.java, Line: 10, Type: StrictDuplicateCodeCheck, Priority: High, Category: Duplicates
Found duplicate of 14 lines in AcceptEvent.java, starting from line 10

When I checked the source code of the 2 files, this is what I found:

1
2
public class SuccessEvent extends EngineEvent {
    private final Document document;
1
2
3
    public SuccessEvent(final Document document) {
        this.document = document;
    }
1
2
3
4
    @Override
    Document getDocument() {
        return document;
    }
1
2
3
4
5
    @Override
    String getMessage() {
        return "";
    }
}
1
2
public class AcceptEvent extends EngineEvent {
    private final Document document;
1
2
3
    public AcceptEvent(final Document document) {
        this.document = document;
    }
1
2
3
4
    @Override
    Document getDocument() {
        return document;
    }
1
2
3
4
5
    @Override
    String getMessage() {
        return "";
    }
}

As you can see, except the name of the class, the 2 files are identical. On one hand I’m thrilled that Checkstyle does a great job at catching such errors, on other hand I’m sad that there are developers who write such sloppy code.

    Licensed under
Creative Commons License