Wednesday, February 13, 2019

Two ways to extend enum functionality

Preface

In my previous article I explained how and why to use enums instead of switch/case control structure in Java code. Here I will show how how to extend functionality of existing enums.


Introduction

Java enum is a kind of a compiler magic. In the byte code any enum is represented as a class that extends abstract class java.lang.Enum and has several static members. Therefore enum cannot extend any other class or enum: there is no multiple inheritance.

Class cannot extend enum as well. This limitation is enforced by compiler.

Here is a simple enum:

enum Color {red, green, blue}

This class tries to extend it:

class SubColor extends Color {}

This is the result of an attempt to compile class SubColor:

$ javac SubColor.java 
SubColor.java:1: error: cannot inherit from final Color
class SubColor extends Color {}
                       ^
SubColor.java:1: error: enum types are not extensible
class SubColor extends Color {}
^
2 errors

Enum cannot either extend or be extended. So, how is it possible to extend its functionality? The key word is "functionality". Enum can implement methods. For example enum Color may declare abstract method draw() and each member can override it:
enum Color {
    red { @Override public void draw() { } },
    green { @Override public void draw() { } },
    blue { @Override public void draw() { } },
    ;
    public abstract void draw();
}

Popular usage of this technique is explained here. Unfortunately it is no always possible to implement method in enum itself because:
  1. the enum may belong to third party library or other team in the company
  2. the enum is probably overloaded with too many other data and functions so it becomes not readable
  3. the enum belongs to module that does not have dependencies required for implementation of method draw().
This article suggests the following solutions for this problem.


Mirror enum

We cannot modify enum Color? No problem! Let's create enum DrawableColor that has exactly same elements as Color. This new enum will implement our method draw():
enum DrawableColor {
    red { @Override public void draw() { } },
    green { @Override public void draw() { } },
    blue { @Override public void draw() { } },
    ;
    public abstract void draw();
}
This enum is a kind of reflection of source enum Color, i.e. Color is its mirror.
But how to use the new enum? All our code uses Color, not DrawableColor. The simplest way to implement this transition is using built-in enum methods name() and valueOf() as following:
Color color = ...
DrawableColor.valueOf(color.name()).draw();
Since name() method is final and cannot be overridden and valueOf() is generated by a compiler these methods are always fit each other, so no functional problems are expected here. Performance of such transition is good also: method name() even does not create new String but returns pre-initialized one (see source code of java.lang.Enum). Method valueOf() is implemented using Map, so its complexity is O(1).
The code above contains obvious problem. If source enum Color is changed the secondary enum DrawableColor does not know this fact, so the trick with name() and valueOf() will fail at runtime. We do not want this to happen. But how to prevent possible failure? We have to let DrawableColor to know that its mirror is Color and enforce this preferably at compile time or at least at unit test phase. Here we suggest validation during unit tests execution. Enum can implement static initializer that is executed when enum is mentioned in any code. This actually means that if static initializer validates that enum DrawableColor fits Color it is enough to implement test like following to be sure that the code will be never broken in production environment:
@Test
public void drawableColorFitsMirror {
    DrawableColor.values();
}
Static initializer just have to compare elements of DrawableColor and Color and throw exception if they do not match. This code is simple and can be written for each particular case. Fortunately simple  open source library named enumus already implements this functionality, so the task becomes trivial:
enum DrawableColor {
    ....
    static {
        Mirror.of(Color.class);
    }
}
That's it. The test will fail if source enum and DrawableColor do not fit it any more. Utility class Mirror has other method that gets 2 arguments: classes of 2 enums that have to fit. This version can be called from any place in code and not only from enum that has to be validated.

EnumMap

Do we really have to define another enum that just holds implementation of one method? In fact, we do not have to. Here is an alternative solution. Let's define interface Drawer as following:
public interface Drawer {
    void draw();
}
Now let's create mapping between enum elements and implementation of interface Drawer:
Map<Color, Drawer> drawers = new EnumMap<>(Color.class) {{
    put(red, new Drawer() { @Override public void draw();});
    put(green, new Drawer() { @Override public void draw();})
    put(blue, new Drawer() { @Override public void draw();})
}}

The usage is simple:

drawers.get(color).draw();

EnumMap is chosen here as a Map implementation for better performance.  Map guaranties that each enum element appears there only once. However, it does not guarantee that there is entry for each enum element. But it is enough to check that size of the map is equal to number of enum elements:


drawers.size() == Color.values().length

Enumus suggests convenient utility for this case also. The following code throws IllegalStateException with descriptive message if map does not fit Color:

EnumMapValidator.validateValues(Color.class, map, "Colors map");


It is important to call the validator from the code which is executed by unit test. In this case the map based solution is safe for future modifications of source enum.


EnumMap and Java 8 functional interface


In fact, we do not have to define special interface to extend enum functionality. We can use one of functional interfaces provided by JDK starting from version 8 (Function, BiFunction, Consumer, BiConsumer, Supplier etc.) The choice depends on parameters that have to be sent to the function. For example, Supplier can be used instead of Drawable defined in the previous example:

Map<Color, Supplier<Void>> drawers = new EnumMap<>(Color.class) {{
    put(red, new Supplier<Void>() { @Override public void get();});
    put(green, new Supplier<Void>() { @Override public void get();})
    put(blue, new Supplier<Void>() { @Override public void get();})
}}

Usage of this map is pretty similar to one from the previous example:

drawers.get(color).get();

This map can be validated exactly as the map that stores instances of Drawable



Conclusions


This article shows how powerful can be Java enums if we put some logic inside. It also demonstrates two ways to expand the functionality of enums that work despite the language limitations. The article introduces to user the open source library named enumus that provides several useful utilities that help to operate enums easier. 


Featured enum instead of switch

Problem and  its solution

Switch/case is the common control structure implemented in most imperative programming languages. Switch is considered more readable than series of if/else.

Here is a simple example:
// Switch with int literal
switch (c) {
  case 1: one(); break;
  case 2: two(); break;
  case 3: three(); break;
  default: throw new UnsupportedOperationException(String.format("Operation %d is not supported", c));
}


Here is the list of the main problems in this code:

  1. Relationship between int literals (1, 2, 3) and executed code is not obvious.
  2. If one of the values (e.g. 2) becomes not supported anymore and this switch is not updated accordingly it will contain  forever the unused code.
  3. If new possible value of c (e.g. 4) is introduced and the switch is not updated accordingly the code will probably throw UnsupportedOperationException at runtime without any compile time notifications.
  4. Such switch structure tends to be duplicated several times in code that makes problems 2 and 3 even more complicated. 
The first simplest fix can be done by using int constants instead of literals. First, let's define constants:

private static int ONE = 1;
private static int TWO = 2;
private static int THREE = 3;

Now the code will look like this:

switch (c) {
  case ONE: one(); break;
  case TWO: two(); break;
  case THREE: three(); break;
  default: throw new UnsupportedOperationException(String.format("Operation %d is not supported", c));
}


(Obviously in real life the names of the constants must be self descriptive)
This snippet is more readable but all other disadvantages are still relevant. The next attempt to improve the initial code snippet uses enums introduced to Java language in version 5 in 2004. Let's define the following enum:

enum Action {ONE, TWO, THREE}


Now the switch snippet will be slightly changed:

Action a = ...
switch (a) {
  case ONE: one(); break;
  case TWO: two(); break;
  case THREE: three(); break;
  default: throw new UnsupportedOperationException(String.format("Operation %s is not supported", a));
}


This code is a little bit better: it will produce compilation error if one of the elements is removed from enum Action. However, it will not cause compilation error if additional element is added to enum Action. Some IDEs or static code analysis tools may produce warning in this case, but who is paying attention to warnings? Fortunately enum can declare abstract method that has to be implemented by each element:


enum Action {
  ONE { @Override public void action() { } }, 
  TWO { @Override public void action() { } }, 
  THREE { @Override public void action() { } }, 
  public abstract void action();
}


Now the switch statement can be replaced by single line:


Action a = ...
a.action();


This solution does not have any of disadvantages enumerated above:

  1. It is readable. The method is "attached" to enum element; one can write as many javadoc as it is needed if method meaning is unclear. The code that calls method is trivial: what can be simpler than method invocation?
  2. There is no way to remove enum constant without removing the implementation, so no unused code will remain if some functionality is no longer relevant.
  3. New enum element cannot be added without implementation of method action(). Code without implementation can't be compiled. 
  4. If several actions are required they all can be implemented in enum. As we already mentioned the code that calls specific function is trivial, so now there is no code duplication. 

Conclusion

Although switch/case structure is well known and widely used in various programming languages its usage may cause a lot of problems. Solution that uses  java enums and described above does not have these disadvantages. The next article from this series shows how to extend functionality of existing enum.

Saturday, February 2, 2019

Syntax highlighting

I have written a lot of blog posts that contain code snippets  in several programming languages (mostly Java). I separated each code snippet by empty line using monospace font to improve readability. Changing font type for code snippets is annoying and does not create the best results I want: I prefer highlighted code.

So, I searched  for tools that can do this work for me and found 2 types of tools:

  1. Tools that take your code snippet and produce HTML that can be embedded into any blog post
  2. Tools that do this transformation at runtime, so the code snippet remains clear text.

The tools of the first type are often very flexible and support various color themes but have a serious disadvantage: they generate almost not editable HTML. If  you want to change your code snippet you mostly have to regenerate its HTML representation. This also mean that you have to store your original snippet for future use, for example as a GitHub gist. It is not a show stopper but an obvious disadvantage. 

The tools of the second type do their magic at runtime. The code snippet remains human readable. The injected java script runs when page is loaded and changes color of reserved words of the programming language used for the embedded code snippet. 

The most popular and good looking syntax highlighter that I found is one created by Alex Gorbabchev.

Here is an example of code snippet highlighted by this tool:
public class MyTest {
    @Test
    public void multiply() {
        assertEquals(4, 2* 2);
    }
}

There are 2 things I had to do to make this magic happen:

  1. Include several scripts and CSS files into HTML header
  2. Write the code snippet into <pre> tag with specific style:
public class MyTest {
    @Test
    public void multiply() {
        assertEquals(4, 2* 2);
    }
}
Typically external resources (either scripts or CSS) are included by reference, i.e.
<script src='http://domain.com/path/script.js' type='text/javascript'></script> 
<link href='http://domain.com/path/style.css' rel='stylesheet' type='text/css'/> 
This works perfectly with Syntax highlighter scripts in stand alone HTML document but did not work when I added the scripts to the themes of my blog. Discovery showed that blogger.com for some reason changed absolute resource references to relative once, so they did not work. Instead of src="http://domain.com/path/script.js" that I have written the following appeared: src="//domain.com/path/script.js", i.e. the http is omitted.


So, I have downloaded all scripts to be able to put their source code directly as a body of tag <script>. For convenience and better performance I have minimized the scripts using one of online tools available in web. The code is available here. This code should be added to <head> of the HTML page.

Now I can enjoy the great syntax highlighter.





Thursday, January 4, 2018

Why Gradle is called Gradle

Today I asked myself: what does name "Gradle" mean. I asked Google and here are the first 2 answers:


Answer #1

It's not an abbreviation, and doesn't have any particular meaning.
The name came from Hans Docter (Gradle founder) who thought it sounded cool.


Answer #2

My original idea was to call it Cradle. The disadvantages of that name were:
  • to diminutive
    • not very unique
As Gradle is using Groovy for the DSL I went down the G-road and thought about calling it Gradle. Everyone I asked liked it so that became the official name. That was about 4 years ago. I'm still very happy with the name.









Conclusions: both answers are correct

  1. This name indeed was invented by the Gradle founder Hans Dockter and IMHO he thinks that the name is cool.
  2. The name has some meaning



Thursday, December 7, 2017

DB #42

DB #42 or how to choose technology :)


We have to implement a new feature. No matter what kind of feature it is. Well, we have to expose REST API, store and retrieve some dynamic data. We spent a lot of time choosing DB. The chief architect wanted MySql. Well, this idea have not pass our sanity filter. The data architect wanted AeroSpike, team leader suggested MongoDB that was blamed by DevOps that proposed Redis.

But as we all know the absolute answer to all questions is 42. So, we decided to choose the DB #42 from alphabetically sorted list of NoSQL databases. 

I opened Google and typed "list of nosql databases". Then I have chosen the following article. I had to extract a list of names of the databases and did it using the following command:

curl http://nosql-database.org/  | grep h3 | grep href > /tmp/db.hrefs.txt

(because all DB names in this list are surrounded by tag <a> with reference to their web page)

There was one case when 2 databases were written in one physical line and I fixed this manually. 
Then I ran the following command to get the alphabetical list of DB names:

cat /tmp/db.hrefs2.txt |  sed 's#</a>.*##' | sed 's/.*>//' | sort > /tmp/db.names.txt

The last phase it to print this list with attached numbers:

 i=1; cat /tmp/db.names.txt | while read l; do echo "$i,$l"; i=`expr $i + 1`; done

Here is the list:
1Accumulo
2acid-state
3Aerospike
4AlchemyDB
5allegro-C
6AllegroGraph
7Amazon SimpleDB
8AmisaDB:
9Apache Flink
10Applied Calculus
11ArangoDB
12ArangoDB
13ArangoDB
14Axibase
15Azure DocumentDB
16Azure Table Storage
17BagriDB
18BangDB
19BaseX
20BayesDB
21BergDB
22Berkeley DB
23Berkeley DB XML
24Bigdata
25BinaryRage
26BoltDB
27BrightstarDB
28BrightstarDB
29Cachelot
30Cassandra
31Chordless
32Chronicle Map
33Cloudata
34Cloud Datastore
35Cloudera
36Clusterpoint Server
37CodernityDB
38ConcourseDB
39CoreObject
40CortexDB
41Couchbase Server
42CouchDB
43Crate Data
44DaggerDB
45Datomic
46db4o
47DBreeze
48densodb
49djondb
50Druid
51DynamoDB
52Dynomite
53EJDB
54Elassandra
55Elastic
56Elliptics
57EMC Documentum xDB
58ESENT
59Eventsourcing for Java (es4j)
60Event Store
61Execom IOG
62eXist
63eXtremeDB
64eXtremeDB
65eXtremeDB Financial Edition
66EyeDB
67Faircom C-Tree
68Fallen 8
69FileDB:
70filejson
71FlockDB
72FoundationDB
73FramerD
74GemFire
75GemStone/S
76GenieDB
77Genomu
78GigaSpaces
79Globals:
80GPUdb
81GraphBase
82GridGain
83GT.M
84gunDB
85gunDB
86gunDB
87Hadoop / HBase
88Hazelcast
89Hibari
90HPCC
91HSS Database
92HyperDex
93HyperGraphDB
94Hypertable
95IBM Cloudant
96IBM Informix
97IBM Lotus/Domino
98iBoxDB
99Infinispan
100Infinite Graph
101InfinityDB
102influxdata
103InfoGrid
104Informix Time Series Solution
105Intersystems Cache
106ISIS Family
107JADE
108JasDB
109jBASE
110JEntigrator
111JSON ODM
112KAI
113kdb+
114KirbyBase
115KitaroDB
116KUDU
117LevelDB
118LightCloud
119LSM
120Magma
121MapR
122MarcelloDB
123MarkLogic Server
124Maxtable
125MemcacheDB
126MentDB:
127Meronymy
128MiniM DB
129Mnesia
130Model 204 Database
131MonetDB
132MongoDB
133Moonshadow
134Morantex
135NCache
136NDatabase
137NeDB
138NEO
139Neo4J
140nessDB
141Newt DB
142Ninja Database Pro
143NosDB
144NoSQL embedded db
145ObjectDB
146Objectivity
147Onyx Database
148OpenInsight
149OpenLDAP
150OpenLink Virtuoso
151OpenQM
152Oracle Coherence
153Oracle NOSQL Database
154OrientDB
155OrientDB
156Perst
157PickleDB
158PicoLisp
159Pincaster
160pipelinedb
161Prevayler
162Qizx
163quasardb
164Queplix
165RaptorDB
166RaptorDB
167rasdaman
168RavenDB
169RDM Embedded
170Reality
171ReasonDB
172Recutils:
173Redis
174RethinkDB
175Riak
176Riak TS
177RockallDB
178RocksDB
179Scalaris
180Scalien
181SciDB
182SCR Siemens Common Repository
183Scylla
184SDB
185Sedna
186SequoiaDB
187Serenety
188SharedHashFile
189siaqodb
190SisoDB
191Sophia
192Sparksee
193Splice Machine
194SpreadsheetDB
195Starcounter
196Sterling
197STSdb
198Symas LMDB
199Tarantool/Box
200TayzGrid
201Terrastore
202ThruDB
203TIBCO Active Spaces
204Tieto TRIP
205TigerLogic PICK
206TITAN
207Tokutek:
208Tokyo Cabinet / Tyrant
209ToroDB
210TreodeDB
211Trinity
212U2
213upscaledb
214VaultDB
215VelocityDB
216Versant
217VertexDB
218Voldemort
219Vyhodb
220weaver
221WhiteDB
222WonderDB
223Yserial
224ZODB


And the winner is .... CouchDB - number 42!

Conclusions

Every time you cannot agree which technology to choose just find the longest list of relevant technologies, sort them alphabetically and choose #42.

Thursday, March 17, 2016

Performance of try/catch

Yesterday I had a discussion with my friend and he said that according to his opinion try/catch statement in java is very performance expensive. Indeed it is always recommended to check value prior using it instead of try to use and then catch exception if wrong value caused its throwing.

I decided to check this and tried several code samples. All functions accept int and return the argument multiplied by 2.
But there were the differences:

  1. just calculate the value and return it (foo())
  2. calculate the value into try block followed by catch block (tryCatch())
  3. calculate the value into try block followed by finally block (tryFinally())
  4. calculate the value into try block followed by catch and finally blocks (tryCatchFinally())
  5. divide integer value by zero into try block followed by catch block that just returns -1 (tryThrowCatch())
  6. divide integer value by zero into try block followed by catch block that re-throws it. Outer try/catch structure catches the secondary exception and returns -1 (tryThrowCatch1())
  7. divide integer value by zero into try block followed by catch block that wraps thrown exception with another RuntimeException and re-throws it. Outer try/catch structure catches the secondary exception and returns -1 (tryThrowCatch2())
I ran each test 100,000,000 times in loop and measured elapsed time. Here are the results.


Test name Elapsed time, ms
foo 46
tryCatch 45
tryFinally 45
tryCatchFinally 44
tryThrowCatch 133
tryThrowCatch1 139
tryThrowCatch2 62293


Analysis

  1. try/catch/finally structure written in code itself does not cause any performance degradation
  2. throwing and catching exception is 3 times more expensive than simple method call.
  3. wrapping exception with another one and re-throwing it is really expensive. 


Conclusions

Catching exceptions itself does not have any performance penalty. Throwing exception is indeed expensive, so validation of values before using them is better not only from design but also from performance perspective. 

The important conclusion is that we should avoid using very common pattern in performance critical code:

try {
     // some code
} catch (ThisLayerException e) {
    throw new UpperLayerException(e);
}

This pattern helps us to use layer specific exceptions on each layer of our code. Exception thrown from lower layer can be wrapped many times that creates extremely long stack trace and causes serious performance degradation. Probably better approach is to extend our domain level exceptions from RuntimeException and wrap only checked exceptions and only once (like Spring does).


Source code

The source code can used here can be found on github.


    Wednesday, February 24, 2016

    Dangerous String.format()

    Introduction


    Static method format() that was added to class java.lang.String in java 5 became popular and widely used method that replaced MessageFormat, string concatenation or verbose calls of StringBuilder.append().

    However using this method we should remember that this lunch is not free.

    Performance issues

    1. This method accepts ellipsis and therefore creates new Object array each time to wrap passed arguments. Extra object is created, extra object must be then removed by GC. 
    2. It internally creates instance of java.util.Formatter that parses the format specification. Yet another object and a lot of CPU intensive parsing. 
    3. It creates new instance of StringBuilder used to store the formatted data.
    4. At the end it calls StringBuilder.toString() and therefore creates yet another object. The good news is that at least it does not copy the content of StringBulder but passes the char array directly to String constructor. 
    So, call of String.format() creates at least 4 short leaving objects and parses format specification. In real application it probably parses the same format millions times. 

    Solution

    Use Formatter directly. Compare the following code snippets:



    public static CharSequence str() {
     StringBuilder buf = new StringBuilder();
     for (int i = 0; i < n; i++) {
      buf.append(String.format("%d\n", 1));
     }
     return buf;
    }
    
    
    public static CharSequence fmt() {
     StringBuilder buf = new StringBuilder();
     Formatter fmt = new Formatter(buf);
     for (int i = 0; i < n; i++) {
      fmt.format("%d\n", 1);
     }
     return buf;
    
    }
    


    Method fmt() is about 1.5 times faster than method str(). Even better results may be received comparing writing directly to stream instead of creating String and then writing it to stream. 


    String.format() is Locale sensitive

    There are 2 format() methods:

    public static String format(String format, Object... args)

    and 

    public static String format(Locale l, String format, Object... args)


    Method that does not receives Locale argument uses default locale: Locale.getDefault(Locale.Category.FORMAT) that depends on machine configuration. This means that changing machine settings changes behavior of your application that may even break it. The most common problems are:
    • decimal separator
    • digits

    Decimal separator


    Programmers are so regular that decimal separator is dot (.) that sometimes forget that this depends on locale. I've written simple code snippet that iterates over all available locales and checks what character is used as a decimal separator:

    Decimal separator Number of locales
    Dot (.) 71
    Comma (,) 89

    If produced string is then parsed the parsing may be broken by changing default locale of current machine. 

    Digits

    Everyone knows that digits are 1,2,3,... This is right. But not in any locale. Arabic, Hindi, Thai and other languages use other characters that represent the same digits. Here is a code sample:




    for (Locale locale : Locale.getAvailableLocales()) {
     String one = String.format(locale, "%d", 1);
     if (!"1".equals(one)) {
       System.out.println("\t" +locale + ": " + one);
     }
    }
    


    And this is its output when it is running on Linux machine with java 8:

            hi_IN: १
            th_TH_TH_#u-nu-thai: ๑


    Being executed on Android this code produces 109 lines long output. It includes:

    1. all versions of Arabic locales, 
    2. as, bn, dz, fa, ks, mr, my, ne, pa, ps, uz with dialects.
    This may easily break application on some locales. 



    Conclusions

    1. Since java formatting is locale dependent it should be used very carefully. Probably in some cases it is better to specify locale explicitly, e.g. Locale.US
    2. Be careful when calling String.format() in performance critical sections of code. Using other API (e.g. direct invocation of class Formatter) may significantly improve performance. 

    Acknowledgements

    I'd like to thank Eliav Atoun that inspired discussion about this issue and helped me to try the code sample on Android. 

    Source code 

    Code snippets used here may be found on github




    Thursday, December 17, 2015

    Creating a self-extracting tar.gz

    Motivation


    Winzip can create self-extracting executable. Actually this is the unzip utility together with content of zip file that is being extracted when running the executable.

    Recently I was working on distribution package of our software for Linux. On Windows we have a huge zip file and a small script that we run once zip is extracted. Scripts for Linux are a little bit more complicated due to necessity of granting permissions, creating user and group etc. So, I wrote script that does all necessary actions including the archive extracting. The disadvantage of this is that now the distribution consists of at least 2 files: the archive and the script.

    So, I decided to check how to create self extracting archive on Linux.


    Used commands

    I started from the following exercise:

    #!/bin/sh
    echo $0
    exit
    foo bar


    This script runs ignoring the last line "foo bar". 
    Then I created tar.gz file and appended it to this script:

    cat script.sh my.tar.gz >script.with.tar.sh

    Although now the script contains binary content of tar.gz it runs as expected. 
    But I want to create self extracting script. Therefore I need a way to separate script from its "attachment". Command "dd" helps to implement this:

    dd bs=1 skip=$SCRIPT_PREFIX  if=$SELF_EXTRACTING_TAR

    But how can script with attachment extract size of its scripting part? If script contains known number of lines (e.g. 3) we can do the following:

    head -3 $SCRIPT | wc -c


    Script can access its own name using variable $0
    Taking both commands together we can write line that extracts attachment appended to script:

    dd bs=1 skip=`head -3 $0 | wc -c` if=$0

    Extracting tar.gz can be achieve using command 

    gunzip -c  | tar -x

    So, this command extracts the attached tar.gz to current directory:

    dd bs=1 skip=`head -3 $0 | wc -c` if=$0 | gunzip -c | tar -x


    Script

    It is a good idea to create script that takes regular tar.gz and creates self-extracting archive.
    The script is available here. It accepts the following arguments:

    1. mandatory path to tar.gz 
    2. optional command that is automatically executed right after the archive extracting. Typically it is script packaged into the tar. 
    The name of resulting executable is as name of tar.gz with suffix ".self".



    Usage Examples: 

    Create self-extracted executable: 
    ./selftar.sh my.tar.gz

    Create self-extracted executable with command executed after archive extracting: 
    ./selftar.sh my.tar.gz "folder/install.sh"

    Both examples create executable file my.tar.gz.self that extracts files initially packaged to my.tar.gz to current directory:

    ./my.tar.gz.self


    Usage

    This script and article were inspired by my work on distribution package based on simple archive. Generally this technique is wrong by definition. Distribution package depends on the target platform and should be created utilizing the platform specific tools: .msi for MS Windows, .rpm for RedHat, .deb for Debian etc. Self-extracting executable however allows creating simple package that can be used on most Unix based platforms that is very convenient especially when the packaged application is written in cross-platform language like java. 

    Wednesday, December 10, 2014

    Why not to develop code on Windows


    I have been developing code professionally for the last 17 years. Most of these years I used MS Windows because this operating system is mostly used in Israeli Hi-Tech industry. However years spent with Linux were them most productive and fun for me. 

    So, I decided to write the reasons why not to develop code on Windows. Obviously this is irrelevant for people that develop applications for Windows. 
    1. windows is not free
    2. you have to install office that is not free too
    3. shell is owfull
    4. even linux-like shells (cygwing, gitshell etc) do not work exactly as Linux shells. 
    5. security issues make mad. Often you cannot remove file although you are administrator. For example on Windows 8.1 java method File.canWrite() returns false unless file is under user home although the file is indeed writable. 
    6. Often files remain locked even if process that locked them is killed. 
    7. Back slash as a delimiter instead of forward slash causes lots of mistakes.
    8. "Program Files" folder is very important. Space in its name causes many bat files that do not wrap each path with quotes to fail.
    9. Most open source tools are developed and tested on Linux. Even if the tools are cross platform some issues often happen on Windows only.
    10. It is not Linux


    Thursday, January 23, 2014

    Usage trends of JavaScript MVC Frameworks

    Analysis

    JavaScript MVC frameworks became very popular during the last 5 years. Naturally there are a lot of such frameworks. I spent most of my time as a developer at server side, so I am not very familiar with those framework but want to learn them. But where to start? I do not want to spend time on framework that will be obsolete in a year because I do not want to look like people that start learning J2ME, applet programming, Log4J API and configuration or start new project with Ant and CVS these days.

    So, I googled "JavaScript MVC framewors" and found the following articles:

    I know that it the list is not full but if it is good enough for authors it is good enough for me :).
    Then I wanted to check whether these libraries are good for other people using google trends. Unfortunately Google trends does not allow to compare more than 5 search targets, so I splitted the list into groups and then compared the best libraries from the first round. 

    Here are links to the results.

    Semi final


    Final




    According to this graph AngularJS wins.

    Conclusions

    It seems that AngularJS is the most popular JS MVC framework now and its popularity is growing very fast, so I am going to learn it. 

    Thursday, July 11, 2013

    Performance of checking computer clock

    Very often we check computer clock using either System.currentTimeMillis() or System.nanoTime(). Often we call these methods to check how long certain part of our program runs to improve performance. But how much does the call of mentioned methods cost? Or by other words

    How long does it take to ask "What time is it now?"

    I asked myself this question and wrote the following program.

    public static void main(String[] args) {
    long tmp = System.nanoTime();
    long before = System.nanoTime();
    for (int i = 0; i < 1000_000_000; i++) {
    // do the call
    }

    long after = System.nanoTime();
    System.out.println((after - before) / 1000_000);
    }

    Then I replaced the comment "do the call" with interesting code fragments and measured the time. Here are my results.

    Code Elapsed time, ms
    nothing 5
    call of foo() {return 0;} 5
    f+=f 320
    call of foo() {return f+=f;} where f is a class level static variable initiated to System.nanoTime() 325
    call of System.nanoTime() 19569
    call of System.currenTimeMillis() 22639

    This means that:

    1. method that just returns constant is not executed at all. Call of method that returns 0 takes exactly the same time as doing nothing.
    2. call of method itself does not take time. Execution of f+=f and call of method that does the same take exactly the same time. We have to say "thanks" to JVM that optimizes code at runtime and  uses JIT.
    3. Call of currentTimeMillis() is about 10% heavier than nanoTime()
    4. Both methods of taking time are comparable with ~65 arithmetic operations. 

    Conclusions

    1. Checking computer clock itself can take time when it is used for measurement of performance of relatively small pieces of code. So, we should be careful doing this. 
    2. Using nanoTime() is preferable when checking time period not only because it gives higher precision and is not sensitive to changing of computer clock but also because it runs faster. Moreover this method returns more correct results because it is using monotonic clock. It guaranties that if you perform 2 consequent calls the second call returns number greater than previous that is not guaranteed when executing currentTimeMillis().
    3. Do not try to optimize code by manual inlining of your logic. JVM does it for us at runtime. Indeed running arithmetic operation directly or by calling method that contains only this operation take exactly the same time. 

    Acknowledgements

    I would like to thank Arnon Klein for his valuable comments. 

    Wednesday, January 9, 2013

    Annotation based design patterns


    Annotations introduced to Java 5.0 became a very well known and widely used language feature. There are more or less "standard" solutions where annotation can be successfully used. This article tries to classify different usages and to define some annotation based design patterns.

    Introduction

    Annotations define metadata that can be discovered using reflection API. They do not affect neither class hierarchy nor objects relationship. However, they can be used to label classes, methods and fields, to lookup services, to define model transformation rules, to configure proxies etc. These use cases can be classified as re-usable patterns exactly like design patterns in object oriented programming. 

    Scope

    This article is not a tutorial that explains what annotations can do. It assumes that a reader is familiar with annotations. The article shows the most popular patterns that can be used for solving typical tasks. Annotations can be classified by their retention policy. This article discusses annotations that can be retained by VM at runtime, e.g. marked as

    @Retention(RUNTIME)  

    Classification of annotations

    Annotations may be classified by:
    • target (type, method, field, annotation etc)
    • role (stereotype, transformation, validation etc)
    • module that uses this annotation. Usually annotation is used by other class but sometimes annotated class itself uses its own annotation. 
    • instance lifecycle phases
      • before instance creation
      • before executing business logic (runner, injector)
      • during executing business logic (discovering stack trace)

    Annotation based design patterns

    Stereotype (Tag)

    Tag interfaces (interfaces that do not declare any method and are used as a kind of label of class that implements them) were used for a long time before inventing of annotations. The classic example of such interface is java.io.Serializable. Since tag interface does not declare methods there are two ways to use it:

    if (obj instanceof Serializable) {...}
    or
    if (Serializable.isAssignableFrom(clazz)) {...}

    If we use annotations instead of tag interface we can replace definition:

    class MyClass implements MyTag {...}

    where MyTag is defined as 

    interface MyTag {}

    by the following one:

    @MyTag
    class MyClass {...}

    where MyTag is defined as:

    @interface MyTag {}

    Class marked with this annotation can be detected using code:

    if (clazz.getAnnotation(MyTag.class)) {...}

    Using annotation instead of tag interface does not require additional efforts during implementation and has advantage: annotation can hold one or several parameters.

    Tags or Stereotypes allow defining what the class does in different contexts.

    Examples:

    • org.springframework.transaction.annotation.Transactional
    • javax.persistence.Transient
    • java.lang.annotation.Documented
    • java.lang.annotation.Retention
    • java.lang.annotation.Target

    Service locator

    Typical software product consists of modules or services. Components should be able to find other services that provide required functionality. Service as well as tag pattern may be defined as a class that implements specific interface. But contrary to tag interface service interface declares methods.

    Before annotations were invented frameworks could locate service if it implemented required interface. This method does not allow distinguishing between two different implementations of the same interface. Annotations allow this. For example Service annotation of Spring Framework can hold service name:

    @Service("manager1") 
    public class FirstManager implements Manager {}

    @Service("manager2") 
    public class SecondManager implements Manager {}

    This example shows two services that implement the same interface but can be identified by the name written as an annotation argument. 

    Not only class but also separate method can be a service. For example, each test in the test case is "service" that validates specific testing scenario. Particular test is labeled with annotation @Test and framework can find it. 

    More complicated example of service pattern is @RequestMapping of Spring MVC. This annotation supports several attributes that help the framework to choose suitable services according to HTTP request parameters. Both classes and methods can be annotated with @RequestMapping. 


    Injector

    Injector is an annotation that helps framework to inject (or initialize) method arguments when calling business method. @PathVariable of Spring MVC that is used in conjunction with @RequestMapping instructs the framework how to initialize method arguments:


    @RequestMapping(value = "/user/{username}", method = RequestMethod.GET)
    @ResponseBody
    public User getUser(@PathVariable("username") String userName) {...}


    Now when URL looks like http://thehost/app/user/johnsmith the framework extracts johnsmith from the URL and passes it as a method argument userName because this argument is marked with annotation @PathVariable("username"). 



    Master (Runner)

    Contrary to Service Locator that defines metadata of the class itself the runner refers to other class that will be used to run the current class. The typical example is @RunWith annotation of JUnit. This annotation marks test case and defines the runner that will execute current test case:


    @RunWith(SpringJUnit4ClassRunner.class)
    public class MyTest {...}


    Configurer

    Annotation may configure class at runtime. An example is @Parameters annotation of JUnit that must be used in conjunction with Parametrized test runner.
    We can run the same test case with different set of arguments and expected results using @Parameters. The following simple example shows how to use these annotations:

    @RunWith(value = Parameterized.class)
    public class MatchTest {
    private Pattern pattern = Pattern.compile("^\\d+$");
    private String number;
    private boolean expected;


    public MatchTest(String number, boolean expected) {
    this.number = number;
    this.expected = expected;
    }

    @Parameters
    public static Collection<Object[]> data() {
    return Arrays.asList(new Object[][] { 
        {"100 kg", false}, {"123", true}, {"220 lb", false} 
      });
    }

    @Test
    public void match() {
    Assert.assertEquals(expected, pattern.matcher(number).find());
    }
    } 

    In the above example JUnit runs this test case 3 times passing "100 kg", "123", "220 lb" as the test arguments and false, true, false as the expected results.


    Proxy/Wrapper/AOP

    Annotations may be used to create and configure class wrapper or dynamic proxy or even to modify the byte code of the class itself utilizing byte code engineering. Spring security framework implements this pattern as follows:


    @PreAuthorize("hasRole('ROLE_USER')")
    @PostFilter("hasPermission(filterObject, 'read') or hasPermission(filterObject, 'admin')")
    public List<Contact> getAll();



    The example above shows how to annotate bean method so that its invocation is permitted only if current user has role ROLE_USER. Moreover, the returned collection is filtered automatically and the result contains only elements for which the condition is defined by @PostFilter is true. Spring wraps the  application level bean with dynamic proxy to perform security check and filter results according to annotation based configuration.

    Validator

    Annotations can be used for configuration of data validation. Two types of validation can be distinguished: bean validation (JSR 303) and method validation. Hibernate validation framework supports both types using the similar way. 


    Bean validation

    public class Car {

        @NotNull 

        private String manufacturer;

            @NotNull

            @Size(min = 2, max = 14)

            private String licensePlate;



            @Min(2)

            private int seatCount;

            ................

    }




    Method validation

    public @NotNull @Size(min-1) Collection<Role> findRolesOfUser(@NotNull User user) {...}

    Both bean and method validations use similar annotation syntax. However they differ in implementation.

    • Method validator typically uses dynamic proxy or byte code engineering. Bean validation may be implemented utilizing validation mechanisms of other systems. For example, @NotNull annotated field may be validated using appropriate constraint of target relational database. 
    • Method validator is always synchronous while validation of bean field may be asynchronous. For example, in Hibernate persisted entity the constraint violation can be found when transaction is being committed. It is not necessarily happens directly after bean field modification.

    Transformer

    Transformer defines how to convert value from one form to another. Typical usage is marking bean properties or appropriate getters. The most popular examples are:
    • JaxB annotations @XmlElement@XmlAttribute@XmlTransient that define how to transform bean properties to XML elements and vice versa.
    • Hibernate annotations: @Table@Id, @Column etc. that define how to transform objects to records stored in relational database and vice versa.
    The above mentioned  @PostFilter annotation that configures dynamic proxy can be classified also as a transformer. It defines additional criteria for filtering of collection retrieved from database. 


    Annotation container

    Two or more annotations of the same type are not allowed on specific element. This is forbidden by the java compiler. One solution is to use array instead of simple value for the specific attribute. For example, already mentioned  above  @RequestMapping can map specific controller or its method to several URLs:

    @RequestMapping(value = {
        "/user/{username}", 
        "/userlist?username={username}"}, 
    method = RequestMethod.GET)

    @ResponseBody
    public User getUser(@PathVariable("username") String userName) {...}

    This is a good solution if the value of the attribute is not used in conjunction with the value of the other attribute: mappings of both URLs are valid for HTTP GET.

     Let's see another example. 

    @XmlElement(name = "user-name", required=true)
    public String getUsername() {return username;}

    In this example the field username is mapped to mandatory XML element user-name.  How can we support mapping of the same bean field to additional XML element UserNameWe cannot add annotation:

    @XmlElement(name = "UserName", required=false)

    to the same getter getUsername(). Fortunately JAXB provides another annotation   @XmlElements  that plays role of the container for other annotations:

    @XmlElements ({
        @XmlElement(name = "user-name", required=true),
        @XmlElement(name = "UserName", required=false)
    )}
    public String getUserName() {return username;}

    The getter is annotated once using @XmlElements. However, both object-to-XML mappings are provided. 

    Definition of @XmlElements is very simple. The value type is an array of XmlElement:


    public @interface XmlElements {

        XmlElement[] value();

    }

    Custom annotation

    Regular classes can be customized using inheritance. Annotations do not support inheritance. How can we customize generic annotations?

    Possible solution is to use custom annotation which is annotated with another annotation provided by framework. @Profile annotation of Spring framework is an example of this pattern. As described here Spring supports profiles. We can define several profiles and run different set of beans for each one. Bean can be associated with profile using @Profile annotation. Both FirstDevService and SecondDevService will run when profile is dev.


    @Profile("dev") @Service
    public class FirstDevService { ... }


    @Profile("dev") @Service
    public class SecondDevService { ... }

    Using @Profile("dev")annotation is relatively verbose and error prone: a bean which is by mistake marked as @Profile("deu")will not start in mode "dev" and no error message will be generated. Fortunately Spring allows creating custom annotation @Dev and mark it with @Profile:



    @Target(ElementType.TYPE)
    @Retention(RetentionPolicy.RUNTIME)
    @Profile("dev")
    pubilc @interface Dev {}

    Now we can use this new annotation as following:


    @Dev @Service
    public class FirstDevService { ... }


    @Dev  @Service
    public class SecondDevService { ... }

    Syntax that uses custom annotations is shorter, more readable and less error prone: simple mistake in annotation name will produce compilation error.

    Caller identifier

    Sometimes code has to discover its caller. Let's examine the following examples. The trick is to iterate over stack trace and to look for specific annotation.

    • Factory will create an instance of a special mockup implementation instead of real implementation when running in test environment. Test context may be identified if one of stack trace elements is annotated with @Test.
    • Special logic can be required when running in web context. Web context can be detected if @WebServlet or @Controller annotations are found.


    Here is a simple implementation of utility that checks whether the code was called by class annotated with specified annotation. For example CallerUtil.isCallerClassAnnotatedBy(Controller.class) determines whether the caller is Spring MVC controller. 

    public class CallerUtil {
    private static Map<String, Class<?>> classes = new HashMap<String, Class<?>>();

    public static boolean isCallerClassAnnotatedBy(
    Class<? extends Annotation> annotationClass) {
    for (StackTraceElement e : new Throwable().getStackTrace()) {
    Class<?> clazz = getClazz(e.getClassName());
    if (clazz.getAnnotation(annotationClass) != null) {
    return true;
    }
    }
    return false;
    }

    private static Class<?> getClazz(String className) {
    Class<?> clazz = classes.get(className);
    if (clazz == null) {
    try {
    clazz = Class.forName(className);
    classes.put(className, clazz);
    } catch (ClassNotFoundException e) {
    throw new IllegalStateException(e);
    }
    }
    return clazz;
    }
    }

    Annotated interface

    Annotation targeted to type and annotated as @Inherited can be retrieved even if it is used to annotate not the class itself but its super class:


    @Retention(RUNTIME) 
    @Target(TYPE)
    @Inherited
    public @interface  MyAnnotation  {}

    @MyAnnotation
    public class Base {}

    public class Child extends Base {}

    The call Child.class.getAnnotation(MyAnnotation.class) will return instance of MyAnnotation although class Child is not annotated with MyAnnotation because base class is annotated and MyAnnotation is annotated itself as @InheritedUnfortunately annotation are inherited only from super classes. Method getAnnotation() does not return annotations used for interface implemented by current class

    @MyAnnotation
    public interface Foo {}

    public class FooImpl implements Foo {}

    The call FooImpl.class.getAnnotation(MyAnnotation.classwill return null because Foo is an interface. 

    Although I have not seen this pattern utilized by popular libraries I personally think that annotated interfaces may be very useful. To retrieve annotation from the interface we have to iterate over all interfaces implemented by class and try to retrieve the annotation from each interface separately. The following utility method retrieves annotation applied to class itself, its base class or any of interfaces implemented by this class directly or indirectly.

    public static <A extends Annotation> A getAnnotation(Class<?> clazz, Class<A> annotationType) {
    A classAnnotation = clazz.getAnnotation(annotationType);
    if (classAnnotation != null) {
    return classAnnotation;
    }
    for (Class<?> c = clazz; c != null; c = c.getSuperclass()) {
      for (Class<?> implementedInterface : c.getInterfaces()) {
      A interfaceAnnotation = implementedInterface.getAnnotation(annotationType);
      if (interfaceAnnotation != null) {
      return interfaceAnnotation;

      }

      }
    }

    return null;
    }





    Shared constructor parameter

    Let's review the example. There is an abstract class Base with constructor that accepts parameters of type Class

    public abstract Base {
        protected Base(Class<?> type) {
            // uses argument type here
        }
    }


    There is an abstract subclass Intermediate that has nothing to do with the parameter. 
    typical tasks
    public abstract Intermediate extends Base {
        protected Intermediate(Class<?> type) {
            super(type); 
        }
    }


    There is the concrete class Concrete that sends the same value of the parameter for all its instances.

    public Concrete extends Intermediate {
        protected Intermediate(Class<?> type) {
            super(String.class); 
        }
    }

    Although Intermediate has nothing to do with type it must declare constructor that just passes the parameter to its base class. The inheritance chain may be longer. But each class in the chain must have such trivial constructor, i.e. must be aware of the base class parameter existence.

    Alternatively we can pass this data using annotations.

    Let's define annotation CocreteType:

    @Target(ElementType.TYPE)
    @Retention(RetentionPolicy.RUNTIME)
    @Inherited
    @Documented
    @interface ConcreteType {
        Class<?> value();
    }

    The constructor of the base class does not have to accept the parameter. It extracts this information from annotation that exists in the concrete class:

    public abstract Base {
        protected Base(Class<?> type) {
            Class<?> type = 
                getClass().getAnnotation(ConcreteType.class).value();
            // deal with type
        }
    }


    Both Intermediate and Concrete classes do not have explicit constructor at all:

    public abstract Intermediate extends Base {
    }

    The value of type is defined using annotation.

    @ConcreteType(String.class)
    public Concrete extends Intermediate {
    }

    This implementation is shorter and easier for modification. For example, if  parameter type is changed Intermediate class should not be modified at all. 

    By the way, Concrete class can be subclassed too:

    public MoreConcrete extends Concrete {
    }

    Class MoreConcrete is not annotated with @ConcreteType but since @ConcreteType is annotated as @Inherited, MoreConcrete  inherits it from Concrete.



    Conclusions

    Design patters are well known technique for creating robust and reusable software components. Various ways of using annotations in Java programming language can be classified as annotation based design patterns. This article suggests classification of typical tasks that can be implemented by using annotations. Author hopes that this classification may be helpful when designing and choosing instruments for implementation of other similar tasks.