The programmer's notes: 2011

Tuesday, December 6, 2011

How slow getStackTrace() is?

Today during discussion about usage of low level Java tricks I was told that Throwable.getStackTrace() is a very heavy operation. So, I decided to check this fact myself and perform benchmark comparison of Throwable.getStackTrace() with other widely used in java operations.

I wrote loop that calls some code snippets 10 million times. I believe this statistics is good enough. Here are the results.

Action	Time (milliseconds)	Comment
System.currentTimeMillis()	149
new Throwable().getStacktrace()	19081
Thread.currentThread().getStacktrace()	24334	Added later, tested on other environment, so the value here is calculated relatively to new Throwable().getStacktrace()
throwable.getStacktrace()	343
Create object and invoke its hashCode()	548
new Thread()	8898
new Thread().start()	804*1000	Used 10000 iterations here. 10 millions take ages...
file print	1647
file print (long lines)	6673

We can see that getStackTrace() is really heavy action comparing with ordinary method call. But it can be easily compared with time that is needed to print 60 characters long line to file located on the local hard drive and it is much lighter than starting a new thread.

This means that the nice formatted log record that contains the source method name is twice heavier than log record that does not print method name. But starting new thread is 40 times heavier.

Recently I found out that there is yet another way to retrieve current stack trace: Thread.currentThread().getStacktrace(). I decided to test this method relatively to new Throwable().getStackTrace() and found that it 20% heavier. Code investigation showed that the reason is probably in yet another security check (invocation of security manager) done in code of Thread.getStackTrace().

Conclusions

Throwable.getStackTrace()is relatively heavy but there are other operations that take approximately the same or even more time. So, we can use it when we need but should be careful.

Acknowledgement
I want to thank Vladi Bar On that caused me to perform this investigation.

Friday, July 29, 2011

Find bad coding practices using regular expressions

I have been using eclipse for last 8 years and very glad to be a witness of its continuous improvement. Specifically I mean its compilation warnings. Number of warnings it is able to produce is growing from version to version and it is great. I typically enable most of them and believe that it improves my code.

Unfortunately some warnings are still absent. Sometimes people use concrete classes or too specialized interfaces where interfaces should (and almost must) be used. Here are the code examples that irritate me:

// Using concrete class to the left of the assignment
ArrayList<String> a = new ArrayList<String>();

// Using concrete class into the generics
new ArrayList<ArrayList<String>>

// Using concrete class as an method parameter or return type
HashSet<Integer> foo(ArrayList<String> list){}

I personally never write such code but I know people that do.
When I see such code I want to fix it. But how find all places? Eclipse does not create warnings about any of these problems. Fortunately Eclipse supports search using regular expressions. I decided to write several regular expressions that can help to solve this problem.

Here are the expressions:

// left side of assignment
// [A-Z]\w+(List|Set|Map|Bean|Impl)\s*(<.*?>)?\s+\w+\s*=

// nested generics
// <\w+(List|Set|Map|Bean|Impl)\b[\w,\s<>]*>

// method argument
// \s+\w+\s*\([^)]*[A-Z]\w+(List|Map|Set)\b

// return type
// \w+(List|Set|Map|Bean|Impl)\s*(<.*?>)?\s+\w+\s*\(.*\{

I used them on pretty large codebase and found useful. I tried to write pattern that will be as short as it is possible and works without false negatives and with minimum false positives.

Limitations

Obviously regular expression cannot solve 100% of problems. The method assumes specific naming conventions. If for example I call my interface Worker and the class that implements it WorkerImpl this method will find expression like WorkerImpl worker = new WorkerImpl(). But it will not work for the following code sample:

class Producer implements Runnable {
@Override
public void Runnable() {
// some code
}
}

//.............................................
Producer p = new Producer();
new Thread(p).start();

Obviously that this code sample uses the instance of Producer as Runnable only and therefore should be written:

Runnable p = new Producer();

new Thread(p).start();

Unfortunately no regular expression can find this situation.

Here is yet another limitation. Sometimes people use "wrong" interface. Here is the code sample:

List<String> list = new ArrayList<String>();
for (String s : list) {/*do somthing*/}

List is not needed here. It should be replaced by Collection:

Collection<String> list = new ArrayList<String>();
for (String s : list) {/*do somthing*/}

Why it is important? Probably in future you will decide to store the elements in Set and still be able to iterate over elements. In this case only the right part of assignment should be chaged:

Collection<String> list = new LinkedHashSet<String>();

for (String s : list) {/*do somthing*/}

It is not a problem when code that declares variable collection and iterates over it is in the same class or even method. But if the collection is create somewhre and then is passed over several layers to code that iterates over it changing method signature of 50 methods from List to Set (or, better to Collection) may take a lot of time.

Conclusions

Regular expression can help us to locate bad coding practices. Although the method cannot find all problems and sometimes produces "false negatives" it was tested on large code base an worked well enough. But "real" solution can be implemented only on IDE level. Here is a link to bug report that I created https://bugs.eclipse.org/bugs/show_bug.cgi?id=353380. I hope eclipse team will implement this suggestion.

Tuesday, July 26, 2011

Automatic detection of debugger

Yesterday I was debugging test case that consists of a few similar unit tests. All unit tests call some API that executes tasks with background thread. I can verify the test results only when thread is done.

The "right" solution is to call the asynchronous API and then call wait(). The thread should call notify() when it is done. The test thread will continue and validate the results. In practice I cannot modify the background thread and add notify() there. It is too deep into the application I am working on. Moreover I do not want to modify production code to help myself writing unit tests. So I decided to use "simple" solution: each my test calls the asynchronous API, then invokes Thread.sleep(100L), then verifies the results. 100 milliseconds are enough for the asynchronous task to complete.

But when I am debugging my code that is invoked by background thread I do not want the test to be terminated. Actually I want my test to sleep infinitely. I changed the parameter of sleep() many times and thought that actually I need automatic mechanism that understands that code is being debugged now and chooses sleep period automatically. I checked system properties and did not see any difference when test is running normally or being debugged.

Other idea was to try JMX. Really, Runtime MBean can help here:

ManagementFactory.getRuntimeMXBean().getInputArguments(). According to javadoc this method:

Returns the input arguments passed to the Java virtual machinewhich does not include the arguments to the main method.

So this is exactly what I need!

Here are 2 examples of return value of this method:

Running program with remote debugger:

[-Xdebug, -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=n]

Debugging program under eclipse:

[-agentlib:jdwp=transport=dt_socket,suspend=y,address=localhost:49709, -Dfile.encoding=UTF-8]

The following utility method returns true if program is running under debugger and false otherwise:

public static boolean isDebugging() {
    Pattern p = Pattern.compile("-Xdebubg|jdwp");
    for (String arg : ManagementFactory.getRuntimeMXBean().getInputArguments()) {
        if (p.matcher(arg).find()) {
            return true;
        }
    }
    return false;
}

I hope that regular expression I used here is general enough to support other IDEs.

Now we can use this code when we need different behavior of our code being executed normally or being debugged.

Tuesday, July 12, 2011

Performance of method invocation by reflection

One day discovering code in project I am working on I found the following code:

try {
    module.getClass().getMethod(methodName, Serializable.class).invoke(module, message);
} catch (Exception e) {
   throw e;
}

Theoretically I knew that reflection works slower than direct invocation. But how slower?
Due to this code was found in the very performance critical part of the system I decided first to perform some benchmarking. I wrote class that contains one method foo() that does nothing and implemented 3 scenarios of invocation and ran them 1 million times:

direct invocation
invocation using reflection when getMethod() was called once
invocation using reflection when getMethod() was called on each loop iteration.

And here are the results.

direct invocation took 5 ms
reflection took 38 ms
getMethod() + reflection invocation too 435 ms

This means that reflection generally can be used even in systems that are required to perform fast if getMethod() is not done for each invocation separately.

Thursday, July 7, 2011

File access: stream vs nio channel

Java provides 2 ways to access files: using streams and NIO channels. Streams implement blocked IO: the read() method is blocked until some content is available. Channels give us ability to read content without being blocked, It allows for example to avoid allocating special thread per data source (file, socket etc.)

While this advantage is very important when communicating over sockets it probably less relevant when reading and writing files. At least the code looks pretty the same. I decided to compare performance of streams an NIO when reading and writing files sequentially.

I wrote simple program that copies file using streams and NIO. When using NIO I used 2 types of buffers: regular and direct. The utility is able to read file without writing it back to disk. This allow to compare the reading and writing speed separately. The following table shows results I got when reading file of 23MB. Evaluation time is given in milliseconds.

Operation	NIO	NIO + Direct buffer	Stream
read + write	638	392	281
read	205	87	37
write (calculated)	433	315	244

The results show quite clearly that good old streams work much faster when accessing files sequentially, especially for reading. Reading files using streams is almost 3 times faster than doing it using NIO event utilizing direct buffer.

I was surprised seeing such results and tried to find the reason. The short JDK code investigation explained everything. FileInputStream.read() method is declared as native. So when we call it we directly use the native mechanisms of current operating system. FileChannel used to read files is an abstract class. The real implementation is in FileChannelImpl. Its read() method calls a lot of methods implemented in java, uses synchronized blocks etc. Obviously it will work slower than native method.

Conclusions

NIO gives us a lot of advantages. But it cannot completely replace the good old streams. At least sequential reading and writing of regular files works much faster when implemented using streams.

Tuesday, July 5, 2011

java.util.Pattern vs regular string search

Following discussion at work where I took a role of advocate of regular expressions I decided to verify how they are really fast. I remember that I have read that once Pattern is compiled (that might take some time) it works as quickly as regular string search. So, I wrote test that calls 1 million times String.contains() and Pattern.matcher().find(). Here are the results: contains test took 102 ms while pattern test took 340.

This means that patterns are almost 3.5 times slower. Therefore although regular expressions provide very convenient way for searching within strings sometimes when pattern are very simple and code is very performance critical we should use the good old methods provided by class java.lang.String.

With that I have to say that I paid attention that method indexOf() accepts String, method contains() accepts CharSequence but is implemented as

return indexOf(s.toString())>=0.

Pattern at the same time works directly with CharSequence. This means that if you use StringBuilder you should be careful passing StringBuilder instance as an argument to String methods: creating new String object may cause significant performance degradation.

Sunday, January 30, 2011

Codility certificate

Recently I found out a cool site that allows programmers to verify their programming skills and recruiters to verify the programming skills of candidates. I did an exam and got silver certificate.
http://codility.com/cert/view/cert3D8D4M-ZZC7578U6V78B8UE

As far as I can see each task has 2 types of solutions: n^2 that can get silver certificate as maximum and better 0^n or ln(n) solution that has a chance to get gold certificate.

Thursday, January 13, 2011

Get program entry point

Today I read the following discussion: How can I determine which class's `main` method was invoked at runtime?

The problem was to distinguish the entry point to the program, i.e. which class' main was used to start the program. The suggestion was obvious:

public static getMainClassName() {
   StackTraceElement[] elem = new Exception().getStackTrace();
   return elem[elem.length - 1].getClassName();
}

This solution is fine but it is correct for main thread only. How to detect the program entry point from other thread? Fortunately we have method getAllStackTraces() available in class Thread since java 1.5. So, the solution is to iterate over all threads, detect the main thread and return last element of its stack trace:

private static String getMainClassName() {
 Map map = Thread.getAllStackTraces();
 for (Map.Entry entry : map.entrySet()) {
  Thread thread = entry.getKey();
  if ("main".equals(thread.getName()) && "main".equals(thread.getThreadGroup().getName())) {
   StackTraceElement[] trace = entry.getValue();
   return trace[trace.length - 1].getClassName();
  }
 }
 return null;
}

This method can be called from any thread in the application. But it also has its limitation. Method getAllStackTraces() returns currently running threads. If main thread was terminated before the method is called the program entry point can not be found.

Tuesday, January 4, 2011

Access windows registry with pure java

Introduction

Java is a cross platform language and therefore does not support platform specific features. Windows has a special unique feature - registry. So, java cannot access registry. We have to use JNI or external processes (directly or indirectly) to do this.

Solution

But recently I found a cool Java feature: preferences implemented by class java.util.prefs.Preferences. Yes, I am eating my hat now! This feature was introduced to Java 1.4 and I did not know and have never used it! What's cool in this feature? Its implementation is platform specific. As always we can access abstract class java.util.prefs.Preferences and get its instances using several static method like userRoot(), systemRoot() and others. But the concrete implementation depends on current operating system. Class WindowsPreferences uses registry to store values. It declares several native methods: WindowsRegOpenKey, WindowsRegCloseKey, WindowsRegDeleteKey etc. Sounds good, doesn't it?

So I decided to abuse this class and implement access to registry utilizing these methods. It was not very simple task. The class WindowsPreferences uses only specific registry path: Software\Java Soft\Prefs under HKLM and HKCU. Class itself is package protected and cannot be nether inherited nor called. So, reflection is the only real option.

Implementation

I called my class Registry and decided to make it singleton. It contains static enum Hive that enumerates all standard registry hives (HKLM, HKCU, HKCR etc). It implements the following public methods:

keys(Hive hive, String path)
values(Hive hive, String path)
get(Hive hive, String path, String name)
createKey(Hive hive, String path)
put(Hive hive, String path, String name, String value)
removeKey(Hive hive, String path)
removeValue(Hive hive, String path, String name)

The implementation is based on invocation of private native methods declared in WindowsPreferences class using reflection:



private <R> R call(String methodName, Class[] types, Object[] args) {

  try {

    Method m = winPrefClazz.getDeclaredMethod(methodName, types);

    m.setAccessible(true);

    return (R)m.invoke(null, args);

  } catch (Exception e) {

    throw new RuntimeException(e);

  }

}

Here is the example how this method is used. This is what I would like to write:

int hresult = WindowsRegCloseKey(handle);

This is what I have to do instead:

int result = call("WindowsRegCloseKey", new Class[] {int.class}, new Object[] {handle});

Using reflection sometimes has some benefits. For example methods that enumerate registry keys and values are very similar but they call different native methods (e.g. WindowsRegEnumValue and WindowsRegEnumKeyEx). So methods childrenNamesSpi() and keysSpi() are almost duplicate in WindowsPreferences. Using reflection allows creating one implementation for both cases.

Limitations

Careful examining of the API exposed by registry class shows that it is limited to work with string values only: method put accepts value of string type, method get returns string. Unfortunately this is the limitation of current approach. Class WindowsPreferences is not designed to work with other value types. Native method WindowsRegSetValueEx works with string values only; attempt to retrieve value of other type using WindowsRegQueryValueEx returns null.

Conclusions

Java programmers are regular to use native libraries to utilize platform specific features. But sometimes some of the features are already implemented by JVM but just not exposed as a public API. Investigation of JDK classes and utilizing of reflection allows us to get access to windows registry with simple and very compact java code. The implementation deals with string values only that is a serious limitation but still can be used because most values stored in registry are strings. The suggested technique is most useful when code size is critical. For example for applications started using JNLP.

Full source code of Registry class and unit test may be found here.

References

I found this trick myself and wrote this article. Then I googled "read windows registry with java" and found at least 2 similar implementations:

http://lenkite.blogspot.com/2008/05/access-windows-registry-using-java.html
http://www.davidc.net/programming/java/reading-windows-registry-java-without-jni

I have no idea why I did not search web before writing the code... The advantage of my solution is that I implemented full API that can be used in any application while these guys just showed that this solution is possible.