String.replaceAll in Java might not do what you expect it to do

At the first glance of String.replaceAll(String regexp, String replacement) it seems very obvious what the method does, and most of the times it does exactly what you want, but under some conditions it does not. Let me show you this by showing you a few unit tests I wrote recently when fixing a bug in our production system:


  @Test
    public void regularReplace1() {
        final String input = "this is a user description";
        String result = input.replaceAll("\\{\\[user\\]\\}", "name");

        Assert.assertEquals(input, result);
    }

This is what we expected. Lets try something else:


    @Test
    public void regularReplace2() {
        final String input = "this is a {[user]} description";
        String result = input.replaceAll("\\{\\[user\\]\\}", "name");

        Assert.assertEquals("this is a name description", result);
    }

Still fine, but what happens when our replacement string contains a $ ?


    @Test(expected = StringIndexOutOfBoundsException.class)
    public void regularReplace3() {
        final String input = "this is a {[user]} description";
        String result = input.replaceAll("\\{\\[user\\]\\}", "name $");

        Assert.assertEquals("this is a name $ description", result);
    }

As you can see from my test, it expects StringIndexOutOfBoundsException to be thrown. Why? We’ll get to that later. Lets try moving the $ to the beginning of the string:


    @Test(expected = IllegalArgumentException.class)
    public void regularReplace4() {
        final String input = "this is a {[user]} description";
        String result = input.replaceAll("\\{\\[user\\]\\}", "$name");

        Assert.assertEquals("this is a $name description", result);
    }

Now replaceAll is going to throw IllegalArgumentException. If you know how regular expressions works you are probably starting to figure out what is going on. Lets try with another magical character, the backslash:


    @Test
    public void regularReplace5() {
        final String input = "this is a {[user]} description";
        String result = input.replaceAll("\\{\\[user\\]\\}", "\\ name");

        // We expect them to be the same, but no
        Assert.assertNotSame("this is a \\ name description", result);
    }

No exception, but not what we expected. Lets move the backslash to the end of the line:


    @Test(expected = StringIndexOutOfBoundsException.class)
    public void regularReplace6() {
        final String input = "this is a {[user]} description";
        String result = input.replaceAll("\\{\\[user\\]\\}", "name \\");

        Assert.assertEquals("this is a name \\ description", result);
    }

Ok, now we are getting a StringIndexOutOfBoundsException. All this seems rather strange, but figuring out what is causing this is not hard. By reading the Java documentation for String.replaceAll will tell you that replaceAll is implemented like this:

Pattern.compile(regex).matcher(str).replaceAll(repl)

so moving on to the documentation for Matcher.replaceAll we can read the following:

Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

Really? Not quite what I expected, I thought the replacement string was just a string that would replace what we matched with the regular expression, turns out that replaceAll is a bit more powerful than that, and also somewhat dangerous. If we are dealing with a replacement string that comes from user input we must escape the special characters $ and \ before we can call string.replaceAll. To fix the bug in our system I first implemented a method, safeReplaceAll:


    public static String safeReplaceAll(String input, String regex, String replacement)
    {
        if (input == null) { return null; }

        // Escape special characters in replacement, then do replace
        return input.replaceAll(regex, Matcher.quoteReplacement(replacement));
    }

So here is the tests using our safeReplaceAll:


    @Test
    public void safeReplace1() {
        final String input = "this is a user description";
        String result = StringUtil.safeReplaceAll(input, "\\{\\[user\\]\\}", "name");

        Assert.assertEquals(input, result);
    }


    @Test
    public void safeReplace2() {
        final String input = "this is a {[user]} description";
        String result = StringUtil.safeReplaceAll(input, "\\{\\[user\\]\\}", "name");

        Assert.assertEquals("this is a name description", result);
    }


    @Test
    public void safeReplace3() {
        final String input = "this is a {[user]} description";
        String result = StringUtil.safeReplaceAll(input, "\\{\\[user\\]\\}", "name $");

        Assert.assertEquals("this is a name $ description", result);
    }


    @Test
    public void safeReplace4() {
        final String input = "this is a {[user]} description";
        String result = StringUtil.safeReplaceAll(input, "\\{\\[user\\]\\}", "$name");

        Assert.assertEquals("this is a $name description", result);
    }


    @Test
    public void safeReplace5() {
        final String input = "this is a {[user]} description";
        String result = StringUtil.safeReplaceAll(input, "\\{\\[user\\]\\}", "\\ name");

        Assert.assertEquals("this is a \\ name description", result);
    }


    @Test
    public void safeReplace6() {
        final String input = "this is a {[user]} description";
        String result = StringUtil.safeReplaceAll(input, "\\{\\[user\\]\\}", "name \\");

        Assert.assertEquals("this is a name \\ description", result);
    }

After this I realized that in Java 1.5 an overloaded method to String.replace(char oldChar, char newChar) was introduced:

String.replace(CharSequence target, CharSequence replacement)

… this method works as a drop in replacement for String.replaceAll! Sometimes reading the API before coding would save you time …

Experiments with Play! Framework

As I wrote in my last post, I recently rewrote one of my hobby projects from ASP.NET MVC to Play! Framework. Play! was very easy to get started with, the only thing required was to have a JDK installed and to download the zip distribution of Play, unzip it to a directory and fire it up using the Play console. Play comes bundled with the Netty http server which can be used for both development and production use. Play is a MVC framework, much in style with ASP.NET MVC. I am not going to go into detail to describe how it works or its features, there is already plenty of articles about it out there, instead I want to touch a few things I think is required from a framework to be productive.

IDE Support
It was quite easy to generate a project for Intellij IDEA was very easy using the Play console. Just type “idea” in the console (or “eclipseify” if you prefer Eclipse). One thing to remember is that everytime you add a new dependency to the project (in Build.scala) you need to rerun the “idea” command, otherwise IDEA will not find the packages and you will not be able to compile from within the IDE.

Support for testing
Play comes with built in support for integration testing (writing tests that test the entire application stack, controller to database). The tests can easily be run from the Play console, but running them from within IDEA turned out to be tricky. Running regular unittests in the project that does not use the in memory database run just fine from the IDE, but i have not been able to configure IDEA to setup the fake application context needed to run the integration tests. It seems like this issue have been brought up by the Play community several times, but no one seems to have an answer.

Dependency injection
Play! doesn’t have any prefered way of doing DI, so it is up to you to use the DI container of your choice. It turnes out that there is a Play plugin for Google Guice so getting started with Guice was easy.

This is what my Guice bootstrap looks like:

public class Dependencies implements Module {

    public void configure(Binder binder) {
        binder.bind(new TypeLiteral>(){}).to(RabbitMqQueue.class);
        binder.bind(ISubscriberHandler.class).to(SubscriberHandler.class);
        binder.bind(IEmailHandler.class).to(EmailHandler.class);
        binder.bind(ISubscriberHandler.class).to(SubscriberHandler.class);
        binder.bind(ISendAccountHandler.class).to(SendAccountHandler.class);
        binder.bind(ISettingsReader.class).to(SettingsReader.class);
        binder.bind(IGeoLocationHandler.class).to(GeoLocationHandler.class);
        binder.bind(ISmtpHandler.class).to(SmtpHandler.class);
    }
}

In Play controllers are static. The reasoning is that controllers should have no state and therefor should be static. In some way this makes sense, controllers should not keep a state, but it also limits us to property based injection instead of constructor based injection. To use dependency injection in the controllers we annotate the properties like this:

public class Emails extends Controller {

    @Inject
    public static EmailQueueHandler emailQueueHandler;

    @Inject
    public static ISmtpHandler smtpHandler;

    @Inject
    public static ISettingsReader settingsReader;

...

HTML Templating
Why do I mention the templating? It turnes out that Play is using a new Scala based templating engine that is heavily inspired by the ASP.NET Razor view engine, witch is the best view engine that I have used. In general the Scala view engine is great, the only complaint that I have is that the error messages can be very cryptic when something doesn’t compile.

Deployment
The preferred way of deploying Play is by using the built in web server and a proxy, such as Nginx in front of it to serve static files. This is how I have deployed my application. However, there is support for packaging a war file and deploying it to an application server such as Tomcat, the downside is that you lose some functionality. I think this is something that Play need to improve to become more enterprise friendly.

Conclusion
Without any previous knowledge of Play framework I was able to rewrite my application from ASP.NET to Play! in less than a week. I also took the opportunity to rewrite a lot of parts of the application that I have been wanting to do for a long time and switched html/css framework from Blueprint to Twitter Bootstrap. So overall I’m quite happy with Play.

New country, new city, new job

So a lot is new since last time I posted. In the end of the summer I moved from Stockholm to San Francisco to start working for a company called Skout, building mobile social networks. With this new job some big changes came:) I am now back to coding Java after several years of mostly working with C#/.NET. Back in a MacOS/Linux environment after having used Windows for most professional development in the last few years. And back to a small company (in total about 100 people) after having worked in a company with over 1000 people in just IT. So a lot has changed, and I am very excited about it!

Lets roll back time a bit. In school I was taught C, C++, Java and Haskell, plus a bunch of languages no one knows of. When I graduated I was expecting getting a job coding Java, however I landed up with a job doing mostly Python and C. After some time at that company I abruptly switched field to building web services, and to .NET. I had done some C# at this time, actually mostly using the Mono compiler. C# at this time was still in version 1.x, and lacks almost everything that makes it a great language today, but I found it to be a better version of Java. Also, at that time a lot about Java felt like it was mainly about enterprise beans, abstractions and more abstractions. And to add on to this, a shit ton of XML to configure every single aspect of everything in XML instead of code. I never liked this. Switching to C# at this time felt like a fresh breeze. Sure .NET has its own baggage as well, for example the core ASP.NET framework could see plenty of improvements and IIS was a mess at that time, but getting things done was much faster and easier compared to Java.

Over time things have changed. .NET is still a great environment to work in (maybe better than ever I would say), but Java has also moved on. Less abstractions, the application stacks are not necessary as tall and more light weight containers are preferred over big application servers. Overall it seems like getting stuff done has become easier. However, Java is still behind C# on many language features such as closures, lambda expressions, expression trees and something like LINQ. But it seems like there is hope! In Java 8 we will get extension methods and lambdas which will allow for a very different style of coding in Java. Today the closest you get is anonymous classes combined with final variables. It does the trick but is very verbose.

Before I got the job I decided to freshen up my Java skills, so I decided to rewrite one of my personal hobby projects from C# to Java. I did some research one what was the most up to date, cool and hip Java framework and it turned out to be Play! So I rewrote it using Play! Naturally my next article is going to be about my experiences and frustations with Play!

Stay tuned.

Will the Metro user interface work on the Dektop?

Two days ago Windows 8 Release Preview was released to the public, while not being the final release of Windows 8, it is probably quite close. You can download the ISO files from here:


http://windows.microsoft.com/en-US/windows-8/iso

If you like me, had problems finding the key for it (it was hidden on the FAQ page), here is the key to install it: TK8TP-9JN6P-7X7WW-RFFTV-B7QPF.

I have tried the earlier builds of Windows 8 that has come out and after playing with the release preview for a short while I must say that my feelings toward Windows 8 is still somewhat mixed. I think the new application model, WinRT, is a step in the right direction to get away from the old legacy Win32. The Market Place is a must to compete with Apple. And the Metro interface seems like a good fit for tables, phones, etc. But Metro on the desktop? I am not sure about this.

A big difference between the mobile devices and the desktop is screen estate. On a tablets the screen area is limited and even more so on a phone. In this case it makes sense to run all applications in full screen. But on the desktop you generally have more screen estate available, and I prefer to use this to have more information visible. For example, I always want to be able to:

  • see a list of open applications
  • see the URL i am browsing
  • see all the tabs that are open
  • see my contact list of friends that is online
  • and something as simple as always have a visible clock on the screen

It does not matter if this wastes some of my pixels, I have plenty of them!

The Metro applications i have seen for the desktop so far have all been good looking. But take for example the calender application, full screen on a tablet, that is fine. Full screen on a 27″ monitor? No thanks. Also, most of the Metro applications I have seen so far seems to simple, not so useful applications inline with, app for twitter, app for stocks, app for something else that I already can do just fine in a web app. What I really want to see is a real complex application converted to Metro, like Office, or even better, Visual Studio. I might work, but even if it does I think using a Metro application on the desktop will feel like it is limiting your ability to multitask because of the full screen nature of it.

I guess time will tell, but for now it feels like the “legacy” desktop will be the frequently used one in Windows 8.

Limiting MongoDB memory usage on Windows 2008 Server

By default MongoDB tries to memory map as much as possible of the database file. Given a fairly big database, this will consume all your memory. If MongoDB is running on a dedicated server this is totally fine, but when running on a shared server this will cause unnecessary swapping for your other applications. At this point you want to limit the max amount of memory MongoDB can use. On Linux I have not found a way on how to do this (if you know a way please let me know!), but on Windows Server it is possible to limit it using WSRM.

WSRM allows greater control over the CPU and Memory available to a process and is an additional feature shipped with Windows Server that can be installed from the server manager under “Features Summary”:

Once installed we need to create a new resource allocation policy from the manager:

and give it some name:

Now we need to select processes to match with the policy:

We will select process from the list of registered services:

Here we find MongoDB:

Click OK a few times to exit the process selection. On the memory tab we can limit the memory for this resource allocation policy:

The final step is to make the policy active. To do this, go to the top level page in the manager and click the “Selected Policy” link. In here you can set the active policy:

That’s it! Now MongoDB will not consume more than 500MB of memory.

ColourSearch – a simple image search engine

A few weeks I randomly got interested in how you match images with other images. Having a very limited background in computer graphics I started reading some research papers on how to match images. One strategy that seemed to work pretty well for most people was a histogram based comparison. Given two histograms it is possible to calculate the distance, or correlation, between the histograms. The image with the lowest distance or highest correlation would be the best match. To try out my newly learnt knowledge I created a small application, ColourSearch, which given a directory calculates a histogram for each image and stores it in memory. In the GUI you can then pick a color to search and the application will find the best matching images using 4 different algorithms and present the result side by side.

A problem with the current implementation is that I have not found any way to index histograms, so when searching the input image needs to be compared to each image in the database (O(n) yay!). For larger datasets this would of course not be feasible, but at least the problem is easy to scale over several CPUs, so I can take advantage of all my cores.

You can find ColourSearch on GitHub (code in C#, works perfect under Mono!):
https://github.com/moberg/coloursearch

If you have any ideas on how indexing can be done or on more efficient matching algorithms I would be very interested in hearing them!

Unlocking your Windows Phone for development

It is not totally obious how to unlock your WP7 device for development. The first step is of course to sign up for the development program at http://create.msdn.com. But then? I had the exact same experience as this guy:

http://www.pitorque.de/MisterGoodcat/post/Unlocking-a-Windows-Phone-7-device-for-development.aspx

So together with the SDK a small activation program was installed, “Windows Phone Developer Registration”. Start it and login to active your device!

Force update of your Lumia 800 to WP7 1600.2487.8107.12070

So for some reason it takes Nokia a long time to push out the WP7 updates. 1600.2487.8107.12070 is a new official build of WP7 that was released a while back, but Zune still tells me that there is no update for my phone. The most important features of the update to me seems to be improved battery performance and improved bass in audio performance. I have been quite happy with the current battery performance of my phone (about 1,5 days on one charge), but at the same time I know other Lumia 800 owners that are very unhappy with theirs, not even lasting one work day. It seems like there are some good and some not so great devices out there. Even better battery life would of course be great! On the audio side, the audio has been quite flat and improved bass performance would be very welcome.

This blog post has great step by step instructions on how to force the update:
http://nokiagadgets.com/2012/03/08/force-the-newest-lumia-800-update-1600-2487-8107-12070-to-your-phone/

Basically what you have to do is to download the update and install some tools for phone developers. Once you have downloaded everything you run the “WP7 Update Cab Sender.bat”:

The update process takes about 10 minutes. The first thing I tested out after updating was of course to play some music, the audio is indeed much improved. Time will tell if the battery performance got better!

Protothon #2 – WebRTC

Yesterday I went to a hackathon called Protothon. Protothon describes itself as “Space for the place between code and creativity” and the idea is to bring together programmers, creatives and entrepreneurs together and under a very limited time build an application from scratch. The focus of this hackathon was WebRTC, which is a new standard for Real Time Communications using HTML5/Javascript. At the moment it is so new that no browser actually supports it yet, but there are special builds of Chrome and Firefox in which you can enable it. So the task of this hackaton was basically to do something cool with WebRTC.

The team I was in consisted of me, Tomek Augustyn, Patrik Spathon and Pebbles Lim. Tother we came up with the idea to build a multiplayer version of the classic game pong, but using motion detection to control the game. We named the game Spong:

Each player runs the game from his web browser, which will access his web cam using WebRTC. The motion detector analyzes the video stream and detects where the most amount of movement is, this gives us a coordinate on the screen. The Y part of this coordinate is used to control the position of the players’ paddle. You can see this in the image above where the cross points out where i just moved my hand. Each players’ movements is sent to the other player via a Node.js/Socket.IO relay server.

To summarize Protothon #2 I must say that I was really impressed with all the applications presented by the different teams. It was one intense day, but a lot of fun! I am looking forward to attending again in the future.

Serving your ASP.NET MVC site on Nginx / fastcgi-mono-server4

In my previous post I showed you how to compile and install mono and get your MVC site up running using the development web server xsp4. The next step is to serve your site using a real web server, my choice is nginx.

First, we need a configuration file for the nginx site. If you haven’t already got nginx installed, install it (sudo apt-get install nginx).

/etc/nginx/sites-enabled/mvc:

server {
    listen   80;
    server_name mvctest.sourcecodebean.com;
    root  /home/peter/MonoMvcDeploy/;

    location / {
      root /home/peter/MonoMvcDeploy/;
      index index.html index.htm default.aspx Default.aspx;
      fastcgi_index /Home;
      fastcgi_pass 127.0.0.1:8000;
      include /etc/nginx/fastcgi_params;
    }
}

Read more Serving your ASP.NET MVC site on Nginx / fastcgi-mono-server4

Older posts