source code bean

08 Mar, 2010

Trying out CouchDB for the first time

Posted by: Peter In: NoSQL

There are several good libraries that will abstract accessing CouchDB. However, in order to understand what goes on in the libraries, I think it is important to first understand what is going on on a lower lever, so that is what I will show you.

Step 1, install CouchDB. CouchDB for OSX can be downloaded here, the bundle contains bouth the Erlang runtime and CouchDB – no compiling or installing needed, just download the .dmg, mount it and run the application! If you are on Linux it is quite likely that CouchDB is in the repository of your distribution. In Ubuntu CouchDB is found in “Universe” and can easily be installed using apt. I haven’t tried CouchDB on Windows so I can’t give you any guidance here, if you try it out, please leave a comment.

Now when CouchDB is installed lets explore it using use one of my favorite tools, good old cURL. Port 5984 is the default port for CouchDB.

  1.  
  2. $ curl -X GET http://localhost:5984/
  3. {"couchdb":"Welcome","version":"0.10.0"}
  4.  

CouchDB is up running! Lets create a new database:

  1.  
  2. $ curl -X PUT http://localhost:5984/testdb
  3. {"ok":true}
  4.  

CouchDB’s responses are also in JSON form. Lets inspect the database we just created:

  1.  
  2. $ curl -X GET http://localhost:5984/testdb
  3. {
  4.   "db_name":"testdb",
  5.   "doc_count":0,
  6.   "doc_del_count":0,
  7.   "update_seq":0,
  8.   "purge_seq":0,
  9.   "compact_running":false,
  10.   "disk_size":79,
  11.   "instance_start_time":"1266963717052501",
  12.   "disk_format_version":4
  13. }
  14.  

CouchDB returns some statistics of the database, we can see that it contains 0 documents. Lets store an empty document in the database:

  1.  
  2. $ curl -X POST http://localhost:5984/testdb/ -H "Content-Type: application/json" -d {}
  3. {
  4.   "ok":true,
  5.   "id":"ef40feff87010a6ef3a45a16df5af977",
  6.   "rev":"1-967a00dff5e02add41819138abb3284d"
  7. }
  8.  

CouchDB returns the unique id for the document and the version number (did i mention that all documents are version controlled?:)). Next step is to fetch all documents:

  1.  
  2. $ curl -X GET http://localhost:5984/testdb/_all_docs
  3. {
  4.   "total_rows":1,
  5.   "offset":0,
  6.   "rows":[{
  7.     "id":"ef40feff87010a6ef3a45a16df5af977",
  8.     "key":"ef40feff87010a6ef3a45a16df5af977",
  9.     "value":{"rev":"1-967a00dff5e02add41819138abb3284d"}
  10.   }]
  11. }
  12.  

Yes! it is there! Now you should also be able to see your database “testdb” and the document you created in the CouchDB GUI.

01 Mar, 2010

CouchDB a NoSQL database

Posted by: Peter In: NoSQL

Traditionally relational databases has been the primary way of storing, sorting and searching data, and for most purposes they are very good at it. However, in the last few years – with the growth of cloud computing and sites such as Facebook pushing relational databases to its limits, people have started to look for other alternatives. The problems with relational databases is that it has been hard to scale them in a vertical way (scale them over several servers with linear, or close to linear, performance increase). A few databases in a cluster is no problem, but when it comes to several terabytes of data, which should be searched in real time, they are simply not enough. CouchDB and the other NoSQL databases aims to provide a truly horizontally scalable database.

NoSQL is a umbrella term for a wide variety of data stores, which all have in common that they do not store data in a relational way. Some examples of NoSQL databases are CouchDB, MongoDB, Amazon SimpleDB and Google BigTable. This is some of the properties they have in common:

1. No schema
The data is stored in one big hashtable like data structure. No schema is needed.

2. No more joins
Joins are slow in general, when they are spread out over several servers it gets even worse. In CouchDB there is no join, instead data should be duplicated. This might sound odd if you like me was taucht back in collage that normalization is of highest importance, and that you should really really avoid to duplicate data in your database. This is still true for most relational databases, but keep in mind that when relational databases was invented, disk space was expensive and normalization was a great way to save a few bytes. Storage is everything but expensive today, this is why NoSQL empathizes that data should instead be duplicated.

3. Eventual consistency
When you update the database there is no longer a quarantine that all subsequent queries will get the updated value immediately, it might take some time depending on the system load. For some systems such as banking systems, this kind of behavior would be a big no no. But for most large web sites, this is no problem, the data is going to be cached in one way or another anyway.


CouchDB

CouchDB is a document database, accessible via a RESTful JSON API. Everything stored in the database is a “document” and is stored in a flat address space.There are no schemas, the documents are stored and retrieved as JSON objects.

To address this problem of adding structure back to semi-structured data, CouchDB has something called “views”. A view is as close as we gets to a SQL query. The views are expressed in Javascript and consist of map and filter functions. They are built dynamically and will not affect the underlying data.

CouchDB is written in Erlang and runs on all systems that the Erlang runtime supports (Linux, Windows, OSX and other unix systems). I have tested CouchDB on Linux and OSX. This is a screenshot of the CouchDBX GUI running on OSX:

The JSON representation of the same document:

In my next post I will show you some hands on action with CouchDB.

Tags:

14 Feb, 2010

Articles coming up on CouchDB and ExtJS

Posted by: Peter In: ExtJS| PHP

It has been a bit too long since my last post so I wanted to give you an short update on what I have been up to lately. Last year I built a web service using Amazon SimpleDB, which got me interested in document based databases. Since then I have been reading and following the progress of different open source document based databases, such as CouchDB and MongoDB.

During the time, I have been thinking about what other kind of projects a document based database would be suitable for. Because my lack of imagination (or just because I was too eager to get started:)), i decided to build a simple Web CMS system using PHP and CoachDB. Pages in a CMS system are perfect for storing as documents in a document database, so it seemed like a good choice.

Building the CMS backend actually turned out to be what took the shortest time. I wanted to create a nice looking Javascript based GUI for the editors and administrators, and decided to build it using the javascript library ExtJS. I had never worked with ExtJS before (and I am more of a backend guy than a javascript frontend guy), so it took me some time to get my mind around it, but the result is sweet – ExtJS is really powerful.

The CMS is far from complete, but so far it has given me quite a few ideas for new blog posts, so stay tuned for the upcoming posts on CoachDB and ExtJS!

In this post I am going to show how easy it is to create a JSON-RPC web service using the built in support in Zend Framework.

First we need to create the php-file that will handle the incoming RPC calls. It is not advised to put this inside the MVC structure of a Zend web application, since that will lead to unessesary complexity and overhead. The Zend people recommend that we create the JSON-RPC under /public/api/vX/, so lets create the file /public/api/v1/jsonrpc.php (if you haven’t setup your Zend MVC structure, read my blog post Getting started with the zend framework to get started).

We will have to do the regular bootstrapping to get our application up and running:

  1.  
  2. // Define path to application directory
  3. defined(‘APPLICATION_PATH’)
  4.     || define(‘APPLICATION_PATH’, realpath(dirname(__FILE__) . ‘/../../../application’));
  5.  
  6. // Define application environment
  7. defined(‘APPLICATION_ENV’)
  8.     || define(‘APPLICATION_ENV’, (getenv(‘APPLICATION_ENV’) ? getenv(‘APPLICATION_ENV’) : ‘production’));
  9.  
  10. // Ensure library/ is on include_path
  11. set_include_path(implode(PATH_SEPARATOR, array(
  12.     realpath(‘../../../library’),
  13. )));
  14.  
  15. /** Zend_Application */
  16. require_once ‘Zend/Application.php’;
  17.  
  18. // Create application, bootstrap, and run
  19. $application = new Zend_Application(
  20.     APPLICATION_ENV,
  21.     APPLICATION_PATH . ‘/configs/application.ini’
  22. );
  23.  
  24. $application->bootstrap();
  25.  

The next step is to create the class that will be exposed through the service. I will create a very simple class that will simply perform an addition of two ints. It is very important to describe the input parameters using the @param directive in the comment. This information is used by the Json Server when creating the SMD (Service Mapping Description).

  1.  
  2. /**
  3.  * Simple – sample class to expose via JSON-RPC
  4.  */
  5. class Simple
  6. {
  7.     /**
  8.      * Return sum of two variables
  9.      *
  10.      * @param  int $x
  11.      * @param  int $y
  12.      * @return array
  13.      */
  14.     public function add($x, $y)
  15.     {
  16.         return array(‘result’ => $x + $y);
  17.     }
  18. }
  19.  

The last step to get the JSON-RPC server running:

  1.  
  2. // Instantiate server, etc.
  3. $server = new Zend_Json_Server();
  4. $server->setClass(‘Simple’);
  5.  
  6.  
  7. if (‘GET’ == $_SERVER[‘REQUEST_METHOD’]) {
  8.     // Indicate the URL endpoint, and the JSON-RPC version used:
  9.     $server->setTarget(‘/api/v1/jsonrpc.php’)
  10.            ->setEnvelope(Zend_Json_Server_Smd::ENV_JSONRPC_2);
  11.  
  12.     // Grab the SMD
  13.     $smd = $server->getServiceMap();
  14.  
  15.     // Return the SMD to the client
  16.     header(‘Content-Type: application/json’);
  17.     echo $smd;
  18.     return;
  19. }
  20.  
  21. $server->handle();
  22.  

Now your JSON-RPC Server should be up running. Browsing http://{your web server}/api/v1/jsonrpc.php should result in the following SMD:

  1.  
  2. {
  3.   "transport":"POST",
  4.   "envelope":"JSON-RPC-2.0",
  5.   "contentType":"application\/json",
  6.   "SMDVersion":"2.0",
  7.   "target":"\/api\/v1\/jsonrpc.php",
  8.   "services": {
  9.       "add":{
  10.           "envelope":"JSON-RPC-2.0",
  11.           "transport":"POST",
  12.           "parameters":[
  13.               {"type":"integer","name":"x","optional":false},
  14.               {"type":"integer","name":"y","optional":false}],
  15.           "returns":"array"
  16.        }
  17.     },
  18.    "methods":{
  19.        "add":{
  20.            "envelope":"JSON-RPC-2.0",
  21.            "transport":"POST",
  22.            "parameters":[
  23.                {"type":"integer","name":"x","optional":false},
  24.                {"type":"integer","name":"y","optional":false}],
  25.            "returns":"array"}
  26.        }
  27. }
  28.  

jQuery does not support calling JSON-RPC services out of the box, but fourtenly there is plenty of plugins for jquery that fixes this. One is the JSON-RPC client found here. Download the client and put the javascript files into your /js folder. Then create a new file test.html and add the following html:

  1.  
  2. <html>
  3. <head>
  4.         <script LANGUAGE="javascript" SRC="js/jquery-1.3.min.js"></script>
  5.         <script LANGUAGE="javascript" SRC="js/json2.js"></script>
  6.         <script LANGUAGE="javascript" SRC="js/jquery.zend.jsonrpc.js"></script>
  7.         <script>
  8.                 $(document).ready(function(){
  9.                         test = jQuery.Zend.jsonrpc({url: ‘/api/v1/jsonrpc.php’});
  10.                         alert(test.add(1,1)['result']);
  11.                 });
  12.         </script>
  13. </head>
  14. <body>
  15. </body>
  16. </html>
  17.  

We are all done! Browsing test.html should result in an alert box containg the result.
result

Congratulations! You have created a JSON-RPC service!



Read more about the Zend Framework:

Today I upgraded a project we started working on last year from Zend Framework 1.7 to Zend Framework 1.9. I excepted to run into several API incompatibilities, but the only problem I got was the autoloader.

In Zend 1.7, and earlier versions, the autoloader was registered like this:

  1.  
  2. require_once "Zend/Loader.php";
  3. Zend_Loader::registerAutoload();
  4.  

In Zend Framwork 1.9 this has changed slightly, you now have to register the namespaces you want to autoload:

  1.  
  2. require_once ‘Zend/Loader/Autoloader.php’;
  3. $loader = Zend_Loader_Autoloader::getInstance();
  4. $loader->registerNamespace(‘Dqc_’);
  5.  

This was the only change we needed to do to upgrade from 1.7 to 1.9, quite impressive!

13 Dec, 2009

Getting started with the Zend Framework

Posted by: Peter In: PHP| Web| Zend Framework

It used to be a bit tricky to get started with the Zend Framework. The Zend Framework is very flexible and allows you to set it up in almost any way that fits your needs. This means for example that the directory structure and location of files is up to you, however – there is a recommended layout. When I first started using Zend I had to figure out this by looking at examples and reading the (at that time) rather poor documentation available. In Zend Framework 1.9 a tool, zf.sh, was introduced. It simplifies creating a new site a lot. In this blogpost I will guide you though the process of setting up a Zend development environment in OS X. The only part that is OS X specific is MAMP. Zend Framework runs just fine under Windows and Linux as well.

Step 1 – Getting a *AMP setup (LAMP, MAMP, WAMP)
The AMP (Apache, Mysql, PHP) stack is available for almost all modern operating systems. I am writing this on my Macbook, so in this tutorial I will use MAMP. When I develop PHP in Windows I usually use WAMPServer and in Linux you can install the LAMP-stack using the packaging system in most distributions.

Setting up MAMP is pretty straightforward, download the MAMP .dmg-file and drag the MAMP folder to your Applications folder.
mamp

Start the application and press the “Open start page” button to make sure everything works. On the start page you will find information about your site, phpInfo, phpMyAdmin and SqLiteManager. We will get back to configuring the http root directory later.

Step 2 – Download and install Zend Framework
Go to the framwork download page and download the minimal distribution. In this tutorial I am using 1.9.6, but the instructions will probably apply to all 1.9 versions.

Extract the downloaded file and move the folder to a shared location, for example /usr/local/. Next step is to create an alias for the zf.sh tool. Edit your ~/.bash_profile and add the following line (change the path to where you moved the extracted files):

zf=/usr/local/ZendFramework1.9/bin/zf.sh

This will allow you execute the zf tool without using the full path. Try it out by executing zf show version, it should return the version number of the file you downloaded.


$ zf show version
Zend Framework Version: 1.9.6

Step 3 – Create your project
Go to the folder where you want to create your new project, in my case ~/Development/. Run zf create project zf-tutorial. This will setup the default directory structure and create the necessary files.

Zend Framework directory structure

The application/ folder is where the source code for your website lives. It contains separate folder for models, views and controllers. The public/ folder is the folder that is going to be your document root.

Now you need to copy the Zend library (in my case /usr/local/ZendFramework1.9/library/Zend) or create a symlink for it so your site can find the Zend files. I prefer using a symlink:


$ cd ~/Development/zf-tutorial/library
$ ln -s /usr/local/ZendFramework1.9/library/Zend/ Zend

Step 4 – Run the project
The last step before we will have running site up is to configure the step that we skipped in Step 1, configuring the Apache document root.
sitesettings

Open up MAMP and click the preferences button, under the apache tab you will find the document root. Click select and navigate to the “public” folder in your zend project. Apply the settings and restart the server.

Now open your browser and direct it to http://localhost and you should see the default welcomescreen:
zenddefault

Now open your application/controllers/IndexController.php and start hacking your code!



If you want to learn more about the Zend Framework i have some posts on some more advanced topics:

About a month ago I finished reading the book ASP.NET MVC 1.0 – Test Driven Development by Emad Ibrahim. The book weighs in only at about 300 pages, making it easily something you can read in a couple of nights. The book is written in tutorial fashion and is is probably best read with a laptop running Visual Studio on your lap, so you can follow the examples in the book. The paradigm of the book is Problem, Design, Solution. Emand gudies the reader though the full process of creating a web application in a Test Driven manner.

Besides from only presenting MVC and TTD Emand presents several very useful libraries for Test Driven Development:

  • Moq – A mocking library for .NET that uses the power of LINQ to create mocks.
  • Ninject – Dependency injection library for .NET
  • MBunit – An alternative to VSTest

I think the book gives a good kickstart in ASP.NET MVC and TTD. My favorite part of the book was actually not reading about MVC itself, it was reading about testing it – Emad shows the strengths of MVC by showing how it makes testing easier.

07 Dec, 2009

ZFS for home NAS?

Posted by: Peter In: Storage| UNIX

I have been doing some research on NASes for home use. I basically want a NAS that offers redundancy (some form of raid), the ability to add disks as I go. It should also support at least SMB as file sharing protocol (but preferable others as well), and of course not be too expensive. All home NASes I have found yet has been lacking on at least one of the above criteria.

I have read about people using ZFS on FreeBSD or OpenSolaris for their storage servers. ZFS is a open source file system developed by Sun Microsystems which has some features that makes it very compelling for a file storage server. Unfourtanly ZFS is not available on Linux at the time of writing (i think it is some licensing issues that is preventing a port of it), if it were I would definitely go for it.

To give it a try, i downloaded OpenSolaris 2009.6 and installed it as a virtual machine in VMware Fusion. Instead of having to add several virtual disks to the VM, i decided to test the features of ZFS using regular files (ZFS can use files as disk devices). An easy way to create some “disks” is to use the mkfile command, it will create a file that can be used a disk device:

# mkfile 100m /tmp/disk1
# mkfile 100m /tmp/disk2
# mkfile 100m /tmp/disk3
# mkfile 100m /tmp/disk4

ZFS has tree leveles. The highest level is a ZFS pool, which can consist of several ZFS filesystems. A ZFS filesystem consists of one or more devices. Filesystems within a pool share its resources and are not restricted to a fixed size. You can add or remove devices to a pool (for example to increase your storage space), while the pool is running. Devices in a filesystem can be configured in mirrored mode or in RAIDZ mode to offer redundancy. ZFS also supports filesystem level snapshots and cloning from existing file systems. The two main ZFS commands are:

zpool - Manages the pools and the devices within them
zfs - Manages ZFS filesystems

Ok, so lets create a pool from the disks we created earlier:

# zpool create storage /tmp/disk1 /tmp/disk2

# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 7.94G 4.28G 3.66G 53% ONLINE -
storage 191M 74.5K 191M 0% ONLINE -

As you can see we combined two disks into one pool. The filesystem automatically gets mounted on /storage (this is the default mount point, it can be changed). No volume management, configuration or formatting is needed. Lets destoy this pool to create a more interesting one.

# zpool destroy storage

# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 7.94G 4.36G 3.58G 54% ONLINE -

As you can see, it is gone. Lets create a new pool using RAIDZ (a form of raid, similar to RAID5):

# zpool create storage raidz /tmp/disk1 /tmp/disk2 /tmp/disk3

# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 7.94G 4.38G 3.56G 55% ONLINE -
storage 286M 140K 286M 0% ONLINE -

One thing that’s a little different in a ZFS raidz pool versus other RAID-5 filesystems is that the reported available disk space doesn’t subtract the space required by parity. Of course parity will take up space, so this is something to keep in mind when monitoring the disks. We can monitor the status of the pool by using the zpool status command:

# zpool status storage
pool: storage
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
/tmp/disk1 ONLINE 0 0 0
/tmp/disk2 ONLINE 0 0 0
/tmp/disk3 ONLINE 0 0 0

errors: No known data errors

After some playing around with ZFS i certainly think it would be a great choice for a storage server. It is way easier to use than the software/LVM solutions i have tried on Linux. The biggest drawback would be OpenSolaris itself, I just find the GNU application userland easier to use compared to the Solaris one. Maybe I should give Nexenta (OpenSolaris Kernel, GNU application userland) a chance?

Read more:
ZFS on Wikipedia
RAID-Z

So I restored a backup from from MSSQL 2005 to a MSSQL 2008 database, which kept the database in “SQL 2005 compatibly mode”. I thought that this would mean that the database could actually be backed up from the MSSQL 2008 server to the MSSQL 2005 server, however it turns out that this is not the case. Backing it up from 2008 and trying to restore it, resulted in the the following slightly cryptic error:

An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.Express.ConnectionInfo)
The media family on device ‘C:\Temp\db.bak’ is incorrectly formed. SQL Server cannot process this media family.
RESTORE HEADERONLY is terminating abnormally. (Microsoft SQL Server, Error: 3241)

After some googeling it turns out that MSSQL 2005 can not read backups from 2008. The solution I found to the problem was to export the database as a SQL script, and running the script on the SQL 2005 server. Maybe not the best solution, but it saved my day.

This is how you create the script file:
In Object Explorer in SQL 2008 management studio, right-click the database, select Tasks->Generate Scripts. In the options dialog enable everything, including Script Data. Make sure you select “Script for SQL 2005″ (otherwise you will export a SQL 2008 script file). Then run the script on your SQL 2005 server and hopefully you are done!

While reading the EPiServer 5 SDK documentation, i found this:

Rename a Folder

There is no Rename method on the EPiServer.Web.Hosting.UnifiedDirectory class. To rename a folder you need to call the MoveTo method as follows:

  1.  
  2. protected void RenameFolder(string path, string oldName, string name)
  3. {
  4.     if (IsFolder(path))
  5.     {
  6.         UnifiedDirectory directory =
  7.         System.Web.Hosting.HostingEnvironment.VirtualPathProvider.GetDirectory(path) as UnifiedDirectory;
  8.  
  9.         int e = -1;
  10.         while (path.IndexOf(oldName, ++e) > -1) ;
  11.  
  12.         StringBuilder sb = new StringBuilder();
  13.         sb.Append(path.Substring(0, e – 1));
  14.         sb.Append(name);
  15.         sb.Append("/");
  16.  
  17.         directory.MoveTo(sb.ToString());
  18.     }
  19. }
  20.  

What a convenient way of renaming a folder :) Good thing that you don’t have to do it too often.

Categories

Adwords

Twitter Updates


    • Martin: maybe yesterday really webservice used one of customers. MonoMaxMemory 500000000 maybe work, but its combination of problems with webservices. we
    • Peter: I am note quite sure how to set if for all vhosts, or how well MonoMaxMemory works at all. This is from the documentation: "MonoMaxMemory. If MonoR
    • Martin: no today webservice called... and i dont reise maxmemory but limit it to 500MB How can i limit memory SUM of all solutions to 500MB?

    About

    Welcome to source code bean! You will find information on tips and tricks on programming languages, server side stuff, and anything that causes troubles to web development.