Let's get rid of middle management.
No, not you, boss.
I'm talking about middle management in Java code. At one presentation in the New England Software Symposium in Boston last year, some Java-bashing guy (Stuart Halloway) described Java as the language that invented middle management. You have probably seen them: these classes often end with "Manager.java". I'm wondering whether we really need them.
Monday, January 25, 2010
Saturday, January 16, 2010
What's new in KBLauncher 1.1.1
I recently released a new version of KBLauncher, my application quick launcher for the BlackBerry. This blog entry serves as my release notes for this app. The rest of you who are here for non-BlackBerry topics can safely move on.
Sunday, January 10, 2010
On the breadcrumb metaphor
I've heard the term used in UI work: people refer to "breadcrumbs" when they mean a navigation aid that shows UI path used to arrive at a point in the program. This supposedly prevents a user from feeling lost. It could be a trail of screens or dialogs like Options > Advanced Options > Applications. I don't have an objection to that. I do think, on the other hand, that "breadcrumbs" is a pretty bad term to describe this UI construct.
The most famous example of using bread crumbs to mark a trail resulted in those crumbs being eaten by wild animals and the adventurers nearly becoming dinner themselves. All versions of of the Hansel and Gretel tale that I've read had the children successfully return home when Hansel used pebbles or small stones to mark a trail. This trick failed when Hansel used bread crumbs. With the crumb trail eaten by animals, they end up lost in the woods. They eventually wander to that candy house where they are trapped by a witch who tries to eat them. You know the rest.
So if you say you want to use breadcrumbs, are you really saying that you want to leave the user lost in the woods to be eaten by a cannibalistic witch?
The most famous example of using bread crumbs to mark a trail resulted in those crumbs being eaten by wild animals and the adventurers nearly becoming dinner themselves. All versions of of the Hansel and Gretel tale that I've read had the children successfully return home when Hansel used pebbles or small stones to mark a trail. This trick failed when Hansel used bread crumbs. With the crumb trail eaten by animals, they end up lost in the woods. They eventually wander to that candy house where they are trapped by a witch who tries to eat them. You know the rest.
So if you say you want to use breadcrumbs, are you really saying that you want to leave the user lost in the woods to be eaten by a cannibalistic witch?
Oops, did you accidentally ship your BlackBerry source code?
If you develop BlackBerry software using RIM's BlackBerry Java plugin for Eclipse and use Subversion for version control, you could end up shipping your source code with your executable if you are not careful. I had hoped this would be solved with the new version (1.1) of the plugin -- still in beta -- but based on my experience with one of the betas this does not appear to be the case.
The problem comes from the plugin's naive handling of non-Java files. Subversion keeps a copy of every version-controlled file in your sandbox. For example, if you have a Foo.java you will also have a copy as .svn/text-base/Foo.java.svn-base. This allows you to do diffs with the base version or revert your changes without even contacting the SVN server. But when RIM's plugin builds your *.cod executable, it assumes those *.svn-base files are resource files to be bundled with your COD file. The result is that if you view your resulting COD file in a text viewer, you will see copies of your source files -- those svn-base files -- embedded in the binary. Good grief.
Until RIM fixes this, one way to work around this is to specifically tell Eclipse to omit those SVN support files from the build. In your project properties, go to Java Build Path, select the Source tab and edit the source folders' "Excluded" entries. You can use wildcards in your exclusion patterns. And of course, make a test build and view it to make sure your source code is really excluded.
The problem comes from the plugin's naive handling of non-Java files. Subversion keeps a copy of every version-controlled file in your sandbox. For example, if you have a Foo.java you will also have a copy as .svn/text-base/Foo.java.svn-base. This allows you to do diffs with the base version or revert your changes without even contacting the SVN server. But when RIM's plugin builds your *.cod executable, it assumes those *.svn-base files are resource files to be bundled with your COD file. The result is that if you view your resulting COD file in a text viewer, you will see copies of your source files -- those svn-base files -- embedded in the binary. Good grief.
Until RIM fixes this, one way to work around this is to specifically tell Eclipse to omit those SVN support files from the build. In your project properties, go to Java Build Path, select the Source tab and edit the source folders' "Excluded" entries. You can use wildcards in your exclusion patterns. And of course, make a test build and view it to make sure your source code is really excluded.
Thursday, December 17, 2009
The cloudy future of relational databases
Cloud computing is the big new thing. The fate of the formerly ubiquitous relational database, on the other hand, is uncertain. There are many ways for a developer to deploy his application to the "cloud", whatever that might mean. On the other hand, the various cloud vendors do not necessarily have a database story to tell, or if they do that story isn't necessarily a good one. As a result, deploying a traditional database-backed application to the cloud may be difficult.
One story I have heard is to simply opt out of using a relational database. You would use some "NoSQL" persistence technology, such as SimpleDB on Amazon or Google's datastore. Such solutions have their advantages in terms of performance, scalability and ease of mapping to objects. If you just want to store dumb data objects, that may be all you need. But there are situations where the traditional RDBMS is more appropriate. If you have a data-centric view of your problem domain, where the data is ontologically independent of the application and can outlive the application, where you want the database to guarantee certain transactional and integrity rules, where you want to do lots of ad-hoc querying ... you might just want an RDBMS. My worry is that for many cloud platforms, you may not have that option.
For the platform as a service (PaaS) sort of cloud, where you may not even have filesystem access, setting up your own database server will not be an option. Even where the vendor provides a database service, your choice remains limited to that one service. For example, if you use Azure you can only use SQL Server. The infrastructure as a service (IaaS) sort of cloud will generally give you that freedom, but it may have limitations. Generally, IaaS vendors like Amazon offer:
At this point, I think the reader of this blog may want to slap me on the head and point out that yes, Amazon tells a perfectly good database story. In fact, it must have at least half a dozen stories to tell. You can use their MySQL service, put up an Oracle AMI or stand up your own database server: there are lots of choices both from Amazon and third parties. There are choices partly because Amazon's EC2 can use elastic block storage (EBS). This is a filesystem which is both big and fast, especially if you set up several EBS volumes in a RAID configuration. I noticed that my former colleague Chip recently blogged about his success deploying database-backed applications to Amazon's cloud.
OK, so you have lots of database choices if you go with the Amazon cloud. But only Amazon. I have been talking to another cloud IaaS vendor without a database story, being stuck with the fast/small vs slow/big dichotomy. With other vendors, it seems, you either have just one choice of RDBMS or you simply cannot get a production-ready RDBMS in the cloud. Did I miss an another option or two? You tell me.
One story I have heard is to simply opt out of using a relational database. You would use some "NoSQL" persistence technology, such as SimpleDB on Amazon or Google's datastore. Such solutions have their advantages in terms of performance, scalability and ease of mapping to objects. If you just want to store dumb data objects, that may be all you need. But there are situations where the traditional RDBMS is more appropriate. If you have a data-centric view of your problem domain, where the data is ontologically independent of the application and can outlive the application, where you want the database to guarantee certain transactional and integrity rules, where you want to do lots of ad-hoc querying ... you might just want an RDBMS. My worry is that for many cloud platforms, you may not have that option.
For the platform as a service (PaaS) sort of cloud, where you may not even have filesystem access, setting up your own database server will not be an option. Even where the vendor provides a database service, your choice remains limited to that one service. For example, if you use Azure you can only use SQL Server. The infrastructure as a service (IaaS) sort of cloud will generally give you that freedom, but it may have limitations. Generally, IaaS vendors like Amazon offer:
- a compute service: a virtual machine with local filesystem storage. Small, fast storage.
- a storage service: an API to read/write data on the cloud. Big, slow storage.
At this point, I think the reader of this blog may want to slap me on the head and point out that yes, Amazon tells a perfectly good database story. In fact, it must have at least half a dozen stories to tell. You can use their MySQL service, put up an Oracle AMI or stand up your own database server: there are lots of choices both from Amazon and third parties. There are choices partly because Amazon's EC2 can use elastic block storage (EBS). This is a filesystem which is both big and fast, especially if you set up several EBS volumes in a RAID configuration. I noticed that my former colleague Chip recently blogged about his success deploying database-backed applications to Amazon's cloud.
OK, so you have lots of database choices if you go with the Amazon cloud. But only Amazon. I have been talking to another cloud IaaS vendor without a database story, being stuck with the fast/small vs slow/big dichotomy. With other vendors, it seems, you either have just one choice of RDBMS or you simply cannot get a production-ready RDBMS in the cloud. Did I miss an another option or two? You tell me.
Wednesday, December 9, 2009
Hibernate formulas do not like subselects
Hibernate formulas and the subselect fetch strategy are currently incompatible. There is already a bug associated with it: HHH-1400. I am linking the bug at the beginning because blog posts like this have a limited shelf life. If you came across this post from Google, it is possible that this note is already obsoleted by a bug fix. I have little hope of this happening very soon right now, though, seeing that HHH-1400 was opened almost 4 years ago.
What's a formula? A formula is a Hibernate feature that lets you map a property to an SQL expression rather than a table column. Hibernate would automatically compute the value of the property for you each time you load an entity. A formula in annotation form looks like the following:
I think it is an elegant feature. Unfortunately, this feature seems to interfere with the subselect fetch strategy.
Why do subselects matter? The subselect fetch mode is an important tool to deal with the n+1 select problem when loading entities with one-to-many relations. Suppose we had an entity called Parent with a collection of Child entities. Suppose also we load n Parents and eventually want to iterate over each element in Parent.children. The typical Hibernate-generated SQL with a lazy-loaded collection would look something like:
In other words, we would have n+1 select statements, which is bad if n is large. We could do a join between Parent and Child to eagerly load the collection, but this only works with one collection. If Parent has multiple collections, a join-fetch of multiple collections would lead to a Cartesian product, retrieving a prohibitively large number of rows. This is where the subselect fetch strategy comes in. If mapped to use the SUBSELECT fetch strategy, we would only need two select statements, with the first one's where clause reused in the second:
Unfortunately, if you had annotated a property with a @Formula, Hibernate will mangle the SQL, leading to an SQL parse error:
If you really need to use subselect fetches, you'll have to find an alternative to @Formula for all its elegance. Possible solutions:
What's a formula? A formula is a Hibernate feature that lets you map a property to an SQL expression rather than a table column. Hibernate would automatically compute the value of the property for you each time you load an entity. A formula in annotation form looks like the following:
@Formula("(select count(*) from foo where ...)")
public Integer getFooCount() { return fooCount; }
I think it is an elegant feature. Unfortunately, this feature seems to interfere with the subselect fetch strategy.
Why do subselects matter? The subselect fetch mode is an important tool to deal with the n+1 select problem when loading entities with one-to-many relations. Suppose we had an entity called Parent with a collection of Child entities. Suppose also we load n Parents and eventually want to iterate over each element in Parent.children. The typical Hibernate-generated SQL with a lazy-loaded collection would look something like:
select * from Parent where ...;
select * from Child c where c.parent_id=?;
select * from Child c where c.parent_id=?;
select * from Child c where c.parent_id=?;
...
In other words, we would have n+1 select statements, which is bad if n is large. We could do a join between Parent and Child to eagerly load the collection, but this only works with one collection. If Parent has multiple collections, a join-fetch of multiple collections would lead to a Cartesian product, retrieving a prohibitively large number of rows. This is where the subselect fetch strategy comes in. If mapped to use the SUBSELECT fetch strategy, we would only need two select statements, with the first one's where clause reused in the second:
select * from Parent where ...;
select * from Child c where c.parent_id in
(select id from Parent where ...);
Unfortunately, if you had annotated a property with a @Formula, Hibernate will mangle the SQL, leading to an SQL parse error:
select * from Parent where ...;
select * from Child c where c.parent_id in
(select ...) as formula0_1_, ...;
If you really need to use subselect fetches, you'll have to find an alternative to @Formula for all its elegance. Possible solutions:
- Compute the properties with separate queries only as needed. This is tricky if you frequently need those computed properties.
- Sort of "denormalize" (I'm sure I'm misusing that term here) by persisting the computed values in the table. This works if the computed values rarely change, but is a problem otherwise.
- Have the DB manage the computed values using views. Eyal Lupu has a great blog entry on how to do this. Obviously, this only works if your database supports views, and requires DB-specific SQL.
Thursday, December 3, 2009
BlackBerry development: big things happening
The buzz in mobile development seems to be on other platforms. People outside of the BlackBerry community probably haven't been paying much attention, but there has been some major announcements in the area of BlackBerry development. These announcements, coinciding with the recently concluded BlackBerry Developer Conference, show that RIM is actively responding to its competition on several fronts. RIM's BlackBerry Developer's Blog has been covering these developments faithfully, so go there if you want to read from the official source. Here, I will go over what seem the most significant announcements to me as a developer.
Subscribe to:
Posts (Atom)