Archive for the ‘XML’ Category

Schema First and Anemic Domain Models

Thursday, November 8th, 2007

I was recently introduced to the notion of Anemic Objects while discussing schema first or POJO first xml definition. I come from the camp of schema first and considered the resulting java objects to be just a normal side effect from this approach. The alternative has scary ramifications from a system design/interoperability perspective.

Before I get into my thoughts on either approach, lets cover what an anemic object is. According to Martin Fowler, they’re an anti-pattern.

The fundamental horror of this anti-pattern is that it’s so contrary to the basic idea of object-oriented design; which is to combine data and process together.

The service, or business, objects are responsible for extracting the required bits and performing everything from validation to the actual business logic.

So is schema-first really a cause for this problem? Casually speaking, yes, but I believe it’s mostly due to the ease of code gen and focus on ease of development. For most developers, learning WS-Schema is not a simple undertaking, just like learning any language or standard. If I know Java, and a tool will create the required artifacts for me, then I can ignore what’s generated as long as it keeps in sync with my domain model.

Keeping in sync with a domain model can be challenging when a developer doesn’t control the underlying schema. Think about experiences of mapping a robust database schema to an object model. The relational/object impidence has resulted in numerous frameworks, Hibernate, JDO, JPA, EJB, Toplink and many more, to handle these differences. Even then, all allow for a way to just use plain SQL and do the mapping yourself. The same mismatch applies to XML Schema as well. There are various frameworks, JAXB and XMLBeans for example, that help out in this area as well.

It’s easier to allow the Domain Model to drive these models or to just consider any schema-first output as a second class object or bit bucket. I’ll admit to subscribing to the latter. As an enterprise architect, it’s easier to think in terms of data schema and service endpoints that act on it. WS-* and REST both encourage this type of behavior. For the developer on the ground, this can be limiting as it does not help promote OO design.

Indeed often these models come with design rules that say that you are not to put
any domain logic in the the domain objects.

But it doesn’t have to be that way.

I’ll pick on the automated build process for a moment. I think what happens is that the auto-gen of either Schema or POJO gets baked into the build process and developers forget. The output is a secondary though that is constantly sync’d with their domain. The problem is that those external parties that rely on those artifacts are also treated as second class. A developer changes and external interface contract without even thinking. This is the part that is scary

The middle ground is to use the schema gen once and lock it down. Put it in the repository and protect it with the walls of governance. The auto-gen can still place, but it needs to also verify that it has not broken the contract. This way, a schema that started from a POJO and given an official approval as the external interface is protected form unwarranted change, and the developer is granted the ability to grow the POJO into the rich domain model they want.

This approach places constraints on the evolution of a domain object so it’s not recommend that some time be taken to get the domain object as close to the use case/user story as possible. Any evolution will be quickly detected and signal integration or legacy concerns immediately.

Anemic models can be address with a slight change in build process and a stronger embrace of the binding tools that typically generate them.

Orbeon Forms 3.5 moves closer to InfoPath

Tuesday, January 23rd, 2007

InfoPath is a pretty solid business user application. A small project I cranked out 6 months ago is still going strong. If only it could be saved into the exist database I’d have a great forms solution.

Orbeon has been on my radar for some time. The premise is that editing xhtml/xforms is the most direct way for folks to create a dynamic web form. After reading the Orbeon Forms User Guide I’d have to agree that they’re definitely getting closer. It’s not quite the InfoPath WYSIWYG style editor, but it’s sure is coming close. I’m going to seriously consider it the next time I reach for InfoPath.

SOA: Better Cardboard Delivery!

Tuesday, January 23rd, 2007

Have you seen this before? Your architects understand the mechanism that allow services to operate. They get service buses, registries, data service, etc. but when it comes to data, they simply throw the responsibility to the database team. In my opinion, they’re making a big mistake. SOA is not about moving data around the same what a network moves packets. In other words, packets do not equal messages.

Pat Helland thought about this quite a bit. In his presentation “Thoughts on Messages, Data, and Business Process in a Service Oriented Architecture,” he talks about the difference between messages and data. In fact he breaks it down into four specific types of data:

  • Request/Response
  • Reference Data
  • Activity Oriented Data
  • Resource Orientated Data

To focus on one aspect of Pat’s presentation, Activity Oriented Data represents the messages that flow within a SOA. In Pat’s “Metropolis” presentation, he refers to XML as cardboard because it wraps up the data into a safe package. This analogy would explain the ubiquity of XML and why it’s so important to SOA.

If you look deeper into Pat’s “Thoughts on Messages…” presentation you’ll see that XML doesn’t work for every single situation (he said this in 2003, I’m surprised he wasn’t shot) and that means that designing a canonical form of data for SOA is not the same as designing a database. So why would an architect ignore such an important aspect of architecture?

I’ll draw the line in the sand and guess that for all the emphasis that is put on XML, architects don’t understand the XML related technologies. XML Schema, WSDL, the WS-Security and WS-* present a real steep learning curve. I’ll also add that REST doesn’t make this any easier than WS-*. REST ans WS-* both require that you understand the data you’re dealing with, more specifically the activity oriented data.

Dan Pritchett points out the fallacy of ignoring the data requirements when implementing services in his blog post “A Real eBay Architect Analyzes Part 3
“…state style interactions [REST] can actually lead to lower levels of efficiency in the implementation. When a client makes an imperative statement like CompleteSale, we are completely clear on the intent of the operation…But if the client passes back an Item (which consists of over 200 state elements) with some state changed, the first task we have to perform is determining the state transition. This will involve retrieving the item and potentially other state in the system. All of this is a precursor necessary to determine intent. This certainly increases the resource requirements.”

In my own words, ignoring your data will not help you scale regardless of REST vs WS-*. Keep an eye on the data that is being sent across the SOA and won’t regret it because the shape of the data matters.

Mozilla Does Microformats

Tuesday, January 2nd, 2007

Microformats are the fair haired child of the Web 2.0 movement.  They’re intended to be simple formats for representing common data patterns, like dates, contacts, resumes, etc..  They’re basically rel and class attributes on XHTML that shapes the data.  They piggyback on the HTML rendering engines of a browser to provide basic representations.  When combined with JavaScript they become interactive data islands.
Mozilla Does Microformats: Firefox 3 as Information Broker

So, if Mozilla detects these microformats and moves them into a online storage service, why the need xhtml as the basis?  Why not simply extend the xhtml?

I’m on the fence when it comes to microformats.  The biggest reason is that there is a potential for ambiguity.  The semantics of the XHTML language are being overridden by convention, not through an extension protocol (like XSD).  However, this view is contrary to the microformat objective.

In another debate: JSON vs XML, XML is meant to represent data that must exist beyond a few seconds while JSON is effectively a “struct on the wire” approach to data transfer.  I think most of the microformats are expected to be long lived so they would fall into the XML category.  While XHTML is short lived, basically serve the representation state transfer of the web server.  The microformats are embedded data documents that are meant for long term storage.

An Antic Disposition

Sunday, December 24th, 2006

Keep it easy to read! I’ve always favored keeping any engineering artifacts as easy to comprehend without having any context for what is being read. I favor Java because of its verbosity. It turns out that XML can be just as verbose without a penalty to performance. It appears I’m not alone:

I usually favor readability. I can always tune the code to make it faster. But the developers are at a permanent disadvantage if the language uses cryptic. I can’t tune them.

An Antic Disposition: Celerity of Verbosity

Comparing XSLT and XQuery

Wednesday, November 8th, 2006

A great read for anyone interested in XSLT and XQuery. Dr. Michael Kay does a great analysis of the differences. Comparing XSLT and XQuery

XQOM: Standing on the shoulders of giants

Thursday, November 2nd, 2006

Looks like Ilya Sterin had an agenda when he was surveying the existing XML database solutions out there.

Enterprise Java Community: Object To XML Persistence Frameworks: Interview with Ilya Sterin of Nextrials

The initial goal of XQOM is basically an abstraction layer that allows the mapping of XQuery expressions and their result sets to the object graph

If I understand what he’s trying to accomplish, then he’s replicating a BEA product called, AquaLogic Data Services Platform full-disclosure: I work for BEA professional services). Currently, BEA does this with XML and uses XmlBeans to create a generic wrapper around the output, but it can be any XML marshaling technology.

To Ilya’s credit, he is looking at the complete end to end solution beyond just database to XML service layer. He wants a complete XML data source to front an XQuery object mapper/model. It’s about time these XML databases had a chance to shine. Perhaps they’ll overcome the hurdles that object database could not.

One thing that is interesting is that XQOM will encapsulate the XML from a XML store to the java runtime. To some extent, Ilya falls into the very group that he attempts enlighten in his previous post.

I think most enterprise developers today look at XML as an intermediary transport format…

XQOM attempts to do just that. Getting back to ALDSP for a second, the XML expose by ALDSP comes from the underlying RDBMs, but really can come from any J2EE sources (RAR, WS, RDMBs) and be expressed in an XML shape or schema. This is essentially creating a virtual XML database. Add SDO to the mix and now there is a read/write runtime that does what Hibernate, JPA, and JDO do very well.

It appears that XQOM will attempt to tackle the “last three meters“. Even better, if XQOM creates the plugins to Hibernate or JDO* to communicate with SDO or an XQuery/XML backed data source, that would go a long way to promoting XML databases and first class, enterprise citizens.

* Update: Looks like the great minds at JPOX are a step ahead of me and it did not fall on deaf ears.

Making XML Behave

Friday, October 20th, 2006

XML is the foundation of almost all application on the market. If the application does not use XML inside its run-time, it surely knows how to import or export XML. This is the application of the XML Infoset to application data structures. In so many words…this is XML as a data bucket. XML Schema, DTD, Relax NG all help make the data buck easier to handle.

XML for holding data is one thing, XML that defines behavior is another. XML is ubiquitous on the server side for holding configuration. However, XML on the client side is still making its debut. XAML is Microsoft’s entry, Synth is Sun’s, but probably the most successful and widely distributed XML framework is XHTML .

If only the two could be merged so I can use XML data structures and behavioral mark up of XHTML. For example. I have a XML schema that defines a custom type and I want to display that XML in a web page, but I also want to add style sheets, onclick events, so that it is first class citizen in the browser.

It turns out that you can. Extending XHTML through modularization is the process. However, it requires advanced understanding of XSD Schema and how to extend existing schema. Luckily, examples are provided however it’s any one’s guess if the FireFox, IE or Safari support it.

Orbeon - Form-based Web Applications Done the Right Way

Monday, October 9th, 2006

I’ve been looking for a replacement to InfoPath. Don’t get me wrong, I think InfoPath is a great product and InfoPath 12/2007 will have some incredibly neat features, but it also has some limitations around support XML schema1. All in all, InfoPath is a boon for any enterprise developer and should definitely be seriously considered when information needs to be collected. Storing in Sharepoint is a no brainer as well. Well, unless you want to actually use that information in a meaningful way.

What would be ideal is to be able to store the resulting output of InfoPath (which is simply XML) into an XML database that allows for advanced queries. Enter eXist database stage left. This is a true XML database and can be run on an individual computer or clustered together for a much more scalable solution2. All the output from InfoPath can be stored in eXist, or copied from SharePoint from time to time, so one can process the information. Rapid UI development using InfoPath and extremely powerful XML database features in eXist.

So, why the post about Orbeon? While I think InfoPath works wonders, it’s a proprietary platform and requires a substantial outlay of capital to use. If you’re Sun, Google and start-up, non-profit or OSS, then using anything from Microsoft is out of the question. Orbeon makes form generation accessible for everyone…as long as you’re willing to learn xform, xhtml and xpath. The good thing is that these are standards will eventually adopted by the tool vendor/projects. Until then, Orbeon is attractive for those that like vi, Emacs or notepad. InfoPath provides a great visual toolkit for create forms rapidly, but Orbeon does it for free at the price of approachability.

I’m taking the time to learn Orbeon because I think that all enterprise architects need to understand these standards. The tools will follow, but for now its time to roll up the sleeves and learn some new languages.

PS. Skyoxft also provide commerical support for xform.
PPS. There are XML editors that do support xform editing as well, but typically in a raw XML form, not a gui editor.

1. anyURI cannot be edited and the use of choice groups are clumsy.
2. In terms of availability, I haven’t seen any numbers relating to thoroughput. YMMV

XML Databases

Friday, September 1st, 2006

Ilya, have a look at eXist. They have a nice web base query engine.