Web Content Management - WCM

At the core of all managed web sites is a content management system (CMS). Early in the development process, a platform and product is selected. There are currently over 400+ options. It is important to find the tool set which most closely matches the defined business requirements.

Choose an overly simple solution, and you will find yourself in the unfortunate position of needing to port your solution to a more capable system. Go to complex, and you become a slave to the implementation and tool experts, regardless of how expensive of flaky they may be.

Choosing the right WCM system for your website, or indeed for your enterprise can be both confusing and frustrating, you have over 500 systems to choose from, with more choices being added daily.

Whether that system is something complex or something simple (i.e. hand editing), your ability to implement and use your CMS is an essential part of a successful site. You must be able to enable content providers and editors to perform website updates (however inexperienced).

Here is a round-up of the posts in our community that relate to Web Content Management (WCM)

Web Content Management (WCM)

CMC: Category: Web Content Management (WCM)

Will Day stay committed to web standards under Adobe's ownership?
Post by sggottlieb
I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts: New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a ...I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts:
  1. New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a standards based...
  2. Whatever happened to the URL? Even back when I was developing websites in 1998,...
  3. iECM: Interoperable Enterprise Content Management iECM is a new standard being developed through AIIM...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
Day’s acquisition by Adobe: point of view of a competitor
Post by tristan renaud
No fiestaEvery time an acquisition is published, it is time to celebrate. Well, it depends for whom.I have learnt to love acquisition as a shareholder of the acquired company. I have learnt to rebuild teams, business and clients’ trust on the ruins of the acquired companies as a manager, several times. I have learnt to learn how tough, and rare, it was to acquire successfully a company as an investor, an employee or whatever, it is always a ...

No fiesta
Every time an acquisition is published, it is time to celebrate. Well, it depends for whom.
I have learnt to love acquisition as a shareholder of the acquired company. I have learnt to rebuild teams, business and clients’ trust on the ruins of the acquired companies as a manager, several times. I have learnt to learn how tough, and rare, it was to acquire successfully a company as an investor, an employee or whatever, it is always a challenge.

Mission accomplished?
So I am not the kind to say “congrats” during an acquisition but to the team in charge of the acquisition, of course. They have “accomplished the mission” and anyone who had been in such business knows how tricky it can sometimes be. So yes, congrats to the management team of Day software, they have sold the company very much successfully.
But from my point of view, there is not so much to celebrate during an acquisition, especially for many of the employees and for many of the clients. The most difficult part is still to come. Acquisition does not matter so much compare to integration. It is like celebrating a deal whereas what matters is the Go Live of a project, not the signature of the contract.
 

Acquisition means “risky business”
OK let’s write “challenging” instead of “risky” to be positive.
I have not read today so much about how a mid size company with a strong taste of Swiss attitude and open source philosophy will fit into a mainly US centric international proprietary software company. I have also learnt to learn that during acquisitions people matter, but not so much actually. Not because you don’t want to, just because you cannot take care too much of the details and that’s what individuals are during an acquisition. Of course some individuals will find great opportunities thanks to the deal, that not the point, but just to say that an acquisition is not a fiesta for everyone, far from that: it mainly means “risky business”.
And one of the key advantages of Day has been several very gifted, skills and committed individuals. I am wondering how Adobe will handle the famous “The surprising truth about what motivates us”
Day Software was one of the rare independent high end WCM vendor and will now be just one of the products of a major software company. That’s a major evolution. Uncertainty will prevail for weeks if not months like for any similar acquisition, at least for many employees, even if, of course, Adobe may argue the opposite. Many are speculating and will speculate about the products integration, the strategic fit, the risks, the advantages, the constraints and so on. I don’t really care so much as my employer is a competitor so I have to focus at my clients, the team and to our own product but to finish my post on something funnier I cannot prevent myself motivating the shareholders of Day to agree on this acquisition. Let me be more specific:

Four Reasons to sell your shares to Adobe
1.      The price is right
The share’s price of the company was extremely high before the acquisition, and I assumed the market was expecting something irrational, sorry I meant “exceptional”. And the market is so often right, so now it is really time to do something guys. Frankly I am impressed. I can write for pages why this valuation looks so unreal to me – maybe because I am coming from a different world - but frankly just sell for that price your shares. Exaggerating, I can write you are more likely to be hit by an asteroid tomorrow morning than to get a better deal soon.
2.      That’s good for my business
Everybody knows an acquisition is always good for the business of the competition at least for the months to come. Beyond more than a year, as usual in this kind of business, it does not really matter so much, that’s another time frame and we always have to face competitors of many kinds: that’s business. What matters is adaptation and anticipation, not conjecture. So for the time being, I believe that’s good news for my employer’s business development as we have very frequently met Day software both in Europe and North America these last months.

3.      You will give Adobe a cooler image
I really like the spirit and the values of the Day team. I don’t believe in miracles but I hope they will influence Adobe somewhere in a positive way. At least, they won’t be of any bad influence!
4.      This summer is so boring like any summer, thanks for the show
For different reasons, and somewhat surprisingly, daily business is always thriving during July and August but conversely market news is usually so boring during these sunny weeks in Europe too. At last some news to really speak about during the summer break. Cool. And just before the CMS geek up cessions! Very cool.

So please give me a favor, just sell your shares to Adobe.

Further readings:

Analysis, interesting comments and some (inevitable) speculation from our WCM industry gurus: Adobe to acquire Day – First Take ECM perspective

Great critics – as usual – from Seth Gottlieb: Will Day stay committed to web standards under Adobe’s ownership?

CMS Wire article: Web CMS: Adobe Buys Day Software for US$ 240 Million

Boris-Magnolia’s own publicity but with several excellent remarks: http://www.betterfasterbigger.com/2010/07/day-to-be-acquired-by-adobe.html

Jon on tech blog post: http://jonontech.com/2010/07/28/a-fine-day-for-adobe/

Jeff Potts who was very quick to write about the acquisition: Adobe acquires Day Software for $240 million

 

 

 

Permalink | Leave a comment  »


Link to original post
CRX Gems: Rendering content as PDF + XFDF
Post by dev.day.com
I promised last time to show a simple way to render CRX content as PDF. The technique in question involves using a PDF form as the readymade container, into which form data is imported using XFDF. The latter is the XML version of Adobe's Forms Data Format, which in turn is a file format specifically designed to allow import and export of data to and from PDF forms. The way it works is simple: Suppose you have a PDF form that you want to ...

I promised last time to show a simple way to render CRX content as PDF. The technique in question involves using a PDF form as the readymade container, into which form data is imported using XFDF. The latter is the XML version of Adobe's Forms Data Format, which in turn is a file format specifically designed to allow import and export of data to and from PDF forms.

The way it works is simple: Suppose you have a PDF form that you want to populate with data. You merely need to create a small data file (in XFDF format) and put it on the server. When a user requests the data file (which has a mimetype of "application/vnd.adobe.xfdf"), Acrobat Reader (or the Reader browser plug-in) detects the fact that form data will need to be imported into a form. The XFDF file itself contains a pointer to the actual form to be used. Reader fetches the form, then imports the form data into it, and renders the result as a PDF file containing the data. It all happens transparently to the user, and the user need only have Acrobat Reader (not a full copy of Acrobat Professional).

In the example I'm going to show below, we generate the XFDF file dynamically on the server, via a script called (what else?) xfdf.esp. We'll get to that in a minute.

The example we're going to talk about assumes that there is content in CRX (under a path of /content/films) that looks something like this:

This particular content node is named terminator_2. It lives under /content/films/ in my CRX repository.

Notice, in the above list, that there is a property (at the bottom) called sling:resourceType, set to a value of "films." This tells CRX to look under /apps/films for any scripts that might be necessary to render the content.

In previous blogs, I've shown how to write scripts that render this content as HTML, SVG, or CSV. Right now, what we need is an XFDF renderer. That turns out to be pretty easy to set up.

First, we need to create a PDF form to hold our data. In the Acrobat Professional forms editor, such a form looks like this:


Will Day stay committed to web standards under Adobe’ ownership?
Post by sggottlieb
I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts:New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a standards based......I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts:
  1. New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a standards based...
  2. Whatever happened to the URL? Even back when I was developing websites in 1998,...
  3. iECM: Interoperable Enterprise Content Management iECM is a new standard being developed through AIIM...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
CRX Gems: Rendering content as CSV
Post by dev.day.com
I've shown how easy it is to push spreadsheet data into CRX (in such a way that there is one content node per row of data, where properties on that node correspond to column data). The reverse is also possible: It's easy to write a script that converts sibling nodes to row data formatted as CSV (comma-separated values per RFC 4180). Such a script, csv.esp, looks something like this: <%// Given a list of sibling nodes (presumably// ...

I've shown how easy it is to push spreadsheet data into CRX (in such a way that there is one content node per row of data, where properties on that node correspond to column data). The reverse is also possible: It's easy to write a script that converts sibling nodes to row data formatted as CSV (comma-separated values per RFC 4180). Such a script, csv.esp, looks something like this:

<%
// Given a list of sibling nodes (presumably
// similar in structure), and an array of
// property names, convert each node
// to one "row" of CSV data, where
// columns correspond to properties.
// We will encode all property data as
// comma-separated values per RFC 4180.
function nodesToCSV( nodes, propertyNames ) {

        var records = new Array( );

        for ( var i = 0; i < nodes.length; i++ ) {

                var aRecord = new Array( );

                // suck in the data for each property:
                for ( var k = 0; k < propertyNames.length; k++ ) {
                        var data = nodes[ i ][ propertyNames[ k ] ];
                        var escaped = escapeData( data );
                        aRecord.push( escaped );
                }
                records.push( aRecord.join( "," ) );
        }

        var CRLF = String.fromCharCode(13) +
        String.fromCharCode(10);

        return records.join( CRLF );
}

// Return an array of property names for this node
function getOrderedProperties( node ) {

        var array = new Array();
        for ( var i in node )
        array.push( i );

        return array;
}

// Escape field data per RFC 4180
function escapeData( data ) {

        // replace " with ""
        data = String(data).replace( /"/g, "\"\"" );

        // if data contains comma, CRLF, or "
        // we need to wrap the entire thing in double quotes
        var escapables = /,|(\r\n)|"/;
        if ( data.match( escapables ) )
        return "\"" + data + "\"";

        return data;
}
%>
<% nodes = currentNode.getNodes( );
// get a list of property names
propertyNames =
getOrderedProperties( nodes[0] );%>
<%= nodesToCSV( nodes, propertyNames ) %>
 

The rules for escaping data for CSV are extremely simple. First, any data string that contains the double-quote (") character needs to have each such character converted to two double-quotes (""). Secondly, if the data contains a comma, the entire data string needs to be wrapped in quotation marks. The same is true for any data that contains double-quotes or line breaks (which RFC 4180 defines as CRLF -- carriage return followed by linefeed). The following very simple function enforces these escaping rules:

// Escape field data per RFC 4180
function escapeData( data ) {

   // replace " with ""
   data = String(data).replace( /"/g, "\"\"" );
 
   // if data contains comma, CRLF, or "
   // we need to wrap the entire thing in double quotes  
   var escapables = /,|(\r\n)|"/;
   if ( data.match( escapables ) )
      return "\"" + data + "\"";
      
   return data;
}

The function that actually converts nodes to records is very straightforward as well:

function nodesToCSV( nodes, propertyNames ) {

   var records = new Array( );

   for ( var i = 0; i < nodes.length; i++ ) {

      var aRecord = new Array( );

      // suck in the data for each property:
      for ( var k = 0; k < propertyNames.length; k++ ) {
         var data = nodes[ i ][ propertyNames[ k ] ];
         var escaped = escapeData( data );
         aRecord.push( escaped );
     }
      records.push( aRecord.join( "," ) );
   }

   var CRLF = String.fromCharCode(13) +
                    String.fromCharCode(10);

   return records.join( CRLF );
}

Note that we need to explicitly provide the function a list of property names, rather than (say) let the function iterate through property names on an introspective basis. The reason for this is that if we simply try gathering property names with a for/in loop, we will get back property names in no particular order. And the order will, in fact, vary from content node to content node even if all of the content nodes have properties with exactly the same names. The unorderedness of the properties (as obtained through simple iteration) would scramble the column data in our CSV file. We don't want that. Hence, we pass in an array of property names, and march through the array in orderly fashion when pulling property data from each node.

When I placed csv.esp in my repository under /apps/films and then navigated to http://localhost:7402/content/films.csv, CRX dutifully fired my script and produced a CSV file containing all of the data from my /films content nodes, causing my browser (in turn) to inform me that I was downloading a file of type "csv" (it then asked me what program I wanted to use to open the file; I specified scalc.exe, and OpenOffice dutifully loaded the file as a spreadsheet).

So far, I've shown how to render /films data as HTML, SVG, and CSV. Next time, I want to show a simple trick for rendering the data as PDF. It's easier than you think!


[LOTD] IBM's approach to JCR text search
Post by dev.day.com
It's always good to get a glimpse into the approaches taken by non-OSS JCR implementations: In a recent technical article on the developerworks website Malarvizhi Kandasamy describes how IBM goes about JCR fulltext search. The actual engine is Juru, which is a Java library developed by the IBM Haifa research lab According to the article Juru is capable of some natural language processing like stemming or finding similar spellings. IBM ...

It's always good to get a glimpse into the approaches taken by non-OSS JCR implementations: In a recent technical article on the developerworks website Malarvizhi Kandasamy describes how IBM goes about JCR fulltext search. The actual engine is

Juru, which is a Java library developed by the IBM Haifa research lab

According to the article Juru is capable of some natural language processing like stemming or finding similar spellings.

IBM uses a JCR compliant repository in a number of their products, e.g. Lotus Web Content Management or WebSphere Portal.


CRX Gems: Rendering content as HTML and SVG
Post by dev.day.com
A few days ago, I talked about how to "shred and store" a spreadsheet -- i.e., how to push rows of a spreadsheet into individual nodes in CRX (one node per row, with column data stored as properties). I also gave JavaScript code for doing this in an OpenOffice macro. For testing purposes, I used the CSV file a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing. After running my OpenOffice macro on ...

A few days ago, I talked about how to "shred and store" a spreadsheet -- i.e., how to push rows of a spreadsheet into individual nodes in CRX (one node per row, with column data stored as properties). I also gave JavaScript code for doing this in an OpenOffice macro. For testing purposes, I used the CSV file a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing.

After running my OpenOffice macro on the Georgia Tech CSV file, my CRX repository now contains movie data (Title, Director, Year, etc.) for 1741 films, each film with its own nt:unstructured node under the path /content/films/. In the CRX Content Explorer, a given node (in this case, the node at http://localhost:7402/content/films/terminator_2) looks something like this:


Keeping your content DRY
Post by sggottlieb
After over 10 years of working in content management, I have come to realize that there is only one way to learn the value of managing structured information: the hard way — and that way is only 50% effective. People can intellectually accept concepts like content re-use and content/layout separation, but in the heat [...] Related posts:Migrating Content There has been a great thread on the CM...Are content managers ready for ...After over 10 years of working in content management, I have come to realize that there is only one way to learn the value of managing structured information: the hard way — and that way is only 50% effective. People can intellectually accept concepts like content re-use and content/layout separation, but in the heat [...] Related posts:
  1. Migrating Content There has been a great thread on the CM...
  2. Are content managers ready for personalization? I have been catching up on product demos recently...
  3. A Content Management Definition I just heard Frank Gilbane define Content Management as...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
CRX Gems: CRXDE Lite
Post by dev.day.com
Since version 2.0 CRX comes with CRXDE Lite (CRX Development Environment - Lite), a web based tool to ease the development of CRX based applications. CRXDE Lite is implemented using the ExtJS Javascript library and aims to replace the CRX 1.x Content Explorer with a modern AJAX-based repository editor and browser, but it also provides improved means for searching, code editing and integrations for code version management and handling of ...

Since version 2.0 CRX comes with CRXDE Lite (CRX Development Environment - Lite), a web based tool to ease the development of CRX based applications. CRXDE Lite is implemented using the ExtJS Javascript library and aims to replace the CRX 1.x Content Explorer with a modern AJAX-based repository editor and browser, but it also provides improved means for searching, code editing and integrations for code version management and handling of non-scripted code. As a tool primarily for developers, it also comes with server side development functionalities like compilation of Java code, OSGi bundle creation and autodeployment, project wizard, etc.


Work Breakdown Structure vs. Deadlines
Post by sggottlieb
One of the most common points of friction between project managers and developers is planning work. Most programmers hate creating work breakdown structures (WBS). You can’t blame them, accurately predicting steps and effort required to build undesigned software is impossible. Yes, you heard that right. Software development planning is impossible — [...] Related posts:Plone Strategic Summit results posted Notes and action ...One of the most common points of friction between project managers and developers is planning work. Most programmers hate creating work breakdown structures (WBS). You can’t blame them, accurately predicting steps and effort required to build undesigned software is impossible. Yes, you heard that right. Software development planning is impossible — [...] Related posts:
  1. Plone Strategic Summit results posted Notes and action items from the Plone Strategic Planning...
  2. Fixed bid implementation work: a marriage made in Vegas Most of my CMS selection clients are not just...
  3. How I use Twitter for Work Publishing Decision Tree V2 Originally uploaded by sggottlieb I...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
CRX Gems: Using an OpenOffice macro to store spreadsheet data in CRX
Post by dev.day.com
In a recent blog, I talked about how easy it is to store snippets of text from OpenOffice in a CRX repository using a little bit of JavaScript and the Sling REST API. While being able to store arbitrary bits of text this way is certainly useful, it would be even more useful to be able to store spreadsheet data. Of course, storing a spreadsheet in CRX, per se, is not much of a challenge: with WebDAV, it's a matter of drag and drop. But storing ...

In a recent blog, I talked about how easy it is to store snippets of text from OpenOffice in a CRX repository using a little bit of JavaScript and the Sling REST API. While being able to store arbitrary bits of text this way is certainly useful, it would be even more useful to be able to store spreadsheet data. Of course, storing a spreadsheet in CRX, per se, is not much of a challenge: with WebDAV, it's a matter of drag and drop. But storing an entire spreadsheet as a single monolithic content item doesn't necessarily give you the greatest content-management bang for the buck. Often, what you really want to do is granularize the spreadsheet into records (or row data), and store individual rows as content items. (You could take it further and store individual cells as content items, but that would probably be overkill for most situations, although there's certainly nothing preventing you from doing it.)

In the database world, where decisions often have to be made as to how best to decompose an XML document when mapping it to tables in a database, this general process (of decomposing a large document along the lines of its natural internal fine-structure) is known as shredding. What would be handy is to have an OpenOffice macro that could shred a spreadsheet into rows, and push the rows into nodes in CRX. That's what I propose to show you right now.

It turns out to be pretty easy to parse a spreadsheet in an OpenOffice macro. Using JavaScript:

   // First, get the document object
   // from the scripting context
   oDoc = XSCRIPTCONTEXT.getDocument();

   // Next, get the XSpreadsheetDocument
   // interface from the document
   xSDoc = UnoRuntime.queryInterface(XSpreadsheetDocument, oDoc);

   // Then get a reference to the sheets for this doc
   var sheets = xSDoc.getSheets();

   // get Sheet1
   var sheet1 = sheets.getByName("Sheet1");

Once you've gotten the sheet reference, you can use it to obtain a cell reference:

var cell = sheet.getObject().getCellByPosition( column, row );

The cell, in turn, contains data, which (dependening on whether you're dealing with a native OpenOffice spreadsheet versus a freshly imported CSV file) can be a floating-point value, a string, or something else. For purposes of this discussion I'm going to assume that you've just imported a CSV or tab-delimited file into OpenOffice, in which case all cells will automatically contain string data. To get the string data from a cell in a freshly imported CSV file, you have to do:

var content = cell.getFormula();

At least, that's what works in OpenOffice 3.2.

The general plan of attack, then, is to come up with a function that can parse a row's worth of data out of a spreadsheet; and have another function that can persist a row of data as a content item in CRX. Then it should be possible to create a macro that simply loops over all rows in a spreadsheet and pushes them out to the repository.

The row-parsing function is pretty straightforward:

function getRow( sheet, rownumber, startColumn, endColumn )  {

    var obj = sheet.getObject();
    var record = [];

    for (var k = startColumn; k < endColumn ; k++) {
         var cell = obj.getCellByPosition( k, rownumber );
         var content = cell.getFormula();
         record.push( content );
    }

    return record;
}

Given a reference to a Sheet, along with a row number and the starting and ending column numbers, this function loops through cells and pushes cell values into an array. The returned array represents a row's worth of data.

To persist a row to CRX, we have a function that looks like this:

function persistRow( sheet, rownumber, startColumn, endColumn ) {

   // get first row of data (column names)
   var columnNames = getRow( sheet, 0, startColumn, endColumn );

   // get specified record
   var row = getRow( sheet, rownumber, startColumn, endColumn );

   // build the request
   var request = {};
   request[":nameHint"] = row[2]; // Title
   request["sling:resourceType"] = "films";
   for ( var i = 0; i < columnNames.length; i++) {
       request[ columnNames[ i ] ] = row[ i ];
   }   
   var data = createRequest( request );

   // where to store it
   var url = "http://localhost:7402/content/films/";

   // finally, hit the repository
   var response = doJavaPOST( url, data );

   return response;
}

Notice that the code assumes that the first row of "data" in the spreadsheet columns the column names. This was the case with the test-spreadsheet I used for testing this macro, namely a spreadsheet called a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing. Each row in the spreadsheet has information for a particular film, such as the film's title, the year the film was made, its genre, the name of the director, major actors and actresses, etc.

Without further ado, here is the complete code for the OpenOffice macro:



// Spreadsheet2CRX Macro
// Kas Thomas, 15 July 2010
// Public domain. Use at your own risk.
// Tested with v3.2 of OpenOffice.org

importClass(Packages.com.sun.star.uno.UnoRuntime);
importClass(Packages.com.sun.star.sheet.XSpreadsheetDocument);

// Do a POST
function doJavaPOST( url, content ) {
        var reply = "";
        var responseCode = "";
        try {
                var URL = new java.net.URL( url );
                var urlConn = URL.openConnection( );
                urlConn.setDoOutput ( true );
                urlConn.setRequestMethod( "POST" );
                urlConn.setUseCaches( false );
                urlConn.setRequestProperty ("Content-Type",
                "application/x-www-form-urlencoded" );
                var printout =
                new java.io.DataOutputStream ( urlConn.getOutputStream ( ) );
                printout.writeBytes ( content );
                printout.flush ( );
                printout.close ( );
                responseCode = urlConn.getResponseCode();
        }
        catch(exception) {
                java.lang.System.out.println( exception.toString() );
        }

        return responseCode;
}

// munge together the form data
// into "name1=value1&name2=value2" etc
function createRequest( object ){

        var data = [];
        for ( var i in object )
        data.push( i + "=" + object[ i ].toString( ) );

        var dataString = data.join( "&" );
        return dataString;
}

// Modal dialog with OK/cancel and a text field
function prompt( msg ) {
        var swing = Packages.javax.swing;
        var text = swing.JOptionPane.showInputDialog(
        new java.awt.Frame(), msg );
        return ( null == text ) ? "" : text; // always return a string
}

// a Swing UI for displaying console info
function EditorPane( ) {

        Swing = Packages.javax.swing;
        this.pane = new Swing.JEditorPane("text/html","" );
        this.jframe = new Swing.JFrame( );
        this.jframe.setBounds( 100,100,500,400 );
        var editorScrollPane = new Swing.JScrollPane(this.pane);
        editorScrollPane.setVerticalScrollBarPolicy(
        Swing.JScrollPane.VERTICAL_SCROLLBAR_ALWAYS);
        editorScrollPane.setPreferredSize(new java.awt.Dimension(250, 250));
        editorScrollPane.setMinimumSize(new java.awt.Dimension(10, 10));
        this.jframe.setVisible( true );
        this.jframe.getContentPane().add( editorScrollPane );

        // public methods
        this.getPane = function( ) { return this.pane; }
        this.getJFrame = function( ) { return this.jframe; }
}

function getRow( sheet, rownumber, startColumn, endColumn )  {

        var obj = sheet.getObject();
        var record = [];

        for (var k = startColumn; k < endColumn ; k++) {
                var cell = obj.getCellByPosition( k, rownumber );
                var content = cell.getFormula();
                record.push( content );
        }

        return record;
}

function persistRow( sheet, rownumber, startColumn, endColumn ) {

        // get first row of data (column names)
        var columnNames = getRow( sheet, 0, startColumn, endColumn );

        // get specified record
        var row = getRow( sheet, rownumber, startColumn, endColumn );

        // build the request
        var request = {};
        request[":nameHint"] = row[2]; // Title
        request["sling:resourceType"] = "films";
        for ( var i = 0; i < columnNames.length; i++) {
                request[ columnNames[ i ] ] = row[ i ];
        }
        var data = createRequest( request );

        // where to store it
        var url = "http://localhost:7402/content/test/";

        // finally, hit the repository
        var response = doJavaPOST( url, data );

        return response;
}

( function main( ) {

        //get the document object from the scripting context
        oDoc = XSCRIPTCONTEXT.getDocument();

        //get the XSpreadsheetDocument interface from the document
        xSDoc = UnoRuntime.queryInterface(XSpreadsheetDocument, oDoc);

        // get a reference to the sheets for this doc
        var sheets = xSDoc.getSheets();

        // get Sheet1
        var sheet1 = sheets.getByName("Sheet1");

        // construct a new EditorPane
        var editor = new EditorPane( );
        var pane = editor.getPane( );

        var size = prompt("Enter total rows and total columns, separated by a comma (e.g., '100,8')");
        if ( !size )
        return "No row/column info supplied.";

        var rows = Number( size.substring(0,size.indexOf(",")) );
        var cols = Number( size.substring( size.indexOf(",")+1) );

        var errors = 0;
        for ( var i = 1; i <= rows; i++) {
                var response = persistRow( sheet1, i, 0, cols );
                var text = pane.getText();
                pane.setText( text + "\nProcessing: " + i );
                if ( response.toString().indexOf("5")==0 )
                errors++;
                // provide a little bit of throttling:
                java.lang.Thread.sleep( 200 );
        }
        pane.setText( pane.getText() + "\n" + errors + " errors" );
})();




You'll notice that the code creates a JEditorPane window to act as an error console. When you run the macro, a JOptionPane dialog appears, asking you to supply the number of rows and columns in the spreadsheet. (For the Georgia Tech spreadsheet, you can enter "1741,8", minus quotes.) Once you dismiss the dialog, the code goes to work looping over all the rows in the spreadsheet, posting each row to CRX at a path of http://localhost:7402/content/films/.

Each new node is named according to a :nameHint parameter based on the Title of the film.

Notice also, we designate a sling:resourceType for each node of "films." (This happens in the persistRow() function.) This fact will be important in a later blog when I show how to write server-side scripts that handle various types of requests for film data.

And that's about it: Now you know how to shred a spreadsheet (say that 3 times in a row fast...) and store the results in CRX, using OpenOffice.


Open source project filtering
Post by sggottlieb
Roberto Galoppini has an interesting case study on selecting an open source project management tool. In it, he describes his SOS Open Source methodology for filtering open source projects by looking at a number of factors organized into three categories: sustainability, industrial strength, and project strategy. The case study doesn’t go into much [...] Related posts:Another Open Source Project Management Tool A few months ago I ...Roberto Galoppini has an interesting case study on selecting an open source project management tool. In it, he describes his SOS Open Source methodology for filtering open source projects by looking at a number of factors organized into three categories: sustainability, industrial strength, and project strategy. The case study doesn’t go into much [...] Related posts:
  1. Another Open Source Project Management Tool A few months ago I was looking for a...
  2. Evaluating open source and closed source software Gartner has been saying how the current recession favors...
  3. Honest Open Source I just read Kris Buytaert’s blog post “Honest Open...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
Learning about ESP vs. JSP in Sling
Post by dev.day.com
The first version of this post originally was published here. Lately I've been doing a fair amount of server-side scripting using ESP (ECMAScript Pages) in Sling. At first blush, such pages tend to look a lot like Java Server Pages, since they usually contain a lot of scriptlet markup, like: <%  // script code here  %> and <%=  // stuff to be evaluated here  %> So it's tempting to think ESP pages are simply ...
The first version of this post originally was published here.

Lately I've been doing a fair amount of server-side scripting using ESP (ECMAScript Pages) in Sling. At first blush, such pages tend to look a lot like Java Server Pages, since they usually contain a lot of scriptlet markup, like:

<%  // script code here  %>

and

<%=  // stuff to be evaluated here  %>

So it's tempting to think ESP pages are simply some different flavor of JSP. But they're not. From what I can tell, ESP pages are just server pages that get handed to an EspReader before being served out. The EspReader, in turn, handles the interpretation of scriptlet tags and expression tags (but doesn't compile anything into a servlet). Bottom line, ESP is not JSP, and despite the availability of scriptlets tags, things work quite a bit differently in each case.

Suppose you want to detect, from an ESP page or a JSP page, what kind of browser a given page request came from. In a Sling JSP page you could do:

<%@taglib prefix="sling" uri="http://sling.apache.org/taglibs/sling/1.0" %>

<sling:defineObjects/>
<html><body>

<%
java.util.Enumeration c = request.getHeaders("User-Agent");

String s = "";

while ( c.hasMoreElements() )
    s += c.nextElement();
%>

<%= s %>
</body></html>

But what do you do in ESP? Remember, <sling:defineObjects/> is not available in ESP.

It turns out that Sling automatically (without the need for any directives) exposes certain globals to the JavaScript Context at runtime, and one of them is a request object. Thus, in ESP you'd simply do:

<%

c = request.getHeaders("User-Agent");

s = "";

while ( c.hasMoreElements() )
    s += c.nextElement();

%>

<%= s %>

Very similar to the JSP version.

So the next question I had was, what are the other globals that are exported into the JavaScript runtime scope by Sling? From what I can determine, the Sling globals available in ESP are:

currentNode
currentSession
log
out
reader
request
resource
response
sling

currentNode is the JCR node underlying the current resource; currentSession is what it sounds like, a reference to the current Session object; log refers to the org.slf4j.Logger; reader returns request.getReader(), which allows for reading the request body; request is a reference to the SlingHttpServletRequest; resource is the current Resource; response is, of course, a reference to the SlingHttpServletResponse; and sling is a SlingScriptHelper. All of these are available all the time, throughout the life of any ESP script in Sling.

The nice part about server-side scripting in Sling (one of many nice parts), incidentally, is that you don't have to choose to do just ESP pages or just JSP; you can write an ESP handler for one situation and a JSP for another, and use ESP/JSP in any combination. You're not locked into one technology or the other.

For more information, try the Sling Javadocs here or Day's page of resources here (note, in particular, the list of References on the right).


New ASF board elected
Post by dev.day.com
The current board of directors of the Apache Software Foundation has just been elected - congratulations to: Shane CurcuruDoug CuttingBertrand DelacretazRoy T. FieldingJim JagielskiSam RubyNoirin ShirleyGreg SteinHenri Yandell Roy and Bertrand are colleagues of mine at Day Software. To find out more about what the board actually does have a look at "How the ASF works".

The current board of directors of the Apache Software Foundation has just been elected - congratulations to:

  • Shane Curcuru
  • Doug Cutting
  • Bertrand Delacretaz
  • Roy T. Fielding
  • Jim Jagielski
  • Sam Ruby
  • Noirin Shirley
  • Greg Stein
  • Henri Yandell

Roy and Bertrand are colleagues of mine at Day Software.

To find out more about what the board actually does have a look at "How the ASF works".


Serving the long tail of information needs with social intranets
Post by Oscar Berg
“Flexible access to people and resources can be enormously powerful in a world driven by changes that, more often than not, lead us in unanticipated directions…we need to become more adept at ‘capability leverage’ – finding and accessing complementary capabilities, wherever they reside in the world, to deliver more value.”  - From “The Power of Pull” by J Hagel, J S Brown, L Davidson Businesses, in particular in the Western world ...
“Flexible access to people and resources can be enormously powerful in a world driven by changes that, more often than not, lead us in unanticipated directions…we need to become more adept at ‘capability leverage’ – finding and accessing complementary capabilities, wherever they reside in the world, to deliver more value.”  
- From “The Power of Pull” by J Hagel, J S Brown, L Davidson 
Businesses, in particular in the Western world, are becoming more and more knowledge-intensive with an increasing part of the workforce engaged in knowledge-based work. A study by The Work Foundation has estimated that we have a 30-30-40 workforce - 30 per cent in jobs with high knowledge content, 30 per cent in jobs with some knowledge content, and 40 per cent in jobs with less knowledge content.

Knowledge work is about such things as solving problems, performing research and creative work, interacting and communicating with other people, and so on. Such work is by nature less predictable and repeatable than traditional industry work (transformational and transactional activities organized into repeatable processes). Both the inputs and outputs of knowledge work – which is information and knowledge – vary from time to time, from situation to situation. So does the purpose, activities, roles and resources involved in knowledge work. Knowledge work is also less structured and the structure of knowledge work typically emerges as the work proceeds.

In a knowledge-intensive business environment, it is often very hard or even impossible to anticipate in advance what information is needed. You simply cannot know what information will be relevant before the moment you need it. The information might not exist until the moment you need it, or you are simply unaware of its existence. That’s why more is better (“more is more”) when it comes to information supply in a knowledge-intensive business environment. If there is more to choose from, chances are there will be something for (almost) any need. That’s also why it has become critical for knowledge workers to access to the information abundance on the Internet. We also need to have immediate access to anyone who might possess the knowledge and information we need but which is not captured – often because it is hard to capture or simply does not allow itself to be captured (tacit knowledge) and exchanged.

There’s a long tail of information needs that still needs to be served

Assuming we have a long tail of diverse, constantly changing and virtually unlimited amount of information needs, we need to do what can be done to serve these needs in some way or another. The problem is that the information resources that most businesses choose to produce and provide access are not aimed at serving these infrequent, uncertain and constantly changing information needs. Let’s use the Long Tail power graph below to illustrate and further expand this reasoning.



In the left end of the power graph we have the information resources which are most frequently used because they are serving frequently recurring information needs. The information which is needed for transformational and transactional activities - but also administrative knowledge work - is likely to be served by information resources in the left part of the Long Tail power graph. This information does not change very often and thus can be quite easily reused. It’s the kind of information used for a commonly performed activity, which means that the need is predictable. An information need that has occurred once will for certain occur again. This allows us to define, design and produce the type and structure of the information as well as the actual information before the next time the information need arises (the activity is performed).

Knowledge work is often a completely different story. While the information used us input to an activity or process is likely to be found in the left part of the Long Tail power graph, the information needed for a knowledge work activity is likely to be found in the long tail. There you have information resources which are used infrequently or maybe even once. The information which is needed varies from time to time, from situation to situation. Not only the actual information varies; often the type and structure of the information resource varies too. This makes it virtually impossible to define a reusable information resource in advance before it is needed.

The unpredictable nature of knowledge work is why we need to give knowledge workers access to all information that exists and that might be relevant. Since we don’t know what might be relevant until a certain need arises (which we never might be aware of until we discover certain information), we can’t really put the relevant information in one “for keeps” pile and all other information in another “to be trashed” pile. We also need to provide them with tools so they can create or capture information with each other, or else there will not be enough information available to serve the knowledge workers’ information needs. To help people find and discover information that is relevant to their tasks when they need it, we also need to create powerful pull mechanisms which allow relevant information to automatically surface and be placed at the fingertips of knowledge workers just when they need it.

Traditional intranets are not designed for knowledge work

This leads me to the changing role of intranets in knowledge-intensive businesses. These intranets need to provide flexible access to both information and people by employing pull models for serving as many knowledge worker information needs as possible, including unanticipated information needs. Information supply needs to be maximized by supporting the creation and access to user-generated content as well as by allowing for easy integration of external information sources. The intranet needs to be turned into an “information broker platform” where information is freely and easily created, aggregated, shared, found and discovered at minimal effort.  Such an intranet gives everybody access to all information which is available and make room for virtually infinite amounts of information.

However, most of today’s intranets primarily consist of pre-produced information artifacts which are intended to serve information needs which can be anticipated in advance. They aim to serve people who perform predefined and repeatable tasks. They are push platforms. As such they might work well for repeatable routine work where the information needs can be defined in advanced, but they are quite dysfunctional for knowledge work. It’s not a coincidence that many knowledge workers find it much easier to find information on the web than in their internal systems and that the intranet plays a marginal role in their daily work.

The information that knowledge workers need can often not be anticipated and served by a push-based intranet. It is also critical that they have access to ALL information that is available, including collaborative content produced by teams, content produced by external resources, tacit knowledge captured in conversations, and so forth. Since the information artifacts on an intranet typically are produced by a relatively small part of the organization’s total workforce, the resources available for producing these artifacts are limited. A line needs to be drawn between information needs which can be served and those which cannot be served. A common approach is to identify the most common information needs and focus available resources on serving these needs as good as possible. Assuming that the resources for producing and maintaining information resources are scarce, this is a seemingly feasible approach. But it’s not a feasible approach for an intranet that needs to serve the information needs of knowledge workers.

The problem here is that traditional intranets are based on a production model which says that all information artifacts on the intranet must be produced in advance (only serving information needs which can be anticipated) by a small subset of all available resources (employees) and that the entire body of information needs to be supervised by a few people for the purpose of controlling the message, format and/or organization of the information resources.

Knowledge workers need a social intranet 

There are plenty of definitions trying to define what a social intranet is, but most of the ones I’ve seen have not been able to see beyond tools and technologies. They don't succeed in describing the paradigm change that is transforming intranets into something completely different from what they are today.

The social intranet is not just about adding a layer of social collaboration tools; it is a platform that combines the powers of push with the powers of pull to supply anyone who participates and contributes within an extended enterprise with the information, knowledge and connections they need to make the right decisions and act to fulfill their objectives. It equips everyone with the tools that allows them to participate, contribute, attract, discover, find and connect with each other to exchange information and knowledge and/or collaborate. It connects information demand with information supply in knowledge-intensive businesses, something which can only be done by involving all employees in the information supply, removing bottle-necks created by the production model (such as approval workflows and that everything must fit in a central taxonomy) and enabling employee-to-employee information exchange.

When it comes to information supply, the previously dominating "less is more" paradigm is being replaced by a "more is more" paradigm. A social intranet must necessarily be designed for information abundance. The increasing volume of information resources needs to be seen as opportunity to be embraced rather than as a problem – a problem which can only be solved by reducing the body of information down to an amount which can be managed by a few people (relatively to the entire population of the extended enterprise).

Although too many options can decrease your performance and create stress, information abundance does not equal an abundance of choice; the social intranet is a pull platform with mechanisms for automatically attracting relevant information and people to you. What’s important is that the options you are presented with are relevant and usable. But that’s another issue. The point is that the information you need is not there in the first place, chances are that none of the options you will be presented with will do. That’s of course an unwanted situation as you might not be able to perform your task or you might make an incorrect decision that can have serious consequences. Deliberately hindering information to reach people is not the way to avoid the sensation commonly called information overload, because as Clay Shirky argues the problem is not the amount of information but rather that the filters we have fail to sort it properly for us. We need to get the filters in place instead of demonizing information supply.

The social intranet also has an important part to play when it comes to supporting serendipity; enabling people to find both information and people they didn’t know they were looking for. To do so it must have mechanisms that allow information and people that might be useful to us to be pulled to us. Spending time and effort searching for relevant information and people where there is information abundance just won’t pay off. We must have ways that “automagically” attract useful information and connections to us. We just need to implicitly and explicitly share what do and know to other people in our networks, to people who share our interests, or to people who happen to pass us by at any other kind of cross-road.

Needless to say, the push-based production model used for most intranets will still have an important role to play - but only as a component within a social intranet. It will continue to serve the most common, stable and predictable information needs. Even though it is important and sometimes critical that these can be served efficiently and effectively, the greatest value that can be created with the use of an intranet relies on the long tail of information. This is because the long tail of information supports the core of a knowledge-intensive modern business: the knowledge work.



Link to original post

CMC: Category: Web Content Management (WCM)

Will Day stay committed to web standards under Adobe's ownership?
Post by sggottlieb
I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts: New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a ...I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts:
  1. New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a standards based...
  2. Whatever happened to the URL? Even back when I was developing websites in 1998,...
  3. iECM: Interoperable Enterprise Content Management iECM is a new standard being developed through AIIM...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
Day’s acquisition by Adobe: point of view of a competitor
Post by tristan renaud
No fiestaEvery time an acquisition is published, it is time to celebrate. Well, it depends for whom.I have learnt to love acquisition as a shareholder of the acquired company. I have learnt to rebuild teams, business and clients’ trust on the ruins of the acquired companies as a manager, several times. I have learnt to learn how tough, and rare, it was to acquire successfully a company as an investor, an employee or whatever, it is always a ...

No fiesta
Every time an acquisition is published, it is time to celebrate. Well, it depends for whom.
I have learnt to love acquisition as a shareholder of the acquired company. I have learnt to rebuild teams, business and clients’ trust on the ruins of the acquired companies as a manager, several times. I have learnt to learn how tough, and rare, it was to acquire successfully a company as an investor, an employee or whatever, it is always a challenge.

Mission accomplished?
So I am not the kind to say “congrats” during an acquisition but to the team in charge of the acquisition, of course. They have “accomplished the mission” and anyone who had been in such business knows how tricky it can sometimes be. So yes, congrats to the management team of Day software, they have sold the company very much successfully.
But from my point of view, there is not so much to celebrate during an acquisition, especially for many of the employees and for many of the clients. The most difficult part is still to come. Acquisition does not matter so much compare to integration. It is like celebrating a deal whereas what matters is the Go Live of a project, not the signature of the contract.
 

Acquisition means “risky business”
OK let’s write “challenging” instead of “risky” to be positive.
I have not read today so much about how a mid size company with a strong taste of Swiss attitude and open source philosophy will fit into a mainly US centric international proprietary software company. I have also learnt to learn that during acquisitions people matter, but not so much actually. Not because you don’t want to, just because you cannot take care too much of the details and that’s what individuals are during an acquisition. Of course some individuals will find great opportunities thanks to the deal, that not the point, but just to say that an acquisition is not a fiesta for everyone, far from that: it mainly means “risky business”.
And one of the key advantages of Day has been several very gifted, skills and committed individuals. I am wondering how Adobe will handle the famous “The surprising truth about what motivates us”
Day Software was one of the rare independent high end WCM vendor and will now be just one of the products of a major software company. That’s a major evolution. Uncertainty will prevail for weeks if not months like for any similar acquisition, at least for many employees, even if, of course, Adobe may argue the opposite. Many are speculating and will speculate about the products integration, the strategic fit, the risks, the advantages, the constraints and so on. I don’t really care so much as my employer is a competitor so I have to focus at my clients, the team and to our own product but to finish my post on something funnier I cannot prevent myself motivating the shareholders of Day to agree on this acquisition. Let me be more specific:

Four Reasons to sell your shares to Adobe
1.      The price is right
The share’s price of the company was extremely high before the acquisition, and I assumed the market was expecting something irrational, sorry I meant “exceptional”. And the market is so often right, so now it is really time to do something guys. Frankly I am impressed. I can write for pages why this valuation looks so unreal to me – maybe because I am coming from a different world - but frankly just sell for that price your shares. Exaggerating, I can write you are more likely to be hit by an asteroid tomorrow morning than to get a better deal soon.
2.      That’s good for my business
Everybody knows an acquisition is always good for the business of the competition at least for the months to come. Beyond more than a year, as usual in this kind of business, it does not really matter so much, that’s another time frame and we always have to face competitors of many kinds: that’s business. What matters is adaptation and anticipation, not conjecture. So for the time being, I believe that’s good news for my employer’s business development as we have very frequently met Day software both in Europe and North America these last months.

3.      You will give Adobe a cooler image
I really like the spirit and the values of the Day team. I don’t believe in miracles but I hope they will influence Adobe somewhere in a positive way. At least, they won’t be of any bad influence!
4.      This summer is so boring like any summer, thanks for the show
For different reasons, and somewhat surprisingly, daily business is always thriving during July and August but conversely market news is usually so boring during these sunny weeks in Europe too. At last some news to really speak about during the summer break. Cool. And just before the CMS geek up cessions! Very cool.

So please give me a favor, just sell your shares to Adobe.

Further readings:

Analysis, interesting comments and some (inevitable) speculation from our WCM industry gurus: Adobe to acquire Day – First Take ECM perspective

Great critics – as usual – from Seth Gottlieb: Will Day stay committed to web standards under Adobe’s ownership?

CMS Wire article: Web CMS: Adobe Buys Day Software for US$ 240 Million

Boris-Magnolia’s own publicity but with several excellent remarks: http://www.betterfasterbigger.com/2010/07/day-to-be-acquired-by-adobe.html

Jon on tech blog post: http://jonontech.com/2010/07/28/a-fine-day-for-adobe/

Jeff Potts who was very quick to write about the acquisition: Adobe acquires Day Software for $240 million

 

 

 

Permalink | Leave a comment  »


Link to original post
CRX Gems: Rendering content as PDF + XFDF
Post by dev.day.com
I promised last time to show a simple way to render CRX content as PDF. The technique in question involves using a PDF form as the readymade container, into which form data is imported using XFDF. The latter is the XML version of Adobe's Forms Data Format, which in turn is a file format specifically designed to allow import and export of data to and from PDF forms. The way it works is simple: Suppose you have a PDF form that you want to ...

I promised last time to show a simple way to render CRX content as PDF. The technique in question involves using a PDF form as the readymade container, into which form data is imported using XFDF. The latter is the XML version of Adobe's Forms Data Format, which in turn is a file format specifically designed to allow import and export of data to and from PDF forms.

The way it works is simple: Suppose you have a PDF form that you want to populate with data. You merely need to create a small data file (in XFDF format) and put it on the server. When a user requests the data file (which has a mimetype of "application/vnd.adobe.xfdf"), Acrobat Reader (or the Reader browser plug-in) detects the fact that form data will need to be imported into a form. The XFDF file itself contains a pointer to the actual form to be used. Reader fetches the form, then imports the form data into it, and renders the result as a PDF file containing the data. It all happens transparently to the user, and the user need only have Acrobat Reader (not a full copy of Acrobat Professional).

In the example I'm going to show below, we generate the XFDF file dynamically on the server, via a script called (what else?) xfdf.esp. We'll get to that in a minute.

The example we're going to talk about assumes that there is content in CRX (under a path of /content/films) that looks something like this:

This particular content node is named terminator_2. It lives under /content/films/ in my CRX repository.

Notice, in the above list, that there is a property (at the bottom) called sling:resourceType, set to a value of "films." This tells CRX to look under /apps/films for any scripts that might be necessary to render the content.

In previous blogs, I've shown how to write scripts that render this content as HTML, SVG, or CSV. Right now, what we need is an XFDF renderer. That turns out to be pretty easy to set up.

First, we need to create a PDF form to hold our data. In the Acrobat Professional forms editor, such a form looks like this:


Will Day stay committed to web standards under Adobe’ ownership?
Post by sggottlieb
I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts:New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a standards based......I JUST heard about Adobe’s acquisition of Day Software and have to admit my first reaction was total disappointment. I always admired Day’s commitment to architecture and standards. Day is one of the few upper upper tier web content management companies to stay focused on the web — not just as a place [...] Related posts:
  1. New ECM Interoperability Standard Proposal on AIIM There is a new proposal for a standards based...
  2. Whatever happened to the URL? Even back when I was developing websites in 1998,...
  3. iECM: Interoperable Enterprise Content Management iECM is a new standard being developed through AIIM...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
CRX Gems: Rendering content as CSV
Post by dev.day.com
I've shown how easy it is to push spreadsheet data into CRX (in such a way that there is one content node per row of data, where properties on that node correspond to column data). The reverse is also possible: It's easy to write a script that converts sibling nodes to row data formatted as CSV (comma-separated values per RFC 4180). Such a script, csv.esp, looks something like this: <%// Given a list of sibling nodes (presumably// ...

I've shown how easy it is to push spreadsheet data into CRX (in such a way that there is one content node per row of data, where properties on that node correspond to column data). The reverse is also possible: It's easy to write a script that converts sibling nodes to row data formatted as CSV (comma-separated values per RFC 4180). Such a script, csv.esp, looks something like this:

<%
// Given a list of sibling nodes (presumably
// similar in structure), and an array of
// property names, convert each node
// to one "row" of CSV data, where
// columns correspond to properties.
// We will encode all property data as
// comma-separated values per RFC 4180.
function nodesToCSV( nodes, propertyNames ) {

        var records = new Array( );

        for ( var i = 0; i < nodes.length; i++ ) {

                var aRecord = new Array( );

                // suck in the data for each property:
                for ( var k = 0; k < propertyNames.length; k++ ) {
                        var data = nodes[ i ][ propertyNames[ k ] ];
                        var escaped = escapeData( data );
                        aRecord.push( escaped );
                }
                records.push( aRecord.join( "," ) );
        }

        var CRLF = String.fromCharCode(13) +
        String.fromCharCode(10);

        return records.join( CRLF );
}

// Return an array of property names for this node
function getOrderedProperties( node ) {

        var array = new Array();
        for ( var i in node )
        array.push( i );

        return array;
}

// Escape field data per RFC 4180
function escapeData( data ) {

        // replace " with ""
        data = String(data).replace( /"/g, "\"\"" );

        // if data contains comma, CRLF, or "
        // we need to wrap the entire thing in double quotes
        var escapables = /,|(\r\n)|"/;
        if ( data.match( escapables ) )
        return "\"" + data + "\"";

        return data;
}
%>
<% nodes = currentNode.getNodes( );
// get a list of property names
propertyNames =
getOrderedProperties( nodes[0] );%>
<%= nodesToCSV( nodes, propertyNames ) %>
 

The rules for escaping data for CSV are extremely simple. First, any data string that contains the double-quote (") character needs to have each such character converted to two double-quotes (""). Secondly, if the data contains a comma, the entire data string needs to be wrapped in quotation marks. The same is true for any data that contains double-quotes or line breaks (which RFC 4180 defines as CRLF -- carriage return followed by linefeed). The following very simple function enforces these escaping rules:

// Escape field data per RFC 4180
function escapeData( data ) {

   // replace " with ""
   data = String(data).replace( /"/g, "\"\"" );
 
   // if data contains comma, CRLF, or "
   // we need to wrap the entire thing in double quotes  
   var escapables = /,|(\r\n)|"/;
   if ( data.match( escapables ) )
      return "\"" + data + "\"";
      
   return data;
}

The function that actually converts nodes to records is very straightforward as well:

function nodesToCSV( nodes, propertyNames ) {

   var records = new Array( );

   for ( var i = 0; i < nodes.length; i++ ) {

      var aRecord = new Array( );

      // suck in the data for each property:
      for ( var k = 0; k < propertyNames.length; k++ ) {
         var data = nodes[ i ][ propertyNames[ k ] ];
         var escaped = escapeData( data );
         aRecord.push( escaped );
     }
      records.push( aRecord.join( "," ) );
   }

   var CRLF = String.fromCharCode(13) +
                    String.fromCharCode(10);

   return records.join( CRLF );
}

Note that we need to explicitly provide the function a list of property names, rather than (say) let the function iterate through property names on an introspective basis. The reason for this is that if we simply try gathering property names with a for/in loop, we will get back property names in no particular order. And the order will, in fact, vary from content node to content node even if all of the content nodes have properties with exactly the same names. The unorderedness of the properties (as obtained through simple iteration) would scramble the column data in our CSV file. We don't want that. Hence, we pass in an array of property names, and march through the array in orderly fashion when pulling property data from each node.

When I placed csv.esp in my repository under /apps/films and then navigated to http://localhost:7402/content/films.csv, CRX dutifully fired my script and produced a CSV file containing all of the data from my /films content nodes, causing my browser (in turn) to inform me that I was downloading a file of type "csv" (it then asked me what program I wanted to use to open the file; I specified scalc.exe, and OpenOffice dutifully loaded the file as a spreadsheet).

So far, I've shown how to render /films data as HTML, SVG, and CSV. Next time, I want to show a simple trick for rendering the data as PDF. It's easier than you think!


[LOTD] IBM's approach to JCR text search
Post by dev.day.com
It's always good to get a glimpse into the approaches taken by non-OSS JCR implementations: In a recent technical article on the developerworks website Malarvizhi Kandasamy describes how IBM goes about JCR fulltext search. The actual engine is Juru, which is a Java library developed by the IBM Haifa research lab According to the article Juru is capable of some natural language processing like stemming or finding similar spellings. IBM ...

It's always good to get a glimpse into the approaches taken by non-OSS JCR implementations: In a recent technical article on the developerworks website Malarvizhi Kandasamy describes how IBM goes about JCR fulltext search. The actual engine is

Juru, which is a Java library developed by the IBM Haifa research lab

According to the article Juru is capable of some natural language processing like stemming or finding similar spellings.

IBM uses a JCR compliant repository in a number of their products, e.g. Lotus Web Content Management or WebSphere Portal.


CRX Gems: Rendering content as HTML and SVG
Post by dev.day.com
A few days ago, I talked about how to "shred and store" a spreadsheet -- i.e., how to push rows of a spreadsheet into individual nodes in CRX (one node per row, with column data stored as properties). I also gave JavaScript code for doing this in an OpenOffice macro. For testing purposes, I used the CSV file a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing. After running my OpenOffice macro on ...

A few days ago, I talked about how to "shred and store" a spreadsheet -- i.e., how to push rows of a spreadsheet into individual nodes in CRX (one node per row, with column data stored as properties). I also gave JavaScript code for doing this in an OpenOffice macro. For testing purposes, I used the CSV file a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing.

After running my OpenOffice macro on the Georgia Tech CSV file, my CRX repository now contains movie data (Title, Director, Year, etc.) for 1741 films, each film with its own nt:unstructured node under the path /content/films/. In the CRX Content Explorer, a given node (in this case, the node at http://localhost:7402/content/films/terminator_2) looks something like this:


Keeping your content DRY
Post by sggottlieb
After over 10 years of working in content management, I have come to realize that there is only one way to learn the value of managing structured information: the hard way — and that way is only 50% effective. People can intellectually accept concepts like content re-use and content/layout separation, but in the heat [...] Related posts:Migrating Content There has been a great thread on the CM...Are content managers ready for ...After over 10 years of working in content management, I have come to realize that there is only one way to learn the value of managing structured information: the hard way — and that way is only 50% effective. People can intellectually accept concepts like content re-use and content/layout separation, but in the heat [...] Related posts:
  1. Migrating Content There has been a great thread on the CM...
  2. Are content managers ready for personalization? I have been catching up on product demos recently...
  3. A Content Management Definition I just heard Frank Gilbane define Content Management as...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
CRX Gems: CRXDE Lite
Post by dev.day.com
Since version 2.0 CRX comes with CRXDE Lite (CRX Development Environment - Lite), a web based tool to ease the development of CRX based applications. CRXDE Lite is implemented using the ExtJS Javascript library and aims to replace the CRX 1.x Content Explorer with a modern AJAX-based repository editor and browser, but it also provides improved means for searching, code editing and integrations for code version management and handling of ...

Since version 2.0 CRX comes with CRXDE Lite (CRX Development Environment - Lite), a web based tool to ease the development of CRX based applications. CRXDE Lite is implemented using the ExtJS Javascript library and aims to replace the CRX 1.x Content Explorer with a modern AJAX-based repository editor and browser, but it also provides improved means for searching, code editing and integrations for code version management and handling of non-scripted code. As a tool primarily for developers, it also comes with server side development functionalities like compilation of Java code, OSGi bundle creation and autodeployment, project wizard, etc.


Work Breakdown Structure vs. Deadlines
Post by sggottlieb
One of the most common points of friction between project managers and developers is planning work. Most programmers hate creating work breakdown structures (WBS). You can’t blame them, accurately predicting steps and effort required to build undesigned software is impossible. Yes, you heard that right. Software development planning is impossible — [...] Related posts:Plone Strategic Summit results posted Notes and action ...One of the most common points of friction between project managers and developers is planning work. Most programmers hate creating work breakdown structures (WBS). You can’t blame them, accurately predicting steps and effort required to build undesigned software is impossible. Yes, you heard that right. Software development planning is impossible — [...] Related posts:
  1. Plone Strategic Summit results posted Notes and action items from the Plone Strategic Planning...
  2. Fixed bid implementation work: a marriage made in Vegas Most of my CMS selection clients are not just...
  3. How I use Twitter for Work Publishing Decision Tree V2 Originally uploaded by sggottlieb I...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
CRX Gems: Using an OpenOffice macro to store spreadsheet data in CRX
Post by dev.day.com
In a recent blog, I talked about how easy it is to store snippets of text from OpenOffice in a CRX repository using a little bit of JavaScript and the Sling REST API. While being able to store arbitrary bits of text this way is certainly useful, it would be even more useful to be able to store spreadsheet data. Of course, storing a spreadsheet in CRX, per se, is not much of a challenge: with WebDAV, it's a matter of drag and drop. But storing ...

In a recent blog, I talked about how easy it is to store snippets of text from OpenOffice in a CRX repository using a little bit of JavaScript and the Sling REST API. While being able to store arbitrary bits of text this way is certainly useful, it would be even more useful to be able to store spreadsheet data. Of course, storing a spreadsheet in CRX, per se, is not much of a challenge: with WebDAV, it's a matter of drag and drop. But storing an entire spreadsheet as a single monolithic content item doesn't necessarily give you the greatest content-management bang for the buck. Often, what you really want to do is granularize the spreadsheet into records (or row data), and store individual rows as content items. (You could take it further and store individual cells as content items, but that would probably be overkill for most situations, although there's certainly nothing preventing you from doing it.)

In the database world, where decisions often have to be made as to how best to decompose an XML document when mapping it to tables in a database, this general process (of decomposing a large document along the lines of its natural internal fine-structure) is known as shredding. What would be handy is to have an OpenOffice macro that could shred a spreadsheet into rows, and push the rows into nodes in CRX. That's what I propose to show you right now.

It turns out to be pretty easy to parse a spreadsheet in an OpenOffice macro. Using JavaScript:

   // First, get the document object
   // from the scripting context
   oDoc = XSCRIPTCONTEXT.getDocument();

   // Next, get the XSpreadsheetDocument
   // interface from the document
   xSDoc = UnoRuntime.queryInterface(XSpreadsheetDocument, oDoc);

   // Then get a reference to the sheets for this doc
   var sheets = xSDoc.getSheets();

   // get Sheet1
   var sheet1 = sheets.getByName("Sheet1");

Once you've gotten the sheet reference, you can use it to obtain a cell reference:

var cell = sheet.getObject().getCellByPosition( column, row );

The cell, in turn, contains data, which (dependening on whether you're dealing with a native OpenOffice spreadsheet versus a freshly imported CSV file) can be a floating-point value, a string, or something else. For purposes of this discussion I'm going to assume that you've just imported a CSV or tab-delimited file into OpenOffice, in which case all cells will automatically contain string data. To get the string data from a cell in a freshly imported CSV file, you have to do:

var content = cell.getFormula();

At least, that's what works in OpenOffice 3.2.

The general plan of attack, then, is to come up with a function that can parse a row's worth of data out of a spreadsheet; and have another function that can persist a row of data as a content item in CRX. Then it should be possible to create a macro that simply loops over all rows in a spreadsheet and pushes them out to the repository.

The row-parsing function is pretty straightforward:

function getRow( sheet, rownumber, startColumn, endColumn )  {

    var obj = sheet.getObject();
    var record = [];

    for (var k = startColumn; k < endColumn ; k++) {
         var cell = obj.getCellByPosition( k, rownumber );
         var content = cell.getFormula();
         record.push( content );
    }

    return record;
}

Given a reference to a Sheet, along with a row number and the starting and ending column numbers, this function loops through cells and pushes cell values into an array. The returned array represents a row's worth of data.

To persist a row to CRX, we have a function that looks like this:

function persistRow( sheet, rownumber, startColumn, endColumn ) {

   // get first row of data (column names)
   var columnNames = getRow( sheet, 0, startColumn, endColumn );

   // get specified record
   var row = getRow( sheet, rownumber, startColumn, endColumn );

   // build the request
   var request = {};
   request[":nameHint"] = row[2]; // Title
   request["sling:resourceType"] = "films";
   for ( var i = 0; i < columnNames.length; i++) {
       request[ columnNames[ i ] ] = row[ i ];
   }   
   var data = createRequest( request );

   // where to store it
   var url = "http://localhost:7402/content/films/";

   // finally, hit the repository
   var response = doJavaPOST( url, data );

   return response;
}

Notice that the code assumes that the first row of "data" in the spreadsheet columns the column names. This was the case with the test-spreadsheet I used for testing this macro, namely a spreadsheet called a1-film.csv, representing 1741 movies catalogued by Georgia Tech's College of Computing. Each row in the spreadsheet has information for a particular film, such as the film's title, the year the film was made, its genre, the name of the director, major actors and actresses, etc.

Without further ado, here is the complete code for the OpenOffice macro:



// Spreadsheet2CRX Macro
// Kas Thomas, 15 July 2010
// Public domain. Use at your own risk.
// Tested with v3.2 of OpenOffice.org

importClass(Packages.com.sun.star.uno.UnoRuntime);
importClass(Packages.com.sun.star.sheet.XSpreadsheetDocument);

// Do a POST
function doJavaPOST( url, content ) {
        var reply = "";
        var responseCode = "";
        try {
                var URL = new java.net.URL( url );
                var urlConn = URL.openConnection( );
                urlConn.setDoOutput ( true );
                urlConn.setRequestMethod( "POST" );
                urlConn.setUseCaches( false );
                urlConn.setRequestProperty ("Content-Type",
                "application/x-www-form-urlencoded" );
                var printout =
                new java.io.DataOutputStream ( urlConn.getOutputStream ( ) );
                printout.writeBytes ( content );
                printout.flush ( );
                printout.close ( );
                responseCode = urlConn.getResponseCode();
        }
        catch(exception) {
                java.lang.System.out.println( exception.toString() );
        }

        return responseCode;
}

// munge together the form data
// into "name1=value1&name2=value2" etc
function createRequest( object ){

        var data = [];
        for ( var i in object )
        data.push( i + "=" + object[ i ].toString( ) );

        var dataString = data.join( "&" );
        return dataString;
}

// Modal dialog with OK/cancel and a text field
function prompt( msg ) {
        var swing = Packages.javax.swing;
        var text = swing.JOptionPane.showInputDialog(
        new java.awt.Frame(), msg );
        return ( null == text ) ? "" : text; // always return a string
}

// a Swing UI for displaying console info
function EditorPane( ) {

        Swing = Packages.javax.swing;
        this.pane = new Swing.JEditorPane("text/html","" );
        this.jframe = new Swing.JFrame( );
        this.jframe.setBounds( 100,100,500,400 );
        var editorScrollPane = new Swing.JScrollPane(this.pane);
        editorScrollPane.setVerticalScrollBarPolicy(
        Swing.JScrollPane.VERTICAL_SCROLLBAR_ALWAYS);
        editorScrollPane.setPreferredSize(new java.awt.Dimension(250, 250));
        editorScrollPane.setMinimumSize(new java.awt.Dimension(10, 10));
        this.jframe.setVisible( true );
        this.jframe.getContentPane().add( editorScrollPane );

        // public methods
        this.getPane = function( ) { return this.pane; }
        this.getJFrame = function( ) { return this.jframe; }
}

function getRow( sheet, rownumber, startColumn, endColumn )  {

        var obj = sheet.getObject();
        var record = [];

        for (var k = startColumn; k < endColumn ; k++) {
                var cell = obj.getCellByPosition( k, rownumber );
                var content = cell.getFormula();
                record.push( content );
        }

        return record;
}

function persistRow( sheet, rownumber, startColumn, endColumn ) {

        // get first row of data (column names)
        var columnNames = getRow( sheet, 0, startColumn, endColumn );

        // get specified record
        var row = getRow( sheet, rownumber, startColumn, endColumn );

        // build the request
        var request = {};
        request[":nameHint"] = row[2]; // Title
        request["sling:resourceType"] = "films";
        for ( var i = 0; i < columnNames.length; i++) {
                request[ columnNames[ i ] ] = row[ i ];
        }
        var data = createRequest( request );

        // where to store it
        var url = "http://localhost:7402/content/test/";

        // finally, hit the repository
        var response = doJavaPOST( url, data );

        return response;
}

( function main( ) {

        //get the document object from the scripting context
        oDoc = XSCRIPTCONTEXT.getDocument();

        //get the XSpreadsheetDocument interface from the document
        xSDoc = UnoRuntime.queryInterface(XSpreadsheetDocument, oDoc);

        // get a reference to the sheets for this doc
        var sheets = xSDoc.getSheets();

        // get Sheet1
        var sheet1 = sheets.getByName("Sheet1");

        // construct a new EditorPane
        var editor = new EditorPane( );
        var pane = editor.getPane( );

        var size = prompt("Enter total rows and total columns, separated by a comma (e.g., '100,8')");
        if ( !size )
        return "No row/column info supplied.";

        var rows = Number( size.substring(0,size.indexOf(",")) );
        var cols = Number( size.substring( size.indexOf(",")+1) );

        var errors = 0;
        for ( var i = 1; i <= rows; i++) {
                var response = persistRow( sheet1, i, 0, cols );
                var text = pane.getText();
                pane.setText( text + "\nProcessing: " + i );
                if ( response.toString().indexOf("5")==0 )
                errors++;
                // provide a little bit of throttling:
                java.lang.Thread.sleep( 200 );
        }
        pane.setText( pane.getText() + "\n" + errors + " errors" );
})();




You'll notice that the code creates a JEditorPane window to act as an error console. When you run the macro, a JOptionPane dialog appears, asking you to supply the number of rows and columns in the spreadsheet. (For the Georgia Tech spreadsheet, you can enter "1741,8", minus quotes.) Once you dismiss the dialog, the code goes to work looping over all the rows in the spreadsheet, posting each row to CRX at a path of http://localhost:7402/content/films/.

Each new node is named according to a :nameHint parameter based on the Title of the film.

Notice also, we designate a sling:resourceType for each node of "films." (This happens in the persistRow() function.) This fact will be important in a later blog when I show how to write server-side scripts that handle various types of requests for film data.

And that's about it: Now you know how to shred a spreadsheet (say that 3 times in a row fast...) and store the results in CRX, using OpenOffice.


Open source project filtering
Post by sggottlieb
Roberto Galoppini has an interesting case study on selecting an open source project management tool. In it, he describes his SOS Open Source methodology for filtering open source projects by looking at a number of factors organized into three categories: sustainability, industrial strength, and project strategy. The case study doesn’t go into much [...] Related posts:Another Open Source Project Management Tool A few months ago I ...Roberto Galoppini has an interesting case study on selecting an open source project management tool. In it, he describes his SOS Open Source methodology for filtering open source projects by looking at a number of factors organized into three categories: sustainability, industrial strength, and project strategy. The case study doesn’t go into much [...] Related posts:
  1. Another Open Source Project Management Tool A few months ago I was looking for a...
  2. Evaluating open source and closed source software Gartner has been saying how the current recession favors...
  3. Honest Open Source I just read Kris Buytaert’s blog post “Honest Open...
Related posts brought to you by Yet Another Related Posts Plugin.
Link to original post
Learning about ESP vs. JSP in Sling
Post by dev.day.com
The first version of this post originally was published here. Lately I've been doing a fair amount of server-side scripting using ESP (ECMAScript Pages) in Sling. At first blush, such pages tend to look a lot like Java Server Pages, since they usually contain a lot of scriptlet markup, like: <%  // script code here  %> and <%=  // stuff to be evaluated here  %> So it's tempting to think ESP pages are simply ...
The first version of this post originally was published here.

Lately I've been doing a fair amount of server-side scripting using ESP (ECMAScript Pages) in Sling. At first blush, such pages tend to look a lot like Java Server Pages, since they usually contain a lot of scriptlet markup, like:

<%  // script code here  %>

and

<%=  // stuff to be evaluated here  %>

So it's tempting to think ESP pages are simply some different flavor of JSP. But they're not. From what I can tell, ESP pages are just server pages that get handed to an EspReader before being served out. The EspReader, in turn, handles the interpretation of scriptlet tags and expression tags (but doesn't compile anything into a servlet). Bottom line, ESP is not JSP, and despite the availability of scriptlets tags, things work quite a bit differently in each case.

Suppose you want to detect, from an ESP page or a JSP page, what kind of browser a given page request came from. In a Sling JSP page you could do:

<%@taglib prefix="sling" uri="http://sling.apache.org/taglibs/sling/1.0" %>

<sling:defineObjects/>
<html><body>

<%
java.util.Enumeration c = request.getHeaders("User-Agent");

String s = "";

while ( c.hasMoreElements() )
    s += c.nextElement();
%>

<%= s %>
</body></html>

But what do you do in ESP? Remember, <sling:defineObjects/> is not available in ESP.

It turns out that Sling automatically (without the need for any directives) exposes certain globals to the JavaScript Context at runtime, and one of them is a request object. Thus, in ESP you'd simply do:

<%

c = request.getHeaders("User-Agent");

s = "";

while ( c.hasMoreElements() )
    s += c.nextElement();

%>

<%= s %>

Very similar to the JSP version.

So the next question I had was, what are the other globals that are exported into the JavaScript runtime scope by Sling? From what I can determine, the Sling globals available in ESP are:

currentNode
currentSession
log
out
reader
request
resource
response
sling

currentNode is the JCR node underlying the current resource; currentSession is what it sounds like, a reference to the current Session object; log refers to the org.slf4j.Logger; reader returns request.getReader(), which allows for reading the request body; request is a reference to the SlingHttpServletRequest; resource is the current Resource; response is, of course, a reference to the SlingHttpServletResponse; and sling is a SlingScriptHelper. All of these are available all the time, throughout the life of any ESP script in Sling.

The nice part about server-side scripting in Sling (one of many nice parts), incidentally, is that you don't have to choose to do just ESP pages or just JSP; you can write an ESP handler for one situation and a JSP for another, and use ESP/JSP in any combination. You're not locked into one technology or the other.

For more information, try the Sling Javadocs here or Day's page of resources here (note, in particular, the list of References on the right).


New ASF board elected
Post by dev.day.com
The current board of directors of the Apache Software Foundation has just been elected - congratulations to: Shane CurcuruDoug CuttingBertrand DelacretazRoy T. FieldingJim JagielskiSam RubyNoirin ShirleyGreg SteinHenri Yandell Roy and Bertrand are colleagues of mine at Day Software. To find out more about what the board actually does have a look at "How the ASF works".

The current board of directors of the Apache Software Foundation has just been elected - congratulations to:

  • Shane Curcuru
  • Doug Cutting
  • Bertrand Delacretaz
  • Roy T. Fielding
  • Jim Jagielski
  • Sam Ruby
  • Noirin Shirley
  • Greg Stein
  • Henri Yandell

Roy and Bertrand are colleagues of mine at Day Software.

To find out more about what the board actually does have a look at "How the ASF works".


Serving the long tail of information needs with social intranets
Post by Oscar Berg
“Flexible access to people and resources can be enormously powerful in a world driven by changes that, more often than not, lead us in unanticipated directions…we need to become more adept at ‘capability leverage’ – finding and accessing complementary capabilities, wherever they reside in the world, to deliver more value.”  - From “The Power of Pull” by J Hagel, J S Brown, L Davidson Businesses, in particular in the Western world ...
“Flexible access to people and resources can be enormously powerful in a world driven by changes that, more often than not, lead us in unanticipated directions…we need to become more adept at ‘capability leverage’ – finding and accessing complementary capabilities, wherever they reside in the world, to deliver more value.”  
- From “The Power of Pull” by J Hagel, J S Brown, L Davidson 
Businesses, in particular in the Western world, are becoming more and more knowledge-intensive with an increasing part of the workforce engaged in knowledge-based work. A study by The Work Foundation has estimated that we have a 30-30-40 workforce - 30 per cent in jobs with high knowledge content, 30 per cent in jobs with some knowledge content, and 40 per cent in jobs with less knowledge content.

Knowledge work is about such things as solving problems, performing research and creative work, interacting and communicating with other people, and so on. Such work is by nature less predictable and repeatable than traditional industry work (transformational and transactional activities organized into repeatable processes). Both the inputs and outputs of knowledge work – which is information and knowledge – vary from time to time, from situation to situation. So does the purpose, activities, roles and resources involved in knowledge work. Knowledge work is also less structured and the structure of knowledge work typically emerges as the work proceeds.

In a knowledge-intensive business environment, it is often very hard or even impossible to anticipate in advance what information is needed. You simply cannot know what information will be relevant before the moment you need it. The information might not exist until the moment you need it, or you are simply unaware of its existence. That’s why more is better (“more is more”) when it comes to information supply in a knowledge-intensive business environment. If there is more to choose from, chances are there will be something for (almost) any need. That’s also why it has become critical for knowledge workers to access to the information abundance on the Internet. We also need to have immediate access to anyone who might possess the knowledge and information we need but which is not captured – often because it is hard to capture or simply does not allow itself to be captured (tacit knowledge) and exchanged.

There’s a long tail of information needs that still needs to be served

Assuming we have a long tail of diverse, constantly changing and virtually unlimited amount of information needs, we need to do what can be done to serve these needs in some way or another. The problem is that the information resources that most businesses choose to produce and provide access are not aimed at serving these infrequent, uncertain and constantly changing information needs. Let’s use the Long Tail power graph below to illustrate and further expand this reasoning.



In the left end of the power graph we have the information resources which are most frequently used because they are serving frequently recurring information needs. The information which is needed for transformational and transactional activities - but also administrative knowledge work - is likely to be served by information resources in the left part of the Long Tail power graph. This information does not change very often and thus can be quite easily reused. It’s the kind of information used for a commonly performed activity, which means that the need is predictable. An information need that has occurred once will for certain occur again. This allows us to define, design and produce the type and structure of the information as well as the actual information before the next time the information need arises (the activity is performed).

Knowledge work is often a completely different story. While the information used us input to an activity or process is likely to be found in the left part of the Long Tail power graph, the information needed for a knowledge work activity is likely to be found in the long tail. There you have information resources which are used infrequently or maybe even once. The information which is needed varies from time to time, from situation to situation. Not only the actual information varies; often the type and structure of the information resource varies too. This makes it virtually impossible to define a reusable information resource in advance before it is needed.

The unpredictable nature of knowledge work is why we need to give knowledge workers access to all information that exists and that might be relevant. Since we don’t know what might be relevant until a certain need arises (which we never might be aware of until we discover certain information), we can’t really put the relevant information in one “for keeps” pile and all other information in another “to be trashed” pile. We also need to provide them with tools so they can create or capture information with each other, or else there will not be enough information available to serve the knowledge workers’ information needs. To help people find and discover information that is relevant to their tasks when they need it, we also need to create powerful pull mechanisms which allow relevant information to automatically surface and be placed at the fingertips of knowledge workers just when they need it.

Traditional intranets are not designed for knowledge work

This leads me to the changing role of intranets in knowledge-intensive businesses. These intranets need to provide flexible access to both information and people by employing pull models for serving as many knowledge worker information needs as possible, including unanticipated information needs. Information supply needs to be maximized by supporting the creation and access to user-generated content as well as by allowing for easy integration of external information sources. The intranet needs to be turned into an “information broker platform” where information is freely and easily created, aggregated, shared, found and discovered at minimal effort.  Such an intranet gives everybody access to all information which is available and make room for virtually infinite amounts of information.

However, most of today’s intranets primarily consist of pre-produced information artifacts which are intended to serve information needs which can be anticipated in advance. They aim to serve people who perform predefined and repeatable tasks. They are push platforms. As such they might work well for repeatable routine work where the information needs can be defined in advanced, but they are quite dysfunctional for knowledge work. It’s not a coincidence that many knowledge workers find it much easier to find information on the web than in their internal systems and that the intranet plays a marginal role in their daily work.

The information that knowledge workers need can often not be anticipated and served by a push-based intranet. It is also critical that they have access to ALL information that is available, including collaborative content produced by teams, content produced by external resources, tacit knowledge captured in conversations, and so forth. Since the information artifacts on an intranet typically are produced by a relatively small part of the organization’s total workforce, the resources available for producing these artifacts are limited. A line needs to be drawn between information needs which can be served and those which cannot be served. A common approach is to identify the most common information needs and focus available resources on serving these needs as good as possible. Assuming that the resources for producing and maintaining information resources are scarce, this is a seemingly feasible approach. But it’s not a feasible approach for an intranet that needs to serve the information needs of knowledge workers.

The problem here is that traditional intranets are based on a production model which says that all information artifacts on the intranet must be produced in advance (only serving information needs which can be anticipated) by a small subset of all available resources (employees) and that the entire body of information needs to be supervised by a few people for the purpose of controlling the message, format and/or organization of the information resources.

Knowledge workers need a social intranet 

There are plenty of definitions trying to define what a social intranet is, but most of the ones I’ve seen have not been able to see beyond tools and technologies. They don't succeed in describing the paradigm change that is transforming intranets into something completely different from what they are today.

The social intranet is not just about adding a layer of social collaboration tools; it is a platform that combines the powers of push with the powers of pull to supply anyone who participates and contributes within an extended enterprise with the information, knowledge and connections they need to make the right decisions and act to fulfill their objectives. It equips everyone with the tools that allows them to participate, contribute, attract, discover, find and connect with each other to exchange information and knowledge and/or collaborate. It connects information demand with information supply in knowledge-intensive businesses, something which can only be done by involving all employees in the information supply, removing bottle-necks created by the production model (such as approval workflows and that everything must fit in a central taxonomy) and enabling employee-to-employee information exchange.

When it comes to information supply, the previously dominating "less is more" paradigm is being replaced by a "more is more" paradigm. A social intranet must necessarily be designed for information abundance. The increasing volume of information resources needs to be seen as opportunity to be embraced rather than as a problem – a problem which can only be solved by reducing the body of information down to an amount which can be managed by a few people (relatively to the entire population of the extended enterprise).

Although too many options can decrease your performance and create stress, information abundance does not equal an abundance of choice; the social intranet is a pull platform with mechanisms for automatically attracting relevant information and people to you. What’s important is that the options you are presented with are relevant and usable. But that’s another issue. The point is that the information you need is not there in the first place, chances are that none of the options you will be presented with will do. That’s of course an unwanted situation as you might not be able to perform your task or you might make an incorrect decision that can have serious consequences. Deliberately hindering information to reach people is not the way to avoid the sensation commonly called information overload, because as Clay Shirky argues the problem is not the amount of information but rather that the filters we have fail to sort it properly for us. We need to get the filters in place instead of demonizing information supply.

The social intranet also has an important part to play when it comes to supporting serendipity; enabling people to find both information and people they didn’t know they were looking for. To do so it must have mechanisms that allow information and people that might be useful to us to be pulled to us. Spending time and effort searching for relevant information and people where there is information abundance just won’t pay off. We must have ways that “automagically” attract useful information and connections to us. We just need to implicitly and explicitly share what do and know to other people in our networks, to people who share our interests, or to people who happen to pass us by at any other kind of cross-road.

Needless to say, the push-based production model used for most intranets will still have an important role to play - but only as a component within a social intranet. It will continue to serve the most common, stable and predictable information needs. Even though it is important and sometimes critical that these can be served efficiently and effectively, the greatest value that can be created with the use of an intranet relies on the long tail of information. This is because the long tail of information supports the core of a knowledge-intensive modern business: the knowledge work.



Link to original post






© 2010 Content Management Connection. All rights reserved.
Be Seen, Be Heard at the Content Management Connection
Sign On

The Dearing Group LLC. | 214.536.7072