Upgrade time?

Inspired by conversations I had at the Alfresco BeeCon I’ve decided to put down some of my thoughts and experiences about going through the upgrade cycle.

It can be a significant amount of work to do an upgrade, even if you have little or no customization, as you need to check that none of the functionality you rely on has changed or broken so it’s not something to be undertaken lightly.

In my experience there are several main factors in helping decide whether it’s time to upgrade:

  • Time since last upgrade
  • Security patches
  • Bug fixes you’ve been waiting for
  • Exciting new features

Using the example of Alfresco Community Edition I find that a good time to start thinking about this is when a new EA version has been released. This means that the previous release is about as stable as it’s going to get and new features are starting to be included. I know many people are a lot more conservative than this so you’ll have to think about what works with your organization.

Time since last release

This is often the deciding factor as you don’t want to get too far behind in the release cycle, otherwise upgrading can become a nightmare. In a previous job I observed an upgrade project that took over a year to complete despite having considerable resources thrown at it and not having any significant new features added – mostly because of the large gap in versions (although there were some poor customization decisions)

Security patches

It’s always important to review security patches and apply them when appropriate but this is generally much easier to do if you’re on a recent version so this is an argument for keeping reasonably up to date.

Bug fixes

Sometimes a bug fix will make it into the core product and you can remove it from your customizations (a good thing), sometimes it’s almost like a new feature but sometimes it will expose a new bug. Generally a positive thing to have.

Exciting new features

Shiny new toys! It’s always tempting to get hold of interesting new features but, unless there’s a really good reason that you want it, it’s usually best to wait for it to stabilize before moving to production but this can be a reason for a more aggressive release cycle.

My process

This is a little more Alfresco specific but the general points apply.

OK so there’s a nice new, significant, version out – for the sake of argument let’s say 5.2 – and I’m on version 5.0 in production so what do I do.

Wait for the SDK to catch up – this is a bit frustrating as sometimes I only have quite a short window to work on Alfresco and if the SDK isn’t working then I’ll have to go and do something else.

I feel that the release should be being built with the SDK but it does tend to lag significantly behind. At the time of writing Alfresco_Community_Edition_201605_GA isn’t supported at all and Alfresco_Community_Edition_201604_GA needs some patching of the SDK while Alfresco_Community_Edition_201606_EA is out. (The SDK 2.2.0 is listed on the release notes page for all of these even though it doesn’t work…)

It’s also a little unclear about what works with what – for example can I run Share 5.1.g(from 201606_EA) with Repo 5.1.g (from 201604_GA)? (which I might be able to make work with the SDK, and I know there are bug fixes I want in Share 5.1.g…) or stick with the Repo 5.1.g/Share 5.1.f combo found in the 201605 GA? (which I can’t build yet)

I should have an existing branch (see below) that is close to working on an earlier EA (or GA) version so in theory I can just update the version number(s) in the pom.xml and rebuild and test. In practice it’s more complicated than that as it’s necessary to go through each customization and check the implications against the code changes in the product (again see below). Sometimes this is easier than others, for example, 5.0.c to 5.0.d seemed like a big change for a minor version increment.

Why create a branch against an EA?

As I mentioned above I’ll try and create a branch against the new EA. Why do this when there’s no chance that I’ll deploy it?

There are a several reasons that I like to do this.

I don’t work with Alfresco all the time so while my thoughts are in that space it’s convenient, and not much slower (see below), to check the customizations against two versions rather than one.

It’s a good time to find and submit bugs – if you find them in the first EA then you’ve got a chance that they’ll be fixed before the GA.

Doing the work against the EA, hopefully, means that when the next GA comes along it won’t be too hard to get ready for a production release.

You get a test instance where you can try out the exciting new features and see if they are good/useful as they sound.

How to check customizations?

This can be a rather time consuming process, and, as it’s not something you do very often, easy to get wrong.

There are a number of things you might need to check (and I’m sure that there are others)

  • Bean definitions
  • Java changes
  • web.xml

While I’m sure everybody has a good set of tests to check your modifications, it’s unlikely that these will be sufficient.

Bean definitions

You might have made changes, for example, to restrict permissions on site creation, and the default values have changed – in this case extra values were added between 4.2 and 5.0, and 5.0 and 5.1

Java changes

Sometimes you might need to override, or extend, existing classes so you need to see if the original class has changed and if you need to take account of these changes

web.xml

CAS configuration is an example of why you might have changed your web.xml and need to update it.

Upgrade Assistant

I’ve started a project https://github.com/wrighting/upgrade-assist to try and help with the more mechanical aspects of checking customizations. I’ve found it helpful and I hope other people will as well – see github for further details.

 

Python, MPI and Sun Grid Engine

Really you need to go here
Do not apt-get install openmpi-bin without reading this first

To see whether your Open MPI installation has been configured to use Sun Grid Engine:

ompi_info | grep gridengine
MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3)
./configure --with-sge
make
make install

Do the straightforward virtualenv setup

sudo apt-get install python-virtualenv
virtualenv example
cd example
source bin/activate
pip install numpy
pip install cython

Installing hdf5 with mpi

Install hdf5 from source to ~/install if necessary – the package should be OK

wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.13.tar.gz
tar zxvf hdf5-1.8.13.tar.gz
cd hdf5-1.8.13
export CC=/usr/local/bin/mpicc
mkdir ~/install
./configure --prefix=/home/${USER}/install --enable-parallel --enable-shared
make
#make test
make install
#If you want to...
export PATH=/home/${USER}/install/bin:${PATH}
export LD_LIBRARY_PATH=/home/${USER}/install/lib:${LD_LIBRARY_PATH}
export CC=/usr/local/bin/mpicc
pip install mpi4py
pip install h5py --install-option="--mpi"
#If hdf5 is installed in your home directory add --hdf5=/home/${USER}/install to the --install-option

SGE configuration

http://docs.oracle.com/cd/E19923-01/820-6793-10/ExecutingBatchPrograms.html is quite useful but a number of the commands are wrong…

Before you can run parallel jobs, make sure that you have defined the parallel environment and queue before running the job.
To see queues

qconf -spl

To define a new parallel environment

qconf -ap mpi_pe

To look at the config of a pr

qconf -sp mpi_pe

The value of control_slaves must be TRUE; otherwise, qrsh exits with an error message.

The value of job_is_first_task must be FALSE or the job launcher consumes a slot. In other words, mpirun itself will count as one of the slots and the job will fail, because only n-1 processes will start.
The allocation_rule must be either $fill_up or $round_robin or only one host will be used.

You can look at the remote execution parameters using

qconf -sconf
qconf -aq mpi.q
qconf -mattr queue pe_list "mpi_pe" mpi.q

Checking and running jobs

The program demo.py – note order of imports. This tests the use of h5py in an MPI environment so may be more complex than you need.

from mpi4py import MPI
import h5py

rank = MPI.COMM_WORLD.rank  # The process ID (integer 0-3 for 4-process run)

f = h5py.File('parallel_test.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD)
#f.atomic = True

dset = f.create_dataset('test', (MPI.COMM_WORLD.Get_size(),), dtype='i')
dset[rank] = rank

grp = f.create_group("subgroup")
dset2 = grp.create_dataset('host',(MPI.COMM_WORLD.Get_size(),), dtype='S10')
dset2[rank] = MPI.Get_processor_name()

f.close()

The command file

source mpi/bin/activate
mpiexec --prefix /usr/local python demo.py

Submitting the job

qsub -cwd -S /bin/bash -pe mpi_pe 2 runq.sh 


Checking mpiexec

mpiexec --prefix /usr/local -n 4 -host oscar,november ~/temp/mpi4py-1.3.1/run.sh
#!/bin/bash
(cd
source mpi/bin/activate
cd ~/temp/mpi4py-1.3.1/
python demo/helloworld.py
)

To avoid extracting mpi4py/demo/helloworld.py

#!/usr/bin/env python
"""
Parallel Hello World
"""

from mpi4py import MPI
import sys

size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()

sys.stdout.write(
    "Hello, World! I am process %d of %d on %s.n"
    % (rank, size, name))

Troubleshooting

If you get Host key verification failed. make sure that you can ssh to all the nodes configured for the queue (server1 is not the same as server1.example.org)

Use NFSv4 – if you use v3 then you will get the following message:

File locking failed in ADIOI_Set_lock(fd 13,cmd F_SETLKW/7,type F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 5.
- If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).
- If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.
ADIOI_Set_lock:: Input/output error
ADIOI_Set_lock:offset 2164, length 4
File locking failed in ADIOI_Set_lock(fd 12,cmd F_SETLKW/7,type F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 5.
- If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).
- If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.
ADIOI_Set_lock:: Input/output error
ADIOI_Set_lock:offset 2160, length 4
[hostname][[54842,1],3][btl_tcp_endpoint.c:459:mca_btl_tcp_endpoint_recv_blocking] recv(17) failed: Connection reset by peer (104)
[hostname][[54842,1],2][btl_tcp_endpoint.c:459:mca_btl_tcp_endpoint_recv_blocking] recv(15) failed: Connection reset by peer (104)

Using

The basic idea is to split the work into chunks and then combine the results. You can see from the demo.py above that if you are using h5py then writing your results out is handled transparently, which is nice.

Scatter(v)/Gather(v)

https://www.cac.cornell.edu/ranger/MPIcc/scatterv.aspx
https://github.com/jbornschein/mpi4py-examples/blob/master/03-scatter-gather
http://stackoverflow.com/questions/12812422/how-can-i-send-part-of-an-array-with-scatter/12815841#12815841

The v variant is used if you cannot break the data into equally sized blocks.

Barrier blocks until all processes have called it.

Getting results from all workers – this will return an array [ worker_data from rank 0, worker_data from rank 1, … ]

worker_data = ....
comm.Barrier()

all_data = comm.gather(worker_data, root = 0)
if (rank == 0):
    #all_data contains the results

BCast/Reduce

BCast sends data from one process to all others
Reduce combines data from all process

Send/Recv

This is probably easier to understand than scatter/gather but you are doing extra work.

There are two obvious strategies available.

Create a results variable of the right dimensions and fill it in as each worker completes:

comm = MPI.COMM_WORLD
rank = comm.rank
size = comm.size

#Very crude e.g. if total_size not a multiple of size
total_size = 20
chunk_size = total_size / ((size - 1))

if rank == 0:
    all_data = np.zeros((total_size, 4), dtype='i4')
    num_workers = size - 1
    closed_workers = 0
    while closed_workers < num_workers:
        data = np.zeros((chunk_size, 4), dtype='i4')
        x = MPI.Status()
        comm.Recv(data, source=MPI.ANY_SOURCE,tag = MPI.ANY_TAG, status = x)
        source = x.Get_source()
        tag = x.Get_tag()
        insert_point = ((tag - 1) * chunk_size)
        all_data[insert_point:insert_point+chunk_size] = data
        closed_workers += 1

Wait for each worker to complete in turn and append to the results

AC_TAG = 99 
if rank == 0:
   for i in range(size-1):
       data = np.zeros((chunk_size, 4), dtype='i4')
       comm.Recv(data, source=i+1,tag = AC_TAG)
       if i == 0:
         all_data = data
       else:
         all_data = np.concatenate((all_data,data))

Just as an example here we are expecting the data to be a numpy 2d array but it could be anything and could just be created once with np.empty as the contents will be overwritten.

The key difference to notice is the value of the source and tag parameters to comm.Recv this needs to be matched by the corresponding parameter to comm.Send i.e. tag = rank for the first example, tag = AC_TAG for the second
e.g. comm.Send(ac, dest=0,tag = rank)
Your use of tag and source may vary…

Input data

There are again different ways to do this – either have the rank 0 do all the reading and use Send/Recv to send the data to be processed or let each worker get it’s own data.

Evaluation

MPI.Wtime() can be used to get the elapsed time between two points in a program

Aikau and CMIS

This is still a work in progress but now has a released version and, with a small amount of testing seems work, do please feel free to try it out and feedback either via the blog or as an issue on github.

The post was originally published in order to help with jira 21647.

Following on from my previous post CMIS Dojo store I thought I’d provide a example working with Aikau and the store from github https://github.com/wrighting/dojoCMIS

Note that this is not intended to be a detailed tutorial on working with Aikau, or CMIS, but should be enough to get you going.

As a caveat there are some fairly significant bugs that cause problems with this:

Installation

The code is available as a jar for use with Share but, of course, there’s nothing to stop you using the javascript on its own as part of an Aikau (or dojo) based application.

Just drop the jar into the share/WEB-INF/lib folder or, if you are using maven, you can install using jitpack.io with the following dependency.

        <dependency>
            <groupId>com.github.wrighting</groupId>
            <artifactId>dojoCMIS</artifactId>
            <version>v0.0.1</version>
        </dependency>

Background

A good example for Aikau is Share Page Creation Code

My scenario is as follows:

We have a custom content type used to describe folders containing some work. These folders can be created anywhere however it’s useful to have a page that lists all the custom properties on all the folders of this type. As an added bonus we’ll make these editable as well.

The first thing I’m going to do is write my CMIS query and make sure it returns what I want.
It will end up something like this:
SELECT * FROM wrighting:workFolder join cm:titled as t on cmis:objectId = t.cmis:objectId

It is better to enumerate the fields rather than using * but I’m using * to be concise here.

Simple Configuration

As part of the dojoCMIS jar there’s a handy widget called CmisGridWidget that inspects that data model to fill in the detailed configuration of the column definitions.

You do need to define which columns you want to appear in the data but that is fairly straightforward.

So in Data Dictionary/Share Resources/Pages create a file of type surf:amdpage, with content type application/json. See the file in aikau-example/

You can then access the page at /share/page/hrp/p/name

{
  "widgets": [{
    "name": "wrighting/cmis/CmisGridWidget",
    "timeout" : 10000,
    "config": {
      "query": {
        "path": "/Sites/test-site/documentLibrary/Test"
      },
      "columns" : [ {
            "parentType": "cmis:item",
            "field" : "cmis:name"
          }, {
            "parentType": "cm:titled",
            "field" : "cm:title"
          }, {
            "parentType": "cm:titled",
            "field" : "cm:description"
          }, {
            "parentType": "cmis:item",
            "field" : "cmis:creationDate"
          }
          ]
    }
  }]
}

You’ll notice that this is slightly different from the example/index.html in that it uses CmisGridWidget instead of CmisGrid. (I think it’s easier to debug using example/index.html).
The Widget is the same apart from inheriting from alfresco/core/Core, which is necessary to make it work in Alfresco, and using CmisStoreAlf instead of CmisStore to make the connection.

There are a couple of properties that can be used to improve performance.

If you set configured: true then CmisGrid won’t inspect the data model but will use the columns as configured. If you want to see what a fully configuration looks like then set loggingEnabled: true and the full config will be logged, and can be copied into your CmisGrid definition. Note that if you do this then changes to the data model, e.g. new LIST constraint values, won’t be dynamically updated.

What it does in the jar

Share needs to know about my extensions. I’ve also decided that I’m going to import dgrid because I want to use a table to show my information but the beauty of this approach is that you can use any dojo widget that understands a store so there are a lot to choose from. (I don’t need to tell it about the wrighting or dgrid packages because that’s already in the dojoCMIS jar)

So in the jar these are defined in ./src/main/resources/extension-module/dojoCMIS-extension.xml, if you were doing something similar in your amp you’d add the file src/main/amp/config/alfresco/web-extension/site-data/extensions/example-amd-extension.xml

 <extension>
   <modules>
      <module>
         <id>example Package</id>
         <version>1.0</version>
         <auto-deploy>true</auto-deploy>
         <configurations>
           <config evaluator="string-compare" condition="WebFramework" replace="false">
             <web-framework>
                <dojo-pages>
                  <packages>
                    <package name="example" location="js/example"/>
                  </packages>
               </dojo-pages>
             </web-framework>
           </config>
         </configurations>
      </module>
   </modules>
</extension>

For convenience there’s also a share-config-custom.xml which allows you to specialize the type to surf:amdpage

CmisGrid will introspect the data model, using CMIS, to decide what to do with each field listed in the columns definition.

Configuration

The targetRoot is slightly tricky due to authentication issues.

Prior to 5.0.d you cannot authenticate against the CMIS API without messing about with tickets otherwise you’ll be presented with a popup to type in your username and password (basic auth). (The ticket API seems to have returned in repo 5.2 so it should be possible to use that again – but untested)

(For an example using authentication tickets see this example)

In 5.0.d and later it will work by using the share proxy by default, however updates (POST) are broken – see JIRA referenced above.

You can use all this outside Share (see example/index.html in the git repo) but you’ll need to get all your security configuration sorted out properly.

Which Store should I use?

There are two stores available – wrighting/cmis/store/CmisStore and wrighting/cmis/store/CmisStoreAlf.

Unsurprising CmisStoreAlf is designed to be used within Alfresco as it uses CoreXhr.serviceXhr to make calls.

CmisStore uses jsonp callbacks so is suitable for use outside Alfresco. CmisStore will also work inside Alfresco under certain circumstances e.g. if CSRF protection isn’t relevant.

Detailed Configuration

If you want more control over your configuration then you can create your own widget as shown below.
The jsonModel for the Aikau page (eg in src/main/amp/config/alfresco/site-webscripts/org/example/alfresco/components/page/get-list.get.js) should contain a widget definition along with the usual get-list.desc.xml and get-list.get.html.ftl (<@processJsonModel group="share"/>)

model.jsonModel = {
 widgets : [
  {
    name : "example/work/List",
    config : {}
  }
 ]
}

Now we add the necessary js files in src/main/amp/web/js according to the locations specified in the configuration above.

So I’m going to create a file src/main/amp/web/js/example/work/List.js

Some things to point out:

This is quite a simple example showing only a few columns but it’s fairly easy to extend.

Making the field editable is a matter of defining the cell as:
editor(config, Widget) but look at the dtable docs for more details.

I like to have autoSave enabled so that changes are saved straight away.

To stop the post getting too cluttered I’m not showing the List.css or List.properties files.

There another handy widget called cggh/cmis/ModelMultiSelect that will act the same as Select but provide MultiSelect capabilities.

The List.html will contain

 <div data-dojo-attach-point="wrighting_work_table"></div>

 

define(
        [
                "dojo/_base/array", // array.forEach
                "dojo/_base/declare", "dojo/_base/lang", "dijit/_WidgetBase", "dijit/_TemplatedMixin", "dojo/dom", "dojo/dom-construct",
                "wrighting/cmis/store/CmisStore", "dgrid/OnDemandGrid", "dgrid/Editor", "dijit/form/MultiSelect", "dijit/form/Select",
                "dijit/form/DateTextBox", "dojo/text!./List.html", "alfresco/core/Core"
        ],
        function(array, declare, lang, _Widget, _Templated, dom, domConstruct, CmisStore, dgrid, Editor, MultiSelect, Select, DateTextBox, template, Core) {
            return declare(
                    [
                            _Widget, _Templated, Core
                    ],
                    {
                        cssRequirements : [
                                {
                                    cssFile : "./List.css",
                                    mediaType : "screen"
                                }, {
                                    cssFile : "js/lib/dojo-1.10.4/dojox/grid/resources/claroGrid.css",
                                    mediaType : "screen"
                                },
                                {
                                  cssFile: 'resources/webjars/dgrid/1.1.0/css/dgrid.css'
                                }
                        ],
                        i18nScope : "WorkList",
                        i18nRequirements : [
                            {
                                i18nFile : "./List.properties"
                            }
                        ],
                        templateString : template,
                        buildRendering : function wrighting_work_List__buildRendering() {
                            this.inherited(arguments);
                        },
                        postCreate : function wrighting_work_List_postCreate() {
                            try {

                                var targetRoot;
                                targetRoot = "/alfresco/api/-default-/public/cmis/versions/1.1/browser";

                                this.cmisStore = new CmisStore({
                                    base : targetRoot,
                                    succinct : true
                                });

                                //t.cm:title is the value used
                                this.cmisStore.excludeProperties.push('cm:title');
                                this.cmisStore.excludeProperties.push('cmis:description');
                                this.cmisStore.excludeProperties.push('cm:description');


                                var formatFunction = function(data) {

                                    if (data != null) {
                                        if (typeof data === "undefined" || typeof data.value === "undefined") {
                                            return data;
                                        } else {
                                            return data.value;
                                        }
                                    } else {
                                        return "";
                                    }
                                };

                                var formatLinkFunction = function(text, data) {

                                    if (text != null) {
                                        if (typeof text === "undefined" || typeof text.value === "undefined") {
                                            if (data['alfcmis:nodeRef']) {
                                                return '' + text + '';
                                            } else {
                                                return text;
                                            }

                                        } else {
                                            return text.value;
                                        }
                                    } else {
                                        return "";
                                    }
                                };
                                this.grid = new (declare([dgrid,Editor]))(
                                        {
                                            store : this.cmisStore,
                                            query : {
                                                'statement' : 'SELECT * FROM wrighting:workFolder ' +
                                                    'join cm:titled as t on cmis:objectId = t.cmis:objectId',
                                            },
                                            columns : [
                                                       {
                                                           label : this.message("work.id"),
                                                           field : "cmis:name",
                                                           formatter : formatLinkFunction
                                                       }, 
                                                       {
                                                           label : this.message("work.schedule"),
                                                           field : "p.work:onSchedule",
                                                           autoSave : true,
                                                           editor : "checkbox",
                                                           get : function(rowData) {
                                                               var d1 = rowData["p.work:onSchedule"];
                                                               if (d1 == null) {
                                                                   return false;
                                                               }
                                                               var date1 = d1[0];
                                                               return (date1);
                                                           }
                                                       }, {
                                                           label : this.message("work.title"),
                                                           field : "t.cm:title",
                                                           autoSave : true,
                                                           formatter : formatFunction,
                                                           editor : "text"
                                                       }, {
                                                           field : this.message('work.submitted.date'),
                                                           editor: DateTextBox,
                                                           autoSave : true,
                                                           get : function(rowData) {
                                                               var d1 = rowData["p.work:submittedDate"];
                                                               if (d1 == null) {
                                                                   return null;
                                                               }
                                                               var date1 = new Date(d1[0]);
                                                               return (date1);
                                                           },
                                                           set : function(rowData) {
                                                               var d1 = rowData["p.work:submittedDate"];
                                                               if (d1) {
                                                                   return d1.getTime();
                                                               } else {
                                                                   return null;
                                                               }
                                                           }
                                                       }
                                                       ]
                                        }, this.wrighting_work_table);
                                this.grid.startup();
                            
                            } catch (err) {
                                //console.log(err);
                            }

                        }
                    });
        });