SSO

After my last post I thought I ought to write something about Single Sign On (SSO) – this post will cover a bit more than just SSO.

I’ve done a lot of work with SSO but it’s one of those things you only visit periodically so it’s easy to forget things – also it’s harder than it feels like it should be!

This post is intended to be fairly generic, although I will use some examples from applications that I’ve worked on.

For anyone especially interested in Alfresco please note that I haven’t looked at any of the version 6/Identity Service or ADF stuff as yet.

My first point is that if you’re working on a new SSO project then the first thing you need to do is to work out how you are going to merge your user data from the different apps – this is generally the hardest part and I’ve seen quite a number of projects give up because of this. It’s a good argument for getting your SSO sorted out as soon as possible.

Single Sign On is where you log in once and are authenticated to all applications (also potentially logged out).

Shared Sign On is where you use a shared user dictionary e.g. LDAP but each application is responsible for it’s own authentication.

Authentication is who are you, authorization is what you can do.

There are many, many guides to this on the internet and if you’re really, really lucky you might find one you can understand.

Single Sign On

My first point is that if you do this right then the protocol/technology doesn’t matter all that much – this area is a lot more mature than it was even a couple of years ago.

What you are trying to achieve is to protect a list of endpoints (URLs) (probably not all e.g. CSS) and communicate a user id through to your application.

Try and keep this as (logically) separate as possible – just identify the user. It can be tempting to link this up to authorization but try not to, it just causes trouble.

The aim here is to intercept the incoming request and process it before it gets to your application. There are different ways of doing this e.g. in Apache, in Tomcat, in filters.

For java what you are aiming for is that a call to request.getRemoteUser() will get you the user id.

For java applications my preference is to try and use a web-fragment.xml to define the filters and endpoints. The problem with this is that any filters defined in the web-fragment are applied after filters in the web.xml so depending on how the application is structured this may not be possible (sometimes you can get away with writing another application specific filter but that’s not ideal – I managed to do this for 5.1 Share and repo but not 5.2 Share).

See https://issues.alfresco.com/jira/browse/ALF-21848 for a suggestion for restructuring the Share web.xml to make this easier.

Another common approach is to just edit the web.xml, sometimes using a maven profile, but that’s not ideal.

A quick aside here – be careful about your username/id – for example we log in using email address but use a different attribute as the user id. Single sign on systems will support this but you might need to be careful in your configuration.

Username/password log in

Sometimes you will want to log in using a username and password – this might be for using a non web or mobile client e.g. in Alfresco CMIS, mobile or IMAP.

Probably the cleanest choice here is to make the client do the work by obtaining a token from SSO and using that in conjunction with your SSO mechanism of choice, however that’s not always practical.

Another alternative is to identify the request, one way is look for the authorization header, and proceed from there either by using SSO proxy authentication (see below) or carrying on to your normal username/password auth method (note Alfresco doesn’t support using a different username attribute out of the box)

Proxy Authentication

This is where you have logged in to one application and want to pass that authentication information through to another.

The idea here is that after the client application (C) has authenticated and thus allowed the server side application (A) to authenticate itself using the SSO mechanism then application A can obtain a token from the SSO server and passes that token through to the other server side application (B) it wants to talk to. Application B then validates the token against the SSO server and as a result obtains the identity information.

What you are doing is intercepting and wrapping the outgoing request from A to B and including the SSO token (typically as part of a header but this can be handled by a library)

An Alfresco specific aside, at least up to version 5.2, Share SSO communicates with the repo/platform/ACS (whatever you want to call it…) by setting a custom HTTP header containing the username (you have to be careful about the security of your configuration!)

So in summary this splits into the following parts:

  • Application A obtains an access token from the SSO server (standard part of libraries)
  • Application A injects the access token into requests to application B (depends on how requests are made)
  • Application B intercepts the request and uses the token to obtain the username (this will be the same configuration as is used for the normal SSO authentication, and subject to the same security constraints)

Authorization

As I said earlier try and keep authentication and authorization separate.

It probably helps to understand this if you consider the evolution of (a lot of) applications.

Start off with a custom authentication and authorization layer, then realize that you need to use shared authentication (probably keeping the original method) so add in synchronization (with LDAP and others), then you realize that you want SSO so add that in later.

There are a few approaches you can take here:

(consider the performance implications, how look ups will be cached – you’ll be doing this a lot and the caching mechanisms, time out etc are not as well considered as for authorization)

Native SSO

Most SSO systems support some form of attribute release – you can use this information to set the rights within your application e.g. OAuth scope or parse a list of LDAP group memberships.

This means that the application must be used in an SSO context.

This can be, and probably would be, done with the same validation request as is used for authorization but do try to keep it logically separate.

Mixed SSO/Shared Auth backend

Use the SSO authorization and then an additional query to the shared backend to determine rights e.g. run a query against LDAP to determine group membership.

Mixed SSO/Custom/Shared Auth

This is probably the most common model that I’ve seen even though it’s relatively painful to configure.

Custom authorization model is set to sync with the shared auth backend.

Use the SSO authorization and then an additional query to the custom model to determine rights.

Custom model can contain rights(groups/group membership) not held in the shared backend.

There can be timing problems waiting for the custom model to be updated from the shared backend.

(Technically can be done without the shared auth but that’s really not a good idea so I’m not including it)

Proxy authorization service

Make a proxy authenticated request to a separate service which can manage the lookup(s)

This is a more flexible version of native SSO but comes with potential performance issues.

User Information

I’m throwing this in as an extra because it’s pretty similar to authorization in concept.

You might want to have information about the user, for example, their name to use in your application.

This information can be retrieved using the same methods as for the authorization information.

I’ve put this separately mainly because if you want to use an avatar or picture for the user then you potentially have a much larger piece of data to consider (and might want to convert the image into a more suitable format)

(Alfresco note – the use of the an image isn’t supported out of the box but it’s something that can be done with customizations)

Authorization Management

Chances are that if you are using SSO then there will some external process for managing authorization e.g. LDAP groups.

If you want to manage authorization from within your application then ideally you need to authenticate against the existing management system before making any changes – this is one of the few occasions where you may want to be able to retrieve the user password from the SSO system e.g. to authenticate against LDAP. The alternative is use some form of super user from within your application but that isn’t ideal as it potentially gives elevated rights to your user by mistake.

Single Sign Out

You may not care about this e.g. by default Share in SSO configuration removes the log out option from the menu and there’s no way to log out. (It’s not too hard to put back in however – see previous posts and the alfresco-cas project on github)

Most applications have their own way of determining whether you are logged in as well as whatever is used by SSO. This is to support non SSO log ins.

The important thing to remember here is that you are logged in to both the SSO system and the application and when you log out of one, you need to log out of both (otherwise you’ll just be logged straight back in again)

Normally the idea would be to log out of the application first and then forward to the SSO logout page (probably with a redirection parameter to send you somewhere afterwards).

This doesn’t cover the more complex case where the user logs out of a different application. In this case the SSO application will send a logout request to your application (as part of the SSO logout process) – this logout request can then be handled to ensure that the local application logout also happens. i.e. logging out from application A also logs you out from applications B, C, D… as well as SSO.

It’s common for the first case to be handled but more unusual to handle the second case.

Summary

SSO is in a much better place than it was a few years ago but there’s still no one right way to do it and that’s likely to remain the case. (I’ve spent a lot of time helping people with CAS SSO for Alfresco)

SSO brings big benefits, both to the users and by providing a single place to manage authentication and, potentially, authorization.

Be careful! SSO provides a single point of failure for the organization and while this can mitigated by suitable configuration you still need to be careful especially during upgrades.

Don’t forget to keep on top of the upgrades!

Keep it as simple as possible as any customization makes upgrading more difficult (you are keeping on top of the upgrades aren’t you?)

Try to keep to only using one SSO system otherwise it’s not really SSO, as well as being more difficult to maintain – you’ll probably end up with some sort of shared LDAP based backend e.g. OpenLDAP or ActiveDirectory.

Make sure that you keep authentication and authorization (logically) separate.

If you write your application in a suitably flexible way then it should be possible to easily support any current or future protocols. To achieve this there are three main parts to consider:

  • Make sure the client application can handle the authentication protocol – don’t forget authentication failures (should be fairly easy with client libraries, response interceptors etc)
  • abstract the mechanism for protecting incoming requests – how to specify the endpoints to protect, and how to pass the validated information through to your application. (e.g. web-fragment.xml and getRemoteUser for java)
  • Provide an abstraction layer for proxy requests i.e. make sure it’s easy to modify any requests between different application components.

 

Alfresco DevCon 2019

Alfresco DevCon – Some thoughts and impressions

These are very much impressions and if you want more details then I encourage you to go and look at the presentations – there’s a lot of good stuff there!

I haven’t really been keeping up with what’s going on in the Alfresco ecosystem for a couple of years so I thought that DevCon would be a good opportunity to catch up.

It’s always nice to get an idea of what’s going on and meet up with friends old and new.

One of the things I like about DevCon is that it gives me a change to get a, fairly conservative, view on what’s going on in the technology space and well as the more specific Alfresco related topics.

There were about 300 attendees and four parallel streams of talks – I like the lightning talk format with auto advancing slides and five minutes per presenter – it’s a different skill but allows you to hear from more people.

John Newton’s key note is always a good introduction and provides context to the rest of the conference.
The overall feeling of this was surprising similar to Zaragoza two years ago – I think this is good as it shows more consistency of vision that has sometimes been the case in the past.
Joining up process, process driven, and content feels like a good thing to be doing.
Once again AI was prominent, to be honest, despite working at the Big Data Institute, AI (or perhaps Machine Learning(ML) would be a better term), I don’t find this very interesting in the Alfresco context. As the Hackathons this year, and even two years ago, have shown this is fairly easy to link up to Alfresco so the interesting part is what you do when you’ve linked it up and that’s going to be very specific to your own needs.
The other hype technology mentioned was blockchain – this seems fairly sensible in the context of governance services.

Moving on the road map and architecture.
It felt like Alfresco were embarking on a fairly major change after 5.2 and I certainly thought that the right thing to do would be to wait a couple of versions before attempting to upgrade, and I don’t think I’m alone in that!
My impression is that it’s still not there yet although progress has been made and it seems to be heading in a generally good direction – more on this below.

The blog posted by Jeff Potts @jeffpotts01 about whether Alfresco should provide more than Lego bricks caused a few conversations in the Q&As – I suggest that the CEO and product managers have a conversation here as I got conflicting messages from the two sessions. I’m firmly in the camp that there needs to be a reasonably functional application if only to show what can be done by putting the Lego together.

One the technology themes here was that containerisation is coming of age. This is the same direction as we have been going internally so it’s good to have some confirmation this.
We’ve got experience of Kubernetes so we’re not afraid of this – it’s perhaps overkill for a lot of installations but does have the benefit of being relatively cloud platform agnostic (interestingly Azure was seen as the second most popular platform to AWS when our observation is that the Google Platform is the best option)
Another observation is that as we’re being pushed more towards cloud architectures it would be nice if the official S3 object storage connector was released to the community (there is already a community version), and perhaps other object storage connectors?

Another potentially huge change/improvement is the introduction of an event bus into a right sized (as Brian said not micro) services architecture, this is one of the presentations I need to catch up on at home. I was advocating ESB based service architectures ten years ago so it’s nice to see this sort of thing happening.  Jeff Potts did a nice presentation of how he had effectively done his own implementation of this.

(As an aside if you’re spec’ing out new hardware get 32Gb if you can)
Kubernates
The ADF related applications feel a bit like where Share was in about 2010 – I like this but it’s not yet a compelling reason to upgrade an existing installation.
Where I’d like to be is to be able to run Share alongside the new stuff with the objective of phasing out Share eventually – I don’t feel we’re there yet.
(good to see the changes for ADF 3 but puzzled as to how the old stuff was there when using OpenAPI as the new stuff seems to match what would come out of the code generators…)

One of my objectives was to get a feel for how hard it would be to upgrade, and what the benefits would be and as Angel (@AngelBorroy) and David’s (@davidcognite) sessions were the most popular, lots of other people had the same idea (Thanks for the shout out Angel). I’m also going to mention Bindu’s (@binduwavell) question in the CEO Q&A here.

This is tricky, and judging from this, and some informal conversations, there’s still a real lack of help and support from both Alfresco and partners in this area.
One of the strengths of Alfresco is the ability to provide your own customisations but it does, potentially, make it difficult to upgrade.
I think the new architecture is a step in the right direction here as it’s going to make it easier to introduce some loosely coupled customisations – there will still be a place for the old way however.

My biggest problem with upgrade is SSO, it always breaks!,  so I was very interested to see the introduction of the Alfresco Identity Service – it’s great to see this area getting some love.

I really want this to work but I’m pretty disappointed with what was presented.

Keycloak is solid choice for SSO but I *really* don’t want to introduce it running in parallel with my existing SSO infrastructure – by all means have it there as an option for people who don’t have existing infrastructure (or for test/dev environs) but please try and do a better job of integrating with existing, standards based, installations – this is quite a mature area now.

No integration with Share is a pretty big absence (and I’m assuming mobile) – I suspect that the changes to ACS for this mean that the existing SSO won’t work any more (I’ve seen it broken but not sure why yet)

In principal I agree with the aim of moving the authentication chain out of the back end but there may be some side effects e.g. one conversation I had was around the frustration of not being able to use the jpegPhoto from LDAP as the avatar in Alfresco – this is fairly easy to provide as a customization to the LDAP sync (I’ve done it and can share the code) but doesn’t fit so well if you’re getting the identity information from an identity service.

All in all an enjoyable conference with lots learned.

P.S. One option for the future would be to reduce the carbon footprint of the conference by holding it in Oxford – it’s possible to use one of the colleges when the students aren’t there e.g. end of June to October.

Edinburgh was nice but I think a few people found it a wee bit chilly.