Tue, 12 March 2013
Adrian Brown and James Lappin talk about digital preservation, and its relevance to the information challenges faced by organisations
Adrian Brown is head of digital preservation and access at the Parliament Archives here in the UK. His book 'Practical digital preservation- a how-to guide for organisations of any size' will be published in the UK and US this May.
Adrian talks of a 'blurring of the bourndary' between digital objects (.doc, .jpeg, .xls etc.) and the applications that they are held in. Key information about these digital objects are held by the applications. Digital preservation has had some success in tackling the problem of how to preserve the file formats of the objects themselves. Now it faces a more complex problem: how do we preserve the information that an application has about the objects it holds? How do we enable digital objects to move from one application to another without losing that information?
Adrian describes different models for digital repositories, and different ways of tackling the preservation issues arising from a complex application such as a Building Management System.
Fri, 15 February 2013
In this episode Alan Pelz-Sharpe explains why he thinks advanced analytics tools will come into mainstream business document management and collaboration systems over the next few years. He discusses the different things that large organisations use advanced analytics for (automatic classification; checking communications for suspicious activites and/or for compliance breaches). Alan and James adebate the question as to whether or not advanced analytics could have records management uses.
James Lappin asks Alan whether it would be feasible to:
- equip an information governance tool with records management rules such as retention rules and a records classifications
- get the information governance tool to apply those records management rules to content held in the many different applications used within an organisation.
Alan said that it was perfectly possible, but the key limiting factor was the scope of each application's ApplicationProgramme Interface (API).
The most important standard for APIs is the Content Management Interoperability Services (CMIS). At the time of writing CMIS supports basic document management taks but not records management tasks
This episode was recorded in London on 15 February 2013.
Alan Pelz-Sharpe is Research director for content management and collaboration at Fahrenheit 451. James Lappin blogs at www.thinkingrecords.co.uk
Fri, 13 July 2012
Alan Pelz-Sharpe, Sharon Richardson and James Lappin discuss Office 365, Microsoft's cloud offering.
Office 365 is a 'bundle' of cloud services, including e-mail, the cloud version of SharePoint, and Lync instant messanging.
We discuss the challenges that Microsoft have in providing both cloud and an on-premise versions of SharePoint.
Sharon said Microsoft will need to improve the way it moves cloud customers from one version of SharePoint to the next.
The cloud version of SharePoint 2007 had been part of BPOS (the predecessor to Office 365). The cloud version of SharePoint 2010 was not released until nearly a year after the on-premise version. There was no simple upgrade path- instead cloud customers had to migrate over from BPOS to Office 365.
The plus side was that the online version of SharePoint 2010 contains almost the whole range of SharePoint functionality (it doesn't have the SharePoint records centre though). It includes for example most of the service applications (such as the user profile service and the search service).
The main competitor to Office 365 was Google Apps, which doesn't have an on premise version. Google Apps is nowhere near as powerful and as configurable as SharePoint online, and it does not have an on-premise version. Its features have developed very little over the past few years, and such changes as have been made have been deployed by Google without disruption to customers.
SharePoint Online also faces competition from two very different sources:
- The relatively simple filesharing services such as Huddle, Box and Dropbox, which like Google apps do not have on-premise versions
- The powerful, complex and configurable document management products from ECM vendors such as Open text, IBM, Oracle and Documentum, which have on-premise versions as well as cloud versions and hybrid versions.
Alan said that the document management products from the ECM vendors are overwelmingly deployed on-premise. If an organisation is deploying a serious document management system, with a degree of customisation and of integration with other applications, then it is almost certainly going to want that application to be on-premise rather than in the cloud.
Alan Pelz-Sharpe is Research Director of the 451 Group. Sharon Richardson is an independent consultant and founder of Joining Dots Limited. James Lappin is an independent records management consultant and founder of Thinking Records Ltd.
Thu, 1 March 2012
In this podcast James Lappin asks Matt Mullen to explain what Big Data is.
This podcast was prompted by a blogpost Matt had written Big Data plus enterprise search = Big enterprise disappointment?
Matt contrasts the vendor driven, enterprise-centric vision of Big Data (vendors selling tools to help organisations make use of the content they have accumulated over the years in different repositories) with the more transparent, idealistic and web-centric vision of linked data (organisations marking up their structured data with rdf and making it available for others to run queries on, or to use for data mash-ups).
Matt explains why it is easier for Google to make sense of the world wide web than it is for an enterprise search engine to make sense of documents and data from multiple different repositories within an organisation. James and Matt discuss whether or not the distinction between structured and unstructured data is a meaningful one.
The podcast was recorded at the Royal Festival Hall, London on 27 February 2012
Matt Mullen is an analyst for the Real Story Group, specialising in Search and in Web Content Management. He is on Twitter as @MattMullenUK
Direct download: ECM_Talk-episode013-BigData_and_Search-MattMullen.mp3
Category:general -- posted at: 10:07 AM
Thu, 26 January 2012
Richard gives his rule of thumb for answering the following question - when a new area or function comes on board in a SharePoint implementation is it best to set up a SharePoint site collection or simply a site within an existing site collection?
We discuss the pros and cons of 'site collections' which are a feature unique to SharePoint. Site collections are a hierarchical collection of SharePoint sites sharing common administrative settings and some common information archicture features such as content types. Crucially a site collection cannot be split across seperate SQL server content databases, so there are storage as well as information architecture considerations to deciding how many site collections to set up and what for. Microsoft recommends that each site collection does not exceed 100GB in size.
James asks about the relationship between site collections and search, and Richard describes some tips for configuring a SharePoint search centre with search 'scopes' set up to enable your users to target their searches at particular site collections or at particular types of content. We discuss the strengths and weaknesses of refiners in SharePoint search. Refiners are a set of links that are returned alongside SharePoint 2010 search results and which enable users to filter those results by defined parameters (for instance date modified, document type, project title). James is disappointed firstly that the SharePoint 2010 refiners only filterthe first 500 results, but more importantly that they give no indication given to the user that only the first 500 results had been refined.
The discussion then touches on the managed metadata service in SharePoint 2010 as a way of getting controlled vocabularies out of the confines of a single site collection and into a place where they can be used by any site collection. Richard outlined some of the ways in which the managed metadata service does not work as well as he would like (and mantions an article by Michal Pisarek in which these weaknesses are collected) but says he still recommends his clients make some use of it.
We finish by talking about 'business connectivity services' in SharePoint. This enables data (in the form of database rows and columns) to be imported into SharePoint from another database within the organistion. Once the data is in SharePoint it can be used as a controlled vocabulary to improve the findability of content. Richard gives the examples of a law firm importing into SharePoint a list of its matter numbers from its customer database. The one disappointment is that the business connectivity service does not work with the managed metadata service - it is not possible to import a list (for example a list of clients) into the managed metadata service from a line of business database and use that as controlled vocabulary within SharePoint.
Direct download: ECM_Talk_012_-_Richard_Harbridge_on_information_architecture_in_SharePoint.mp3
Category:general -- posted at: 2:27 PM
Fri, 14 October 2011
Brad says that it can be regarded as a records management system with the caveat that it may not do things in the way that traditional records management systems do them. James concedes that SharePoint 2010 has records management features (such as holding and applying retention rules, holding a hierarchical classification, locking documents down as records) but feels that these features are not brought together in a coherent enough way to justify calling SharePoint a records management 'system'.
SharePoint 2010 offers organisations two different approaches to records management - the in-place approach and the records centre approach. Brad and James describe and critique these two different approaches . James characterises the choice between them as being like that between 'a rock and a hard place'.
Brad describes the challenge of managing the routing rules necessary to get documents from SharePoint team sites to the record centre. James describes the problem of in-place records management which leaves records scattered around team sites under the control of local site owners without providing any reporting capability to give a records manager visibility over them all.
Brad and James will be debating the issue of records management in SharePoint live at the SharePoint Symposium in Washington on 2 November 2011
Direct download: ECM_Talk011-IsSharePointaRecordsManagementSystem.mp3
Category:general -- posted at: 2:09 PM
Sat, 8 October 2011
James Lappin asks Alan Pelz-Sharpe 10 questions about the current state of the enterprise content management market
Here is a flavour of some of Alan's answers - there is a lot more detail in the actual podcast itself
Why have HP bought Autonomy?
Alan said that most analysts were surprised at how much HP paid for Autonomy. The best guess at what HP (a hardware company) wants to do with Autonomy (a software company) is that they may wish to create some kind of appliance which has Autonomy's IDOL search engine already loaded onto it (a bit like the Google search appliance). One thing that HP and Autonomy have in common is that they have both bought well-regarded electronic records management systems (Tower and Meridio respectively), and done very little with them.
How hard have the ECM vendors been hit by the rise of SharePoint?
Alan said that the ECM vendors haven't bit hit as hard as you might think. Their revenues are still rising, and most of them enjoy good relations with Microsoft.
How does EMC and Open Text compare with the bigger ECM vendors (Oracle and IBM)
Alan said that Oracle and IBM are so big because they do a huge variety of stuff as well as ECM. But at the end of the day if you are buying FileNet from IBM you are dealing with the FileNet division, not the whole massive company. So for buyers of ECM systems company size doesn't matter that much. Open Text is the largest company that focuses exclusively on ECM. EMC's business is mainly about storage. They bought Documentum, but Documentum is very different from the rest of the EMC group and there has not been many synergies.
What is happening in the CRM (Customer relationship management) arena and how does it relate to ECM?
Essentially ECM and CRM are seperate worlds without much overlap. CRM is a vital tool for many organisations. As yet there is not a great deal of tie-ins with ECM. Oracle has both a CRM and an ECM suite, which work together reasonably well. SAP signed a large deal with Open Text but there doesn't seem to be a huge number of organisations using SAP together with Open Text products. Many of the CRM tools will do a little bit of document management of customer related documents, but for the most part organisations will have CRMs that don't talk to whatever ECM product(s) they have
The Europeans have just revised their electronic records management specification (MoReq2010). When will the US records management standard DoD 5015 be revised (it was issued back in 2007)
Alan said he didn't know of any plans to revise DoD 5015. SharePoint drove a horse and cart through DoD 5015 because Microsoft made the decision to release a document management product that did not comply with it but had huge market success. Vendors didn't like DoD because it was very hard for them to tailor their products to.
What is happening in the intranet arena?
Alan said that nothing dramatic is happening in the intranet arena. Some intranet makeover projects will have been hit by the economic downturn. Alan can't understand why some organisations want to use the same product to manage there external web-site and their intranet - to him they are fundamentally different things.
Do you know any organisation that manages their e-mail well?
Alan said that of all the ECM implementations that he sees, the type that gives the quickest and most reliable return on investment is an e-mail archiving tool brought in to take stored e-mails off the mail servers.
What do you think of PAS 89?
Alan thought PAS 89 good attempt to define the scope of enterprise content management, although he can't think of what an organisation would specifically use it for.
How does Alfresco compare with the proprietary ECM products
Alan said that if we were talking about open source ECM products Nuxeo should be mentioned alongside Alfresco. Both of them are established, mainstream enterprise content management systems. The main difference between them and the proprietary ECM products is the licensing model.
How does Google Apps compare with the established ECM products
In terms of impact on the ECM market Alan is more interested in Box.Net than Google Apps. Alan and James discussed the prospect of new start ups deciding not to set up shared drives and instead using services like Box.Net in the cloud to provide a relatively simple place for colleagues to store and share documents.
Wed, 27 July 2011
In this episode Alan Pelz-Sharpe discusses the current state of ECM in Brazil with Walter Koch . Topics they cover include:
This podcast was recorded on the 19 July 2011, and lasts for 31 minutes
Wed, 13 July 2011
In this episode analyst Ralph Gammon, author of the Document Imaging Report newsletter and blog, joins Alan Pelz-Sharpe and James Lappin to discuss the the state of the market for document capture software
Capture software, such as Kofax and Captiva, is used to make sense of scanned documents. It is typically used to apply optical character recognition (OCR), or barcode recognition, to scanned documents.
More sophisticated use cases involve integrating a capture product with an enterprise content management system (ECM), an enterprise resource planning system (ERP) such as SAP, or a line of business (LOB) application. The capture product might be used to identify what type of document a scanned image is, and to kick-off an appropriate workflow within an ECM/ERP/LOB application. Or the capture product might be trained to help with form processing where a large volume of paper forms are received and scanned. The role of the capture product might be to read the entry in each field of the form and place that entry in the appropriate metadata field within the ECM/ERP/LOB, which could then trigger an appropriate workflow.
Ralph identified the main value that capture software brings as reducing keystrokes- reducing the amount of manual effort needed to make scanned images of paper documents useable by an organisation on their electronic systems. Alan points out the downside of this - some large capture projects result in job losses.
Alan said that many of his clients think that Kofax and Captiva are the only players in the Capture market. Ralph said that many of the traditional ECM vendors have some sort of partnership with a capture vendor. EMC (owners of Documentum) own Captiva. IBM bought Datacap. Oracle have a relationship with Brainware. Kofax and ReadSoft are independent of any one ECM vendor. Microsoft are not linked with any particular capture vendor, and several vendors have worked on plug-ins to integrate capture software with SharePoint.
Thu, 21 April 2011
James and Cheryl started by discussing the rise of open source enterprise content management systems.
They went on to discuss the impact of CMIS (Content Management Interoperability Standards).
CMIS is an OASIS specification, created by a group of enterprise content management system vendors (IBM, EMC, Microsoft, Alfresco, Open Text and others).
CIMS enables different content repositories within an organisation to interoperate with each other even if they are written in different programming languages. If a vendor adds a CMIS compliant layer to their application, then other applications can use CMIS protocols to perform basic content management operations on that application.
For example if an organisation installed an application that had a CMIS layer, it could allow one of its other applications to use CMIS protocols to do things such as
James and Cheryl discussed the progress vendors had made in adding CMIS layers to their products.
Towards the end of the podcast James and Cheryl discussed the question of whether it was either possible or meaningful to make a distinction between 'documents' and 'records'.
The podcast was recorded on 21 April 2011 via skype.
Direct download: ECM_Talk_006_-_CMIS_-_Content_Management_Interoperability_Services_-Cheryl_McKinnon_-_stv.mp3
Category:general -- posted at: 4:00 AM