Alfresco Data Migration: The Ultimate Guide

Alfresco Data Migration: The Ultimate Guide

The other day we with my colleagues had a hot talk with a certain amount of nostalgia about sharing photos, audio records, some education papers with a floppy disk, further emails, and the most advanced of us used FTP servers. Yes, we had our moments.

But the world moves forward. As a consequence, those dreamers with a book of Jules Verne under their arms couldn’t fail to invent some relief for transferring files as easy as ABC. So it was that electronic document management systems were born in the 90s. The rise in adoption refers to the middle of the 2000s. Nevertheless, the electronic document management system (EDMS) market is soaring up and foreseen to surpass $6 billion by 2024.

Last year Gartner named Alfresco a Challenger placing on the same level with IBM, Oracle, and Laserfiche which occupy Enterprise Content Management (ECM) market. The demand among companies exploiting EDMS will cause the increase of smart data migration. Expecting companies’ ordering fever, we decided to create a step-by-step guide that reveals vital points about migrating data and Alfresco migration services.

When do you need to migrate your data to Alfresco?

When is it necessary to migrate your data? In nine of ten cases, company management triggers off the initiative of data migration willing to decrease software maintenance cost spendings.

While migrating from Enterprise Edition (EE) to Community Edition (CE) the obligation to pay around $20 000 annually for post-deployment support of certified Alfresco developers ceases to exist. To provide due support, software outsourcing companies like Aimprosoft can; for which this ECM platform is a strong suit of development and support of Alfresco based systems.

Likewise, EE intrigues clients with a clustered environment. Requiring scalability of EDM systems to serve more end clients per unit of time, tech specialists prefer to crowd Alfresco Content Services instances within one cluster. Сlustering is a great mean for situations of increased user activity when one server doesn’t manage the load and performance. A one more twin in-house server is set up to diminish load on the operating server. An old-hand Alfresco developer on call will stand you in good stead here.

The load on Alfresco hinges on the hardware power which is used to run Alfresco instance. How to evaluate hardware power concerning the number of active users and Alfresco hardware requirements we recounted thoroughly in one of our previous articles How to make a digital transformation in the enterprise and succeed with Alfresco? Part 2.

Alfresco content migration may be of two types: migration between editions and version upgrade. In both cases, data will be duplicated from one platform to another. Let’s explore the most important facets.

How to perform Alfresco Data Migration?

Alfresco Content Services (Alfresco One earlier) servers and databases are open for various migration procedures being performed both in CE and EE. Contributors and maintainers fell over backwards to open gates widely of Alfresco migration services for developers. So, let’s hit the high spots in the issue how to perform Alfresco data migration and point out only important matters.

What are the types of migrated data?

When you are reflecting on moving to the new system, you probably want to get your business processes transferred as well as documents in a cut-copy-paste way. Any software system for document management consists of files contained programming code of business logic and working documentation. There are two points we have to highlight in this regard.

Content. Documents, user accounts, groups of users, and other files in Alfresco should be migrated smoothly. All digital files have to be categorized and stored in workflows strictly by established rules of the company. It starts with transferring repository content from one Alfresco instance to another. Migrated files are supposed to be noted in the master directory. The entire Alfresco database and the file system can be copied with out-of-the-box replication service or migration tools with a variety of migration scenarios. After the faultless duplication of data, workflows are the next most desired thing of EDM to be kept as it was on the first day. But it’s not that simple.

Business processes present problems in trying to migrate them because, in spite of implementing the first Activiti release long ago in 2011 (a framework which powers Alfresco Process Services (BPM)), it has not been making strides still. Earlier developed systems are powered by old frameworks. For example, the leadership of Alfresco has given up a worn-out jBPM in 5.1.

If business processes (or rather process definitions — workflow templates with if-then-else decision trees) are based on Activiti, then they can be migrated ensuring the preservation of logic and settings. In turn, if the business process was written in different language, then it has to be developed from scratch on the new programming language. That will entail large investments of time and money.

The last version of Alfresco is 5.2.2. No major changes in Alfresco data migration were made. We won’t elaborate on the full list of alterations. If you want to check it, please find the available information here.

Alfresco Data Migration Approach

We’ve actually reached the point where we can draw the line between migration, upgrade, and replication. Data migration could be implemented between editions when upgrading versions. Let’s bring the light. Here migration is considered as the process of transferring data between servers. Upgrade is the process of replacing a platform with a newer version of the same Alfresco platform.

Alfresco data migration approach

If you’ve been on 4.2 Community version for quite a time and planned to migrate to Community Edition 5.2, data migration will be carried out in the twinkling of an eye for you. As Alfresco developers do, a new Alfresco instance is set up, and a new database is connected up. It upgrades and launches on its own automatically. The same happens when migrating data between EE versions.

EE to CE is a process of high complication and isn’t available out-of-the-box. EE has more more diverse full-featured ECM capabilities such as content and repository encryption, clustering, storage policies, advanced administration console, hybrid cloud for syncing on-premises and virtual Alfresco, AWS quick deployment, connectors Salesforce, Amazon S3 and EMC Centera, engines for content transformation and others. What does it say? Trying to transfer data from EE to CE you will have to do it manually due to the lack of automated process. There is even a database architecture different.

Alfresco-Data Migration_Enterprise to Community

What are the types of migrated data?

When you are reflecting on moving to the new system, you probably want to get your business processes transferred as well as documents in a cut-copy-paste way. Any software system for document management consists of files contained programming code of business logic and working documentation. There are two points we have to highlight in this regard.

Alfresco data migration

Another way to cope with it is setting up one more server nearby and flow data manually from the old Alfresco EDM to the new one.

“When it comes to pleasing clients, think automated,” we say to ourselves. Out of necessity, a migrator was created as a module which makes data requests with further moving them to the new software without losing their values. We developed an entire custom module which could grab all data (users and their groups, permissions, websites, roles, emails, etc.) and apply that on a new instance. It was of paramount importance for the client to get all workflow settings transferred precisely while Alfresco migration. If you have a system with a complex structure, it will take far more time to migrate data workflow settings.

Replication. Alfresco data replication comes in handy with geographically distributed deployments. In cases, the poor network latency and software bandwidth limitations may affect the performance of software it is recommended to set up an additional independent Alfresco instance provided that the versions are identical. Data will be flowing from the first instance to the neighboring one. It will look like a smart backup.

Simple or advanced: this is the question

You may want to migrate only files without saving meta information. Then you can use any programs which work with FTP, CMIS, CFS or WebDAV protocols. Either way, the data can be downloaded on the local server and uploaded to Alfresco ECMS. Thus, your data will be fixed and transferred in the state at the moment of saving on the local hardware; date of creation and date of change, custom version of the file, to which workflow it depends, names of users who contributed to this document, etc. You must have been chosen Alfresco because of the possibility to add metadata, mustn’t you?

In case of advanced data migration, it is necessary to develop custom functionality which enables to save and copy document date of creation, the name of last modifier, all previous document versions, etc. It is hard and challenging work. In our experience, we created modules which could save additional parameters about files when migrating.

What should you pay your attention to?

Iterative Alfresco development experienced us to break down focal points in Alfresco data migration. We have adduced several practical tips which helped us and, we believe, will stand by you. Finally, we’d like to draw to a close with three unavoidable highlights in Alfresco migration services we can’t give up our reader without.

Timely backups Timely backups. It is a vital thing to make backups when an update is made on the server while migrating. Before your moving to new Alfresco in full, to provide yourself with the latest backup is imperative. Migration can last from two hours to a couple of months depending on the amount of data. When we started, we did get a tech part of our work straight. Years later, it became obvious to be more customer-centric and leave the client being able to continue working with documents. One of the methods which Aimprosoft developers use is a crafted programming code which checks if some changes were on the main server and update data in backup at one stroke without waiting until the next day or slowing EDM system.

Accuracy of end result Accuracy of end result. Gathering requirements is a necessary phase of any development process. Statement of work is left to developers with user acceptance criteria of migrated data at the start. After the Alfresco migration is done, quality assurance specialists set about detecting bugs and developers about bug fixing. For most clients from law firms or logistic companies, we’ve had the honor to work with it is of utter importance that creation dates of files tally with pre-migrated versions. With a pair work with the client, there is not a lot of margin for error because the best inspector is that who knows the system inside and out and will stress the essential concerns. Alternatively, Alfresco developers can write a magic piece of code empowered to test a hash of bulks of newly migrated documents automatically. Verification will take from one hour until one week.

Customisations Customisations. Let’s assume a client has got his EDM system styled with a branded theme. Apparently, he will feel like remaining unchanged, for example, a login skin along with the rest alterations that have to be migrated. Alfresco platform changes from version to version. Moving to the new version one should not forget about elements which were customized in the previous one.