Thursday, 22 December 2011

Migrating from Sharepoint

Hi All,

Sharepoint is a whore to get away from. But here is some useful information and how to do it, quick and dirty.

First things first, do you know where your data is? Nope! It's in the database.

This is just for interest. Run this against your sharepoint DB:

Select d.dirname, d.leafname, d.setuppathuser, a.[content] from docs d
left outer join alldocversions a on a.id = d.id
where dirname = '[your directory name in the URL]'
and a.[content] is not null

What this will do is return all the documents. The important column here is a.[content]. This is your file converted into hex. If you don't believe me copy and paste it into a hex convertor on the web or something. Paste that into notepad, save as the right file type yada yada.

Anyway, get rid of the a.[content] entries in the sql then run it again.

Acceptable loss 1. The only docs with pathsetupuser are ones with multiple versions. Single versions do not have this metadata. That is in another hex encoded column called metainfo (select metainfo from docs).

Unless you want to faff with all that, you will lose the metadata. I couldn't personally be arsed and it was an acceptable loss as we are moving to a new file storage system.

Now you need to rescue your files. You can do this by running a tool here:

c:\program files\common files\microsoft shared\web server
extensions\12\bin

This is a sample command line you will need to run:

stsadm.exe -o export -url http://[servername]/sitename]/ -versions 1 -filename d:\sharepointexport -nofilecompression
Ok, quick overview.

-o export: does what it says
-url: the base url before the /forms/allitems.aspx
-versions 1: only the latest versions of files
-filename: destination directory. This mustn't exist before you run the command.
-nofilecompression: important for us. This tool exports .dat files, with compression on we will just get one big one.

OK, run that commandline and get your files. You should have all sorts of files like:

0000000A.dat
00000004.dat
....

These are your files. You should also get an xml file called manifest. You need this as it has the metadata like the real bloody filename!

At this point, I gave up on windows. I was 3 hours in and I managed to get some help off a friend of mine.

We copied the xml file to a linux box, and used perl to parse the xml file and generate a .BAT file at the end that would rename the files and move them into the right directory. Here is the perl code:

use strict;
print "\@echo off\n";
open (my $FL, "<manifest.xml");
foreach my $line (<$FL>) {
chomp($line);
if ($line =~ /^<file url="'\">)
my $realfilename = $line;
$realfilename =~ s/.*?Name=\"(.*?)\".*/\1/;
my $directory = $line;
$directory =~ s/.*?File Url\"(.*?)\".*/\1/;
$directory =~ s/\/.*?$//;
my $currfilename = $line;
$currfilename =~ s/.*?FileValue=\"(.*?)\".*/\1/;
print "copy $currfilename \"$directory\\$realfilename\"\n";
}

on your linux box run this script and > rescuemyfiles.BAT

Right!
Copy this file to your sharepoint server, and drop it into the same directory as your rescued .dat files.
All the .dat files are, are your files. You can rename one from .dat to .jpg or whatever and they will work. All the .bat file does is grab the file info from manifest.xml and copies them into a directory for you.

So there you have it. How to resuce your files from sharepoint. Have fun!

thanks for reading,

Trev
p.s. I never said this was clean did I? :-)




 

No comments:

Post a Comment