This article describes how you can add the ability to convert DOC to

advertisement
This article describes how you can add the ability to convert DOC to PDF (DOC2PDF) to Microsoft
Office SharePoint Server 2007 (MOSS) using Aspose.Words.
In this article, we show how you can create a small console application in Visual Studio that works
as a document converter for SharePoint and invokes the Aspose components to perform the
conversion.
It is easy to add other types of conversions such as DOC to DOCX, DOC to RTF, RTF to DOC, DOC to
WordML, WordML to DOC, HTML to DOC etc by following the example in this article. You can also
investigate other Aspose file format components such as Aspose.Cells and Aspose.Slides and use
them to support even more types of document conversions in SharePoint.
Document Converters in SharePoint
Microsoft Office SharePoint Server 2007 includes a new feature that allows the conversion of
documents from one format (content type) to another. You can use document conversions to
transform your content to suit your business requirements. You can invoke the conversion from the
user interface or programmatically via the SharePoint Object Model.
Built-in Document Converters in SharePoint
SharePoint includes several document converters that you can use out of the box:

.DOCX (Office Open XML) to HTML web page (also .DOCM to web page)

InfoPath to HTML web page

XML to HTML web page
Converting a Word Document (DOCX) to a Web Page using a built-in MOSS document
converter.
Need for More Document Converters
The set of document converters included with MOSS is limited. You can only convert DOCX, InfoPath
and XML documents into web pages.
There are many possible scenarios where additional converters might be required:

When a draft document is stored in one format (Microsoft Word DOC) and the final
document is published in another format (Adobe PDF) to a customer-facing site.

When the main format for documents is DOCX inside the organization, but it needs to
make the documents available to its customers and partners as DOC documents or
vice versa.
Extensible Document Converter Framework
Thankfully, the document converter’s framework in SharePoint is extensible. It allows custom
converters to be implemented and seamlessly integrated into SharePoint allowing for any required
content type conversion to be supported.
There is a good section Document Converters Overview in MSDN about document converters.
Although it is geared towards developers implementing a custom document converter, it makes
good reading for any IT professional who is tasked with planning or supporting document converters
in SharePoint.
Summary of Document Converters in SharePoint
To summarize the features of Document Converters in SharePoint:

Extensible. Custom converters can be added to facilitate almost any content
conversion.

A Document Converter is an executable. You can develop one or find a suitable
commercial product.

Document conversions are usually resource intensive; they run on the server(s) and
are controlled by the SharePoint load balancer service.

Documents that are the results of the conversion can be versioned and they maintain
a link to the original document in their metadata, history, properties etc.
Aspose to the Rescue
Aspose provides a great line of .NET and Java components. Trusted by thousands of customers
worldwide, the products include File Format Components, Reporting Products, Visual Components
and Utility Components.
Aspose File Format Components include products such as Aspose.Words, Aspose.Cells, Aspose.Pdf,
Aspose.Slides and so on that allow you to programmatically open, modify, generate, save, merge,
convert, etc. documents in various formats including DOC, DOCX, RTF, WordML, HTML, PDF, XLS,
PPT and others. These products are .NET class libraries that developers use when building their .NET
or Java applications that require access to documents in different formats.
Aspose File Format Components are often chosen for their superior performance, scalability and
stability in a server environment over Microsoft Office Automation. Microsoft Office Automation is
not recommended on the server for these reasons. While Aspose components cannot be directly
used as document converters for SharePoint out of the box, this article shows how you can easily
create a small .NET application that wraps an Aspose component and works as a document
converter for SharePoint.
Create a Document Converter for MOSS
A document converter for SharePoint (MOSS) is a custom executable that SharePoint calls with
command line arguments. The arguments specify the input, output, configuration and log files. The
command line arguments are described in detail in Document Converter Run Command in MSDN.
We are going to create a simple console application in Visual Studio 2005 that supports the
command line arguments passed by SharePoint and performs the DOC to PDF conversion using
Aspose.Words.
In this example we are using Visual Studio 2005 and the application will be built for .NET 2.0, but
you can also use Visual Studio 2003 and the document converter will be built for .NET 1.1, which
will also work fine. SharePoint has no requirements regarding .NET version to document converters;
in fact, a document converter does not have to be a .NET application at all, it just needs to be an
executable.
Download and Install Aspose Components
You need to download Aspose.Words for .NET from Aspose Downloads.
Install Aspose.Words on your development computer. All Aspose components, when installed, work
in evaluation mode. The evaluation mode has no time limit and injects watermarks into produced
documents.
Create a Project
Start Visual Studio 2005 and create a new console application. This example will show a C# console
application, but you can use VB.NET too.
Add References
Add a reference to C:\Program Files\Aspose\Aspose.Words\Bin\net2.0\Aspose.Words.DLL.
Add Code
Example
The following is the complete code of the document converter.
[C#]
using System;
using System.IO;
using Aspose.Words;
namespace Examples
{
/// <summary>
/// DOC2PDF document converter for SharePoint.
/// Uses Aspose.Words to perform the conversion.
/// </summary>
public class ExMossDoc2Pdf
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
// Although SharePoint passes "-log <filename>" to us and we are
// supposed to log there, for the sake of simplicity, we will use
// our own hard coded path to the log file.
//
// Make sure there are permissions to write into this folder.
// The document converter will be called under the document
// conversion account (not sure what name), so for testing purposes
// I would give the Users group write permissions into this folder.
gLog = new StreamWriter(@"C:\Aspose2Pdf\log.txt", true);
try
{
gLog.WriteLine(DateTime.Now.ToString() + " Started");
gLog.WriteLine(Environment.CommandLine);
ParseCommandLine(args);
// Uncomment the code below when you have purchased a licenses for
Aspose.Words.
//
//
//
//
//
//
//
//
You need to deploy the license in the same folder as your
executable, alternatively you can add the license file as an
embedded resource to your project.
// Set license for Aspose.Words.
Aspose.Words.License wordsLicense = new Aspose.Words.License();
wordsLicense.SetLicense("Aspose.Total.lic");
ConvertDoc2Pdf(gInFileName, gOutFileName);
}
catch (Exception e)
{
gLog.WriteLine(e.Message);
Environment.ExitCode = 100;
}
finally
{
gLog.Close();
}
}
private static void ParseCommandLine(string[] args)
{
int i = 0;
while (i < args.Length)
{
string s = args[i];
switch (s.ToLower())
{
case "-in":
i++;
gInFileName = args[i];
break;
case "-out":
i++;
gOutFileName = args[i];
break;
case "-config":
// Skip the name of the config file and do nothing.
i++;
break;
case "-log":
// Skip the name of the log file and do nothing.
i++;
break;
default:
throw new Exception("Unknown command line argument: " + s);
}
i++;
}
}
private static void ConvertDoc2Pdf(string inFileName, string outFileName)
{
// You can load not only DOC here, but any format supported by
// Aspose.Words: DOC, RTF, WordML, HTML.
Document doc = new Document(inFileName);
doc.Save(outFileName, SaveFormat.Pdf);
}
private static string gInFileName;
private static string gOutFileName;
private static StreamWriter gLog;
}
}
[Visual Basic]
Imports
Imports
Imports
Imports
Microsoft.VisualBasic
System
System.IO
Aspose.Words
Namespace Examples
''' <summary>
''' DOC2PDF document converter for SharePoint.
''' Uses Aspose.Words to perform the conversion.
''' </summary>
Public Class ExMossDoc2Pdf
''' <summary>
''' The main entry point for the application.
''' </summary>
<STAThread> _
Shared Sub Main(ByVal args As String())
' Although SharePoint passes "-log <filename>" to us and we are
' supposed to log there, for the sake of simplicity, we will use
' our own hard coded path to the log file.
'
' Make sure there are permissions to write into this folder.
' The document converter will be called under the document
' conversion account (not sure what name), so for testing purposes
' I would give the Users group write permissions into this folder.
gLog = New StreamWriter("C:\Aspose2Pdf\log.txt", True)
Try
gLog.WriteLine(DateTime.Now.ToString() & " Started")
gLog.WriteLine(Environment.CommandLine)
ParseCommandLine(args)
' Uncomment the code below when you have purchased a licenses for
Aspose.Words.
'
'
'
'
'
'
'
'
You need to deploy the license in the same folder as your
executable, alternatively you can add the license file as an
embedded resource to your project.
// Set license for Aspose.Words.
Aspose.Words.License wordsLicense = new Aspose.Words.License();
wordsLicense.SetLicense("Aspose.Total.lic");
ConvertDoc2Pdf(gInFileName, gOutFileName)
Catch e As Exception
gLog.WriteLine(e.Message)
Environment.ExitCode = 100
Finally
gLog.Close()
End Try
End Sub
Private Shared Sub ParseCommandLine(ByVal args As String())
Dim i As Integer = 0
Do While i < args.Length
Dim s As String = args(i)
Select Case s.ToLower()
Case "-in"
i += 1
gInFileName = args(i)
Case "-out"
i += 1
gOutFileName = args(i)
Case "-config"
' Skip the name of the config file and do nothing.
i += 1
Case "-log"
' Skip the name of the log file and do nothing.
i += 1
Case Else
Throw New Exception("Unknown command line argument: " & s)
End Select
i += 1
Loop
End Sub
Private Shared Sub ConvertDoc2Pdf(ByVal inFileName As String, ByVal outFileName
As String)
' You can load not only DOC here, but any format supported by
' Aspose.Words: DOC, RTF, WordML, HTML.
Dim doc As Document = New Document(inFileName)
doc.Save(outFileName, SaveFormat.Pdf)
End Sub
Private Shared gInFileName As String
Private Shared gOutFileName As String
Private Shared gLog As StreamWriter
End Class
End Namespace
Select the Release configuration and rebuild the solution.
You now have the AsposeDoc2Pdf.exe executable that can be used as a document converter for
SharePoint.
How to Build Converters for Other Formats
It is very easy to build more document converters. Aspose.Words supports DOC, DOCX, RTF,
WordML and HTML documents and can perform conversions between these formats in any direction.
Conversions between Microsoft Word formats (DOC, DOCX, RTF and WordML) are high-fidelity,
meaning no content or formatting in the document is lost.
Example
Converts an RTF document to OOXML.
[C#]
public static void ConvertRtfToDocx(string inFileName, string outFileName)
{
// Load an RTF file into Aspose.Words.
Aspose.Words.Document doc = new Aspose.Words.Document(inFileName);
// Save the document in the OOXML format.
doc.Save(outFileName, Aspose.Words.SaveFormat.Docx);
}
[Visual Basic]
Public Shared Sub ConvertRtfToDocx(ByVal inFileName As String, ByVal outFileName As
String)
' Load an RTF file into Aspose.Words.
Dim doc As Aspose.Words.Document = New Aspose.Words.Document(inFileName)
' Save the document in the OOXML format.
doc.Save(outFileName, Aspose.Words.SaveFormat.Docx)
End Sub
Deploy a Document Converter to MOSS
The document converter for SharePoint must be packaged as a SharePoint Feature and deployed at
the Web-application level.
If you need an overview of deploying document converters as SharePoint features, see the following
topics in MSDN:

Document Converter Deployment.

Working with Features.
A Feature in SharePoint is a unit of functionality that can be added/removed to a SharePoint server.
A feature is defined in an XML file that describes the feature, its name, scope and required files. The
feature definition XML and accompanying files must be placed in a folder in the C:\Program
Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES folder.
Each feature needs to have a Feature.xml file that specifies the feature name, unique id, scope and
the elements that comprise the feature.
Create a Folder for the Feature
Create the C:\Program Files\Common Files\Microsoft Shared\web server
extensions\12\TEMPLATE\FEATURES\AsposeDoc2Pdf folder on the SharePoint server.
Create a Feature Definition XML File
In the feature folder, create the Feature.xml as shown below.
Content of the Feature.xml file.
<Feature xmlns="http://schemas.microsoft.com/sharepoint/"
Id="{b4ce4c29-8aaf-4b80-bb63-d676e836f8ef}"
Title="DOC to PDF Converter (by Aspose)"
Description="Makes it possible to convert documents from DOC to PDF."
Scope="WebApplication">
<ElementManifests>
<ElementManifest Location="Elements.xml"/>
<ElementFile Location="AsposeDoc2Pdf.exe"/>
<ElementFile Location="Aspose.Words.dll"/>
</ElementManifests>
</Feature>
If you create more converters later on, pick a different GUID for the feature. The easiest way to
generate a unique GUID is to use the Tools / Generate GUID menu in Visual Studio.
Create a Document Converter Definition XML File
The ElementManifest element in the Feature.xml file refers to the Elements.xml file. This file
contains the definition of the document converter. The definition of the document converter includes
unique id, display name, the name of the executable to launch and the extensions of the source and
destination content types.
In the feature folder, create the Elements.xml as shown below.
Content of the Elements.xml file.
<Elements xmlns="http://schemas.microsoft.com/sharepoint/">
<DocumentConverter Id="{a4df1dac-a22c-431a-bbf6-dcc91848fee9}"
Name="Word Document to PDF (by Aspose)"
App="AsposeDoc2Pdf.exe"
From="doc"
To="pdf"
/>
</Elements>
If you create more converters later on, pick a different GUID for the converter. The easiest way to
generate a unique GUID is to use the Tools / Generate GUID menu in Visual Studio.
AsposeDoc2Pdf is now deployed as a SharePoint Feature.
Enable Document Converters
You need to enable document conversions in SharePoint, as they seem to be disabled by default.
Go to the Central Administrator / Application Management / Configure Document Conversion screen
and enable document conversions.
Enable document conversions in MOSS.
It is a good idea to check that the document conversion services are installed and running. In my
case they were installed and running.
Go to the Central Administration / Operations / Services on Server and make sure that the
Document Conversions Launcher Service and Document Conversions Load Balancer Services are
installed and running.
Check Document Conversion services are installed and running.
Install the Document Converter Feature
Now we need to install the feature so the document converter becomes available in SharePoint.
Execute the following command on the server:
"C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\STSADM.EXE"
-o installfeature -filename AsposeDoc2Pdf\Feature.xml –force
Activate the Document Converter Feature
Now we need to activate the document converter, execute the following command on the server:
"C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\STSADM.EXE"
-o activatefeature -name AsposeDoc2Pdf -url http://win2k3r2ee
Note that you need to specify a URL in the –url argument. I have not fully figured out exactly what
URL must be specified, I just specified the name of my SharePoint server and it worked, making the
document converter available to all SharePoint sites on this server.
Now is a good time to verify that the feature in fact was installed and activated. In my case, I found
I still needed to click the Activate button in the SharePoint Central Administration / Application
Management / Manage Web Applications Features window.
Make sure the new document converter feature is activated in the Manage Web
Application Features window.
Copy the Document Converter Files!
After installing and activating the document converter as a SharePoint Feature, I was expecting that
the conversions would just run. But the conversions did not run (nothing was happening) and I had
to examine the logs in the C:\Program Files\Common Files\Microsoft Shared\web server
extensions\12\LOGS folder.
The error message I was getting was that the Document Conversion Launcher Service was
attempting to start my converter C:\Program Files\Microsoft Office
Servers\12.0\TransformApps\AsposeDoc2Pdf.exe, but there was no such file in that directory.
Most likely, I have done something wrong in the Feature Definition XML file so the files of the
feature were not copied to the correct location, but I could not find a solution here, so I copied the
files manually.
Copy the following files from C:\Program Files\Common Files\Microsoft Shared\web server
extensions\12\TEMPLATE\FEATURES\AsposeDoc2Pdf to C:\Program Files\Microsoft Office
Servers\12.0\TransformApps:

AsposeDoc2Pdf.exe

Aspose.Words.DLL
Make Sure the Converter is Enabled for the Site
Go to your SharePoint home page, click Site Actions, Site Settings, Modify All Site Settings, Site
Content Types, Document, Manage Document Conversion for This Content Type and make sure your
document converter is enabled.
Checking Document Converter is enabled for Documents on this SharePoint Site.
Test Your Conversion
Finally, we can test if the conversion works.
Upload a test DOC file to the server. I uploaded a document called “Distributable VHD Image
EULA.doc”.
Upload a DOC file to MOSS.
Click on your test file so the context menu opens, select Convert Document, Word Document to PDF
(by Aspose).
Selecting to convert a DOC document to PDF.
Click OK in the window confirming the request for conversion.
Note the conversion is managed by the document conversion schedule and document conversion
load balancer service so it might not happen instantaneously. By default, the document conversion
starts every minute.
Just refresh the page with the list of documents after several seconds until you see the converted
document appears in the list.
The document that is a result of the conversion appears in the list.
Click on the document to download it. It is a PDF document and will open Adobe Reader on your
machine.
Adobe Reader displays the PDF document downloaded from the MOSS site.
Just to verify that Aspose.Words did a great job at accurately converting DOC to PDF, open the
original DOC in Microsoft Word and compare with what you have in Adobe Reader.
The original DOC file opened in Microsoft Word to compare how well it was converted to
PDF.
Troubleshooting
If the conversion does not work, check the MOSS log files, which are detailed. The log files are in
C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\LOGS.
Summary
In this article, I have shown how to use Aspose.Words to add the DOC2PDF conversion feature to
MOSS. I have also shown it is very easy to add many more types of conversions to SharePoint using
Aspose components.
If you feel that building your own document converter as shown in this article is too much for you
and you would prefer a finished product with a simple installer, let us know. We might package it as
a product eventually, say Aspose.Words for SharePoint.
If you are using Microsoft SQL Server Reporting Services, make sure to check our other great
product Aspose.Words for Reporting Services that makes possible the generation of true DOC,
DOCX, RTF and WordprocessingML reports in Microsoft SQL Server 2005 Reporting Services.
Please excuse any technical inaccuracies regarding SharePoint (if you find any) because my
experience with SharePoint before this article was nil and I had to spend some time grasping many
concepts that were new to me such as sites, web applications, document libraries and so on and
what they mean in the context of MOSS.
Any questions or comments are welcome in the Aspose.Words Forums.
Download