Posted at: 3:53 PM on 24 February 2010 by Muhimbi
Earlier today we released a new version of our popular PDF Converter for SharePoint. One of the key changes in this version is that it fully supports Office 2010 file formats, including all new formatting features such as Excel 2010 Sparklines.
Full installation details are provided in Appendix – Office 2010 Installation of the Administration Guide, however one detail is so essential that we feel we have to repeat it in this blog post. After all, who reads boring documentation anyway?
The single most important thing to take away from this post is that no matter what CPU architecture you deploy the PDF Converter on, if you plan to use Office 2010 to carry out the conversions then you should always install the 32 bit version of Office 2010.
Even though the Muhimbi PDF Converter for SharePoint is a hybrid 32 / 64 bit application, the 64 bit version of Office will not work in combination with our software. Even if it did, it would provide little to no benefit.
For more details about the differences between the 32 and 64 bit versions of Office, including Microsoft’s recommendation to run the 32-bit version of Office 2010 on 64 bit hardware, read this article on Microsoft’s Office 2010 blog.
.
Labels: Articles, PDF Converter, Products
Posted at: 2:29 PM on by Muhimbi
Ah, those pesky customers of ours, always looking for some niche functionality that is impossible to include in a generic product. However, using the Workflow Power Pack for SharePoint we can achieve almost anything we can think of…..almost.
Previously I described how to configure PDF Security settings from a SharePoint workflow and how to automatically watermark PDF files from a workflow. This time I’ll show how to add JavaScript to any PDF file to automatically print the current date (the date the PDF was opened) on every page. In essence this adds a print date without modifying the PDF file every day to include the current date.
A quick introduction for those not familiar with the product: The Muhimbi Workflow Power Pack for SharePoint allows custom C# or VB.NET code to be embedded in SharePoint Designer Workflows without the need to resort to complex Visual Studio based workflows, the development of bespoke Workflow Activities or long development cycles.
The solution presented below executes a workflow whenever a PDF file is added or updated. It iterates over all pages and inserts a form field on each page. Some client side JavaScript is then added to the PDF file that iterates over all newly added fields to insert the current date every time the PDF file is opened.
As the code is well documented it is easy to make further changes and customisations, e.g. change the formatting of the date or position of the label. Note that this has only been tested with a recent version of Adobe Acrobat reader. If you use a different PDF viewer your mileage may vary.
Create the workflow as follows:
- Download and install the Muhimbi Workflow Power Pack for SharePoint.
- Download and install the Muhimbi PDF Converter for SharePoint.
Note that you need version 3.2.0.20 or newer, older versions do not allow JavaScript to be inserted.
- Download this article’s source code.
- We need to be able to access functionality in the Muhimbi.SharePoint.DocumentConverter.PDF and System.Drawing assemblies. Add these references to the relevant Web Application using the Workflow Power Pack Central Administration screens as described in the Administration Guide. Make sure to place each reference on a new line.
- Make sure you have the appropriate privileges to create workflows on a site collection.
- Create a new workflow using SharePoint Designer.
- On the Workflow definition screen associate the workflow with the Shared Documents library, tick the boxes next to both ‘Automatically start….’ options and proceed to the next screen.
- We only want to act on files of type PDF. Although we could have put this validation in the code, in this example we use a workflow condition for it so add a Compare Any Data Source condition and:
a. Click on the first value followed by the display data binding (fx) button.
b. Select Current Item as the Source and select File Type in the Field. Click the OK button to continue.
d. Click on the second value and enter pdf. (Use lower case as the compare option is case sensitive).
- Click the Actions button and insert the Execute Custom Code action.
- Optionally click parameter 1 and enter a relative or absolute destination path. Leave the parameter empty to save the modified file on top of the existing PDF file. For details about how paths are handled, see this post and search for the words ‘this url’.
- Insert the C# based code embedded in step #3’s download (also listed below) by clicking this code.
/*********************************************************************************************
Muhimbi PDF Converter - JavaScript Watermarking
Copyright 2010, Muhimbi Ltd - www.muhimbi.com - All rights reserved
The following code shows a simple way of adding JavaScript to existing PDF Files. It adds
the current date to each page in the document in order to simulate a 'print date' that is
always up to date without the need to modify the PDF file. The code is automatically executed
when the document is opened in the Adobe Acrobat Viewer.
Error and permission checking as well as other minor features have been omitted for the sake
of brevity and clarity.
Ideally PDF Conversion, applying security and watermarking is executed in the same step, see
http://blog.muhimbi.com/2010/01/configure-pdf-security-from-sharepoint.html
This code requires Muhimbi’s PDF Converter and Workflow Power Pack to be installed.
*********************************************************************************************/
using System.Drawing;
using System.IO;
using Syncfusion.Pdf;
using Syncfusion.Pdf.Parsing;
using Syncfusion.Pdf.Graphics;
using Syncfusion.Pdf.Interactive;
using Muhimbi.SharePoint.DocumentConverter.PDF;
SPFile spSourceDocument = MyWorkflow.Item.File;
string destinationFileName = spSourceDocument.Name;
string destinationFolderName = MyWorkflow.Parameter1 as string;
// ** Load the document
PdfLoadedDocument sourceDocument = new PdfLoadedDocument(spSourceDocument.OpenBinary());
PdfDocument destinationDocument = new PdfDocument();
// ** Copy all pages from the source document into the destination document
// ** so we can add JavaScript actions.
destinationDocument.ImportPageRange(sourceDocument, 0, sourceDocument.Pages.Count - 1);
sourceDocument.Dispose();
// ** Iterate over all pages and add a form element
for (int i = 0; i < destinationDocument.Pages.Count; i++)
{ PdfPage destinationPage = destinationDocument.Pages[i];
// ** Create a new field using a unique name
PdfTextBoxField field = new PdfTextBoxField(destinationPage, "_M_PrintDateField_" + i);
// ** Center the field
const int BOX_WIDTH = 200;
int boxLeft = (int)((destinationPage.Size.Width - BOX_WIDTH) / 2);
field.Bounds = new RectangleF(boxLeft, 20, BOX_WIDTH, 20);
// ** Format the field
PdfFont font = new PdfStandardFont(PdfFontFamily.Helvetica, 12f);
field.Font = font;
field.BorderColor = new PdfColor(Color.White);
field.BackColor = new PdfColor(Color.White);
field.ReadOnly = true;
field.TextAlignment = PdfTextAlignment.Center;
destinationDocument.Form.Fields.Add(field);
}
// ** Create a client side script that iterates over all fields and populates the date
string jscript = @"
var pages = " + destinationDocument.Pages.Count + @";
var today = util.printd('dd-mm-yyyy', new Date()); for(var i=0; i<pages; i++)
{ var field = this.getField('_M_PrintDateField_' + i); field.value = 'Today is: ' + today;
}
";
// ** Attach the script to the Document Open event.
PdfJavaScriptAction jsAction = new PdfJavaScriptAction(jscript);
destinationDocument.Actions.AfterOpen = jsAction;
// ** Construct the path and file to write the watermarked PDF file to.
if (string.IsNullOrEmpty(destinationFolderName) == true)
destinationFolderName = spSourceDocument.ParentFolder.Url;
SPFolder destinationFolder = Utility.GetSPFolder(destinationFolderName, MyWorkflow.Web);
string destinationFilePath = string.Format("{0}/{1}", destinationFolder.Url, destinationFileName);
SPWeb destinationWeb = destinationFolder.ParentWeb;
SPFile spDestinationFile = destinationWeb.GetFile(destinationFilePath);
// ** If a document library requires manual checkout and the file is not checked out, then
// ** check the file out before uploading.
if (spDestinationFile.Exists && spDestinationFile.Item.ParentList.ForceCheckout &&
spDestinationFile.CheckOutStatus == SPFile.SPCheckOutStatus.None)
{ spDestinationFile.CheckOut();
}
// ** Add the file to the site including the meta data
using (MemoryStream watermarkedFile = new MemoryStream())
{ destinationDocument.Save(watermarkedFile);
spDestinationFile = destinationWeb.Files.Add(destinationFilePath, watermarkedFile,
spSourceDocument.Item.Properties, true);
}
// ** Check the file back in if this script was responsible for checking it out.
if (spDestinationFile.Item.ParentList.ForceCheckout == true)
{ spDestinationFile.CheckIn("Auto check-in after PDF watermarking.");}
- Click the Actions button, select Log to History List, click this message and enter File watermarked.
- Close the Workflow Designer.
- Update an existing PDF or add a new PDF file to your library to trigger the workflow and apply the JavaScript.
Naturally this is just a simple example. Feel free to play around with the code, change which parameters are passed into the workflow, or add different JavaScript. Note that you may want to add a check to the code to check if the JavaScript / fields have previously been added, otherwise duplicate form fields may be added every time the PDF is updated.
Adobe’s JavaScript for Acrobat reference can be found here.
.
Labels: Articles, PDF Converter, Products, Workflow, WPP
Posted at: 11:02 AM on by Muhimbi
We are very excited to announce the new version of the Muhimbi PDF Converter for SharePoint. The main change in this version is support for Office 2010 based converters and file formats.
We are quite surprised by the number of customers asking for Office 2010 support, especially considering that at the time of writing it is still in beta. On the other hand, it appears to be very stable and particularly the improvements in converting InfoPath forms to PDF format make it worth considering.
For those not familiar with the product, the PDF Converter for SharePoint is a lightweight solution that allows end-users to convert common document types to PDF format from within SharePoint without the need to install any client side software or Adobe Acrobat. It integrates at a deep level with SharePoint and leverages facilities such as the Audit log, localisation, security and tracing. It runs on both WSS 3 as well as MOSS and is available in English, German, Dutch, French and Japanese. For detailed information check out the product page.
Convert files using the User Interface or an automated Workflow
The main changes in version 3.2 are as follows:
| 778 | New: Support for Office 2010 has been added. |
| 768 | New: For InfoPath conversions, disabling of external data sources and embedded code has been made optional. |
For more information check out the following resources:
As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.
Download your free trial here (4MB).
.
Labels: News, PDF Converter, Products
Posted at: 11:50 AM on 22 February 2010 by Muhimbi
Just a quick note to make sure that anyone typing this error message into a search engine will find this post.
Due to a change in the SharePoint 2010 AllDocVersions table it is no longer possible to rebuild indexes as part of a SQL 2008 maintenance plan and keep the indexes on-line at the same time.
The reason behind this is that the MetaInfo field is no longer of type Image. It is now of type tCompressedBinary:varbinary(MAX).
If you get the error listed below then make sure you open the Rebuild Index Task in your SQL Maintenance plan and disable the ‘Keep index online while reindexing’ option.
If you don’t then you’ll get the following error:
Executing the query "ALTER INDEX [AllDocVersions_PK] ON [dbo].[AllDocVe..." failed with the following error: "An online operation cannot be performed for index 'AllDocVersions_PK' because the index contains column 'MetaInfo' of data type text, ntext, image, varchar(max), nvarchar(max), varbinary(max), xml, or large CLR type. For a non-clustered index, the column could be an include column of the index. For a clustered index, the column could be any column of the table. If DROP_EXISTING is used, the column could be part of a new or old index. The operation must be performed offline.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
If you use one of the default maintenance plans then this error happens before the Backup step. As a result your databases will not be backed up.
.
Labels: Articles, News, SP2010
Posted at: 1:22 PM on 18 February 2010 by Muhimbi
Not too long ago we wrote about how to create a Short URL from a SharePoint workflow using the Muhimbi URL Shortener (MuSH) in combination with our Workflow Power Pack. The response from our customers has been so positive that we decided to ship a Workflow Action with the new version of MuSH.
For those not familiar with the product, the Muhimbi URL Shortener for SharePoint, aka MuSH, can be used to shorten URLs for typical web applications and SharePoint in particular. It integrates tightly with both WSS and MOSS and allows short URLs to be created directly from a list item’s context menu, workflows and web services. For details see the original product announcement.
Creating short URLs from a workflow can be very useful. For example creating a short URL named after data in an InfoPath form or create a short URL for a deeply nested folder. In the example below we create a short URL that always points to the latest entry in the announcement list. Not sure if this is useful, but it illustrates the power of this facility.
Create the workflow as follows:
- Download and install the Muhimbi URL Shortener for SharePoint.
- Make sure you have the appropriate privileges to create workflows on a site collection.
- Create a new workflow using SharePoint Designer.
- On the Workflow Definition screen associate the workflow with the Announcements list, tick the box next to ‘Automatically start this workflow when a new item is created’ and proceed to the next screen.
- From the Actions Menu select Create Short URL, you may need to click More Actions first.
- The following Workflow Sentence is inserted:
- To auto generate the short URL, leave the optional short name empty, but in our case we always want to give it the same name, so enter Announce.
- Click this ID / address, click the Workflow Lookup button and select Current Item as the Source and ID as the field.
- Click Document / Display Form and select Document (when used in a Document Library) or Display Form showing the item’s properties. As we are not dealing with a Document Library, it doesn’t matter what is selected.
- Click Overwrite / Return null and select the Overwrite as we always want to write the latest announcement using the same short name. (Return Null will return null in the output variable, which can then be tested for and action can be taken accordingly.)
- Click Variable: this variable and specify the variable the Short URL will be stored in. In this example name it shortURL.
- Add a Log To History List Action and specify the name of the workflow variable the Short URL has been stored in using the Workflow Lookup dialog box.
Close the workflow and create a new Announcement. When the workflow has finished, click the completed link to see the output. Click the generated URL to link to the latest announcement.
Create another Announcement, the Short URL should now link to the latest announcement.
.
Labels: Articles, MuSH, News, Products, Workflow
Posted at: 2:30 PM on 16 February 2010 by Muhimbi
Live never stops at Muhimbi. It has only been 7 days since we announced a new version of the Workflow Power Pack and here we are again with the brand new ‘2.0’ version of our URL Shortener for SharePoint. This version adds support for generating short URLs from workflows, manually specifying short URL names, new languages as well as some other new features and fixes. For full details see the table below.
For those not familiar with the product, the Muhimbi URL Shortener for SharePoint, aka MuSH, can be used to shorten URLs for typical web applications and SharePoint in particular. It integrates tightly with both WSS and MOSS and allows short URLs to be created directly from a list item’s context menu, workflows and web services. For details see the original announcement.
The main changes and improvements are as follows:
| 562 | New: Allow users to specify their own Short URL. |
| 556 | New: Allow users to specify if they want the short URL to point to the Document rather than the Display Form. |
| 760 | New: Allow the URL Shortener to be called from any page using SharePoint’s Personal Action’s menu. |
| 561 | New: Allow the URL Shortener to be invoked from a SharePoint Designer Workflow. |
| 735 | Fixed: Make sure that the same Short URL is returned if a long URL has been shortened before. |
| 655 | New: Add Support for Simplified Chinese in the user interface. |
For more information check out the:
As always, feel free to contact us using Twitter, our Blog or regular email or subscribe to our newsletter.
Download your free trial here (1MB).
Labels: MuSH, News, Products, Workflow
Posted at: 2:17 PM on 09 February 2010 by Muhimbi
In part 4 of our series of User Guide related blog postings for the Muhimbi Workflow Power Pack for SharePoint we show how to create your own methods in a WPP script in order to keep the code organised and easy to maintain.
A quick introduction In case you are not familiar with the product: The Muhimbi Workflow Power Pack for SharePoint allows custom C# or VB.NET code to be embedded in SharePoint Designer Workflows without the need to resort to complex Visual Studio based workflows, the development of bespoke Workflow Activities or long development cycles.
The following Blog postings are part of this User Guide series:
- Language Features: Discusses the script like syntax, the generic workflow action and condition, passing parameters, returning values from a workflow and using the MyWorkflow property.
- Embedding .net code in a Workflow Condition: Provides a number of examples of how to use the Evaluate Custom Code condition to carry out basic as well as complex conditional tasks.
- Embedding .net code in a Workflow Action: Contains a number of examples of how to use the Execute Custom Code to basically carry out any action you can think of in a SharePoint Designer Workflow.
- Creating Custom Methods (this article): Shows how to create your own methods in your scripts in order to keep the code organised and easy to maintain.
Due to its scripting like approach, the Workflow Power Pack does not allow regular .NET methods to be created. However, by cleverly using delegates you can create your own reusable pieces of code.
To facilitate this, the following delegates can be used in addition to the normal delegates available in the .net framework. Note that this only works for C# as VB.net does not allow anonymous methods to be created.
delegate void WorkflowMethod(params object[] parameters);
delegate object WorkflowFunction(params object[] parameters);
delegate void WorkflowMethod<ParameterType>(params ParameterType[] parameters);
delegate ReturnType WorkflowFunction<ParameterType, ReturnType>(params ParameterType[] parameters);
There is no need to add these delegates to your WPP Code, they are added automatically.
| Delegate name | Description |
| WorkflowMethod | Method with a void return type. Accepts any number of Object based parameters that can be accessed from the delegate body using the parameters array. Parameters may need to be cast to the correct type before they can be used. |
| WorkflowFunction | Method using a return type of Object. Accepts any number of Object based parameters that can be accessed from the delegate body using the parameters array. Parameters may need to be cast to the correct type before they can be used. |
| WorkflowMethod (Using generics) | Generics based version that allows strongly typed parameters to be passed. |
| WorkflowFunction (Using generics) | Generics based version that allows strongly typed parameters to be passed and returned |
The example provided below creates a generic Debug method to concatenate information to a string. This string is then returned as the workflow’s ReturnValue, from where it can be written to the Workflow History.
string debugString = String.Empty;
WorkflowMethod<string> Debug = delegate(string[] parameters)
{ debugString += parameters[0] + "\r\n";
};
WorkflowFunction Calculate = delegate(object[] parameters)
{ return (int)parameters[0] + (int)parameters[1];
};
WorkflowFunction<int, string> Calculate2 = delegate(
int[] parameters)
{ return (parameters[0] + parameters[1]).ToString();
};
Debug("Hello");Debug("World");Debug(Calculate(1, 2).ToString());
Debug(Calculate2(3, 4));
MyWorkflow.ReturnValue = debugString;
Labels: Articles, News, Products, Workflow, WPP
Posted at: 11:25 AM on by Muhimbi
I can’t believe it has only been 6 weeks since we launched the Workflow Power Pack for SharePoint. We are getting great feedback from our customers, who seem to universally love the product. The support call from one frustrated SharePoint Designer workflow developer who was almost in tears stood out particularly.
The version released today adds support for the number one user request, which is the ability to add your own custom methods to the code to allow some degree of usability and reduce the size of scripts. Read this post for more details about how to use this new functionality.
A quick introduction for those not familiar with the product: The Muhimbi Workflow Power Pack for SharePoint allows custom C# or VB.NET code to be embedded in SharePoint Designer Workflows without the need to resort to complex Visual Studio based workflows, the development of bespoke Workflow Activities or long development cycles.
We have been working very hard to write as many blog posts as possible to provide examples of what can be achieved using the product as well as how to integrate the WPP with our other products such as the PDF Converter and URL Shortener. Have a look at the following posts:
Embed C# code directly into a SharePoint Designer workflow
The main changes in version 1.1 are as follows:
| 743 | Add Support for Custom methods using Delegates (See details in User Guide) |
| 763 | Trial version causes an error when used after a Pause For Duration activity. |
For more information check out the following resources:
As always, feel free to contact us using Twitter, our Blog, regular email or subscribe to our newsletter.
Download your free trial here (1MB).
.
Labels: News, Products, Workflow, WPP
Posted at: 11:36 AM on 04 February 2010 by Muhimbi
As most of our products can be used from a SharePoint workflow, it is perhaps useful to know how to tweak SharePoint’s workflow engine for high-load or other specific scenarios.
This article explains in detail what can be tuned and how it can be tuned. If you are in a rush then you can skip over the first 20%.
In summary:
- Workflow Throttle: Controls how many workflows can be processing at any one time on the entire server farm. This setting does not control how many workflows can be "In Progress" concurrently, but rather how many can be actively using the processor. When this number is exceeded, workflow instances that are started and events that wake up dehydrated workflows are queued for later processing. The default value is 15. This setting is per farm, so the number of front-end Web servers is irrelevant
stsadm -o setproperty -pn workflow-eventdelivery-throttle -pv "25"
- Workflow Batch Size: Workflows, by their very nature, do not execute in a nonstop, linear fashion. Instead, they run for a little while, pause, run some more, and then pause again, continuing in this manner until the process is complete. Although an outside observer or a developer might disagree, workflows are a collection of batches and the workflow engine is simply a glorified batch controller.
stsadm -o setproperty -pn workitem-eventdelivery-batchsize -pv "125"
- Workflow Timeout: The timeout setting specifies the amount of time (in minutes) in which a workflow timer job must complete before it is considered to have stopped responding and is forced to stop processing. Jobs that time out are returned to the queue to be reprocessed later. The default timeout period is five minutes
stsadm -o setproperty -pn workflow-eventdelivery-timeout -pv "10"
- Workflow Timer Interval: The workflow timer interval specifies how often the workflow SPTimer job fires to process pending workflow tasks. This interval also represents the granularity of delay timers within your workflow. If a timer is set to delay for one minute, but the interval timer fires only every five minutes, the workflow delays for five minutes, not one minute
stsadm -o setproperty -pn job-workflow -pv value -url http://myWssServer
For our products you may need to tweak Workflow Timeout for very long running PDF Conversions. Changing the Timer Interval can be useful during development when using Pause Until or Pause For workflow Activities.
.
Labels: Articles, MuSH, PDF Converter, Workflow, WPP
Posted at: 4:41 PM on 02 February 2010 by Muhimbi
The following article is an ‘open Kimono session’ where I discuss some of the internals of my company as well as our marketing program. It is my opinion that we are the victim of click-fraud, however my investigation is not 100% scientific and I have had to make some assumptions based on observations and time constraints. Please draw your own conclusion and consider everyone innocent until proven guilty. The figures, charts and tables presented in this article originate from Google’s own Analytics and AdWords software.
Update: Latest developments and responses from Google can be found at the end of this post.
After witnessing unexpected browsing behaviour from visitors who arrived on our site via a Google AdWords campaign that we ran a year ago for our PDF Converter for SharePoint, I was pretty sure that we were the victim of click-fraud.
Unfortunately, due to a lack of time and detailed figures to back up my suspicions, I decided not to pursue the matter at the time. However, after recently analysing another campaign it became clear that something suspicious is going on. Naturally it is not Google who is committing the fraud, but they are not doing enough to prevent it either.
Note that Google has settled a click-fraud related class action lawsuit in 2006 for $90M, a drop in the ocean compared to their level of revenue. The problems appear to be ongoing, read on for my findings.
What is Google AdWords?
Ever wonder how Google make tens of billions of dollars each year? One word: Advertising! AdWords is the platform that allows customers to specify keywords, bids and budget for displaying adverts next to Google’s search results as well as in-line on any website that is willing to display adverts in exchange for a share in the revenue.
When creating a campaign you can specify where the ads appear:
- In Google’s Search Results: Based on the search terms and the keywords specified, relevant adverts are displayed next to and above the search results. The more you are willing to pay, the higher the advert will be displayed, increasing the chances of a user clicking it. Every time an advert is clicked, Google charges the advertiser a fee. As Google’s site is a trusted entity, this way of advertising is relatively fraud proof. In all fairness it appears to work exceptionally well and Muhimbi probably could not survive without it.
- On the content network: This is where I suspect the majority of fraud is taking place. Anyone who can host any kind of content, e.g. a blog, can sign up as an affiliate and place Google ads in their content. Every time an advert is clicked on the content network Google charges the advertiser and part of the income is paid to the ‘owner’ of the content.
Although Google is putting a lot of effort in preventing fraud, the engineer in me can think of many ways to abuse the content network program, particularly using cheap labour, proxy servers and spyware like applications to simulate real user clicks.
For more detail read Wikipedia’s definition of the AdWords platform.
Muhimbi’s market and products
What makes this investigation relatively easy is the fact that we serve a niche market. All our products are aimed at corporate IT departments for use in their SharePoint environment. The campaign discussed in this article is for our Workflow Power Pack, a product that allows SharePoint Designer developers to embed C# or VB code into their workflows. A great product, but I believe there is a box shot of our product next to the definition of the word niche.
We are a small, but extremely committed company, which makes it difficult to swallow that our hard earned money appears to be used for funding criminal activity.
Normal, genuine, users
Based on our experience with other campaigns as well as ‘organic visitors’ who visit our site via external links or regular Google searches, our normal audience has the following characteristics:
- When we send a newsletter to these users, the email rarely bounces.
- They visit during weekdays. During the weekend our site has 75% less visitors compared to weekdays.
- They arrive on our site via Google with relevant keyword searches or via links from external sites that are relevant to our niche.
- They browse around before going to the download page. Only 41% go from the landing page directly to the download page.
- Roughly half of interested visitors contact our support / sales department at some stage for further information.
Regular visits by day of the week. Guess which data points represent the weekend.
Evil users
The usage pattern of these alleged fraudulent users is completely different:
- A large proportion of newsletters sent to these users bounce.
- They visit every day of the week including the weekend.
- They arrive on our site via questionable, unrelated sites. More about this later.
- 75% of the users go directly to the download page from the landing page without getting any further information about the product.
- None get in contact with our support / sales department to request any kind of information. I would like to think the information on the site is crystal clear, however this does not match the pattern we see from other products and campaigns.
- They spend an average of 1 minute on the site. I am not sure if this is the minimum that Google Analytics reports, but these people are clearly ‘very committed’.
Visits by day of the week for pages related to this particular campaign. No weekend dips, clearly hard workers.
So why do these alleged fraudulent users go through the effort of downloading our software and registering for newsletters after clicking the advertisement, which is when they make their money? The reason behind this is that many Google campaigns as well as marketing professionals measure the success of their campaign based on conversions. For example:
- A user enters the site via an AdWords campaign and downloads the software. This is considered a conversion and a sign of a campaign being successful. The marketing executive will get a pat on the back from the CEO and everyone is happy (initially).
- A user enters the site via an AdWords campaign and subscribes to a newsletter. This is considered a conversion as well resulting in CEO –> pats marketing on back–> Happy –> time expires –> Sad –> Fired –> Divorce –> Death (See the pain these people are causing!)
Clever marketing people and, if configured that way Google AdWords, measure conversions and allocate more budget to sites that generate these conversions. A good reason for fraudsters to simulate some activity after clicking an advertisement.
Which sites are the worst offenders?
It should come as no surprise that sites that allow anonymous users to host their own content and insert Google advertisements are the worst offenders as it is almost impossible to trace these people. From a geographic perspective it appears that Chinese web sites are the worst, but many other countries are just as bad.
Our advert has been displayed 2.5 million times on 767 sites over the course of one month. 611 different sites have referred at least one visitor. Out of those sites I consider about 60 sites relevant in the loosest sense of the word (Intentionally or not, eggheadcafe.com for example is legit although until recently very mischievous in the way they presented and positioned their advertisements to make them look like clickable answers to questions. They still do it on some threads, but not as bad as it used to be).
76 sites had an amazing 100% click through ratio, 213 with a CTR of more than 20%, 345 with more than 10% and 450 with more than 1% (which is still an amazing rate considering mail.google.com has a 0.02% CTR).
Sites such as divxphoto.com (domain for sale) I consider to be irrelevant as the domain is for sale and doesn’t actually display any advertisements(!!!!!) Most of the other sites on the list can be categorised as domain for sale, dodgy software download site, driver download site or rubbish content aggregation site.
Listed below are the top 15 sites by highest number of advertising clicks.
| Domain | Clicks | Impr. | CTR |
|
| thaimanga.net | 249 | 485640 | 0.05% | Manga comics, not relevant to our advert. |
| softpedia.com | 223 | 158636 | 0.14% | Download site, not relevant to our advert. |
| incoto.com | 178 | 192949 | 0.09% | Some Chinese site |
| webs.com | 159 | 19109 | 0.83% | Create your own website service, which makes it easy to host dodgy content. |
| conduit.com | 139 | 7221 | 1.92% | Browser toolbar company. God knows what is going on here. |
| mail.google.com | 74 | 399955 | 0.02% | Wow, a legit one |
| eggheadcafe.com | 53 | 109068 | 0.05% | Legit, but sometimes misleading i.m.o. |
| csdn.net | 50 | 146830 | 0.03% | Chinese programming site, maybe legit, probably not. |
| gyanii.com | 48 | 7481 | 0.64% | Software download site, looks rubbish and full of advertisements. |
| blogspot.com | 41 | 13915 | 0.29% | Host your own content. Partly legit. |
| csharpfr.com | 38 | 12915 | 0.29% | French C# site, probably legit. |
| pin5i.com | 34 | 15196 | 0.22% | Chinese programming site. Could be legit or just an aggregator. My Chinese isn’t what it used to be. |
| dotnet-news.com | 33 | 2806 | 1.18% | Another French .net site related to csharpfr.com. Possibly legit, but I wonder why they are generating so many clicks |
Green: Most likely legit - Amber: Likely to be illegitimate - Red: Almost certainly illegitimate
Listed below are the top 15 Sites by Click Through Ratio with more than 25 impressions (otherwise the table would contain 76 sites with a 100% CTR after a single impression, which is rather useless).
| Domain | Clicks | Impr. | CTR |
|
| 9mine.com | 9 | 34 | 26.47% | Free games, not relevant to our advert. |
| hbrsd.com | 19 | 87 | 21.84% | Domain for sale, no ads. Who knows where the clicks came from. |
| 5dmail.net | 6 | 36 | 16.67% | Chinese site, could be legit, could be aggregator. |
| boxsoftware.net | 5 | 37 | 13.51% | Spanish software download site |
| meiying.com | 6 | 48 | 12.50% | Dodgy site to display just SharePoint related ads without any content. |
| myalbums.tk | 18 | 160 | 11.25% | Dodgy site to display just MS Development related ads without any content. |
| codehaus.org | 6 | 54 | 11.11% | Some open source site. Could be legit, but not relevant so doesn't explain the high CTR. |
| micorcsolft.com.cn | 8 | 76 | 10.53% | Site doesn't even exist. |
| douziwang.cn | 4 | 40 | 10.00% | Similar to meiying.com. Dodgy site to just display ads for MS Dev tools |
| paramegsoft.com | 3 | 30 | 10.00% | Arabic online games, not relevant to our advert. |
| kidwaresoftware.com | 4 | 46 | 8.70% | Possibly legit site, but not relevant so doesn't explain the high CTR |
| thaiboxsoftware.com | 3 | 38 | 7.89% | Thai software download site. Glad to see we are so popular in Thailand |
| netcsharp.cn | 3 | 40 | 7.50% | Malware site as reported by Google Chrome, yet Google allow advertisements. |
| download3k.com | 3 | 45 | 6.67% | Another software download site |
| technos-sources.com | 2 | 30 | 6.67% | French tech site. Could be legit, could be aggregator. Doesn't explain the high CTR. |
Green: Most likely legit - Amber: Likely to be illegitimate - Red: Almost certainly illegitimate
I realise that detecting and solving click fraud is much more difficult than actually causing it, especially without access to key information such as site demographics, visitor behaviour, click streams and conversion data. On the other hand, as Google Analytics tracks the Muhimbi Site, they actually have most of this data. I will present my findings to Google and give them a chance to respond and hopefully improve the situation for everyone. Perhaps some kind of validation system or list of ‘trusted sites’ could be created by Google.
As a Google shareholder I wonder how much of Google’s income actually comes from this kind of alleged criminal activity. According to Google: “…we manage the problem of invalid clicks very well. We have a large team of expert engineers and analysts devoted to it. By far, most invalid clicks are caught by our automatic filters and discarded *before* they reach an advertiser’s bill. And for the clicks that are not caught in advance, advertisers can notify Google and ask for reimbursement.”
This situation cannot continue any longer. I am naturally upset that my company appears to be the victim of fraud, but what about the thousands (millions?) of other advertisers who do not have the knowledge or resources to detect fraud? It should not be up to the customers to research and report fraud, Google should step up its game and clean up its act, no matter how difficult or painful it is.
So, is Google guilty of fraud? I seriously doubt it, however they appear to be profiteering from other people’s criminal activity in a manner not dissimilar to the way illegal media sharing sites are behaving. “We are not doing anything illegal, we can’t help it that other people upload illegal movies / music / software / <insert excuse here> to our site even though it has clearly been designed for this purpose.”
… Not good enough. To be continued.
02-Feb-2010 - Update 1: We are clearly not the only party experiencing click-fraud. For more information visit the links below:
Report Google click-fraud here.
18-Feb-2010 - Update 2: Google have responded and claim that only a small percentage of the clicks are fraudulent. The remaining clicks are all part of normal user activity. It appears their response is largely automated so I have replied back asking for further details as I don’t accept their findings. I find it astonishing how much they downplay the issue of click-fraud. Apparently it is up to me to manually exclude domains that I consider not to be relevant. This is just laughable.
19-Feb-2010 – Update 3:
01-Mar-2010 – Update 4: Received another reply from Google AdWords support. They have disabled some of the accounts that have caused fraudulent clicks, but they are not allowed to tell which clicks were the fraudulent ones. Apparently it is up to us to police Google’s content network, painstakingly go through all reports, check out each domain and then take Google’s word for it about them taking the appropriate action. In light of my findings, Google’s word is not worth much to me at the moment, even if they have the best of intentions.
.
Labels: Articles, News