Create an MHTML archive

Introduction

If you need to ship a more-or-less static web page to be run on an embedded or unconnected device, you can make use of the MHTML format initiated by Microsoft and available, at least, on all recent versions of IE. There are tools available to use it under Linux as well. This is the same as using "Save As" from the IE menu and selecting "Web archive, single file (mht)".

The idea is that you create one multipart MIME file, similar to an email with attachments, which includes all the static elements of a webpage: scripts, stylesheets and images. It does not appear to handle the resources embedded in <OBJECT> tags, such as movies, but will refer to these validly if they specify an absolute URL.

Use a CDO Message to create an MHTML archive

Although one's first inclination is to try to automate IE via the the InternetExplorer.Application object, this doesn't allow any way to avoid the user prompt. Instead, and just as efficiently, you can use the .CreateMHTMLBody feature of the CDO.Message object, designed for this very purpose. Having created the message body, you get hold of its associated ADO stream to persist the result to disc.

The SaveToFile method of the ADO stream takes as its second parameter either 1, which will create if the file doesn't exist and will fail if it does; or 2, which will overwrite if it exists and create otherwise.

import os, sys
from win32com.client.gencache import EnsureDispatch as Dispatch

URL = r"http://local.timgolden.me.uk"
FILEPATH = r"timgolden.me.uk.mht"

message = Dispatch ("CDO.Message")
message.CreateMHTMLBody (URL)
stream = Dispatch (message.GetStream ())
stream.SaveToFile (FILEPATH, 2)
stream.Close ()