Getting process info: Win32 and COM with python and C++ |
- SUMMARY
- Introduction
- Getting the process list: PdhEnumObjectItems
- Getting the process ids: Several Pdh Calls
- The COM client
- Making the COM objects
- The Python COM object
- The C++ COM object
- In Conclusion
- Further Info
- Author
SUMMARY
Python is a rich scripting language offering a lot of the power of C++ while retaining the ease of use of VBscript. With it's simplified C++ style, win32 access, and ability to make COM servers, it's a natural rapid development environment for the developer.
When working with an unfamiliar API, python is great for helping you understand how to solve the problem without getting in the way. Even if you have to supply a C++ COM object, it is often easier to first figure out the details with python and then compose the C++ solution. Python is very similar to C++ pseudo-code, so you can follow it as an outline for the C++. In this case, we're going to talk about how to use both python and C++ to expose a list of processes and their corresponding ids with a COM object.
Introduction
To get process information for both NT and W2K (but not the 9x family) you can use the Performance Data Helper library(PDH) available in the SDK at microsoft's ftp site. It provides a convenient interface to performance information stored fundamentally in the registry. The basic process of using the PDH encompasses the following:
- Get a list of all the objects you want
- Get a list of the object's instances and data available for each instance: called 'items' or 'counters'
- Get a group of performance data for each counter
In our case the object we want is the process object, the object's instances are it's list of processes, and the counter we want for the processes is 'ID Process'.
The specific set of API called are the following:
- PdhEnumObjectItems -- get the list of instances (processes) for the process object
- PdhOpenQuery -- initialize query handle which will contain all the counters (in our case only: ID Process)
- PdhMakeCounterPath -- Each counter has a specific path, kind of like like a url
- PdhAddCounter -- convert the path to a handle and group the counters paths(in this case only one) together for the query
- PdhCollectQueryData -- gathers the actual data
- PdhGetFormattedCounterValue -- call for each counter to convert the data to a typical format
- PdhCloseQuery -- close the counters and the query handle
We'll cover these points now in more depth.
Getting the process list: PdhEnumObjectItems
The MSDN describes the call as the following:
PDH_STATUS PdhEnumObjectItems( LPCTSTR szDataSource, LPCTSTR szMachineName, LPCTSTR szObjectName, LPTSTR mszCounterList, LPDWORD pcchCounterListLength, LPTSTR mszInstanceList, LPDWORD pcchInstanceListLength, DWORD dwDetailLevel, DWORD dwFlags );
The python call is similar though simpler. For example, you do not need to bother with the list length -- it takes care of that for you. Both the python and C++ examples are taken from their COM components shown later.To call make with python would look like the following
def proclist(self): try: junk, instances = win32pdh.EnumObjectItems(None,None,self.object, win32pdh.PERF_DETAIL_WIZARD) return instances except: raise COMException("Problem getting process list")
The variable instances contains the list of processes, and you'll find the item or counter than we want 'ID Process' present in the list of items. Since you can have multiple processes with the same name, in python it is convenient to use a dictionary to store a list of processes and how many you found for each type.
for instance in instances: if proc_dict.has_key(instance): proc_dict[instance] = proc_dict[instance] + 1 else: proc_dict[instance]=0
The C++ call though essentially the same in spirit is much more involved. To help, I use map(like a python dictionary) and string from Standard C++. The additional things you need to manage are:
- strings -- convert to TCHAR for ansi/unicode support using ATL macros
- the memory for the buffers that PdhEnumObjectItems needs
- Parse it's NULL padded results.
HRESULT getinst (map < string,int > & m_inst) { map<string, int>::iterator iter; USES_CONVERSION; LPTSTR szCountersBuf = NULL; DWORD dwCountersSize = 0; LPTSTR szInstancesBuf = NULL; DWORD dwInstancesSize = 0; LPTSTR szTemp = NULL; PDH_STATUS status; std::string str_obj="Process"; status = PdhEnumObjectItems( NULL, NULL, A2CT(str_obj.c_str()), NULL, &dwCountersSize, szInstancesBuf, &dwInstancesSize, PERF_DETAIL_WIZARD, 0 ); if ( ERROR_SUCCESS != status ) return E_FAIL; if (dwCountersSize) { szCountersBuf = (LPTSTR)malloc (dwCountersSize * sizeof (TCHAR)); if (szCountersBuf==NULL) { return E_FAIL; } } else szCountersBuf=NULL; if (dwInstancesSize) { szInstancesBuf = (LPTSTR)malloc (dwInstancesSize * sizeof (TCHAR)); if (szInstancesBuf==NULL) { free(szCountersBuf); return E_FAIL; } } else szInstancesBuf = NULL; status = PdhEnumObjectItems( NULL, NULL, A2CT(str_obj.c_str()), szCountersBuf, &dwCountersSize, szInstancesBuf, &dwInstancesSize, PERF_DETAIL_WIZARD, 0); if ( ERROR_SUCCESS != status ) return E_FAIL; //it's a series of contingous NULL terminated strings, ending w/zero length string if (szInstancesBuf){ for (szTemp = szInstancesBuf;*szTemp != 0;szTemp += lstrlen(szTemp) + 1) { m_inst[T2A(szTemp)]++; //increment instance counter //default value is zero for arith element } } return S_OK; }
Getting the process ids: Several Pdh Calls
A whole sequence of calls are necessary once you get the process list. To refresh your memory, you need:
- PdhOpenQuery -- initialize query handle which will contain all the counters (in our case only: ID Process)
- PdhMakeCounterPath -- Each counter has a specific path, kind of like like a url
- PdhAddCounter -- convert the path to a handle and group the counters paths(in this case only one) together for the query
- PdhCollectQueryData -- gathers the actual data
- PdhGetFormattedCounterValue -- call for each counter to convert the data to a typical format
- PdhCloseQuery -- close the counters and the query handle
for instance, max_instances in proc_dict.items(): for inum in xrange(max_instances+1): try: hq = win32pdh.OpenQuery() # initializes the query handle path = win32pdh.MakeCounterPath( (None,self.object,instance, None, inum, self.item) ) counter_handle=win32pdh.AddCounter(hq, path) #convert counter path to counter handle win32pdh.CollectQueryData(hq) #collects data for the counter type, val = win32pdh.GetFormattedCounterValue(counter_handle, win32pdh.PDH_FMT_LONG) proc_ids.append(instance+'\t'+str(val)) win32pdh.CloseQuery(hq) except: raise COMException("Problem getting process id")
Again, the C++ code is more involved and makes use of Standand C++, vector, map, and string. It converts a map of process names and the number for each name, to a vector of strings each which has tab-delimited process id entry. Also, the process id info is returned in the format of a double, which is converted to a string.
HRESULT getprocid (map<string,int>& m_inst, vector<string> &v_ids) { USES_CONVERSION; PDH_STATUS status = 0; HQUERY hQuery = NULL; HCOUNTER hCounter = NULL; DWORD dwType = 0; map<string,int> m_idinst; std::string objname="Process"; std::string counter="ID Process"; char *buffer;int junk,junk2; // initialize the query handle map < string, int>::iterator iter; for (iter=m_inst.begin();iter != m_inst.end();++iter) { for (int i=0;i<= iter->second;++i) { status = PdhOpenQuery( NULL, 0, &hQuery ); if ( status != ERROR_SUCCESS ) return status; TCHAR szCounterPath[2048]; DWORD dwPathSize = 2048; PDH_COUNTER_PATH_ELEMENTS pdh_elm; pdh_elm.szMachineName = NULL; pdh_elm.szObjectName = A2T(objname.c_str()); pdh_elm.szInstanceName = A2T(iter->first.c_str()); pdh_elm.szParentInstance = NULL; pdh_elm.dwInstanceIndex = i; pdh_elm.szCounterName = A2T(counter.c_str()); status = PdhMakeCounterPath( &pdh_elm, szCounterPath, &dwPathSize, 0 ); if ( status != ERROR_SUCCESS ) { return E_FAIL; } // Add the counter to the query //PdhAddCounter converts each counter path into a counter handle status = PdhAddCounter( hQuery, szCounterPath, 0, &hCounter ); if ( status != ERROR_SUCCESS ) { return E_FAIL; } //PdhCollectQueryData gets raw data for the counters status = PdhCollectQueryData(hQuery); if ( status != ERROR_SUCCESS ) { return E_FAIL; } //PdhGetFormattedCounterValue formats counter values for display DWORD dwFormat = PDH_FMT_DOUBLE; PDH_FMT_COUNTERVALUE fmtValue; status = PdhGetFormattedCounterValue (hCounter, dwFormat, (LPDWORD)NULL, &fmtValue); if (status == ERROR_SUCCESS) { buffer=_fcvt( fmtValue.doubleValue, 0, &junk,&junk2 ); string id=buffer; v_ids.push_back(iter->first+'\t'+id); } } status = PdhCloseQuery (hQuery); //PdhCloseQuery closes the query handle and it's counters } return S_OK; }
The COM client.
The Python COM client which will call the C++ COM object and Python COM object does the following:
import win32com.client a=win32com.client.Dispatch('NtPerf.process') #C++ com object print a.procids() b=win32com.client.Dispatch('PyPerf.process') #python com object print b.procids()
As far as it is concerned, there is no difference between the 2 objects. Both returns a list of processes and their respective id's seperated by tab.
Making the COM objects.
From a 1000 mile perspective, ATL C++ and python offer a class based COM object approach. In both approaches, the methods of the com object are simply methods of a class. However, creating a COM object in python is much easier than C++, again allowing you to focus on the problem first while still retaining the C++ feel.
Much of the details with python COM objects are exposed through a default policy which leverages IDispatch. You simply add a few attributes your python class, to expose your methods, prog id, add a line to register your class, and you are done. The policy knows what to do. Thus, it's easy enough to take a simple win32 class you wrote and add a few attributes and convert it to a COM object. Creating and developing python COM objects is simple, all that is needed is notepad. You don't need a full blown IDE nor do you have to go back and find the source code that created the object, since the object is the source code. It lends itself to very rapid development.
ATL provides wizards and a lot of the basic implementation goo (like COM interfaces for IUnknown and IDispatch), and wrappers for data types. However, there still a big difference between a simple console based app and an ATL COM object. The COM world of Variants, SafeArrays, and BSTR's is (as we'll see below) unfriendly to C++.
Python behind the scenes converts back and forth between it's native types and BSTR, Variants, SafeArrays, etc. When you change your COM object, you don't have to worry about changing IDL and constructing new Variant structures, it is managed for you.
In C++, it is more messy. First of all: strings. Since the concept of what text is isn't consistent, COM standardizes with it's own OLECHAR. Also, for non-COM text, because of issues between using ansi and unicode, you need to use the TCHAR data type which is a generic type that maps at compile time to what is necessary. And, with regard to BSTRs (length-prefixed strings), you need to convert the OLECHAR to BSTR's with SysAllocString. You notice the COM object below uses the Standard C++ string (which I prefer to use). When necessary, it converts the string it to the necessary COM type leveraging ATL conversion macros and SysAllocString (another option is to use the CComBSTR class).
Secondly: arrays. Since we are returning a list of processes and their ids, to be friendly with all languages, everything needs to be converted to SafeArrays of Variants housing BSTRs. Python does the conversion for you. With C++, you'll need to make various Variant calls to create the necessary structure. In the C++ COM object, I've encapsulated all the necessary code in a single function that converts any vector of strings into a 1 dimensional safe array of variants.
Now for some code:
The Python COM object.
The Python COM object has 2 methods proclist and procids. Proclist is trivial, simply returning the list of processes from EnumObjectItems. Procids calls proclist, constructs a dictionary to count the number of processes with the same name, and then makes the necessary calls to their their ids. Each function simple returns the python list (which is then converted for you). If you later decide to only return a single string, simple change what you return, and python again will convert for you. Each method also uses the very cool function COMException which returns errors back to the client. In addition to the methods, there are 4 basic attributes, I set to define the COM object and a line to register it.
import win32pdh, string, win32api from win32com.server.exception import COMException import win32com.server.util import win32com.client.dynamic #to generate guids use: #import pythoncom #print pythoncom.CreateGuid() class pyperf: # COM attributes. _reg_clsid_ = '{763AE791-1D6B-11D4-A38B-00902798B22B}' #guid for your class in registry _reg_desc_ = "get process list and ids" _reg_progid_ = "PyPerf.process" #The progid for this class _public_methods_ = ['procids','proclist' ] #names of callable methods def __init__(self): self.object='process' self.item='ID Process' def proclist(self): try: junk, instances = win32pdh.EnumObjectItems(None,None,self.object, win32pdh.PERF_DETAIL_WIZARD) return instances except: raise COMException("Problem getting process list") def procids(self): #each instance is a process, you can have multiple processes w/same name instances=self.proclist() proc_ids=[] proc_dict={} for instance in instances: if proc_dict.has_key(instance): proc_dict[instance] = proc_dict[instance] + 1 else: proc_dict[instance]=0 for instance, max_instances in proc_dict.items(): for inum in xrange(max_instances+1): try: hq = win32pdh.OpenQuery() # initializes the query handle path = win32pdh.MakeCounterPath( (None,self.object,instance, None, inum, self.item) ) counter_handle=win32pdh.AddCounter(hq, path) #convert counter path to counter handle win32pdh.CollectQueryData(hq) #collects data for the counter type, val = win32pdh.GetFormattedCounterValue(counter_handle, win32pdh.PDH_FMT_LONG) proc_ids.append(instance+'\t'+str(val)) win32pdh.CloseQuery(hq) except: raise COMException("Problem getting process id") proc_ids.sort() return proc_ids if __name__=='__main__': import win32com.server.register win32com.server.register.UseCommandLine(pyperf)
The C++ COM object.
As you notice from the idl, C++ COM object also exposes 2 methods, proclist and procids. Proclist calls the getinst function returns returns a map of processes, converts that to a vector of strings, and the calls make_safe to convert that to a Safe array of Variants. Procids does much the same except that after calling getinst, it then calls getprocid, which returns a vector of strings containing the processes and their ids. The vector of strings is then converted to a SafeArray with make_safe. Unlike python, you don't actually return the SafeArray(since every COM method has to return an HRESULT). Instead, you store the values in a Variant pointer.
Here is the relevant excerpt from the IDL: interface Iprocess : IDispatch { [id(1), helpstring("lists current processes")] HRESULT proclist([out, retval] VARIANT *plist); [id(2), helpstring("method procids")] HRESULT procids([out, retval] VARIANT *pids); } Here is the source for the cpp file #include "stdafx.h" #include "Ntperf.h" #include "process.h" //MS stuff #include "pdh.h" #include "pdhmsg.h" // fix problem with different versions of pdh.dll #undef PdhOpenQuery // PdhOpenQueryA extern "C" long __stdcall PdhOpenQuery ( IN LPCSTR szDataSource, IN DWORD dwUserData, IN HQUERY *phQuery ); //STD C++ stuff #pragma warning(disable : 4786) //get rid of stl warnings #include <string> #include <vector> #include <map> using namespace std; ///////////////////////////////////////////////////////////////////////////// // Cprocess HRESULT make_safe(vector<string>& v_list, VARIANT *plist) { HRESULT hr = S_OK; USES_CONVERSION; VariantInit(plist); plist->vt = VT_ARRAY | VT_VARIANT; //set type of plist to variant array //now create the 1 dimensional safearray of variants LPSAFEARRAY psa; SAFEARRAYBOUND rgsabound[] = { v_list.size(), 0 }; // size elements, 0-based psa = SafeArrayCreate(VT_VARIANT, 1, rgsabound); if (!psa) { return E_OUTOFMEMORY; } VARIANT * VarArray; //Increment lock count and get pointer to the array data if (FAILED(hr = SafeArrayAccessData(psa,(void **) &VarArray ))) { return hr; } for (int i =0; i<v_list.size();i++) { VarArray[i].vt = VT_BSTR; //convert ascii to olestr then bstr VarArray[i].bstrVal = SysAllocString(A2OLE(v_list[i].c_str())); if (!VarArray[i].bstrVal) { VariantClear(VarArray); return hr = E_OUTOFMEMORY; } } SafeArrayUnaccessData( psa ); plist->parray = psa; //now set the array in plist to be the created array return S_OK; } HRESULT getprocid (map<string,int>& m_inst, vector<string> &v_ids) { USES_CONVERSION; PDH_STATUS status = 0; HQUERY hQuery = NULL; HCOUNTER hCounter = NULL; DWORD dwType = 0; map<string,int> m_idinst; std::string objname="Process"; std::string counter="ID Process"; char *buffer;int junk,junk2; // initialize the query handle map<string, int>::iterator iter; for (iter=m_inst.begin();iter != m_inst.end();++iter) { for (int i=0;i<= iter->second;++i) { status = PdhOpenQuery( NULL, 0, &hQuery ); if ( status != ERROR_SUCCESS ) return status; TCHAR szCounterPath[2048]; DWORD dwPathSize = 2048; PDH_COUNTER_PATH_ELEMENTS pdh_elm; pdh_elm.szMachineName = NULL; pdh_elm.szObjectName = A2T(objname.c_str()); pdh_elm.szInstanceName = A2T(iter->first.c_str()); pdh_elm.szParentInstance = NULL; pdh_elm.dwInstanceIndex = i; pdh_elm.szCounterName = A2T(counter.c_str()); status = PdhMakeCounterPath( &pdh_elm, szCounterPath, &dwPathSize, 0 ); if ( status != ERROR_SUCCESS ) { return E_FAIL; } // Add the counter to the query //PdhAddCounter converts each counter path into a counter handle status = PdhAddCounter( hQuery, szCounterPath, 0, &hCounter ); if ( status != ERROR_SUCCESS ) { return E_FAIL; } //PdhCollectQueryData gets raw data for the counters status = PdhCollectQueryData(hQuery); if ( status != ERROR_SUCCESS ) { return E_FAIL; } //PdhGetFormattedCounterValue formats counter values for display DWORD dwFormat = PDH_FMT_DOUBLE; PDH_FMT_COUNTERVALUE fmtValue; status = PdhGetFormattedCounterValue (hCounter, dwFormat, (LPDWORD)NULL, &fmtValue); if (status == ERROR_SUCCESS) { buffer=_fcvt( fmtValue.doubleValue, 0, &junk,&junk2 ); string id=buffer; v_ids.push_back(iter->first+'\t'+id); } } status = PdhCloseQuery (hQuery); //PdhCloseQuery closes the query handle and it's counters } return S_OK; } HRESULT getinst (map<string,int>& m_inst) { map<string, int>::iterator iter; USES_CONVERSION; LPTSTR szCountersBuf = NULL; DWORD dwCountersSize = 0; LPTSTR szInstancesBuf = NULL; DWORD dwInstancesSize = 0; LPTSTR szTemp = NULL; PDH_STATUS status; std::string str_obj="Process"; status = PdhEnumObjectItems( NULL, NULL, A2CT(str_obj.c_str()), NULL, &dwCountersSize, szInstancesBuf, &dwInstancesSize, PERF_DETAIL_WIZARD, 0 ); if ( ERROR_SUCCESS != status ) return E_FAIL; if (dwCountersSize) { szCountersBuf = (LPTSTR)malloc (dwCountersSize * sizeof (TCHAR)); if (szCountersBuf==NULL) { return E_FAIL; } } else szCountersBuf=NULL; if (dwInstancesSize) { szInstancesBuf = (LPTSTR)malloc (dwInstancesSize * sizeof (TCHAR)); if (szInstancesBuf==NULL) { free(szCountersBuf); return E_FAIL; } } else szInstancesBuf = NULL; status = PdhEnumObjectItems( NULL, NULL, A2CT(str_obj.c_str()), szCountersBuf, &dwCountersSize, szInstancesBuf, &dwInstancesSize, PERF_DETAIL_WIZARD 0); if ( ERROR_SUCCESS != status ) return E_FAIL; //it's a series of contingous NULL terminated strings, ending w/zero length string if (szInstancesBuf){ for (szTemp = szInstancesBuf;*szTemp != 0;szTemp += lstrlen(szTemp) + 1) { m_inst[T2A(szTemp)]++; //increment instance counter //default value is zero for arith element } } return S_OK; } STDMETHODIMP Cprocess::proclist(VARIANT *plist) { if (!plist) { return E_INVALIDARG;} HRESULT hr = NOERROR; vector<string> test(50, "hello"); map<string,int> m_inst; map<string, int>::iterator iter; hr=getinst(m_inst); if FAILED(hr) {return hr;} vector<string> v_inst; for (iter=m_inst.begin();iter != m_inst.end();++iter) { //go through index of processes for(int i=0;i<iter->second;i++){ //put onto vector multiple procs w/same name v_inst.push_back(iter->first); } } //send a vector of strings and a variant to make_safe hr = make_safe(v_inst, plist); //getprocid(m_inst); return hr; } STDMETHODIMP Cprocess::procids(VARIANT *pids) { // TODO: Add your implementation code here if (!pids) { return E_INVALIDARG;} HRESULT hr = NOERROR; map<string,int> m_inst; map<string, int>::iterator iter; hr=getinst(m_inst); if FAILED(hr) {return hr;} vector<string> v_ids; getprocid(m_inst,v_ids); //send a vector of strings and a variant to make_safe hr = make_safe(v_ids, pids); //getprocid(m_inst); return hr; }
In Conclusion
That was a quick tour of Python and C++ in the win32 and COM world. Both languages have their strengths and weaknesses. With C++ you have ultimate granularity and power. It obviously comes at a cost of more details to keep track of. Python's strength is rich productivity. It is fast to write the win32 and COM sever code, yet still have a sophisticated language at your disposal. You lose some of the flexibility of C++, which often does not matter. And, when it does, python can help you understand how to solve the problem, before wading into the details.
Further Info
- Pdh stuff found at ftp://ftp.microsoft.com in something similar to /developr/platformsdk/april2000/x86/redist/pdh
- Mirosoft MSDN references at http://msdn.microsoft.com
- Relevant Pdh Python libraries: win32pdh.py, win32pdhutil.py
Author
John Nielsen, jn@who.net-- Have a great time with programming with python!
Getting process info: Win32 and COM with python and C++ |