RSA Key Exchange with Windows Crypto API and OpenSSL Part 1


Microsoft Crypto API (CAPI) was first released with the Windows NT4 operating system in 1996. The OpenSSL project, which was originally a fork of SSLeay by Eric Young and Tim Hudson, was initiated in 1998 and has since become one of the most widely distributed cryptographic libraries available.

I recently required a Windows application using CAPI that can sign and verify files using the RSA digital signature algorithm, but It needed to read RSA keys and signatures generated by OpenSSL. The keys are stored usingĀ Abstract Syntax Notation One (ASN.1) and Privacy Enhanced Mail (PEM) format. The signatures are stored in binary using big-endian convention.

In this post, we’ll focus specifically on RSA key generation, the importation/exportation of RSA keys and the key management standards used to exchange keys in a platform independent manner.

I understand RSA is being phased out in favor of Elliptic Curve Cryptography (ECC) which may be discussed in a future post. I’m also aware ECC will eventually be phased out in favor of quantum resistant cryptography which is still under a lot of research and development, but that could be 10+ years away and RSA still offers good security margin, albeit with less efficiency.

Part 2 will focus specifically on generation and exchange of session keys over TCP for symmetric encryption, but the bulk of work needed to reach that stage is really within this post.

Key Management Standards

The 2 main issues developers appear to complain about for interoperability between OpenSSL and Microsoft Crypto API are:

  1. Signatures exported by OpenSSL functions use big-endian convention, Microsoft Crypto API uses little-endian
  2. Public and Private Keys exported by OpenSSL functions use ASN.1 structures, Microsoft CryptoAPI use their own structures or what are referred to as “Blobs”

However, both API support the following standards for public key management:

We just need to import and export keys using these standard formats to successfully exchange keys between the 2 API.

Then there’s the issue of textual encoding using Privacy Enhanced Mail (PEM) format which is defined in RFC1421, RFC1422 and RFC1423.

Essentially, this encoding uses the base64 algorithm.

RSA Digital Signatures and RSA Key Exchange

When signing a file, we derive a cryptographic hash from its data. This hash is then encrypted using an RSA private key and modular exponentiation. The resulting ciphertext is called a signature. Verification of the signature involves decryption using an RSA public key and Modular Exponentiation.

When exchanging session keys, the client side will generate a value derived from a cryptographic pseudo-random number generator (CSPRNG). This value will be used as the symmetric encryption key. It’s then encrypted using an RSA public key and modular exponentiatation before being sent to a remote server. The server will perform RSA decryption using the private key to recover the same session key.

Byte Order

As stated already, OpenSSL exports signatures using the Big-Endian convention whereas Microsoft Crypto API uses Little-Endian.

High end servers and mainframes in the 80s and 90s used Big-Endian architectures like SPARC, MIPS and POWER. The legacy of this are many cryptography libraries using Big-Endian convention to store data on disk.

To accomodate this on Windows which predominantly runs on X86 architecture, we use the following piece of code to swap the order of bytes after signing and before verification.

Then we have no problem verifying signatures generated by OpenSSL.

RSA key context

The following structure is defined to hold RSA keys.

RSA Key Generation

CAPI uses 65537 as the public exponent in key generation so we need to use the same for OpenSSL.

Reading and writing PEM files

Before using the CAPI functions, we need to decode the PEM files into ASN.1 encoded structures. The CryptStringToBinary and CryptBinaryToString APIs can convert to and from PEM, however these were only made available since Windows XP and Windows 2003.

See PEM_read_file and PEM_write_file functions for more details.

Importing public and private keys

  • Crypto API

For the Public key, decode the ASN.1 structure into a Public Key Info structure before importing to CAPI key object using CryptImportPublicKeyInfo API.

Private Key, decode the ASN.1 structure into a Private Key Info structure. Convert the PrivateKey value into a CAPI Private Key Blob before importing into a CAPI key object.

Unfortunately, there’s no CryptImportPrivateKeyInfo API, hence the extra call to CryptDecodeObjectEx.

  • OpenSSL

OpenSSL offers a much simpler solution with a single API call for both private and public keys. We also don’t have to decode the PEM format before hand. Nice, eh?

Exporting keys

  • Crypto API

Since 2000, we can use CryptExportPKCS8 to export the private key.

Exporting the public key using CryptExportPublicKeyInfo before encoding with ASN.1.

  • OpenSSL

Signing a file

  • Crypto API

  • OpenSSL

Verifying signature

  • Crypto API

  • OpenSSL

RSA Tool Usage

  • Key Generation


  • Signing a file

  • Verifying signature


The purpose of this post was to cover the main problems of key exchange between OpenSSL and Microsoft Crypto API.

For symmetric key exchange, so long as we use ASN.1 encoding for the exchange of public and private keys and remember that OpenSSL uses Big-Endian convention instead of Little-Endian by CAPI, there isn’t a significant problem.

In part 2, we’ll examine how to perform End-To-End encryption of network traffic between a windows machine using Crypto API and Linux using OpenSSL.

Source code for the RSA tool can be found here

Posted in crypto api, cryptography, openssl, programming, security, windows | Tagged , , , , , , | Leave a comment

Listing processes on Windows in C


I was writing something recently which required obtaining a list of running processes on Windows. The problem was that it had to run on systems as early as Windows NT right up to Windows 10 and there’s no single API you can use for this.

I also had to pick a compiler that would support Windows NT since Microsoft generally avoid supporting legacy operating systems with their compilers after a certain point.

API options

So what API are available to use?

  • Performance Data
  • Available since Windows NT but undocumented. Mark Russinovich’s pslist uses this so it can also work remotely provided the Remote Registry Service is enabled.

  • Win32_Process
  • Seems to be available since Windows NT. Tried checking with VBScript but cscript.exe wasn’t installed. This would be ideal if you needed to list processes on a remote system but is also much harder to use when programming with C++ instead of a scripting language such as VBScript or Powershell.

  • Process32First
  • You can certainly use this for 32 and 64-bit now since Windows XP but it’s not available on Windows NT.

  • EnumProcesses
  • Apparently the PSAPI.DLL file where this API resides can be installed from CD-ROM but it’s not installed by default which means we can’t depend on it.

  • NtQuerySystemInformation
  • Available since Windows NT so we can use it for that but doesn’t work for 64-bit systems.

    The solution is to use NtQuerySystemInformation if executing on Wow64 or legacy systems and to use Process32First if we’re building for 64-bit systems.

    Process Entry Structure

    Rather than work with 2 different structures, I only copy the module name and process id to a new structure which is then parsed by the callee. The module names are in unicode format.

    typedef struct _PROCENTRY_T {
      DWORD id;
      WCHAR name[MAX_PATH];

    Legacy Mode

    So here’s a little function in C called GetProcessList which uses those 2 API to retrieve a list of running processes.

    PPROCENTRY GetProcessList(VOID)
      pNtQuerySystemInformation   NtQuerySystemInformation;
      pRtlCompareUnicodeString    RtlCompareUnicodeString;
      ULONG                       len=0, total=0, pe_size=0;
      NTSTATUS                    status;
      LPVOID                      list=NULL;
      PPROCENTRY                  pe;
      DWORD                       i;
      NtQuerySystemInformation = 
          GetModuleHandle("ntdll"), "NtQuerySystemInformation");
      if (!NtQuerySystemInformation) {
        // we couldn't resolve API address
        return NULL;
      list = xmalloc(2048);
      do {
        len += 2048;
        list = xrealloc (list, len);
        if (list==NULL) {
          // we couldn't reallocate memory
        status = NtQuerySystemInformation(SystemProcessInformation, list, len, &total);
      } while (status == STATUS_INFO_LEN_MISMATCH);
      if (status < 0) {
        // we were unable to obtain list of process
        return NULL;
      p       = (PSYSTEM_PROCESS_INFORMATION)list;
      pe_size = sizeof(PROCENTRY);
      pe      = xmalloc(pe_size);
      for (i=0;;) 
        if (p->ProcessName.Buffer != 0)
          // copy process id and module name
          pe[i].id = p->ProcessId; 
          lstrcpy(pe[i].name, p->ProcessName.Buffer);
          pe_size += sizeof(PROCENTRY);
          pe = xrealloc(pe, pe_size);
          if (pe==NULL) {
        // no more entries? break
        if (p->NextEntryDelta==0) break;
        // advance to next entry
        p = (PSYSTEM_PROCESS_INFORMATION)(((char *)p) + p->NextEntryDelta);
      return pe;

    Long Mode

    PPROCENTRY GetProcessList(VOID)
      HANDLE         hSnap;
      PROCESSENTRY32 pe32;
      PPROCENTRY     pe=NULL;
      DWORD          i=0, pe_size;
      hSnap = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
      if (hSnap != INVALID_HANDLE_VALUE)
        pe32.dwSize = sizeof(PROCESSENTRY32);
        if (Process32First(hSnap, &pe32))
          i       = 0;
          pe_size = sizeof(PROCENTRY);
          pe      = xmalloc(pe_size);
          do {
            if (pe32.th32ProcessID==0) continue;
            pe[i].id = pe32.th32ProcessID; 
            lstrcpy(pe[i].name, pe32.szExeFile);
            pe_size += sizeof(PROCENTRY);
            pe = xrealloc(pe, pe_size);
            if (pe==NULL) {
          } while (Process32Next(hSnap, &pe32));
      return pe;


    The following is small example of using the above functions.

    int main(void)
      PPROCENTRY pe;  
      PPROCENTRY list = GetProcessList();
      if (list==NULL) {
        printf ("\nUnable to retrieve list of process");
        return 0;
      printf ("\nList of processes");
      printf ("\n=================");
      for (pe=list; pe->id; pe++) {
        printf ("\n%-30ws - %i", pe->name, pe->id);
      return 0;

    See pslist.c here

    To compile for just listing processes, use CL /DTEST pslist.c

    Posted in programming, windows | Tagged , , , , | Leave a comment

    Windows ICMP API in C/C++


    In the old days, pinging a computer on windows required building an ICMP packet from scratch and using RAW sockets to send the packet to its destination. Worse was that you then had to listen for a response and parse this manually.

    Thankfully, Microsoft since Windows 2000 made available an ICMP helper library so you don’t have to perform any hard work anymore. Well, admittedly, it’s much easier to use .NET than Win32 API but if you’re curious about how to do it with C/C++, read on.

    Ping class and reply structure

    Because we can sometimes resolve multiple addresses for one hostname, I’ve defined a structure to hold information about each response.

    It contains basic information like the IP address as a string, the hostname (if available) and an status message.

    typedef struct _ICMP_REPLY {
      DWORD dwCode;
      wchar_t Message[256];
      wchar_t Dns[255];
      wchar_t Ip[255];
      struct _ICMP_REPLY *next;

    The class with properties and methods.

    class Ping {
        BOOL FlushDnsCache ();
        PICMP_REPLY rlist, current;
        void Add (PICMP_REPLY);
        void Clear (void);
        int SendEcho(ICMP_REPLY*, PADDRINFOW, int);
        HANDLE IcmpCreate(int);
        Ping() { rlist=NULL; }
        BOOL Send (wchar_t address[], int);
        BOOL Send (wchar_t address[], int, DWORD timeout);
        PICMP_REPLY GetReplies (void);

    Flushing the DNS cache

    Let’s say a machine has rebooted and you’re waiting for it to come back up and it just won’t respond to a ping anymore. Chances are your cache is out of date. While this may not be a problem, it’s still always good practice to flush the DNS cache before trying again.

    So from the command line, you can use

    ipconfig /flushdns

    But if you don’t want to execute this command through an API, you can call an undocumented API DnsFlushResolverCache.

     * Uses undocumented API from DNSAPI.DLL
     * Same as : ipconfig /flushdns
    BOOL Ping::FlushDnsCache (VOID) {
      BOOL bResult = FALSE;
      BOOL (WINAPI *Flush) ();
      HMODULE hDNS = LoadLibrary (L"dnsapi");
      if (hDNS != NULL) {
        *(FARPROC *)&Flush = GetProcAddress (hDNS, "DnsFlushResolverCache");
        if (Flush != NULL) {
          bResult = Flush ();
        FreeLibrary (hDNS);
      return bResult;

    Create an ICMP file

    First thing to do is create an ICMP file handle. The following function will return handle for either IPV4 or IPV6 depending on value of family parameter which should be AF_INET or AF_INET6.

    // create an ICMP handle for ipv4 or ipv6
    HANDLE Ping::IcmpCreate(int family) 
      if (family==AF_INET) {
        return IcmpCreateFile();
      } else {
        return Icmp6CreateFile();

    Resolving network address for hostname

    Steps to resolve all network addresses for a hostname

    1. Resolve all addresses using GetAddrInfo
    2. Resolve name for each address using GetNameInfo
    3. Convert network address to string using WSAAddressToString
    BOOL Ping::Send (wchar_t address[], int family, DWORD timeout)
      ADDRINFOW  hints;
      PADDRINFOW e, list = NULL;
      ICMP_REPLY r;
      wchar_t    host[NI_MAXHOST], serv[NI_MAXSERV], ip[INET6_ADDRSTRLEN];
      DWORD      res, size, idx;
      // clear any previous entries
      if (FlushDnsCache ()) 
        ZeroMemory (&hints, sizeof (hints));
        hints.ai_family   = family;
        hints.ai_socktype = SOCK_STREAM;
        hints.ai_protocol = IPPROTO_TCP;
        // resolve all available addresses
        if (GetAddrInfo (address, NULL, &hints, &list) == NO_ERROR) 
          // loop through each entry
          for (e=list; e!=NULL; e=e->ai_next) 
            // resolve name if available
            res = GetNameInfo (e->ai_addr, sizeof (SOCKADDR), host,
                NI_MAXHOST, serv, NI_MAXSERV, NI_NUMERICSERV);
            // copy name to structure
            StrCpy (r.Dns, (res == NO_ERROR) ? host : L"unresolved");
            // convert ip address to string
            size = sizeof (ip);
            res = WSAAddressToString (e->ai_addr, 
                (DWORD)e->ai_addrlen, NULL, ip, &size);
            StrCpy (r.Ip, 
              (res == NO_ERROR) ? ip : family==AF_INET ? L"" : L"::");
            if (SendEcho(&r, e, timeout)) {
              Add (&r);        
          FreeAddrInfo (list);
        } else {
          printf ("\n%i", GetLastError());
      return rlist != NULL;

    Sending echo

    • Create ICMP file handle
    • If IPV4, use IcmpSendEcho, if IPV6, use Icmp6SendEcho2
    int Ping::SendEcho(ICMP_REPLY *r, PADDRINFOW addr, int timeout)
      wchar_t             req_data[4];
      PICMP_ECHO_REPLY    pReply4;
      PICMPV6_ECHO_REPLY  pReply6;
      LPVOID              reply[sizeof(ICMPV6_ECHO_REPLY) + sizeof(req_data) * 2];
      HANDLE              hIcmpFile;
      struct sockaddr_in  v4;
      struct sockaddr_in6 v6, sa;  
      DWORD               reply_size, idx;
      reply_size = sizeof(reply);
      // create icmp file handle
      hIcmpFile = IcmpCreate(addr->ai_family);
      if (hIcmpFile==NULL) return 0;
      if (addr->ai_family==AF_INET)
        // send ipv4
        memcpy (&v4, addr->ai_addr, addr->ai_addrlen);
        IcmpSendEcho (hIcmpFile,
          (IPAddr)v4.sin_addr.S_un.S_addr, req_data, 
          sizeof(req_data), NULL, 
          reply, reply_size, timeout);
          pReply4 = (PICMP_ECHO_REPLY)reply;
          r->dwCode = pReply4->Status;
      } else {
        // send ipv6
        memcpy(&v6, addr->ai_addr, addr->ai_addrlen);
        sa.sin6_addr     = in6addr_any;
        sa.sin6_family   = AF_INET6;
        sa.sin6_flowinfo = 0;
        sa.sin6_port     = 0;
        Icmp6SendEcho2 (hIcmpFile, NULL, NULL, NULL, 
          &sa, &v6, req_data, sizeof(req_data),
          NULL, reply, reply_size, timeout);
        pReply6 = (PICMPV6_ECHO_REPLY)reply; 
        r->dwCode = pReply6->Status;    
      StrCpy (r->Message, L"Check status code");
      for (idx=0; idx<sizeof(pStatus)/sizeof(STATUS_MSG); idx++) 
        if (r->dwCode == pStatus[idx].dwCode) 
          StrCpy (r->Message, pStatus[idx].pMessage);
      IcmpCloseHandle (hIcmpFile);
      return 1;  


    Simple demonstration using both ipv4 and ipv6

    See sources here

    Posted in networking, programming, windows | Tagged , , , | Leave a comment