Zip engine detect whether password is used

BlitzMax Forums/BlitzMax Programming/Zip engine detect whether password is used

JoshK(Posted 2012) [#1]
The zip engine module has support for passwords, if you know whether a password is used, and what it is. However, I don't know any way to detect whether or not a password is used in an unknown zip file, and how to tell whether the password used is the right one. I'd like to be able to prompt the user to enter a password, only when it is needed, and be able to tell them if their password is incorrect. Is there any way to do this with zip engine? Thanks.


BlitzSupport(Posted 2012) [#2]
This structure seems to relate to an individual file entry (which may individually be encrypted or unencrypted) within a zip file:

Type SZIPCentralFileHeader 
	Field Sig:Int
	Field VersionMadeBy:Short
	Field VersionToExtract:Short
	Field GeneralBitFlag:Short	' **** That's your feller ****
	...


It looks like it comes from here (unzip.h), with the addition of a 'magic number' Sig field:

/* unz_file_info contain information about a file in the zipfile */
typedef struct unz_file_info_s
{
    uLong version;              /* version made by                 2 bytes */
    uLong version_needed;       /* version needed to extract       2 bytes */
    uLong flag;                 /* general purpose bit flag        2 bytes */
    ....


And from http://www.pkware.com/documents/casestudies/APPNOTE.TXT :


general purpose bit flag: (2 bytes)

Bit 0: If set, indicates that the file is encrypted.



[Encrypted = has password.]

So, I believe you want to call ZipFile.getFileInfo, on each entry within the zip, to receive one of these...

Type SZipFileEntry
	Field zipFileName:String
	Field simpleFileName:String
	Field path:String
	Field header:SZIPCentralFileHeader
	...


... then check for the 0 bit in header:SZIPCentralFileHeader --> GeneralBitFlag.

As far as I can see, you can't just check if a password is right or wrong -- I think it's applied to the data as it's read, ie. you just have to try and unpack it, and it'll fail if the password's wrong.

Last edited 2012


BlitzSupport(Posted 2012) [#3]
This seems to work...


Import gman.zipengine

Local zr:ZipReader = New ZipReader

If zr.OpenZip ("myfile.zip")
	
	Print "OK"
	
	' ITERATE THROUGH INDIVIDUAL ENTRIES CHECKING FOR PASSWORDS...

	For Local entry:Int = 0 Until zr.getFileCount ()
		
		Local zfe:SZipFileEntry = zr.getFileInfo (entry)
		
		If zfe
			
			Local info:String = zfe.zipFileName
			
			If zfe.header
				If zfe.header.GeneralBitFlag & 1 Shl bit
					info = info + " [Password protected]"
				EndIf
			EndIf
			
			Print info
			
		EndIf
		
	Next
	
	zr.CloseZip ()
	
Else
	Print "FAIL"
EndIf


Last edited 2012


BlitzSupport(Posted 2012) [#4]
As I say, I believe the only way to find out if a password is correct is to try to extract the file entry in question.


xlsior(Posted 2012) [#5]
IIRC that is on purpose -- the password is used to encrypt the data as well, and doing this makes it harder for someone to brute-force password crack a zipfile.


JoshK(Posted 2012) [#6]
I thought WinRar and other programs would prompt you again for the password if you got it wrong. How do they figure it out?


matibee(Posted 2012) [#7]
How do they figure it out?


I've always assumed the unencrypted and expanded data would fail the CRC check. Whether that's the CRC of an individual block or the whole file, and whether the stored CRC is for the plain data or the packed data I don't know.

If stored CRC's are for the plain data you'd just have to run the extracted data through the same CRC algo to check it was extracted correctly.

James' pkware link mentions the CRC algorithm used:

The CRC-32 algorithm was generously contributed by
          David Schwaderer and can be found in his excellent
          book "C Programmers Guide to NetBIOS" published by
          Howard W. Sams & Co. Inc.



JoshK(Posted 2012) [#8]
The problem seems to be that the function unzOpenCurrentFilePassword() always returns UNZ_OK, regardless of whether the password is right or not.

I don't know what the "Shl bit" is doing. "bit" is an undeclared identifier:
			If zfe.header
				If zfe.header.GeneralBitFlag & 1 Shl bit
					info = info + " [Password protected]"
				EndIf
			EndIf


Here's what I am doing:
									Local password:String="This is the password!"
									If (szipfileentry.header.GeneralBitFlag & 1)
										extractedfile = zrObject.ExtractFile( szipfileentry.zipFileName,False,password)
									Else
										extractedfile = zrObject.ExtractFile(szipfileentry.zipFileName)
									EndIf
									If extractedFile
										outFile = WriteFile( path+"\"+szipfileentry.zipFileName )
										If outfile
											CopyStream(extractedFile,outFile)
											outfile.Close()
											success=True
										EndIf
									Else
										Notify "Failed to extract file ~q"+szipfileentry.zipFileName+"~q.",True
										zrObject.CloseZip()
										HideGadget progresswindow
										Return Null
									EndIf


This works to detect passwords (I think) but if the password is wrong, there is no error. The extracted files are just blank, most likely all zeroes. How can I check to tell if the password is right or not?

Some Russian guy is having the exact same problem here:
http://translate.google.com/translate?hl=en&sl=ru&u=http://forum.codenet.ru/archive/index.php/t-65168.html&ei=Bt9TT_7NMKquiQLj5vG0Bg&sa=X&oi=translate&ct=result&resnum=1&ved=0CCMQ7gEwADgK&prev=/search%3Fq%3D%2522unzOpenCurrentFilePassword%2522%2Bwrong%26start%3D10%26hl%3Den%26client%3Dfirefox-a%26hs%3DxSz%26sa%3DN%26rls%3Dorg.mozilla:en-US:official%26prmd%3Dimvns

Last edited 2012


JoshK(Posted 2012) [#9]
Okay, it looks like the only way is to compare the CRC32 value stored in the zip header to what you calculate for the file contents, and if they don't match that either means an error occurred, or the password is wrong.

How do I calculate a CRC32 value for a file's contents? Is there already a BMX function somewhere I can copy and paste?
SuperStrict

Function CRC32:Long(stream:TStream)
	Local pos:Int=stream.GetPos()
	Local value:Long
	stream.Seek(0)
	
	'code here
	
	stream.Seek(pos)
	Return value
EndFunction


Last edited 2012


BlitzSupport(Posted 2012) [#10]

"bit" is an undeclared identifier


That's just because you're in Strict/Superstrict mode, but it looks like you're past that anyway.

There are several examples of CRC32 checks throughout the site -- here are a couple:

http://www.blitzbasic.com/Community/posts.php?topic=47524
http://www.blitzbasic.com/codearcs/codearcs.php?code=1686


JoshK(Posted 2012) [#11]
This is my attempt. The CRC32 value this produces does not match the one stored in the zip file info header. I suspect this may have something to do with signed/unsigned integers (although the zip file info header stores a signed int).

CRC32.bmx:
SuperStrict

Import "CRC32.c"

Extern "c"
	Function gen_crc_table()
	Function update_crc:Int(crc_accum:Int,data:Byte Ptr,size:Int)
EndExtern

gen_crc_table()

Function CRC32:Long(stream:TStream)
	Local pos:Int=stream.Pos()
	Local value:Long
	Local size:Int=stream.Size()
	Local bank:TBank=CreateBank(size)
	
	stream.Seek(0)
	stream.ReadBytes(bank.buf(),size)
	value=update_crc(-1,bank.buf(),size)
	stream.Seek(pos)
	Return value
EndFunction

CRC32.c:
//======================================================== file = crc32.c =====
//=  Program to compute CRC-32 using the "table method" for 8-bit subtracts   =
//=============================================================================
//=  Notes: Uses the standard "Charles Michael Heard" code available from     =
//=         http://cell-relay.indiana.edu/cell-relay/publications/software    =
//=         /CRC which was adapted from the algorithm described by Avarm      =
//=         Perez, "Byte-wise CRC Calculations," IEEE Micro 3, 40 (1983).     =
//=---------------------------------------------------------------------------=
//=  Build:  bcc32 crc32.c, gcc crc32.c                                       =
//=---------------------------------------------------------------------------=
//=  History:  KJC (8/24/00) - Genesis (from Heard code, see above)           =
//=============================================================================
//----- Include files ---------------------------------------------------------
#include <stdio.h>                  // Needed for printf()
#include <stdlib.h>                 // Needed for rand()

//----- Type defines ----------------------------------------------------------
typedef unsigned char      byte;    // Byte is a char
typedef unsigned short int word16;  // 16-bit word is a short int
typedef unsigned int       word32;  // 32-bit word is an int

//----- Defines ---------------------------------------------------------------
#define POLYNOMIAL 0x04c11db7L      // Standard CRC-32 ppolynomial
#define BUFFER_LEN       4096L      // Length of buffer

//----- Gloabl variables ------------------------------------------------------
static word32 crc_table[256];       // Table of 8-bit remainders

//----- Prototypes ------------------------------------------------------------
void gen_crc_table(void);
word32 update_crc(word32 crc_accum, byte *data_blk_ptr, word32 data_blk_size);

//===== Main program ==========================================================
/*void main(void)
{
  byte        buff[BUFFER_LEN]; // Buffer of packet bytes
  word32      crc32;            // 32-bit CRC value
  word16      i;                // Loop counter (16 bit)
  word32      j;                // Loop counter (32 bit)

  // Initialize the CRC table
  gen_crc_table();

  // Load buffer with BUFFER_LEN random bytes
  for (i=0; i<BUFFER_LEN; i++)
    buff[i] = (byte) rand();

  // Compute and output CRC
  crc32 = update_crc(-1, buff, BUFFER_LEN);
  printf("CRC = %08X \n", crc32);
}*/

//=============================================================================
//=  CRC32 table initialization                                               =
//=============================================================================
void gen_crc_table(void)
{
  register word16 i, j;
  register word32 crc_accum;

  for (i=0;  i<256;  i++)
  {
    crc_accum = ( (word32) i << 24 );
    for ( j = 0;  j < 8;  j++ )
    {
      if ( crc_accum & 0x80000000L )
        crc_accum = (crc_accum << 1) ^ POLYNOMIAL;
      else
        crc_accum = (crc_accum << 1);
    }
    crc_table[i] = crc_accum;
  }
}

//=============================================================================
//=  CRC32 generation                                                         =
//=============================================================================
word32 update_crc(word32 crc_accum, byte *data_blk_ptr, word32 data_blk_size)
{
   register word32 i, j;

   for (j=0; j<data_blk_size; j++)
   {
     i = ((int) (crc_accum >> 24) ^ *data_blk_ptr++) & 0xFF;
     crc_accum = (crc_accum << 8) ^ crc_table[i];
   }
   crc_accum = ~crc_accum;

   return crc_accum;
}


Last edited 2012


JoshK(Posted 2012) [#12]
Cool, I found some code that works. Here's what I ended up using. Thanks!
Strict

Import brl.stream

Global crc_table[256]

crc_init()

Function crc_init()
	Local i
	Local j
	Local value
	
	For i=0 To 255
		value=i
		For j=0 To 7
			If (value & $1) Then 
				value=(value Shr 1) ~ $EDB88320
			Else
				value=(value Shr 1)
			EndIf
		Next
		crc_table[i]=value
	Next
EndFunction

Function GetStreamCRC32(stream:TStream)
	Local bbyte
	Local crc
	Local pos:Int=stream.pos()
	
	crc=$FFFFFFFF
	While Not Eof(stream)
		bbyte=ReadByte(stream)
		crc=(crc Shr 8) ~ crc_table[bbyte ~ (crc & $FF)]
	Wend
	stream.seek pos
	Return ~crc
EndFunction



matibee(Posted 2012) [#13]
Nice work.

It made sense that re-calculating the CRC from the plain data would be the way. Obviously that's still pretty secure because CRC values are only a few bits (and so can't be unique) therefore it's not possible to really tell you have got the original plain data even if the CRC matches. That prevents brute force extraction simply halting when the CRC's match (sprinkle in checks of any included and known file headers though and you'd be on your way :) ). There are zip cracking programs out there that probably use that approach.

We once had a smart-arse customer that put a password on his zips to make them "secure" when emailing. His 4 digit password took seconds to find but I didn't want to rip his warm cozy blanket of ignorance.