Per #260 I am assuming that the internal version 6 refers to files
created with Access 2019. I can't find any documentation on this format,
so I am calling it ACE17. Testing welcome.
According to the HACKING file, the file's default language ID is stored
in the database header. Use this value instead of a generic English
language locale for indexing JET4 files.
Columns can have their own text sorting rules, including language ID
distinct from the file's language ID, but this is not addressed as we'd
have to break the mdb_index_hash_text function signature, which I'm not
prepared to do just yet.
There appear to be two bytes after the language ID that may indicate
additional sorting flags. These bytes need additional research.
Using the notes and RC4 key provided in the HACKING file, decrypt the
database definition page all at once instead of decrypting individual
fields with ad-hoc keys. Use the newly decrypted header to access the
database code page at offset 0x3C, and use this numeric value to
initialize the iconv converter with an appropriate charset name for
popular windows code pages. More encodings can be added later, with
the eventual goal of getting rid of the MDB_JET3_CHARSET environment
variable.
Note that individual columns can have their own code pages but this
issue is not addressed.
An extra field is added to the MdbFile structure - because this
struct is allocated internally, this should not break the public
ABI.
Finally, only set the db_passwd field if it's a JET3 database (see #144)
* Separate -D (date only) and -T (date/time) format options in mdb-export and mdb-json
* New public mdb_set_shortdate_fmt() function in libmdb
* New private(ish) mdb_col_is_shortdate() function
I'm calling it "shortdate" in order to preserve the existing API.
See https://github.com/mdbtools/mdbtools/issues/12
This should fix long-standing complaints about the default bind size
without causing undue memory inflation in existing applications.
Could make this adjustable on the command line later.
Supersedes:
https://github.com/mdbtools/mdbtools/pull/137
Quickstart (requires Clang 6 or later):
$ export LIB_FUZZING_ENGINE=/path/to/fuzzing/library.a
$ ./configure --enable-fuzz-testing
$ make
$ cd src/fuzz
$ make fuzz_mdb
$ ./fuzz_mdb
Also add a new `mdb_open_buffer function` to facilitate in-memory
fuzz-testing. This requires fmemopen, which may not be present on all
systems. The internal API has been reworked to use file streams instead
of file descriptors. This allows reading from memory and reading from
files using a consistent API.
There are more modern tools for memory debugging, get rid of DMALLOC
crap in the source code.
I've left one reference in backend.c to prevent a merge conflict but
this can be removed later.
Attempt to make the backend handling logic thread-safe. This removes the
last MDB_CONSTRUCTOR. Also get rid of some JAVA junk and make the
remaining static variables in backend.c constant. Finally remove some
obsolete fields from MdbFile.
Some Access 2010 files use 0x03 as the version number rather than
0x0103. For this reason I have changed the call to mdb_get_int32 to
mdb_get_byte.
In addition, according to the Library of Congress page:
https://www.loc.gov/preservation/digital/formats/fdd/fdd000463.shtml
Access 2016 uses 0x05 as the version number. I have inferred from the
Wikipedia page that Access 2013 likely uses 0x04.