GLib will automatically convert command line options to UTF-8 provided that setlocale(LC_TYPE, "") is called first, and the argument type is STRING (but not FILENAME). Update the CLI tools to take advantage of this behavior, and likewise implement it in fakeglib.
GLib does not automatically convert non-option arguments (i.e. everything remaining in argv after option processing), so manually call g_locale_to_utf8 on these arguments when they represent table names. This should fix the CLI tools when processing non-ASCII table names in non-UTF-8 locales. Also update fakeglib to implement a fast and loose version of g_locale_to_utf8, and factor out some of the code page => iconv name logic in iconv.c so it can be used in our fake g_locale_to_utf8. This adds a new symbol mdb_iconv_name_from_code_page that is not advertised in the main header file. I did not want to include mdbtools.h from fakeglib.c, but maybe that's not important.
Other programs (e.g. gmdb2) use mdb_print_col, so restore the old enum
names and values. MDB_EXPORT_ESCAPE_INVISIBLE can be OR'ed into the
last argument to enable C-style escaping of text fields.
Print a warning if use_index is turned on but libmswstr is not found. I
suppose we could enable indexes only on JET3 databases, but I am not at
all confident in the JET3 index logic, as it seems to break on non-ASCII
input. So I'd rather just print this warning and require some
hoop-jumping.
* Migrate the Windows Msys2 build from Appveyor to GitHub Actions
* Fix build with newer versions of Msys2 (fix `vasprintf` conflict)
* Enable SQL tests on the Cygwin build on Appveyor
* Fix an error message about Bison not being available when in fact Flex was not available
* Don't fail fast with Mac and Linux GitHub Actions
src/libmdb/backend.c:mdb_print_indexes() and
src/libmdb/backend.c:mdb_get_relationships(): In PostgreSQL the INDEX
names explicitly must not have a namespace name on them; they are always
created in the namespace of the table. See:
http://www.postgresql.org/docs/current/static/sql-createindex.html
which says: "The name of the index to be created. No schema name can be
included here; the index is always created in the same schema as
its parent table."
By observation the same is true for CONSTRAINT names; they are refused
if the namespace is included before them.
Also omit the namespace from the FOREIGN KEY constraint _column_ names
on PostgreSQL (it's not clear that the _column_ names should ever be
namespaced, but behaviour should currently be unchanged for databases
other than PostgreSQL).
According to the HACKING file, the file's default language ID is stored
in the database header. Use this value instead of a generic English
language locale for indexing JET4 files.
Columns can have their own text sorting rules, including language ID
distinct from the file's language ID, but this is not addressed as we'd
have to break the mdb_index_hash_text function signature, which I'm not
prepared to do just yet.
There appear to be two bytes after the language ID that may indicate
additional sorting flags. These bytes need additional research.
Using the notes and RC4 key provided in the HACKING file, decrypt the
database definition page all at once instead of decrypting individual
fields with ad-hoc keys. Use the newly decrypted header to access the
database code page at offset 0x3C, and use this numeric value to
initialize the iconv converter with an appropriate charset name for
popular windows code pages. More encodings can be added later, with
the eventual goal of getting rid of the MDB_JET3_CHARSET environment
variable.
Note that individual columns can have their own code pages but this
issue is not addressed.
An extra field is added to the MdbFile structure - because this
struct is allocated internally, this should not break the public
ABI.
Finally, only set the db_passwd field if it's a JET3 database (see #144)
Replace the jerry-built UTF-16 => Latin-1 code path with a cross-platform wcstombs solution that emits UTF-8.
This adds an element to the end of the MdbHandle struct, but should not break any existing code.
A run-time option could be added later to emit other encodings, but people who care about such things can just use the iconv code path.