Feature Request / Improvement
Summary
PyIceberg's HiveCatalog supports Kerberos (GSSAPI) authentication via hive.kerberos-authentication, but does not support DIGEST-MD5 SASL authentication with Hadoop delegation tokens. In many production Hadoop environments, pods/containers authenticate to HMS using delegation tokens (read from $HADOOP_TOKEN_FILE_LOCATION) rather than Kerberos keytabs. This means PyIceberg's Hive catalog cannot be used in these environments without building a custom client.
Proposed Enhancement
Extend _HiveClient to support DIGEST-MD5 delegation token auth:
- Add a new config property (e.g.
hive.metastore.authentication=DIGEST-MD5)
- When DIGEST-MD5 is configured, read credentials from
$HADOOP_TOKEN_FILE_LOCATION (Hadoop Writable credentials format)
- Use
TSaslClientTransport with mechanism=DIGEST-MD5 and the extracted token identifier/password