天天看點

python 3.6使用 impala連接配接hive遇到的問題

 用python3 用implala連接配接hive中遇到的一下問題。其中的報錯主要參考了:

https://blog.csdn.net/Xiblade/article/details/82318294

https://blog.csdn.net/wx0628/article/details/86550582

https://blog.csdn.net/woay2008/article/details/79905627

代碼很簡單:

from impala.dbapi import connect
# 需要注意的是這裡的auth_mechanism必須有,但database不必須
conn = connect(host='172.26.xxx.xxx', port=10000 ,auth_mechanism='PLAIN')
cur = conn.cursor()

cur.execute('SHOW DATABASES')
print(cur.fetchall())
cur.execute('SHOW Tables')
print(cur.fetchall())

           

安裝包:

pip install pure-sasl

pip install thrift_sasl==0.2.1 --no-deps

pip install thrift==0.9.3

pip install impyla

1. 安裝impla的時候報錯

error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": 

有人說使用二進制包安裝  :pip install impyla-0.14.1-py3-none-any.whl   ,我試過了不行,依舊報錯。

解決辦法:

安裝Visual Studio 2015(C槽至少需要6G空間!沒辦法,不得不裝)

下載下傳位址在這裡

安裝界面,隻選擇”Common Tools for VisualC++2015“。

python 3.6使用 impala連接配接hive遇到的問題

安裝vs後繼續安裝impala,安裝成功:

報錯:

ThriftParserError: ThriftPy does not support generating module with path in protocol ‘c’

解決辦法:

定位到 \Lib\site-packages\thriftpy\parser\parser.py的

if url_scheme == '':
    with open(path) as fh:
        data = fh.read()
elif url_scheme in ('http', 'https'):
    data = urlopen(path).read()
else:
    raise ThriftParserError('ThriftPy does not support generating module '
                            'with path in protocol \'{}\''.format(
                                url_scheme))
           

更改為:

if url_scheme == '':
    with open(path) as fh:
        data = fh.read()
elif url_scheme in ('c', 'd','e','f''):
    with open(path) as fh:
        data = fh.read()


elif url_scheme in ('http', 'https'):
    data = urlopen(path).read()
else:
    raise ThriftParserError('ThriftPy does not support generating module '
                            'with path in protocol \'{}\''.format(
                                url_scheme))
           

報錯:

thriftpy.transport.TTransportException: TTransportException(type=1, message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2'")

搞了好久,記得之前安裝pyhive也是這個報錯。

主要原因其實還是因為sasl和pure-sasl有沖突,這種情況下,直接解除安裝sasl包就可能了。

解決辦法:

pip uninstall SASL
           

報錯:TypeError: can't concat str to bytes

解決辦法:

定位到錯誤的最後一條,在init.py第94行 (注意代碼的縮進)

header = struct.pack(">BI", status, len(body))
self._trans.write(header + body)
           

更改為:

header = struct.pack(">BI", status, len(body))
if(type(body) is str):
    body = body.encode() 
self._trans.write(header + body)
           

修改代碼的時候一定注意縮進,不然你都不知所雲了。

至此,可以通路hive資料庫了。