Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
0.5.0
-
None
-
None
-
AWS Lambda
Description
When using pyarrow in AWS Lambda function like this:
import pyarrow as pa import pyarrow.parquet as pq import pandas as pd def lambda_handler(event, context): df = pd.DataFrame([data]) #data is dictionary table = pa.Table.from_pandas(df) pq.write_table(table, 'tmp/test.parquet', compression='snappy') table = pq.read_table('tmp/test.parquet') table.to_pandas() print(table) return "Success"
Module initialization error occurs:
module initialization error: [Errno 2] No such file or directory: '/var/task/__pycache__/_cffi__x762f05ffx6bf5342b.c'
Deployment package was prepared by running following commands:
virtualenv nameofenv source nameofenv/bin/active pip install pyarrow sudo apt-get install libsnappy-dev pip install python-snappy pip install pandas
files from site-packages directory are than zipped together with lambda function.
This does not seem obviously related to pyarrow. What happens if you exclude pyarrow from the deployment package?