[ARROW-11427] [C++] Arrow uses AVX512 instructions even when not supported by the OS - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.0.0
Component/s: C++, Python
Labels:
- pull-request-available
Environment:
Windows Server 2012 Datacenter, Azure VM (D2_v2), Intel Xeon Platinum 8171m

External issue URL:
https://github.com/apache/arrow/issues/27314

Description

Update: Azure (D2_v2) VM no longer spins-up with Xeon Platinum 8171m, so I'm unable to test it with other OS's. Azure VM's are assigned different type of CPU's of same "class" depending on availability. I will try my "luck" later.

VM's w/ Xeon Platinum 8171m running on Azure (D2_v2) start crashing after upgrading from pyarrow 2.0 to pyarrow 3.0. However, this only happens when reading parquet files larger than 4096 bits!?

Windows closes Python with exit code 255 and produces this:

Faulting application name: python.exe, version: 3.8.3150.1013, time stamp: 0x5ebc7702 Faulting module name: arrow.dll, version: 0.0.0.0, time stamp: 0x60060ce3 Exception code: 0xc000001d Fault offset: 0x000000000047aadc Faulting process id: 0x1b10 Faulting application start time: 0x01d6f4a43dca3c14 Faulting application path: D:\SvcFab\_App\SomeApp.FabricType_App32\SomeApp.Fabric.Executor.ProcessActorPkg.Code.1.0.218-prod\Python38\python.exe Faulting module path: D:\SvcFab\_App\SomeApp.FabricType_App32\temp\Executions\50cfffe8-9250-4ac7-8ba8-08d8c2bb3edf\.venv\lib\site-packages\pyarrow\arrow.dll

Tested on:

OS	Xeon Platinum 8171m or 8272CL	Other CPUs
Windows Server 2012 Data Center	Fail	OK
Windows Server 2016 Data Center	OK	OK
Windows Server 2019 Data Center
Windows 10		OK

Example code (Python):

import numpy as np
import pandas as pd

data_len = 2**5
data = pd.DataFrame(
    {"values": np.arange(0., float(data_len), dtype=float)},
    index=np.arange(0, data_len, dtype=int)
)

data.to_parquet("test.parquet")
data = pd.read_parquet("test.parquet", engine="pyarrow")  # fails here!

Attachments

Issue Links

links to

GitHub Pull Request #9398

Activity

People

Assignee:: Antoine Pitrou

Reporter:: Ali Cetin

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 29/Jan/21 13:26

Updated:: 11/Jan/23 08:19

Resolved:: 05/Feb/21 16:20

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 50m