Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
2.4.5
-
None
-
None
-
- OS: Windows
- SparkSession: spark = SparkSession.builder.appName("annonces_organiques").getOrCreate()
Description
I have a PipelineModel saved on my computer that I can't load using PipelineModel.load(path).
When I launch my code in a Databricks cluster, it works. path is the path to my model saved on DBFS, accessible via a mount point: path = "/dbfs/path/to/my/model.
However on my machine, calling PipelineModel.load("C:\\Users\\path\\to\\my
model") throws a ValueError("RDD is empty").
Here is how the model is saved on my computer:
pipeline.txt
\---model +---metadata | part-00000 | _SUCCESS | \---stages +---0_CountVectorizer_b92625354bf7 | +---data | | part-00000-tid-9156766819779394023-5cf6aecb-8959-48b3-be24-65bfa0543465-62-1-c000.snappy.parquet | | _committed_9156766819779394023 | | _started_9156766819779394023 | | _SUCCESS | | | \---metadata | part-00000 | _SUCCESS | \---1_LinearSVC_108fa01daf43 +---data | part-00000-tid-4403060754466700849-27841dd9-de88-4015-9dfa-7854c2a15f15-65-1-c000.snappy.parquet | _committed_4403060754466700849 | _started_4403060754466700849 | _SUCCESS | \---metadata part-00000 _SUCCESS
(I just downloaded the model from my DataLake to my computer)
How can I load this model when running my code in local?