[SPARK-3297] [Spark SQL][UI] SchemaRDD toString with many columns messes up Storage tab display - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.0.2
Fix Version/s: 1.1.1, 1.2.0
Component/s: SQL, Web UI
Labels:
- newbie

Description

When a SchemaRDD with many columns (for example, 57 columns in this example) is cached using sqlContext.cacheTable, the Storage tab of the driver Web UI display gets messed up, because the long string of the SchemaRDD causes the first column to be much much wider than the others, and in fact much wider than the width of the browser. It would be nice to have the first column be restricted to, say, 50% of the width of the browser window, with some minimum.

For example this is the SchemaRDD text for my table:

RDD Storage Info for ExistingRdd ActionGeo_ADM1Code#198,ActionGeo_CountryCode#199,ActionGeo_FeatureID#200,ActionGeo_FullName#201,ActionGeo_Lat#202,ActionGeo_Long#203,ActionGeo_Type#204,Actor1Code#205,Actor1CountryCode#206,Actor1EthnicCode#207,Actor1Geo_ADM1Code#208,Actor1Geo_CountryCode#209,Actor1Geo_FeatureID#210,Actor1Geo_FullName#211,Actor1Geo_Lat#212,Actor1Geo_Long#213,Actor1Geo_Type#214,Actor1KnownGroupCode#215,Actor1Name#216,Actor1Religion1Code#217,Actor1Religion2Code#218,Actor1Type1Code#219,Actor1Type2Code#220,Actor1Type3Code#221,Actor2Code#222,Actor2CountryCode#223,Actor2EthnicCode#224,Actor2Geo_ADM1Code#225,Actor2Geo_CountryCode#226,Actor2Geo_FeatureID#227,Actor2Geo_FullName#228,Actor2Geo_Lat#229,Actor2Geo_Long#230,Actor2Geo_Type#231,Actor2KnownGroupCode#232,Actor2Name#233,Actor2Religion1Code#234,Actor2Religion2Code#235,Actor2Type1Code#236,Actor2Type2Code#237,Actor2Type3Code#238,AvgTone#239,DATEADDED#240,Day#241,EventBaseCode#242,EventCode#243,EventId#244,EventRootCode#245,FractionDate#246,GoldsteinScale#247,IsRootEvent#248,MonthYear#249,NumArticles#250,NumMentions#251,NumSources#252,QuadClass#253,Year#254, MappedRDD[200]

I would personally love to fix the toString method to not necessarily print every column, but to cut it off after a while. This would aid the printout in the Spark Shell as well. For example:

ActionGeo_ADM1Code#198,ActionGeo_CountryCode#199,ActionGeo_FeatureID#200,ActionGeo_FullName#201,ActionGeo_Lat#202 .... and 52 more columns

Attachments

Issue Links

relates to

SPARK-3827 Very long RDD names are not rendered properly in web UI

Resolved

Activity

People

Assignee:: Hossein Falaki

Reporter:: Evan Chan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 29/Aug/14 07:32

Updated:: 07/Oct/14 18:48

Resolved:: 07/Oct/14 18:48