Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.5
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      Provides support for monetary values to Solr/Lucene with query-time currency conversion. The following features are supported:

      • Point queries
      • Range quries
      • Sorting
      • Currency parsing by either currency code or symbol.
      • Symmetric & Asymmetric exchange rates. (Asymmetric exchange rates are useful if there are fees associated with exchanging the currency.)

      At indexing time, money fields can be indexed in a native currency. For example, if a product on an e-commerce site is listed in Euros, indexing the price field as "1000,EUR" will index it appropriately. By altering the currency.xml file, the sorting and querying against Solr can take into account fluctuations in currency exchange rates without having to re-index the documents.

      The new "money" field type is a polyfield which indexes two fields, one which contains the amount of the value and another which contains the currency code or symbol. The currency metadata (names, symbols, codes, and exchange rates) are expected to be in an xml file which is pointed to by the field type declaration in the schema.xml.

      The current patch is factored such that Money utility functions and configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), while the MoneyType and MoneyValueSource lie in Solr. This was meant to mirror the work being done on the spacial field types.

      This patch will be getting used to power the international search capabilities of the search engine at Etsy.

      Also see WIKI page: http://wiki.apache.org/solr/MoneyFieldType

      1. SOLR-2202-solr-1.patch
        17 kB
        Greg Fodor
      2. SOLR-2202-lucene-1.patch
        24 kB
        Greg Fodor
      3. SOLR-2202-solr-2.patch
        18 kB
        Greg Fodor
      4. SOLR-2022-solr-3.patch
        29 kB
        Greg Fodor
      5. SOLR-2202-solr-4.patch
        30 kB
        Greg Fodor
      6. SOLR-2202-solr-5.patch
        30 kB
        Greg Fodor
      7. SOLR-2202-solr-6.patch
        37 kB
        Greg Fodor
      8. SOLR-2202-solr-7.patch
        40 kB
        Greg Fodor
      9. SOLR-2202-solr-8.patch
        40 kB
        Greg Fodor
      10. SOLR-2202-solr-9.patch
        33 kB
        Greg Fodor
      11. SOLR-2202.patch
        33 kB
        Greg Fodor
      12. SOLR-2202.patch
        41 kB
        Jan Høydahl
      13. SOLR-2202.patch
        41 kB
        Jan Høydahl
      14. SOLR-2202.patch
        41 kB
        Greg Fodor
      15. SOLR-2202-solr-10.patch
        41 kB
        Andrew Morrison
      16. SOLR-2202.patch
        43 kB
        Jan Høydahl
      17. SOLR-2202.patch
        52 kB
        Jan Høydahl
      18. SOLR-2202.patch
        56 kB
        Jan Høydahl
      19. SOLR-2202.patch
        58 kB
        Jan Høydahl
      20. SOLR-2202-fix-NPE-if-no-tlong-fieldType.patch
        2 kB
        Koji Sekiguchi
      21. SOLR-2202-no-fieldtype-deps.patch
        5 kB
        Jan Høydahl
      22. SOLR-2202-3x-stabilize-provider-interface.patch
        8 kB
        Jan Høydahl

        Issue Links

          Activity

          Hide
          Greg Fodor added a comment -

          Initial patch.

          Show
          Greg Fodor added a comment - Initial patch.
          Hide
          Greg Fodor added a comment -

          Forgot to include currency.xml.

          Show
          Greg Fodor added a comment - Forgot to include currency.xml.
          Hide
          Robert Muir added a comment -

          Hello, taking a look at the code, is there any specific reason you didn't use NumberFormat/Currency in java for parsing, output, symbols, etc?

          I think this would be easier in order to support more locales.

          Some examples below, using NumberFormat.getCurrencyInstance() with a specified currency, specifying locale:

          thai baht:

          en_US: THB1,234.50
          th_TH: ฿1,234.50
          th_TH_TH: ฿๑,๒๓๔.๕๐
          

          euro:

          de_DE: 1.234,50 €
          en_US: EUR1,234.50
          
          Show
          Robert Muir added a comment - Hello, taking a look at the code, is there any specific reason you didn't use NumberFormat/Currency in java for parsing, output, symbols, etc? I think this would be easier in order to support more locales. Some examples below, using NumberFormat.getCurrencyInstance() with a specified currency, specifying locale: thai baht: en_US: THB1,234.50 th_TH: ฿1,234.50 th_TH_TH: ฿๑,๒๓๔.๕๐ euro: de_DE: 1.234,50 € en_US: EUR1,234.50
          Hide
          Uwe Schindler added a comment - - edited

          The other question is, why you don't use NumericField (in Solr it's the TrieField type) and instead save the stuff as plain numbers in index?

          In general its wrong to use float/double as currency-backing type, as you have rounding problems. To index/store the fields in lucene/solr or any database, you have to use fixed point. E.g. a TrieField instance saing the "US-Cent" value.

          This would enable Range Queries without the field cache!

          Show
          Uwe Schindler added a comment - - edited The other question is, why you don't use NumericField (in Solr it's the TrieField type) and instead save the stuff as plain numbers in index? In general its wrong to use float/double as currency-backing type, as you have rounding problems. To index/store the fields in lucene/solr or any database, you have to use fixed point. E.g. a TrieField instance saing the "US-Cent" value. This would enable Range Queries without the field cache!
          Hide
          Uwe Schindler added a comment -

          Here some infos, why it's wrong to use double/float: http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html

          Show
          Uwe Schindler added a comment - Here some infos, why it's wrong to use double/float: http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
          Hide
          Robert Muir added a comment -

          Here some infos, why it's wrong to use double/float: http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html

          Also, you can use Currency.getDefaultFractionDigits to fix this.

          For example, the default number of fraction digits for the Euro is 2, while for the Japanese Yen it's 0. In the case of pseudo-currencies, such as IMF Special Drawing Rights, -1 is returned.

          Show
          Robert Muir added a comment - Here some infos, why it's wrong to use double/float: http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html Also, you can use Currency.getDefaultFractionDigits to fix this. For example, the default number of fraction digits for the Euro is 2, while for the Japanese Yen it's 0. In the case of pseudo-currencies, such as IMF Special Drawing Rights, -1 is returned.
          Hide
          Greg Fodor added a comment -

          Hey Uwe & Robert, thank you for your feedback. I will look into better integration with the native Currency classes, that makes a lot of sense. As far as the numerical indexing goes, I might need some help.

          First, I'm guessing the necessary change will be to change the "amount" subfield from a double to a tdouble field type. (TrieDouble) How exactly does this help with rounding errors? I'm not that familiar with the Trie types, but it seems that since the values are coerced through the ValueSource to doubles there is always a risk of rounding error.

          Additionally, how does the trie field prevent hitting the field cache? I am implementing a custom value source, which does the currency conversion by applying it by iterating over the ValueSource's for the subfields.

          Finally, you mention saving the values as "plain numbers in the index" – I assumed this is what I was already doing via the construction of the subfields and storing the values in "double" FieldType'd field "amount". Can you explain what steps I need to take to be sure these values are being stored as numbers in the index?

          Thanks!

          Show
          Greg Fodor added a comment - Hey Uwe & Robert, thank you for your feedback. I will look into better integration with the native Currency classes, that makes a lot of sense. As far as the numerical indexing goes, I might need some help. First, I'm guessing the necessary change will be to change the "amount" subfield from a double to a tdouble field type. (TrieDouble) How exactly does this help with rounding errors? I'm not that familiar with the Trie types, but it seems that since the values are coerced through the ValueSource to doubles there is always a risk of rounding error. Additionally, how does the trie field prevent hitting the field cache? I am implementing a custom value source, which does the currency conversion by applying it by iterating over the ValueSource's for the subfields. Finally, you mention saving the values as "plain numbers in the index" – I assumed this is what I was already doing via the construction of the subfields and storing the values in "double" FieldType'd field "amount". Can you explain what steps I need to take to be sure these values are being stored as numbers in the index? Thanks!
          Hide
          Robert Muir added a comment -

          I will look into better integration with the native Currency classes, that makes a lot of sense.

          Great! Just for your own reference at least, you get even more powerful stuff with the ICU Currency stuff.

          For example, if you use NumberFormat.getInstance(new ULocale("en", "US"), NumberFormat.PLURALCURRENCYSTYLE),
          then setCurrency to Currency.getInstance("USD"), it will output "1,234.50 US dollars".

          So, with ICU you dont have to maintain the plural data you have (or worry about the plural formatting, etc).

          Show
          Robert Muir added a comment - I will look into better integration with the native Currency classes, that makes a lot of sense. Great! Just for your own reference at least, you get even more powerful stuff with the ICU Currency stuff. For example, if you use NumberFormat.getInstance(new ULocale("en", "US"), NumberFormat.PLURALCURRENCYSTYLE), then setCurrency to Currency.getInstance("USD"), it will output "1,234.50 US dollars". So, with ICU you dont have to maintain the plural data you have (or worry about the plural formatting, etc).
          Hide
          Greg Fodor added a comment - - edited

          A few questions, now that I've done a bit more research and thinking:

          • First, currency parsing in Java appears locale-dependent (which obviously makes sense.) The concern here is that the locale of the end-user performing queries is likely not the same as the locale of the search engine. Is there currently a standard mechanism in Solr to acquire the user's locale? What do we do for other internationalized components?
          • NumberFormat parsing fails to parse "10.00USD", or "10.00 USD", instead relying upon the symbol. ("$10.00"). This seems like a limitation since generally using the currency code as suffix is a locale-independent way of specifying a monetary value, making indexing code easy to write (simply append the currency code for the document to the value). It may very well be a good idea to simply standardize on this approach for the purposes of indexing, and avoid all the locale-specific issues that come up regarding currency symbols.
          • The NumberFormat parsing does not yield back the currency, just the value. It seems the currency itself still needs to be extracted somehow. Is there a built in mechanism to do this? Currently the patch iterates over all currencies attempting to extract the symbol or code from the value.
          • How important is it that users have control over the currencies table? It was quite useful to have the ability to define fake currencies for testing (as is done in the example currency.xml file), it seems that if I changed the implementation to use Java's currency table this might be a limitation if non-testing oriented use-cases exist.
          • I wanted to know in more detail what rounding related errors, if any, I need to be concerned with. You'll notice in the patch that the range query applies an EPSILON on the edges to avoid floating point equality issues, and the point query actually executes a range query. Are there additional problems I need to address? It seems there will always be some margin of error when exchange rates are being applied since this requires floating point multiplications of values in the index at execution time.
          • Looking further I'm not really sure I understand how the TrieField can benefit me here. It seems that an entire iteration through the ValueSource is necessary for range queries, as conversion rates may dictate that the minimum and maximum amount value documents need to be visited.
          • Right now the name and plural name, etc are unused. It definitely will make sense to remove or incorporate Java's native APIs to get those if they end up being needed, however.

          Thanks again for reviewing this patch!

          Show
          Greg Fodor added a comment - - edited A few questions, now that I've done a bit more research and thinking: First, currency parsing in Java appears locale-dependent (which obviously makes sense.) The concern here is that the locale of the end-user performing queries is likely not the same as the locale of the search engine. Is there currently a standard mechanism in Solr to acquire the user's locale? What do we do for other internationalized components? NumberFormat parsing fails to parse "10.00USD", or "10.00 USD", instead relying upon the symbol. ("$10.00"). This seems like a limitation since generally using the currency code as suffix is a locale-independent way of specifying a monetary value, making indexing code easy to write (simply append the currency code for the document to the value). It may very well be a good idea to simply standardize on this approach for the purposes of indexing, and avoid all the locale-specific issues that come up regarding currency symbols. The NumberFormat parsing does not yield back the currency, just the value. It seems the currency itself still needs to be extracted somehow. Is there a built in mechanism to do this? Currently the patch iterates over all currencies attempting to extract the symbol or code from the value. How important is it that users have control over the currencies table? It was quite useful to have the ability to define fake currencies for testing (as is done in the example currency.xml file), it seems that if I changed the implementation to use Java's currency table this might be a limitation if non-testing oriented use-cases exist. I wanted to know in more detail what rounding related errors, if any, I need to be concerned with. You'll notice in the patch that the range query applies an EPSILON on the edges to avoid floating point equality issues, and the point query actually executes a range query. Are there additional problems I need to address? It seems there will always be some margin of error when exchange rates are being applied since this requires floating point multiplications of values in the index at execution time. Looking further I'm not really sure I understand how the TrieField can benefit me here. It seems that an entire iteration through the ValueSource is necessary for range queries, as conversion rates may dictate that the minimum and maximum amount value documents need to be visited. Right now the name and plural name, etc are unused. It definitely will make sense to remove or incorporate Java's native APIs to get those if they end up being needed, however. Thanks again for reviewing this patch!
          Hide
          Uwe Schindler added a comment -

          Looking further I'm not really sure I understand how the TrieField can benefit me here. It seems that an entire iteration through the ValueSource is necessary for range queries, as conversion rates may dictate that the minimum and maximum amount value documents need to be visited.

          Very easy: Dont do the transformation for each value in terms index/per document. Just convert the currency value from the query string to the index local before building the trie range.

          First, I'm guessing the necessary change will be to change the "amount" subfield from a double to a tdouble field type. (TrieDouble) How exactly does this help with rounding errors? I'm not that familiar with the Trie types, but it seems that since the values are coerced through the ValueSource to doubles there is always a risk of rounding error.

          I said, you should remove double/long completely and save the value as a "long" or "tlong" using the Currency.getDefaultFractionDigits (e.g. convert euros into eurocents, because for EUR the fraction digits are 2).

          Show
          Uwe Schindler added a comment - Looking further I'm not really sure I understand how the TrieField can benefit me here. It seems that an entire iteration through the ValueSource is necessary for range queries, as conversion rates may dictate that the minimum and maximum amount value documents need to be visited. Very easy: Dont do the transformation for each value in terms index/per document. Just convert the currency value from the query string to the index local before building the trie range. First, I'm guessing the necessary change will be to change the "amount" subfield from a double to a tdouble field type. (TrieDouble) How exactly does this help with rounding errors? I'm not that familiar with the Trie types, but it seems that since the values are coerced through the ValueSource to doubles there is always a risk of rounding error. I said, you should remove double/long completely and save the value as a "long" or "tlong" using the Currency.getDefaultFractionDigits (e.g. convert euros into eurocents, because for EUR the fraction digits are 2).
          Hide
          Robert Muir added a comment - - edited

          Hi Greg, these are excellent questions... I'll reply only to the localization ones and let Uwe or others talk about the Trie stuff.

          First, currency parsing in Java appears locale-dependent (which obviously makes sense.) The concern here is that the locale of the end-user performing queries is likely not the same as the locale of the search engine. Is there currently a standard mechanism in Solr to acquire the user's locale? What do we do for other internationalized components?

          Well, in general components are internationalized, but this is actually a localization problem. Usually for Solr, the Solr service is not handlign the end-users request, so its best if Locale is somehow a parameter, and the default Locale never used at all. In other words, its up to you to figure out how you want to determine what Locale to use, and Solr would just respect that.

          NumberFormat parsing fails to parse "10.00USD", or "10.00 USD", instead relying upon the symbol. ("$10.00"). This seems like a limitation since generally using the currency code as suffix is a locale-independent way of specifying a monetary value, making indexing code easy to write (simply append the currency code for the document to the value). It may very well be a good idea to simply standardize on this approach for the purposes of indexing, and avoid all the locale-specific issues that come up regarding currency symbols.

          Its not really a limitation, it depends upon the NumberFormat in use. The one you used for parsing is just the Locale default format for that Locale from getCurrencyInstance... but you can supply your own DecimalFormat pattern too. This is a printf/scanf like pattern that can contain special characters, particularly ¤ (\u00A4): Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.

          Ideally here, you could allow this pattern to be a parameter.

          The NumberFormat parsing does not yield back the currency, just the value. It seems the currency itself still needs to be extracted somehow. Is there a built in mechanism to do this? Currently the patch iterates over all currencies attempting to extract the symbol or code from the value.

          Right, NumberFormat parses the actual number. Really its best if the currency ISO code (e.g. USD) itself is supplied as a parameter, because these symbols are not unique, for example $ is used for many currencies. I think this is what Solr should do. if the end-user application doesn't know this somehow, the end-user application can use more sophisticated mechanisms to "guess" it, particularly things like ICU's "CurrencyMetaInfo" allow you to supply a "filter" based on things like region, and timeframes, to get a list of the currencies used in that region at that time, but I think Solr should just take the ISO code as input.

          How important is it that users have control over the currencies table? It was quite useful to have the ability to define fake currencies for testing (as is done in the example currency.xml file), it seems that if I changed the implementation to use Java's currency table this might be a limitation if non-testing oriented use-cases exist.

          I don't think its import to have fake currencies, but such things can be done with the Locale SPI ... I think. I think we could just use real currencies for testing.

          Show
          Robert Muir added a comment - - edited Hi Greg, these are excellent questions... I'll reply only to the localization ones and let Uwe or others talk about the Trie stuff. First, currency parsing in Java appears locale-dependent (which obviously makes sense.) The concern here is that the locale of the end-user performing queries is likely not the same as the locale of the search engine. Is there currently a standard mechanism in Solr to acquire the user's locale? What do we do for other internationalized components? Well, in general components are internationalized, but this is actually a localization problem. Usually for Solr, the Solr service is not handlign the end-users request, so its best if Locale is somehow a parameter, and the default Locale never used at all. In other words, its up to you to figure out how you want to determine what Locale to use, and Solr would just respect that. NumberFormat parsing fails to parse "10.00USD", or "10.00 USD", instead relying upon the symbol. ("$10.00"). This seems like a limitation since generally using the currency code as suffix is a locale-independent way of specifying a monetary value, making indexing code easy to write (simply append the currency code for the document to the value). It may very well be a good idea to simply standardize on this approach for the purposes of indexing, and avoid all the locale-specific issues that come up regarding currency symbols. Its not really a limitation, it depends upon the NumberFormat in use. The one you used for parsing is just the Locale default format for that Locale from getCurrencyInstance... but you can supply your own DecimalFormat pattern too. This is a printf/scanf like pattern that can contain special characters, particularly ¤ (\u00A4): Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator. Ideally here, you could allow this pattern to be a parameter. The NumberFormat parsing does not yield back the currency, just the value. It seems the currency itself still needs to be extracted somehow. Is there a built in mechanism to do this? Currently the patch iterates over all currencies attempting to extract the symbol or code from the value. Right, NumberFormat parses the actual number. Really its best if the currency ISO code (e.g. USD) itself is supplied as a parameter, because these symbols are not unique, for example $ is used for many currencies. I think this is what Solr should do. if the end-user application doesn't know this somehow, the end-user application can use more sophisticated mechanisms to "guess" it, particularly things like ICU's "CurrencyMetaInfo" allow you to supply a "filter" based on things like region, and timeframes, to get a list of the currencies used in that region at that time, but I think Solr should just take the ISO code as input. How important is it that users have control over the currencies table? It was quite useful to have the ability to define fake currencies for testing (as is done in the example currency.xml file), it seems that if I changed the implementation to use Java's currency table this might be a limitation if non-testing oriented use-cases exist. I don't think its import to have fake currencies, but such things can be done with the Locale SPI ... I think. I think we could just use real currencies for testing.
          Hide
          Robert Muir added a comment -

          Sorry for editing my response above, i just wanted to be more clear in that I think any guess-work should be outside the scope of Solr

          Show
          Robert Muir added a comment - Sorry for editing my response above, i just wanted to be more clear in that I think any guess-work should be outside the scope of Solr
          Hide
          Greg Fodor added a comment -

          Uwe: apologies if I'm missing something, but I'm still having a hard time understanding how this would work. If the user performs a query for 10.00EUR, it seems the appropriate trie range, in theory, would be the largest and smallest possible converted currency value, across all currencies. For example, if we assume currencies 1FOO = 100EUR, and 1BAR = 0.001EUR, then we'd have to do a trie range for the query 1EUR from 0.001 to 100, is this correct?

          If this is what you're proposing, I'm not sure if it will be that big a win, due to the fact that there are exchange rates exceeding 100X (for example, USD -> JPY). Presumably, if you are indexing e-commerce products that range from say $10.00 to $1,000USD, you'll likely be forced to scan the entire range of values for most queries. For example, a $25.00USD query will need to scan from approx 10.00 -> 2500.00 to be sure no documents are missed. (lower bound dictated by GBP or EUR, upper bound by JPY for example.)

          Robert: Thanks for the clarifications. First, if no such mechanism exists, would it make sense to include the locale as one of the parameters to the query via the edismax parser? I think the locale specified can be useful for parsing the currency in cases where "$", for example, is ambiguous, and we should defer as you said to the ICU/Java parser. However, I think including the currency code as part of the value itself (particularly when indexing it as a field) is an important use case to support as well, since it makes indexing so much easier to implement. Is this what you mean by Solr taking the ISO code as input? For example: "price:10.00USD", where the code is part of the value. This is what is currently supported by the patch. Note that I do not allow spaces, since that breaks the range query parser. ("price:[10.00USD to 40.00USD]") This can probably be addressed later.

          Show
          Greg Fodor added a comment - Uwe: apologies if I'm missing something, but I'm still having a hard time understanding how this would work. If the user performs a query for 10.00EUR, it seems the appropriate trie range, in theory, would be the largest and smallest possible converted currency value, across all currencies. For example, if we assume currencies 1FOO = 100EUR, and 1BAR = 0.001EUR, then we'd have to do a trie range for the query 1EUR from 0.001 to 100, is this correct? If this is what you're proposing, I'm not sure if it will be that big a win, due to the fact that there are exchange rates exceeding 100X (for example, USD -> JPY). Presumably, if you are indexing e-commerce products that range from say $10.00 to $1,000USD, you'll likely be forced to scan the entire range of values for most queries. For example, a $25.00USD query will need to scan from approx 10.00 -> 2500.00 to be sure no documents are missed. (lower bound dictated by GBP or EUR, upper bound by JPY for example.) Robert: Thanks for the clarifications. First, if no such mechanism exists, would it make sense to include the locale as one of the parameters to the query via the edismax parser? I think the locale specified can be useful for parsing the currency in cases where "$", for example, is ambiguous, and we should defer as you said to the ICU/Java parser. However, I think including the currency code as part of the value itself (particularly when indexing it as a field) is an important use case to support as well, since it makes indexing so much easier to implement. Is this what you mean by Solr taking the ISO code as input? For example: "price:10.00USD", where the code is part of the value. This is what is currently supported by the patch. Note that I do not allow spaces, since that breaks the range query parser. ("price: [10.00USD to 40.00USD] ") This can probably be addressed later.
          Hide
          Greg Fodor added a comment -

          I guess I should clarify my comment re: TrieField. I guess I'm wondering if it is more expensive to perform a Trie-based query against a large portion of the value's range instead of a direct fieldcache based range query. My assumption (which might be incorrect) is that trie-based range queries across the entire span of values are more expensive than non-Trie full-span range queries. If this isn't the case then it makes sense to do as you suggest and use Trie ranges even though often they will span the entire range of values.

          Show
          Greg Fodor added a comment - I guess I should clarify my comment re: TrieField. I guess I'm wondering if it is more expensive to perform a Trie-based query against a large portion of the value's range instead of a direct fieldcache based range query. My assumption (which might be incorrect) is that trie-based range queries across the entire span of values are more expensive than non-Trie full-span range queries. If this isn't the case then it makes sense to do as you suggest and use Trie ranges even though often they will span the entire range of values.
          Hide
          Robert Muir added a comment -

          However, I think including the currency code as part of the value itself (particularly when indexing it as a field) is an important use case to support as well, since it makes indexing so much easier to implement. Is this what you mean by Solr taking the ISO code as input? For example: "price:10.00USD", where the code is part of the value.

          No, I don't think we should do this.

          I think the code, even in this case should be provided separate.
          Because 10.00USD is just still a number format (NumberFormat.ISOCURRENCYSTYLE), not any less ambiguous than $10.00 (NumberFormat.CURRENCYSTYLE) to me.

          Under a chinese locale its USD10.00, under german its 10,00 USD (with a space!)

          Really I can't stand how the current date-range stuff is handled by Lucene either via queryparsing.
          In my opinion, when queryparsing this stuff: for a date range query/indexing, you should have to provide a DateFormat.
          for a currency query/indexing, you should have to provide a DecimalFormat.

          For dates, it seems Solr opted to go with a standardized required DateFormat across the board, and its up to clients to convert.
          We really need to think this through for Currency, because passing the necessary stuff to build a DecimalFormat its going to be verbose (Locale + Currency ISO Code + Format String + the String containing the currency value itself)...

          Really I wonder if we could force the client to deal with localization and parsing, since thats where it fits best anyway, and make it provide just the raw long + ISO code to solr for this...

          the fact that Solr forces you to implement query-parsing server-side is going to introduce complexity here unless we can find a trick...

          Show
          Robert Muir added a comment - However, I think including the currency code as part of the value itself (particularly when indexing it as a field) is an important use case to support as well, since it makes indexing so much easier to implement. Is this what you mean by Solr taking the ISO code as input? For example: "price:10.00USD", where the code is part of the value. No, I don't think we should do this. I think the code, even in this case should be provided separate. Because 10.00USD is just still a number format (NumberFormat.ISOCURRENCYSTYLE), not any less ambiguous than $10.00 (NumberFormat.CURRENCYSTYLE) to me. Under a chinese locale its USD10.00, under german its 10,00 USD (with a space!) Really I can't stand how the current date-range stuff is handled by Lucene either via queryparsing. In my opinion, when queryparsing this stuff: for a date range query/indexing, you should have to provide a DateFormat. for a currency query/indexing, you should have to provide a DecimalFormat. For dates, it seems Solr opted to go with a standardized required DateFormat across the board, and its up to clients to convert. We really need to think this through for Currency, because passing the necessary stuff to build a DecimalFormat its going to be verbose (Locale + Currency ISO Code + Format String + the String containing the currency value itself)... Really I wonder if we could force the client to deal with localization and parsing, since thats where it fits best anyway, and make it provide just the raw long + ISO code to solr for this... the fact that Solr forces you to implement query-parsing server-side is going to introduce complexity here unless we can find a trick...
          Hide
          Greg Fodor added a comment -

          I think the fact that we do have a locale-independent way to specify currency, the ISO code, is the lever we need.

          You'll notice that the MoneyType is a polyfield of string and double. It could be that we might want to introduce a standard syntax for specifying polyfield values, which could be leveraged here. For PointType it's merely comma delimited. So, we could use the raw long and the ISO code, separated by a comma.

          price: 12345,USD

          It's not pretty to look at, but it at least would be a "correct" solution by removing the dependency upon locale altogether.

          Show
          Greg Fodor added a comment - I think the fact that we do have a locale-independent way to specify currency, the ISO code, is the lever we need. You'll notice that the MoneyType is a polyfield of string and double. It could be that we might want to introduce a standard syntax for specifying polyfield values, which could be leveraged here. For PointType it's merely comma delimited. So, we could use the raw long and the ISO code, separated by a comma. price: 12345,USD It's not pretty to look at, but it at least would be a "correct" solution by removing the dependency upon locale altogether.
          Hide
          Robert Muir added a comment -

          It's not pretty to look at, but it at least would be a "correct" solution by removing the dependency upon locale altogether.

          right, i definitely have no problem with that, from a localization standpoint... then there is no issue at all, and its up to the client
          to deal with formatting/parsing/pluralization/whatever

          Show
          Robert Muir added a comment - It's not pretty to look at, but it at least would be a "correct" solution by removing the dependency upon locale altogether. right, i definitely have no problem with that, from a localization standpoint... then there is no issue at all, and its up to the client to deal with formatting/parsing/pluralization/whatever
          Hide
          Uwe Schindler added a comment -

          I guess I should clarify my comment re: TrieField. I guess I'm wondering if it is more expensive to perform a Trie-based query against a large portion of the value's range instead of a direct fieldcache based range query. My assumption (which might be incorrect) is that trie-based range queries across the entire span of values are more expensive than non-Trie full-span range queries. If this isn't the case then it makes sense to do as you suggest and use Trie ranges even though often they will span the entire range of values.

          That exactly the trick behind the trie field. A query that spans all values is as fast as a query which spans only less values (ok it still depends on the number of documents, but the part that selects the terms to match is very effective). The trick behin trie is to reduce the number of terms by using multiple indexed values in the same field and only choose those that match best. Please read the docs about Lucene's NumericRangeQuery. If the range matches only some values on a sparse index, you loose lots of time on iterating the FieldCache.

          And FieldCache (in 3.x) has a big disadvantage: It supports only one value per document and it cannot detect NULL values.

          Show
          Uwe Schindler added a comment - I guess I should clarify my comment re: TrieField. I guess I'm wondering if it is more expensive to perform a Trie-based query against a large portion of the value's range instead of a direct fieldcache based range query. My assumption (which might be incorrect) is that trie-based range queries across the entire span of values are more expensive than non-Trie full-span range queries. If this isn't the case then it makes sense to do as you suggest and use Trie ranges even though often they will span the entire range of values. That exactly the trick behind the trie field. A query that spans all values is as fast as a query which spans only less values (ok it still depends on the number of documents, but the part that selects the terms to match is very effective). The trick behin trie is to reduce the number of terms by using multiple indexed values in the same field and only choose those that match best. Please read the docs about Lucene's NumericRangeQuery. If the range matches only some values on a sparse index, you loose lots of time on iterating the FieldCache. And FieldCache (in 3.x) has a big disadvantage: It supports only one value per document and it cannot detect NULL values.
          Hide
          Greg Fodor added a comment -

          Ok, great, thanks Uwe. I will make a pass to incorporate the changes noted here:

          • Indexing tlong instead of double
          • Construct range query efficiently based upon maximal/minimal conversion rate
          • Remove locale specific logic, standardize on input format being <long value>,<ISO code>

          Note that coercion to double's will still occur during the actual conversion, but they will be immediately coerced back into long's.

          Show
          Greg Fodor added a comment - Ok, great, thanks Uwe. I will make a pass to incorporate the changes noted here: Indexing tlong instead of double Construct range query efficiently based upon maximal/minimal conversion rate Remove locale specific logic, standardize on input format being <long value>,<ISO code> Note that coercion to double's will still occur during the actual conversion, but they will be immediately coerced back into long's.
          Hide
          Greg Fodor added a comment -

          I've attached an updated patch. In the process of removing the cruft for Currency parsing, I pulled everything that was in Lucene out. This is entirely a Solr-based patch now.

          Money based field expect their values in the form <long>,<ISO code> where <long> is the converted long value based upon the known currency fraction digits for the ISO code.

          Uwe, could you please check my implementation of getRangeQuery()? The way I implemented this was via the creation of a range query on the TrieField, which has as its range the max and min potential conversions of the upper and lower bound of the user specified range respectively. This query is then wrapped with a FilteredQuery that applies a Filter that performs the same ValueSource based Scorer as before over the documents to determine if they fall within the range (once converted).

          Presumably this means is the outer range query will only pass forward documents to the inner, more expensive, ValueSource filter if they have amount values that fall within the max and min possible amounts across all currencies (given the specified range being queried.) I'm assuming that the Filter in a FilteredQuery is applied after the documents are screened from by the Query being filtered.

          Show
          Greg Fodor added a comment - I've attached an updated patch. In the process of removing the cruft for Currency parsing, I pulled everything that was in Lucene out. This is entirely a Solr-based patch now. Money based field expect their values in the form <long>,<ISO code> where <long> is the converted long value based upon the known currency fraction digits for the ISO code. Uwe, could you please check my implementation of getRangeQuery()? The way I implemented this was via the creation of a range query on the TrieField, which has as its range the max and min potential conversions of the upper and lower bound of the user specified range respectively. This query is then wrapped with a FilteredQuery that applies a Filter that performs the same ValueSource based Scorer as before over the documents to determine if they fall within the range (once converted). Presumably this means is the outer range query will only pass forward documents to the inner, more expensive, ValueSource filter if they have amount values that fall within the max and min possible amounts across all currencies (given the specified range being queried.) I'm assuming that the Filter in a FilteredQuery is applied after the documents are screened from by the Query being filtered.
          Hide
          Greg Fodor added a comment -

          Bugfixes for computation of bounding range for trie query.

          Show
          Greg Fodor added a comment - Bugfixes for computation of bounding range for trie query.
          Hide
          Greg Fodor added a comment -

          Fix for error when computing converted values.

          Show
          Greg Fodor added a comment - Fix for error when computing converted values.
          Hide
          Lance Norskog added a comment -

          This is a really interesting concept. Do other search engines or databases have a currency field type with this configurability? The range value support alone makes it cool.

          Also, it is nice to have a full example of how to implement Polyfields outside of the spatial code. And Trie integration and ZooKeeper integrations.

          A minor nit: equals(Object other) has to be transitive. So this.equals(other) has to do the same thing as other.equals(this).
          {{

          { if (o == null || getClass() != o.getClass()) return false;}

          }}
          If other is null, this.equals(other) returns false but other.equals(this) throws a null pointer exception.

          Show
          Lance Norskog added a comment - This is a really interesting concept. Do other search engines or databases have a currency field type with this configurability? The range value support alone makes it cool. Also, it is nice to have a full example of how to implement Polyfields outside of the spatial code. And Trie integration and ZooKeeper integrations. A minor nit: equals(Object other) has to be transitive. So this.equals(other) has to do the same thing as other.equals(this). {{ { if (o == null || getClass() != o.getClass()) return false;} }} If other is null, this.equals(other) returns false but other.equals(this) throws a null pointer exception.
          Hide
          Robert Muir added a comment -

          Greg, one more nitpick:

          I think the reloadCurrencyConfig could be improved:

          1. It seems to use resource loader to read in the xml file into a String line-by-line, but then concats all these lines and converts back into a bytearray, just to get an input stream.
          2. It uses a charset of "UTF8" (should be "UTF-8").

          I think easier/safer would be to just get an InputStream directly from the resource loader (ResourceLoader.openResource) without this encoding conversion.

          Show
          Robert Muir added a comment - Greg, one more nitpick: I think the reloadCurrencyConfig could be improved: It seems to use resource loader to read in the xml file into a String line-by-line, but then concats all these lines and converts back into a bytearray, just to get an input stream. It uses a charset of "UTF8" (should be "UTF-8"). I think easier/safer would be to just get an InputStream directly from the resource loader (ResourceLoader.openResource) without this encoding conversion.
          Hide
          Greg Fodor added a comment - - edited

          Please ignore, there are problems in the current patch, will update later today.

          Show
          Greg Fodor added a comment - - edited Please ignore, there are problems in the current patch, will update later today.
          Hide
          Greg Fodor added a comment - - edited

          This update to the patch includes a number of performance enhancements and is the version of the patch we will be likely to push to production.

          First, this patch introduces the defaultCurrency parameter, which defaults to USD. The default currency allows you to omit the currency code in the field value (ie, "5000" instead of "5000,USD".) However, it plays a more pivotal role in improving performance.

          The previous patches provided a naive approach to constructing the trie bounding range by taking the current max and min currency exchange rates to the target currency. This proved to be minimally useful since the relative magnitude of currency units vary wildly and hence the bounding range often spanned the full document set.

          The solution I took in this patch is to compute the bounding range by taking into account the "currency drift." Before getting to that, though, the indexing process was updated to include a new dynamic field that indexes the value of the field in the default currency, exchanged at the current rate at indexing time. (Additionally, a stored field is optionally created if the money field is marked as stored.)

          The historical max and min exchange rates (the "drift") are now tracked by solr in a properties file. The properties file is named after the currency config file. For example, if the config file is "currency.xml", the properties file is "currency.xml.drift.properties". This file is designed to work correctly with replication, and is updated by Solr whenever the currency config file is loaded.

          To compute an accurate bounding range, it is necessary to compute the max and min "historical composite exchange rates". The "historical" refers to the fact that the historical max/min exchange rates are used instead of the current exchange rate. The "composite" refers to the fact that the max/min exchange rate is computed by taking the max/min of a composition of the max/min exchange rates between the source currency S, the target currency T, and all intermediate currencies Z. For example, to compute the max historical composite exchange rate between USD and EUR, take the max value of the the value x*y, where x is the max historical exchange rate between USD->Z, and y is the max historical exchange rate between Z->EUR, for all currencies Z.

          I made an attempt at proving mathematically that this historical composite exchange rate approach computes a minimal upper bound and maximal lower bound for the trie query. If necessary I can attach this proof.

          Beyond this, I added some additional intra-query caching and changed the query construction from the FilteredQuery approach (which seemed to be inefficient in leveraging the trie query) to the BooleanQuery. You'll note that I rely upon the second clause in the BooleanQuery being scored first, which eliminates the expensive exchange rate conversions from happening for documents that fall outside the trie range.

          I ran into a limitation of the current resource loader API, however, in that it does not allow access to creating or writing new resources, which is needed to maintain the drift properties file. For now, I only support SolrResourceLoader which writes to the local filesystem by extracting the config directory. However, the new ZkResourceLoader is not supported, for example. A non-fatal warning is emitted to the log when this occurs. The side effect of this is that currency exchange rate drift will not be tracked, resulting in incorrect range and point queries if the currency.xml file is updated. It would be nice if it were possible to ask the ResourceLoader for an OutputStream to a new resource for this purpose.

          Some limitations:

          • The default currency cannot be changed after the initial index, otherwise the index effectively is corrupt since the value for the trie bound is indexed in the default currency.
          • Loss or corruption of the drift file will cause erroneous range and point queries (documents will be omitted from the results, though no incorrect documents will appear.)
          • As mentioned above, the only ResourceLoader supported are SolrResourceLoaders that respond to getConfigDir(). Please let me know if there is a safer, more canonical way to store and load Solr-maintained metadata that lives with the index.

          Also note that this has been tested with replication. The only thing necessary for replication to work is that the currency.xml and currency.xml.drift.properties file be included as part of the replication. A limitation here is that if no documents are updated but the currency exchange rates change, the file will not be replicated due to Solr's policy of not replicating files without index changes. It would be useful to allow this behavior to be overridden. In our case this isn't a problem since our index churn is high enough that replication events happen regularly.

          In the end these changes result in accurate currency range queries that perform nearly as fast as their non-currency counterparts.

          Show
          Greg Fodor added a comment - - edited This update to the patch includes a number of performance enhancements and is the version of the patch we will be likely to push to production. First, this patch introduces the defaultCurrency parameter, which defaults to USD. The default currency allows you to omit the currency code in the field value (ie, "5000" instead of "5000,USD".) However, it plays a more pivotal role in improving performance. The previous patches provided a naive approach to constructing the trie bounding range by taking the current max and min currency exchange rates to the target currency. This proved to be minimally useful since the relative magnitude of currency units vary wildly and hence the bounding range often spanned the full document set. The solution I took in this patch is to compute the bounding range by taking into account the "currency drift." Before getting to that, though, the indexing process was updated to include a new dynamic field that indexes the value of the field in the default currency, exchanged at the current rate at indexing time. (Additionally, a stored field is optionally created if the money field is marked as stored.) The historical max and min exchange rates (the "drift") are now tracked by solr in a properties file. The properties file is named after the currency config file. For example, if the config file is "currency.xml", the properties file is "currency.xml.drift.properties". This file is designed to work correctly with replication, and is updated by Solr whenever the currency config file is loaded. To compute an accurate bounding range, it is necessary to compute the max and min "historical composite exchange rates". The "historical" refers to the fact that the historical max/min exchange rates are used instead of the current exchange rate. The "composite" refers to the fact that the max/min exchange rate is computed by taking the max/min of a composition of the max/min exchange rates between the source currency S, the target currency T, and all intermediate currencies Z. For example, to compute the max historical composite exchange rate between USD and EUR, take the max value of the the value x*y, where x is the max historical exchange rate between USD->Z, and y is the max historical exchange rate between Z->EUR, for all currencies Z. I made an attempt at proving mathematically that this historical composite exchange rate approach computes a minimal upper bound and maximal lower bound for the trie query. If necessary I can attach this proof. Beyond this, I added some additional intra-query caching and changed the query construction from the FilteredQuery approach (which seemed to be inefficient in leveraging the trie query) to the BooleanQuery. You'll note that I rely upon the second clause in the BooleanQuery being scored first, which eliminates the expensive exchange rate conversions from happening for documents that fall outside the trie range. I ran into a limitation of the current resource loader API, however, in that it does not allow access to creating or writing new resources, which is needed to maintain the drift properties file. For now, I only support SolrResourceLoader which writes to the local filesystem by extracting the config directory. However, the new ZkResourceLoader is not supported, for example. A non-fatal warning is emitted to the log when this occurs. The side effect of this is that currency exchange rate drift will not be tracked, resulting in incorrect range and point queries if the currency.xml file is updated. It would be nice if it were possible to ask the ResourceLoader for an OutputStream to a new resource for this purpose. Some limitations: The default currency cannot be changed after the initial index, otherwise the index effectively is corrupt since the value for the trie bound is indexed in the default currency. Loss or corruption of the drift file will cause erroneous range and point queries (documents will be omitted from the results, though no incorrect documents will appear.) As mentioned above, the only ResourceLoader supported are SolrResourceLoaders that respond to getConfigDir(). Please let me know if there is a safer, more canonical way to store and load Solr-maintained metadata that lives with the index. Also note that this has been tested with replication. The only thing necessary for replication to work is that the currency.xml and currency.xml.drift.properties file be included as part of the replication. A limitation here is that if no documents are updated but the currency exchange rates change, the file will not be replicated due to Solr's policy of not replicating files without index changes. It would be useful to allow this behavior to be overridden. In our case this isn't a problem since our index churn is high enough that replication events happen regularly. In the end these changes result in accurate currency range queries that perform nearly as fast as their non-currency counterparts.
          Hide
          Greg Fodor added a comment -

          Small bugfix for currency drift not considering asymmetric conversions correctly.

          Show
          Greg Fodor added a comment - Small bugfix for currency drift not considering asymmetric conversions correctly.
          Hide
          Greg Fodor added a comment -

          Further testing on production datasets revealed that the use of the trie query, particularly in conjunction with the currency based scorer, yielded much poorer performance than simply computing the values directly using the field cache. (My guess is the reason for this is due to the low cardinality of price values in practical datasets.)

          So, this patch rolls back a lot of the complexity of previous patches by removing the need to construct accurate trie bounds. This removes the entire "currency drift" properties file discussed above and also removes the incompatibility with the zookeeper resource loader.

          Other changes in this patch are improvements to the MoneyValueSource to improve performance through caching and removing boxing.

          Show
          Greg Fodor added a comment - Further testing on production datasets revealed that the use of the trie query, particularly in conjunction with the currency based scorer, yielded much poorer performance than simply computing the values directly using the field cache. (My guess is the reason for this is due to the low cardinality of price values in practical datasets.) So, this patch rolls back a lot of the complexity of previous patches by removing the need to construct accurate trie bounds. This removes the entire "currency drift" properties file discussed above and also removes the incompatibility with the zookeeper resource loader. Other changes in this patch are improvements to the MoneyValueSource to improve performance through caching and removing boxing.
          Hide
          Greg Fodor added a comment - - edited

          Etsy has been running this patch in production for a little over a month now successfully. One issue is that reloading the currency.xml file requires a full core reload, which can cause live query performance to suffer. We're working on adding a request handler to handle this instead of waiting for schema change notifications for folks who need this optimization.

          That said, the implementation appears correct, performant, and useful enough for our needs. I'd like to see if there are any outstanding TODOs for this to be seriously considered for inclusion into Solr.

          Show
          Greg Fodor added a comment - - edited Etsy has been running this patch in production for a little over a month now successfully. One issue is that reloading the currency.xml file requires a full core reload, which can cause live query performance to suffer. We're working on adding a request handler to handle this instead of waiting for schema change notifications for folks who need this optimization. That said, the implementation appears correct, performant, and useful enough for our needs. I'd like to see if there are any outstanding TODOs for this to be seriously considered for inclusion into Solr.
          Hide
          Jan Høydahl added a comment -

          Any interest in reviving this and work towards committing a first version?

          Show
          Jan Høydahl added a comment - Any interest in reviving this and work towards committing a first version?
          Hide
          Jan Høydahl added a comment -

          Greg, do you have an updatet patch for this one? We'll help you get it the last mile for inclusion.

          PS: When uploading patches, we prefer that you name it SOLR-2202.patch every time. JIRA will automatically "grey out" the older versions for you.

          Show
          Jan Høydahl added a comment - Greg, do you have an updatet patch for this one? We'll help you get it the last mile for inclusion. PS: When uploading patches, we prefer that you name it SOLR-2202 .patch every time. JIRA will automatically "grey out" the older versions for you.
          Hide
          Greg Fodor added a comment - - edited

          Hey Jan, awesome! The latest patch is the implementation we have been running in production for some time now. (We are on Solr trunk, however, and there were some small tweaks necessary to get it to build there.) One enhancement we are likely to make is the ability to reload the currency.xml file without reloading the Solr cores. If we can live without that feature, I think this should be good to go.

          It's been quite a while since I have looked at this patch – please let me know if you find that it does not merge in or test properly with the current release of Solr. Also, let me know what I need to do documentation wise and so on.

          Thanks a bunch, excited to see this come together!

          Show
          Greg Fodor added a comment - - edited Hey Jan, awesome! The latest patch is the implementation we have been running in production for some time now. (We are on Solr trunk, however, and there were some small tweaks necessary to get it to build there.) One enhancement we are likely to make is the ability to reload the currency.xml file without reloading the Solr cores. If we can live without that feature, I think this should be good to go. It's been quite a while since I have looked at this patch – please let me know if you find that it does not merge in or test properly with the current release of Solr. Also, let me know what I need to do documentation wise and so on. Thanks a bunch, excited to see this come together!
          Hide
          Greg Fodor added a comment -

          I will actually take a crack at merging this into Solr's latest release today. My guess is since it has been so long there are surely a few issues.

          Show
          Greg Fodor added a comment - I will actually take a crack at merging this into Solr's latest release today. My guess is since it has been so long there are surely a few issues.
          Hide
          Steve Rowe added a comment -

          Greg, there have been structural changes since your last patch; you can read about it in SOLR-2452.

          I committed a Perl script that can convert a patch against the old structure into a patch against the current structure. You can find it in your working copy at:

          dev-tools/scripts/SOLR-2452.patch.hack.pl

          There is an example of its use in a comment at the top of the script.

          Show
          Steve Rowe added a comment - Greg, there have been structural changes since your last patch; you can read about it in SOLR-2452 . I committed a Perl script that can convert a patch against the old structure into a patch against the current structure. You can find it in your working copy at: dev-tools/scripts/SOLR-2452.patch.hack.pl There is an example of its use in a comment at the top of the script.
          Hide
          Greg Fodor added a comment -

          Yeah, this patch is a total mess now. I will see if other engineers at Etsy have been maintaining it (I think they have, since we are on some variation of trunk.) If not, I'm sad to say I don't expect to be able to find time to clean it up and re-learn the internals I needed to learn to get it working anytime soon.

          Show
          Greg Fodor added a comment - Yeah, this patch is a total mess now. I will see if other engineers at Etsy have been maintaining it (I think they have, since we are on some variation of trunk.) If not, I'm sad to say I don't expect to be able to find time to clean it up and re-learn the internals I needed to learn to get it working anytime soon.
          Hide
          Greg Fodor added a comment -

          I decided to roll up my sleeves, wasn't as bad as I thought! New patch attached, passes against trunk.

          Show
          Greg Fodor added a comment - I decided to roll up my sleeves, wasn't as bad as I thought! New patch attached, passes against trunk.
          Hide
          Jan Høydahl added a comment -

          Great, Greg.

          The patch now applies cleanly. I've tested it a bit and uploaded a new patch:

          • Added CHANGES.txt entry
          • Added <fieldType> and <dynamicField> entries to example schema
          • Added currency.xml to example config with real rates between USD and many currencies as well as cross-rates for GBP, EUR, NOK
          • Added money.xml in exampledocs/

          I also added a new Wiki page at http://wiki.apache.org/solr/MoneyFieldType trying to document. Plese review and change where I've got it wrong.

          At the bottom of the Wiki page I've put some questions/TODOs which I couldn't figure out immediately:

          • How do decimals work? I.e. should USD values be entered including two decimals so that $1.00 is written as "100,USD" ?
          • Can you return a value in another currency than what's indexed in the search result?
          • Range facets do not work with this field type - this should be fixed

          I think it could be nice to switch the "price" field in the example schema over from float to "money", but that should wait until range facets work for money field type.

          Show
          Jan Høydahl added a comment - Great, Greg. The patch now applies cleanly. I've tested it a bit and uploaded a new patch: Added CHANGES.txt entry Added <fieldType> and <dynamicField> entries to example schema Added currency.xml to example config with real rates between USD and many currencies as well as cross-rates for GBP, EUR, NOK Added money.xml in exampledocs/ I also added a new Wiki page at http://wiki.apache.org/solr/MoneyFieldType trying to document. Plese review and change where I've got it wrong. At the bottom of the Wiki page I've put some questions/TODOs which I couldn't figure out immediately: How do decimals work? I.e. should USD values be entered including two decimals so that $1.00 is written as "100,USD" ? Can you return a value in another currency than what's indexed in the search result? Range facets do not work with this field type - this should be fixed I think it could be nice to switch the "price" field in the example schema over from float to "money", but that should wait until range facets work for money field type.
          Hide
          Greg Fodor added a comment -

          Hey Jan, thanks for cleaning things up!

          To answer your questions:

          • Yes, decimals should be encoded as 100,USD.
          • Not as far as I know, the way we do this at Etsy is we do a conversion on the way out at render-time. This requires the frontend to have access to the same currency exchange rates as the search engine.
          • Can you explain in a bit more detail what is required here? I'm not sure how to address this.
          Show
          Greg Fodor added a comment - Hey Jan, thanks for cleaning things up! To answer your questions: Yes, decimals should be encoded as 100,USD. Not as far as I know, the way we do this at Etsy is we do a conversion on the way out at render-time. This requires the frontend to have access to the same currency exchange rates as the search engine. Can you explain in a bit more detail what is required here? I'm not sure how to address this.
          Hide
          Greg Fodor added a comment -

          I noticed a reference to currency.xml.drift.properties in the wiki. This drift idea was abandoned by me after realizing the performance was no good. Was there somewhere you saw this file cropping up? There should be no trace of it in the current patch.

          Show
          Greg Fodor added a comment - I noticed a reference to currency.xml.drift.properties in the wiki. This drift idea was abandoned by me after realizing the performance was no good. Was there somewhere you saw this file cropping up? There should be no trace of it in the current patch.
          Hide
          Greg Fodor added a comment -

          wiki updated to reflect the answers above

          Show
          Greg Fodor added a comment - wiki updated to reflect the answers above
          Hide
          Simon Rosenthal added a comment -

          One enhancement we are likely to make is the ability to reload the currency.xml file without reloading the Solr cores

          Greg: I'm working on a patch to handle the more general case of reloading changed config files ( stopwords.txt, synonyms.txt, etc.) without requiring a core reload. The class just needs to be resourceLoaderAware, which seems to be the case here.

          Will submit this patch for issue SOLR-1307.

          Show
          Simon Rosenthal added a comment - One enhancement we are likely to make is the ability to reload the currency.xml file without reloading the Solr cores Greg: I'm working on a patch to handle the more general case of reloading changed config files ( stopwords.txt, synonyms.txt, etc.) without requiring a core reload. The class just needs to be resourceLoaderAware, which seems to be the case here. Will submit this patch for issue SOLR-1307 .
          Hide
          Greg Fodor added a comment -

          This is excellent news! Thanks a bunch. This is an issue we've been struggling with at Etsy and haven't had a chance to try to address appropriately.

          Show
          Greg Fodor added a comment - This is excellent news! Thanks a bunch. This is an issue we've been struggling with at Etsy and haven't had a chance to try to address appropriately.
          Hide
          Lance Norskog added a comment - - edited

          +1 on getting this in. It's a cool feature that makes sense, but nobody would think of. (Ok, no American )

          Show
          Lance Norskog added a comment - - edited +1 on getting this in. It's a cool feature that makes sense, but nobody would think of. (Ok, no American )
          Hide
          Jan Høydahl added a comment -

          Greg,

          Can you explain in a bit more detail what is required here? I'm not sure how to address this.

          facet.range can do range facets for numbers and dates. It would be natural to apply range facets to prices, but this does currently not work. Guess we should document it as a limitation, open a Jira for it and tackle it later. It doesn't make sense to operate on mixed currencies in a facet, so the faceting code would need to operate on a normalized currency, most naturally the defaultCurrency?

          Another thing is that it would be sooo much more user friendly to allow decimal in the user-facing form, e.g. to be allowed to insert "1.00,USD" instead of "100,USD". Could we not allow this in the string-form of the input and stored version, but normalize it to whatever other internal format in the indexed version?

          Show
          Jan Høydahl added a comment - Greg, Can you explain in a bit more detail what is required here? I'm not sure how to address this. facet.range can do range facets for numbers and dates. It would be natural to apply range facets to prices, but this does currently not work. Guess we should document it as a limitation, open a Jira for it and tackle it later. It doesn't make sense to operate on mixed currencies in a facet, so the faceting code would need to operate on a normalized currency, most naturally the defaultCurrency? Another thing is that it would be sooo much more user friendly to allow decimal in the user-facing form, e.g. to be allowed to insert "1.00,USD" instead of "100,USD". Could we not allow this in the string-form of the input and stored version, but normalize it to whatever other internal format in the indexed version?
          Hide
          Greg Fodor added a comment - - edited

          Hm. It could be tricky to allow decimal input, no? Right now, there is no locale-specific code in this implementation, it's generic. If we allowed decimal input, we'd have to understand each locale and how to convert it to the internal format. For example, with decimal support, "1,USD" could mean "100,USD" or "1,USD", and we'd need to know for USD there are hundredths (cents), so we'd save 100. Am I misunderstanding?

          Show
          Greg Fodor added a comment - - edited Hm. It could be tricky to allow decimal input, no? Right now, there is no locale-specific code in this implementation, it's generic. If we allowed decimal input, we'd have to understand each locale and how to convert it to the internal format. For example, with decimal support, "1,USD" could mean "100,USD" or "1,USD", and we'd need to know for USD there are hundredths (cents), so we'd save 100. Am I misunderstanding?
          Hide
          Jan Høydahl added a comment -

          I'm no currency expert, but it feels wrong to put this burden on the user (or front-end or Solr APIs for different programming languages) to know that inputting "1" means 0.01 for USD but means 1 for JPY? We're not talking locale support here, but a strict format with "." as decimal point: "1.234,CCC". Could not conversion from decimal form to internal integer form be done by the FieldType:

          double d = 1.2345;
          String c = "USD";
          long l = Math.round(d*Math.pow(10.0,Currency.getInstance(c).getDefaultFractionDigits()));
          ==> 123
          

          Perhaps I'm missing something here?

          Show
          Jan Høydahl added a comment - I'm no currency expert, but it feels wrong to put this burden on the user (or front-end or Solr APIs for different programming languages) to know that inputting "1" means 0.01 for USD but means 1 for JPY? We're not talking locale support here, but a strict format with "." as decimal point: "1.234,CCC". Could not conversion from decimal form to internal integer form be done by the FieldType: double d = 1.2345; String c = "USD" ; long l = Math .round(d* Math .pow(10.0,Currency.getInstance(c).getDefaultFractionDigits())); ==> 123 Perhaps I'm missing something here?
          Hide
          Greg Fodor added a comment -

          Actually that makes sense. I was getting my wires crossed thinking this would be locale specific, not currency specific, which is OK. I will try to work this in.

          Show
          Greg Fodor added a comment - Actually that makes sense. I was getting my wires crossed thinking this would be locale specific, not currency specific, which is OK. I will try to work this in.
          Hide
          Jan Høydahl added a comment -

          Greg, I'm gonna look at this again and try to prepare a first candidate for committing to 3.x.
          What we have is already quite good I think!

          Do you have time to do the planned last changes? Do you need any help?

          Show
          Jan Høydahl added a comment - Greg, I'm gonna look at this again and try to prepare a first candidate for committing to 3.x. What we have is already quite good I think! Do you have time to do the planned last changes? Do you need any help?
          Hide
          Greg Fodor added a comment -

          Hey Jan, I can try to take a crack at it this week. The biggest issue right now is that if we change the indexing format we have to change our indexer at Etsy which is a very scary proposition.

          Show
          Greg Fodor added a comment - Hey Jan, I can try to take a crack at it this week. The biggest issue right now is that if we change the indexing format we have to change our indexer at Etsy which is a very scary proposition.
          Hide
          Jan Høydahl added a comment -

          My biggest wish is human-friendly decimal support e.g. "1.5,USD". Are you sure you need to change the index format for this, perhaps enough to add some parse/display code?

          Reloading config would be nice, but as there is a known workaround by reloading core it's not a blocker.

          Show
          Jan Høydahl added a comment - My biggest wish is human-friendly decimal support e.g. "1.5,USD". Are you sure you need to change the index format for this, perhaps enough to add some parse/display code? Reloading config would be nice, but as there is a known workaround by reloading core it's not a blocker.
          Hide
          Ryan McKinley added a comment -

          My biggest wish is human-friendly decimal support e.g. "1.5,USD". Are you sure you need to change the index format for this, perhaps enough to add some parse/display code?

          perhaps this is a good candidate for a DocTransformer – though it seems like something this is likely better done at the client end...

          Show
          Ryan McKinley added a comment - My biggest wish is human-friendly decimal support e.g. "1.5,USD". Are you sure you need to change the index format for this, perhaps enough to add some parse/display code? perhaps this is a good candidate for a DocTransformer – though it seems like something this is likely better done at the client end...
          Hide
          Jan Høydahl added a comment -

          Updated patch which applies cleanly to current trunk. DocValues renamed to FunctionValues. Had to change an assert in testMoneyFieldType() to test for fields[i].numericValue() instead of fields[i].binaryValue() to get the test to pass.

          Show
          Jan Høydahl added a comment - Updated patch which applies cleanly to current trunk. DocValues renamed to FunctionValues. Had to change an assert in testMoneyFieldType() to test for fields [i] .numericValue() instead of fields [i] .binaryValue() to get the test to pass.
          Hide
          Jan Høydahl added a comment -

          Updated description, as it was a bit outdated

          Show
          Jan Høydahl added a comment - Updated description, as it was a bit outdated
          Hide
          Greg Fodor added a comment -

          Hey Jan, I took a crack at making your proposed change, but I'm afraid it seems to break the point query test. (Yet range queries strangely are OK.) I have no clue how this could be since the change to the parser seems pretty straightforward. Seems like a floating point rounding thing but I'm still very confused. Let me know if you have any ideas or if you want to go forward with the current version. I can try looking into it more but right now I'm fairly pressed for time.

          Show
          Greg Fodor added a comment - Hey Jan, I took a crack at making your proposed change, but I'm afraid it seems to break the point query test. (Yet range queries strangely are OK.) I have no clue how this could be since the change to the parser seems pretty straightforward. Seems like a floating point rounding thing but I'm still very confused. Let me know if you have any ideas or if you want to go forward with the current version. I can try looking into it more but right now I'm fairly pressed for time.
          Hide
          Greg Fodor added a comment -

          Success! I was calling parse() twice in the call chain for point queries, and that no longer works since it scales the value by the fraction digits 2x on the way in. Fixed patch is attached.

          Show
          Greg Fodor added a comment - Success! I was calling parse() twice in the call chain for point queries, and that no longer works since it scales the value by the fraction digits 2x on the way in. Fixed patch is attached.
          Hide
          Greg Fodor added a comment -

          Wiki updated

          Show
          Greg Fodor added a comment - Wiki updated
          Hide
          Jan Høydahl added a comment -

          This is great! I'll see if I can test it a bit more next week, and see if I spot any bugs or improvements.

          This will be a great addition.
          It would be kind of nice to demo the fieldType in all the prices in the exampledocs. Then we need to change fieldType of the "price" field to MoneyType etc. Only problem then is that we cannot demo price range facets in /browse as we do now.

          Do anyone know what it takes to make range facets work with this field type?

          And a minor one: Why do we call the class MoneyType? All the other fields are called XxxField (except for the LatLonType and PointType). Is this intentional. To me it would sound even better with "CurrencyField"

          Show
          Jan Høydahl added a comment - This is great! I'll see if I can test it a bit more next week, and see if I spot any bugs or improvements. This will be a great addition. It would be kind of nice to demo the fieldType in all the prices in the exampledocs. Then we need to change fieldType of the "price" field to MoneyType etc. Only problem then is that we cannot demo price range facets in /browse as we do now. Do anyone know what it takes to make range facets work with this field type? And a minor one: Why do we call the class MoneyType? All the other fields are called XxxField (except for the LatLonType and PointType). Is this intentional. To me it would sound even better with "CurrencyField"
          Hide
          Greg Fodor added a comment -

          I have no problems with renaming it – at the time I started the patch LatLonType and PointType were the polyfields I saw in the repo – at this point so much has probably changed naming things appropriately makes sense. Let me know if this is a patch you'd like me to do or if you want to take it from here. Thanks Jan!

          Show
          Greg Fodor added a comment - I have no problems with renaming it – at the time I started the patch LatLonType and PointType were the polyfields I saw in the repo – at this point so much has probably changed naming things appropriately makes sense. Let me know if this is a patch you'd like me to do or if you want to take it from here. Thanks Jan!
          Hide
          Jan Høydahl added a comment -

          Other opinions on the naming? Or other blockers for committing? I have not tested the latest change regarding decimal point support yet but will try to soon...

          Show
          Jan Høydahl added a comment - Other opinions on the naming? Or other blockers for committing? I have not tested the latest change regarding decimal point support yet but will try to soon...
          Hide
          Greg Fodor added a comment -

          Any updates on this?

          Show
          Greg Fodor added a comment - Any updates on this?
          Hide
          Jan Høydahl added a comment - - edited

          Had no chance to get back to this yet. Afraid I won't have time next week either

          In the mean time, perhaps other committers could chime in with their views on preferred naming?
          a) MoneyType
          b) MoneyField
          c) CurrencyType
          d) CurrencyField

          As for range facets, I'll open a new issue once the basics for this is committed.

          Show
          Jan Høydahl added a comment - - edited Had no chance to get back to this yet. Afraid I won't have time next week either In the mean time, perhaps other committers could chime in with their views on preferred naming? a) MoneyType b) MoneyField c) CurrencyType d) CurrencyField As for range facets, I'll open a new issue once the basics for this is committed.
          Hide
          Erik Hatcher added a comment -

          d) CurrencyField - that's more consistent with the likes of TextField, StrField, etc. We have some new field types that suffix with "Type", but they're the exception rather than the norm. I prefer the term Currency over the less formal Money.

          Show
          Erik Hatcher added a comment - d) CurrencyField - that's more consistent with the likes of TextField, StrField, etc. We have some new field types that suffix with "Type", but they're the exception rather than the norm. I prefer the term Currency over the less formal Money.
          Hide
          Andrew Morrison added a comment -

          Hello Jan.

          I've attached a patch against https://svn.apache.org/repos/asf/lucene/dev/trunk@1220795 that adds the ability to do range faceting. I'd be happy to move this to another ticket and clean it up.

          Show
          Andrew Morrison added a comment - Hello Jan. I've attached a patch against https://svn.apache.org/repos/asf/lucene/dev/trunk@1220795 that adds the ability to do range faceting. I'd be happy to move this to another ticket and clean it up.
          Hide
          Hoss Man added a comment -

          a) CurrencyField (and by extension "CurrencyValue") gets my vote

          b) i really only reviewed the facet stuff in SOLR-2202-solr-10.patch (i know Jan has already been reviewing the more core stuff about the type) ... it makes me realize that we really need to refactor the range faceting code to be easier to do in custom FieldTypes, but that's certainly no fault of this issue and can be done later.

          The facet code itself looks correct but my one concern is that (if i'm understanding all of this MoneyValue conversion stuff correctly) it should be possible to facet with start/end/gap values specified in any currency, as long as they are all consistent – but there is not test of this situation. the negative test only looks at using an inconsistent gap, and the positive tests only use USD, or the "default" which is also USD. We should have at least one test that uses something like EUR for start/end/gap and verifies that the counts are correct given the conversion rates used in the test.

          incidentally: I don't see anything actually enforcing that start/end are in the same currency – just that gap is in the same currency as the values it's being added to, so essentially that start and gap use hte same currenty. But I'm actually not at all clear on why there is any attempt to enforce that the currencies used are the same, since the whole point of the type (as i understand it) is that you can do conversions on the fly – it may seem silly for someone to say facet.range.start=0,USD & facet.range.gap=200,EUR & facet.range.end=1000,YEN but is there any technical reason why we can't let them do that?

          Show
          Hoss Man added a comment - a) CurrencyField (and by extension "CurrencyValue") gets my vote b) i really only reviewed the facet stuff in SOLR-2202 -solr-10.patch (i know Jan has already been reviewing the more core stuff about the type) ... it makes me realize that we really need to refactor the range faceting code to be easier to do in custom FieldTypes, but that's certainly no fault of this issue and can be done later. The facet code itself looks correct but my one concern is that (if i'm understanding all of this MoneyValue conversion stuff correctly) it should be possible to facet with start/end/gap values specified in any currency, as long as they are all consistent – but there is not test of this situation. the negative test only looks at using an inconsistent gap, and the positive tests only use USD, or the "default" which is also USD. We should have at least one test that uses something like EUR for start/end/gap and verifies that the counts are correct given the conversion rates used in the test. incidentally: I don't see anything actually enforcing that start/end are in the same currency – just that gap is in the same currency as the values it's being added to, so essentially that start and gap use hte same currenty. But I'm actually not at all clear on why there is any attempt to enforce that the currencies used are the same, since the whole point of the type (as i understand it) is that you can do conversions on the fly – it may seem silly for someone to say facet.range.start=0,USD & facet.range.gap=200,EUR & facet.range.end=1000,YEN but is there any technical reason why we can't let them do that?
          Hide
          Jan Høydahl added a comment -

          New patch (without faceting).

          • Renamed to CurrencyField
          • Fixed bug when default type missing in document
          • Added a copyField from price to price_c in schema
          • Various cleanup

          I think this is more or less ready

          Show
          Jan Høydahl added a comment - New patch (without faceting). Renamed to CurrencyField Fixed bug when default type missing in document Added a copyField from price to price_c in schema Various cleanup I think this is more or less ready
          Hide
          Jan Høydahl added a comment -

          Andrew, please see SOLR-3218 for faceting support.
          I suggest we first commit this basic field type for Solr3.6, and then add faceting support

          Show
          Jan Høydahl added a comment - Andrew, please see SOLR-3218 for faceting support. I suggest we first commit this basic field type for Solr3.6, and then add faceting support
          Hide
          Jan Høydahl added a comment -

          I have a new patch ready soon. It

          • has truly pluggable ExchangeRateProvider through new param exchangeRateProvider on fieldType
          • pulls out ExchangeRateProvider interface into its own file, with init(), reload(), inform() and list() methods
          • reading and parsing of config file delegated to FileExchangeRateProvider
          • cleans up static strings into constants
          • normalizes stored value when currency missing i.e. an input of "3.5" would be "3.5,USD" if USD is default currency
          • adds ASL license headers to all new files
          • reverts schema field "price" back to "float" for backward compat
          • removes defaultCurrency param from <field> definition and adds it to <fieldType>

          Actually, defaultCurrency param does not work at all on <field> level. Any ideas on how it could be made working?
          We should create a test with a different ExchangeRateProvider plugin just to prove that it works..

          Show
          Jan Høydahl added a comment - I have a new patch ready soon. It has truly pluggable ExchangeRateProvider through new param exchangeRateProvider on fieldType pulls out ExchangeRateProvider interface into its own file, with init(), reload(), inform() and list() methods reading and parsing of config file delegated to FileExchangeRateProvider cleans up static strings into constants normalizes stored value when currency missing i.e. an input of "3.5" would be "3.5,USD" if USD is default currency adds ASL license headers to all new files reverts schema field "price" back to "float" for backward compat removes defaultCurrency param from <field> definition and adds it to <fieldType> Actually, defaultCurrency param does not work at all on <field> level. Any ideas on how it could be made working? We should create a test with a different ExchangeRateProvider plugin just to prove that it works..
          Hide
          Jan Høydahl added a comment -

          Here's the patch

          Show
          Jan Høydahl added a comment - Here's the patch
          Hide
          Jan Høydahl added a comment -

          New patch:

          • Added MockExchangeRateProvider with hardcoded rates
          • Added fieldType and field for mock provider in test-schema
          • Added test case validating mock provider
          Show
          Jan Høydahl added a comment - New patch: Added MockExchangeRateProvider with hardcoded rates Added fieldType and field for mock provider in test-schema Added test case validating mock provider
          Hide
          Jan Høydahl added a comment - - edited

          Some further tests added and fixed bug in listAvailableCurrencies(). Also now printing the price_c value instead of price in Velocity "/browse" template, looks nice

          I plan to commit this to trunk shortly and start backporting to 3.x.

          Show
          Jan Høydahl added a comment - - edited Some further tests added and fixed bug in listAvailableCurrencies(). Also now printing the price_c value instead of price in Velocity "/browse" template, looks nice I plan to commit this to trunk shortly and start backporting to 3.x.
          Hide
          Greg Fodor added a comment -

          Great! Thanks for the cleanup work

          Show
          Greg Fodor added a comment - Great! Thanks for the cleanup work
          Hide
          Jan Høydahl added a comment -

          Committed to trunk. Now go start building ExchangeRateProviders

          Show
          Jan Høydahl added a comment - Committed to trunk. Now go start building ExchangeRateProviders
          Hide
          Jan Høydahl added a comment -

          Backport from SOLR-3228 committed to branch_3x

          Show
          Jan Høydahl added a comment - Backport from SOLR-3228 committed to branch_3x
          Hide
          Jan Høydahl added a comment -
          Show
          Jan Høydahl added a comment - Updated Wiki: http://wiki.apache.org/solr/CurrencyField
          Hide
          Koji Sekiguchi added a comment -

          Reopening, as CurrencyField depends on the definition "tlong" in schema.xml and I don't think it is a good idea.

          If tlong fieldType is not there, I got NPE.

          CurrencyField.java
          protected static final String FIELD_TYPE_CURRENCY         = "string";
          protected static final String FIELD_TYPE_AMOUNT_RAW       = "tlong";
          
          Show
          Koji Sekiguchi added a comment - Reopening, as CurrencyField depends on the definition "tlong" in schema.xml and I don't think it is a good idea. If tlong fieldType is not there, I got NPE. CurrencyField.java protected static final String FIELD_TYPE_CURRENCY = "string" ; protected static final String FIELD_TYPE_AMOUNT_RAW = "tlong" ;
          Hide
          Koji Sekiguchi added a comment -

          A draft patch which hasn't been tested yet.

          Show
          Koji Sekiguchi added a comment - A draft patch which hasn't been tested yet.
          Hide
          Jan Høydahl added a comment -

          Agree, it should be possible to create a schema with only one "currency" field in it. Thanks for the patch.

          Show
          Jan Høydahl added a comment - Agree, it should be possible to create a schema with only one "currency" field in it. Thanks for the patch.
          Hide
          Koji Sekiguchi added a comment - - edited

          My patch introduces another NPE problem (try to go to schema browser, for example) because "tlong" field which is created in init() method lacks some required properties (e.g. typeName).

          I don't like the idea of hard coding to construct tlong object, how about introducing an attribute something like "subFieldType", that the way of currently AbstractSubTypeFieldType does?

          "subFieldType" is set to tlong as default is ok I think this time, as we can throw SolrException if the sub type doesn't exist, like AbstractSubTypeFieldType does.

          Show
          Koji Sekiguchi added a comment - - edited My patch introduces another NPE problem (try to go to schema browser, for example) because "tlong" field which is created in init() method lacks some required properties (e.g. typeName). I don't like the idea of hard coding to construct tlong object, how about introducing an attribute something like "subFieldType", that the way of currently AbstractSubTypeFieldType does? "subFieldType" is set to tlong as default is ok I think this time, as we can throw SolrException if the sub type doesn't exist, like AbstractSubTypeFieldType does.
          Hide
          Jan Høydahl added a comment -

          Here's a patch based on Koji's, independent from other schema fields. We create the two fields, with omitNorms=true and we now also allow precisionStep to be specified, which will be passed on to the TrieLong. All tests pass.

          I don't think anyone will need the extra configurability of choosing other fieldTypes. If such a request comes, we can easily add it later.

          Show
          Jan Høydahl added a comment - Here's a patch based on Koji's, independent from other schema fields. We create the two fields, with omitNorms=true and we now also allow precisionStep to be specified, which will be passed on to the TrieLong. All tests pass. I don't think anyone will need the extra configurability of choosing other fieldTypes. If such a request comes, we can easily add it later.
          Hide
          Koji Sekiguchi added a comment -

          Patch looks good!

          Show
          Koji Sekiguchi added a comment - Patch looks good!
          Hide
          Jan Høydahl added a comment -

          Fix checked in to trunk and branch_3x. We now do not rely on other fieldTypes in schema, and we can control precisionStep for the TrieLong. Wiki updated.

          Show
          Jan Høydahl added a comment - Fix checked in to trunk and branch_3x. We now do not rely on other fieldTypes in schema, and we can control precisionStep for the TrieLong. Wiki updated.
          Hide
          Jan Høydahl added a comment -

          This patch backports the ExchangeRateProvider interface stabilizations and class loader fixes checked in to TRUNK as part of SOLR-3255. This should now be a good basis for coding new providers.

          Show
          Jan Høydahl added a comment - This patch backports the ExchangeRateProvider interface stabilizations and class loader fixes checked in to TRUNK as part of SOLR-3255 . This should now be a good basis for coding new providers.
          Hide
          Jan Høydahl added a comment -

          Checked in stabilization patch to branch_3x

          Show
          Jan Høydahl added a comment - Checked in stabilization patch to branch_3x
          Hide
          Yonik Seeley added a comment -

          I can't seem to run a lot of tests from IntelliJ after this.

          Caused by: java.lang.RuntimeException: Can't find resource 'open-exchange-rates.json' in classpath or '/opt/code/lusolr/solr/build/solr-idea/classes/test/solr/conf/', cwd=/opt/code/lusolr
          	at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:293)
          	at org.apache.solr.schema.OpenExchangeRatesOrgProvider.reload(OpenExchangeRatesOrgProvider.java:126)
          
          
          $ find . -name open-exchange-rates.json
          ./build/solr-core/test-files/solr/conf/open-exchange-rates.json
          ./core/src/test-files/solr/conf/open-exchange-rates.json
          
          

          Anyone else?

          Show
          Yonik Seeley added a comment - I can't seem to run a lot of tests from IntelliJ after this. Caused by: java.lang.RuntimeException: Can't find resource 'open-exchange-rates.json' in classpath or '/opt/code/lusolr/solr/build/solr-idea/classes/test/solr/conf/', cwd=/opt/code/lusolr at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:293) at org.apache.solr.schema.OpenExchangeRatesOrgProvider.reload(OpenExchangeRatesOrgProvider.java:126) $ find . -name open-exchange-rates.json ./build/solr-core/test-files/solr/conf/open-exchange-rates.json ./core/src/test-files/solr/conf/open-exchange-rates.json Anyone else?
          Hide
          Jan Høydahl added a comment -

          I think this rather could be something related to the latest checkin for SOLR-3255 ?
          The test "OpenExchangeRatesOrgProviderTest" depends on the file "open-exchange-rates.json" in test-files/solr/conf. It appears that SolrResourceLoader does not look there when you run tests from IntelliJ?

          Show
          Jan Høydahl added a comment - I think this rather could be something related to the latest checkin for SOLR-3255 ? The test "OpenExchangeRatesOrgProviderTest" depends on the file "open-exchange-rates.json" in test-files/solr/conf. It appears that SolrResourceLoader does not look there when you run tests from IntelliJ?
          Hide
          Yonik Seeley added a comment -

          Looks like all of the files are in
          ./build/solr-idea/classes/test/solr/conf
          except for one... the .json file. It's either not being copied in the first place, or it's being deleted somehow.

          Show
          Yonik Seeley added a comment - Looks like all of the files are in ./build/solr-idea/classes/test/solr/conf except for one... the .json file. It's either not being copied in the first place, or it's being deleted somehow.
          Hide
          Chris Male added a comment -

          Is it that your Compiler settings in IntelliJ don't include .json files? That is usually my first goto when a file with an exotic extension isn't getting copied into the build space.

          Show
          Chris Male added a comment - Is it that your Compiler settings in IntelliJ don't include .json files? That is usually my first goto when a file with an exotic extension isn't getting copied into the build space.
          Hide
          Steve Rowe added a comment -

          Is it that your Compiler settings in IntelliJ don't include .json files? That is usually my first goto when a file with an exotic extension isn't getting copied into the build space.

          Yes, this was the problem. I committed a fix to both trunk and branch_3x after Yonik asked me about the problem on #lucene-dev IRC.

          Show
          Steve Rowe added a comment - Is it that your Compiler settings in IntelliJ don't include .json files? That is usually my first goto when a file with an exotic extension isn't getting copied into the build space. Yes, this was the problem. I committed a fix to both trunk and branch_3x after Yonik asked me about the problem on #lucene-dev IRC.
          Hide
          Jan Høydahl added a comment -

          Thanks for sorting this out!

          Show
          Jan Høydahl added a comment - Thanks for sorting this out!

            People

            • Assignee:
              Jan Høydahl
              Reporter:
              Greg Fodor
            • Votes:
              3 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development