I wrote a Selector which can: - compute a value of a file - compare this value with one stored in a cache - decide whether to select The "compute" algorithm and the cache is configurable. Current implementations: HashvalueAlgorithm: reads the file´s content into a String and uses its hashValue(), therefore only files are selected whichs CONTENT has changed PropertyfileCache: use java.util.Properties for storing the key-value-pairs. While writing the selector and its testcase I realized that the doco needs some improvents in that area. So I have done that, too.
Created attachment 6622 [details] Zip containing diffs and complete files
The zip contains: docs\manual\CoreTypes\*.html - the improved/updated doco (complete files) docs\manual\CoreTypes\*.diff - " diff-file docs\manual\CoreTypes\*.png - graphics used in the doco **\cacheselector\*.java - new files for the selector (complete files) CacheSelectorTest.java - JUnit testcase for the new selector AbstractFileSet, SelectorContainer - adding support for the new selector BaseExtendSelector - my fault :-( no changes done I have forgotten the BaseSelectorContainer for the support for the new selector. Next attachement.
Created attachment 6623 [details] the forgotten diff
Created attachment 6642 [details] Add support for CacheSelector (without this, many classes can
Created attachment 6671 [details] Complete refactored implemenation and updated documentation
The attachement 6671 contains the full new sources (complete and diff´s), so the other are out of date.
I like the idea behind the cache selector. But I am not comfortable with the HashValueAlgorithm where you convert a byte array into a character string object before computing the hashvalue for it. Instead of converting from bytes to string, use MessageDigest along with the MD5 algorithm to calculate the MD5 hashvalue for each file (See the code for the checksum task for an example). This is purely stream based and not string based.
I know why I implement the CacheSelector against the Algorithm _interface_ :-) I will try an implementation as MD5Algorithm.
Created attachment 6675 [details] MD5Algorithm uses MessageDigest for computing the value
Created attachment 6676 [details] Added test for MD5Algorithm
Created attachment 6677 [details] Added MD5Algorithm (contains the other modifications)
After a while a refactorisation this would be the final version. Features: - CacheSelector uses a third interface java.util.Comparator for comparing the new and the cached value - smaller Algorithm interface - configuration of Algorithm and Cache is done via IntrospectionHelper - MD5Algorithm is changed to DigestAlgorithm and supports SHA - default values for easier use Now I need comments. Then I can try my first change to the cvs tree :-)
Created attachment 7049 [details] ZIP containing refactored CacheSelector and docs
In conjunction with forrest I created a test scenario. I attache a zip file containing the two files for running the scenario on windows. Needed: - Ant distribution containing the cacheselector - Ant-Contrib Install + Run: - unzip forrests fresh-site.zip into %forrest_home%\fresh-site - unzip the attached cacheselector+forrest.zip into %forrest_home%\fresh-site - open a cmd.exe in %forrest_home%\fresh-site - type #suite.bat Scenario: - deletes all generated files (so you can run this multiple times) - generate the site - copy to dist-1: should contain all files - copy to dist-2: should contain no files, because nothing has changed - generate the site - copy to dist-3: should contain no files, because no CONTENT has changed - modify two sources (using notepad) - generate the site - copy to dist-4: should contain only the derived files (faq.html, faq.pdf, test1.html)
Created attachment 7050 [details] Scenario CacheSelector in conjunction with Apache Forrest
Seems a useful addition, and a nicely done patch. One comment: when I see <cache/>, it's not immediately obvious what the selector does. I think the fact that a cache is used is more an implementation detail, than anything of relevance to the user. Perhaps <modified/> would be better? --Jeff
commited as <modified> selector Jan