Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Migrate to git as the main source code repository. After this work:
- The source code repository will become https://gitbox.apache.org/repos/asf/sis.
- The https://svn.apache.org/repos/asf/sis/trunk/ repository will become read-only.
We will continue to use Subversion for the site, sis-data and non-free. Before to make the new git repository ready for use, we will try to cleanup its history by removing large files, especially:
- California_Restaurants.csv (19 Mb)
- DEPARTEMENT.SHP (3 Mb)
- ANC90Ply_4326.shp (0.7 Mb)
Those large files were identified as below (source: stackoverflow):
git rev-list --objects --all | sort -k 2 > allfileshas.txt git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+ blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects.txt for SHA in `cut -f 1 -d\ < bigobjects.txt`; do echo $(grep $SHA bigobjects.txt) $(grep $SHA allfileshas.txt) | awk '{print $1,$3,$7}' >> bigtosmall.txt done;
Commands executed for removing them:
git filter-branch --tree-filter 'find . -name "California_Restaurants.csv" -delete' -- --all git filter-branch --tree-filter 'find . -name "DEPARTEMENT.*" -delete' -- --all git filter-branch --tree-filter 'find . -name "ANC90Ply_4326*" -delete' -- --all git filter-branch --tree-filter 'find . -name "*~" -delete' -- --all git filter-branch --tree-filter 'rm -rf "sis-data"' -- --all git update-ref -d refs/original/refs/heads/master git reflog expire --expire=now --all git gc --prune=now