Infrastructure
  1. Infrastructure
  2. INFRA-4711

Hive Git mirror hasn't updated in a couple days

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: Initial Clearing
    • Component/s: Git
    • Labels:
      None

      Issue Links

        Activity

        Hide
        Jukka Zitting added a comment -
        Nice work Todd! Resolving as fixed.
        Show
        Jukka Zitting added a comment - Nice work Todd! Resolving as fixed.
        Hide
        Enis Soztutar added a comment -
        I can see that the git repos are updated. Thanks Todd, you're the man.
        Show
        Enis Soztutar added a comment - I can see that the git repos are updated. Thanks Todd, you're the man.
        Hide
        Todd Lipcon added a comment -
        danielsh helped me get svn access to the repo, so I committed the changes back. If someone with JIRA fu can assign this JIRA to me and mark it resolved, that would be cool. thanks
        Show
        Todd Lipcon added a comment - danielsh helped me get svn access to the repo, so I committed the changes back. If someone with JIRA fu can assign this JIRA to me and mark it resolved, that would be cool. thanks
        Hide
        Todd Lipcon added a comment -
        I figured out where they're kept in SVN, and then noticed that the checked out versions on git.apache.org had a bunch of local changes. So I jumped in on the fun and made some more local changes (made a backup first in the same dir)

        I made the following improvements:
        - use hardlink based locking instead of mkdir
        - the file that's created as a lock contains info like the following: Locked by pid 14981 at Thu Apr 26 03:35:46 GMT 2012
        - exit immediately if the lock cannot be obtained (this way one "stuck" lock doesn't back up the syncing of all the repos)

        So far seems to be working much better. We can improve this by adding a timeout so that, if a lock is old, we could remove it.
        Show
        Todd Lipcon added a comment - I figured out where they're kept in SVN, and then noticed that the checked out versions on git.apache.org had a bunch of local changes. So I jumped in on the fun and made some more local changes (made a backup first in the same dir) I made the following improvements: - use hardlink based locking instead of mkdir - the file that's created as a lock contains info like the following: Locked by pid 14981 at Thu Apr 26 03:35:46 GMT 2012 - exit immediately if the lock cannot be obtained (this way one "stuck" lock doesn't back up the syncing of all the repos) So far seems to be working much better. We can improve this by adding a timeout so that, if a lock is old, we could remove it.
        Hide
        Todd Lipcon added a comment -
        I futzed around on the git server a bit to look into this. trafficserver-plugins.git had a 6-day old "update-repo.lock" file which was blocking all of the cron jobs - they'd get to that repo and just block forever.

        I rmdired it and things seem to be making a little progress now... I'll keep looking into it. Useful commands for future spelunking:

        Find update-mirrors which are being blocked by locks:
        {code}
        for x in $(ps -ef | grep update-mirror | awk '{print $2}') ; do pgrep -P $x -l | ggrep -q sleep && (echo -n $x: ; pargs -e $x | grep GIT_DIR) ; done
        {code}

        Find update-mirrors which are actually running:
        {code}
        for x in $(ps -ef | grep update-mirror | awk '{print $2}') ; do pgrep -P $x -l | ggrep -q sleep || (echo -n $x: ; pargs -e $x | grep GIT_DIR) ; done
        {code}
        (this latter one will generate some error messages because it can race against the processes finishing)

        To improve this situation, we could do a few things:
        - when we make the lock dir, drop a file inside it with the pid who created it
        - if we see the lock, rather than sleeping forever until it succeeds, just exit (when being run from cron context). That way processes don't pile up

        Can someone point me to where these scripts are held in version control?
        Show
        Todd Lipcon added a comment - I futzed around on the git server a bit to look into this. trafficserver-plugins.git had a 6-day old "update-repo.lock" file which was blocking all of the cron jobs - they'd get to that repo and just block forever. I rmdired it and things seem to be making a little progress now... I'll keep looking into it. Useful commands for future spelunking: Find update-mirrors which are being blocked by locks: {code} for x in $(ps -ef | grep update-mirror | awk '{print $2}') ; do pgrep -P $x -l | ggrep -q sleep && (echo -n $x: ; pargs -e $x | grep GIT_DIR) ; done {code} Find update-mirrors which are actually running: {code} for x in $(ps -ef | grep update-mirror | awk '{print $2}') ; do pgrep -P $x -l | ggrep -q sleep || (echo -n $x: ; pargs -e $x | grep GIT_DIR) ; done {code} (this latter one will generate some error messages because it can race against the processes finishing) To improve this situation, we could do a few things: - when we make the lock dir, drop a file inside it with the pid who created it - if we see the lock, rather than sleeping forever until it succeeds, just exit (when being run from cron context). That way processes don't pile up Can someone point me to where these scripts are held in version control?
        Hide
        Enis Soztutar added a comment -
        This is true for HBase, and some other projects as well. The HBase repo has not been updated since 04/21.

        Since it severely hinders development, can we get some love for this hopefully anytime soon.

        Thanks in advance.
        Show
        Enis Soztutar added a comment - This is true for HBase, and some other projects as well. The HBase repo has not been updated since 04/21. Since it severely hinders development, can we get some love for this hopefully anytime soon. Thanks in advance.
        Hide
        Carl Steinbach added a comment -
        The Hive git mirror at git://git.apache.org/hive.git hasn't updated in the past couple of days:

        % svn log -q -l 20
        ------------------------------------------------------------------------
        r1329507 | cws | 2012-04-23 16:21:48 -0700 (Mon, 23 Apr 2012)
        ------------------------------------------------------------------------
        r1329492 | namit | 2012-04-23 15:31:14 -0700 (Mon, 23 Apr 2012)
        ------------------------------------------------------------------------
        r1329461 | hashutosh | 2012-04-23 14:25:50 -0700 (Mon, 23 Apr 2012)
        ------------------------------------------------------------------------
        r1329416 | cws | 2012-04-23 13:13:03 -0700 (Mon, 23 Apr 2012)
        ------------------------------------------------------------------------
        r1329381 | cws | 2012-04-23 12:19:55 -0700 (Mon, 23 Apr 2012)
        ------------------------------------------------------------------------
        r1329314 | hashutosh | 2012-04-23 09:23:35 -0700 (Mon, 23 Apr 2012)
        ------------------------------------------------------------------------
        r1328568 | hashutosh | 2012-04-20 19:36:57 -0700 (Fri, 20 Apr 2012)


        versus


        % git remote -v
        apache git://git.apache.org/hive.git (fetch)
        apache git://git.apache.org/hive.git (push)
        (master) [ ~/Work/repos/hive6 ]
        % git fetch apache
        (master) [ ~/Work/repos/hive6 ]
        % glogt -4 apache/trunk
        dc0281c 2012-04-21 HIVE-2966 :Revert HIVE-2795 (Thejas Nair via Ashutosh Chauhan)
        e645544 2012-04-20 HIVE-2965 : Revert HIVE-2612 (hashutosh)
        151710d 2012-04-20 HIVE-2958 [jira] GROUP BY causing ClassCastException [LazyDioInteger cannot be cast LazyInteger] (Navis Ryu via Ashutosh Chauhan)
        3b6b6d7 2012-04-18 HIVE-2959 [jira] TestRemoteHiveMetaStoreIpAddress always uses the same port (Kevin Wilfong via Ashutosh Chauhan)
        (master) [ ~/Work/repos/hive6 ]
        Show
        Carl Steinbach added a comment - The Hive git mirror at git://git.apache.org/hive.git hasn't updated in the past couple of days: % svn log -q -l 20 ------------------------------------------------------------------------ r1329507 | cws | 2012-04-23 16:21:48 -0700 (Mon, 23 Apr 2012) ------------------------------------------------------------------------ r1329492 | namit | 2012-04-23 15:31:14 -0700 (Mon, 23 Apr 2012) ------------------------------------------------------------------------ r1329461 | hashutosh | 2012-04-23 14:25:50 -0700 (Mon, 23 Apr 2012) ------------------------------------------------------------------------ r1329416 | cws | 2012-04-23 13:13:03 -0700 (Mon, 23 Apr 2012) ------------------------------------------------------------------------ r1329381 | cws | 2012-04-23 12:19:55 -0700 (Mon, 23 Apr 2012) ------------------------------------------------------------------------ r1329314 | hashutosh | 2012-04-23 09:23:35 -0700 (Mon, 23 Apr 2012) ------------------------------------------------------------------------ r1328568 | hashutosh | 2012-04-20 19:36:57 -0700 (Fri, 20 Apr 2012) versus % git remote -v apache git://git.apache.org/hive.git (fetch) apache git://git.apache.org/hive.git (push) (master) [ ~/Work/repos/hive6 ] % git fetch apache (master) [ ~/Work/repos/hive6 ] % glogt -4 apache/trunk dc0281c 2012-04-21 HIVE-2966 :Revert HIVE-2795 (Thejas Nair via Ashutosh Chauhan) e645544 2012-04-20 HIVE-2965 : Revert HIVE-2612 (hashutosh) 151710d 2012-04-20 HIVE-2958 [jira] GROUP BY causing ClassCastException [LazyDioInteger cannot be cast LazyInteger] (Navis Ryu via Ashutosh Chauhan) 3b6b6d7 2012-04-18 HIVE-2959 [jira] TestRemoteHiveMetaStoreIpAddress always uses the same port (Kevin Wilfong via Ashutosh Chauhan) (master) [ ~/Work/repos/hive6 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Carl Steinbach
          • Votes:
            4 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development