Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.3.0
    • Fix Version/s: 0.3.0
    • Component/s: User Interface
    • Labels:
      None
    • Environment:

      Web Stack

      Description

      Being able to specify custom tags for a machine.

      I would like to be able to tag a machine as part of a specific stack, such as Stack1 and Stack2, which is already implemented as the 'cluster' tag specified in chukwa-agent-conf.xml. I would also like to be able to tag each machine according to its role, such as WebServer or DataBase. This is not currently doable. The best solution would probably be to allow arbitrary tags. This functionality should be exposed also in the use of dump.sh, such that I can filter on any tag.

      1. fixedTagMatcher.patch
        1 kB
        Ari Rabkin
      2. fixedTagMatcher-1.patch
        1 kB
        Eric Yang
      3. generaltagging.patch
        18 kB
        Ari Rabkin

        Activity

        Hide
        asrabkin Ari Rabkin added a comment -

        Just to clarify: The scope of this issue is as follows:
        Make it possible for dump.sh to filter on arbitrary tags, not just cluster. This probably requires modifying DumpChunks.java, and possibly adding utility methods elsewhere. The rest of Chukwa should already tolerate arbitrary tags.

        Proposed dump interface:
        tag.tagname=pattern.

        This verifies that the user understood what they were doing.

        Show
        asrabkin Ari Rabkin added a comment - Just to clarify: The scope of this issue is as follows: Make it possible for dump.sh to filter on arbitrary tags, not just cluster. This probably requires modifying DumpChunks.java, and possibly adding utility methods elsewhere. The rest of Chukwa should already tolerate arbitrary tags. Proposed dump interface: tag.tagname=pattern. This verifies that the user understood what they were doing.
        Hide
        asrabkin Ari Rabkin added a comment -

        Did some refactoring of filter mechanism. Note that support for extracting general tags is now in ChunkImpl, and that this method is part of the Chunk interface.

        Show
        asrabkin Ari Rabkin added a comment - Did some refactoring of filter mechanism. Note that support for extracting general tags is now in ChunkImpl, and that this method is part of the Chunk interface.
        Hide
        asrabkin Ari Rabkin added a comment -

        Since this patch modifies core parts of Chukwa – even if very slightly – I don't intend to treat silence as consent.

        Show
        asrabkin Ari Rabkin added a comment - Since this patch modifies core parts of Chukwa – even if very slightly – I don't intend to treat silence as consent.
        Hide
        jboulon Jerome Boulon added a comment -

        I would need more information because it's already possible to add some tags directly from chukwa-agent.xml or dynamically.
        So let me know if you need something else.

        <property>
        <name>chukwaAgent.tags</name>
        <value>cluster="@TODO-CLUSTER-NAME@"</value>
        <description>The cluster's name for this agent</description>
        </property>

        At the beginning the property name was chukwaAgent.cluster but I changed it to be able to define some additional tags, cluster is just one tag that is there by default.

        So if you want to define a role for example you could do:

        <property>
        <name>chukwaAgent.tags</name>
        <value>cluster="@TODO-CLUSTER-NAME@" role="Database"</value>
        <description>The cluster's name for this agent and his role</description>
        </property>

        Also, there's a public API that you can use to set additional tags and the FileAdaptor is using it to add a timestamp. This is done by this line: chunk.addTag("time=\"" + fileTime + "\"");

        • org.apache.hadoop.chukwa.Chunk#getTag(String) has not been implemented because we coudln't agree on a standard format for a tag.
          So I would rather like to move that method to an helper class unless everybody agree on this format: tagname="val". Personally I'm ok with this format since I'm the one that put it in

        That's being said, I'm currently working on something else that may impact the Chunk class. So here more information on what I'm doing.
        I'm working on a streaming implementation for Chukwa and I need to manage the chunk routing dynamically using some rules.
        The rules are simple string comparison against some specific fields like host, category, cluster and something similar to the role.
        So I could either use the current Map or tags. The Map is not necessary the best option since I will add some fields that are more like dimensions/metadata (cluster,host,rack, role,etc) and are not part of the data. Tags will be the right place to do that but there's an overhead in using only text.

        So my preference will be to add another Map for storing dimensions or metadata related to the list of records. This will simplify all the coding, speedup the code because we will no longer use regex to match some tags but will add some additional bytes.

        And in between, we could decide to use the tag field but with a json string.

        Show
        jboulon Jerome Boulon added a comment - I would need more information because it's already possible to add some tags directly from chukwa-agent.xml or dynamically. So let me know if you need something else. <property> <name>chukwaAgent.tags</name> <value>cluster="@TODO-CLUSTER-NAME@"</value> <description>The cluster's name for this agent</description> </property> At the beginning the property name was chukwaAgent.cluster but I changed it to be able to define some additional tags, cluster is just one tag that is there by default. So if you want to define a role for example you could do: <property> <name>chukwaAgent.tags</name> <value>cluster="@TODO-CLUSTER-NAME@" role="Database"</value> <description>The cluster's name for this agent and his role</description> </property> Also, there's a public API that you can use to set additional tags and the FileAdaptor is using it to add a timestamp. This is done by this line: chunk.addTag("time=\"" + fileTime + "\""); org.apache.hadoop.chukwa.Chunk#getTag(String) has not been implemented because we coudln't agree on a standard format for a tag. So I would rather like to move that method to an helper class unless everybody agree on this format: tagname="val". Personally I'm ok with this format since I'm the one that put it in That's being said, I'm currently working on something else that may impact the Chunk class. So here more information on what I'm doing. I'm working on a streaming implementation for Chukwa and I need to manage the chunk routing dynamically using some rules. The rules are simple string comparison against some specific fields like host, category, cluster and something similar to the role. So I could either use the current Map or tags. The Map is not necessary the best option since I will add some fields that are more like dimensions/metadata (cluster,host,rack, role,etc) and are not part of the data. Tags will be the right place to do that but there's an overhead in using only text. So my preference will be to add another Map for storing dimensions or metadata related to the list of records. This will simplify all the coding, speedup the code because we will no longer use regex to match some tags but will add some additional bytes. And in between, we could decide to use the tag field but with a json string.
        Hide
        asrabkin Ari Rabkin added a comment -

        Yes. My patch is leveraging chukwaAgent.tags, and addTag(). It makes no change to that part of the system. As you point out, it is implicitly standardizing key="value"; or at least encouraging that format. You'd still be able to use that field in other ways, but callers of getTag(String ) would be confused. I'm prepared to pay that price.

        Rather than adding a Map to Chunk, you might extend ChunkImpl to include your map, if you don't need to serialize it. This would preserve the existing serialized format.

        Show
        asrabkin Ari Rabkin added a comment - Yes. My patch is leveraging chukwaAgent.tags, and addTag(). It makes no change to that part of the system. As you point out, it is implicitly standardizing key="value"; or at least encouraging that format. You'd still be able to use that field in other ways, but callers of getTag(String ) would be confused. I'm prepared to pay that price. Rather than adding a Map to Chunk, you might extend ChunkImpl to include your map, if you don't need to serialize it. This would preserve the existing serialized format.
        Hide
        asrabkin Ari Rabkin added a comment -

        Jerome, can I take your response for consent? The only change to ChunkImpl and the existing tagging mechanism and tag-handlign code is adding the Chunk.getTag(String) method. Which shouldn't do any harm, even if we later do something nonstandard with tags; at worst, it'll give misleading results.

        I thought about pushing the functionality into a helper class, and decided against it, since if tags are a standard mechanism (and they should be), then a default tag format seems like a reasonable thing to standardize and push into Chunk.

        Show
        asrabkin Ari Rabkin added a comment - Jerome, can I take your response for consent? The only change to ChunkImpl and the existing tagging mechanism and tag-handlign code is adding the Chunk.getTag(String) method. Which shouldn't do any harm, even if we later do something nonstandard with tags; at worst, it'll give misleading results. I thought about pushing the functionality into a helper class, and decided against it, since if tags are a standard mechanism (and they should be), then a default tag format seems like a reasonable thing to standardize and push into Chunk.
        Hide
        jboulon Jerome Boulon added a comment -

        +1 for the Chunk.getTag(String)
        +1 for the tag format: tagname="val"[space]tagname2="val2"
        +1 for the patch

        Show
        jboulon Jerome Boulon added a comment - +1 for the Chunk.getTag(String) +1 for the tag format: tagname="val" [space] tagname2="val2" +1 for the patch
        Hide
        asrabkin Ari Rabkin added a comment -

        I just committed this.

        Show
        asrabkin Ari Rabkin added a comment - I just committed this.
        Hide
        asrabkin Ari Rabkin added a comment -

        Slight bug – regex for extracting tags overmatched if chunks had multiple chunks.

        Show
        asrabkin Ari Rabkin added a comment - Slight bug – regex for extracting tags overmatched if chunks had multiple chunks.
        Hide
        eyang Eric Yang added a comment -

        I think the regular expression that you are looking for is:

        "."tagName"=\"(.?)\".*"

        Show
        eyang Eric Yang added a comment - I think the regular expression that you are looking for is: ". " tagName "=\"(. ?)\".*"
        Hide
        eyang Eric Yang added a comment -

        Comment messed things up. The correct regex is enclosed in the file.

        Show
        eyang Eric Yang added a comment - Comment messed things up. The correct regex is enclosed in the file.
        Hide
        asrabkin Ari Rabkin added a comment -

        Eric, I think these two patches are equivalent, except perhaps for performance. And this patch is very far from a critical path, so I'm going to commit your version.

        Show
        asrabkin Ari Rabkin added a comment - Eric, I think these two patches are equivalent, except perhaps for performance. And this patch is very far from a critical path, so I'm going to commit your version.
        Hide
        asrabkin Ari Rabkin added a comment -

        I just committed this. Thanks, Eric.

        Show
        asrabkin Ari Rabkin added a comment - I just committed this. Thanks, Eric.
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #112 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/112/ )

          People

          • Assignee:
            asrabkin Ari Rabkin
            Reporter:
            aaronbee Aaron Beitch
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development