XMLWordPrintableJSON

Details

    • New Feature
    • Status: To Do
    • Minor
    • Resolution: Unresolved
    • Apache MXNet Backend
    • None

    Description

      Description

      IndexArray is an operator that returns an array of indexes of the input array.

      For an input array with shape (d_1, d_2, ..., d_n), index_array returns a (d_1, d_2, ..., d_n, n) array idx, where idx[i_1, i_2, ..., i_n, :] = [i_1, i_2, ..., i_n].

      Additionally, when the parameter axes is specified, idx will be a
      (d_1, d_2, ..., d_n, m) array where m is the length of axes, and the following
      equality will hold: idx[i_1, i_2, ..., i_n, j] = i_{axes[j]}.

      Motivation

      This operator can be used to generate meshgrids for tensors without knowing their exact shapes during construction. For instance, this operator can be used to make a makeshift prior box generator for anchor-based computer vision models:

      feature_map = F.ones((8, 128, 128, 256)) # N x H x W x C, no shape information when using the Symbol API.
      prior_box_stride = 16
      box_size=[8, 8]
      
      template = F.squeeze(F.slice_axis(feature_map, begin=0, end=1, axis=-1), axis=-1) # N x H x W
      box_centres = F.contrib.index_array(template, axes=(-2, -1, -2, -1)).astype("float32") # N x H x W x 4
      box_centres = F.broadcast_mul(box_centres, F.array([prior_box_stride]).reshape((1, 1, 1, 1))) # N x H x W x 4
      corner_offsets = F.array(box_size).reshape((1, 1, 1, 2))
      corner_offsets = F.concat(-corner_offsets/2, corner_offsets/2, dim=-1)
      box_corners = F.broadcast_plus(box_centres, corner_offsets)

      Also, this operator can be applied to implement positional encodings for sequence processing, e.g.:

      sequence_embeddings = F.ones((65, 8, 256)) # T x N x C, no shape information when using the Symbol API.
      template = sequence_embeddings.reshape((0, 0, -1, 2)) # T x N x C -> T x N x (C/2) x 2
      pos, i = F.split(
          F.contrib.index_array(template, axes=(0, 2)).astype("float32"), # T x N x (C/2) x 2 x 2
          axis=-1,
          num_outputs=2,
          squeeze_axis=True
      ) # T x N x (C/2) x 2 and T x N x (C/2) x 2
      base = F.ones((1, 1, 1, 1)) * 10000
      dmodel = F.slice_axis(F.shape_array(sequence_embeddings), begin=-1, end=None, axis=0)
      dmodel = dmodel.reshape((1, 1, 1, 1)).astype("float32")
      tmp = F.broadcast_div(pos, F.broadcast_power(base, F.broadcast_div(2 * i,  dmodel))) # T x N x (C/2) x 2
      sin_input, cos_input = F.split(tmp, axis=-1, num_outputs=2, squeeze_axis=True) # T x N x (C/2) and T x N x (C/2)
      positional_encoding = F.stack(F.sin(sin_input), F.cos(cos_input), axis=-1).reshape((0, 0, -3)) # T x N x C

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nickguletskii Nick Guletskii
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5.5h
                  5.5h