Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7556

KafkaConsumer.beginningOffsets does not return actual first offsets

    XMLWordPrintableJSON

Details

    Description

      Description of the problem

      The method `org.apache.kafka.clients.consumer.KafkaConsumer.beginningOffsets` claims in its Javadoc documentation that it would 'Get the first offset for the given partitions.'.

      I used it with a compacted topic, and it always returned offset 0 for all partitions.
      Not sure if using a compacted topic actually matters, but I'm enclosing this information anyway.

      Given a Kafka topic with retention set, and old log files being removed as a result of that, the effective start offset of those partitions move further; it simply will be greater than offset 0.

      However, calling the `beginningOffsets` method always returns offset 0 as the first offset.

      In contrast, when the method `org.apache.kafka.clients.consumer.KafkaConsumer.offsetsForTimes` is called with a timestamp of 0L (UNIX epoch 1st Jan, 1970), it correctly returns the effective start offsets for each partitions.

      Output of using `org.apache.kafka.clients.consumer.KafkaConsumer.beginningOffsets`: 

      {test.topic-87=0, test.topic-54=0, test.topic-21=0, test.topic-79=0, test.topic-46=0, test.topic-13=0, test.topic-70=0, test.topic-37=0, test.topic-12=0, test.topic-95=0, test.topic-62=0, test.topic-29=0, test.topic-4=0, test.topic-88=0, test.topic-55=0, test.topic-22=0, test.topic-80=0, test.topic-47=0, test.topic-14=0, test.topic-71=0, test.topic-38=0, test.topic-5=0, test.topic-96=0, test.topic-63=0, test.topic-30=0, test.topic-56=0, test.topic-23=0, test.topic-89=0, test.topic-48=0, test.topic-15=0, test.topic-81=0, test.topic-72=0, test.topic-39=0, test.topic-6=0, test.topic-64=0, test.topic-31=0, test.topic-97=0, test.topic-24=0, test.topic-90=0, test.topic-57=0, test.topic-16=0, test.topic-82=0, test.topic-49=0, test.topic-40=0, test.topic-7=0, test.topic-73=0, test.topic-32=0, test.topic-98=0, test.topic-65=0, test.topic-91=0, test.topic-58=0, test.topic-25=0, test.topic-83=0, test.topic-50=0, test.topic-17=0, test.topic-8=0, test.topic-74=0, test.topic-41=0, test.topic-0=0, test.topic-99=0, test.topic-66=0, test.topic-33=0, test.topic-92=0, test.topic-59=0, test.topic-26=0, test.topic-84=0, test.topic-51=0, test.topic-18=0, test.topic-75=0, test.topic-42=0, test.topic-9=0, test.topic-67=0, test.topic-34=0, test.topic-1=0, test.topic-85=0, test.topic-60=0, test.topic-27=0, test.topic-77=0, test.topic-52=0, test.topic-19=0, test.topic-76=0, test.topic-43=0, test.topic-10=0, test.topic-93=0, test.topic-68=0, test.topic-35=0, test.topic-2=0, test.topic-86=0, test.topic-53=0, test.topic-28=0, test.topic-78=0, test.topic-45=0, test.topic-20=0, test.topic-69=0, test.topic-44=0, test.topic-11=0, test.topic-94=0, test.topic-61=0, test.topic-36=0, test.topic-3=0}
      

      Output of using `org.apache.kafka.clients.consumer.KafkaConsumer.offsetsForTimes`:

      {test.topic-87=(timestamp=1511264434285, offset=289), test.topic-54=(timestamp=1511265134993, offset=45420), test.topic-21=(timestamp=1511265534207, offset=63643), test.topic-79=(timestamp=1511270338275, offset=380750), test.topic-46=(timestamp=1511266883588, offset=266379), test.topic-13=(timestamp=1511265900538, offset=98512), test.topic-70=(timestamp=1511266972452, offset=118522), test.topic-37=(timestamp=1511264396370, offset=763), test.topic-12=(timestamp=1511265504886, offset=61108), test.topic-95=(timestamp=1511289492800, offset=847647), test.topic-62=(timestamp=1511265831298, offset=68299), test.topic-29=(timestamp=1511278767417, offset=548361), test.topic-4=(timestamp=1511269316679, offset=144855), test.topic-88=(timestamp=1511265608468, offset=107831), test.topic-55=(timestamp=1511267449288, offset=129241), test.topic-22=(timestamp=1511283134114, offset=563095), test.topic-80=(timestamp=1511277334877, offset=534859), test.topic-47=(timestamp=1511265530689, offset=71608), test.topic-14=(timestamp=1511266308829, offset=80962), test.topic-71=(timestamp=1511265474740, offset=83607), test.topic-38=(timestamp=1511268268259, offset=166460), test.topic-5=(timestamp=1511276243850, offset=294307), test.topic-96=(timestamp=1511276749138, offset=483237), test.topic-63=(timestamp=1511276798188, offset=441051), test.topic-30=(timestamp=1511269265206, offset=151727), test.topic-56=(timestamp=1511267813861, offset=171045), test.topic-23=(timestamp=1511268644790, offset=103736), test.topic-89=(timestamp=1511270771269, offset=276851), test.topic-48=(timestamp=1511279335687, offset=696518), test.topic-15=(timestamp=1511265886647, offset=82049), test.topic-81=(timestamp=1511272770367, offset=426329), test.topic-72=(timestamp=1511265783620, offset=76346), test.topic-39=(timestamp=1511265444819, offset=39325), test.topic-6=(timestamp=1511343902830, offset=501911), test.topic-64=(timestamp=1511276754029, offset=517845), test.topic-31=(timestamp=1511265905957, offset=68886), test.topic-97=(timestamp=1511271052868, offset=378159), test.topic-24=(timestamp=1511268471118, offset=185533), test.topic-90=(timestamp=1511265280669, offset=54403), test.topic-57=(timestamp=1511266299095, offset=154119), test.topic-16=(timestamp=1511266140680, offset=91612), test.topic-82=(timestamp=1511266586862, offset=158657), test.topic-49=(timestamp=1511265550828, offset=67356), test.topic-40=(timestamp=1511266374225, offset=133092), test.topic-7=(timestamp=1511270077779, offset=178568), test.topic-73=(timestamp=1511324184866, offset=1897104), test.topic-32=(timestamp=1511269989097, offset=291600), test.topic-98=(timestamp=1511282897007, offset=813386), test.topic-65=(timestamp=1511265848788, offset=97431), test.topic-91=(timestamp=1511270007091, offset=302858), test.topic-58=(timestamp=1511273273743, offset=416483), test.topic-25=(timestamp=1511266304470, offset=85250), test.topic-83=(timestamp=1511272177062, offset=426855), test.topic-50=(timestamp=1511266726637, offset=151059), test.topic-17=(timestamp=1511265892637, offset=71425), test.topic-8=(timestamp=1511276315670, offset=386542), test.topic-74=(timestamp=1511265671808, offset=79210), test.topic-41=(timestamp=1511265957561, offset=84127), test.topic-0=(timestamp=1511265529344, offset=56920), test.topic-99=(timestamp=1511265343408, offset=34036), test.topic-66=(timestamp=1511268683052, offset=285773), test.topic-33=(timestamp=1511292268724, offset=627169), test.topic-92=(timestamp=1511265745483, offset=67471), test.topic-59=(timestamp=1511275909747, offset=398275), test.topic-26=(timestamp=1511268821082, offset=167519), test.topic-84=(timestamp=1511320800595, offset=1820740), test.topic-51=(timestamp=1511267699931, offset=201574), test.topic-18=(timestamp=1511277137591, offset=447521), test.topic-75=(timestamp=1511266718159, offset=101415), test.topic-42=(timestamp=1511268280357, offset=102338), test.topic-9=(timestamp=1511321331900, offset=1012234), test.topic-67=(timestamp=1511266755881, offset=157516), test.topic-34=(timestamp=1511265028871, offset=18917), test.topic-1=(timestamp=1511269548372, offset=114164), test.topic-85=(timestamp=1511265426041, offset=59734), test.topic-60=(timestamp=1511299615444, offset=1395047), test.topic-27=(timestamp=1511284039796, offset=567605), test.topic-77=(timestamp=1511265412189, offset=45907), test.topic-52=(timestamp=1511268603935, offset=212559), test.topic-19=(timestamp=1511266992878, offset=98667), test.topic-76=(timestamp=1511269937877, offset=304288), test.topic-43=(timestamp=1511267795292, offset=130635), test.topic-10=(timestamp=1511265000373, offset=26730), test.topic-93=(timestamp=1511275532751, offset=545451), test.topic-68=(timestamp=1511266797346, offset=160989), test.topic-35=(timestamp=1511279364499, offset=694191), test.topic-2=(timestamp=1511270517080, offset=79696), test.topic-86=(timestamp=1511266816638, offset=142042), test.topic-53=(timestamp=1511265524531, offset=44083), test.topic-28=(timestamp=1511266431654, offset=97500), test.topic-78=(timestamp=1511266431145, offset=95872), test.topic-45=(timestamp=1511273274002, offset=346044), test.topic-20=(timestamp=1511268405607, offset=166317), test.topic-69=(timestamp=1511266953570, offset=143188), test.topic-44=(timestamp=1511289270173, offset=870214), test.topic-11=(timestamp=1511271446913, offset=182125), test.topic-94=(timestamp=1511290978907, offset=1022343), test.topic-61=(timestamp=1511303529222, offset=1204397), test.topic-36=(timestamp=1511266872111, offset=172640), test.topic-3=(timestamp=1511266312670, offset=98632)}
      

      Proposed fix

      • If the intention is to return what the beginning offset ever was, the Javadoc documentation should be explicit about it and another method should be added to query the effective start offsets.
      • Alternatively, the method should return the actual start offset of the partitions at the time of calling this method. Which, for topics with retention policy set, is not equal to what the start offsets were before removing log files.

      Attachments

        Issue Links

          Activity

            People

              jolshan Justine Olshan
              rob_v Robert V
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: