Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6558

solr does not insert the first line in the csv file

    XMLWordPrintableJSON

Details

    Description

      link for stackoverflow as well http://stackoverflow.com/questions/26000623/solr-does-not-insert-the-first-line-in-the-csv-file

      When a csv file is uploaded over curl command as below

      C:\>curl "http://localhost:8983/solr/update/csv?commit=true&stream.file=C:\dev\tools\solr-4.7.2\data.txt&stream.contentType=text/csv&header=false&fieldnames=id,cat,pubyear_i,title,author,
      series_s,sequence_i&skipLines=0"

      and data.txt content is as below

      book1,fantasy,2000,A Storm of Swords,George R.R. Martin,A Song of Ice and Fire,3
      book2,fantasy,2005,A Feast for Crows,George R.R. Martin,A Song of Ice and Fire,4
      book3,fantasy,2011,A Dance with Dragons,George R.R. Martin,A Song of Ice and Fire,5
      book4,sci-fi,1987,Consider Phlebas,Iain M. Banks,The Culture,1
      book5,sci-fi,1988,The Player of Games,Iain M. Banks,The Culture,2
      book6,sci-fi,1990,Use of Weapons,Iain M. Banks,The Culture,3
      book7,fantasy,1984,Shadows Linger,Glen Cook,The Black Company,2
      book8,fantasy,1984,The White Rose,Glen Cook,The Black Company,3
      book9,fantasy,1989,Shadow Games,Glen Cook,The Black Company,4
      book10,sci-fi,2001,Gridlinked,Neal Asher,Ian Cormac,1
      book11,sci-fi,2003,The Line of Polity,Neal Asher,Ian Cormac,2
      book12,sci-fi,2005,Brass Man,Neal Asher,Ian Cormac,3

      first data in data.txt file is not being inserted to Solr which its id is "book1". Can someone please tell why?

      http://localhost:8983/solr/query?q=id:book1
      {
      "responseHeader":{
      "status":0,
      "QTime":1,
      "params":{
      "q":"id:book1"}},
      "response":{"numFound":0,"start":0,"docs":[]
      }}

      Solr logs already tells that book1 is being added.

      15440876 [searcherExecutor-5-thread-1] INFO org.apache.solr.core.SolrCore û [collection1] Registered new searcher Searcher@177fcdf1[collection1] main

      {StandardDirectoryReader(segments_1g:124:nrt _z(4.7):C12)}

      15440877 [qtp84034882-11] INFO org.apache.solr.update.processor.LogUpdateProcessor û [collection1] webapp=/solr path=/update params=

      {fieldnames=id,cat,pubyear_i,title,author,series_s,sequence_i&skipLines=0&commit=true&stream.con tentType=text/csv&header=false&stream.file=C:\dev\tools\solr-4.7.2\data.txt}

      {add=[?book1 (1480070032327180288), book2 (1480070032332423168), book3 (1480070032335568896), book4 (1480070032337666048), book5 (1480070032339763200), b ook6 (1480070032341860352), book7 (1480070032343957504), book8 (1480070032347103232), book9 (1480070032349200384), book10 (1480070032351297536), ... (12 adds)],commit=}

      0 92

      If I ask for all data then below you can also see book1 is still missing

      http://localhost:8983/solr/query?q=id:book*&sort=pubyear_i+desc&fl=id,title,pubyear_i&rows=15
      {
      "responseHeader":{
      "status":0,
      "QTime":1,
      "params":{
      "fl":"id,title,pubyear_i",
      "sort":"pubyear_i desc",
      "q":"id:book*",
      "rows":"15"}},
      "response":{"numFound":11,"start":0,"docs":[

      { "id":"book3", "pubyear_i":2011, "title":["A Dance with Dragons"]}

      ,

      { "id":"book2", "pubyear_i":2005, "title":["A Feast for Crows"]}

      ,

      { "id":"book12", "pubyear_i":2005, "title":["Brass Man"]}

      ,

      { "id":"book11", "pubyear_i":2003, "title":["The Line of Polity"]}

      ,

      { "id":"book10", "pubyear_i":2001, "title":["Gridlinked"]}

      ,

      { "id":"book6", "pubyear_i":1990, "title":["Use of Weapons"]}

      ,

      { "id":"book9", "pubyear_i":1989, "title":["Shadow Games"]}

      ,

      { "id":"book5", "pubyear_i":1988, "title":["The Player of Games"]}

      ,

      { "id":"book4", "pubyear_i":1987, "title":["Consider Phlebas"]}

      ,

      { "id":"book7", "pubyear_i":1984, "title":["Shadows Linger"]}

      ,

      { "id":"book8", "pubyear_i":1984, "title":["The White Rose"]}

      ]
      }}

      Attachments

        Activity

          People

            Unassigned Unassigned
            fatih.tekin85@gmail.com fatih
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified