quick search:
 

Query CSV Files

Submitted by: runyaga
Last Edited: 2004-10-14

Category: Python(External Method)

Average rating is: 5.0 out of 5 (1 ratings)

Description:
sometimes you need to query a CSV file. this is a very simple
solution of doing this. It only returns the first found critera
could easily be modified to return a sequence of data.

Now handles returning multiple items.

NOTES:
* still retains carriage returns! there could be more strip()ing done
* ideally each result wouldnt be a row of dumb data but a sequence of mappings
this way you could call <dtml-in queryCSV> <dtml-var name> <dtml-var ssn> </dtml-in>

these are left to the reader or for someone else who wants to add a comment



Source (Text):
#create csv_query.py in /Extensions and paste below code into it
def _parseFile(file):
    """ responsible for parsing the file and returning the datastructure """
    parsed={}
    columns=[] #in order of appearance
    f=open(file, 'r')
    lines=f.readlines()
    for column in lines[0].split(','):
        column=column.strip()
        columns.append(column)
        parsed[column]=()

    for line in lines[1:]:
        data=line.strip().split(',')
        for x in range(0, len(columns)):
            column_name=columns[x]
            values=parsed[column_name]
            parsed[column_name]=values+(data[x], )

    f.close()
    return parsed

def _queryCSVMapping(map, field, value):
    """ return the position (of the sequence of dat) inside the value of the mapping """
    values=map[field]
    results=() # line numbers where found
    for x in range(0, len(values)):
        if value==values[x]:
            results=results+(x,)
    return results

def _getDataFromLineNumbers(file, line_nums):
    data=() #sequence of text found
    f=open(file, 'r')
    contents=f.readlines()
    for line_num in line_nums:
        data=data+(contents[line_num+1],)
    f.close()
    return data

def csv_query(file, field, value):
    csv_mapping=_parseFile(file)
    line_number=_queryCSVMapping(csv_mapping, field, value)
    lines=_getDataFromLineNumbers(file, line_number)
    return lines #this is sequence of raw lines found


#in ZOPE create a External Method
#id: queryCSVFile
#module: csv_query
#function: csv_query

#create a datafile, lets say data.csv in /Extensions
John Smith,111-1111,123456789
John Smith,222-2222,123456789
Betty Lee,333-3333,234567890
Joe Anderson,444-4444,345678901

#now in your Script(Python)
f='e:\\zope25b1\\extensions\\data.csv' #path to file
data=context.queryCSVFile(f, 'phone', '222-2222')
return data

#and you should be returned
John Smith,222-2222,123456789

#you could in your DTMLMethod call it by
<dtml-in "queryCSVFile('e:\\zope25b1\\extensions\\data.csv', 'name', 'John Smith')">
  <dtml-var sequence-item> <br />
</dtml-in>

#and you should get
John Smith,111-1111,123456789 <br/>
John Smith,222-2222,123456789 <br/>

Explanation:
_parseFile opens a file, f reads the first line (lines[0]) into
a sequence called columns. then it iterates over the rest of the lines
putting them into the mapping {}, parsed.

{'name':('John Smith','John Smith','Betty Lee', 'Joe Anderson'),
'ssn':(other stuff),
'phone': (phone sequence) }

_queryCSVMapping takes the field and value and queries the parsed
mapping and returns the sequence number it found in the values. it returns
the column number found in the sequence.

_getdataFromLineNumbers opens the line and returns that line from the
csv file.


Comments:

Error by beacon - 2004-10-14
This won't work quite right, since fields containing a comma must be enclosed by quotes: you parse on commas only.

I suggest using the excellent ASV module, which gives you the advantage of having someone else maintain it ;-)

http://tratt.net/laurie/python/asv/

 
Re: Error by runyaga - 2004-10-14
want to create a recipe that uses it? :)
i didnt even think about fields that would have commas in it.  there 
is also a CSV module for python.  oh well.

http://object-craft.com.au/projects/csv/

would be nice for someone to wrap these and provide a recipe